From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Xie, Huawei" Subject: Re: [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors. Date: Wed, 20 May 2015 05:26:21 +0000 Message-ID: References: <1430720780-27525-1-git-send-email-changchun.ouyang@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" To: "Ouyang, Changchun" Return-path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 1906B5A9B for ; Wed, 20 May 2015 07:28:25 +0200 (CEST) Content-Language: en-US List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 5/18/2015 9:23 PM, Ouyang, Changchun wrote:=0A= > Hi Huawei,=0A= >=0A= >> -----Original Message-----=0A= >> From: Xie, Huawei=0A= >> Sent: Monday, May 18, 2015 5:39 PM=0A= >> To: Ouyang, Changchun; dev@dpdk.org=0A= >> Subject: Re: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle= =0A= >> chained vring descriptors.=0A= >>=0A= >> On 5/4/2015 2:27 PM, Ouyang Changchun wrote:=0A= >>> Vring enqueue need consider the 2 cases:=0A= >>> 1. Vring descriptors chained together, the first one is for virtio=0A= >>> header, the rest are for real data; 2. Only one descriptor, virtio=0A= >>> header and real data share one single descriptor;=0A= >>>=0A= >>> So does vring dequeue.=0A= >>>=0A= >>> Signed-off-by: Changchun Ouyang =0A= >>> ---=0A= >>> lib/librte_vhost/vhost_rxtx.c | 60=0A= >>> +++++++++++++++++++++++++++++++------------=0A= >>> 1 file changed, 44 insertions(+), 16 deletions(-)=0A= >>>=0A= >>> diff --git a/lib/librte_vhost/vhost_rxtx.c=0A= >>> b/lib/librte_vhost/vhost_rxtx.c index 510ffe8..3135883 100644=0A= >>> --- a/lib/librte_vhost/vhost_rxtx.c=0A= >>> +++ b/lib/librte_vhost/vhost_rxtx.c=0A= >>> @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> struct virtio_net_hdr_mrg_rxbuf virtio_hdr =3D {{0, 0, 0, 0, 0, 0}, 0= };=0A= >>> uint64_t buff_addr =3D 0;=0A= >>> uint64_t buff_hdr_addr =3D 0;=0A= >>> - uint32_t head[MAX_PKT_BURST], packet_len =3D 0;=0A= >>> + uint32_t head[MAX_PKT_BURST];=0A= >>> uint32_t head_idx, packet_success =3D 0;=0A= >>> uint16_t avail_idx, res_cur_idx;=0A= >>> uint16_t res_base_idx, res_end_idx;=0A= >>> @@ -113,6 +113,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> rte_prefetch0(&vq->desc[head[packet_success]]);=0A= >>>=0A= >>> while (res_cur_idx !=3D res_end_idx) {=0A= >>> + uint32_t offset =3D 0;=0A= >>> + uint32_t data_len, len_to_cpy;=0A= >>> + uint8_t plus_hdr =3D 0;=0A= >>> +=0A= >>> /* Get descriptor from available ring */=0A= >>> desc =3D &vq->desc[head[packet_success]];=0A= >>>=0A= >>> @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >>> queue_id,=0A= >>>=0A= >>> /* Copy virtio_hdr to packet and increment buffer address */=0A= >>> buff_hdr_addr =3D buff_addr;=0A= >>> - packet_len =3D rte_pktmbuf_data_len(buff) + vq->vhost_hlen;=0A= >>>=0A= >>> /*=0A= >>> * If the descriptors are chained the header and data are @@=0A= >>> -136,24 +139,44 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> desc =3D &vq->desc[desc->next];=0A= >>> /* Buffer address translation. */=0A= >>> buff_addr =3D gpa_to_vva(dev, desc->addr);=0A= >>> - desc->len =3D rte_pktmbuf_data_len(buff);=0A= >>> } else {=0A= >>> buff_addr +=3D vq->vhost_hlen;=0A= >>> - desc->len =3D packet_len;=0A= >>> + plus_hdr =3D 1;=0A= >>> }=0A= >>>=0A= >>> + data_len =3D rte_pktmbuf_data_len(buff);=0A= >>> + len_to_cpy =3D RTE_MIN(data_len, desc->len);=0A= >>> + do {=0A= >>> + if (len_to_cpy > 0) {=0A= >>> + /* Copy mbuf data to buffer */=0A= >>> + rte_memcpy((void *)(uintptr_t)buff_addr,=0A= >>> + (const void=0A= >> *)(rte_pktmbuf_mtod(buff, const char *) + offset),=0A= >>> + len_to_cpy);=0A= >>> + PRINT_PACKET(dev, (uintptr_t)buff_addr,=0A= >>> + len_to_cpy, 0);=0A= >>> +=0A= >>> + desc->len =3D len_to_cpy + (plus_hdr ? vq-=0A= >>> vhost_hlen : 0);=0A= >> Do we really need to rewrite the desc->len again and again? At least we= only=0A= >> have the possibility to change the value of desc->len of the last descri= ptor.=0A= > Well, I think we need change each descriptor's len in the chain here,=0A= > If aggregate all len to the last descriptor's len, it is possibly the len= gth will exceed its original len,=0A= > e.g. use 8 descriptor(each has len of 1024) chained to recv a 8K packet, = then last descriptor's len=0A= > will be 8K, and all other descriptor is 0, I don't think this situation m= ake sense. =0A= Let me explain this way.=0A= We receive a packet with 350 bytes size, and we have descriptor chain of=0A= 10 descs, each with 100 byte size.=0A= At least we don't need to change the len field of first three descriptors.= =0A= Whether we need to change the 4th len field to 50, and the rest of them=0A= to zero is still a question(used->len is updated to 350).=0A= We need check the VIRTIO spec.=0A= >>> + offset +=3D len_to_cpy;=0A= >>> + if (desc->flags & VRING_DESC_F_NEXT) {=0A= >>> + desc =3D &vq->desc[desc->next];=0A= >>> + buff_addr =3D gpa_to_vva(dev, desc-=0A= >>> addr);=0A= >>> + len_to_cpy =3D RTE_MIN(data_len -=0A= >> offset, desc->len);=0A= >>> + } else=0A= >>> + break;=0A= >> Still there are two issues here.=0A= >> a) If the data couldn't be fully copied to chain of guest buffers, we sh= ouldn't=0A= >> do any copy.=0A= > Why don't copy any data is better than the current implementation?=0A= We don't need to pass part of a packet to guest.=0A= >=0A= >> b) scatter mbuf isn't considered.=0A= > If we also consider scatter mbuf here, then this function will have exact= ly same logic with mergeable_rx,=0A= > Do you want to totally remove this function, just keep the mergeable rx f= unction for all cases?=0A= >=0A= > Changchun=0A= >=0A= >=0A= >=0A= =0A=