virtio-comment.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
From: Xuewei Niu <niuxuewei97@gmail.com>
To: sgarzare@redhat.com
Cc: fupan.lfp@antgroup.com, mst@redhat.com,
	niuxuewei.nxw@antgroup.com, niuxuewei97@gmail.com,
	parav@nvidia.com, stefanha@redhat.com,
	virtio-comment@lists.linux.dev
Subject: Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
Date: Mon,  7 Apr 2025 10:17:15 +0800	[thread overview]
Message-ID: <20250407021715.2736840-1-niuxuewei.nxw@antgroup.com> (raw)
In-Reply-To: <lr6fvhjts6ytekqnlbvccvvjbnmbgspzynme4yii4bla6fiyig@mzgor4y3ojsh>

> On Mon, Mar 31, 2025 at 02:18:27PM +0800, Xuewei Niu wrote:
> >> > On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> >> > > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> >> > > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> >> > > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >> > > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> >> > > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >> > > >> >> >"device_order" field to the config for the virtio vsock.
> >> > > >> >> >
> >> > > >> >> >== Motivition ==
> >> > > >> >> >
> >> > > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> >> > > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >> > > >> >> >in the inability to enable more than one backend. For instance, two devices
> >> > > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> >> > > >> >>
> >> > > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> >> > > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> >> > > >> >> it. Could this be the use case?
> >> > > >> >>
> >> > > >> >> We could in this way add a new feature for those devices that
> >> > > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> >> > > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> >> > > >> >> single CID, but the transport could support 2 devices, one to
> >> > > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> >> > > >> >> apps (CID = 2).
> >> > > >> >>
> >> > > >> >> Maybe this is orthogonal to this proposal, though, because it might
> >> > > >> >> still make sense to have multiple vsock devices, even though it's not
> >> > > >> >> very clear to me.
> >> > > >> >
> >> > > >> >In terms of the current situation, two devices are enough.
> >> > > >> >
> >> > > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> >> > > >> >computing. What I mentioned below might be beyond the scope of the virtio
> >> > > >> >spec, just for your reference.
> >> > > >> >
> >> > > >> >The background is that the architecture of proxy mesh has been evolved over
> >> > > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> >> > > >> >
> >> > > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> >> > > >> >both host and guest network stacks. It is possible to establish a fast path
> >> > > >> >between the pod and the proxy.
> >> > > >> >
> >> > > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> >> > > >> >does vsock.
> >> > > >>
> >> > > >> Be careful though, we don't want to complicate vsock to become like a
> >> > > >> NIC.
> >> > > >>
> >> > > >> >
> >> > > >> >When multiple networks are availble, it means that it is possible to have
> >> > > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> >> > > >> >enough. This feature makes vsock more flexible and scalable.
> >> > > >>
> >> > > >> This is a good point, but I really don't understand why a VM should have
> >> > > >> multiple CIDs assigned.
> >> > > >
> >> > > >I think priority is not the biggest issue here. So let us focus on how to
> >> > > >route the connection to the right device among more than two devices.
> >> > >
> >> > > That's why I was recommending a different approach. IMO the user should
> >> > > not do this, but that should be transparent, hidden in the driver.
> >> > >
> >> > > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> >> > > be sent to the VMM, then we have to use the device that supports it.
> >> > > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> >> > > device.
> >> > >
> >> > > The user doesn't have to do anything, only use the right destination CID
> >> > > if it wants to talk to the VMM or another host process.
> >> >
> >> > Obviously, if we want to support more than 2 devices, we need this
> >> > that you are proposing. But IMO we need also to support
> >> > VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> >> > bind() on a random CID if one of the two devices only talks to the
> >> > VMM.
> >>
> >> I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
> >> work on this later.
> 
> Would be nice to have both together, but I'm fine if you want to 
> postpone it.
> 
> >>
> >> > Because, again, how does the user know which CID to bind?
> >>
> >> Nice catch! I am trying to give a solution for this issue regarding the
> >> scenario of more than two devices.
> >>
> >> Let users access the `device_order` and the `guest_cid` field. Host user
> >> program and guest user program can make an advance agreement. For example,
> >> the first device (whose `device_order` is smallest) is used to communicate
> >> with host process 1, the second device is used to host process 2, and so
> >> on.
> >>
> >> The guest user program want to direct the message to host process 2, then
> >> the things would be:
> >>
> >> 1. Guest user program gets the second device's `guest_cid`.
> >> 2. Guest user program binds to the CID.
> >>
> >> This could be worked because the `device_order` is a VM-level
> >> configuration. (On the contrary, the `guest_cid` is a host-level
> >> configuration).
> >>
> >> If people don't need this feature (use 1 or 2 devices only), they can use
> >> vsock as the simple way. Otherwise, people should accept the more
> >> complicated way.
> >>
> >> WDYT?
> >
> >Or we can replace the device_order with the guest_lid (aka local id). The
> >guest_lid is a VM-level address space, while the guest_cid is a host-level
> >address space.
> >
> >```c
> >struct virtio_vsock_config {
> >	__le64 guest_cid;
> >	__le16 guest_lid; /* previous device_order */
> >};
> >```
> >
> >With this design, the relationship between the device and the guest_lid
> >should be set properly before building the guest app and launching the
> >VM.
> >
> >For example, host process 0's guest_lid is 1000, and host process 1's is
> >2000. Their guest_cid will be determined when the VM started. The device
> >table should be like this:
> >
> >* device0: process=VM   guest_lid=0     guest_cid=0 <default device>
> >* device1: process=0    guest_lid=1000  guest_cid=x
> >* device2: process=1    guest_lid=2000  guest_cid=y
> >
> >The driver should expose an interface, such as ioctl, receiving a
> >local_cid. Guest apps can use it to obtain the actual guest_cid.
> 
> No, please, I don't think adding virtio-specific behaviour in AF_VSOCK 
> is what we want.
> 
> Let's continue with device_order and see what others say.
> 
> I think we need to try to get a better understanding of what to do, 
> depending on the direction:
> 
> - host -> guest: it might make sense multiple devices with different
>    CIDs, and the host will know which one to use depending on the CID
>    assigned to the device (e.g. vhost, vhost-user, device in VMM)
> 
> - guest -> host: again I think we should differentiate the device to use
>    depending on the destination CID which can be VMADDR_CID_HOST,
>    VMADDR_CID_HYPERVISOR, or in the case where sibling communication is
>    supported a CID >= 3, so maybe we should have some features or flags
>    in the config space to describe destination CID supported for each
>    device

I don't understand the point of adding a new features/flags. Could you
explain a bit more?

We have had the guest_cid field in the config space. The guest knows all
devices present in the VM.

If the app tries to bind a random CID, it will fail since the driver can't
find the device by the CID.

> so that the guest knows which device to use depending on the destination
> CID.

Yes, this is what I was describing in the previous comment. The message
will be directed to the device by the destination CID.

Thanks,
Xuewei

> I don't want to stop this patch, but I would like to make it easy for 
> the user to use.
> 
> Thanks,
> Stefano

  reply	other threads:[~2025-04-07  2:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-24  6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
2025-03-24 13:51 ` Stefano Garzarella
2025-03-25  3:19   ` Xuewei Niu
2025-03-26  8:50     ` Stefano Garzarella
2025-03-26 10:00       ` Xuewei Niu
2025-03-26 10:32         ` Stefano Garzarella
2025-03-26 10:36           ` Stefano Garzarella
2025-03-26  2:59   ` Xuewei Niu
2025-03-26  9:03     ` Stefano Garzarella
2025-03-27  8:18       ` Xuewei Niu
2025-03-31  6:18         ` Xuewei Niu
2025-04-01 11:15           ` Stefano Garzarella
2025-04-07  2:17             ` Xuewei Niu [this message]
2025-04-08 13:34               ` Stefano Garzarella
2025-04-09  6:55                 ` Xuewei Niu
2025-04-09  9:34                   ` Stefano Garzarella
2025-04-10  3:05                     ` Xuewei Niu
2025-04-10  7:21                       ` Stefano Garzarella
2025-04-10  8:58                         ` Xuewei Niu
2025-04-10 10:38                           ` Stefano Garzarella
2025-04-10 10:47                             ` Xuewei Niu
2025-04-10 10:49                               ` Stefano Garzarella
2025-04-10 13:47                               ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250407021715.2736840-1-niuxuewei.nxw@antgroup.com \
    --to=niuxuewei97@gmail.com \
    --cc=fupan.lfp@antgroup.com \
    --cc=mst@redhat.com \
    --cc=niuxuewei.nxw@antgroup.com \
    --cc=parav@nvidia.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).