From: Xuewei Niu <niuxuewei97@gmail.com>
To: sgarzare@redhat.com
Cc: fupan.lfp@antgroup.com, mst@redhat.com,
niuxuewei.nxw@antgroup.com, niuxuewei97@gmail.com,
parav@nvidia.com, stefanha@redhat.com,
virtio-comment@lists.linux.dev
Subject: Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
Date: Thu, 27 Mar 2025 16:18:30 +0800 [thread overview]
Message-ID: <20250327081830.2309856-1-niuxuewei.nxw@antgroup.com> (raw)
In-Reply-To: <jvqe4zwumc5lxpcltzjfgtwi3uppi4hc2wpbgphpb2vbezsg3b@n5hmgt3gwayk>
> On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> > On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> > >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> > >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> > >> >> >This patch brings a new feature, called "multi devices", to the virtio
> > >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> > >> >> >"device_order" field to the config for the virtio vsock.
> > >> >> >
> > >> >> >== Motivition ==
> > >> >> >
> > >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> > >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> > >> >> >in the inability to enable more than one backend. For instance, two devices
> > >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> > >> >>
> > >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> > >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> > >> >> it. Could this be the use case?
> > >> >>
> > >> >> We could in this way add a new feature for those devices that
> > >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> > >> >> So instead of having multiple CIDs per VM, we could continue to have a
> > >> >> single CID, but the transport could support 2 devices, one to
> > >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> > >> >> apps (CID = 2).
> > >> >>
> > >> >> Maybe this is orthogonal to this proposal, though, because it might
> > >> >> still make sense to have multiple vsock devices, even though it's not
> > >> >> very clear to me.
> > >> >
> > >> >In terms of the current situation, two devices are enough.
> > >> >
> > >> >We are the team of Kata Containers, so we are focusing on cloud-native
> > >> >computing. What I mentioned below might be beyond the scope of the virtio
> > >> >spec, just for your reference.
> > >> >
> > >> >The background is that the architecture of proxy mesh has been evolved over
> > >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> > >> >
> > >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> > >> >both host and guest network stacks. It is possible to establish a fast path
> > >> >between the pod and the proxy.
> > >> >
> > >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> > >> >does vsock.
> > >>
> > >> Be careful though, we don't want to complicate vsock to become like a
> > >> NIC.
> > >>
> > >> >
> > >> >When multiple networks are availble, it means that it is possible to have
> > >> >multiple proxies(i.e. user processes). In this case, two devices are not
> > >> >enough. This feature makes vsock more flexible and scalable.
> > >>
> > >> This is a good point, but I really don't understand why a VM should have
> > >> multiple CIDs assigned.
> > >
> > >I think priority is not the biggest issue here. So let us focus on how to
> > >route the connection to the right device among more than two devices.
> >
> > That's why I was recommending a different approach. IMO the user should
> > not do this, but that should be transparent, hidden in the driver.
> >
> > By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> > be sent to the VMM, then we have to use the device that supports it.
> > Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> > device.
> >
> > The user doesn't have to do anything, only use the right destination CID
> > if it wants to talk to the VMM or another host process.
>
> Obviously, if we want to support more than 2 devices, we need this
> that you are proposing. But IMO we need also to support
> VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
> bind() on a random CID if one of the two devices only talks to the
> VMM.
I agree with supporting `VMADDR_CID_HYPERVISOR` for virtio-vsock. I can
work on this later.
> Because, again, how does the user know which CID to bind?
Nice catch! I am trying to give a solution for this issue regarding the
scenario of more than two devices.
Let users access the `device_order` and the `guest_cid` field. Host user
program and guest user program can make an advance agreement. For example,
the first device (whose `device_order` is smallest) is used to communicate
with host process 1, the second device is used to host process 2, and so
on.
The guest user program want to direct the message to host process 2, then
the things would be:
1. Guest user program gets the second device's `guest_cid`.
2. Guest user program binds to the CID.
This could be worked because the `device_order` is a VM-level
configuration. (On the contrary, the `guest_cid` is a host-level
configuration).
If people don't need this feature (use 1 or 2 devices only), they can use
vsock as the simple way. Otherwise, people should accept the more
complicated way.
WDYT?
Thanks,
Xuewei
> >
> > >
> > >Our solution uses CID as device identification. From the users'
> > >perspective, they can direct the connection to the appropriate device by
> > >specifying a CID in either the `connect` or `bind` syscall.
> >
> > How does the user know which device/CID to bind if it wants to talk with
> > the VMM or with the application?
> >
> > >
> > >Assigning one CID to a VM looks good to me. But I am not sure how to
> > >distinguish the devices. For example, should we expose a ioctl or a
> > >sockopt?
> >
> > Nope, just simply use the right destination CID in the connect()
> > (VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().
> >
> > For receiving, the user can check the source CID after connection and
> > decide to discard connections from VMADDR_CID_HYPERVISOR or
> > VMADDR_CID_HOST depending of the service.
> >
> > Thanks,
> > Stefano
next prev parent reply other threads:[~2025-03-27 8:18 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-24 6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
2025-03-24 13:51 ` Stefano Garzarella
2025-03-25 3:19 ` Xuewei Niu
2025-03-26 8:50 ` Stefano Garzarella
2025-03-26 10:00 ` Xuewei Niu
2025-03-26 10:32 ` Stefano Garzarella
2025-03-26 10:36 ` Stefano Garzarella
2025-03-26 2:59 ` Xuewei Niu
2025-03-26 9:03 ` Stefano Garzarella
2025-03-27 8:18 ` Xuewei Niu [this message]
2025-03-31 6:18 ` Xuewei Niu
2025-04-01 11:15 ` Stefano Garzarella
2025-04-07 2:17 ` Xuewei Niu
2025-04-08 13:34 ` Stefano Garzarella
2025-04-09 6:55 ` Xuewei Niu
2025-04-09 9:34 ` Stefano Garzarella
2025-04-10 3:05 ` Xuewei Niu
2025-04-10 7:21 ` Stefano Garzarella
2025-04-10 8:58 ` Xuewei Niu
2025-04-10 10:38 ` Stefano Garzarella
2025-04-10 10:47 ` Xuewei Niu
2025-04-10 10:49 ` Stefano Garzarella
2025-04-10 13:47 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250327081830.2309856-1-niuxuewei.nxw@antgroup.com \
--to=niuxuewei97@gmail.com \
--cc=fupan.lfp@antgroup.com \
--cc=mst@redhat.com \
--cc=niuxuewei.nxw@antgroup.com \
--cc=parav@nvidia.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).