From: Stefano Garzarella <sgarzare@redhat.com>
To: Xuewei Niu <niuxuewei97@gmail.com>
Cc: fupan.lfp@antgroup.com, mst@redhat.com,
niuxuewei.nxw@antgroup.com, parav@nvidia.com,
virtio-comment@lists.linux.dev
Subject: Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
Date: Wed, 26 Mar 2025 11:36:41 +0100 [thread overview]
Message-ID: <CAGxU2F5haeggJjW0Y+BF=A4jXpOy-yOa-ZY_Y=F-rErbFLskTQ@mail.gmail.com> (raw)
In-Reply-To: <fiqrmwvxg36dljg6krolmxbjquazjbnu7vwg2savfk7y4pn7en@ejvem5zwoydz>
On Wed, 26 Mar 2025 at 11:32, Stefano Garzarella <sgarzare@redhat.com> wrote:
> On Wed, Mar 26, 2025 at 06:00:31PM +0800, Xuewei Niu wrote:
> >> On Tue, Mar 25, 2025 at 11:19:46AM +0800, Xuewei Niu wrote:
> >> >> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >> >> >This patch brings a new feature, called "multi devices", to the virtio
> >> >> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >> >> >"device_order" field to the config for the virtio vsock.
> >> >> >
> >> >> >== Motivition ==
> >> >> >
> >> >> >Vsock is a lightweight and widely used data exchange mechanism between host
> >> >> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >> >> >in the inability to enable more than one backend. For instance, two devices
> >> >> >are required: one to transfer data to the VMM via virtio-vsock,
> >> >>
> >> >> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> >> >> communicate with the hypervisor, but in virtio-vsock we never supported
> >> >> it. Could this be the use case?
> >> >>
> >> >> We could in this way add a new feature for those devices that
> >> >> communicate only with the VMM, where the CID of the VM is quite useless.
> >> >> So instead of having multiple CIDs per VM, we could continue to have a
> >> >> single CID, but the transport could support 2 devices, one to
> >> >> communicate with the VMM (CID = 0) and one to communicate with the host
> >> >> apps (CID = 2).
> >> >>
> >> >> Maybe this is orthogonal to this proposal, though, because it might
> >> >> still make sense to have multiple vsock devices, even though it's not
> >> >> very clear to me.
> >> >
> >> >In terms of the current situation, two devices are enough.
> >> >
> >> >We are the team of Kata Containers, so we are focusing on cloud-native
> >> >computing. What I mentioned below might be beyond the scope of the virtio
> >> >spec, just for your reference.
> >> >
> >> >The background is that the architecture of proxy mesh has been evolved over
> >> >the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
> >> >
> >> >Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
> >> >both host and guest network stacks. It is possible to establish a fast path
> >> >between the pod and the proxy.
> >> >
> >> >When we have multiple networks, it is intuitive to have multiple NICs. So
> >> >does vsock.
> >>
> >> Be careful though, we don't want to complicate vsock to become like a
> >> NIC.
> >>
> >> >
> >> >When multiple networks are availble, it means that it is possible to have
> >> >multiple proxies(i.e. user processes). In this case, two devices are not
> >> >enough. This feature makes vsock more flexible and scalable.
> >>
> >> This is a good point, but I really don't understand why a VM should have
> >> multiple CIDs assigned.
> >
> >I think priority is not the biggest issue here. So let us focus on how to
> >route the connection to the right device among more than two devices.
>
> That's why I was recommending a different approach. IMO the user should
> not do this, but that should be transparent, hidden in the driver.
>
> By supporting VMADDR_CID_HYPERVISOR, we know very well if a packet is to
> be sent to the VMM, then we have to use the device that supports it.
> Whereas if the user connects to VMADDR_CID_HOST we have to use the other
> device.
>
> The user doesn't have to do anything, only use the right destination CID
> if it wants to talk to the VMM or another host process.
Obviously, if we want to support more than 2 devices, we need this
that you are proposing. But IMO we need also to support
VMADDR_CID_HYPERVISOR, and we should prevent the user from doing
bind() on a random CID if one of the two devices only talks to the
VMM.
Because, again, how does the user know which CID to bind?
>
> >
> >Our solution uses CID as device identification. From the users'
> >perspective, they can direct the connection to the appropriate device by
> >specifying a CID in either the `connect` or `bind` syscall.
>
> How does the user know which device/CID to bind if it wants to talk with
> the VMM or with the application?
>
> >
> >Assigning one CID to a VM looks good to me. But I am not sure how to
> >distinguish the devices. For example, should we expose a ioctl or a
> >sockopt?
>
> Nope, just simply use the right destination CID in the connect()
> (VMADDR_CID_HYPERVISOR or VMADDR_CID_HOST), without doing any bind().
>
> For receiving, the user can check the source CID after connection and
> decide to discard connections from VMADDR_CID_HYPERVISOR or
> VMADDR_CID_HOST depending of the service.
>
> Thanks,
> Stefano
>
next prev parent reply other threads:[~2025-03-26 10:36 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-24 6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
2025-03-24 13:51 ` Stefano Garzarella
2025-03-25 3:19 ` Xuewei Niu
2025-03-26 8:50 ` Stefano Garzarella
2025-03-26 10:00 ` Xuewei Niu
2025-03-26 10:32 ` Stefano Garzarella
2025-03-26 10:36 ` Stefano Garzarella [this message]
2025-03-26 2:59 ` Xuewei Niu
2025-03-26 9:03 ` Stefano Garzarella
2025-03-27 8:18 ` Xuewei Niu
2025-03-31 6:18 ` Xuewei Niu
2025-04-01 11:15 ` Stefano Garzarella
2025-04-07 2:17 ` Xuewei Niu
2025-04-08 13:34 ` Stefano Garzarella
2025-04-09 6:55 ` Xuewei Niu
2025-04-09 9:34 ` Stefano Garzarella
2025-04-10 3:05 ` Xuewei Niu
2025-04-10 7:21 ` Stefano Garzarella
2025-04-10 8:58 ` Xuewei Niu
2025-04-10 10:38 ` Stefano Garzarella
2025-04-10 10:47 ` Xuewei Niu
2025-04-10 10:49 ` Stefano Garzarella
2025-04-10 13:47 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGxU2F5haeggJjW0Y+BF=A4jXpOy-yOa-ZY_Y=F-rErbFLskTQ@mail.gmail.com' \
--to=sgarzare@redhat.com \
--cc=fupan.lfp@antgroup.com \
--cc=mst@redhat.com \
--cc=niuxuewei.nxw@antgroup.com \
--cc=niuxuewei97@gmail.com \
--cc=parav@nvidia.com \
--cc=virtio-comment@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).