From: Xuewei Niu <niuxuewei97@gmail.com>
To: sgarzare@redhat.com
Cc: fupan.lfp@antgroup.com, mst@redhat.com,
niuxuewei.nxw@antgroup.com, niuxuewei97@gmail.com,
parav@nvidia.com, virtio-comment@lists.linux.dev
Subject: Re: [PATCH v6 RESEND] virtio-vsock: Add support for multi devices
Date: Tue, 25 Mar 2025 11:19:46 +0800 [thread overview]
Message-ID: <20250325031946.1934483-1-niuxuewei.nxw@antgroup.com> (raw)
In-Reply-To: <h5vwvrs7gsnkyw3vmkumhbtuqdecfmksnwlvfmxr62t2lhz3hc@lpinl24bwkte>
> On Mon, Mar 24, 2025 at 02:43:35PM +0800, Xuewei Niu wrote:
> >This patch brings a new feature, called "multi devices", to the virtio
> >vsock. It introduces a "VIRTIO_VSOCK_F_MULTI_DEVICES" feature bit, and a
> >"device_order" field to the config for the virtio vsock.
> >
> >== Motivition ==
> >
> >Vsock is a lightweight and widely used data exchange mechanism between host
> >and guest. Currently, the virtio-vsock only supports one device, resulting
> >in the inability to enable more than one backend. For instance, two devices
> >are required: one to transfer data to the VMM via virtio-vsock,
>
> Come to think of it, AF_VSOCK defines CID 0 (VMADDR_CID_HYPERVISOR) to
> communicate with the hypervisor, but in virtio-vsock we never supported
> it. Could this be the use case?
>
> We could in this way add a new feature for those devices that
> communicate only with the VMM, where the CID of the VM is quite useless.
> So instead of having multiple CIDs per VM, we could continue to have a
> single CID, but the transport could support 2 devices, one to
> communicate with the VMM (CID = 0) and one to communicate with the host
> apps (CID = 2).
>
> Maybe this is orthogonal to this proposal, though, because it might
> still make sense to have multiple vsock devices, even though it's not
> very clear to me.
In terms of the current situation, two devices are enough.
We are the team of Kata Containers, so we are focusing on cloud-native
computing. What I mentioned below might be beyond the scope of the virtio
spec, just for your reference.
The background is that the architecture of proxy mesh has been evolved over
the past few years: from per-pod to per-host (e.g. Istio Ambient Mesh[1]).
Thanks to the TSI[2] and vhost-user protocol, network packets can bypass
both host and guest network stacks. It is possible to establish a fast path
between the pod and the proxy.
When we have multiple networks, it is intuitive to have multiple NICs. So
does vsock.
When multiple networks are availble, it means that it is possible to have
multiple proxies(i.e. user processes). In this case, two devices are not
enough. This feature makes vsock more flexible and scalable.
Looks like you don't like the design of multiple devices. May I ask why? Is
it too heavy for you?
> > and another to a user process via vhost-user-vsock.
>
> So to recap, one device would be used only to communicate with the VMM,
> and the other device to communicate with other external processes,
> right?
>
> Do you have any other use cases?
>
> >Apart from that, a side gain is that theoretically the performance might be
> >improved since each device has its own queue. But it varies depending on
> >the implementation.
>
> This though might be easier to implement supported multi-queue in the
> device, instead of adding n devices to the VM.
I think multi-queue and multi-device are independent of each other, just
like what network devices do. A single vsock device can be considered as a
group of queues (if multi-queue is supported), and it can be assigned a
thread to handle the traffic.
So I accepted Parav's sugguestion, mentioned it as a side gain.
> >== Typical Usages ==
> >
> >Assuming there are two virtio-vsock devices on the guest, with CIDs 3 and 4
> >respectively. And the device with CID 3 is default.
> >
> >Connect to the host using the device with CID 3.
> >
> >```c
> >// use default one (no bind)
> >fd = socket(AF_VSOCK);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >
> >// or bind explicitly
> >fd = socket(AF_VSOCK);
> >bind(fd, 3, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >Connect to the host using the device with CID 4.
> >
> >```c
> >// must bind explicitly as the device with CID 4 is not default.
> >fd = socket(AF_VSOCK);
> >bind(fd, 4, -1);
> >connect(fd, 2, 1234);
> >n = write(fd, buffer);
> >```
> >
> >The first version of multi-devices implementation is available at [1].
> >
> >[1] https://lore.kernel.org/virtualization/20240517144607.2595798-1-niuxuewei.nxw@antgroup.com
> >
> >Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
> >---
> > device-types/vsock/description.tex | 30 ++++++++++++++++++++++++++++--
> > 1 file changed, 28 insertions(+), 2 deletions(-)
> >
> >diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
> >index 7d91d15..7d0cfe4 100644
> >--- a/device-types/vsock/description.tex
> >+++ b/device-types/vsock/description.tex
> >@@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
> > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
> > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
> >+\item[VIRTIO_VSOCK_F_MULTI_DEVICES (3)] multiple devices feature is supported.
> > \end{description}
> >
> > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >@@ -34,6 +35,12 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> > VIRTIO_VSOCK_F_NO_IMPLIED_STREAM, the driver MAY act as if
> > VIRTIO_VSOCK_F_STREAM has also been negotiated.
> >
> >+The driver SHOULD ignore devices that do not have
> >+VIRTIO_VSOCK_F_MULTI_DEVICES if the feature has been negotiated.
> >+
> >+The driver SHOULD ignore all subsequent devices if a device without
> >+VIRTIO_VSOCK_F_MULTI_DEVICES is present.
> >+
> > \devicenormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
> >
> > The device SHOULD offer the VIRTIO_VSOCK_F_NO_IMPLIED_STREAM feature.
> >@@ -52,6 +59,7 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \begin{lstlisting}
> > struct virtio_vsock_config {
> > le64 guest_cid;
> >+ le16 device_order;
> > };
> > \end{lstlisting}
> >
> >@@ -77,11 +85,27 @@ \subsection{Device configuration layout}\label{sec:Device Types / Socket Device
> > \hline
> > \end{tabular}
> >
> >+The \field{device_order} is used to identify the default device. Up to
> >+65,535 devices can be supported due to the size.
> >+
> >+\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The device MUST provide a distinct \field{device_order} if
> >+VIRTIO_VSOCK_F_MULTI_DEVICES feature has been negotiated.
> >+
> >+\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Socket Device / Device configuration layout}
> >+
> >+The driver MUST treat the device with the lowest \field{device_order} as
> >+the default device.
> >+
> > \subsection{Device Initialization}\label{sec:Device Types / Socket Device / Device Initialization}
> >
> > \begin{enumerate}
> > \item The guest's cid is read from \field{guest_cid}.
> >
> >+\item If VIRTIO_VSOCK_F_MULTI_DEVICES has been negotiated, the device's
> >+order will be read from \field{device_order}.
> >+
> > \item Buffers are added to the event virtqueue to receive events from the device.
> >
> > \item Buffers are added to the rx virtqueue to start receiving packets.
> >@@ -233,8 +257,10 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
> >
> > \drivernormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
> >
> >-The \field{guest_cid} configuration field MUST be used as the source CID when
> >-sending outgoing packets.
> >+If \field{src_cid} is missing in outgoing packets, the driver MUST assign
>
> I think here we have to define what the driver does, since the driver
> has to populate that field, how is it missing?
>
> Maybe we are confusing interaction with user space, so we should say
> something like, “If the source socket is not bound to any source CID,
> the driver MUST assign ...”
Will do.
> >+one. If more than one device is present, the driver SHOULD use the default
> >+device's \field{guest_cid} configuration. Otherwise, the driver SHOULD use
> >+the \field{guest_cid} of the only available device.
> >
> > A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> > unknown \field{type} value.
> >--
> >2.34.1
[1]: https://istio.io/latest/blog/2022/introducing-ambient-mesh/
[2]: https://github.com/containers/libkrun?tab=readme-ov-file#networking
Thanks,
Xuewei
next prev parent reply other threads:[~2025-03-25 3:20 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-24 6:43 [PATCH v6 RESEND] virtio-vsock: Add support for multi devices Xuewei Niu
2025-03-24 13:51 ` Stefano Garzarella
2025-03-25 3:19 ` Xuewei Niu [this message]
2025-03-26 8:50 ` Stefano Garzarella
2025-03-26 10:00 ` Xuewei Niu
2025-03-26 10:32 ` Stefano Garzarella
2025-03-26 10:36 ` Stefano Garzarella
2025-03-26 2:59 ` Xuewei Niu
2025-03-26 9:03 ` Stefano Garzarella
2025-03-27 8:18 ` Xuewei Niu
2025-03-31 6:18 ` Xuewei Niu
2025-04-01 11:15 ` Stefano Garzarella
2025-04-07 2:17 ` Xuewei Niu
2025-04-08 13:34 ` Stefano Garzarella
2025-04-09 6:55 ` Xuewei Niu
2025-04-09 9:34 ` Stefano Garzarella
2025-04-10 3:05 ` Xuewei Niu
2025-04-10 7:21 ` Stefano Garzarella
2025-04-10 8:58 ` Xuewei Niu
2025-04-10 10:38 ` Stefano Garzarella
2025-04-10 10:47 ` Xuewei Niu
2025-04-10 10:49 ` Stefano Garzarella
2025-04-10 13:47 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250325031946.1934483-1-niuxuewei.nxw@antgroup.com \
--to=niuxuewei97@gmail.com \
--cc=fupan.lfp@antgroup.com \
--cc=mst@redhat.com \
--cc=niuxuewei.nxw@antgroup.com \
--cc=parav@nvidia.com \
--cc=sgarzare@redhat.com \
--cc=virtio-comment@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).