All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Shinkevich <andrey.shinkevich@huawei.com>
To: Shashi Mallela <shashi.mallela@linaro.org>
Cc: "peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"drjones@redhat.com" <drjones@redhat.com>,
	"Cota@braap.org" <Cota@braap.org>,
	"richard.henderson@linaro.org" <richard.henderson@linaro.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"Chengen (William, FixNet)" <chengen@huawei.com>,
	yuzenghui <yuzenghui@huawei.com>,
	"Wanghaibin (D)" <wanghaibin.wang@huawei.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: GICv3 for MTTCG
Date: Thu, 17 Jun 2021 16:43:59 +0000	[thread overview]
Message-ID: <d8acff2435364e019931b8d13296ad69@huawei.com> (raw)
In-Reply-To: 41839E15-50DF-4EFB-AF54-6CDB089859BD@getmailspring.com

Dear Shashi,

I have applied the version 4 of the series "GICv3 LPI and ITS feature 
implementation" right after the commit 3e9f48b as before (because the 
GCCv7.5 is unavailable in the YUM repository for CentOS-7.9).

The guest OS still hangs at its start when QEMU is configured with 4 or 
more vCPUs (with 1 to 3 vCPUs the guest starts and runs OK and the MTTCG 
works properly):

Welcome to EulerOS 2.0 ... (Initramfs)!

…

[  OK  ] Mounted Kernel Configuration File System.

[  OK  ] Started udev Coldplug all Devices.

[  OK  ] Reached target System Initialization.

[  OK  ] Reached target Basic System.



IT HANGS HERE
  (with 4 or more vCPUs)!!!


[  OK  ] Found device /dev/mapper/euleros-root.

[  OK  ] Reached target Initrd Root Device.

[  OK  ] Started dracut initqueue hook.

          Starting File System Check on /dev/mapper/euleros-root...

[  OK  ] Reached target Remote File Systems (Pre).

[  OK  ] Reached target Remote File Systems.

[  OK  ] Started File System Check on /dev/mapper/euleros-root.

          Mounting /sysroot...

[  OK  ] Mounted /sysroot.

…


The back trace of threads in QEMU looks like a dead lock in MTTCG, 
doesn't it?

Thread 7 (Thread 0x7f476e489700 (LWP 24967)):

#0  0x00007f477c2bbd19 in syscall () at /lib64/libc.so.6

#1  0x000055747d41a270 in qemu_event_wait (val=<optimized out>, 
f=<optimized out>) at /home/andy/git/qemu/include/qemu/futex.h:29

#2  0x000055747d41a270 in qemu_event_wait (ev=ev@entry=0x55747e051c28 
<rcu_call_ready_event>) at ../util/qemu-thread-posix.c:460

#3  0x000055747d444d78 in call_rcu_thread (opaque=opaque@entry=0x0) at 
../util/rcu.c:258

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 6 (Thread 0x7f472ce42700 (LWP 24970)):

#0  0x00007f477c2b6ccd in poll () at /lib64/libc.so.6

#1  0x00007f47805c137c in g_main_context_iterate.isra.19 () at 
/lib64/libglib-2.0.so.0

#2  0x00007f47805c16ca in g_main_loop_run () at /lib64/libglib-2.0.so.0

#3  0x000055747d29b071 in iothread_run 
(opaque=opaque@entry=0x55747f85f280) at ../iothread.c:80

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 5 (Thread 0x7f461f9ff700 (LWP 24971)):

#0  0x00007f477c59ca35 in pthread_cond_wait@@GLIBC_2.3.2 () at 
/lib64/libpthread.so.0

#1  0x000055747d419b1d in qemu_cond_wait_impl (cond=0x55747f916670, 
mutex=0x55747e04dc00 <qemu_global_mutex>, file=0x55747d5dbe5c 
"../softmmu/cpus.c", line=417) at ../util/qemu-thread-posix.c:174

#2  0x000055747d20ae36 in qemu_wait_io_event 
(cpu=cpu@entry=0x55747f8b7920) at ../softmmu/cpus.c:417

#3  0x000055747d18d6a1 in mttcg_cpu_thread_fn 
(arg=arg@entry=0x55747f8b7920) at ../accel/tcg/tcg-accel-ops-mttcg.c:98

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 4 (Thread 0x7f461f1fe700 (LWP 24972)):

#0  0x00007f477c59ca35 in pthread_cond_wait@@GLIBC_2.3.2 () at 
/lib64/libpthread.so.0

#1  0x000055747d419b1d in qemu_cond_wait_impl (cond=0x55747f9897e0, 
mutex=0x55747e04dc00 <qemu_global_mutex>, file=0x55747d5dbe5c 
"../softmmu/cpus.c", line=417) at ../util/qemu-thread-posix.c:174

#2  0x000055747d20ae36 in qemu_wait_io_event 
(cpu=cpu@entry=0x55747f924bc0) at ../softmmu/cpus.c:417

#3  0x000055747d18d6a1 in mttcg_cpu_thread_fn 
(arg=arg@entry=0x55747f924bc0) at ../accel/tcg/tcg-accel-ops-mttcg.c:98

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 3 (Thread 0x7f461e9fd700 (LWP 24973)):

#0  0x00007f477c59ca35 in pthread_cond_wait@@GLIBC_2.3.2 () at 
/lib64/libpthread.so.0

#1  0x000055747d419b1d in qemu_cond_wait_impl (cond=0x55747f9f5b40, 
mutex=0x55747e04dc00 <qemu_global_mutex>, file=0x55747d5dbe5c 
"../softmmu/cpus.c", line=417) at ../util/qemu-thread-posix.c:174

#2  0x000055747d20ae36 in qemu_wait_io_event 
(cpu=cpu@entry=0x55747f990ba0) at ../softmmu/cpus.c:417

#3  0x000055747d18d6a1 in mttcg_cpu_thread_fn 
(arg=arg@entry=0x55747f990ba0) at ../accel/tcg/tcg-accel-ops-mttcg.c:98

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 2 (Thread 0x7f461e1fc700 (LWP 24974)):

#0  0x00007f477c59ca35 in pthread_cond_wait@@GLIBC_2.3.2 () at 
/lib64/libpthread.so.0

---Type <return> to continue, or q <return> to quit---

#1  0x000055747d419b1d in qemu_cond_wait_impl (cond=0x55747fa626c0, 
mutex=0x55747e04dc00 <qemu_global_mutex>, file=0x55747d5dbe5c 
"../softmmu/cpus.c", line=417) at ../util/qemu-thread-posix.c:174

#2  0x000055747d20ae36 in qemu_wait_io_event 
(cpu=cpu@entry=0x55747f9fcf00) at ../softmmu/cpus.c:417

#3  0x000055747d18d6a1 in mttcg_cpu_thread_fn 
(arg=arg@entry=0x55747f9fcf00) at ../accel/tcg/tcg-accel-ops-mttcg.c:98

#4  0x000055747d419406 in qemu_thread_start (args=<optimized out>) at 
../util/qemu-thread-posix.c:521

#5  0x00007f477c598ea5 in start_thread () at /lib64/libpthread.so.0

#6  0x00007f477c2c19fd in clone () at /lib64/libc.so.6



Thread 1 (Thread 0x7f4781db4d00 (LWP 24957)):

#0  0x00007f477c2b6d8f in ppoll () at /lib64/libc.so.6

#1  0x000055747d431439 in qemu_poll_ns (__ss=0x0, 
__timeout=0x7ffcc3188330, __nfds=<optimized out>, __fds=<optimized out>) 
at /usr/include/bits/poll2.h:77

#2  0x000055747d431439 in qemu_poll_ns (fds=<optimized out>, 
nfds=<optimized out>, timeout=timeout@entry=3792947) at 
../util/qemu-timer.c:348

#3  0x000055747d4466ce in main_loop_wait (timeout=<optimized out>) at 
../util/main-loop.c:249

#4  0x000055747d4466ce in main_loop_wait 
(nonblocking=nonblocking@entry=0) at ../util/main-loop.c:530

#5  0x000055747d2695c7 in qemu_main_loop () at ../softmmu/runstate.c:725

#6  0x000055747ccc1bde in main (argc=<optimized out>, argv=<optimized 
out>, envp=<optimized out>) at ../softmmu/main.c:50

(gdb)


I run QEMU with virt-manager as this:

qemu      7311     1 70 19:15 ?        00:00:05 
/usr/local/bin/qemu-system-aarch64 -name 
guest=EulerOS-2.8-Rich,debug-threads=on -S -object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-95-EulerOS-2.8-Rich/master-key.aes 
-machine virt-6.1,accel=tcg,usb=off,dump-guest-core=off,gic-version=3 
-cpu max -drive 
file=/usr/share/AAVMF/AAVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on 
-drive 
file=/var/lib/libvirt/qemu/nvram/EulerOS-2.8-Rich_VARS.fd,if=pflash,format=raw,unit=1 
-m 4096 -smp 4,sockets=4,cores=1,threads=1 -uuid 
c95e0e92-011b-449a-8e3f-b5f0938aaaa7 -display none -no-user-config 
-nodefaults -chardev socket,id=charmonitor,fd=26,server,nowait -mon 
chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown 
-boot strict=on -device 
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 
-device 
pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 
-device 
pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 
-device 
pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 
-device qemu-xhci,p2=8,p3=8,id=usb,bus=pci.2,addr=0x0 -device 
virtio-scsi-pci,id=scsi0,bus=pci.3,addr=0x0 -drive 
file=/var/lib/libvirt/images/EulerOS-2.8-Rich.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 
-device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 
-drive if=none,id=drive-scsi0-0-0-1,readonly=on -device 
scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1 
-netdev tap,fd=28,id=hostnet0 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f9:e0:69,bus=pci.1,addr=0x0 
-chardev pty,id=charserial0 -serial chardev:charserial0 -msg timestamp=on

The issue is reproducible and persists.
1. Do you think that applying the series results in the dead lock in 
MTTCG? Or it may be other reason?
2. Which piece of QEMU source code should I investigate to locate the issue?

Best regards,
Andrey Shinkevich


On 5/13/21 7:45 PM, Shashi Mallela wrote:
> Hi Andrey,
> 
> To clarify, the patch series
> 
>     https://lists.gnu.org/archive/html/qemu-arm/2021-04/msg00944.html
>     "GICv3 LPI and ITS feature implementation"
> 
> is applicable for virt machine 6.1 onwards,i.e ITS TCG functionality is 
> not available for version 6.0 that is being tried
> here.
> 
> Thanks
> Shashi
> 
> On May 13 2021, at 12:35 pm, Andrey Shinkevich 
> <andrey.shinkevich@huawei.com> wrote:
> 
>     Dear colleagues,
> 
>     Thank you all very much for your responses. Let me reply with one
>     message.
> 
>     I configured QEMU for AARCH64 guest:
>     $ ./configure --target-list=aarch64-softmmu
> 
>     When I start QEMU with GICv3 on an x86 host:
>     qemu-system-aarch64 -machine virt-6.0,accel=tcg,gic-version=3
> 
>     QEMU reports this error from hw/pci/msix.c:
>     error_setg(errp, "MSI-X is not supported by interrupt controller");
> 
>     Probably, the variable 'msi_nonbroken' would be initialized in
>     hw/intc/arm_gicv3_its_common.c:
>     gicv3_its_init_mmio(..)
> 
>     I guess that it works with KVM acceleration only rather than with TCG.
> 
>     The error persists after applying the series:
>     https://lists.gnu.org/archive/html/qemu-arm/2021-04/msg00944.html
>     "GICv3 LPI and ITS feature implementation"
>     (special thanks for referring me to that)
> 
>     Please, make me clear and advise ideas how that error can be fixed?
>     Should the MSI-X support be implemented with GICv3 extra?
> 
>     When successful, I would like to test QEMU for a maximum number of cores
>     to get the best MTTCG performance.
>     Probably, we will get just some percentage of performance enhancement
>     with the BQL series applied, won't we? I will test it as well.
> 
>     Best regards,
>     Andrey Shinkevich
> 
> 
>     On 5/12/21 6:43 PM, Alex Bennée wrote:
>      >
>      > Andrey Shinkevich <andrey.shinkevich@huawei.com> writes:
>      >
>      >> Dear colleagues,
>      >>
>      >> I am looking for ways to accelerate the MTTCG for ARM guest on
>     x86-64 host.
>      >> The maximum number of CPUs for MTTCG that uses GICv2 is limited
>     by 8:
>      >>
>      >> include/hw/intc/arm_gic_common.h:#define GIC_NCPU 8
>      >>
>      >> The version 3 of the Generic Interrupt Controller (GICv3) is not
>      >> supported in QEMU for some reason unknown to me. It would allow to
>      >> increase the limit of CPUs and accelerate the MTTCG performance on a
>      >> multiple core hypervisor.
>      >
>      > It is supported, you just need to select it.
>      >
>      >> I have got an idea to implement the Interrupt Translation
>     Service (ITS)
>      >> for using by MTTCG for ARM architecture.
>      >
>      > There is some work to support ITS under TCG already posted:
>      >
>      > Subject: [PATCH v3 0/8] GICv3 LPI and ITS feature implementation
>      > Date: Thu, 29 Apr 2021 19:41:53 -0400
>      > Message-Id: <20210429234201.125565-1-shashi.mallela@linaro.org>
>      >
>      > please do review and test.
>      >
>      >> Do you find that idea useful and feasible?
>      >> If yes, how much time do you estimate for such a project to
>     complete by
>      >> one developer?
>      >> If no, what are reasons for not implementing GICv3 for MTTCG in
>     QEMU?
>      >
>      > As far as MTTCG performance is concerned there is a degree of
>      > diminishing returns to be expected as the synchronisation cost
>     between
>      > threads will eventually outweigh the gains of additional threads.
>      >
>      > There are a number of parts that could improve this performance. The
>      > first would be picking up the BQL reduction series from your
>     FutureWei
>      > colleges who worked on the problem when they were Linaro assignees:
>      >
>      > Subject: [PATCH v2 0/7] accel/tcg: remove implied BQL from
>     cpu_handle_interrupt/exception path
>      > Date: Wed, 19 Aug 2020 14:28:49 -0400
>      > Message-Id: <20200819182856.4893-1-robert.foley@linaro.org>
>      >
>      > There was also a longer series moving towards per-CPU locks:
>      >
>      > Subject: [PATCH v10 00/73] per-CPU locks
>      > Date: Wed, 17 Jun 2020 17:01:18 -0400
>      > Message-Id: <20200617210231.4393-1-robert.foley@linaro.org>
>      >
>      > I believe the initial measurements showed that the BQL cost
>     started to
>      > edge up with GIC interactions. We did discuss approaches for this
>     and I
>      > think one idea was use non-BQL locking for the GIC. You would need to
>      > revert:
>      >
>      > Subject: [PATCH-for-5.2] exec: Remove
>     MemoryRegion::global_locking field
>      > Date: Thu, 6 Aug 2020 17:07:26 +0200
>      > Message-Id: <20200806150726.962-1-philmd@redhat.com>
>      >
>      > and then implement a more fine tuned locking in the GIC emulation
>      > itself. However I think the BQL and per-CPU locks are lower hanging
>      > fruit to tackle first.
>      >
>      >>
>      >> Best regards,
>      >> Andrey Shinkevich
>      >
>      >
> 
> Sent from Mailspring



  parent reply	other threads:[~2021-06-17 16:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11 17:51 GICv3 for MTTCG Andrey Shinkevich
2021-05-11 19:53 ` Richard Henderson
2021-05-12  1:44 ` Zenghui Yu
2021-05-12 15:26 ` Alex Bennée
2021-05-13 16:35   ` Andrey Shinkevich
2021-05-13 16:45     ` Shashi Mallela
2021-05-13 18:29       ` Andrey Shinkevich
2021-06-17 16:43       ` Andrey Shinkevich [this message]
2021-06-17 17:44         ` shashi.mallela
2021-06-17 18:55           ` Andrey Shinkevich
2021-06-18 13:15         ` Alex Bennée
2021-06-18 15:18           ` Andrey Shinkevich
2021-05-13 17:19     ` Alex Bennée
2021-05-13 18:33       ` Andrey Shinkevich
2021-05-14  5:21       ` Andrey Shinkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8acff2435364e019931b8d13296ad69@huawei.com \
    --to=andrey.shinkevich@huawei.com \
    --cc=Cota@braap.org \
    --cc=alex.bennee@linaro.org \
    --cc=chengen@huawei.com \
    --cc=drjones@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=shashi.mallela@linaro.org \
    --cc=wanghaibin.wang@huawei.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.