Linux-Serial Archive mirror
 help / color / mirror / Atom feed
From: Liuye <liu.yeC@h3c.com>
To: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "jason.wessel@windriver.com" <jason.wessel@windriver.com>,
	"dianders@chromium.org" <dianders@chromium.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"jirislaby@kernel.org" <jirislaby@kernel.org>,
	"kgdb-bugreport@lists.sourceforge.net"
	<kgdb-bugreport@lists.sourceforge.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-serial@vger.kernel.org" <linux-serial@vger.kernel.org>
Subject: 答复: 答复: 答复: 答复: 答复: [PATCH] kdb: Fix the deadlock issue in KDB debugging.
Date: Thu, 14 Mar 2024 07:06:22 +0000	[thread overview]
Message-ID: <56ed54fd241c462189d2d030ad51eac6@h3c.com> (raw)
In-Reply-To: <20240313141745.GD202685@aspen.lan>

>On Wed, Mar 13, 2024 at 01:22:17AM +0000, Liuye wrote:
>> >On Tue, Mar 12, 2024 at 10:04:54AM +0000, Liuye wrote:
>> >> >On Tue, Mar 12, 2024 at 08:37:11AM +0000, Liuye wrote:
>> >> >> I know that you said schedule_work is not NMI save, which is the 
>> >> >> first issue. Perhaps it can be fixed using irq_work_queue. But 
>> >> >> even if irq_work_queue is used to implement it, there will still 
>> >> >> be a deadlock problem because slave cpu1 still has not released 
>> >> >> the running queue lock of master CPU0.
>> >> >
>> >> >This doesn't sound right to me. Why do you think CPU1 won't 
>> >> >release the run queue lock?
>> >>
>> >> In this example, CPU1 is waiting for CPU0 to release 
>> >> dbg_slave_lock.
>> >
>> >That shouldn't be a problem. CPU0 will have released that lock by the 
>> >time the irq work is dispatched.
>>
>> Release dbg_slave_lock in CPU0. Before that, shcedule_work needs to be 
>> handled, and we are back to the previous issue.
>
>Sorry but I still don't understand what problem you think can happen here. What is wrong with calling schedule_work() from the IRQ work handler?
>
>Both irq_work_queue() and schedule_work() are calls to queue deferred work. It does not matter when the work is queued (providing we are lock safe). What matters is when the work is actually executed.
>
>Please can you describe the problem you think exists based on when the work is executed.

CPU0 enters the KDB process when processing serial port interrupts and triggers an IPI (NMI) to other CPUs. 
After entering a stable state, CPU0 is in interrupt context, while other CPUs are in NMI context. 
Before other CPUs enter NMI context, there is a chance to obtain the running queue of CPU0. 
At this time, when CPU0 is processing kgdboc_restore_input, calling schedule_work, need_more_worker here determines the chance to wake up processes on system_wq. 
This will cause CPU0 to acquire the running queue lock of this core, which is held by other CPUs. 
but other CPUs are still in NMI context and have not exited because waiting for CPU0 to release the dbg_slave_lock after schedule_work.

After thinking about it, the problem is not whether schedule_work is NMI safe, but that processes on system_wq should not be awakened immediately when schedule_work is called. 
I replaced schedule_work with schedule_delayed_work, and this solved my problem.

The new patch is as follows:

Index: drivers/tty/serial/kgdboc.c
===================================================================
--- drivers/tty/serial/kgdboc.c (revision 57862)
+++ drivers/tty/serial/kgdboc.c (working copy)
@@ -92,12 +92,12 @@
        mutex_unlock(&kgdboc_reset_mutex);
 }

-static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
+static DECLARE_DELAYED_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);

 static void kgdboc_restore_input(void)
 {
        if (likely(system_state == SYSTEM_RUNNING))
-               schedule_work(&kgdboc_restore_input_work);
+               schedule_delayed_work(&kgdboc_restore_input_work,2*HZ);
 }

 static int kgdboc_register_kbd(char **cptr)
@@ -128,7 +128,7 @@
                        i--;
                }
        }
-       flush_work(&kgdboc_restore_input_work);
+       flush_delayed_work(&kgdboc_restore_input_work);
 }
 #else /* ! CONFIG_KDB_KEYBOARD */
 #define kgdboc_register_kbd(x) 0

  reply	other threads:[~2024-03-14  7:08 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  2:56 [PATCH] kdb: Fix the deadlock issue in KDB debugging LiuYe
2024-02-28 12:05 ` Daniel Thompson
2024-03-01  3:30   ` 答复: " Liuye
2024-03-01 10:59     ` Daniel Thompson
2024-03-12  8:37       ` 答复: " Liuye
2024-03-12  9:57         ` Daniel Thompson
2024-03-12 10:04           ` 答复: " Liuye
2024-03-12 10:24             ` Daniel Thompson
2024-03-13  1:22               ` 答复: " Liuye
2024-03-13 14:17                 ` Daniel Thompson
2024-03-14  7:06                   ` Liuye [this message]
2024-03-14 13:09                     ` 答复: " Daniel Thompson
2024-03-15  9:59                       ` 答复: " Liuye
2024-03-16  2:34                       ` [PATCH v1] " liu.yec
2024-03-20 16:28                         ` Daniel Thompson
2024-03-21  2:26                           ` [PATCH V3] " liu.yec
2024-03-21  7:38                             ` Greg KH
2024-03-21  7:57                               ` 答复: " Liuye
2024-03-21 11:04                                 ` Daniel Thompson
2024-03-21 11:50                                   ` [PATCH V4] " liu.yec
2024-03-22  6:54                                     ` Jiri Slaby
2024-03-22  7:50                                       ` 答复: " Liuye
2024-03-22 15:58                                         ` Daniel Thompson
2024-03-23  1:41                                           ` [PATCH V5] " liu.yec
2024-03-25 16:54                                             ` Daniel Thompson
2024-03-26  0:47                                               ` 答复: " Liuye
2024-03-26  7:40                                               ` [PATCH V6] " liu.yec
2024-03-26  8:22                                                 ` Greg KH
2024-03-26  8:54                                                   ` [PATCH V7] " liu.yec
2024-04-02 12:58                                                     ` Daniel Thompson
2024-04-03  6:11                                                       ` [PATCH V8] " liu.yec
2024-04-03 13:58                                                         ` Daniel Thompson
2024-04-03 22:22                                                         ` Andy Shevchenko
2024-04-08  1:44                                                           ` LiuYe
2024-04-08 10:29                                                             ` Andy Shevchenko
2024-04-09  2:03                                                               ` [PATCH V9] " liu.yec
2024-04-10  2:06                                                                 ` [PATCH V10] " liu.yec
2024-04-10  3:59                                                                   ` Andy Shevchenko
2024-04-10  5:30                                                                   ` Greg KH
2024-04-10  5:54                                                                     ` 答复: " Liuye
2024-04-10  5:59                                                                       ` Greg KH
2024-04-10  6:10                                                                         ` 答复: " Liuye
2024-04-10  6:15                                                                           ` Greg KH
2024-04-10  6:30                                                                             ` 答复: " Liuye
2024-04-10  7:18                                                                               ` [PATCH V11] " liu.yec
2024-04-10  8:24                                                                               ` 答复: 答复: 答复: [PATCH V10] " Greg KH
2024-04-10  8:38                                                                                 ` 答复: " Liuye
2024-03-02 20:44 ` [PATCH] " Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56ed54fd241c462189d2d030ad51eac6@h3c.com \
    --to=liu.yec@h3c.com \
    --cc=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jason.wessel@windriver.com \
    --cc=jirislaby@kernel.org \
    --cc=kgdb-bugreport@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).