* [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()"
@ 2016-01-13 10:31 zyjzyj2000
2016-01-13 13:07 ` Thomas Gleixner
2016-01-21 7:52 ` [V2 PATCH 1/1] genirq: fix desc->action become NULL error zyjzyj2000
0 siblings, 2 replies; 7+ messages in thread
From: zyjzyj2000 @ 2016-01-13 10:31 UTC (permalink / raw
To: zyjzyj2000, tglx, linux-kernel
From: Zhu Yanjun <zyjzyj2000@gmail.com>
After this commit 71f64340fc0e ("genirq: Remove the second parameter
from handle_irq_event_percpu()") is applied, the variable action is
not protected by raw_spin_lock. The following calltrace will pop up.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff810a4991>] handle_irq_event_percpu+0x31/0x1c0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.0 #30
task: ffff88003d2ed040 ti: ffff88003d380000 task.ti: ffff88003d380000
RIP: 0010:[<ffffffff810a4991>] [<ffffffff810a4991>] handle_irq_event_percpu+0x31/0x1c0
RSP: 0018:ffff88003eb03ed8 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000010003
RDX: 0000000080010003 RSI: 0000000000000000 RDI: ffff88003d02ac00
RBP: ffff88003eb03f10 R08: ffff88003d380000 R09: 0000000000000002
R10: 0000000000027e88 R11: 0000000000000282 R12: 0000000000000004
R13: ffff88003d02ac38 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88003eb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000001e0a000 CR4: 00000000000006e0
Stack:
ffff88003d02ac00 0000000000000007 ffff88003d02ac00 ffff88003d02acb4
ffff88003d02ac38 0000000000000034 0000000000000000 ffff88003eb03f38
ffffffff810a4b5c ffff88003d02ac00 ffff88003d02acb4 ffff88003d02ac38
Call Trace:
<IRQ>
[<ffffffff810a4b5c>] handle_irq_event+0x3c/0x60
[<ffffffff810a7c9f>] handle_edge_irq+0xcf/0x160
[<ffffffff810067ba>] handle_irq+0x1a/0x30
[<ffffffff819b0d37>] do_IRQ+0x57/0xf0
[<ffffffff819af1ff>] common_interrupt+0x7f/0x7f
<EOI>
[<ffffffff819ae192>] ? _raw_write_unlock_irq+0x12/0x30
[<ffffffff819ae1be>] _raw_spin_unlock_irq+0xe/0x10
[<ffffffff8107703a>] finish_task_switch+0x9a/0x1f0
[<ffffffff819aa375>] __schedule+0x3c5/0xb60
[<ffffffff819aac8f>] schedule+0x3f/0x90
[<ffffffff819aaf18>] schedule_preempt_disabled+0x18/0x30
[<ffffffff8108f2ec>] cpu_startup_entry+0x13c/0x320
[<ffffffff810379b1>] start_secondary+0xf1/0x100
RIP [<ffffffff810a4991>] handle_irq_event_percpu+0x31/0x1c0
RSP <ffff88003eb03ed8>
CR2: 0000000000000008
---[ end trace c62dc8f0b2aee0f5 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception in interrupt
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
kernel/irq/chip.c | 2 +-
kernel/irq/handle.c | 7 ++++---
kernel/irq/internals.h | 2 +-
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 5797909..ce483ac 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -692,7 +692,7 @@ void handle_percpu_irq(struct irq_desc *desc)
if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
- handle_irq_event_percpu(desc);
+ handle_irq_event_percpu(desc, desc->action);
if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index a302cf9..e25a83b 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -132,11 +132,11 @@ void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action)
wake_up_process(action->thread);
}
-irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
+irqreturn_t
+handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
{
irqreturn_t retval = IRQ_NONE;
unsigned int flags = 0, irq = desc->irq_data.irq;
- struct irqaction *action = desc->action;
do {
irqreturn_t res;
@@ -184,13 +184,14 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
irqreturn_t handle_irq_event(struct irq_desc *desc)
{
+ struct irqaction *action = desc->action;
irqreturn_t ret;
desc->istate &= ~IRQS_PENDING;
irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
raw_spin_unlock(&desc->lock);
- ret = handle_irq_event_percpu(desc);
+ ret = handle_irq_event_percpu(desc, action);
raw_spin_lock(&desc->lock);
irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index fcab63c..25a2c9c 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -83,7 +83,7 @@ extern void irq_mark_irq(unsigned int irq);
extern void init_kstat_irqs(struct irq_desc *desc, int node, int nr);
-irqreturn_t handle_irq_event_percpu(struct irq_desc *desc);
+irqreturn_t handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action);
irqreturn_t handle_irq_event(struct irq_desc *desc);
/* Resending of interrupts :*/
--
1.9.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()"
2016-01-13 10:31 [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()" zyjzyj2000
@ 2016-01-13 13:07 ` Thomas Gleixner
2016-01-14 1:29 ` Huang Shijie
2016-01-14 19:15 ` [tip:irq/urgent] genirq: Validate action before dereferencing it in handle_irq_event_percpu() tip-bot for Thomas Gleixner
2016-01-21 7:52 ` [V2 PATCH 1/1] genirq: fix desc->action become NULL error zyjzyj2000
1 sibling, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2016-01-13 13:07 UTC (permalink / raw
To: zyjzyj2000; +Cc: LKML, Huang Shijie, Jiang Liu, Peter Zijlstra
On Wed, 13 Jan 2016, zyjzyj2000@gmail.com wrote:
> After this commit 71f64340fc0e ("genirq: Remove the second parameter
> from handle_irq_event_percpu()") is applied, the variable action is
> not protected by raw_spin_lock. The following calltrace will pop up.
Thanks, for the report. I missed that detail when merging the patch!
Just for correctness sake: You miss to explain why this can happen.
It's not about the variable action, it's about desc->action not being
protected anymore. So the reason why this oopses is that the action is being
removed concurrently.
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc)
action = desc->action
While the original code did:
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event()
action = desc->action
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc, action)
So now the question is whether we revert that patch or simply change
handle_irq_event_percpu() to deal with that. Patch below.
That preserves us the code size reduction of commit 71f64340fc0e. This is safe
because we either see a valid desc->action or NULL. If the action is about to
be removed it is still valid as free_irq() is blocked on synchronize_irq().
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
set(INPROGRESS)
unlock(desc)
handle_irq_event_percpu(desc)
action = desc->action
desc->action = NULL
sychronize_irq()
while(INPROGRESS); lock(desc)
clr(INPROGRESS)
free(action)
That's basically the same mechanism as we have for shared
interrupts. action->next can become NULL while handle_irq_event_percpu()
runs. Either it sees the action or NULL. It does not matter, because action
itself cannot go away.
Thanks,
tglx
8<-------------
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru
{
irqreturn_t retval = IRQ_NONE;
unsigned int flags = 0, irq = desc->irq_data.irq;
- struct irqaction *action = desc->action;
+ struct irqaction *action;
- do {
+ /*
+ * READ_ONCE is not required here. The compiler cannot reload action
+ * because it'll be action->next for the second iteration of the loop.
+ */
+ action = desc->action;
+
+ while (action) {
irqreturn_t res;
trace_irq_handler_entry(irq, action);
@@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru
retval |= res;
action = action->next;
- } while (action);
+ }
add_interrupt_randomness(irq, flags);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()"
2016-01-13 13:07 ` Thomas Gleixner
@ 2016-01-14 1:29 ` Huang Shijie
2016-01-18 8:00 ` zhuyj
2016-01-14 19:15 ` [tip:irq/urgent] genirq: Validate action before dereferencing it in handle_irq_event_percpu() tip-bot for Thomas Gleixner
1 sibling, 1 reply; 7+ messages in thread
From: Huang Shijie @ 2016-01-14 1:29 UTC (permalink / raw
To: Thomas Gleixner; +Cc: zyjzyj2000, LKML, Jiang Liu, Peter Zijlstra, nd
On Wed, Jan 13, 2016 at 02:07:25PM +0100, Thomas Gleixner wrote:
> On Wed, 13 Jan 2016, zyjzyj2000@gmail.com wrote:
>
> > After this commit 71f64340fc0e ("genirq: Remove the second parameter
> > from handle_irq_event_percpu()") is applied, the variable action is
> > not protected by raw_spin_lock. The following calltrace will pop up.
>
> Thanks, for the report. I missed that detail when merging the patch!
>
> Just for correctness sake: You miss to explain why this can happen.
>
> It's not about the variable action, it's about desc->action not being
> protected anymore. So the reason why this oopses is that the action is being
> removed concurrently.
>
> CPU 0 CPU 1
>
> free_irq() lock(desc)
> lock(desc) handle_edge_irq()
> handle_irq_event(desc)
> unlock(desc)
> desc->action = NULL handle_irq_event_percpu(desc)
> action = desc->action
>
> While the original code did:
>
> free_irq() lock(desc)
> lock(desc) handle_edge_irq()
> handle_irq_event()
> action = desc->action
> unlock(desc)
> desc->action = NULL handle_irq_event_percpu(desc, action)
>
> So now the question is whether we revert that patch or simply change
> handle_irq_event_percpu() to deal with that. Patch below.
>
> That preserves us the code size reduction of commit 71f64340fc0e. This is safe
> because we either see a valid desc->action or NULL. If the action is about to
> be removed it is still valid as free_irq() is blocked on synchronize_irq().
>
> free_irq() lock(desc)
> lock(desc) handle_edge_irq()
> handle_irq_event(desc)
> set(INPROGRESS)
> unlock(desc)
> handle_irq_event_percpu(desc)
> action = desc->action
> desc->action = NULL
> sychronize_irq()
> while(INPROGRESS); lock(desc)
> clr(INPROGRESS)
> free(action)
>
> That's basically the same mechanism as we have for shared
> interrupts. action->next can become NULL while handle_irq_event_percpu()
> runs. Either it sees the action or NULL. It does not matter, because action
> itself cannot go away.
>
> Thanks,
>
> tglx
>
> 8<-------------
>
> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru
> {
> irqreturn_t retval = IRQ_NONE;
> unsigned int flags = 0, irq = desc->irq_data.irq;
> - struct irqaction *action = desc->action;
> + struct irqaction *action;
>
> - do {
> + /*
> + * READ_ONCE is not required here. The compiler cannot reload action
> + * because it'll be action->next for the second iteration of the loop.
> + */
> + action = desc->action;
> +
> + while (action) {
> irqreturn_t res;
>
> trace_irq_handler_entry(irq, action);
> @@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru
>
> retval |= res;
> action = action->next;
> - } while (action);
> + }
>
> add_interrupt_randomness(irq, flags);
I prefer to this patch, revert the old the patch is not a good solution.
thanks
Huang Shijie
^ permalink raw reply [flat|nested] 7+ messages in thread
* [tip:irq/urgent] genirq: Validate action before dereferencing it in handle_irq_event_percpu()
2016-01-13 13:07 ` Thomas Gleixner
2016-01-14 1:29 ` Huang Shijie
@ 2016-01-14 19:15 ` tip-bot for Thomas Gleixner
1 sibling, 0 replies; 7+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-01-14 19:15 UTC (permalink / raw
To: linux-tip-commits
Cc: peterz, shijie.huang, tglx, mingo, linux-kernel, jiang.liu, hpa
Commit-ID: 570540d50710ed192e98e2f7f74578c9486b6b05
Gitweb: http://git.kernel.org/tip/570540d50710ed192e98e2f7f74578c9486b6b05
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 13 Jan 2016 14:07:25 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 14 Jan 2016 20:09:49 +0100
genirq: Validate action before dereferencing it in handle_irq_event_percpu()
commit 71f64340fc0e changed the handling of irq_desc->action from
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
if (desc->action) {
handle_irq_event()
action = desc->action
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc, action)
action->xxx
to
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
if (desc->action) {
handle_irq_event()
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc, action)
action = desc->action
action->xxx
So if free_irq manages to set the action to NULL between the unlock and before
the readout, we happily dereference a null pointer.
We could simply revert 71f64340fc0e, but we want to preserve the better code
generation. A simple solution is to change the action loop from a do {} while
to a while {} loop.
This is safe because we either see a valid desc->action or NULL. If the action
is about to be removed it is still valid as free_irq() is blocked on
synchronize_irq().
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
set(INPROGRESS)
unlock(desc)
handle_irq_event_percpu(desc)
action = desc->action
desc->action = NULL while (action) {
action->xxx
...
action = action->next;
sychronize_irq()
while(INPROGRESS); lock(desc)
clr(INPROGRESS)
free(action)
That's basically the same mechanism as we have for shared
interrupts. action->next can become NULL while handle_irq_event_percpu()
runs. Either it sees the action or NULL. It does not matter, because action
itself cannot go away before the interrupt in progress flag has been cleared.
Fixes: commit 71f64340fc0e "genirq: Remove the second parameter from handle_irq_event_percpu()"
Reported-by: zyjzyj2000@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Huang Shijie <shijie.huang@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1601131224190.3575@nanos
---
kernel/irq/handle.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index a302cf9..57bff78 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -138,7 +138,8 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
unsigned int flags = 0, irq = desc->irq_data.irq;
struct irqaction *action = desc->action;
- do {
+ /* action might have become NULL since we dropped the lock */
+ while (action) {
irqreturn_t res;
trace_irq_handler_entry(irq, action);
@@ -173,7 +174,7 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
retval |= res;
action = action->next;
- } while (action);
+ }
add_interrupt_randomness(irq, flags);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()"
2016-01-14 1:29 ` Huang Shijie
@ 2016-01-18 8:00 ` zhuyj
0 siblings, 0 replies; 7+ messages in thread
From: zhuyj @ 2016-01-18 8:00 UTC (permalink / raw
To: Huang Shijie, Thomas Gleixner; +Cc: LKML, Jiang Liu, Peter Zijlstra, nd
Hi, all
I made tests for this patch. To now, I can not find any similar problem.
Best Regards!
Zhu Yanjun
On 01/14/2016 09:29 AM, Huang Shijie wrote:
> On Wed, Jan 13, 2016 at 02:07:25PM +0100, Thomas Gleixner wrote:
>> On Wed, 13 Jan 2016, zyjzyj2000@gmail.com wrote:
>>
>>> After this commit 71f64340fc0e ("genirq: Remove the second parameter
>>> from handle_irq_event_percpu()") is applied, the variable action is
>>> not protected by raw_spin_lock. The following calltrace will pop up.
>> Thanks, for the report. I missed that detail when merging the patch!
>>
>> Just for correctness sake: You miss to explain why this can happen.
>>
>> It's not about the variable action, it's about desc->action not being
>> protected anymore. So the reason why this oopses is that the action is being
>> removed concurrently.
>>
>> CPU 0 CPU 1
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event(desc)
>> unlock(desc)
>> desc->action = NULL handle_irq_event_percpu(desc)
>> action = desc->action
>>
>> While the original code did:
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event()
>> action = desc->action
>> unlock(desc)
>> desc->action = NULL handle_irq_event_percpu(desc, action)
>>
>> So now the question is whether we revert that patch or simply change
>> handle_irq_event_percpu() to deal with that. Patch below.
>>
>> That preserves us the code size reduction of commit 71f64340fc0e. This is safe
>> because we either see a valid desc->action or NULL. If the action is about to
>> be removed it is still valid as free_irq() is blocked on synchronize_irq().
>>
>> free_irq() lock(desc)
>> lock(desc) handle_edge_irq()
>> handle_irq_event(desc)
>> set(INPROGRESS)
>> unlock(desc)
>> handle_irq_event_percpu(desc)
>> action = desc->action
>> desc->action = NULL
>> sychronize_irq()
>> while(INPROGRESS); lock(desc)
>> clr(INPROGRESS)
>> free(action)
>>
>> That's basically the same mechanism as we have for shared
>> interrupts. action->next can become NULL while handle_irq_event_percpu()
>> runs. Either it sees the action or NULL. It does not matter, because action
>> itself cannot go away.
>>
>> Thanks,
>>
>> tglx
>>
>> 8<-------------
>>
>> --- a/kernel/irq/handle.c
>> +++ b/kernel/irq/handle.c
>> @@ -136,9 +136,15 @@ irqreturn_t handle_irq_event_percpu(stru
>> {
>> irqreturn_t retval = IRQ_NONE;
>> unsigned int flags = 0, irq = desc->irq_data.irq;
>> - struct irqaction *action = desc->action;
>> + struct irqaction *action;
>>
>> - do {
>> + /*
>> + * READ_ONCE is not required here. The compiler cannot reload action
>> + * because it'll be action->next for the second iteration of the loop.
>> + */
>> + action = desc->action;
>> +
>> + while (action) {
>> irqreturn_t res;
>>
>> trace_irq_handler_entry(irq, action);
>> @@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(stru
>>
>> retval |= res;
>> action = action->next;
>> - } while (action);
>> + }
>>
>> add_interrupt_randomness(irq, flags);
> I prefer to this patch, revert the old the patch is not a good solution.
>
> thanks
> Huang Shijie
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [V2 PATCH 1/1] genirq: fix desc->action become NULL error
2016-01-13 10:31 [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()" zyjzyj2000
2016-01-13 13:07 ` Thomas Gleixner
@ 2016-01-21 7:52 ` zyjzyj2000
2016-01-21 7:52 ` zyjzyj2000
1 sibling, 1 reply; 7+ messages in thread
From: zyjzyj2000 @ 2016-01-21 7:52 UTC (permalink / raw
To: zyjzyj2000, linux-kernel, jiang.liu, peterz, nd, tglx,
shijie.huang
Hi, all
According to the suggestions from Thomas Gleixner, I made a new patch
to fix this problem.
Changes:
The commit 71f64340fc0e will not be reverted. And action test is
inserted.
Best Regards!
Zhu Yanjun
^ permalink raw reply [flat|nested] 7+ messages in thread
* [V2 PATCH 1/1] genirq: fix desc->action become NULL error
2016-01-21 7:52 ` [V2 PATCH 1/1] genirq: fix desc->action become NULL error zyjzyj2000
@ 2016-01-21 7:52 ` zyjzyj2000
0 siblings, 0 replies; 7+ messages in thread
From: zyjzyj2000 @ 2016-01-21 7:52 UTC (permalink / raw
To: zyjzyj2000, linux-kernel, jiang.liu, peterz, nd, tglx,
shijie.huang
From: Zhu Yanjun <zyjzyj2000@gmail.com>
After this commit 71f64340fc0e ("genirq: Remove the second parameter
from handle_irq_event_percpu()") is applied, the variable desc->action is
not protected by raw_spin_lock. The following calltrace will pop up.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff810a4991>] handle_irq_event_percpu+0x31/0x1c0
...
Call Trace:
<IRQ>
[<ffffffff810a4b5c>] handle_irq_event+0x3c/0x60
[<ffffffff810a7c9f>] handle_edge_irq+0xcf/0x160
[<ffffffff810067ba>] handle_irq+0x1a/0x30
[<ffffffff819b0d37>] do_IRQ+0x57/0xf0
[<ffffffff819af1ff>] common_interrupt+0x7f/0x7f
<EOI>
[<ffffffff819ae192>] ? _raw_write_unlock_irq+0x12/0x30
[<ffffffff819ae1be>] _raw_spin_unlock_irq+0xe/0x10
[<ffffffff8107703a>] finish_task_switch+0x9a/0x1f0
[<ffffffff819aa375>] __schedule+0x3c5/0xb60
[<ffffffff819aac8f>] schedule+0x3f/0x90
[<ffffffff819aaf18>] schedule_preempt_disabled+0x18/0x30
[<ffffffff8108f2ec>] cpu_startup_entry+0x13c/0x320
[<ffffffff810379b1>] start_secondary+0xf1/0x100
RIP [<ffffffff810a4991>] handle_irq_event_percpu+0x31/0x1c0
...
The reason is as below:
The variable desc->action is not protected anymore. So desc->action is
removed concurrently.
CPU 0 CPU 1
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
unlock(desc)
desc->action = NULL handle_irq_event_percpu(desc)
action = desc->action
Because we either see a valid desc->action or NULL. If the action is about to
be removed it is still valid as free_irq() is blocked on synchronize_irq().
free_irq() lock(desc)
lock(desc) handle_edge_irq()
handle_irq_event(desc)
set(INPROGRESS)
unlock(desc)
handle_irq_event_percpu(desc)
action = desc->action
desc->action = NULL
sychronize_irq()
while(INPROGRESS); lock(desc)
clr(INPROGRESS)
free(action)
That's basically the same mechanism as we have for shared
interrupts. The variable action->next can become NULL while
handle_irq_event_percpu() runs. Either it sees the action or
NULL. It does not matter, because action itself cannot go away.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
---
kernel/irq/handle.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index a302cf9..7510b72 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -136,9 +136,14 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
{
irqreturn_t retval = IRQ_NONE;
unsigned int flags = 0, irq = desc->irq_data.irq;
- struct irqaction *action = desc->action;
+ struct irqaction *action;
- do {
+ /*
+ * READ_ONCE is not required here. The compiler cannot reload action
+ * because it'll be action->next for the second iteration of the loop.
+ */
+ action = desc->action;
+ while (action) {
irqreturn_t res;
trace_irq_handler_entry(irq, action);
@@ -173,7 +179,7 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc)
retval |= res;
action = action->next;
- } while (action);
+ }
add_interrupt_randomness(irq, flags);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-01-21 7:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-13 10:31 [PATCH 1/1] Revert "genirq: Remove the second parameter from handle_irq_event_percpu()" zyjzyj2000
2016-01-13 13:07 ` Thomas Gleixner
2016-01-14 1:29 ` Huang Shijie
2016-01-18 8:00 ` zhuyj
2016-01-14 19:15 ` [tip:irq/urgent] genirq: Validate action before dereferencing it in handle_irq_event_percpu() tip-bot for Thomas Gleixner
2016-01-21 7:52 ` [V2 PATCH 1/1] genirq: fix desc->action become NULL error zyjzyj2000
2016-01-21 7:52 ` zyjzyj2000
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).