All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
@ 2013-12-23 14:39 Prarit Bhargava
  2013-12-24  2:51 ` Chen, Gong
  2013-12-27 22:01 ` H. Peter Anvin
  0 siblings, 2 replies; 8+ messages in thread
From: Prarit Bhargava @ 2013-12-23 14:39 UTC (permalink / raw
  To: linux-kernel
  Cc: Prarit Bhargava, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Gong Chen,
	Andi Kleen, H. Peter Anvin, x86, stable

Patch is against linux-tip.git and was tested on both linux.git and tip without
any issues.  As expected, the number of required vectors for the down'd cpu
drops from 202 to 181 on my test system (which has 509 vectors assigned in
total).

Many thanks to Gong Chen for catching this.

P.

----8<----

Gong Chen caught this coding error during inspection of the patch.  The
code should be AND not OR.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Janet Morgan <janet.morgan@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ruiv Wang <ruiv.wang@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: x86@kernel.org
Cc: <stable@vger.kernel.org>
---
 arch/x86/kernel/irq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 7d40698..aed7acc 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -281,7 +281,7 @@ int check_irq_vectors_for_cpu_disable(void)
 			desc = irq_to_desc(irq);
 			data = irq_desc_get_irq_data(desc);
 			affinity = data->affinity;
-			if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
+			if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
 			    !cpumask_subset(affinity, cpu_online_mask))
 				this_count++;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-23 14:39 [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable() Prarit Bhargava
@ 2013-12-24  2:51 ` Chen, Gong
  2013-12-24 13:19   ` Prarit Bhargava
  2013-12-27 22:01 ` H. Peter Anvin
  1 sibling, 1 reply; 8+ messages in thread
From: Chen, Gong @ 2013-12-24  2:51 UTC (permalink / raw
  To: Prarit Bhargava
  Cc: linux-kernel, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Andi Kleen,
	H. Peter Anvin, x86, stable

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

On Mon, Dec 23, 2013 at 09:39:12AM -0500, Prarit Bhargava wrote:
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 7d40698..aed7acc 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -281,7 +281,7 @@ int check_irq_vectors_for_cpu_disable(void)
>  			desc = irq_to_desc(irq);
>  			data = irq_desc_get_irq_data(desc);
>  			affinity = data->affinity;
> -			if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
> +			if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>  			    !cpumask_subset(affinity, cpu_online_mask))
>  				this_count++;
Hi, Prarit

I noticed that you don't mention another question I asked in last mail.

"It looks like cpu_online_mask will be updated until cpu_disable_common
is called, but your check_vectors is called before that."

native_cpu_disable
        cpu_disable_common
                remove_cpu_from_maps
                        /*
                         * until here, cpu_online_mask/cpu_online_bits
                         * is cleared
                         */
                        set_cpu_online(cpu, false);

Something I missed?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-24  2:51 ` Chen, Gong
@ 2013-12-24 13:19   ` Prarit Bhargava
  2013-12-25  2:40     ` Chen, Gong
  0 siblings, 1 reply; 8+ messages in thread
From: Prarit Bhargava @ 2013-12-24 13:19 UTC (permalink / raw
  To: linux-kernel, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Andi Kleen,
	H. Peter Anvin, x86, stable



On 12/23/2013 09:51 PM, Chen, Gong wrote:
> On Mon, Dec 23, 2013 at 09:39:12AM -0500, Prarit Bhargava wrote:
>> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
>> index 7d40698..aed7acc 100644
>> --- a/arch/x86/kernel/irq.c
>> +++ b/arch/x86/kernel/irq.c
>> @@ -281,7 +281,7 @@ int check_irq_vectors_for_cpu_disable(void)
>>  			desc = irq_to_desc(irq);
>>  			data = irq_desc_get_irq_data(desc);
>>  			affinity = data->affinity;
>> -			if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
>> +			if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>>  			    !cpumask_subset(affinity, cpu_online_mask))
>>  				this_count++;
> Hi, Prarit
> 
> I noticed that you don't mention another question I asked in last mail.
> 
> "It looks like cpu_online_mask will be updated until cpu_disable_common
> is called, but your check_vectors is called before that."

Oh, I'm sorry ... Yes, check_irq_vectors_for_cpu_disable() is called before we
remove the CPU from the maps.  If there is an error then we have to do much less
clean up of the code.  I explicitly take into account the cpu that is being
downed into the check vectors code.


> 
> native_cpu_disable		
>         cpu_disable_common
>                 remove_cpu_from_maps
>                         /*
>                          * until here, cpu_online_mask/cpu_online_bits
>                          * is cleared
>                          */
>                         set_cpu_online(cpu, false);
> 
> Something I missed?

No -- are you pointing out a bug?

P.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-24 13:19   ` Prarit Bhargava
@ 2013-12-25  2:40     ` Chen, Gong
  2013-12-27 16:13       ` Prarit Bhargava
  0 siblings, 1 reply; 8+ messages in thread
From: Chen, Gong @ 2013-12-25  2:40 UTC (permalink / raw
  To: Prarit Bhargava
  Cc: linux-kernel, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Andi Kleen,
	H. Peter Anvin, x86, stable

[-- Attachment #1: Type: text/plain, Size: 3100 bytes --]

On Tue, Dec 24, 2013 at 08:19:09AM -0500, Prarit Bhargava wrote:
> On 12/23/2013 09:51 PM, Chen, Gong wrote:
> > On Mon, Dec 23, 2013 at 09:39:12AM -0500, Prarit Bhargava wrote:
> >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> >> index 7d40698..aed7acc 100644
> >> --- a/arch/x86/kernel/irq.c
> >> +++ b/arch/x86/kernel/irq.c
> >> @@ -281,7 +281,7 @@ int check_irq_vectors_for_cpu_disable(void)
> >>  			desc = irq_to_desc(irq);
> >>  			data = irq_desc_get_irq_data(desc);
> >>  			affinity = data->affinity;
> >> -			if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
> >> +			if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
> >>  			    !cpumask_subset(affinity, cpu_online_mask))
> >>  				this_count++;
> > Hi, Prarit
> > 
> > I noticed that you don't mention another question I asked in last mail.
> > 
> > "It looks like cpu_online_mask will be updated until cpu_disable_common
> > is called, but your check_vectors is called before that."
> 
> Oh, I'm sorry ... Yes, check_irq_vectors_for_cpu_disable() is called before we
> remove the CPU from the maps.  If there is an error then we have to do much less
> clean up of the code.  I explicitly take into account the cpu that is being
> downed into the check vectors code.
> 
Here is my question: How to decide this_count can be incrased?
1) it is a valid irq(irq_has_action) 2) it is not percpu irq(!irqd_is_per_cpu)
3) it is not shared with left online cpus(!cpumask_subset)

For item 3, I have some concerns.
Your current codes are called before cpu_disable_common, so affinity and
cpu_online_mask are both not updated. BTW, it means your calculation
for count is not correct because it concludes one to-be-off-lined cpu

+       for_each_online_cpu(cpu) {
+               if (cpu == smp_processor_id())
+                       continue;
+               for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS;
+                    vector++) {
+                       if (per_cpu(vector_irq, cpu)[vector] < 0)
+                               count++;
+               }
+       }

Back to my question, assume cpu1 will be off-lined and one irq affinity is
set as (1, 2) -- this irq will be bypassed. Looks good. But if one irq
affinity is set as only (1), -- this irq is bypassed, too. Not right!

Furthermore, you even can't use cpumask_subset as evaluation condition,
whatever affinity/cpu_online_mask is updated or not. Let me paste the
comment of cpumask_subset:
/**
 * cpumask_subset - (*src1p & ~*src2p) == 0
 * @src1p: the first input
 * @src2p: the second input
 *
 * Returns 1 if *@src1p is a subset of *@src2p, else returns 0
 */
Here we can see, even if src1p is an empty set, it still can be considered
as the subset of src2p. For our this special case, I mean cpu1 will
be off-lined and one irq affinity is set as (1). If this irq affinity
is updated to (0), it means no cpu is bound to this irq, but the
calculation of cpumask_subset will return true and this irq will be
bypassed. For this case, cpumask_empty should be more suitable.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-25  2:40     ` Chen, Gong
@ 2013-12-27 16:13       ` Prarit Bhargava
  2013-12-28  1:07         ` H. Peter Anvin
  0 siblings, 1 reply; 8+ messages in thread
From: Prarit Bhargava @ 2013-12-27 16:13 UTC (permalink / raw
  To: linux-kernel, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Andi Kleen,
	H. Peter Anvin, x86, stable



On 12/24/2013 09:40 PM, Chen, Gong wrote:
> On Tue, Dec 24, 2013 at 08:19:09AM -0500, Prarit Bhargava wrote:
>> On 12/23/2013 09:51 PM, Chen, Gong wrote:
>>> On Mon, Dec 23, 2013 at 09:39:12AM -0500, Prarit Bhargava wrote:
>>>> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
>>>> index 7d40698..aed7acc 100644
>>>> --- a/arch/x86/kernel/irq.c
>>>> +++ b/arch/x86/kernel/irq.c
>>>> @@ -281,7 +281,7 @@ int check_irq_vectors_for_cpu_disable(void)
>>>>  			desc = irq_to_desc(irq);
>>>>  			data = irq_desc_get_irq_data(desc);
>>>>  			affinity = data->affinity;
>>>> -			if (irq_has_action(irq) || !irqd_is_per_cpu(data) ||
>>>> +			if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
>>>>  			    !cpumask_subset(affinity, cpu_online_mask))
>>>>  				this_count++;
>>> Hi, Prarit
>>>
>>> I noticed that you don't mention another question I asked in last mail.
>>>
>>> "It looks like cpu_online_mask will be updated until cpu_disable_common
>>> is called, but your check_vectors is called before that."
>>
>> Oh, I'm sorry ... Yes, check_irq_vectors_for_cpu_disable() is called before we
>> remove the CPU from the maps.  If there is an error then we have to do much less
>> clean up of the code.  I explicitly take into account the cpu that is being
>> downed into the check vectors code.
>>
> Here is my question: How to decide this_count can be incrased?
> 1) it is a valid irq(irq_has_action) 2) it is not percpu irq(!irqd_is_per_cpu)
> 3) it is not shared with left online cpus(!cpumask_subset)
> 
> For item 3, I have some concerns.
> Your current codes are called before cpu_disable_common, so affinity and
> cpu_online_mask are both not updated. BTW, it means your calculation
> for count is not correct because it concludes one to-be-off-lined cpu
> 
> +       for_each_online_cpu(cpu) {
> +               if (cpu == smp_processor_id())
> +                       continue;
> +               for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS;
> +                    vector++) {
> +                       if (per_cpu(vector_irq, cpu)[vector] < 0)
> +                               count++;
> +               }
> +       }
> 
> Back to my question, assume cpu1 will be off-lined and one irq affinity is
> set as (1, 2) -- this irq will be bypassed. Looks good. But if one irq
> affinity is set as only (1), -- this irq is bypassed, too. Not right!

Oh, yes, this is a bug.  ... and as you point out ...

> 
> Furthermore, you even can't use cpumask_subset as evaluation condition,
> whatever affinity/cpu_online_mask is updated or not. Let me paste the
> comment of cpumask_subset:
> /**
>  * cpumask_subset - (*src1p & ~*src2p) == 0
>  * @src1p: the first input
>  * @src2p: the second input
>  *
>  * Returns 1 if *@src1p is a subset of *@src2p, else returns 0
>  */
> Here we can see, even if src1p is an empty set, it still can be considered
> as the subset of src2p. For our this special case, I mean cpu1 will
> be off-lined and one irq affinity is set as (1). If this irq affinity
> is updated to (0), it means no cpu is bound to this irq, but the
> calculation of cpumask_subset will return true and this irq will be
> bypassed. For this case, cpumask_empty should be more suitable.

I definitely should have used cpumask_empty().  I'll retest ...

P.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-23 14:39 [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable() Prarit Bhargava
  2013-12-24  2:51 ` Chen, Gong
@ 2013-12-27 22:01 ` H. Peter Anvin
  1 sibling, 0 replies; 8+ messages in thread
From: H. Peter Anvin @ 2013-12-27 22:01 UTC (permalink / raw
  To: Prarit Bhargava, linux-kernel
  Cc: Michel Lespinasse, Seiji Aguchi, Yang Zhang, Paul Gortmaker,
	Janet Morgan, Tony Luck, Ruiv Wang, Gong Chen, Andi Kleen, x86,
	stable

On 12/23/2013 06:39 AM, Prarit Bhargava wrote:
> Patch is against linux-tip.git and was tested on both linux.git and tip without
> any issues.  As expected, the number of required vectors for the down'd cpu
> drops from 202 to 181 on my test system (which has 509 vectors assigned in
> total).
> 
> Many thanks to Gong Chen for catching this.
> 
> P.
> 
> ----8<----
> 
> Gong Chen caught this coding error during inspection of the patch.  The
> code should be AND not OR.
> 

Please include a description that is understandable by someone who isn't
familiar with the back story to this patch.  The patch descriptions
become part of the permanent record of the history of the Linux kernel
and are critical to it being understandable.

	-hpa


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-27 16:13       ` Prarit Bhargava
@ 2013-12-28  1:07         ` H. Peter Anvin
  2013-12-28 15:58           ` Prarit Bhargava
  0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2013-12-28  1:07 UTC (permalink / raw
  To: Prarit Bhargava, linux-kernel, Michel Lespinasse, Seiji Aguchi,
	Yang Zhang, Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang,
	Andi Kleen, x86, stable

On 12/27/2013 08:13 AM, Prarit Bhargava wrote:
>>
>> Back to my question, assume cpu1 will be off-lined and one irq affinity is
>> set as (1, 2) -- this irq will be bypassed. Looks good. But if one irq
>> affinity is set as only (1), -- this irq is bypassed, too. Not right!
> 
> Oh, yes, this is a bug.  ... and as you point out ...
> 

Does this mean the patch that is currently in my tree should not be
pushed to Linus?  It sounds like that to me...

	-hpa


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable()
  2013-12-28  1:07         ` H. Peter Anvin
@ 2013-12-28 15:58           ` Prarit Bhargava
  0 siblings, 0 replies; 8+ messages in thread
From: Prarit Bhargava @ 2013-12-28 15:58 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: linux-kernel, Michel Lespinasse, Seiji Aguchi, Yang Zhang,
	Paul Gortmaker, Janet Morgan, Tony Luck, Ruiv Wang, Andi Kleen,
	x86, stable



On 12/27/2013 08:07 PM, H. Peter Anvin wrote:
> On 12/27/2013 08:13 AM, Prarit Bhargava wrote:
>>>
>>> Back to my question, assume cpu1 will be off-lined and one irq affinity is
>>> set as (1, 2) -- this irq will be bypassed. Looks good. But if one irq
>>> affinity is set as only (1), -- this irq is bypassed, too. Not right!
>>
>> Oh, yes, this is a bug.  ... and as you point out ...
>>
> 
> Does this mean the patch that is currently in my tree should not be
> pushed to Linus?  It sounds like that to me...

Yes, hpa -- please drop the patch.  I will resubmit an updated patch after testing.

P.

> 
> 	-hpa
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-12-28 15:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-23 14:39 [PATCH] x86, irq, fix logical AND/OR error in check_irq_vectors_for_cpu_disable() Prarit Bhargava
2013-12-24  2:51 ` Chen, Gong
2013-12-24 13:19   ` Prarit Bhargava
2013-12-25  2:40     ` Chen, Gong
2013-12-27 16:13       ` Prarit Bhargava
2013-12-28  1:07         ` H. Peter Anvin
2013-12-28 15:58           ` Prarit Bhargava
2013-12-27 22:01 ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.