All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
@ 2015-07-12  8:03 Jörg Otte
  2015-07-12 16:33 ` Jörg Otte
  2015-07-12 16:52   ` Linus Torvalds
  0 siblings, 2 replies; 15+ messages in thread
From: Jörg Otte @ 2015-07-12  8:03 UTC (permalink / raw)
  To: David Airlie, Linus Torvalds, Daniel Vetter, dri-devel,
	Linux Kernel Mailing List

4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a
system freeze
when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :

BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
PGD 0
Oops: 0000 [#1] SMP
CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6
Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
task: ffff8802149d6c00 ti: ffff880206df4000 task.ti: ffff880206df4000
RIP: 0010:[<ffffffffbd3447bb>]  [<ffffffffbd3447bb>] 0xffffffffbd3447bb
RSP: 0018:ffff880206df7b08  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88021578f480 RCX: ffff88021578f4d0
RDX: 0000000000000000 RSI: ffff88021630b000 RDI: ffff880214a68000
RBP: ffff88021630b000 R08: ffff88021578f4e0 R09: ffff88021578f4f0
R10: 0000000000003c18 R11: 00000000fffffff2 R12: ffff880214a68000
R13: ffff88021634e800 R14: 0000000000000000 R15: 0000000000000000
FS:  00007ff3caa60880(0000) GS:ffff88021f280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000009 CR3: 0000000206e07000 CR4: 00000000001407e0
Stack:
 ffff880200010000 ffff880200010000 ffff880200000000 ffff880200000001
 ffff88021578f500 ffffffffbd2df135 ffff880213f71c00 ffff880214a68000
 0000000000000000 ffff880214a70000 0000000000000001 ffff880214a68000
Call Trace:
 [<ffffffffbd2df135>] ? 0xffffffffbd2df135
 [<ffffffffbd2b6ca8>] ? 0xffffffffbd2b6ca8
 [<ffffffffbd33cc7e>] ? 0xffffffffbd33cc7e
 [<ffffffffbd343673>] ? 0xffffffffbd343673
 [<ffffffffbd2d0728>] ? 0xffffffffbd2d0728
 [<ffffffffbd2d088e>] ? 0xffffffffbd2d088e
 [<ffffffffbd2d10c5>] ? 0xffffffffbd2d10c5
 [<ffffffffbd2c6976>] ? 0xffffffffbd2c6976
 [<ffffffffbd2d0fe0>] ? 0xffffffffbd2d0fe0
 [<ffffffffbd0c6a1f>] ? 0xffffffffbd0c6a1f
 [<ffffffffbd0e79e1>] ? 0xffffffffbd0e79e1
 [<ffffffffbd0e7ed1>] ? 0xffffffffbd0e7ed1
 [<ffffffffbd6df557>] ? 0xffffffffbd6df557
Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44
24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41>
80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00
RIP  [<ffffffffbd3447bb>] 0xffffffffbd3447bb
 RSP <ffff880206df7b08>
CR2: 0000000000000009
---[ end trace dd0931f7f0d43d12 ] ---

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-12  8:03 [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference Jörg Otte
@ 2015-07-12 16:33 ` Jörg Otte
  2015-07-12 16:52   ` Linus Torvalds
  1 sibling, 0 replies; 15+ messages in thread
From: Jörg Otte @ 2015-07-12 16:33 UTC (permalink / raw)
  To: David Airlie, Linus Torvalds, Daniel Vetter, dri-devel,
	Linux Kernel Mailing List, Ander Conselvan de Oliveira

2015-07-12 10:03 GMT+02:00 Jörg Otte <jrg.otte@gmail.com>:
> 4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a
> system freeze
> when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> PGD 0
> Oops: 0000 [#1] SMP
> CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6
> Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
> task: ffff8802149d6c00 ti: ffff880206df4000 task.ti: ffff880206df4000
> RIP: 0010:[<ffffffffbd3447bb>]  [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> RSP: 0018:ffff880206df7b08  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff88021578f480 RCX: ffff88021578f4d0
> RDX: 0000000000000000 RSI: ffff88021630b000 RDI: ffff880214a68000
> RBP: ffff88021630b000 R08: ffff88021578f4e0 R09: ffff88021578f4f0
> R10: 0000000000003c18 R11: 00000000fffffff2 R12: ffff880214a68000
> R13: ffff88021634e800 R14: 0000000000000000 R15: 0000000000000000
> FS:  00007ff3caa60880(0000) GS:ffff88021f280000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000009 CR3: 0000000206e07000 CR4: 00000000001407e0
> Stack:
>  ffff880200010000 ffff880200010000 ffff880200000000 ffff880200000001
>  ffff88021578f500 ffffffffbd2df135 ffff880213f71c00 ffff880214a68000
>  0000000000000000 ffff880214a70000 0000000000000001 ffff880214a68000
> Call Trace:
>  [<ffffffffbd2df135>] ? 0xffffffffbd2df135
>  [<ffffffffbd2b6ca8>] ? 0xffffffffbd2b6ca8
>  [<ffffffffbd33cc7e>] ? 0xffffffffbd33cc7e
>  [<ffffffffbd343673>] ? 0xffffffffbd343673
>  [<ffffffffbd2d0728>] ? 0xffffffffbd2d0728
>  [<ffffffffbd2d088e>] ? 0xffffffffbd2d088e
>  [<ffffffffbd2d10c5>] ? 0xffffffffbd2d10c5
>  [<ffffffffbd2c6976>] ? 0xffffffffbd2c6976
>  [<ffffffffbd2d0fe0>] ? 0xffffffffbd2d0fe0
>  [<ffffffffbd0c6a1f>] ? 0xffffffffbd0c6a1f
>  [<ffffffffbd0e79e1>] ? 0xffffffffbd0e79e1
>  [<ffffffffbd0e7ed1>] ? 0xffffffffbd0e7ed1
>  [<ffffffffbd6df557>] ? 0xffffffffbd6df557
> Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44
> 24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41>
> 80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00
> RIP  [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>  RSP <ffff880206df7b08>
> CR2: 0000000000000009
> ---[ end trace dd0931f7f0d43d12 ] ---

I can fix the problem for me by reverting:

commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jul 7 11:15:47 2015 +0200

    drm/i915: Use crtc_state->active in primary check_plane func
    Since
    commit 8c7b5ccb729870e606321b3703e2c2e698c49a95
    Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
    Date:   Tue Apr 21 17:13:19 2015 +0300
    drm/i915: Use atomic helpers for computing changed flags

Thanks, Jörg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-12  8:03 [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference Jörg Otte
@ 2015-07-12 16:52   ` Linus Torvalds
  2015-07-12 16:52   ` Linus Torvalds
  1 sibling, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2015-07-12 16:52 UTC (permalink / raw)
  To: Jörg Otte, Daniel Vetter
  Cc: David Airlie, DRI, Linux Kernel Mailing List, Maarten Lankhorst

On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb

Ugh. Please enable KALLSYMS to get sane symbols.

But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
so it's pretty clearly just that change frm

-       if (intel_crtc->active) {
+       if (crtc_state->base.active) {

and "crtc_state" is NULL.

And the code very much knows that crtc_state can be NULL, since it's
initialized with

        crtc_state = state->base.state ?
                intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;

Tssk. Daniel? Should I just revert that commit dec4f799d0a4
("drm/i915: Use crtc_state->active in primary check_plane func") for
now, or is there a better fix? Like just checking crtc_state for NULL?

                    Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
@ 2015-07-12 16:52   ` Linus Torvalds
  0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2015-07-12 16:52 UTC (permalink / raw)
  To: Jörg Otte, Daniel Vetter; +Cc: Linux Kernel Mailing List, DRI

On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb

Ugh. Please enable KALLSYMS to get sane symbols.

But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
so it's pretty clearly just that change frm

-       if (intel_crtc->active) {
+       if (crtc_state->base.active) {

and "crtc_state" is NULL.

And the code very much knows that crtc_state can be NULL, since it's
initialized with

        crtc_state = state->base.state ?
                intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;

Tssk. Daniel? Should I just revert that commit dec4f799d0a4
("drm/i915: Use crtc_state->active in primary check_plane func") for
now, or is there a better fix? Like just checking crtc_state for NULL?

                    Linus
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-12 16:52   ` Linus Torvalds
@ 2015-07-13  5:56     ` Maarten Lankhorst
  -1 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-13  5:56 UTC (permalink / raw)
  To: Linus Torvalds, Jörg Otte, Daniel Vetter
  Cc: David Airlie, DRI, Linux Kernel Mailing List

Op 12-07-15 om 18:52 schreef Linus Torvalds:
> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> Ugh. Please enable KALLSYMS to get sane symbols.
>
> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
> so it's pretty clearly just that change frm
>
> -       if (intel_crtc->active) {
> +       if (crtc_state->base.active) {
>
> and "crtc_state" is NULL.
>
> And the code very much knows that crtc_state can be NULL, since it's
> initialized with
>
>         crtc_state = state->base.state ?
>                 intel_atomic_get_crtc_state(state->base.state,
> intel_crtc) : NULL;
>
> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
> ("drm/i915: Use crtc_state->active in primary check_plane func") for
> now, or is there a better fix? Like just checking crtc_state for NULL?
>
>                     Linus
More symbols would be nice.

With the transitional helpers when crtc_state == NULL you don't want to update the scalers or funny things happen.
Fix is probably something like this:

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ba9321998a41..830e07b23a15 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane,
 	if (ret)
 		return ret;
 
-	if (crtc_state->base.active) {
+	if (crtc_state ? crtc_state->base.active || crtc->state->active) {
 		struct intel_plane_state *old_state =
 			to_intel_plane_state(plane->state);
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
@ 2015-07-13  5:56     ` Maarten Lankhorst
  0 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-13  5:56 UTC (permalink / raw)
  To: Linus Torvalds, Jörg Otte, Daniel Vetter
  Cc: Linux Kernel Mailing List, DRI

Op 12-07-15 om 18:52 schreef Linus Torvalds:
> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> Ugh. Please enable KALLSYMS to get sane symbols.
>
> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
> so it's pretty clearly just that change frm
>
> -       if (intel_crtc->active) {
> +       if (crtc_state->base.active) {
>
> and "crtc_state" is NULL.
>
> And the code very much knows that crtc_state can be NULL, since it's
> initialized with
>
>         crtc_state = state->base.state ?
>                 intel_atomic_get_crtc_state(state->base.state,
> intel_crtc) : NULL;
>
> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
> ("drm/i915: Use crtc_state->active in primary check_plane func") for
> now, or is there a better fix? Like just checking crtc_state for NULL?
>
>                     Linus
More symbols would be nice.

With the transitional helpers when crtc_state == NULL you don't want to update the scalers or funny things happen.
Fix is probably something like this:

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ba9321998a41..830e07b23a15 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane,
 	if (ret)
 		return ret;
 
-	if (crtc_state->base.active) {
+	if (crtc_state ? crtc_state->base.active || crtc->state->active) {
 		struct intel_plane_state *old_state =
 			to_intel_plane_state(plane->state);
 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-12 16:52   ` Linus Torvalds
@ 2015-07-13  6:22     ` Daniel Vetter
  -1 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2015-07-13  6:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jörg Otte, Daniel Vetter, David Airlie, DRI,
	Linux Kernel Mailing List, Maarten Lankhorst

On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
> >
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
> > IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> 
> Ugh. Please enable KALLSYMS to get sane symbols.
> 
> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
> so it's pretty clearly just that change frm
> 
> -       if (intel_crtc->active) {
> +       if (crtc_state->base.active) {
> 
> and "crtc_state" is NULL.
> 
> And the code very much knows that crtc_state can be NULL, since it's
> initialized with
> 
>         crtc_state = state->base.state ?
>                 intel_atomic_get_crtc_state(state->base.state,
> intel_crtc) : NULL;
> 
> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
> ("drm/i915: Use crtc_state->active in primary check_plane func") for
> now, or is there a better fix? Like just checking crtc_state for NULL?

Indeed embarrassing. I've missed that we still have 1 caller left that's
using the transitional helpers, and those don't fill out
plane_state->state backpointers to the global atomic update since there is
no global atomic update for transitional helpers. Below diff should fix
this - we need to preferentially check crts_state->active and if that's
not set intel_crtc->active should yield the right result for the one
remaining caller (it's in the crtc_disable paths).

For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
to transition to a different QA team ("we'll do this in 1 week without
upfront planing") I essentially don't have proper QA support for 1-2
months by now. The other trouble in this area specifically is that this
code is already completely changed in -next again, so any testing done on
integration trees (like -next or drm-intel-nightly) won't test any patches
for 4.2.
-Daniel

Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
decide to apply this right away.
---
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ba9321998a41..85ac6d85dc39 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane,
 	if (ret)
 		return ret;
 
-	if (crtc_state->base.active) {
+	if (crtc_state ? crtc_state->base.active : intel_crtc->active) {
 		struct intel_plane_state *old_state =
 			to_intel_plane_state(plane->state);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
@ 2015-07-13  6:22     ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2015-07-13  6:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Vetter, Linux Kernel Mailing List, DRI, Jörg Otte

On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
> >
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
> > IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
> 
> Ugh. Please enable KALLSYMS to get sane symbols.
> 
> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
> so it's pretty clearly just that change frm
> 
> -       if (intel_crtc->active) {
> +       if (crtc_state->base.active) {
> 
> and "crtc_state" is NULL.
> 
> And the code very much knows that crtc_state can be NULL, since it's
> initialized with
> 
>         crtc_state = state->base.state ?
>                 intel_atomic_get_crtc_state(state->base.state,
> intel_crtc) : NULL;
> 
> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
> ("drm/i915: Use crtc_state->active in primary check_plane func") for
> now, or is there a better fix? Like just checking crtc_state for NULL?

Indeed embarrassing. I've missed that we still have 1 caller left that's
using the transitional helpers, and those don't fill out
plane_state->state backpointers to the global atomic update since there is
no global atomic update for transitional helpers. Below diff should fix
this - we need to preferentially check crts_state->active and if that's
not set intel_crtc->active should yield the right result for the one
remaining caller (it's in the crtc_disable paths).

For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
to transition to a different QA team ("we'll do this in 1 week without
upfront planing") I essentially don't have proper QA support for 1-2
months by now. The other trouble in this area specifically is that this
code is already completely changed in -next again, so any testing done on
integration trees (like -next or drm-intel-nightly) won't test any patches
for 4.2.
-Daniel

Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
decide to apply this right away.
---
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ba9321998a41..85ac6d85dc39 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane,
 	if (ret)
 		return ret;
 
-	if (crtc_state->base.active) {
+	if (crtc_state ? crtc_state->base.active : intel_crtc->active) {
 		struct intel_plane_state *old_state =
 			to_intel_plane_state(plane->state);
 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-13  6:22     ` Daniel Vetter
@ 2015-07-13  7:23       ` Maarten Lankhorst
  -1 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-13  7:23 UTC (permalink / raw)
  To: Linus Torvalds, Jörg Otte, David Airlie, DRI,
	Linux Kernel Mailing List

Op 13-07-15 om 08:22 schreef Daniel Vetter:
> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>> Ugh. Please enable KALLSYMS to get sane symbols.
>>
>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>> so it's pretty clearly just that change frm
>>
>> -       if (intel_crtc->active) {
>> +       if (crtc_state->base.active) {
>>
>> and "crtc_state" is NULL.
>>
>> And the code very much knows that crtc_state can be NULL, since it's
>> initialized with
>>
>>         crtc_state = state->base.state ?
>>                 intel_atomic_get_crtc_state(state->base.state,
>> intel_crtc) : NULL;
>>
>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>> now, or is there a better fix? Like just checking crtc_state for NULL?
> Indeed embarrassing. I've missed that we still have 1 caller left that's
> using the transitional helpers, and those don't fill out
> plane_state->state backpointers to the global atomic update since there is
> no global atomic update for transitional helpers. Below diff should fix
> this - we need to preferentially check crts_state->active and if that's
> not set intel_crtc->active should yield the right result for the one
> remaining caller (it's in the crtc_disable paths).
>
> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
> to transition to a different QA team ("we'll do this in 1 week without
> upfront planing") I essentially don't have proper QA support for 1-2
> months by now. The other trouble in this area specifically is that this
> code is already completely changed in -next again, so any testing done on
> integration trees (like -next or drm-intel-nightly) won't test any patches
> for 4.2.
> -Daniel
>
> Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
> decide to apply this right away.
>
Well your version has the benefit of compiling without errors. :-)

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
@ 2015-07-13  7:23       ` Maarten Lankhorst
  0 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-13  7:23 UTC (permalink / raw)
  To: Linus Torvalds, Jörg Otte, David Airlie, DRI,
	Linux Kernel Mailing List

Op 13-07-15 om 08:22 schreef Daniel Vetter:
> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>> Ugh. Please enable KALLSYMS to get sane symbols.
>>
>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>> so it's pretty clearly just that change frm
>>
>> -       if (intel_crtc->active) {
>> +       if (crtc_state->base.active) {
>>
>> and "crtc_state" is NULL.
>>
>> And the code very much knows that crtc_state can be NULL, since it's
>> initialized with
>>
>>         crtc_state = state->base.state ?
>>                 intel_atomic_get_crtc_state(state->base.state,
>> intel_crtc) : NULL;
>>
>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>> now, or is there a better fix? Like just checking crtc_state for NULL?
> Indeed embarrassing. I've missed that we still have 1 caller left that's
> using the transitional helpers, and those don't fill out
> plane_state->state backpointers to the global atomic update since there is
> no global atomic update for transitional helpers. Below diff should fix
> this - we need to preferentially check crts_state->active and if that's
> not set intel_crtc->active should yield the right result for the one
> remaining caller (it's in the crtc_disable paths).
>
> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
> to transition to a different QA team ("we'll do this in 1 week without
> upfront planing") I essentially don't have proper QA support for 1-2
> months by now. The other trouble in this area specifically is that this
> code is already completely changed in -next again, so any testing done on
> integration trees (like -next or drm-intel-nightly) won't test any patches
> for 4.2.
> -Daniel
>
> Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
> decide to apply this right away.
>
Well your version has the benefit of compiling without errors. :-)

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-13  7:23       ` Maarten Lankhorst
  (?)
@ 2015-07-13  7:42       ` Jörg Otte
  2015-07-13  7:58         ` Maarten Lankhorst
  -1 siblings, 1 reply; 15+ messages in thread
From: Jörg Otte @ 2015-07-13  7:42 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Linus Torvalds, David Airlie, DRI, Linux Kernel Mailing List

2015-07-13 9:23 GMT+02:00 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>:
> Op 13-07-15 om 08:22 schreef Daniel Vetter:
>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>>>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>>> Ugh. Please enable KALLSYMS to get sane symbols.
>>>
>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>>> so it's pretty clearly just that change frm
>>>
>>> -       if (intel_crtc->active) {
>>> +       if (crtc_state->base.active) {
>>>
>>> and "crtc_state" is NULL.
>>>
>>> And the code very much knows that crtc_state can be NULL, since it's
>>> initialized with
>>>
>>>         crtc_state = state->base.state ?
>>>                 intel_atomic_get_crtc_state(state->base.state,
>>> intel_crtc) : NULL;
>>>
>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>>> now, or is there a better fix? Like just checking crtc_state for NULL?
>> Indeed embarrassing. I've missed that we still have 1 caller left that's
>> using the transitional helpers, and those don't fill out
>> plane_state->state backpointers to the global atomic update since there is
>> no global atomic update for transitional helpers. Below diff should fix
>> this - we need to preferentially check crts_state->active and if that's
>> not set intel_crtc->active should yield the right result for the one
>> remaining caller (it's in the crtc_disable paths).
>>
>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
>> to transition to a different QA team ("we'll do this in 1 week without
>> upfront planing") I essentially don't have proper QA support for 1-2
>> months by now. The other trouble in this area specifically is that this
>> code is already completely changed in -next again, so any testing done on
>> integration trees (like -next or drm-intel-nightly) won't test any patches
>> for 4.2.
>> -Daniel
>>
>> Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
>> decide to apply this right away.
>>
> Well your version has the benefit of compiling without errors. :-)
>
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Just noticed another problem:
On each resume I get the following error:
-----------[ cut here ]------------
WARNING: CPU: 2 PID: 2663 at
/data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319
0xffffffff9a33d5e9()
WARN_ON(!crtc->state->enable)
CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15
ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
orkqueue: events_unbound 0xffffffff9a055750
0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000
ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000
ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664
all Trace:
[<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2
[<ffffffff9a03c416>] ? 0xffffffff9a03c416
[<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5
[<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9
[<ffffffff9a343ac3>] ? 0xffffffff9a343ac3
[<ffffffff9a34444a>] ? 0xffffffff9a34444a
[<ffffffff9a345518>] ? 0xffffffff9a345518
[<ffffffff9a3246f0>] ? 0xffffffff9a3246f0
[<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8
[<ffffffff9a236170>] ? 0xffffffff9a236170
[<ffffffff9a38b28d>] ? 0xffffffff9a38b28d
[<ffffffff9a38b784>] ? 0xffffffff9a38b784
[<ffffffff9a38baa4>] ? 0xffffffff9a38baa4
[<ffffffff9a05577d>] ? 0xffffffff9a05577d
[<ffffffff9a04dc47>] ? 0xffffffff9a04dc47
[<ffffffff9a04dfab>] ? 0xffffffff9a04dfab
[<ffffffff9a04dea0>] ? 0xffffffff9a04dea0
[<ffffffff9a05331c>] ? 0xffffffff9a05331c
[<ffffffff9a053260>] ? 0xffffffff9a053260
[<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f
[<ffffffff9a053260>] ? 0xffffffff9a053260
--[ end trace 1b6d28ee34071679 ]---

Nervertheless resume works, so it doesn't hurt me.


BTW: I get also up to 40..50!  compile warnings like:
i915/i915_drv.h: In function 'i915_debugfs_connector_add':
i915/i915_drv.h:3119:53: warning: no return statement in function
returning non-void [-Wreturn-type]

which may cause yet uncovered troubles.

Thanks, Jörg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-13  7:42       ` Jörg Otte
@ 2015-07-13  7:58         ` Maarten Lankhorst
  2015-07-13  8:50           ` Jörg Otte
  0 siblings, 1 reply; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-13  7:58 UTC (permalink / raw)
  To: Jörg Otte
  Cc: Linus Torvalds, David Airlie, DRI, Linux Kernel Mailing List

Op 13-07-15 om 09:42 schreef Jörg Otte:
> 2015-07-13 9:23 GMT+02:00 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>:
>> Op 13-07-15 om 08:22 schreef Daniel Vetter:
>>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>>>>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>>>> Ugh. Please enable KALLSYMS to get sane symbols.
>>>>
>>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>>>> so it's pretty clearly just that change frm
>>>>
>>>> -       if (intel_crtc->active) {
>>>> +       if (crtc_state->base.active) {
>>>>
>>>> and "crtc_state" is NULL.
>>>>
>>>> And the code very much knows that crtc_state can be NULL, since it's
>>>> initialized with
>>>>
>>>>         crtc_state = state->base.state ?
>>>>                 intel_atomic_get_crtc_state(state->base.state,
>>>> intel_crtc) : NULL;
>>>>
>>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>>>> now, or is there a better fix? Like just checking crtc_state for NULL?
>>> Indeed embarrassing. I've missed that we still have 1 caller left that's
>>> using the transitional helpers, and those don't fill out
>>> plane_state->state backpointers to the global atomic update since there is
>>> no global atomic update for transitional helpers. Below diff should fix
>>> this - we need to preferentially check crts_state->active and if that's
>>> not set intel_crtc->active should yield the right result for the one
>>> remaining caller (it's in the crtc_disable paths).
>>>
>>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
>>> to transition to a different QA team ("we'll do this in 1 week without
>>> upfront planing") I essentially don't have proper QA support for 1-2
>>> months by now. The other trouble in this area specifically is that this
>>> code is already completely changed in -next again, so any testing done on
>>> integration trees (like -next or drm-intel-nightly) won't test any patches
>>> for 4.2.
>>> -Daniel
>>>
>>> Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
>>> decide to apply this right away.
>>>
>> Well your version has the benefit of compiling without errors. :-)
>>
>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Just noticed another problem:
> On each resume I get the following error:
> -----------[ cut here ]------------
> WARNING: CPU: 2 PID: 2663 at
> /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319
> 0xffffffff9a33d5e9()
> WARN_ON(!crtc->state->enable)
> CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15
> ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
> orkqueue: events_unbound 0xffffffff9a055750
> 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000
> ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000
> ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664
> all Trace:
> [<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2
> [<ffffffff9a03c416>] ? 0xffffffff9a03c416
> [<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5
> [<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9
> [<ffffffff9a343ac3>] ? 0xffffffff9a343ac3
> [<ffffffff9a34444a>] ? 0xffffffff9a34444a
> [<ffffffff9a345518>] ? 0xffffffff9a345518
> [<ffffffff9a3246f0>] ? 0xffffffff9a3246f0
> [<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8
> [<ffffffff9a236170>] ? 0xffffffff9a236170
> [<ffffffff9a38b28d>] ? 0xffffffff9a38b28d
> [<ffffffff9a38b784>] ? 0xffffffff9a38b784
> [<ffffffff9a38baa4>] ? 0xffffffff9a38baa4
> [<ffffffff9a05577d>] ? 0xffffffff9a05577d
> [<ffffffff9a04dc47>] ? 0xffffffff9a04dc47
> [<ffffffff9a04dfab>] ? 0xffffffff9a04dfab
> [<ffffffff9a04dea0>] ? 0xffffffff9a04dea0
> [<ffffffff9a05331c>] ? 0xffffffff9a05331c
> [<ffffffff9a053260>] ? 0xffffffff9a053260
> [<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f
> [<ffffffff9a053260>] ? 0xffffffff9a053260
> --[ end trace 1b6d28ee34071679 ]---
>
> Nervertheless resume works, so it doesn't hurt me.
>
>
> BTW: I get also up to 40..50!  compile warnings like:
> i915/i915_drv.h: In function 'i915_debugfs_connector_add':
> i915/i915_drv.h:3119:53: warning: no return statement in function
> returning non-void [-Wreturn-type]
>
> which may cause yet uncovered troubles.
>
> Thanks, Jörg
kallsyms please!

Looks like intel_crtc_disable being called with a mode change on a already disabled crtc, it's gone in 4.3 because of the atomic rework.

Does something like below work?

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ba9321998a41..725d2b727704 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc)
 	struct drm_connector *connector;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	/* crtc should still be enabled when we disable it. */
-	WARN_ON(!crtc->state->enable);
-
 	intel_crtc_disable_planes(crtc);
 	dev_priv->display.crtc_disable(crtc);
 	dev_priv->display.off(crtc);
@@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc,
 			continue;
 
 		if (!crtc_state->enable) {
-			intel_crtc_disable(crtc);
+			if (crtc->state->enable)
+				intel_crtc_disable(crtc);
 		} else if (crtc->state->enable) {
 			intel_crtc_disable_planes(crtc);
 			dev_priv->display.crtc_disable(crtc);


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference
  2015-07-13  7:58         ` Maarten Lankhorst
@ 2015-07-13  8:50           ` Jörg Otte
  2015-07-14 11:00               ` Maarten Lankhorst
  0 siblings, 1 reply; 15+ messages in thread
From: Jörg Otte @ 2015-07-13  8:50 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Linus Torvalds, David Airlie, DRI, Linux Kernel Mailing List

2015-07-13 9:58 GMT+02:00 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>:
> Op 13-07-15 om 09:42 schreef Jörg Otte:
>> 2015-07-13 9:23 GMT+02:00 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>:
>>> Op 13-07-15 om 08:22 schreef Daniel Vetter:
>>>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>>>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte <jrg.otte@gmail.com> wrote:
>>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009
>>>>>> IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
>>>>> Ugh. Please enable KALLSYMS to get sane symbols.
>>>>>
>>>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>>>>> so it's pretty clearly just that change frm
>>>>>
>>>>> -       if (intel_crtc->active) {
>>>>> +       if (crtc_state->base.active) {
>>>>>
>>>>> and "crtc_state" is NULL.
>>>>>
>>>>> And the code very much knows that crtc_state can be NULL, since it's
>>>>> initialized with
>>>>>
>>>>>         crtc_state = state->base.state ?
>>>>>                 intel_atomic_get_crtc_state(state->base.state,
>>>>> intel_crtc) : NULL;
>>>>>
>>>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>>>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>>>>> now, or is there a better fix? Like just checking crtc_state for NULL?
>>>> Indeed embarrassing. I've missed that we still have 1 caller left that's
>>>> using the transitional helpers, and those don't fill out
>>>> plane_state->state backpointers to the global atomic update since there is
>>>> no global atomic update for transitional helpers. Below diff should fix
>>>> this - we need to preferentially check crts_state->active and if that's
>>>> not set intel_crtc->active should yield the right result for the one
>>>> remaining caller (it's in the crtc_disable paths).
>>>>
>>>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
>>>> to transition to a different QA team ("we'll do this in 1 week without
>>>> upfront planing") I essentially don't have proper QA support for 1-2
>>>> months by now. The other trouble in this area specifically is that this
>>>> code is already completely changed in -next again, so any testing done on
>>>> integration trees (like -next or drm-intel-nightly) won't test any patches
>>>> for 4.2.
>>>> -Daniel
>>>>
>>>> Oh and Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> in case you
>>>> decide to apply this right away.
>>>>
>>> Well your version has the benefit of compiling without errors. :-)
>>>
>>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Just noticed another problem:
>> On each resume I get the following error:
>> -----------[ cut here ]------------
>> WARNING: CPU: 2 PID: 2663 at
>> /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319
>> 0xffffffff9a33d5e9()
>> WARN_ON(!crtc->state->enable)
>> CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15
>> ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
>> orkqueue: events_unbound 0xffffffff9a055750
>> 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000
>> ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000
>> ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664
>> all Trace:
>> [<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2
>> [<ffffffff9a03c416>] ? 0xffffffff9a03c416
>> [<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5
>> [<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9
>> [<ffffffff9a343ac3>] ? 0xffffffff9a343ac3
>> [<ffffffff9a34444a>] ? 0xffffffff9a34444a
>> [<ffffffff9a345518>] ? 0xffffffff9a345518
>> [<ffffffff9a3246f0>] ? 0xffffffff9a3246f0
>> [<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8
>> [<ffffffff9a236170>] ? 0xffffffff9a236170
>> [<ffffffff9a38b28d>] ? 0xffffffff9a38b28d
>> [<ffffffff9a38b784>] ? 0xffffffff9a38b784
>> [<ffffffff9a38baa4>] ? 0xffffffff9a38baa4
>> [<ffffffff9a05577d>] ? 0xffffffff9a05577d
>> [<ffffffff9a04dc47>] ? 0xffffffff9a04dc47
>> [<ffffffff9a04dfab>] ? 0xffffffff9a04dfab
>> [<ffffffff9a04dea0>] ? 0xffffffff9a04dea0
>> [<ffffffff9a05331c>] ? 0xffffffff9a05331c
>> [<ffffffff9a053260>] ? 0xffffffff9a053260
>> [<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f
>> [<ffffffff9a053260>] ? 0xffffffff9a053260
>> --[ end trace 1b6d28ee34071679 ]---
>>
>> Nervertheless resume works, so it doesn't hurt me.
>>
>>
>> BTW: I get also up to 40..50!  compile warnings like:
>> i915/i915_drv.h: In function 'i915_debugfs_connector_add':
>> i915/i915_drv.h:3119:53: warning: no return statement in function
>> returning non-void [-Wreturn-type]
>>
>> which may cause yet uncovered troubles.
>>
>> Thanks, Jörg
> kallsyms please!
>
> Looks like intel_crtc_disable being called with a mode change on a already disabled crtc, it's gone in 4.3 because of the atomic rework.
>
> Does something like below work?
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index ba9321998a41..725d2b727704 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc)
>         struct drm_connector *connector;
>         struct drm_i915_private *dev_priv = dev->dev_private;
>
> -       /* crtc should still be enabled when we disable it. */
> -       WARN_ON(!crtc->state->enable);
> -
>         intel_crtc_disable_planes(crtc);
>         dev_priv->display.crtc_disable(crtc);
>         dev_priv->display.off(crtc);
> @@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc,
>                         continue;
>
>                 if (!crtc_state->enable) {
> -                       intel_crtc_disable(crtc);
> +                       if (crtc->state->enable)
> +                               intel_crtc_disable(crtc);
>                 } else if (crtc->state->enable) {
>                         intel_crtc_disable_planes(crtc);
>                         dev_priv->display.crtc_disable(crtc);
>
The patch works for me.

Thanks, Jörg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] drm/i915: Do not call intel_crtc_disable if the crtc is already disabled.
  2015-07-13  8:50           ` Jörg Otte
@ 2015-07-14 11:00               ` Maarten Lankhorst
  0 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-14 11:00 UTC (permalink / raw)
  To: Jörg Otte
  Cc: Linus Torvalds, David Airlie, DRI, Linux Kernel Mailing List

When resuming with dpms off, the following warn can happen:

[  118.334082] ------------[ cut here ]------------
[  118.334105] WARNING: CPU: 2 PID: 2274 at drivers/gpu/drm/i915/intel_display.c:6319 __intel_set_mode+0xae5/0xb90 [i915]()
[  118.334106] WARN_ON(!crtc->state->enable)
[  118.334137] Modules linked in: i915
[  118.334139] CPU: 2 PID: 2274 Comm: kworker/u16:117 Not tainted 4.2.0-rc2-fixes+ #4148
[  118.334140] Hardware name: LENOVO 2349AV8/2349AV8, BIOS G1ETA5WW (2.65 ) 04/15/2014
[  118.334144] Workqueue: events_unbound async_run_entry_fn
[  118.334147]  ffffffffc017eef0 ffff8800ada93998 ffffffff817aa62a 0000000080000001
[  118.334149]  ffff8800ada939e8 ffff8800ada939d8 ffffffff810807e1 ffff8800ada939c8
[  118.334151]  ffff8800cea3b3d8 0000000000000000 ffff8800ad86b008 ffff880117705668
[  118.334151] Call Trace:
[  118.334155]  [<ffffffff817aa62a>] dump_stack+0x4f/0x7b
[  118.334157]  [<ffffffff810807e1>] warn_slowpath_common+0x81/0xc0
[  118.334158]  [<ffffffff81080861>] warn_slowpath_fmt+0x41/0x50
[  118.334173]  [<ffffffffc0120375>] __intel_set_mode+0xae5/0xb90 [i915]
[  118.334188]  [<ffffffffc0121312>] ? intel_modeset_compute_config+0x52/0xb40 [i915]
[  118.334191]  [<ffffffff8144de53>] ? drm_atomic_set_fb_for_plane+0x63/0x80
[  118.334205]  [<ffffffffc01269d9>] intel_set_mode+0x29/0x60 [i915]
[  118.334219]  [<ffffffffc012730a>] intel_crtc_restore_mode+0x13a/0x1f0 [i915]
[  118.334232]  [<ffffffffc0101160>] ? gen6_write16+0x250/0x250 [i915]
[  118.334246]  [<ffffffffc01283ec>] intel_modeset_setup_hw_state+0x89c/0xcd0 [i915]
[  118.334248]  [<ffffffff8137d260>] ? pci_pm_thaw+0x90/0x90
[  118.334255]  [<ffffffffc00ac11b>] i915_drm_resume+0xcb/0x160 [i915]
[  118.334262]  [<ffffffffc00ac1d2>] i915_pm_resume+0x22/0x30 [i915]
[  118.334263]  [<ffffffff8137d2c3>] pci_pm_resume+0x63/0xa0
[  118.334266]  [<ffffffff81467550>] dpm_run_callback+0x70/0x420
[  118.334267]  [<ffffffff81467cbd>] device_resume+0x9d/0x1c0
[  118.334269]  [<ffffffff814673d0>] ? initcall_debug_start+0x60/0x60
[  118.334270]  [<ffffffff81467dfc>] async_resume+0x1c/0x50
[  118.334271]  [<ffffffff810a6a94>] async_run_entry_fn+0x34/0xd0
[  118.334273]  [<ffffffff8109d4ad>] process_one_work+0x1dd/0x7e0
[  118.334275]  [<ffffffff8109d41a>] ? process_one_work+0x14a/0x7e0
[  118.334276]  [<ffffffff8109daf9>] worker_thread+0x49/0x450
[  118.334278]  [<ffffffff8109dab0>] ? process_one_work+0x7e0/0x7e0
[  118.334280]  [<ffffffff810a3cb9>] kthread+0xf9/0x110
[  118.334282]  [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90
[  118.334284]  [<ffffffff817b414f>] ret_from_fork+0x3f/0x70
[  118.334286]  [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90
[  118.334287] ---[ end trace 01f2cf6371b82d7a ]---

This warn is harmless, and can be fixed by not calling intel_crtc_disable when
the crtc is already disabled.

Reported-and-Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 85ac6d85dc39..30e0f54ba19d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc)
 	struct drm_connector *connector;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	/* crtc should still be enabled when we disable it. */
-	WARN_ON(!crtc->state->enable);
-
 	intel_crtc_disable_planes(crtc);
 	dev_priv->display.crtc_disable(crtc);
 	dev_priv->display.off(crtc);
@@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc,
 			continue;
 
 		if (!crtc_state->enable) {
-			intel_crtc_disable(crtc);
+			if (crtc->state->enable)
+				intel_crtc_disable(crtc);
 		} else if (crtc->state->enable) {
 			intel_crtc_disable_planes(crtc);
 			dev_priv->display.crtc_disable(crtc);


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] drm/i915: Do not call intel_crtc_disable if the crtc is already disabled.
@ 2015-07-14 11:00               ` Maarten Lankhorst
  0 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2015-07-14 11:00 UTC (permalink / raw)
  To: Jörg Otte; +Cc: Linus Torvalds, Linux Kernel Mailing List, DRI

When resuming with dpms off, the following warn can happen:

[  118.334082] ------------[ cut here ]------------
[  118.334105] WARNING: CPU: 2 PID: 2274 at drivers/gpu/drm/i915/intel_display.c:6319 __intel_set_mode+0xae5/0xb90 [i915]()
[  118.334106] WARN_ON(!crtc->state->enable)
[  118.334137] Modules linked in: i915
[  118.334139] CPU: 2 PID: 2274 Comm: kworker/u16:117 Not tainted 4.2.0-rc2-fixes+ #4148
[  118.334140] Hardware name: LENOVO 2349AV8/2349AV8, BIOS G1ETA5WW (2.65 ) 04/15/2014
[  118.334144] Workqueue: events_unbound async_run_entry_fn
[  118.334147]  ffffffffc017eef0 ffff8800ada93998 ffffffff817aa62a 0000000080000001
[  118.334149]  ffff8800ada939e8 ffff8800ada939d8 ffffffff810807e1 ffff8800ada939c8
[  118.334151]  ffff8800cea3b3d8 0000000000000000 ffff8800ad86b008 ffff880117705668
[  118.334151] Call Trace:
[  118.334155]  [<ffffffff817aa62a>] dump_stack+0x4f/0x7b
[  118.334157]  [<ffffffff810807e1>] warn_slowpath_common+0x81/0xc0
[  118.334158]  [<ffffffff81080861>] warn_slowpath_fmt+0x41/0x50
[  118.334173]  [<ffffffffc0120375>] __intel_set_mode+0xae5/0xb90 [i915]
[  118.334188]  [<ffffffffc0121312>] ? intel_modeset_compute_config+0x52/0xb40 [i915]
[  118.334191]  [<ffffffff8144de53>] ? drm_atomic_set_fb_for_plane+0x63/0x80
[  118.334205]  [<ffffffffc01269d9>] intel_set_mode+0x29/0x60 [i915]
[  118.334219]  [<ffffffffc012730a>] intel_crtc_restore_mode+0x13a/0x1f0 [i915]
[  118.334232]  [<ffffffffc0101160>] ? gen6_write16+0x250/0x250 [i915]
[  118.334246]  [<ffffffffc01283ec>] intel_modeset_setup_hw_state+0x89c/0xcd0 [i915]
[  118.334248]  [<ffffffff8137d260>] ? pci_pm_thaw+0x90/0x90
[  118.334255]  [<ffffffffc00ac11b>] i915_drm_resume+0xcb/0x160 [i915]
[  118.334262]  [<ffffffffc00ac1d2>] i915_pm_resume+0x22/0x30 [i915]
[  118.334263]  [<ffffffff8137d2c3>] pci_pm_resume+0x63/0xa0
[  118.334266]  [<ffffffff81467550>] dpm_run_callback+0x70/0x420
[  118.334267]  [<ffffffff81467cbd>] device_resume+0x9d/0x1c0
[  118.334269]  [<ffffffff814673d0>] ? initcall_debug_start+0x60/0x60
[  118.334270]  [<ffffffff81467dfc>] async_resume+0x1c/0x50
[  118.334271]  [<ffffffff810a6a94>] async_run_entry_fn+0x34/0xd0
[  118.334273]  [<ffffffff8109d4ad>] process_one_work+0x1dd/0x7e0
[  118.334275]  [<ffffffff8109d41a>] ? process_one_work+0x14a/0x7e0
[  118.334276]  [<ffffffff8109daf9>] worker_thread+0x49/0x450
[  118.334278]  [<ffffffff8109dab0>] ? process_one_work+0x7e0/0x7e0
[  118.334280]  [<ffffffff810a3cb9>] kthread+0xf9/0x110
[  118.334282]  [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90
[  118.334284]  [<ffffffff817b414f>] ret_from_fork+0x3f/0x70
[  118.334286]  [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90
[  118.334287] ---[ end trace 01f2cf6371b82d7a ]---

This warn is harmless, and can be fixed by not calling intel_crtc_disable when
the crtc is already disabled.

Reported-and-Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 85ac6d85dc39..30e0f54ba19d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc)
 	struct drm_connector *connector;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	/* crtc should still be enabled when we disable it. */
-	WARN_ON(!crtc->state->enable);
-
 	intel_crtc_disable_planes(crtc);
 	dev_priv->display.crtc_disable(crtc);
 	dev_priv->display.off(crtc);
@@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc,
 			continue;
 
 		if (!crtc_state->enable) {
-			intel_crtc_disable(crtc);
+			if (crtc->state->enable)
+				intel_crtc_disable(crtc);
 		} else if (crtc->state->enable) {
 			intel_crtc_disable_planes(crtc);
 			dev_priv->display.crtc_disable(crtc);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-07-14 11:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-12  8:03 [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference Jörg Otte
2015-07-12 16:33 ` Jörg Otte
2015-07-12 16:52 ` Linus Torvalds
2015-07-12 16:52   ` Linus Torvalds
2015-07-13  5:56   ` Maarten Lankhorst
2015-07-13  5:56     ` Maarten Lankhorst
2015-07-13  6:22   ` Daniel Vetter
2015-07-13  6:22     ` Daniel Vetter
2015-07-13  7:23     ` Maarten Lankhorst
2015-07-13  7:23       ` Maarten Lankhorst
2015-07-13  7:42       ` Jörg Otte
2015-07-13  7:58         ` Maarten Lankhorst
2015-07-13  8:50           ` Jörg Otte
2015-07-14 11:00             ` [PATCH] drm/i915: Do not call intel_crtc_disable if the crtc is already disabled Maarten Lankhorst
2015-07-14 11:00               ` Maarten Lankhorst

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.