All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
@ 2009-12-15 16:27 Peter Palfrader
  2009-12-22 11:47 ` Peter Palfrader
  2009-12-22 12:04 ` Andi Kleen
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Palfrader @ 2009-12-15 16:27 UTC (permalink / raw
  To: linux-kernel; +Cc: DSA

Hi,

we tried to upgrade a couple of our proliant servers from 2.6.31.6 to
2.6.32.1.

On two of our DL385g1 servers we had problems booting 2.6.32.1, as they
paniced.

One of them eventually booted correctly when it was decided to log its
serial console output; that strategy proved unsuccessful with the second
box.


[    5.304749] BUG: unable to handle kernel NULL pointer dereference at 000000000000001f
..
[    5.308739] Call Trace:
[    5.308739]  [<ffffffff810c3840>] kstrdup+0x40/0x70
[    5.308739]  [<ffffffff81150d77>] sysfs_new_dirent+0xf7/0x110
[    5.308739]  [<ffffffff8115121d>] create_dir+0x3d/0xc0
[    5.308739]  [<ffffffff81090af1>] ? autoremove_wake_function+0x11/0x40
[    5.308739]  [<ffffffff811512d4>] sysfs_create_dir+0x34/0x50
[    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
[    5.308739]  [<ffffffff8138e961>] kobject_add_internal+0xe1/0x1e0
[    5.308739]  [<ffffffff8138eb78>] kobject_add_varg+0x38/0x60
[    5.308739]  [<ffffffff8138ec15>] kobject_init_and_add+0x75/0x90
[    5.308739]  [<ffffffff81150560>] ? sysfs_ilookup_test+0x0/0x20
[    5.308739]  [<ffffffff8115082d>] ? sysfs_find_dirent+0x2d/0x40
[    5.308739]  [<ffffffff81150ec1>] ? sysfs_addrm_finish+0x21/0x250
[    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
[    5.308739]  [<ffffffff810e6fe4>] ? kmem_cache_alloc+0x84/0xc0
[    5.308739]  [<ffffffff814238d4>] bus_add_driver+0x94/0x260
[    5.308739]  [<ffffffff81424ed9>] driver_register+0x79/0x160
[    5.308739]  [<ffffffff815a28a3>] __hid_register_driver+0x43/0x80
[    5.308739]  [<ffffffff81a3d7ff>] ? gyration_init+0x0/0x1b
[    5.308739]  [<ffffffff81a3d818>] gyration_init+0x19/0x1b
[    5.308739]  [<ffffffff81009048>] do_one_initcall+0x38/0x1a0
[    5.308739]  [<ffffffff81a0e6b5>] kernel_init+0x172/0x1ca
[    5.308739]  [<ffffffff81036a0a>] child_rip+0xa/0x20
[    5.308739]  [<ffffffff81a0e543>] ? kernel_init+0x0/0x1ca
[    5.308739]  [<ffffffff81036a00>] ? child_rip+0x0/0x20

is from the machine that reliably fails to boot.
http://asteria.noreply.org/~weasel/volatile/2009-12-15-1VAB84BxJzE/ravel
hosts the complete serial console output.




What I caught on the second box, that eventually decided to boot is
similar, but not identical:
[   19.028333] Call Trace:
[   19.028333]  [<ffffffff81150560>] ? sysfs_ilookup_test+0x0/0x20
[   19.028333]  [<ffffffff810c3840>] kstrdup+0x40/0x70
[   19.028333]  [<ffffffff81150d77>] sysfs_new_dirent+0xf7/0x110
[   19.028333]  [<ffffffff81150b17>] ? sysfs_add_one+0x27/0xd0
[   19.028333]  [<ffffffff81151bf7>] sysfs_do_create_link+0x87/0x160
[   19.028333]  [<ffffffff81151cee>] sysfs_create_link+0xe/0x10
[   19.028333]  [<ffffffff81422072>] device_add+0x272/0x730
[   19.028333]  [<ffffffff8139779e>] ? kvasprintf+0x6e/0x90
[   19.028333]  [<ffffffff81422549>] device_register+0x19/0x20
[   19.028333]  [<ffffffff8142262c>] device_create_vargs+0xdc/0xf0
[   19.028333]  [<ffffffff8142268b>] device_create+0x4b/0x50
[   19.028333]  [<ffffffff813e9702>] ? extract_entropy+0xe2/0x140
[   19.028333]  [<ffffffff813f573f>] misc_register+0xbf/0x180
[   19.028333]  [<ffffffff8107a4e0>] ? init_oops_id+0x0/0x40
[   19.028333]  [<ffffffff81a2626b>] ? pm_qos_power_init+0x0/0xe1
[   19.028333]  [<ffffffff81a262a3>] pm_qos_power_init+0x38/0xe1
[   19.028333]  [<ffffffff81009048>] do_one_initcall+0x38/0x1a0
[   19.028333]  [<ffffffff81a0e6b5>] kernel_init+0x172/0x1ca
[   19.028333]  [<ffffffff81036a0a>] child_rip+0xa/0x20
[   19.028333]  [<ffffffff81a0e543>] ? kernel_init+0x0/0x1ca
[   19.028333]  [<ffffffff81036a00>] ? child_rip+0x0/0x20

http://asteria.noreply.org/~weasel/volatile/2009-12-15-1VAB84BxJzE/klecker-bad

http://asteria.noreply.org/~weasel/volatile/2009-12-15-1VAB84BxJzE/klecker-good
for the output during a successful boot.

The config file can be found at
http://asteria.noreply.org/~weasel/volatile/2009-12-15-1VAB84BxJzE/config-2.6.32.1-dsa-amd64


Cheers,
Peter
-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-15 16:27 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f Peter Palfrader
@ 2009-12-22 11:47 ` Peter Palfrader
  2009-12-22 12:04 ` Andi Kleen
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Palfrader @ 2009-12-22 11:47 UTC (permalink / raw
  To: linux-kernel, DSA

On Tue, 15 Dec 2009, Peter Palfrader wrote:

> we tried to upgrade a couple of our proliant servers from 2.6.31.6 to
> 2.6.32.1.
> 
> On two of our DL385g1 servers we had problems booting 2.6.32.1, as they
> paniced.

Several more do not boot .32 reliably.  Anything I can try?

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-15 16:27 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f Peter Palfrader
  2009-12-22 11:47 ` Peter Palfrader
@ 2009-12-22 12:04 ` Andi Kleen
  2009-12-22 18:33   ` Peter Palfrader
  1 sibling, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2009-12-22 12:04 UTC (permalink / raw
  To: linux-kernel; +Cc: DSA, linux-input

Peter Palfrader <weasel@debian.org> writes:


> [    5.304749] BUG: unable to handle kernel NULL pointer dereference at 000000000000001f
> ..
> [    5.308739] Call Trace:
> [    5.308739]  [<ffffffff810c3840>] kstrdup+0x40/0x70
> [    5.308739]  [<ffffffff81150d77>] sysfs_new_dirent+0xf7/0x110
> [    5.308739]  [<ffffffff8115121d>] create_dir+0x3d/0xc0
> [    5.308739]  [<ffffffff81090af1>] ? autoremove_wake_function+0x11/0x40
> [    5.308739]  [<ffffffff811512d4>] sysfs_create_dir+0x34/0x50
> [    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
> [    5.308739]  [<ffffffff8138e961>] kobject_add_internal+0xe1/0x1e0
> [    5.308739]  [<ffffffff8138eb78>] kobject_add_varg+0x38/0x60
> [    5.308739]  [<ffffffff8138ec15>] kobject_init_and_add+0x75/0x90
> [    5.308739]  [<ffffffff81150560>] ? sysfs_ilookup_test+0x0/0x20
> [    5.308739]  [<ffffffff8115082d>] ? sysfs_find_dirent+0x2d/0x40
> [    5.308739]  [<ffffffff81150ec1>] ? sysfs_addrm_finish+0x21/0x250
> [    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
> [    5.308739]  [<ffffffff810e6fe4>] ? kmem_cache_alloc+0x84/0xc0
> [    5.308739]  [<ffffffff814238d4>] bus_add_driver+0x94/0x260
> [    5.308739]  [<ffffffff81424ed9>] driver_register+0x79/0x160
> [    5.308739]  [<ffffffff815a28a3>] __hid_register_driver+0x43/0x80
> [    5.308739]  [<ffffffff81a3d7ff>] ? gyration_init+0x0/0x1b
> [    5.308739]  [<ffffffff81a3d818>] gyration_init+0x19/0x1b

Seems to be caused by the "gyration driver" whatever that is. Do you
have such a USB device?  

It could be some module mismatch, it looks suspicious
and from a quick look the gyration driver does nothing bad 
in that init path. Try a make clean and remove/rebuild/reinstall all the modules
on the target system.

If that doesn't help perhaps disable CONFIG_HID_GYRATION,
but from your other oops something more seems to be broken anyways.

> [    5.308739]  [<ffffffff81009048>] do_one_initcall+0x38/0x1a0
> [    5.308739]  [<ffffffff81a0e6b5>] kernel_init+0x172/0x1ca
> [    5.308739]  [<ffffffff81036a0a>] child_rip+0xa/0x20
> [    5.308739]  [<ffffffff81a0e543>] ? kernel_init+0x0/0x1ca
> [    5.308739]  [<ffffffff81036a00>] ? child_rip+0x0/0x20

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-22 12:04 ` Andi Kleen
@ 2009-12-22 18:33   ` Peter Palfrader
  2009-12-22 18:42     ` Andi Kleen
  2009-12-22 18:57     ` Peter Palfrader
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Palfrader @ 2009-12-22 18:33 UTC (permalink / raw
  To: Andi Kleen; +Cc: linux-kernel, DSA, linux-input

On Tue, 22 Dec 2009, Andi Kleen wrote:

> > [    5.304749] BUG: unable to handle kernel NULL pointer dereference at 000000000000001f
> > ..
> > [    5.308739] Call Trace:
> > [    5.308739]  [<ffffffff810c3840>] kstrdup+0x40/0x70
> > [    5.308739]  [<ffffffff81150d77>] sysfs_new_dirent+0xf7/0x110
> > [    5.308739]  [<ffffffff8115121d>] create_dir+0x3d/0xc0
> > [    5.308739]  [<ffffffff81090af1>] ? autoremove_wake_function+0x11/0x40
> > [    5.308739]  [<ffffffff811512d4>] sysfs_create_dir+0x34/0x50
> > [    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
> > [    5.308739]  [<ffffffff8138e961>] kobject_add_internal+0xe1/0x1e0
> > [    5.308739]  [<ffffffff8138eb78>] kobject_add_varg+0x38/0x60
> > [    5.308739]  [<ffffffff8138ec15>] kobject_init_and_add+0x75/0x90
> > [    5.308739]  [<ffffffff81150560>] ? sysfs_ilookup_test+0x0/0x20
> > [    5.308739]  [<ffffffff8115082d>] ? sysfs_find_dirent+0x2d/0x40
> > [    5.308739]  [<ffffffff81150ec1>] ? sysfs_addrm_finish+0x21/0x250
> > [    5.308739]  [<ffffffff8138e7ea>] ? kobject_get+0x1a/0x30
> > [    5.308739]  [<ffffffff810e6fe4>] ? kmem_cache_alloc+0x84/0xc0
> > [    5.308739]  [<ffffffff814238d4>] bus_add_driver+0x94/0x260
> > [    5.308739]  [<ffffffff81424ed9>] driver_register+0x79/0x160
> > [    5.308739]  [<ffffffff815a28a3>] __hid_register_driver+0x43/0x80
> > [    5.308739]  [<ffffffff81a3d7ff>] ? gyration_init+0x0/0x1b
> > [    5.308739]  [<ffffffff81a3d818>] gyration_init+0x19/0x1b
> 
> Seems to be caused by the "gyration driver" whatever that is. Do you
> have such a USB device?  

Doubtful.

> It could be some module mismatch, it looks suspicious
> and from a quick look the gyration driver does nothing bad 
> in that init path. Try a make clean and remove/rebuild/reinstall all the modules
> on the target system.
> 
> If that doesn't help perhaps disable CONFIG_HID_GYRATION,
> but from your other oops something more seems to be broken anyways.

This is a static kernel - no module support.  Anyway, I also tried
without CONFIG_USB_HID (which pulls in all the other HID_* things) but
no luck.

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-22 18:33   ` Peter Palfrader
@ 2009-12-22 18:42     ` Andi Kleen
  2009-12-22 19:01       ` Peter Palfrader
  2009-12-22 18:57     ` Peter Palfrader
  1 sibling, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2009-12-22 18:42 UTC (permalink / raw
  To: Andi Kleen, linux-kernel, DSA, linux-input

> This is a static kernel - no module support.  Anyway, I also tried
> without CONFIG_USB_HID (which pulls in all the other HID_* things) but
> no luck.

Try a make distclean + rebuild anyways.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-22 18:33   ` Peter Palfrader
  2009-12-22 18:42     ` Andi Kleen
@ 2009-12-22 18:57     ` Peter Palfrader
  2009-12-24 13:04       ` Peter Palfrader
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Palfrader @ 2009-12-22 18:57 UTC (permalink / raw
  To: Andi Kleen, linux-kernel, DSA, linux-input

On Tue, 22 Dec 2009, Peter Palfrader wrote:

> > If that doesn't help perhaps disable CONFIG_HID_GYRATION,
> > but from your other oops something more seems to be broken anyways.
> 
> This is a static kernel - no module support.  Anyway, I also tried
> without CONFIG_USB_HID (which pulls in all the other HID_* things) but
> no luck.

However, disabling all of HID (CONFIG_HID_SUPPORT=n) makes the system
boot (Previously HID, HIDRAW and HID_SUPPORT were still enabled).

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-22 18:42     ` Andi Kleen
@ 2009-12-22 19:01       ` Peter Palfrader
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Palfrader @ 2009-12-22 19:01 UTC (permalink / raw
  To: Andi Kleen; +Cc: linux-kernel, DSA, linux-input

On Tue, 22 Dec 2009, Andi Kleen wrote:

> > This is a static kernel - no module support.  Anyway, I also tried
> > without CONFIG_USB_HID (which pulls in all the other HID_* things) but
> > no luck.
> 
> Try a make distclean + rebuild anyways.

I usually do.  make-kpkg doesn't really like building from dirty
directories all that much.

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-22 18:57     ` Peter Palfrader
@ 2009-12-24 13:04       ` Peter Palfrader
  2009-12-26 17:12         ` Andi Kleen
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Palfrader @ 2009-12-24 13:04 UTC (permalink / raw
  To: Andi Kleen, linux-kernel, DSA, linux-input

On Tue, 22 Dec 2009, Peter Palfrader wrote:

> On Tue, 22 Dec 2009, Peter Palfrader wrote:
> 
> > > If that doesn't help perhaps disable CONFIG_HID_GYRATION,
> > > but from your other oops something more seems to be broken anyways.
> > 
> > This is a static kernel - no module support.  Anyway, I also tried
> > without CONFIG_USB_HID (which pulls in all the other HID_* things) but
> > no luck.
> 
> However, disabling all of HID (CONFIG_HID_SUPPORT=n) makes the system
> boot (Previously HID, HIDRAW and HID_SUPPORT were still enabled).

However, I still see panics on boot occassionally, tho not so often or
reproducible.  So far only on dl385 (opteron) systems.

And all of the backtraces go through sysfs_new_dirent() near the top.
-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f
  2009-12-24 13:04       ` Peter Palfrader
@ 2009-12-26 17:12         ` Andi Kleen
  0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2009-12-26 17:12 UTC (permalink / raw
  To: Andi Kleen, linux-kernel, DSA, linux-input

On Thu, Dec 24, 2009 at 02:04:25PM +0100, Peter Palfrader wrote:
> On Tue, 22 Dec 2009, Peter Palfrader wrote:
> 
> > On Tue, 22 Dec 2009, Peter Palfrader wrote:
> > 
> > > > If that doesn't help perhaps disable CONFIG_HID_GYRATION,
> > > > but from your other oops something more seems to be broken anyways.
> > > 
> > > This is a static kernel - no module support.  Anyway, I also tried
> > > without CONFIG_USB_HID (which pulls in all the other HID_* things) but
> > > no luck.
> > 
> > However, disabling all of HID (CONFIG_HID_SUPPORT=n) makes the system
> > boot (Previously HID, HIDRAW and HID_SUPPORT were still enabled).

It's suspicious if you don't have such devices, that would
point to something being confused in the driver probing 
layer.

> 
> However, I still see panics on boot occassionally, tho not so often or
> reproducible.  So far only on dl385 (opteron) systems.

Multiple systems and the same oopses?

> 
> And all of the backtraces go through sysfs_new_dirent() near the top.

Please post full oopses.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-12-26 17:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-15 16:27 2.6.32.1: BUG and panic: unable to handle kernel NULL pointer dereference at 000000000000001f Peter Palfrader
2009-12-22 11:47 ` Peter Palfrader
2009-12-22 12:04 ` Andi Kleen
2009-12-22 18:33   ` Peter Palfrader
2009-12-22 18:42     ` Andi Kleen
2009-12-22 19:01       ` Peter Palfrader
2009-12-22 18:57     ` Peter Palfrader
2009-12-24 13:04       ` Peter Palfrader
2009-12-26 17:12         ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.