From: Tang Chen <tangchen@cn.fujitsu.com>
To: linux-kernel@vger.kernel.org, x86@kernel.org, linux-numa@vger.kernel.org
Cc: Wen Congyang <wency@cn.fujitsu.com>
Subject: [BUG] Failed to online cpu on a hot-added NUMA node.
Date: Mon, 10 Sep 2012 18:31:52 +0800 [thread overview]
Message-ID: <504DC198.6080602@cn.fujitsu.com> (raw)
Hi,
When I hot add a node, all the cpus on it are offline.
When I online one of them, I got the following error message.
[ 762.759364] Call Trace:
[ 762.759371] [<ffffffff8106ec2f>] warn_slowpath_common+0x7f/0xc0
[ 762.759374] [<ffffffff8106ec8a>] warn_slowpath_null+0x1a/0x20
[ 762.759377] [<ffffffff810b463b>] init_sched_groups_power+0xcb/0xd0
[ 762.759380] [<ffffffff810b49fc>] build_sched_domains+0x3bc/0x6a0
[ 762.759387] [<ffffffff810e2e73>] ? __lock_release+0x133/0x1a0
[ 762.759390] [<ffffffff810b51f7>] partition_sched_domains+0x347/0x530
[ 762.759393] [<ffffffff810b4ff2>] ? partition_sched_domains+0x142/0x530
[ 762.759399] [<ffffffff81102bd3>] cpuset_update_active_cpus+0x83/0x90
[ 762.759402] [<ffffffff810b5418>] cpuset_cpu_active+0x38/0x70
[ 762.759411] [<ffffffff81681167>] notifier_call_chain+0x67/0x150
[ 762.759417] [<ffffffff81670bff>] ? native_cpu_up+0x194/0x1c7
[ 762.759422] [<ffffffff810a36be>] __raw_notifier_call_chain+0xe/0x10
[ 762.759426] [<ffffffff81072d70>] __cpu_notify+0x20/0x40
[ 762.759430] [<ffffffff81672af7>] _cpu_up+0xfc/0x144
[ 762.759433] [<ffffffff81672c12>] cpu_up+0xd3/0xe6
[ 762.759439] [<ffffffff81662a1c>] store_online+0x9c/0xd0
[ 762.759447] [<ffffffff81441f80>] dev_attr_store+0x20/0x30
[ 762.759454] [<ffffffff812547a3>] sysfs_write_file+0xa3/0x100
[ 762.759462] [<ffffffff811d62a0>] vfs_write+0xd0/0x1a0
[ 762.759465] [<ffffffff811d6474>] sys_write+0x54/0xa0
[ 762.759471] [<ffffffff81686269>] system_call_fastpath+0x16/0x1b
[ 762.759473] ---[ end trace 75068e651299460b ]---
[ 762.759493] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
In init_sched_groups_power(), we got a NULL pointer sg, which should
have been initialized in build_overlap_sched_groups().
In build_overlap_sched_groups(),
cpumask_copy(sg_span, sched_domain_span(child));
the new cpu is not set in sched_domain_span(child). It should be set in
build_sched_domain(),
cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
But on NUMA topology level, the cpus' masks on the new node is not set
in array sched_domains_numa_masks when they are hot added, which means
they are not set in tl->mask(cpu).
Should we set the hot added cpu masks in sched_domains_numa_masks when
they are onlined ?
If I want to fix this, do I need to add a new notifier to the notify
chain ?
Thanks. :)
reply other threads:[~2012-09-10 10:31 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=504DC198.6080602@cn.fujitsu.com \
--to=tangchen@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-numa@vger.kernel.org \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).