Linux-arch Archive mirror
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexey Gladkov <legion@kernel.org>, Kyle Huey <me@kylehuey.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads
Date: Wed, 22 Dec 2021 17:37:17 -0700	[thread overview]
Message-ID: <YcPEvfyWK8DKPGlL@archlinux-ax161> (raw)
In-Reply-To: <87pmpow7ga.fsf@email.froward.int.ebiederm.org>

On Wed, Dec 22, 2021 at 05:22:45PM -0600, Eric W. Biederman wrote:
> Nathan Chancellor <nathan@kernel.org> writes:
> 
> > On Wed, Dec 22, 2021 at 12:30:57PM -0600, Eric W. Biederman wrote:
> >> Nathan Chancellor <nathan@kernel.org> writes:
> >> 
> >> > Hi Eric,
> >> >
> >> > On Wed, Dec 08, 2021 at 02:25:31PM -0600, Eric W. Biederman wrote:
> >> >> Today the rules are a bit iffy and arbitrary about which kernel
> >> >> threads have struct kthread present.  Both idle threads and thread
> >> >> started with create_kthread want struct kthread present so that is
> >> >> effectively all kernel threads.  Make the rule that if PF_KTHREAD
> >> >> and the task is running then struct kthread is present.
> >> >> 
> >> >> This will allow the kernel thread code to using tsk->exit_code
> >> >> with different semantics from ordinary processes.
> >> >> 
> >> >> To make ensure that struct kthread is present for all
> >> >> kernel threads move it's allocation into copy_process.
> >> >> 
> >> >> Add a deallocation of struct kthread in exec for processes
> >> >> that were kernel threads.
> >> >> 
> >> >> Move the allocation of struct kthread for the initial thread
> >> >> earlier so that it is not repeated for each additional idle
> >> >> thread.
> >> >> 
> >> >> Move the initialization of struct kthread into set_kthread_struct
> >> >> so that the structure is always and reliably initailized.
> >> >> 
> >> >> Clear set_child_tid in free_kthread_struct to ensure the kthread
> >> >> struct is reliably freed during exec.  The function
> >> >> free_kthread_struct does not need to clear vfork_done during exec as
> >> >> exec_mm_release called from exec_mmap has already cleared vfork_done.
> >> >> 
> >> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >> >
> >> > This patch as commit 40966e316f86 ("kthread: Ensure struct kthread is
> >> > present for all kthreads") in -next causes an ARCH=arm
> >> > multi_v5_defconfig kernel to fail to boot in QEMU. I had to apply commit
> >> > 6692c98c7df5 ("fork: Stop protecting back_fork_cleanup_cgroup_lock with
> >> > CONFIG_NUMA") to get it to build and I applied commit dd621ee0cf8e
> >> > ("kthread: Warn about failed allocations for the init kthread") to avoid
> >> > the known runtime warning.
> >> >
> >> > $ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- distclean multi_v5_defconfig all
> >> >
> >> > $ qemu-system-arm \
> >> >     -initrd rootfs.cpio \
> >> >     -append earlycon \
> >> >     -machine palmetto-bmc \
> >> >     -no-reboot \
> >> >     -dtb arch/arm/boot/dts/aspeed-bmc-opp-palmetto.dtb \
> >> >     -display none \
> >> >     -kernel arch/arm/boot/zImage \
> >> >     -m 512m \
> >> >     -nodefaults \
> >> >     -serial mon:stdio
> >> > qemu-system-arm: warning: nic ftgmac100.0 has no peer
> >> > qemu-system-arm: warning: nic ftgmac100.1 has no peer
> >> > Booting Linux on physical CPU 0x0
> >> > Linux version 5.16.0-rc1-00016-g40966e316f86-dirty (nathan@archlinux-ax161) (arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 PREEMPT Wed Dec 22 18:08:53 UTC 2021
> >> > CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00093177
> >> > CPU: VIVT data cache, VIVT instruction cache
> >> > OF: fdt: Machine model: Palmetto BMC
> >> > earlycon: ns16550a0 at MMIO 0x1e784000 (options '')
> >> > printk: bootconsole [ns16550a0] enabled
> >> > Memory policy: Data cache writethrough
> >> > cma: Reserved 16 MiB at 0x5b000000
> >> > Zone ranges:
> >> >   DMA      [mem 0x0000000040000000-0x000000005edfffff]
> >> >   Normal   empty
> >> >   HighMem  [mem 0x000000005ee00000-0x000000005fffffff]
> >> > Movable zone start for each node
> >> > Early memory node ranges
> >> >   node   0: [mem 0x0000000040000000-0x000000005bffffff]
> >> >   node   0: [mem 0x000000005c000000-0x000000005dffffff]
> >> >   node   0: [mem 0x000000005e000000-0x000000005edfffff]
> >> >   node   0: [mem 0x000000005ee00000-0x000000005fffffff]
> >> > Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
> >> > Built 1 zonelists, mobility grouping on.  Total pages: 130084
> >> > Kernel command line: earlycon
> >> > Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
> >> > Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
> >> > mem auto-init: stack:off, heap alloc:off, heap free:off
> >> > Memory: 433140K/524288K available (9628K kernel code, 2019K rwdata, 2368K rodata, 340K init, 661K bss, 74764K reserved, 16384K cma-reserved, 0K highmem)
> >> > SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> >> > rcu: Preemptible hierarchical RCU implementation.
> >> > rcu:    RCU event tracing is enabled.
> >> >         Trampoline variant of Tasks RCU enabled.
> >> > rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> >> > NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> >> > i2c controller registered, irq 16
> >> > random: get_random_bytes called from start_kernel+0x408/0x624 with crng_init=0
> >> > clocksource: FTTMR010-TIMER2: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> >> > sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> >> > Switching to timer-based delay loop, resolution 41ns
> >> > Console: colour dummy device 80x30
> >> > printk: console [tty0] enabled
> >> > printk: bootconsole [ns16550a0] disabled
> >> >
> >> > After that, it just hangs.
> >> >
> >> > The rootfs is available at https://github.com/ClangBuiltLinux/boot-utils
> >> > in the images/arm folder.
> >> >
> >> > If there is any more information that I can provide or changes to test,
> >> > please let me know.
> 
> I have managed to reproduce, fix and verify my fix, please
> see below.
> 
> 
> Subject: [PATCH] kthread: Never put_user the set_child_tid address
> 
> Kernel threads abuse set_child_tid.  Historically that has been fine
> as set_child_tid was initialized after the kernel thread had been
> forked.  Unfortunately storing struct kthread in set_child_tid after
> the thread is running makes struct kthread being unusable for storing
> result codes of the thread.
> 
> When set_child_tid is set to struct kthread during fork that results
> in schedule_tail writing the thread id to the beggining of struct
> kthread (if put_user does not realize it is a kernel address).
> 
> Solve this by skipping the put_user for all kthreads.
> 
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Link: https://lkml.kernel.org/r/YcNsG0Lp94V13whH@archlinux-ax161
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Thanks a lot for the quick fix. I can confirm that it resolves the
failure on my side.

Tested-by: Nathan Chancellor <nathan@kernel.org>

> ---
>  kernel/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ee222b89c692..d8adbea77be1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4908,7 +4908,7 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
>  	finish_task_switch(prev);
>  	preempt_enable();
>  
> -	if (current->set_child_tid)
> +	if (!(current->flags & PF_KTHREAD) && current->set_child_tid)
>  		put_user(task_pid_vnr(current), current->set_child_tid);
>  
>  	calculate_sigpending();
> -- 
> 2.29.2
> 
> 
> Eric

  reply	other threads:[~2021-12-23  0:37 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 20:17 [PATCH 00/10] Removal of most do_exit calls Eric W. Biederman
2021-12-08 20:25 ` [PATCH 01/10] exit/s390: Remove dead reference to do_exit from copy_thread Eric W. Biederman
2021-12-12 17:48   ` Heiko Carstens
2021-12-13 14:50     ` Eric W. Biederman
2022-01-05  4:25     ` Al Viro
2021-12-08 20:25 ` [PATCH 02/10] exit: Add and use make_task_dead Eric W. Biederman
2022-01-05  5:01   ` Al Viro
2022-01-05 20:46     ` Eric W. Biederman
2022-01-05 21:53       ` Al Viro
2022-01-05 22:51         ` Linus Torvalds
2022-01-05 23:34           ` Al Viro
2021-12-08 20:25 ` [PATCH 03/10] exit: Move oops specific logic from do_exit into make_task_dead Eric W. Biederman
2022-01-05  5:48   ` Al Viro
2022-01-06  7:08     ` Al Viro
2022-01-07  3:42     ` Al Viro
2022-01-07 19:02       ` Eric W. Biederman
2022-01-07 18:59     ` Eric W. Biederman
2022-01-17  8:05       ` Christoph Hellwig
2022-01-17 12:15         ` Heiko Carstens
2022-01-17 13:17           ` Christoph Hellwig
2022-01-17 13:24         ` Arnd Bergmann
2022-01-17 13:27           ` [PATCH] microblaze: remove CONFIG_SET_FS Arnd Bergmann
2022-02-09 13:50             ` Michal Simek
2022-02-09 13:52               ` Christoph Hellwig
2022-02-09 14:03                 ` Michal Simek
2022-02-09 14:40               ` Arnd Bergmann
2022-02-09 14:44                 ` Michal Simek
2022-02-09 14:54                   ` Arnd Bergmann
2022-02-09 23:31                     ` Stafford Horne
2022-02-11  0:17                       ` Stafford Horne
2022-02-11 16:59                         ` Arnd Bergmann
2022-02-11 17:46                           ` Linus Torvalds
2022-02-11 20:57                             ` Arnd Bergmann
2022-02-11 21:10                               ` Eric W. Biederman
2022-02-11 22:21                                 ` Stafford Horne
2022-02-14  7:41                             ` Christoph Hellwig
2022-02-14  7:50                           ` Christoph Hellwig
2022-02-14 16:20                             ` Arnd Bergmann
2021-12-08 20:25 ` [PATCH 04/10] exit: Stop poorly open coding do_task_dead in make_task_dead Eric W. Biederman
2022-01-05  5:58   ` Al Viro
2022-01-05 22:33     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 05/10] exit: Stop exporting do_exit Eric W. Biederman
2022-01-05  6:02   ` Al Viro
2022-01-05 22:36     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 06/10] exit: Implement kthread_exit Eric W. Biederman
2022-01-07  2:27   ` Al Viro
2022-01-08 18:35     ` Eric W. Biederman
2022-01-08 22:44       ` David Laight
2022-01-10 15:00         ` Eric W. Biederman
2022-01-09  3:27       ` Al Viro
2022-01-10 15:05         ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 07/10] exit: Rename module_put_and_exit to module_put_and_kthread_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 08/10] exit: Rename complete_and_exit to kthread_complete_and_exit Eric W. Biederman
2021-12-08 20:25 ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Eric W. Biederman
2021-12-22 18:19   ` Nathan Chancellor
2021-12-22 18:30     ` Eric W. Biederman
2021-12-22 18:46       ` Nathan Chancellor
2021-12-22 23:22         ` Eric W. Biederman
2021-12-23  0:37           ` Nathan Chancellor [this message]
2021-12-23  1:44           ` Linus Torvalds
2021-12-23  3:34             ` Eric W. Biederman
2021-12-23  5:19               ` [PATCH] kthread: Generalize pf_io_worker so it can point to struct kthread Eric W. Biederman
2021-12-23 17:20                 ` Linus Torvalds
2022-01-07  3:59   ` [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads Al Viro
2022-01-08 18:20     ` Eric W. Biederman
2021-12-08 20:25 ` [PATCH 10/10] exit/kthread: Move the exit code for kernel threads into struct kthread Eric W. Biederman
2022-01-07  3:22   ` Al Viro
2021-12-13 22:50 ` [PATCH 0/8] signal: Cleanup of the signal->flags Eric W. Biederman
2022-01-03 21:30   ` [PATCH 00/17] exit: Making task exiting a first class concept Eric W. Biederman
2022-01-03 21:32     ` [PATCH 01/17] exit: Remove profile_task_exit & profile_munmap Eric W. Biederman
2022-01-04  7:38       ` Christoph Hellwig
2022-01-07  3:48       ` Al Viro
2022-01-08 16:10         ` Eric W. Biederman
2022-01-03 21:32     ` [PATCH 02/17] exit: Coredumps reach do_group_exit Eric W. Biederman
2022-01-03 21:32     ` [PATCH 03/17] exit: Fix the exit_code for wait_task_zombie Eric W. Biederman
2022-01-03 21:32     ` [PATCH 04/17] exit: Use the correct exit_code in /proc/<pid>/stat Eric W. Biederman
2022-01-03 21:33     ` [PATCH 05/17] taskstats: Cleanup the use of task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 06/17] ptrace: Remove second setting of PT_SEIZED in ptrace_attach Eric W. Biederman
2022-01-03 21:33     ` [PATCH 07/17] ptrace: Remove unused regs argument from ptrace_report_syscall Eric W. Biederman
2022-01-03 21:33     ` [PATCH 08/17] ptrace/m68k: Stop open coding ptrace_report_syscall Eric W. Biederman
2022-01-10 15:26       ` Geert Uytterhoeven
2022-01-10 16:20         ` Al Viro
2022-01-10 16:25           ` Al Viro
2022-01-10 17:54           ` Geert Uytterhoeven
2022-01-10 20:37             ` Al Viro
2022-01-10 21:18               ` Eric W. Biederman
2022-01-11  1:33             ` Michael Schmitz
2022-01-11 22:42               ` Finn Thain
2022-01-12  0:20                 ` Michael Schmitz
2022-01-12  3:32                   ` Finn Thain
2022-01-12  7:54                     ` Michael Schmitz
2022-01-12  7:55                   ` Geert Uytterhoeven
2022-01-12  8:05                     ` Michael Schmitz
2022-01-03 21:33     ` [PATCH 09/17] ptrace: Move setting/clearing ptrace_message into ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 10/17] ptrace: Return the signal to continue with from ptrace_stop Eric W. Biederman
2022-01-03 21:33     ` [PATCH 11/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2022-01-03 21:33     ` [PATCH 12/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2022-01-03 21:33     ` [PATCH 13/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2022-01-03 21:33     ` [PATCH 14/17] signal: Remove zap_other_threads Eric W. Biederman
2022-01-03 21:33     ` [PATCH 15/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2022-01-03 21:33     ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2022-01-03 21:33     ` [PATCH 17/17] signal: Always set SIGNAL_GROUP_EXIT on process exit Eric W. Biederman
2022-03-09  0:15     ` [PATCH 00/13] Removing tracehook.h Eric W. Biederman
2022-03-09 20:58       ` Linus Torvalds
2021-12-13 22:53 ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2022-01-04  6:30   ` Dmitry Osipenko
2022-01-04 16:18     ` Eric W. Biederman
2022-01-05 19:58     ` Eric W. Biederman
2022-01-05 21:39       ` Dmitry Osipenko
2022-01-08 18:13         ` Eric W. Biederman
2022-01-08 18:15           ` [PATCH 1/2] signal: Have prepare_signal detect coredumps using signal->core_state Eric W. Biederman
2022-01-08 18:15           ` [PATCH 2/2] signal: Make coredump handling explicit in complete_signal Eric W. Biederman
2022-01-11  8:59           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Dmitry Osipenko
2022-01-11 17:20             ` Eric W. Biederman
2022-01-18 17:30               ` Dmitry Osipenko
2022-01-18 17:52                 ` Eric W. Biederman
2022-01-18 18:01                   ` Dmitry Osipenko
2022-01-04 18:44   ` Linus Torvalds
2022-01-04 19:47     ` Eric W. Biederman
2022-01-08 19:13       ` Heiko Carstens
     [not found]         ` <87ilurwjju.fsf@email.froward.int.ebiederm.org>
     [not found]           ` <87o84juwhg.fsf@email.froward.int.ebiederm.org>
2022-01-10 23:00             ` Olivier Langlois
2022-01-11 17:28               ` Eric W. Biederman
2022-01-11 18:51                 ` Eric W. Biederman
2022-01-11 19:19                   ` Linus Torvalds
2022-01-15  0:12                     ` Eric W. Biederman
2022-01-15 19:23                       ` Olivier Langlois
2022-01-17 16:09                         ` Eric W. Biederman
2022-01-17 18:46                           ` io_uring truncating coredumps Eric W. Biederman
2022-01-18  4:23                             ` Linus Torvalds
2022-01-26 15:06                           ` [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case Olivier Langlois
2021-12-13 22:53 ` [PATCH 2/8] signal: Drop signals received after a fatal signal has been processed Eric W. Biederman
2021-12-13 22:53 ` [PATCH 3/8] signal: Have the oom killer detect coredumps using signal->core_state Eric W. Biederman
2021-12-13 22:53 ` [PATCH 4/8] signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process Eric W. Biederman
2021-12-13 22:53 ` [PATCH 5/8] signal: Remove SIGNAL_GROUP_COREDUMP Eric W. Biederman
2021-12-13 22:53 ` [PATCH 6/8] coredump: Stop setting signal->group_exit_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 7/8] signal: Rename group_exit_task group_exec_task Eric W. Biederman
2021-12-13 22:53 ` [PATCH 8/8] signal: Remove the helper signal_group_exit Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YcPEvfyWK8DKPGlL@archlinux-ax161 \
    --to=nathan@kernel.org \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=ebiederm@xmission.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=oleg@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).