LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH] kprobe/ftrace: bail out if ftrace was killed
@ 2024-04-26 22:58 Stephen Brennan
  2024-04-29 13:48 ` Masami Hiramatsu
  2024-04-29 17:47 ` [PATCH v2] " Stephen Brennan
  0 siblings, 2 replies; 6+ messages in thread
From: Stephen Brennan @ 2024-04-26 22:58 UTC (permalink / raw
  To: Steven Rostedt, Masami Hiramatsu, Mark Rutland
  Cc: Guo Ren, Huacai Chen, WANG Xuerui, James E.J. Bottomley,
	Helge Deller, Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Aneesh Kumar K.V, Naveen N. Rao, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	Stephen Brennan

If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
---

Apologies for the wide net cast here. I recognize that a change like this
may need to be split up and go through arch-specific trees. I hoped to get
feedback on the patch itself. If it's satisfactory and the architecture
maintainers prefer it split out, I'm glad to do it. Thanks!

 arch/csky/kernel/probes/ftrace.c     | 3 +++
 arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
 arch/parisc/kernel/ftrace.c          | 3 +++
 arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
 arch/riscv/kernel/probes/ftrace.c    | 3 +++
 arch/s390/kernel/ftrace.c            | 3 +++
 arch/x86/kernel/kprobes/ftrace.c     | 3 +++
 include/linux/ftrace.h               | 2 ++
 8 files changed, 23 insertions(+)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..3931bf9f707b 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	struct pt_regs *regs;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..82c952cb5be0 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..3660834f54c3 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..85eb55aa1457 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 	struct pt_regs *regs;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(nip, parent_nip);
 	if (bit < 0)
 		return;
diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..8814fbe4c888 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..ccbe8ccf945b 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index dd2ec14adb77..c73f9ab7ff50 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 54d53f345d14..ba83e99c1fbe 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
 #define register_ftrace_function(ops) ({ 0; })
 #define unregister_ftrace_function(ops) ({ 0; })
 static inline void ftrace_kill(void) { }
+static inline int ftrace_is_dead(void) { return 0; }
 static inline void ftrace_free_init_mem(void) { }
 static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
 static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
@@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
 
 /* totally disable ftrace - can not re-enable after this */
 void ftrace_kill(void);
+int ftrace_is_dead(void);
 
 static inline void tracer_disable(void)
 {
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] kprobe/ftrace: bail out if ftrace was killed
  2024-04-26 22:58 [PATCH] kprobe/ftrace: bail out if ftrace was killed Stephen Brennan
@ 2024-04-29 13:48 ` Masami Hiramatsu
  2024-04-29 17:47   ` Stephen Brennan
  2024-04-29 17:47 ` [PATCH v2] " Stephen Brennan
  1 sibling, 1 reply; 6+ messages in thread
From: Masami Hiramatsu @ 2024-04-29 13:48 UTC (permalink / raw
  To: Stephen Brennan
  Cc: Steven Rostedt, Mark Rutland, Guo Ren, Huacai Chen, WANG Xuerui,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Aneesh Kumar K.V,
	Naveen N. Rao, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390

Hi Stephen,

On Fri, 26 Apr 2024 15:58:34 -0700
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> If an error happens in ftrace, ftrace_kill() will prevent disarming
> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> freed, yet the kprobes will still be active, and when triggered, they
> will use the freed memory, likely resulting in a page fault and panic.

Hmm, indeed.

> 
> This behavior can be reproduced quite easily, by creating a kprobe and
> then triggering a ftrace_kill(). For simplicity, we can simulate an
> ftrace error with a kernel module like [1]:
> 
> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
> 
>   sudo perf probe --add commit_creds
>   sudo perf trace -e probe:commit_creds
>   # In another terminal
>   make
>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>   # Back to perf terminal
>   # ctrl-c
>   sudo perf probe --del commit_creds
> 
> After a short period, a page fault and panic would occur as the kprobe
> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> is supposed to be used only in extreme circumstances, it is invoked in
> FTRACE_WARN_ON() and so there are many places where an unexpected bug
> could be triggered, yet the system may continue operating, possibly
> without the administrator noticing. If ftrace_kill() does not panic the
> system, then we should do everything we can to continue operating,
> rather than leave a ticking time bomb.

OK, the patch looks good to me.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks!

> 
> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> 
> Apologies for the wide net cast here. I recognize that a change like this
> may need to be split up and go through arch-specific trees. I hoped to get
> feedback on the patch itself. If it's satisfactory and the architecture
> maintainers prefer it split out, I'm glad to do it. Thanks!
> 
>  arch/csky/kernel/probes/ftrace.c     | 3 +++
>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>  arch/parisc/kernel/ftrace.c          | 3 +++
>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>  arch/riscv/kernel/probes/ftrace.c    | 3 +++
>  arch/s390/kernel/ftrace.c            | 3 +++
>  arch/x86/kernel/kprobes/ftrace.c     | 3 +++
>  include/linux/ftrace.h               | 2 ++
>  8 files changed, 23 insertions(+)
> 
> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
> index 834cffcfbce3..3931bf9f707b 100644
> --- a/arch/csky/kernel/probes/ftrace.c
> +++ b/arch/csky/kernel/probes/ftrace.c
> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	struct pt_regs *regs;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
> index 73858c9029cc..82c952cb5be0 100644
> --- a/arch/loongarch/kernel/ftrace_dyn.c
> +++ b/arch/loongarch/kernel/ftrace_dyn.c
> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	struct kprobe_ctlblk *kcb;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
> index 621a4b386ae4..3660834f54c3 100644
> --- a/arch/parisc/kernel/ftrace.c
> +++ b/arch/parisc/kernel/ftrace.c
> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> index 072ebe7f290b..85eb55aa1457 100644
> --- a/arch/powerpc/kernel/kprobes-ftrace.c
> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>  	struct pt_regs *regs;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(nip, parent_nip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
> index 7142ec42e889..8814fbe4c888 100644
> --- a/arch/riscv/kernel/probes/ftrace.c
> +++ b/arch/riscv/kernel/probes/ftrace.c
> @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> index c46381ea04ec..ccbe8ccf945b 100644
> --- a/arch/s390/kernel/ftrace.c
> +++ b/arch/s390/kernel/ftrace.c
> @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
> index dd2ec14adb77..c73f9ab7ff50 100644
> --- a/arch/x86/kernel/kprobes/ftrace.c
> +++ b/arch/x86/kernel/kprobes/ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 54d53f345d14..ba83e99c1fbe 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
>  #define register_ftrace_function(ops) ({ 0; })
>  #define unregister_ftrace_function(ops) ({ 0; })
>  static inline void ftrace_kill(void) { }
> +static inline int ftrace_is_dead(void) { return 0; }
>  static inline void ftrace_free_init_mem(void) { }
>  static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
>  static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
> @@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
>  
>  /* totally disable ftrace - can not re-enable after this */
>  void ftrace_kill(void);
> +int ftrace_is_dead(void);
>  
>  static inline void tracer_disable(void)
>  {
> -- 
> 2.39.3
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] kprobe/ftrace: bail out if ftrace was killed
  2024-04-26 22:58 [PATCH] kprobe/ftrace: bail out if ftrace was killed Stephen Brennan
  2024-04-29 13:48 ` Masami Hiramatsu
@ 2024-04-29 17:47 ` Stephen Brennan
  2024-04-30  1:29   ` Steven Rostedt
  1 sibling, 1 reply; 6+ messages in thread
From: Stephen Brennan @ 2024-04-29 17:47 UTC (permalink / raw
  To: Steven Rostedt, Masami Hiramatsu, Mark Rutland
  Cc: Guo Ren, Huacai Chen, WANG Xuerui, James E.J. Bottomley,
	Helge Deller, Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Aneesh Kumar K.V, Naveen N. Rao, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390,
	Stephen Brennan

If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
---
Difference from v1: removed both existing declarations of ftrace_is_dead()
from kernel/trace/trace.h.

 arch/csky/kernel/probes/ftrace.c     | 3 +++
 arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
 arch/parisc/kernel/ftrace.c          | 3 +++
 arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
 arch/riscv/kernel/probes/ftrace.c    | 3 +++
 arch/s390/kernel/ftrace.c            | 3 +++
 arch/x86/kernel/kprobes/ftrace.c     | 3 +++
 include/linux/ftrace.h               | 2 ++
 kernel/trace/trace.h                 | 2 --
 9 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..3931bf9f707b 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	struct pt_regs *regs;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..82c952cb5be0 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..3660834f54c3 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..85eb55aa1457 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 	struct pt_regs *regs;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(nip, parent_nip);
 	if (bit < 0)
 		return;
diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..8814fbe4c888 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..ccbe8ccf945b 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe *p;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index dd2ec14adb77..c73f9ab7ff50 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 	struct kprobe_ctlblk *kcb;
 	int bit;
 
+	if (unlikely(ftrace_is_dead()))
+		return;
+
 	bit = ftrace_test_recursion_trylock(ip, parent_ip);
 	if (bit < 0)
 		return;
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 54d53f345d14..ba83e99c1fbe 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
 #define register_ftrace_function(ops) ({ 0; })
 #define unregister_ftrace_function(ops) ({ 0; })
 static inline void ftrace_kill(void) { }
+static inline int ftrace_is_dead(void) { return 0; }
 static inline void ftrace_free_init_mem(void) { }
 static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
 static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
@@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
 
 /* totally disable ftrace - can not re-enable after this */
 void ftrace_kill(void);
+int ftrace_is_dead(void);
 
 static inline void tracer_disable(void)
 {
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 64450615ca0c..70a37ee41813 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1026,7 +1026,6 @@ static inline int ftrace_trace_task(struct trace_array *tr)
 	return this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid) !=
 		FTRACE_PID_IGNORE;
 }
-extern int ftrace_is_dead(void);
 int ftrace_create_function_files(struct trace_array *tr,
 				 struct dentry *parent);
 void ftrace_destroy_function_files(struct trace_array *tr);
@@ -1046,7 +1045,6 @@ static inline int ftrace_trace_task(struct trace_array *tr)
 {
 	return 1;
 }
-static inline int ftrace_is_dead(void) { return 0; }
 static inline int
 ftrace_create_function_files(struct trace_array *tr,
 			     struct dentry *parent)
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] kprobe/ftrace: bail out if ftrace was killed
  2024-04-29 13:48 ` Masami Hiramatsu
@ 2024-04-29 17:47   ` Stephen Brennan
  0 siblings, 0 replies; 6+ messages in thread
From: Stephen Brennan @ 2024-04-29 17:47 UTC (permalink / raw
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Mark Rutland, Guo Ren, Huacai Chen, WANG Xuerui,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Aneesh Kumar K.V,
	Naveen N. Rao, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390

Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:
> Hi Stephen,
>
> On Fri, 26 Apr 2024 15:58:34 -0700
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>
> Hmm, indeed.
>
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>   sudo perf probe --add commit_creds
>>   sudo perf trace -e probe:commit_creds
>>   # In another terminal
>>   make
>>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>>   # Back to perf terminal
>>   # ctrl-c
>>   sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>
> OK, the patch looks good to me.
>
> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Thanks!

Hi Masami,

Thank you! Sadly I took a second look at the patch and noticed I forgot
to remove the existing declarations of ftrace_is_dead() from
kernel/trace/trace.h. I've sent v2 in reply to v1 in order to correct
that. I'm sorry for the churn.

Thanks,
Stephen

>> 
>> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
>> ---
>> 
>> Apologies for the wide net cast here. I recognize that a change like this
>> may need to be split up and go through arch-specific trees. I hoped to get
>> feedback on the patch itself. If it's satisfactory and the architecture
>> maintainers prefer it split out, I'm glad to do it. Thanks!
>> 
>>  arch/csky/kernel/probes/ftrace.c     | 3 +++
>>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>>  arch/parisc/kernel/ftrace.c          | 3 +++
>>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>>  arch/riscv/kernel/probes/ftrace.c    | 3 +++
>>  arch/s390/kernel/ftrace.c            | 3 +++
>>  arch/x86/kernel/kprobes/ftrace.c     | 3 +++
>>  include/linux/ftrace.h               | 2 ++
>>  8 files changed, 23 insertions(+)
>> 
>> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
>> index 834cffcfbce3..3931bf9f707b 100644
>> --- a/arch/csky/kernel/probes/ftrace.c
>> +++ b/arch/csky/kernel/probes/ftrace.c
>> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	struct pt_regs *regs;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
>> index 73858c9029cc..82c952cb5be0 100644
>> --- a/arch/loongarch/kernel/ftrace_dyn.c
>> +++ b/arch/loongarch/kernel/ftrace_dyn.c
>> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	struct kprobe_ctlblk *kcb;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
>> index 621a4b386ae4..3660834f54c3 100644
>> --- a/arch/parisc/kernel/ftrace.c
>> +++ b/arch/parisc/kernel/ftrace.c
>> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
>> index 072ebe7f290b..85eb55aa1457 100644
>> --- a/arch/powerpc/kernel/kprobes-ftrace.c
>> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
>> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>>  	struct pt_regs *regs;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(nip, parent_nip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
>> index 7142ec42e889..8814fbe4c888 100644
>> --- a/arch/riscv/kernel/probes/ftrace.c
>> +++ b/arch/riscv/kernel/probes/ftrace.c
>> @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
>> index c46381ea04ec..ccbe8ccf945b 100644
>> --- a/arch/s390/kernel/ftrace.c
>> +++ b/arch/s390/kernel/ftrace.c
>> @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
>> index dd2ec14adb77..c73f9ab7ff50 100644
>> --- a/arch/x86/kernel/kprobes/ftrace.c
>> +++ b/arch/x86/kernel/kprobes/ftrace.c
>> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 54d53f345d14..ba83e99c1fbe 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
>>  #define register_ftrace_function(ops) ({ 0; })
>>  #define unregister_ftrace_function(ops) ({ 0; })
>>  static inline void ftrace_kill(void) { }
>> +static inline int ftrace_is_dead(void) { return 0; }
>>  static inline void ftrace_free_init_mem(void) { }
>>  static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
>>  static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
>> @@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
>>  
>>  /* totally disable ftrace - can not re-enable after this */
>>  void ftrace_kill(void);
>> +int ftrace_is_dead(void);
>>  
>>  static inline void tracer_disable(void)
>>  {
>> -- 
>> 2.39.3
>> 
>
>
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] kprobe/ftrace: bail out if ftrace was killed
  2024-04-29 17:47 ` [PATCH v2] " Stephen Brennan
@ 2024-04-30  1:29   ` Steven Rostedt
  2024-04-30 23:01     ` Stephen Brennan
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2024-04-30  1:29 UTC (permalink / raw
  To: Stephen Brennan
  Cc: Masami Hiramatsu, Mark Rutland, Guo Ren, Huacai Chen, WANG Xuerui,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Aneesh Kumar K.V,
	Naveen N. Rao, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390

On Mon, 29 Apr 2024 10:47:18 -0700
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> If an error happens in ftrace, ftrace_kill() will prevent disarming
> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> freed, yet the kprobes will still be active, and when triggered, they
> will use the freed memory, likely resulting in a page fault and panic.
> 
> This behavior can be reproduced quite easily, by creating a kprobe and
> then triggering a ftrace_kill(). For simplicity, we can simulate an
> ftrace error with a kernel module like [1]:
> 
> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
> 
>   sudo perf probe --add commit_creds
>   sudo perf trace -e probe:commit_creds
>   # In another terminal
>   make
>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>   # Back to perf terminal
>   # ctrl-c
>   sudo perf probe --del commit_creds
> 
> After a short period, a page fault and panic would occur as the kprobe
> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> is supposed to be used only in extreme circumstances, it is invoked in
> FTRACE_WARN_ON() and so there are many places where an unexpected bug
> could be triggered, yet the system may continue operating, possibly
> without the administrator noticing. If ftrace_kill() does not panic the
> system, then we should do everything we can to continue operating,
> rather than leave a ticking time bomb.
> 
> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> Difference from v1: removed both existing declarations of ftrace_is_dead()
> from kernel/trace/trace.h.
> 
>  arch/csky/kernel/probes/ftrace.c     | 3 +++
>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>  arch/parisc/kernel/ftrace.c          | 3 +++
>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>  arch/riscv/kernel/probes/ftrace.c    | 3 +++
>  arch/s390/kernel/ftrace.c            | 3 +++
>  arch/x86/kernel/kprobes/ftrace.c     | 3 +++
>  include/linux/ftrace.h               | 2 ++
>  kernel/trace/trace.h                 | 2 --
>  9 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
> index 834cffcfbce3..3931bf9f707b 100644
> --- a/arch/csky/kernel/probes/ftrace.c
> +++ b/arch/csky/kernel/probes/ftrace.c
> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	struct pt_regs *regs;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
> index 73858c9029cc..82c952cb5be0 100644
> --- a/arch/loongarch/kernel/ftrace_dyn.c
> +++ b/arch/loongarch/kernel/ftrace_dyn.c
> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	struct kprobe_ctlblk *kcb;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
> index 621a4b386ae4..3660834f54c3 100644
> --- a/arch/parisc/kernel/ftrace.c
> +++ b/arch/parisc/kernel/ftrace.c
> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> index 072ebe7f290b..85eb55aa1457 100644
> --- a/arch/powerpc/kernel/kprobes-ftrace.c
> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>  	struct pt_regs *regs;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(nip, parent_nip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
> index 7142ec42e889..8814fbe4c888 100644
> --- a/arch/riscv/kernel/probes/ftrace.c
> +++ b/arch/riscv/kernel/probes/ftrace.c
> @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> index c46381ea04ec..ccbe8ccf945b 100644
> --- a/arch/s390/kernel/ftrace.c
> +++ b/arch/s390/kernel/ftrace.c
> @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe *p;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
> index dd2ec14adb77..c73f9ab7ff50 100644
> --- a/arch/x86/kernel/kprobes/ftrace.c
> +++ b/arch/x86/kernel/kprobes/ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>  	struct kprobe_ctlblk *kcb;
>  	int bit;
>  
> +	if (unlikely(ftrace_is_dead()))
> +		return;
> +
>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>  	if (bit < 0)
>  		return;
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 54d53f345d14..ba83e99c1fbe 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
>  #define register_ftrace_function(ops) ({ 0; })
>  #define unregister_ftrace_function(ops) ({ 0; })
>  static inline void ftrace_kill(void) { }
> +static inline int ftrace_is_dead(void) { return 0; }
>  static inline void ftrace_free_init_mem(void) { }
>  static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
>  static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
> @@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
>  
>  /* totally disable ftrace - can not re-enable after this */
>  void ftrace_kill(void);
> +int ftrace_is_dead(void);
>  
>  static inline void tracer_disable(void)
>  {
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 64450615ca0c..70a37ee41813 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1026,7 +1026,6 @@ static inline int ftrace_trace_task(struct trace_array *tr)
>  	return this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid) !=
>  		FTRACE_PID_IGNORE;
>  }
> -extern int ftrace_is_dead(void);

Honestly I rather not expose this function outside of the tracing
infrastructure. Instead, we should have a kprobe_ftrace_kill() function,
and have ftrace_kill() call that.

Then kprobe_ftrace_kill() can set its own variable that is exposed to all
these functions and they can test that instead of adding the extra overhead
in the fast path of a function call to ftrace_is_dead()

extern bool kprobes_ftrace_disabled __read_mostly;

void kprobe_ftrace_kill(void)
{
	kprobes_ftrace_disabled = true;
}

And you can then replace all these with:

	if (kprobes_ftrace_disabled)
		return;

Which is faster.

-- Steve

>  int ftrace_create_function_files(struct trace_array *tr,
>  				 struct dentry *parent);
>  void ftrace_destroy_function_files(struct trace_array *tr);
> @@ -1046,7 +1045,6 @@ static inline int ftrace_trace_task(struct trace_array *tr)
>  {
>  	return 1;
>  }
> -static inline int ftrace_is_dead(void) { return 0; }
>  static inline int
>  ftrace_create_function_files(struct trace_array *tr,
>  			     struct dentry *parent)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] kprobe/ftrace: bail out if ftrace was killed
  2024-04-30  1:29   ` Steven Rostedt
@ 2024-04-30 23:01     ` Stephen Brennan
  0 siblings, 0 replies; 6+ messages in thread
From: Stephen Brennan @ 2024-04-30 23:01 UTC (permalink / raw
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mark Rutland, Guo Ren, Huacai Chen, WANG Xuerui,
	James E.J. Bottomley, Helge Deller, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Aneesh Kumar K.V,
	Naveen N. Rao, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-kernel, linux-trace-kernel, linux-csky, loongarch,
	linux-parisc, linuxppc-dev, linux-riscv, linux-s390

Steven Rostedt <rostedt@goodmis.org> writes:
> On Mon, 29 Apr 2024 10:47:18 -0700
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>   sudo perf probe --add commit_creds
>>   sudo perf trace -e probe:commit_creds
>>   # In another terminal
>>   make
>>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>>   # Back to perf terminal
>>   # ctrl-c
>>   sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>> 
>> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
>> ---
>> Difference from v1: removed both existing declarations of ftrace_is_dead()
>> from kernel/trace/trace.h.
>> 
>>  arch/csky/kernel/probes/ftrace.c     | 3 +++
>>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>>  arch/parisc/kernel/ftrace.c          | 3 +++
>>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>>  arch/riscv/kernel/probes/ftrace.c    | 3 +++
>>  arch/s390/kernel/ftrace.c            | 3 +++
>>  arch/x86/kernel/kprobes/ftrace.c     | 3 +++
>>  include/linux/ftrace.h               | 2 ++
>>  kernel/trace/trace.h                 | 2 --
>>  9 files changed, 23 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
>> index 834cffcfbce3..3931bf9f707b 100644
>> --- a/arch/csky/kernel/probes/ftrace.c
>> +++ b/arch/csky/kernel/probes/ftrace.c
>> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	struct pt_regs *regs;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
>> index 73858c9029cc..82c952cb5be0 100644
>> --- a/arch/loongarch/kernel/ftrace_dyn.c
>> +++ b/arch/loongarch/kernel/ftrace_dyn.c
>> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	struct kprobe_ctlblk *kcb;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
>> index 621a4b386ae4..3660834f54c3 100644
>> --- a/arch/parisc/kernel/ftrace.c
>> +++ b/arch/parisc/kernel/ftrace.c
>> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
>> index 072ebe7f290b..85eb55aa1457 100644
>> --- a/arch/powerpc/kernel/kprobes-ftrace.c
>> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
>> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
>>  	struct pt_regs *regs;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(nip, parent_nip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
>> index 7142ec42e889..8814fbe4c888 100644
>> --- a/arch/riscv/kernel/probes/ftrace.c
>> +++ b/arch/riscv/kernel/probes/ftrace.c
>> @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
>> index c46381ea04ec..ccbe8ccf945b 100644
>> --- a/arch/s390/kernel/ftrace.c
>> +++ b/arch/s390/kernel/ftrace.c
>> @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe *p;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
>> index dd2ec14adb77..c73f9ab7ff50 100644
>> --- a/arch/x86/kernel/kprobes/ftrace.c
>> +++ b/arch/x86/kernel/kprobes/ftrace.c
>> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
>>  	struct kprobe_ctlblk *kcb;
>>  	int bit;
>>  
>> +	if (unlikely(ftrace_is_dead()))
>> +		return;
>> +
>>  	bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  	if (bit < 0)
>>  		return;
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index 54d53f345d14..ba83e99c1fbe 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -399,6 +399,7 @@ int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *a
>>  #define register_ftrace_function(ops) ({ 0; })
>>  #define unregister_ftrace_function(ops) ({ 0; })
>>  static inline void ftrace_kill(void) { }
>> +static inline int ftrace_is_dead(void) { return 0; }
>>  static inline void ftrace_free_init_mem(void) { }
>>  static inline void ftrace_free_mem(struct module *mod, void *start, void *end) { }
>>  static inline int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs)
>> @@ -914,6 +915,7 @@ static inline bool is_ftrace_trampoline(unsigned long addr)
>>  
>>  /* totally disable ftrace - can not re-enable after this */
>>  void ftrace_kill(void);
>> +int ftrace_is_dead(void);
>>  
>>  static inline void tracer_disable(void)
>>  {
>> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
>> index 64450615ca0c..70a37ee41813 100644
>> --- a/kernel/trace/trace.h
>> +++ b/kernel/trace/trace.h
>> @@ -1026,7 +1026,6 @@ static inline int ftrace_trace_task(struct trace_array *tr)
>>  	return this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid) !=
>>  		FTRACE_PID_IGNORE;
>>  }
>> -extern int ftrace_is_dead(void);
>
> Honestly I rather not expose this function outside of the tracing
> infrastructure. Instead, we should have a kprobe_ftrace_kill() function,
> and have ftrace_kill() call that.
>
> Then kprobe_ftrace_kill() can set its own variable that is exposed to all
> these functions and they can test that instead of adding the extra overhead
> in the fast path of a function call to ftrace_is_dead()
>
> extern bool kprobes_ftrace_disabled __read_mostly;
>
> void kprobe_ftrace_kill(void)
> {
> 	kprobes_ftrace_disabled = true;
> }
>
> And you can then replace all these with:
>
> 	if (kprobes_ftrace_disabled)
> 		return;
>
> Which is faster.

Thanks, that does make a lot more sense. It's faster, and doesn't
involve exporting that function. I'll go ahead and use this approach.

Thanks,
Stephen

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-04-30 23:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-26 22:58 [PATCH] kprobe/ftrace: bail out if ftrace was killed Stephen Brennan
2024-04-29 13:48 ` Masami Hiramatsu
2024-04-29 17:47   ` Stephen Brennan
2024-04-29 17:47 ` [PATCH v2] " Stephen Brennan
2024-04-30  1:29   ` Steven Rostedt
2024-04-30 23:01     ` Stephen Brennan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).