LKML Archive mirror
 help / color / mirror / Atom feed
* net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
@ 2024-04-23  4:10 John B. Wyatt IV
  2024-04-23 15:03 ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 2+ messages in thread
From: John B. Wyatt IV @ 2024-04-23  4:10 UTC (permalink / raw
  To: Raju Rangoju, Sebastian Andrzej Siewior
  Cc: Juri Lelli, Clark Williams, netdev, kernel-rts-sst, LKML

Hello Raju, Hello Sebastian,

Red Hat QE found this issue with cxgb4 only when the kernel has PREEMPT_RT set
with the preempt-rt patchset:

git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

We are also seeing this in the Real-time builds of RHEL9 and 8.

The specific build is an internal build that was pulled from the mirror Clark
Williams setup for Fedora and RHEL testing.

https://gitlab.com/cki-project/kernel-ark/-/tree/os-build-rt?ref_type=heads

We use the branch: os-build-rt

I was unable to find the cause of this and I thought I should report it.

Please let me if you have any questions or you need any testing done.

Call trace is below:

kernel-rt-6.9.0-0.rc4.f8dba31b0a82.38.test.eln136.x86_64
 BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
 caller is cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
 Hardware name: Dell Inc. PowerEdge R750/0WT8Y6, BIOS 1.5.4 12/17/2021
 Call Trace:
  <TASK>
 dump_stack_lvl (lib/dump_stack.c:116) 
 check_preemption_disabled (lib/smp_processor_id.c:49) 
 cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
 cxgb4_self_test+0x8f/0xe0 cxgb4
 ethtool_self_test (net/ethtool/ioctl.c:2002) 
 __dev_ethtool (net/ethtool/ioctl.c:2997) 
 ? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3)) 
 ? kmalloc_trace (./arch/x86/include/asm/jump_label.h:55 ./include/linux/memcontrol.h:1839 mm/slub.c:1980 mm/slub.c:3807 mm/slub.c:3845 mm/slub.c:3992) 
 dev_ethtool (net/ethtool/ioctl.c:3177) 
 dev_ioctl (net/core/dev_ioctl.c:724) 
 sock_do_ioctl (net/socket.c:1236) 
 __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:904 fs/ioctl.c:890 fs/ioctl.c:890) 
 do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
 ? mod_objcg_state (mm/memcontrol.c:3421) 
 ? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3)) 
 ? try_charge_memcg (mm/memcontrol.c:2745) 
 ? __mod_node_page_state (./include/linux/preempt.h:477 (discriminator 3) mm/vmstat.c:405 (discriminator 3)) 
 ? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3)) 
 ? rt_spin_unlock (kernel/locking/rtmutex.c:230 (discriminator 5) kernel/locking/spinlock_rt.c:84 (discriminator 5)) 
 ? do_anonymous_page (./include/linux/pgtable.h:114 mm/memory.c:4490) 
 ? __handle_mm_fault (mm/memory.c:3878 mm/memory.c:5300 mm/memory.c:5441) 
 ? syscall_exit_to_user_mode (kernel/entry/common.c:221) 
 ? __count_memcg_events (./include/linux/preempt.h:477 (discriminator 3) mm/memcontrol.c:704 (discriminator 3) mm/memcontrol.c:963 (discriminator 3)) 
 ? handle_mm_fault (mm/memory.c:5483 mm/memory.c:5622) 
 ? do_user_addr_fault (arch/x86/mm/fault.c:1443 (discriminator 1)) 
 ? clear_bhb_loop (arch/x86/entry/entry_64.S:1539) 
 ? clear_bhb_loop (arch/x86/entry/entry_64.S:1539) 
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
 RIP: 0033:0x7f55216c557b
 Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 68 0f 00 f7 d8 64 89 01 48
All code
========
   0:	ff                   	(bad)
   1:	ff                   	(bad)
   2:	ff 85 c0 79 9b 49    	incl   0x499b79c0(%rbp)
   8:	c7 c4 ff ff ff ff    	mov    $0xffffffff,%esp
   e:	5b                   	pop    %rbx
   f:	5d                   	pop    %rbp
  10:	4c 89 e0             	mov    %r12,%rax
  13:	41 5c                	pop    %r12
  15:	c3                   	ret
  16:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
  1d:	00 00 
  1f:	f3 0f 1e fa          	endbr64
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 75 68 0f 00 	mov    0xf6875(%rip),%rcx        # 0xf68af
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 75 68 0f 00 	mov    0xf6875(%rip),%rcx        # 0xf6885
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
 RSP: 002b:00007ffd867a78f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00007ffd867a7980 RCX: 00007f55216c557b
 RDX: 00007ffd867a7990 RSI: 0000000000008946 RDI: 0000000000000003
 RBP: 0000556fe43632e0 R08: 0000000000000003 R09: 0000000000000001
 R10: 0000000000000fff R11: 0000000000000246 R12: 0000556fe43632a0
 R13: 0000000000000018 R14: 0000000000000001 R15: 0000000000000000
  </TASK>

-- 
Sincerely,
John Wyatt
Software Engineer, Core Kernel
Red Hat


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
  2024-04-23  4:10 net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718 John B. Wyatt IV
@ 2024-04-23 15:03 ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 2+ messages in thread
From: Luis Claudio R. Goncalves @ 2024-04-23 15:03 UTC (permalink / raw
  To: John B. Wyatt IV
  Cc: Raju Rangoju, Sebastian Andrzej Siewior, Juri Lelli,
	Clark Williams, netdev, LKML

On Tue, Apr 23, 2024 at 12:10:10AM -0400, John B. Wyatt IV wrote:
> Hello Raju, Hello Sebastian,
> 
> Red Hat QE found this issue with cxgb4 only when the kernel has PREEMPT_RT set
> with the preempt-rt patchset:
> 
> git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git
> 
> We are also seeing this in the Real-time builds of RHEL9 and 8.
> 
> The specific build is an internal build that was pulled from the mirror Clark
> Williams setup for Fedora and RHEL testing.
> 
> https://gitlab.com/cki-project/kernel-ark/-/tree/os-build-rt?ref_type=heads
> 
> We use the branch: os-build-rt
> 
> I was unable to find the cause of this and I thought I should report it.
> 
> Please let me if you have any questions or you need any testing done.
> 
> Call trace is below:
> 
> kernel-rt-6.9.0-0.rc4.f8dba31b0a82.38.test.eln136.x86_64
>  BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
>  caller is cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
>  Hardware name: Dell Inc. PowerEdge R750/0WT8Y6, BIOS 1.5.4 12/17/2021
>  Call Trace:
>   <TASK>
>  dump_stack_lvl (lib/dump_stack.c:116) 
>  check_preemption_disabled (lib/smp_processor_id.c:49) 
>  cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
>  cxgb4_self_test+0x8f/0xe0 cxgb4
>  ethtool_self_test (net/ethtool/ioctl.c:2002) 
>  __dev_ethtool (net/ethtool/ioctl.c:2997) 

Hi John,

The patch below is untested but should fix the problem you reported:

======

cxgb4: fix smp_processor_id() usage in selftests

When PREEMPT_RT is enabled the following call can result in a "BUG: using
smp_processor_id() in preemptible [00000000] code: ethtool/xxxx" error
message:

    ethtool_self_test()
        cxgb4_self_test()
            cxgb4_selftest_lb_pkt()
                __netif_tx_lock(q->txq, smp_processor_id());  <--- BOOM

Replacing smp_processor_id() by raw_smp_processor_id() is safe in this
context given that __netif_tx_lock() is an inline function that takes a
spinlock and then uses the cpu value.

Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 1948b7bf99661..803dc62a4db04 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -2674,7 +2674,7 @@ int cxgb4_selftest_lb_pkt(struct net_device *netdev)
 	lb->loopback = 1;
 
 	q = &adap->sge.ethtxq[pi->first_qset];
-	__netif_tx_lock(q->txq, smp_processor_id());
+	__netif_tx_lock(q->txq, raw_smp_processor_id());
 
 	reclaim_completed_tx(adap, &q->q, -1, true);
 	credits = txq_avail(&q->q) - ndesc;



^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-04-23 15:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-23  4:10 net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718 John B. Wyatt IV
2024-04-23 15:03 ` Luis Claudio R. Goncalves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).