Fix for the insmod/rmmod netfs bug See attached On Mon, May 13, 2024 at 2:34 AM Steve French wrote: > > The problem with the recent netfs/folio series is easy to repro, and > doesn't show up if I remove the mempools patch: > > Author: David Howells > Date: Fri Mar 15 18:03:30 2024 +0000 > > cifs: Add mempools for cifs_io_request and cifs_io_subrequest structs > > Add mempools for the allocation of cifs_io_request and cifs_io_subrequest > structs for netfslib to use so that it can guarantee eventual allocation in > writeback. > > Repro is just to do modprobe and then rmmod > > [root@fedora29 xfstests-dev]# modprobe cifs > [root@fedora29 xfstests-dev]# dmesg -c > [ 589.547809] Key type cifs.spnego registered > [ 589.547857] Key type cifs.idmap registered > [root@fedora29 xfstests-dev]# rmmod cifs > Segmentation fault > > [ 593.793058] RIP: 0010:free_large_kmalloc+0x78/0xb0 > [ 593.793063] Code: 74 0a 5d 41 5c 41 5d c3 cc cc cc cc 48 89 ef 5d > 41 5c 41 5d e9 99 06 f4 ff 48 c7 c6 50 cf 38 9d 48 89 ef e8 7a f4 f8 > ff 0f 0b <0f> 0b 80 3d a6 3d 91 02 00 41 bc 00 f0 ff ff 75 a2 4c 89 ee > 48 c7 > [ 593.793068] RSP: 0018:ff1100011ceafe00 EFLAGS: 00010246 > [ 593.793074] RAX: 0017ffffc0000000 RBX: 1fe22000239d5fc6 RCX: dffffc0000000000 > [ 593.793078] RDX: ffd4000009265808 RSI: ffffffffc1960140 RDI: ffd4000009265800 > [ 593.793082] RBP: ffd4000009265800 R08: ffffffff9b287a70 R09: 0000000000000001 > [ 593.793086] R10: ffffffff9df472e7 R11: 0000000000000001 R12: ffffffffc195ff60 > [ 593.793090] R13: ffffffffc1960140 R14: 0000000000000000 R15: 0000000000000000 > [ 593.793093] FS: 00007fd5849cc280(0000) GS:ff110004cb200000(0000) > knlGS:0000000000000000 > [ 593.793098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 593.793101] CR2: 000055c6c44c7d58 CR3: 000000010da2a004 CR4: 0000000000371ef0 > [ 593.793110] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 593.793114] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 593.793118] Call Trace: > [ 593.793121] > [ 593.793125] ? __warn+0xa4/0x220 > [ 593.793133] ? free_large_kmalloc+0x78/0xb0 > [ 593.793140] ? report_bug+0x1d4/0x1e0 > [ 593.793151] ? handle_bug+0x42/0x80 > [ 593.793158] ? exc_invalid_op+0x18/0x50 > [ 593.793164] ? asm_exc_invalid_op+0x1a/0x20 > [ 593.793178] ? rcu_is_watching+0x20/0x50 > [ 593.793188] ? free_large_kmalloc+0x78/0xb0 > [ 593.793197] exit_cifs+0x89/0x6a0 [cifs] > [ 593.793363] __do_sys_delete_module.constprop.0+0x23f/0x450 > [ 593.793370] ? __pfx___do_sys_delete_module.constprop.0+0x10/0x10 > [ 593.793375] ? mark_held_locks+0x24/0x90 > [ 593.793383] ? __x64_sys_close+0x54/0xa0 > [ 593.793388] ? lockdep_hardirqs_on_prepare+0x139/0x200 > [ 593.793394] ? kasan_quarantine_put+0x97/0x1f0 > [ 593.793404] ? mark_held_locks+0x24/0x90 > [ 593.793414] do_syscall_64+0x78/0x180 > [ 593.793421] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 593.793427] RIP: 0033:0x7fd584aecd4b > [ 593.793433] Code: 73 01 c3 48 8b 0d 3d 11 0c 00 f7 d8 64 89 01 48 > 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 11 0c 00 f7 d8 64 89 > 01 48 > [ 593.793437] RSP: 002b:00007ffe0a36ec18 EFLAGS: 00000206 ORIG_RAX: > 00000000000000b0 > [ 593.793443] RAX: ffffffffffffffda RBX: 000055c6c44bd7a0 RCX: 00007fd584aecd4b > [ 593.793447] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055c6c44bd808 > [ 593.793451] RBP: 0000000000000000 R08: 00007ffe0a36db91 R09: 0000000000000000 > [ 593.793454] R10: 00007fd584b5eae0 R11: 0000000000000206 R12: 00007ffe0a36ee40 > [ 593.793458] R13: 00007ffe0a3706d1 R14: 000055c6c44bd260 R15: 000055c6c44bd7a0 > [ 593.793474] > [ 593.793477] irq event stamp: 12729 > [ 593.793480] hardirqs last enabled at (12735): [] > console_unlock+0x15b/0x170 > [ 593.793487] hardirqs last disabled at (12740): [] > console_unlock+0x140/0x170 > [ 593.793492] softirqs last enabled at (11910): [] > __irq_exit_rcu+0xfe/0x120 > [ 593.793498] softirqs last disabled at (11901): [] > __irq_exit_rcu+0xfe/0x120 > [ 593.793503] ---[ end trace 0000000000000000 ]--- > [ 593.793546] object pointer: 0x00000000da6e868b > [ 593.793550] ================================================================== > [ 593.793553] BUG: KASAN: invalid-free in exit_cifs+0x89/0x6a0 [cifs] > [ 593.793698] Free of addr ffffffffc1960140 by task rmmod/1306 > > [ 593.793703] CPU: 4 PID: 1306 Comm: rmmod Tainted: G W > 6.9.0 #1 > [ 593.793707] Hardware name: Red Hat KVM, BIOS 1.16.1-1.el9 04/01/2014 > [ 593.793709] Call Trace: > [ 593.793711] > [ 593.793714] dump_stack_lvl+0x79/0xb0 > [ 593.793718] print_report+0xcb/0x620 > [ 593.793724] ? exit_cifs+0x89/0x6a0 [cifs] > [ 593.793861] ? exit_cifs+0x89/0x6a0 [cifs] > [ 593.794002] kasan_report_invalid_free+0x9a/0xc0 > [ 593.794008] ? exit_cifs+0x89/0x6a0 [cifs] > [ 593.794173] free_large_kmalloc+0x38/0xb0 > [ 593.794178] exit_cifs+0x89/0x6a0 [cifs] > [ 593.794327] __do_sys_delete_module.constprop.0+0x23f/0x450 > [ 593.794331] ? __pfx___do_sys_delete_module.constprop.0+0x10/0x10 > [ 593.794335] ? mark_held_locks+0x24/0x90 > [ 593.794339] ? __x64_sys_close+0x54/0xa0 > [ 593.794342] ? lockdep_hardirqs_on_prepare+0x139/0x200 > [ 593.794347] ? kasan_quarantine_put+0x97/0x1f0 > [ 593.794352] ? mark_held_locks+0x24/0x90 > [ 593.794357] do_syscall_64+0x78/0x180 > [ 593.794361] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 593.794367] RIP: 0033:0x7fd584aecd4b > [ 593.794370] Code: 73 01 c3 48 8b 0d 3d 11 0c 00 f7 d8 64 89 01 48 > 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 11 0c 00 f7 d8 64 89 > 01 48 > [ 593.794373] RSP: 002b:00007ffe0a36ec18 EFLAGS: 00000206 ORIG_RAX: > 00000000000000b0 > [ 593.794377] RAX: ffffffffffffffda RBX: 000055c6c44bd7a0 RCX: 00007fd584aecd4b > [ 593.794380] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055c6c44bd808 > [ 593.794382] RBP: 0000000000000000 R08: 00007ffe0a36db91 R09: 0000000000000000 > [ 593.794385] R10: 00007fd584b5eae0 R11: 0000000000000206 R12: 00007ffe0a36ee40 > [ 593.794387] R13: 00007ffe0a3706d1 R14: 000055c6c44bd260 R15: 000055c6c44bd7a0 > [ 593.794394] > > [ 593.794398] The buggy address belongs to the variable: > [ 593.794399] cifs_io_subrequest_pool+0x0/0xfffffffffff3dec0 [cifs] > > [ 593.794557] Memory state around the buggy address: > [ 593.794559] ffffffffc1960000: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 > f9 f9 f9 f9 > [ 593.794562] ffffffffc1960080: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 > f9 f9 f9 f9 > [ 593.794565] >ffffffffc1960100: 00 00 f9 f9 f9 f9 f9 f9 00 00 00 00 > 00 00 00 00 > [ 593.794567] ^ > [ 593.794570] ffffffffc1960180: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 f9 > [ 593.794572] ffffffffc1960200: f9 f9 f9 f9 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 593.794575] ================================================================== > > On Sat, May 11, 2024 at 12:59 PM Steve French wrote: > > > > This was running against linux-next as of about an hour ago > > > > On Sat, May 11, 2024 at 12:53 PM Steve French wrote: > > > > > > Tried running the regression tests against for-next and saw crash > > > early in the test run in > > > > > > # FS QA Test No. cifs/006 > > > # > > > # check deferred closes on handles of deleted files > > > # > > > umount: /mnt/test: not mounted. > > > umount: /mnt/test: not mounted. > > > umount: /mnt/scratch: not mounted. > > > umount: /mnt/scratch: not mounted. > > > ./run-xfstests.sh: line 25: 4556 Segmentation fault rmmod cifs > > > modprobe: ERROR: could not insert 'cifs': Device or resource busy > > > > > > More information here: > > > http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/5/builds/123/steps/14/logs/stdio > > > > > > Are you also seeing that? There are not many likely candidates for > > > what patch is causing the problem (could be related to the folios > > > changes) e.g. > > > > > > 7c1ac89480e8 cifs: Enable large folio support > > > 3ee1a1fc3981 cifs: Cut over to using netfslib > > > 69c3c023af25 cifs: Implement netfslib hooks > > > c20c0d7325ab cifs: Make add_credits_and_wake_if() clear deducted credits > > > edea94a69730 cifs: Add mempools for cifs_io_request and > > > cifs_io_subrequest structs > > > 3758c485f6c9 cifs: Set zero_point in the copy_file_range() and > > > remap_file_range() > > > 1a5b4edd97ce cifs: Move cifs_loose_read_iter() and > > > cifs_file_write_iter() to file.c > > > dc5939de82f1 cifs: Replace the writedata replay bool with a netfs sreq flag > > > 56257334e8e0 cifs: Make wait_mtu_credits take size_t args > > > ab58fbdeebc7 cifs: Use more fields from netfs_io_subrequest > > > a975a2f22cdc cifs: Replace cifs_writedata with a wrapper around > > > netfs_io_subrequest > > > 753b67eb630d cifs: Replace cifs_readdata with a wrapper around > > > netfs_io_subrequest > > > 0f7c0f3f5150 cifs: Use alternative invalidation to using launder_folio > > > 2e9d7e4b984a mm: Remove the PG_fscache alias for PG_private_2 > > > > > > Any ideas? > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve -- Thanks, Steve