OCFS2-Devel Archive mirror
 help / color / mirror / Atom feed
From: Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>
To: Joseph Qi <joseph.qi@linux.alibaba.com>,
	Joseph Qi <jiangqi903@gmail.com>,
	ocfs2-devel@oss.oracle.com
Subject: Re: [Ocfs2-devel] report BUG: io_uring triggers umount error
Date: Wed, 1 Mar 2023 17:49:17 +0800	[thread overview]
Message-ID: <ce41878f-f56d-b365-9dd9-89c3f8a26cc4@suse.com> (raw)
In-Reply-To: <50c26dfd-afd4-5dc2-212c-a02e84a75440@linux.alibaba.com>

On 3/1/23 10:06 AM, Joseph Qi wrote:
> Hi,
> 
> On 2/27/23 10:59 AM, Joseph Qi via Ocfs2-devel wrote:
>>
>>
>> On 2/26/23 11:22 PM, Heming Zhao wrote:
>>> On Sun, Feb 26, 2023 at 04:45:44PM +0800, Heming Zhao via Ocfs2-devel wrote:
>>>> On Fri, Feb 24, 2023 at 06:59:47PM +0800, Joseph Qi wrote:
>>>>>
>>>>>
>>>>> On 2/24/23 4:03 PM, Heming Zhao wrote:
>>>>>> On 2/24/23 3:52 PM, Joseph Qi wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2/24/23 3:48 PM, Heming Zhao via Ocfs2-devel wrote:
>>>>>>>> On 2/24/23 2:54 PM, Joseph Qi wrote:
>>>>>>>>> I can reproduce this in my local VM.
>>>>>>>>> I've traced ocfs2_dismount_volume and found that it hasn't been called.
>>>>>>>>> So EBUSY is returned in VFS layer. I guess something wrong when doing
>>>>>>>>> a copy with linked SQEs (normal copy seems no problem).
>>>>>>>>>
>>>>>>>>
>>>>>>>> I am inclined to agree with you. I also test liburing examples apps
>>>>>>>> on ext4 partition, everything looks fine.
>>>>>>>>
>>>>>>>> I used below bpftrace method, the retval is '3'.
>>>>>>>>    bpftrace -e 'kr:mnt_get_count{printf("%d\n", retval);}'
>>>>>>>>
>>>>>>>> It responds to flow: path_umount() => do_umount => mnt_get_count (gets '3')
>>>>>>>>
>>>>>>> Yes, that's the place return EBUSY.
>>>>>>> So the problem seems to be getmnt/putmnt not match in this case.
>>>>>>>
>>>>>>
>>>>>> I didn't familiar with setting up kernel bi-search env. I used one last year
>>>>>> openSUSE tumblweed (with kernel 5.16.2), this umount issue doesn't exist.
>>>>>> So there is a possibility one ocfs2 commit introduced this issue.
>>>>> You can checkout each mailine version like Linux 6.0, 6.1, ... and try
>>>>> to check if it can be reproduced.
>>>>
>>>> I drop this method, my machine cpu is old, compiling kernel will take too much
>>>> time.
>>>>
>>>>>
>>>>> I've tried trace mntget/mntput using the following bpftrace script,
>>>>> link-cp output shows it misses a fput.
>>>>>
>>>>> #include <linux/mount.h>
>>>>> #include <linux/string.h>
>>>>>
>>>>> kprobe:mntget
>>>>> {
>>>>> 	$n = ((struct vfsmount *)arg0)->mnt_sb->s_type->name;
>>>>>
>>>>> 	if (!strncmp(str($n), "ocfs2", 5)) {
>>>>> 		@[comm] += 1;
>>>>>
>>>>> 		printf("%s", kstack);
>>>>> 	}
>>>>> }
>>>>>
>>>>> kprobe:mntput
>>>>> {
>>>>> 	$n = ((struct vfsmount *)arg0)->mnt_sb->s_type->name;
>>>>>
>>>>> 	if (!strncmp(str($n), "ocfs2", 5)) {
>>>>> 		@[comm] +=1;
>>>>>
>>>>> 		printf("%s", kstack);
>>>>> 	}
>>>>> }
>>>>>
>>>>
>>>> I used below script to find source file hold f_count.
>>>>
>>>> e.g: ./link-cp /mnt/aa.bin /mnt/1
>>>> The script will show f_count of aa.bin only down to 3. We expect f_count value
>>>> down to 1 then trigger mntput.
>>>>
>>>> ```
>>>> #include <linux/mount.h>
>>>> #include <linux/string.h>
>>>> #include <linux/dcache.h>
>>>> #include <linux/path.h>
>>>> #include <linux/fs_struct.h>
>>>> #include <linux/fs.h>
>>>> #include <linux/io_uring_types.h>
>>>>
>>>> k:fput
>>>> {
>>>> 	$f = (struct file *)arg0;
>>>> 	$cnt = $f->f_count;
>>>>
>>>> 	if (!strncmp(comm, "link-cp", 7)) {
>>>> 		printf("\n [link-cp] fput() name:%s f_count:%lld <%p> mnt:%p\n",
>>>> 			   	str($f->f_path.dentry->d_name.name), $cnt.counter, kptr($f), kptr($f->f_path.mnt));
>>>> 		//printf("%s\n", kstack);
>>>> 	}
>>>> }
>>>> ```
>>>>
>>>
>>> I am not familiar with io_uring, and can't make sure my analysis is correct.
>>> Because ocfs2_file_read_iter() returns -EOPNOTSUPP, then fails
>>> io_iter_do_read(), the io_issue_sqe missing fput() which causes umount failure
>>> issue.
>>>
>>> io_issue_sqe
>>>   + io_assign_file //call fget(), but missing fput()
>>>   + def->issue()
>>>   |  io_read
>>>   |   + io_iter_do_read
>>>   |   |  ocfs2_file_read_iter
>>>   |   |   return: -EOPNOTSUPP //iocb->ki_flags:0x8
>>>   |   + kiocb_done
>>>   |      + io_rw_done(&rw->kiocb, ret)
>>>   |      |   kiocb->ki_complete(kiocb, ret)
>>>   |      |    io_complete_rw
>>>   |      |      __io_complete_rw_common
>>>   |      |       + req->flags |= REQ_F_REISSUE | REQ_F_PARTIAL_IO;
>>>   |      |       + return true
>>>   |      + io_req_task_queue_reissue/io_req_task_queue_fail
>>>   |      + return IOU_ISSUE_SKIP_COMPLETE;
>>>   |
>>>   + return 0;
>>>
>> Seems not the case.
>> Reissue will queue work to iowq, and fput will finally get called in
>> io_wq_free_work() once work is handled.
>>
> I've sent a fix to io_uring, please refer:
> https://lore.kernel.org/io-uring/167758917178.12826.13481934362166793957.b4-ty@kernel.dk/T/#t
> 
> Thanks,
> Joseph
> 

Got it. Thank you for your quickly fix.

Let's make ocfs2 more powerful/stable.



- Heming

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

      reply	other threads:[~2023-03-01  9:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-24  0:32 [Ocfs2-devel] report BUG: io_uring triggers umount error Heming Zhao via Ocfs2-devel
2023-02-24  6:54 ` Joseph Qi via Ocfs2-devel
2023-02-24  7:48   ` Heming Zhao via Ocfs2-devel
2023-02-24  7:52     ` Joseph Qi via Ocfs2-devel
2023-02-24  8:03       ` Heming Zhao via Ocfs2-devel
2023-02-24 10:59         ` Joseph Qi via Ocfs2-devel
2023-02-26  8:45           ` Heming Zhao via Ocfs2-devel
2023-02-26 15:22             ` Heming Zhao via Ocfs2-devel
2023-02-27  2:59               ` Joseph Qi via Ocfs2-devel
2023-02-28  4:50                 ` Joseph Qi via Ocfs2-devel
2023-03-01  2:06                 ` Joseph Qi via Ocfs2-devel
2023-03-01  9:49                   ` Heming Zhao via Ocfs2-devel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce41878f-f56d-b365-9dd9-89c3f8a26cc4@suse.com \
    --to=ocfs2-devel@oss.oracle.com \
    --cc=heming.zhao@suse.com \
    --cc=jiangqi903@gmail.com \
    --cc=joseph.qi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).