From: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-xfs <linux-xfs@vger.kernel.org>, Linux MM <linux-mm@kvack.org>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, device-mapper development <dm-devel@redhat.com>, "Darrick J. Wong" <darrick.wong@oracle.com>, david <david@fromorbit.com>, Christoph Hellwig <hch@lst.de>, Alasdair Kergon <agk@redhat.com>, Mike Snitzer <snitzer@redhat.com>, Goldwyn Rodrigues <rgoldwyn@suse.de>, Linux NVDIMM <nvdimm@lists.linux.dev> Subject: RE: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock Date: Thu, 17 Jun 2021 08:12:49 +0000 [thread overview] Message-ID: <OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com> (raw) In-Reply-To: <CAPcyv4ihuErfVWHL0F1OExQashutJjBdaLn5X5oPm44OkQ+a_A@mail.gmail.com> > -----Original Message----- > From: Dan Williams <dan.j.williams@intel.com> > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock > > On Wed, Jun 16, 2021 at 11:51 PM ruansy.fnst@fujitsu.com > <ruansy.fnst@fujitsu.com> wrote: > > > > > -----Original Message----- > > > From: Dan Williams <dan.j.williams@intel.com> > > > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for > > > superblock > > > > > > [ drop old linux-nvdimm@lists.01.org, add nvdimm@lists.linux.dev ] > > > > > > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@fujitsu.com> > wrote: > > > > > > > > Memory failure occurs in fsdax mode will finally be handled in > > > > filesystem. We introduce this interface to find out files or > > > > metadata affected by the corrupted range, and try to recover the > > > > corrupted data if possiable. > > > > > > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> > > > > --- > > > > include/linux/fs.h | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h index > > > > c3c88fdb9b2a..92af36c4225f 100644 > > > > --- a/include/linux/fs.h > > > > +++ b/include/linux/fs.h > > > > @@ -2176,6 +2176,8 @@ struct super_operations { > > > > struct shrink_control *); > > > > long (*free_cached_objects)(struct super_block *, > > > > struct shrink_control *); > > > > + int (*corrupted_range)(struct super_block *sb, struct > > > > + block_device > > > *bdev, > > > > + loff_t offset, size_t len, void > > > > + *data); > > > > > > Why does the superblock need a new operation? Wouldn't whatever > > > function is specified here just be specified to the dax_dev as the > > > ->notify_failure() holder callback? > > > > Because we need to find out which file is effected by the given poison page so > that memory-failure code can do collect_procs() and kill_procs() jobs. And it > needs filesystem to use its rmap feature to search the file from a given offset. > So, we need this implemented by the specified filesystem and called by > dax_device's holder. > > > > This is the call trace I described in cover letter: > > memory_failure() > > * fsdax case > > pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() > > dax_device->holder_ops->corrupted_range() => > > - fs_dax_corrupted_range() > > - md_dax_corrupted_range() > > sb->s_ops->currupted_range() => xfs_fs_corrupted_range() <== > **HERE** > > xfs_rmap_query_range() > > xfs_currupt_helper() > > * corrupted on metadata > > try to recover data, call xfs_force_shutdown() > > * corrupted on file data > > try to recover data, call mf_dax_kill_procs() > > * normal case > > mf_generic_kill_procs() > > > > As you can see, this new added operation is an important for the whole > progress. > > I don't think you need either fs_dax_corrupted_range() nor > sb->s_ops->corrupted_range(). In fact that fs_dax_corrupted_range() > looks broken because the filesystem may not even be mounted on the device > associated with the error. If filesystem is not mounted, then there won't be any process using the broken page and no one need to be killed in memory-failure. So, I think we can just return and handle the error on driver level if needed. > The holder_data and holder_op should be sufficient > from communicating the stack of notifications: > > pgmap->notify_memory_failure() => pmem_pgmap_notify_failure() > pmem_dax_dev->holder_ops->notify_failure(pmem_dax_dev) => > md_dax_notify_failure() > md_dax_dev->holder_ops->notify_failure() => xfs_notify_failure() > > I.e. the entire chain just walks dax_dev holder ops. Oh, I see. Just need to implement holder_ops in filesystem or mapped_device directly. I made the routine complicated. -- Thanks, Ruan.
WARNING: multiple messages have this Message-ID (diff)
From: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Linux NVDIMM <nvdimm@lists.linux.dev>, Mike Snitzer <snitzer@redhat.com>, "Darrick J. Wong" <darrick.wong@oracle.com>, Goldwyn Rodrigues <rgoldwyn@suse.de>, david <david@fromorbit.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-xfs <linux-xfs@vger.kernel.org>, Linux MM <linux-mm@kvack.org>, device-mapper development <dm-devel@redhat.com>, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Christoph Hellwig <hch@lst.de>, Alasdair Kergon <agk@redhat.com> Subject: Re: [dm-devel] [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock Date: Thu, 17 Jun 2021 08:12:49 +0000 [thread overview] Message-ID: <OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com> (raw) In-Reply-To: <CAPcyv4ihuErfVWHL0F1OExQashutJjBdaLn5X5oPm44OkQ+a_A@mail.gmail.com> > -----Original Message----- > From: Dan Williams <dan.j.williams@intel.com> > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock > > On Wed, Jun 16, 2021 at 11:51 PM ruansy.fnst@fujitsu.com > <ruansy.fnst@fujitsu.com> wrote: > > > > > -----Original Message----- > > > From: Dan Williams <dan.j.williams@intel.com> > > > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for > > > superblock > > > > > > [ drop old linux-nvdimm@lists.01.org, add nvdimm@lists.linux.dev ] > > > > > > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@fujitsu.com> > wrote: > > > > > > > > Memory failure occurs in fsdax mode will finally be handled in > > > > filesystem. We introduce this interface to find out files or > > > > metadata affected by the corrupted range, and try to recover the > > > > corrupted data if possiable. > > > > > > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> > > > > --- > > > > include/linux/fs.h | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h index > > > > c3c88fdb9b2a..92af36c4225f 100644 > > > > --- a/include/linux/fs.h > > > > +++ b/include/linux/fs.h > > > > @@ -2176,6 +2176,8 @@ struct super_operations { > > > > struct shrink_control *); > > > > long (*free_cached_objects)(struct super_block *, > > > > struct shrink_control *); > > > > + int (*corrupted_range)(struct super_block *sb, struct > > > > + block_device > > > *bdev, > > > > + loff_t offset, size_t len, void > > > > + *data); > > > > > > Why does the superblock need a new operation? Wouldn't whatever > > > function is specified here just be specified to the dax_dev as the > > > ->notify_failure() holder callback? > > > > Because we need to find out which file is effected by the given poison page so > that memory-failure code can do collect_procs() and kill_procs() jobs. And it > needs filesystem to use its rmap feature to search the file from a given offset. > So, we need this implemented by the specified filesystem and called by > dax_device's holder. > > > > This is the call trace I described in cover letter: > > memory_failure() > > * fsdax case > > pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() > > dax_device->holder_ops->corrupted_range() => > > - fs_dax_corrupted_range() > > - md_dax_corrupted_range() > > sb->s_ops->currupted_range() => xfs_fs_corrupted_range() <== > **HERE** > > xfs_rmap_query_range() > > xfs_currupt_helper() > > * corrupted on metadata > > try to recover data, call xfs_force_shutdown() > > * corrupted on file data > > try to recover data, call mf_dax_kill_procs() > > * normal case > > mf_generic_kill_procs() > > > > As you can see, this new added operation is an important for the whole > progress. > > I don't think you need either fs_dax_corrupted_range() nor > sb->s_ops->corrupted_range(). In fact that fs_dax_corrupted_range() > looks broken because the filesystem may not even be mounted on the device > associated with the error. If filesystem is not mounted, then there won't be any process using the broken page and no one need to be killed in memory-failure. So, I think we can just return and handle the error on driver level if needed. > The holder_data and holder_op should be sufficient > from communicating the stack of notifications: > > pgmap->notify_memory_failure() => pmem_pgmap_notify_failure() > pmem_dax_dev->holder_ops->notify_failure(pmem_dax_dev) => > md_dax_notify_failure() > md_dax_dev->holder_ops->notify_failure() => xfs_notify_failure() > > I.e. the entire chain just walks dax_dev holder ops. Oh, I see. Just need to implement holder_ops in filesystem or mapped_device directly. I made the routine complicated. -- Thanks, Ruan. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2021-06-17 8:13 UTC|newest] Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-04 1:18 [PATCH v4 00/10] fsdax: introduce fs query to support reflink Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-04 1:18 ` [PATCH v4 01/10] pagemap: Introduce ->memory_failure() Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-16 0:18 ` Dan Williams 2021-06-16 0:18 ` Dan Williams 2021-06-04 1:18 ` [PATCH v4 02/10] dax: Introduce holder for dax_device Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-16 0:46 ` Dan Williams 2021-06-16 0:46 ` Dan Williams 2021-06-04 1:18 ` [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-16 0:48 ` Dan Williams 2021-06-16 0:48 ` Dan Williams 2021-06-17 6:51 ` ruansy.fnst 2021-06-17 6:51 ` [dm-devel] " ruansy.fnst 2021-06-17 7:04 ` Dan Williams 2021-06-17 7:04 ` Dan Williams 2021-06-17 8:12 ` ruansy.fnst [this message] 2021-06-17 8:12 ` [dm-devel] " ruansy.fnst 2021-06-04 1:18 ` [PATCH v4 04/10] mm, fsdax: Refactor memory-failure handler for dax mapping Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-16 6:30 ` Dan Williams 2021-06-16 6:30 ` Dan Williams 2021-06-17 7:51 ` ruansy.fnst 2021-06-17 7:51 ` [dm-devel] " ruansy.fnst 2021-06-04 1:18 ` [PATCH v4 05/10] mm, pmem: Implement ->memory_failure() in pmem driver Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-16 6:49 ` Dan Williams 2021-06-16 6:49 ` Dan Williams 2021-06-04 1:18 ` [PATCH v4 06/10] fs/dax: Implement dax_holder_operations Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-04 1:18 ` [PATCH v4 07/10] dm: Introduce ->rmap() to find bdev offset Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-04 1:18 ` [PATCH v4 08/10] md: Implement dax_holder_operations Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-04 5:48 ` kernel test robot 2021-06-04 5:48 ` kernel test robot 2021-06-04 5:48 ` kernel test robot 2021-06-04 1:18 ` [PATCH v4 09/10] xfs: Implement ->corrupted_range() for XFS Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan 2021-06-04 5:22 ` kernel test robot 2021-06-04 5:22 ` kernel test robot 2021-06-04 5:22 ` [dm-devel] " kernel test robot 2021-06-04 5:40 ` kernel test robot 2021-06-04 5:40 ` kernel test robot 2021-06-04 5:40 ` kernel test robot 2021-06-04 1:18 ` [PATCH v4 10/10] fs/dax: Remove useless functions Shiyang Ruan 2021-06-04 1:18 ` [dm-devel] " Shiyang Ruan
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com \ --to=ruansy.fnst@fujitsu.com \ --cc=agk@redhat.com \ --cc=dan.j.williams@intel.com \ --cc=darrick.wong@oracle.com \ --cc=david@fromorbit.com \ --cc=dm-devel@redhat.com \ --cc=hch@lst.de \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-xfs@vger.kernel.org \ --cc=nvdimm@lists.linux.dev \ --cc=rgoldwyn@suse.de \ --cc=snitzer@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.