All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	david <david@fromorbit.com>, Christoph Hellwig <hch@lst.de>,
	Alasdair Kergon <agk@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	Linux NVDIMM <nvdimm@lists.linux.dev>
Subject: RE: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
Date: Thu, 17 Jun 2021 08:12:49 +0000	[thread overview]
Message-ID: <OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <CAPcyv4ihuErfVWHL0F1OExQashutJjBdaLn5X5oPm44OkQ+a_A@mail.gmail.com>

> -----Original Message-----
> From: Dan Williams <dan.j.williams@intel.com>
> Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
> 
> On Wed, Jun 16, 2021 at 11:51 PM ruansy.fnst@fujitsu.com
> <ruansy.fnst@fujitsu.com> wrote:
> >
> > > -----Original Message-----
> > > From: Dan Williams <dan.j.williams@intel.com>
> > > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for
> > > superblock
> > >
> > > [ drop old linux-nvdimm@lists.01.org, add nvdimm@lists.linux.dev ]
> > >
> > > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@fujitsu.com>
> wrote:
> > > >
> > > > Memory failure occurs in fsdax mode will finally be handled in
> > > > filesystem.  We introduce this interface to find out files or
> > > > metadata affected by the corrupted range, and try to recover the
> > > > corrupted data if possiable.
> > > >
> > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> > > > ---
> > > >  include/linux/fs.h | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/include/linux/fs.h b/include/linux/fs.h index
> > > > c3c88fdb9b2a..92af36c4225f 100644
> > > > --- a/include/linux/fs.h
> > > > +++ b/include/linux/fs.h
> > > > @@ -2176,6 +2176,8 @@ struct super_operations {
> > > >                                   struct shrink_control *);
> > > >         long (*free_cached_objects)(struct super_block *,
> > > >                                     struct shrink_control *);
> > > > +       int (*corrupted_range)(struct super_block *sb, struct
> > > > + block_device
> > > *bdev,
> > > > +                              loff_t offset, size_t len, void
> > > > + *data);
> > >
> > > Why does the superblock need a new operation? Wouldn't whatever
> > > function is specified here just be specified to the dax_dev as the
> > > ->notify_failure() holder callback?
> >
> > Because we need to find out which file is effected by the given poison page so
> that memory-failure code can do collect_procs() and kill_procs() jobs.  And it
> needs filesystem to use its rmap feature to search the file from a given offset.
> So, we need this implemented by the specified filesystem and called by
> dax_device's holder.
> >
> > This is the call trace I described in cover letter:
> > memory_failure()
> >  * fsdax case
> >  pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
> >   dax_device->holder_ops->corrupted_range() =>
> >                                       - fs_dax_corrupted_range()
> >                                       - md_dax_corrupted_range()
> >    sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()  <==
> **HERE**
> >     xfs_rmap_query_range()
> >      xfs_currupt_helper()
> >       * corrupted on metadata
> >           try to recover data, call xfs_force_shutdown()
> >       * corrupted on file data
> >           try to recover data, call mf_dax_kill_procs()
> >  * normal case
> >  mf_generic_kill_procs()
> >
> > As you can see, this new added operation is an important for the whole
> progress.
> 
> I don't think you need either fs_dax_corrupted_range() nor
> sb->s_ops->corrupted_range(). In fact that fs_dax_corrupted_range()
> looks broken because the filesystem may not even be mounted on the device
> associated with the error. 

If filesystem is not mounted, then there won't be any process using the broken page and no one need to be killed in memory-failure.  So, I think we can just return and handle the error on driver level if needed.

> The holder_data and holder_op should be sufficient
> from communicating the stack of notifications:
> 
> pgmap->notify_memory_failure() => pmem_pgmap_notify_failure()
> pmem_dax_dev->holder_ops->notify_failure(pmem_dax_dev) =>
> md_dax_notify_failure()
> md_dax_dev->holder_ops->notify_failure() => xfs_notify_failure()
> 
> I.e. the entire chain just walks dax_dev holder ops.

Oh, I see.  Just need to implement holder_ops in filesystem or mapped_device directly.  I made the routine complicated.


--
Thanks,
Ruan.


WARNING: multiple messages have this Message-ID (diff)
From: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux NVDIMM <nvdimm@lists.linux.dev>,
	Mike Snitzer <snitzer@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>, david <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	device-mapper development <dm-devel@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>, Alasdair Kergon <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
Date: Thu, 17 Jun 2021 08:12:49 +0000	[thread overview]
Message-ID: <OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <CAPcyv4ihuErfVWHL0F1OExQashutJjBdaLn5X5oPm44OkQ+a_A@mail.gmail.com>

> -----Original Message-----
> From: Dan Williams <dan.j.williams@intel.com>
> Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock
> 
> On Wed, Jun 16, 2021 at 11:51 PM ruansy.fnst@fujitsu.com
> <ruansy.fnst@fujitsu.com> wrote:
> >
> > > -----Original Message-----
> > > From: Dan Williams <dan.j.williams@intel.com>
> > > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for
> > > superblock
> > >
> > > [ drop old linux-nvdimm@lists.01.org, add nvdimm@lists.linux.dev ]
> > >
> > > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@fujitsu.com>
> wrote:
> > > >
> > > > Memory failure occurs in fsdax mode will finally be handled in
> > > > filesystem.  We introduce this interface to find out files or
> > > > metadata affected by the corrupted range, and try to recover the
> > > > corrupted data if possiable.
> > > >
> > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> > > > ---
> > > >  include/linux/fs.h | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/include/linux/fs.h b/include/linux/fs.h index
> > > > c3c88fdb9b2a..92af36c4225f 100644
> > > > --- a/include/linux/fs.h
> > > > +++ b/include/linux/fs.h
> > > > @@ -2176,6 +2176,8 @@ struct super_operations {
> > > >                                   struct shrink_control *);
> > > >         long (*free_cached_objects)(struct super_block *,
> > > >                                     struct shrink_control *);
> > > > +       int (*corrupted_range)(struct super_block *sb, struct
> > > > + block_device
> > > *bdev,
> > > > +                              loff_t offset, size_t len, void
> > > > + *data);
> > >
> > > Why does the superblock need a new operation? Wouldn't whatever
> > > function is specified here just be specified to the dax_dev as the
> > > ->notify_failure() holder callback?
> >
> > Because we need to find out which file is effected by the given poison page so
> that memory-failure code can do collect_procs() and kill_procs() jobs.  And it
> needs filesystem to use its rmap feature to search the file from a given offset.
> So, we need this implemented by the specified filesystem and called by
> dax_device's holder.
> >
> > This is the call trace I described in cover letter:
> > memory_failure()
> >  * fsdax case
> >  pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
> >   dax_device->holder_ops->corrupted_range() =>
> >                                       - fs_dax_corrupted_range()
> >                                       - md_dax_corrupted_range()
> >    sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()  <==
> **HERE**
> >     xfs_rmap_query_range()
> >      xfs_currupt_helper()
> >       * corrupted on metadata
> >           try to recover data, call xfs_force_shutdown()
> >       * corrupted on file data
> >           try to recover data, call mf_dax_kill_procs()
> >  * normal case
> >  mf_generic_kill_procs()
> >
> > As you can see, this new added operation is an important for the whole
> progress.
> 
> I don't think you need either fs_dax_corrupted_range() nor
> sb->s_ops->corrupted_range(). In fact that fs_dax_corrupted_range()
> looks broken because the filesystem may not even be mounted on the device
> associated with the error. 

If filesystem is not mounted, then there won't be any process using the broken page and no one need to be killed in memory-failure.  So, I think we can just return and handle the error on driver level if needed.

> The holder_data and holder_op should be sufficient
> from communicating the stack of notifications:
> 
> pgmap->notify_memory_failure() => pmem_pgmap_notify_failure()
> pmem_dax_dev->holder_ops->notify_failure(pmem_dax_dev) =>
> md_dax_notify_failure()
> md_dax_dev->holder_ops->notify_failure() => xfs_notify_failure()
> 
> I.e. the entire chain just walks dax_dev holder ops.

Oh, I see.  Just need to implement holder_ops in filesystem or mapped_device directly.  I made the routine complicated.


--
Thanks,
Ruan.


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-06-17  8:13 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-04  1:18 [PATCH v4 00/10] fsdax: introduce fs query to support reflink Shiyang Ruan
2021-06-04  1:18 ` [dm-devel] " Shiyang Ruan
2021-06-04  1:18 ` [PATCH v4 01/10] pagemap: Introduce ->memory_failure() Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-16  0:18   ` Dan Williams
2021-06-16  0:18     ` Dan Williams
2021-06-04  1:18 ` [PATCH v4 02/10] dax: Introduce holder for dax_device Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-16  0:46   ` Dan Williams
2021-06-16  0:46     ` Dan Williams
2021-06-04  1:18 ` [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-16  0:48   ` Dan Williams
2021-06-16  0:48     ` Dan Williams
2021-06-17  6:51     ` ruansy.fnst
2021-06-17  6:51       ` [dm-devel] " ruansy.fnst
2021-06-17  7:04       ` Dan Williams
2021-06-17  7:04         ` Dan Williams
2021-06-17  8:12         ` ruansy.fnst [this message]
2021-06-17  8:12           ` [dm-devel] " ruansy.fnst
2021-06-04  1:18 ` [PATCH v4 04/10] mm, fsdax: Refactor memory-failure handler for dax mapping Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-16  6:30   ` Dan Williams
2021-06-16  6:30     ` Dan Williams
2021-06-17  7:51     ` ruansy.fnst
2021-06-17  7:51       ` [dm-devel] " ruansy.fnst
2021-06-04  1:18 ` [PATCH v4 05/10] mm, pmem: Implement ->memory_failure() in pmem driver Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-16  6:49   ` Dan Williams
2021-06-16  6:49     ` Dan Williams
2021-06-04  1:18 ` [PATCH v4 06/10] fs/dax: Implement dax_holder_operations Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-04  1:18 ` [PATCH v4 07/10] dm: Introduce ->rmap() to find bdev offset Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-04  1:18 ` [PATCH v4 08/10] md: Implement dax_holder_operations Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-04  5:48   ` kernel test robot
2021-06-04  5:48     ` kernel test robot
2021-06-04  5:48     ` kernel test robot
2021-06-04  1:18 ` [PATCH v4 09/10] xfs: Implement ->corrupted_range() for XFS Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan
2021-06-04  5:22   ` kernel test robot
2021-06-04  5:22     ` kernel test robot
2021-06-04  5:22     ` [dm-devel] " kernel test robot
2021-06-04  5:40   ` kernel test robot
2021-06-04  5:40     ` kernel test robot
2021-06-04  5:40     ` kernel test robot
2021-06-04  1:18 ` [PATCH v4 10/10] fs/dax: Remove useless functions Shiyang Ruan
2021-06-04  1:18   ` [dm-devel] " Shiyang Ruan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OSBPR01MB292031EC271D4AD843389A3FF40E9@OSBPR01MB2920.jpnprd01.prod.outlook.com \
    --to=ruansy.fnst@fujitsu.com \
    --cc=agk@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rgoldwyn@suse.de \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.