Linux-XFS Archive mirror
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: linux-mm@kvack.org
Cc: david@fromorbit.com, dan.j.williams@intel.com,
	jhubbard@nvidia.com, rcampbell@nvidia.com, willy@infradead.org,
	jgg@nvidia.com, linux-fsdevel@vger.kernel.org, jack@suse.cz,
	djwong@kernel.org, hch@lst.de, david@redhat.com,
	ruansy.fnst@fujitsu.com, nvdimm@lists.linux.dev,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	jglisse@redhat.com, Alistair Popple <apopple@nvidia.com>
Subject: [RFC 00/10] fs/dax: Fix FS DAX page reference counts
Date: Thu, 11 Apr 2024 10:57:21 +1000	[thread overview]
Message-ID: <cover.fe275e9819458a4bbb9451b888cafb88af8867d4.1712796818.git-series.apopple@nvidia.com> (raw)

FS DAX pages have always maintained their own page reference counts
without following the normal rules for page reference counting. In
particular pages are considered free when the refcount hits one rather
than zero and refcounts are not added when mapping the page.

Tracking this requires special PTE bits (PTE_DEVMAP) and a secondary
mechanism for allowing GUP to hold references on the page (see
get_dev_pagemap). However there doesn't seem to be any reason why FS
DAX pages need their own reference counting scheme.

This RFC is an initial attempt at removing the special reference
counting and instead refcount FS DAX pages the same as normal pages.

There are still a couple of rough edges - in particular I haven't
completely removed the devmap PTE bit references from arch specific
code and there is probably some more cleanup of dev_pagemap reference
counting that could be done, particular in mm/gup.c. I also haven't
yet compiled on anything other than x86_64.

Before continuing further with this clean-up though I would appreciate
some feedback on the viability of this approach and any issues I may
have overlooked, as I am not intimately familiar with FS DAX code (or
for that matter the FS layer in general).

I have of course run some basic testing which didn't reveal any
problems.

Signed-off-by: Alistair Popple <apopple@nvidia.com>

Alistair Popple (10):
  mm/gup.c: Remove redundant check for PCI P2PDMA page
  mm/hmm: Remove dead check for HugeTLB and FS DAX
  pci/p2pdma: Don't initialise page refcount to one
  fs/dax: Don't track page mapping/index
  fs/dax: Refactor wait for dax idle page
  fs/dax: Add dax_page_free callback
  mm: Allow compound zone device pages
  fs/dax: Properly refcount fs dax pages
  mm/khugepage.c: Warn if trying to scan devmap pmd
  mm: Remove pXX_devmap

 Documentation/mm/arch_pgtable_helpers.rst    |   6 +-
 arch/arm64/include/asm/pgtable.h             |  24 +---
 arch/powerpc/include/asm/book3s/64/pgtable.h |  42 +-----
 arch/powerpc/mm/book3s64/hash_pgtable.c      |   3 +-
 arch/powerpc/mm/book3s64/pgtable.c           |   8 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c     |   5 +-
 arch/powerpc/mm/pgtable.c                    |   2 +-
 arch/x86/include/asm/pgtable.h               |  31 +---
 drivers/dax/super.c                          |   2 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c       |   2 +-
 drivers/nvdimm/pmem.c                        |  10 +-
 drivers/pci/p2pdma.c                         |   4 +-
 fs/dax.c                                     | 158 +++++++-----------
 fs/ext4/inode.c                              |   5 +-
 fs/fuse/dax.c                                |   4 +-
 fs/fuse/virtio_fs.c                          |   8 +-
 fs/userfaultfd.c                             |   2 +-
 fs/xfs/xfs_file.c                            |   4 +-
 include/linux/dax.h                          |  16 ++-
 include/linux/huge_mm.h                      |  11 +-
 include/linux/memremap.h                     |  12 +-
 include/linux/migrate.h                      |   2 +-
 include/linux/mm.h                           |  41 +-----
 include/linux/page-flags.h                   |   6 +-
 include/linux/pgtable.h                      |  17 +--
 lib/test_hmm.c                               |   2 +-
 mm/debug_vm_pgtable.c                        |  51 +------
 mm/gup.c                                     | 165 +------------------
 mm/hmm.c                                     |  40 +----
 mm/huge_memory.c                             | 180 +++++++++-----------
 mm/internal.h                                |   2 +-
 mm/khugepaged.c                              |   2 +-
 mm/mapping_dirty_helpers.c                   |   4 +-
 mm/memory-failure.c                          |   6 +-
 mm/memory.c                                  | 109 ++++++++----
 mm/memremap.c                                |  36 +---
 mm/migrate_device.c                          |   6 +-
 mm/mm_init.c                                 |   5 +-
 mm/mprotect.c                                |   2 +-
 mm/mremap.c                                  |   5 +-
 mm/page_vma_mapped.c                         |   5 +-
 mm/pgtable-generic.c                         |   7 +-
 mm/swap.c                                    |   2 +-
 mm/vmscan.c                                  |   5 +-
 44 files changed, 338 insertions(+), 721 deletions(-)

base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa
-- 
git-series 0.9.1

             reply	other threads:[~2024-04-11  0:57 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11  0:57 Alistair Popple [this message]
2024-04-11  0:57 ` [RFC 01/10] mm/gup.c: Remove redundant check for PCI P2PDMA page Alistair Popple
2024-04-11 12:59   ` Jason Gunthorpe
2024-04-11 13:47   ` David Hildenbrand
2024-04-12  1:37     ` Alistair Popple
2024-04-11  0:57 ` [RFC 02/10] mm/hmm: Remove dead check for HugeTLB and FS DAX Alistair Popple
2024-04-11 12:25   ` Jason Gunthorpe
2024-04-11 13:37     ` Peter Xu
2024-04-12  1:28       ` Alistair Popple
2024-04-11  0:57 ` [RFC 03/10] pci/p2pdma: Don't initialise page refcount to one Alistair Popple
2024-04-11 12:29   ` Jason Gunthorpe
2024-04-12  5:40     ` Alistair Popple
2024-04-12 17:20   ` Dan Williams
2024-05-09 21:59   ` Logan Gunthorpe
2024-05-09 23:14     ` Alistair Popple
2024-04-11  0:57 ` [RFC 04/10] fs/dax: Don't track page mapping/index Alistair Popple
2024-04-12 15:22   ` Jan Kara
2024-04-12 17:31     ` Dan Williams
2024-04-15  7:03       ` Alistair Popple
2024-04-15 20:51         ` Dan Williams
2024-04-16  0:07           ` Alistair Popple
2024-04-16  0:36             ` Dan Williams
2024-04-12 17:21   ` Dan Williams
2024-04-11  0:57 ` [RFC 05/10] fs/dax: Refactor wait for dax idle page Alistair Popple
2024-04-12 14:37   ` Jan Kara
2024-04-13 20:19   ` John Hubbard
2024-04-15  8:41     ` Alistair Popple
2024-04-11  0:57 ` [RFC 06/10] fs/dax: Add dax_page_free callback Alistair Popple
2024-04-11  0:57 ` [RFC 07/10] mm: Allow compound zone device pages Alistair Popple
2024-04-11 12:32   ` Jason Gunthorpe
2024-04-11 14:10   ` Matthew Wilcox
2024-04-12  1:38     ` Alistair Popple
2024-04-11  0:57 ` [RFC 08/10] fs/dax: Properly refcount fs dax pages Alistair Popple
2024-04-11  0:57 ` [RFC 09/10] mm/khugepage.c: Warn if trying to scan devmap pmd Alistair Popple
2024-04-11 13:45   ` David Hildenbrand
2024-04-12  1:34     ` Alistair Popple
2024-04-11  0:57 ` [RFC 10/10] mm: Remove pXX_devmap Alistair Popple
2024-04-11 12:57   ` Jason Gunthorpe
2024-04-11 17:28 ` [RFC 00/10] fs/dax: Fix FS DAX page reference counts Dan Williams
2024-04-11 17:35   ` Jason Gunthorpe
2024-04-11 17:56     ` Dan Williams
2024-04-12  3:54   ` Alistair Popple
2024-04-12  6:55     ` Alistair Popple
2024-04-12 11:53       ` Jason Gunthorpe
2024-04-12 17:32         ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.fe275e9819458a4bbb9451b888cafb88af8867d4.1712796818.git-series.apopple@nvidia.com \
    --to=apopple@nvidia.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=rcampbell@nvidia.com \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).