gfs2.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/16] fanotify: add pre-content hooks
@ 2024-08-08 19:27 Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 01/16] fanotify: don't skip extra event info if no info_mode is set Josef Bacik
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

v1: https://lore.kernel.org/linux-fsdevel/cover.1721931241.git.josef@toxicpanda.com/

v1->v2:
- reworked the page fault logic based on Jan's suggestion and turned it into a
  helper.
- Added 3 patches per-fs where we need to call the fsnotify helper from their
  ->fault handlers.
- Disabled readahead in the case that there's a pre-content watch in place.
- Disabled huge faults when there's a pre-content watch in place (entirely
  because it's untested, theoretically it should be straightforward to do).
- Updated the command numbers.
- Addressed the random spelling/grammer mistakes that Jan pointed out.
- Addressed the other random nits from Jan.

--- Original email ---

Hello,

These are the patches for the bare bones pre-content fanotify support.  The
majority of this work is Amir's, my contribution to this has solely been around
adding the page fault hooks, testing and validating everything.  I'm sending it
because Amir is traveling a bunch, and I touched it last so I'm going to take
all the hate and he can take all the credit.

There is a PoC that I've been using to validate this work, you can find the git
repo here

https://github.com/josefbacik/remote-fetch

This consists of 3 different tools.

1. populate.  This just creates all the stub files in the directory from the
   source directory.  Just run ./populate ~/linux ~/hsm-linux and it'll
   recursively create all of the stub files and directories.
2. remote-fetch.  This is the actual PoC, you just point it at the source and
   destination directory and then you can do whatever.  ./remote-fetch ~/linux
   ~/hsm-linux.
3. mmap-validate.  This was to validate the pagefault thing, this is likely what
   will be turned into the selftest with remote-fetch.  It creates a file and
   then you can validate the file matches the right pattern with both normal
   reads and mmap.  Normally I do something like

   ./mmap-validate create ~/src/foo
   ./populate ~/src ~/dst
   ./rmeote-fetch ~/src ~/dst
   ./mmap-validate validate ~/dst/foo

I did a bunch of testing, I also got some performance numbers.  I copied a
kernel tree, and then did remote-fetch, and then make -j4

Normal
real    9m49.709s
user    28m11.372s
sys     4m57.304s

HSM
real    10m6.454s
user    29m10.517s
sys     5m2.617s

So ~17 seconds more to build with HSM.  I then did a make mrproper on both trees
to see the size

[root@fedora ~]# du -hs /src/linux
1.6G    /src/linux
[root@fedora ~]# du -hs dst
125M    dst

This mirrors the sort of savings we've seen in production.

Meta has had these patches (minus the page fault patch) deployed in production
for almost a year with our own utility for doing on-demand package fetching.
The savings from this has been pretty significant.

The page-fault hooks are necessary for the last thing we need, which is
on-demand range fetching of executables.  Some of our binaries are several gigs
large, having the ability to remote fetch them on demand is a huge win for us
not only with space savings, but with startup time of containers.

There will be tests for this going into LTP once we're satisfied with the
patches and they're on their way upstream.  Thanks,

Josef

Amir Goldstein (8):
  fsnotify: introduce pre-content permission event
  fsnotify: generate pre-content permission event on open
  fanotify: introduce FAN_PRE_ACCESS permission event
  fanotify: introduce FAN_PRE_MODIFY permission event
  fanotify: pass optional file access range in pre-content event
  fanotify: rename a misnamed constant
  fanotify: report file range info with pre-content events
  fanotify: allow to set errno in FAN_DENY permission response

Josef Bacik (8):
  fanotify: don't skip extra event info if no info_mode is set
  fanotify: add a helper to check for pre content events
  fanotify: disable readahead if we have pre-content watches
  mm: don't allow huge faults for files with pre content watches
  fsnotify: generate pre-content permission event on page fault
  bcachefs: add pre-content fsnotify hook to fault
  gfs2: add pre-content fsnotify hook to fault
  xfs: add pre-content fsnotify hook for write faults

 fs/bcachefs/fs-io-pagecache.c      |  13 ++++
 fs/gfs2/file.c                     |  13 ++++
 fs/namei.c                         |   9 +++
 fs/notify/fanotify/fanotify.c      |  32 ++++++--
 fs/notify/fanotify/fanotify.h      |  20 +++++
 fs/notify/fanotify/fanotify_user.c | 116 +++++++++++++++++++++++------
 fs/notify/fsnotify.c               |  14 +++-
 fs/xfs/xfs_file.c                  |  20 ++++-
 include/linux/fanotify.h           |  20 +++--
 include/linux/fsnotify.h           |  54 ++++++++++++--
 include/linux/fsnotify_backend.h   |  59 ++++++++++++++-
 include/linux/mm.h                 |   2 +
 include/uapi/linux/fanotify.h      |  17 +++++
 mm/filemap.c                       | 109 +++++++++++++++++++++++++--
 mm/memory.c                        |  22 ++++++
 mm/readahead.c                     |  13 ++++
 security/selinux/hooks.c           |   3 +-
 17 files changed, 482 insertions(+), 54 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 01/16] fanotify: don't skip extra event info if no info_mode is set
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 02/16] fsnotify: introduce pre-content permission event Josef Bacik
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

New pre-content events will be path events but they will also carry
additional range information. Remove the optimization to skip checking
whether info structures need to be generated for path events. This
results in no change in generated info structures for existing events.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/notify/fanotify/fanotify_user.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 9ec313e9f6e1..2e2fba8a9d20 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -160,9 +160,6 @@ static size_t fanotify_event_len(unsigned int info_mode,
 	int fh_len;
 	int dot_len = 0;
 
-	if (!info_mode)
-		return event_len;
-
 	if (fanotify_is_error_event(event->mask))
 		event_len += FANOTIFY_ERROR_INFO_LEN;
 
@@ -740,12 +737,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (fanotify_is_perm_event(event->mask))
 		FANOTIFY_PERM(event)->fd = fd;
 
-	if (info_mode) {
-		ret = copy_info_records_to_user(event, info, info_mode, pidfd,
-						buf, count);
-		if (ret < 0)
-			goto out_close_fd;
-	}
+	ret = copy_info_records_to_user(event, info, info_mode, pidfd,
+					buf, count);
+	if (ret < 0)
+		goto out_close_fd;
 
 	if (f)
 		fd_install(fd, f);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 02/16] fsnotify: introduce pre-content permission event
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 01/16] fanotify: don't skip extra event info if no info_mode is set Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 03/16] fsnotify: generate pre-content permission event on open Josef Bacik
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

The new FS_PRE_ACCESS permission event is similar to FS_ACCESS_PERM,
but it meant for a different use case of filling file content before
access to a file range, so it has slightly different semantics.

Generate FS_PRE_ACCESS/FS_ACCESS_PERM as two seperate events, same as
we did for FS_OPEN_PERM/FS_OPEN_EXEC_PERM.

FS_PRE_MODIFY is a new permission event, with similar semantics as
FS_PRE_ACCESS, which is called before a file is modified.

FS_ACCESS_PERM is reported also on blockdev and pipes, but the new
pre-content events are only reported for regular files and dirs.

The pre-content events are meant to be used by hierarchical storage
managers that want to fill the content of files on first access.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fsnotify.c             |  2 +-
 include/linux/fsnotify.h         | 27 ++++++++++++++++++++++++---
 include/linux/fsnotify_backend.h | 13 +++++++++++--
 security/selinux/hooks.c         |  3 ++-
 4 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 272c8a1dab3c..1ca4a8da7f29 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -621,7 +621,7 @@ static __init int fsnotify_init(void)
 {
 	int ret;
 
-	BUILD_BUG_ON(HWEIGHT32(ALL_FSNOTIFY_BITS) != 23);
+	BUILD_BUG_ON(HWEIGHT32(ALL_FSNOTIFY_BITS) != 25);
 
 	ret = init_srcu_struct(&fsnotify_mark_srcu);
 	if (ret)
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 278620e063ab..028ce807805a 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -133,12 +133,13 @@ static inline int fsnotify_file(struct file *file, __u32 mask)
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
 /*
- * fsnotify_file_area_perm - permission hook before access to file range
+ * fsnotify_file_area_perm - permission hook before access/modify of file range
  */
 static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 					  const loff_t *ppos, size_t count)
 {
-	__u32 fsnotify_mask = FS_ACCESS_PERM;
+	struct inode *inode = file_inode(file);
+	__u32 fsnotify_mask;
 
 	/*
 	 * filesystem may be modified in the context of permission events
@@ -147,7 +148,27 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 	 */
 	lockdep_assert_once(file_write_not_started(file));
 
-	if (!(perm_mask & MAY_READ))
+	/*
+	 * Generate FS_PRE_ACCESS/FS_ACCESS_PERM as two seperate events.
+	 */
+	if (perm_mask & MAY_READ) {
+		int ret = fsnotify_file(file, FS_ACCESS_PERM);
+
+		if (ret)
+			return ret;
+	}
+
+	/*
+	 * Pre-content events are only reported for regular files and dirs.
+	 */
+	if (!S_ISDIR(inode->i_mode) && !S_ISREG(inode->i_mode))
+		return 0;
+
+	if (perm_mask & MAY_WRITE)
+		fsnotify_mask = FS_PRE_MODIFY;
+	else if (perm_mask & MAY_READ)
+		fsnotify_mask = FS_PRE_ACCESS;
+	else
 		return 0;
 
 	return fsnotify_file(file, fsnotify_mask);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 8be029bc50b1..200a5e3b1cd4 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -56,6 +56,9 @@
 #define FS_ACCESS_PERM		0x00020000	/* access event in a permissions hook */
 #define FS_OPEN_EXEC_PERM	0x00040000	/* open/exec event in a permission hook */
 
+#define FS_PRE_ACCESS		0x00080000	/* Pre-content access hook */
+#define FS_PRE_MODIFY		0x00100000	/* Pre-content modify hook */
+
 /*
  * Set on inode mark that cares about things that happen to its children.
  * Always set for dnotify and inotify.
@@ -77,8 +80,14 @@
  */
 #define ALL_FSNOTIFY_DIRENT_EVENTS (FS_CREATE | FS_DELETE | FS_MOVE | FS_RENAME)
 
-#define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
-				  FS_OPEN_EXEC_PERM)
+/* Content events can be used to inspect file content */
+#define FSNOTIFY_CONTENT_PERM_EVENTS (FS_OPEN_PERM | FS_OPEN_EXEC_PERM | \
+				      FS_ACCESS_PERM)
+/* Pre-content events can be used to fill file content */
+#define FSNOTIFY_PRE_CONTENT_EVENTS  (FS_PRE_ACCESS | FS_PRE_MODIFY)
+
+#define ALL_FSNOTIFY_PERM_EVENTS (FSNOTIFY_CONTENT_PERM_EVENTS | \
+				  FSNOTIFY_PRE_CONTENT_EVENTS)
 
 /*
  * This is a list of all events that may get sent to a parent that is watching
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 55c78c318ccd..2997edf3e7cd 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3406,7 +3406,8 @@ static int selinux_path_notify(const struct path *path, u64 mask,
 		perm |= FILE__WATCH_WITH_PERM;
 
 	/* watches on read-like events need the file:watch_reads permission */
-	if (mask & (FS_ACCESS | FS_ACCESS_PERM | FS_CLOSE_NOWRITE))
+	if (mask & (FS_ACCESS | FS_ACCESS_PERM | FS_PRE_ACCESS |
+		    FS_CLOSE_NOWRITE))
 		perm |= FILE__WATCH_READS;
 
 	return path_has_perm(current_cred(), path, perm);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 03/16] fsnotify: generate pre-content permission event on open
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 01/16] fanotify: don't skip extra event info if no info_mode is set Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 02/16] fsnotify: introduce pre-content permission event Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 11:51   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event Josef Bacik
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

FS_PRE_ACCESS or FS_PRE_MODIFY will be generated on open depending on
file open mode.  The pre-content event will be generated in addition to
FS_OPEN_PERM, but without sb_writers held and after file was truncated
in case file was opened with O_CREAT and/or O_TRUNC.

The event will have a range info of (0..0) to provide an opportunity
to fill entire file content on open.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/namei.c               |  9 +++++++++
 include/linux/fsnotify.h | 10 +++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/namei.c b/fs/namei.c
index 3a4c40e12f78..c16487e3742d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3735,6 +3735,15 @@ static int do_open(struct nameidata *nd,
 	}
 	if (do_truncate)
 		mnt_drop_write(nd->path.mnt);
+
+	/*
+	 * This permission hook is different than fsnotify_open_perm() hook.
+	 * This is a pre-content hook that is called without sb_writers held
+	 * and after the file was truncated.
+	 */
+	if (!error)
+		error = fsnotify_file_perm(file, MAY_OPEN);
+
 	return error;
 }
 
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 028ce807805a..a28daf136fea 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -168,6 +168,10 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 		fsnotify_mask = FS_PRE_MODIFY;
 	else if (perm_mask & MAY_READ)
 		fsnotify_mask = FS_PRE_ACCESS;
+	else if (perm_mask & MAY_OPEN && file->f_mode & FMODE_WRITER)
+		fsnotify_mask = FS_PRE_MODIFY;
+	else if (perm_mask & MAY_OPEN)
+		fsnotify_mask = FS_PRE_ACCESS;
 	else
 		return 0;
 
@@ -176,10 +180,14 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 
 /*
  * fsnotify_file_perm - permission hook before file access
+ *
+ * Called from read()/write() with perm_mask MAY_READ/MAY_WRITE.
+ * Called from open() with MAY_OPEN without sb_writers held and after the file
+ * was truncated. Note that this is a different event from fsnotify_open_perm().
  */
 static inline int fsnotify_file_perm(struct file *file, int perm_mask)
 {
-	return fsnotify_file_area_perm(file, perm_mask, NULL, 0);
+	return fsnotify_file_area_perm(file, perm_mask, &file->f_pos, 0);
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (2 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 03/16] fsnotify: generate pre-content permission event on open Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 11:57   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY " Josef Bacik
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

Similar to FAN_ACCESS_PERM permission event, but it is only allowed with
class FAN_CLASS_PRE_CONTENT and only allowed on regular files and dirs.

Unlike FAN_ACCESS_PERM, it is safe to write to the file being accessed
in the context of the event handler.

This pre-content event is meant to be used by hierarchical storage
managers that want to fill the content of files on first read access.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      |  3 ++-
 fs/notify/fanotify/fanotify_user.c | 17 ++++++++++++++---
 include/linux/fanotify.h           | 14 ++++++++++----
 include/uapi/linux/fanotify.h      |  2 ++
 4 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 224bccaab4cc..7dac8e4486df 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -910,8 +910,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
 	BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
 	BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
 	BUILD_BUG_ON(FAN_RENAME != FS_RENAME);
+	BUILD_BUG_ON(FAN_PRE_ACCESS != FS_PRE_ACCESS);
 
-	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 21);
+	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 22);
 
 	mask = fanotify_group_event_mask(group, iter_info, &match_mask,
 					 mask, data, data_type, dir);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 2e2fba8a9d20..c294849e474f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1628,6 +1628,7 @@ static int fanotify_events_supported(struct fsnotify_group *group,
 				     unsigned int flags)
 {
 	unsigned int mark_type = flags & FANOTIFY_MARK_TYPE_BITS;
+	bool is_dir = d_is_dir(path->dentry);
 	/* Strict validation of events in non-dir inode mask with v5.17+ APIs */
 	bool strict_dir_events = FAN_GROUP_FLAG(group, FAN_REPORT_TARGET_FID) ||
 				 (mask & FAN_RENAME) ||
@@ -1665,9 +1666,15 @@ static int fanotify_events_supported(struct fsnotify_group *group,
 	 * but because we always allowed it, error only when using new APIs.
 	 */
 	if (strict_dir_events && mark_type == FAN_MARK_INODE &&
-	    !d_is_dir(path->dentry) && (mask & FANOTIFY_DIRONLY_EVENT_BITS))
+	    !is_dir && (mask & FANOTIFY_DIRONLY_EVENT_BITS))
 		return -ENOTDIR;
 
+	/* Pre-content events are only supported on regular files and dirs */
+	if (mask & FANOTIFY_PRE_CONTENT_EVENTS) {
+		if (!is_dir && !d_is_reg(path->dentry))
+			return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -1769,11 +1776,15 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 		goto fput_and_out;
 
 	/*
-	 * Permission events require minimum priority FAN_CLASS_CONTENT.
+	 * Permission events are not allowed for FAN_CLASS_NOTIF.
+	 * Pre-content permission events are not allowed for FAN_CLASS_CONTENT.
 	 */
 	ret = -EINVAL;
 	if (mask & FANOTIFY_PERM_EVENTS &&
-	    group->priority < FSNOTIFY_PRIO_CONTENT)
+	    group->priority == FSNOTIFY_PRIO_NORMAL)
+		goto fput_and_out;
+	else if (mask & FANOTIFY_PRE_CONTENT_EVENTS &&
+		 group->priority == FSNOTIFY_PRIO_CONTENT)
 		goto fput_and_out;
 
 	if (mask & FAN_FS_ERROR &&
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 4f1c4f603118..5c811baf44d2 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -88,6 +88,16 @@
 #define FANOTIFY_DIRENT_EVENTS	(FAN_MOVE | FAN_CREATE | FAN_DELETE | \
 				 FAN_RENAME)
 
+/* Content events can be used to inspect file content */
+#define FANOTIFY_CONTENT_PERM_EVENTS (FAN_OPEN_PERM | FAN_OPEN_EXEC_PERM | \
+				      FAN_ACCESS_PERM)
+/* Pre-content events can be used to fill file content */
+#define FANOTIFY_PRE_CONTENT_EVENTS  (FAN_PRE_ACCESS)
+
+/* Events that require a permission response from user */
+#define FANOTIFY_PERM_EVENTS	(FANOTIFY_CONTENT_PERM_EVENTS | \
+				 FANOTIFY_PRE_CONTENT_EVENTS)
+
 /* Events that can be reported with event->fd */
 #define FANOTIFY_FD_EVENTS (FANOTIFY_PATH_EVENTS | FANOTIFY_PERM_EVENTS)
 
@@ -103,10 +113,6 @@
 				 FANOTIFY_INODE_EVENTS | \
 				 FANOTIFY_ERROR_EVENTS)
 
-/* Events that require a permission response from user */
-#define FANOTIFY_PERM_EVENTS	(FAN_OPEN_PERM | FAN_ACCESS_PERM | \
-				 FAN_OPEN_EXEC_PERM)
-
 /* Extra flags that may be reported with event or control handling of events */
 #define FANOTIFY_EVENT_FLAGS	(FAN_EVENT_ON_CHILD | FAN_ONDIR)
 
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index a37de58ca571..bcada21a3a2e 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -26,6 +26,8 @@
 #define FAN_ACCESS_PERM		0x00020000	/* File accessed in perm check */
 #define FAN_OPEN_EXEC_PERM	0x00040000	/* File open/exec in perm check */
 
+#define FAN_PRE_ACCESS		0x00080000	/* Pre-content access hook */
+
 #define FAN_EVENT_ON_CHILD	0x08000000	/* Interested in child events */
 
 #define FAN_RENAME		0x10000000	/* File was renamed */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY permission event
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (3 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 11:57   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event Josef Bacik
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

Generate FAN_PRE_MODIFY permission event from fsnotify_file_perm()
pre-write hook to notify fanotify listeners on an intent to make
modification to a file.

Like FAN_PRE_ACCESS, it is only allowed with FAN_CLASS_PRE_CONTENT
and unlike FAN_MODIFY, it is only allowed on regular files.

Like FAN_PRE_ACCESS, it is generated without sb_start_write() held,
so it is safe to perform filesystem modifications in the context of
event handler.

This pre-content event is meant to be used by hierarchical storage
managers that want to fill the content of files on first write access.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 3 ++-
 fs/notify/fanotify/fanotify_user.c | 2 ++
 include/linux/fanotify.h           | 3 ++-
 include/uapi/linux/fanotify.h      | 1 +
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 7dac8e4486df..b163594843f5 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -911,8 +911,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
 	BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
 	BUILD_BUG_ON(FAN_RENAME != FS_RENAME);
 	BUILD_BUG_ON(FAN_PRE_ACCESS != FS_PRE_ACCESS);
+	BUILD_BUG_ON(FAN_PRE_MODIFY != FS_PRE_MODIFY);
 
-	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 22);
+	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 23);
 
 	mask = fanotify_group_event_mask(group, iter_info, &match_mask,
 					 mask, data, data_type, dir);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index c294849e474f..3a7101544f30 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1673,6 +1673,8 @@ static int fanotify_events_supported(struct fsnotify_group *group,
 	if (mask & FANOTIFY_PRE_CONTENT_EVENTS) {
 		if (!is_dir && !d_is_reg(path->dentry))
 			return -EINVAL;
+		if (is_dir && mask & FAN_PRE_MODIFY)
+			return -EISDIR;
 	}
 
 	return 0;
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 5c811baf44d2..ae6cb2688d52 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -92,7 +92,8 @@
 #define FANOTIFY_CONTENT_PERM_EVENTS (FAN_OPEN_PERM | FAN_OPEN_EXEC_PERM | \
 				      FAN_ACCESS_PERM)
 /* Pre-content events can be used to fill file content */
-#define FANOTIFY_PRE_CONTENT_EVENTS  (FAN_PRE_ACCESS)
+#define FANOTIFY_PRE_CONTENT_EVENTS  (FAN_PRE_ACCESS | FAN_PRE_MODIFY)
+#define FANOTIFY_PRE_MODIFY_EVENTS   (FAN_PRE_MODIFY)
 
 /* Events that require a permission response from user */
 #define FANOTIFY_PERM_EVENTS	(FANOTIFY_CONTENT_PERM_EVENTS | \
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index bcada21a3a2e..ac00fad66416 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -27,6 +27,7 @@
 #define FAN_OPEN_EXEC_PERM	0x00040000	/* File open/exec in perm check */
 
 #define FAN_PRE_ACCESS		0x00080000	/* Pre-content access hook */
+#define FAN_PRE_MODIFY		0x00100000	/* Pre-content modify hook */
 
 #define FAN_EVENT_ON_CHILD	0x08000000	/* Interested in child events */
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (4 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY " Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 12:00   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 07/16] fanotify: rename a misnamed constant Josef Bacik
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

We would like to add file range information to pre-content events.

Pass a struct file_range with optional offset and length to event handler
along with pre-content permission event.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c    | 10 ++++++++--
 fs/notify/fanotify/fanotify.h    |  2 ++
 include/linux/fsnotify.h         | 17 ++++++++++++++++-
 include/linux/fsnotify_backend.h | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index b163594843f5..4e8dce39fa8f 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -549,9 +549,13 @@ static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
 	return &pevent->fae;
 }
 
-static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
+static struct fanotify_event *fanotify_alloc_perm_event(const void *data,
+							int data_type,
 							gfp_t gfp)
 {
+	const struct path *path = fsnotify_data_path(data, data_type);
+	const struct file_range *range =
+			    fsnotify_data_file_range(data, data_type);
 	struct fanotify_perm_event *pevent;
 
 	pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp);
@@ -565,6 +569,8 @@ static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
 	pevent->hdr.len = 0;
 	pevent->state = FAN_EVENT_INIT;
 	pevent->path = *path;
+	pevent->ppos = range ? range->ppos : NULL;
+	pevent->count = range ? range->count : 0;
 	path_get(path);
 
 	return &pevent->fae;
@@ -802,7 +808,7 @@ static struct fanotify_event *fanotify_alloc_event(
 	old_memcg = set_active_memcg(group->memcg);
 
 	if (fanotify_is_perm_event(mask)) {
-		event = fanotify_alloc_perm_event(path, gfp);
+		event = fanotify_alloc_perm_event(data, data_type, gfp);
 	} else if (fanotify_is_error_event(mask)) {
 		event = fanotify_alloc_error_event(group, fsid, data,
 						   data_type, &hash);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index e5ab33cae6a7..93598b7d5952 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -425,6 +425,8 @@ FANOTIFY_PE(struct fanotify_event *event)
 struct fanotify_perm_event {
 	struct fanotify_event fae;
 	struct path path;
+	const loff_t *ppos;		/* optional file range info */
+	size_t count;
 	u32 response;			/* userspace answer to the event */
 	unsigned short state;		/* state of the event */
 	int fd;		/* fd we passed to userspace for this event */
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index a28daf136fea..4609d9b6b087 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -132,6 +132,21 @@ static inline int fsnotify_file(struct file *file, __u32 mask)
 }
 
 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
+static inline int fsnotify_file_range(struct file *file, __u32 mask,
+				      const loff_t *ppos, size_t count)
+{
+	struct file_range range;
+
+	if (file->f_mode & FMODE_NONOTIFY)
+		return 0;
+
+	range.path = &file->f_path;
+	range.ppos = ppos;
+	range.count = count;
+	return fsnotify_parent(range.path->dentry, mask, &range,
+			       FSNOTIFY_EVENT_FILE_RANGE);
+}
+
 /*
  * fsnotify_file_area_perm - permission hook before access/modify of file range
  */
@@ -175,7 +190,7 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
 	else
 		return 0;
 
-	return fsnotify_file(file, fsnotify_mask);
+	return fsnotify_file_range(file, fsnotify_mask, ppos, count);
 }
 
 /*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 200a5e3b1cd4..276320846bfd 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -298,6 +298,7 @@ static inline void fsnotify_group_assert_locked(struct fsnotify_group *group)
 /* When calling fsnotify tell it if the data is a path or inode */
 enum fsnotify_data_type {
 	FSNOTIFY_EVENT_NONE,
+	FSNOTIFY_EVENT_FILE_RANGE,
 	FSNOTIFY_EVENT_PATH,
 	FSNOTIFY_EVENT_INODE,
 	FSNOTIFY_EVENT_DENTRY,
@@ -310,6 +311,17 @@ struct fs_error_report {
 	struct super_block *sb;
 };
 
+struct file_range {
+	const struct path *path;
+	const loff_t *ppos;
+	size_t count;
+};
+
+static inline const struct path *file_range_path(const struct file_range *range)
+{
+	return range->path;
+}
+
 static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
 {
 	switch (data_type) {
@@ -319,6 +331,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
 		return d_inode(data);
 	case FSNOTIFY_EVENT_PATH:
 		return d_inode(((const struct path *)data)->dentry);
+	case FSNOTIFY_EVENT_FILE_RANGE:
+		return d_inode(file_range_path(data)->dentry);
 	case FSNOTIFY_EVENT_ERROR:
 		return ((struct fs_error_report *)data)->inode;
 	default:
@@ -334,6 +348,8 @@ static inline struct dentry *fsnotify_data_dentry(const void *data, int data_typ
 		return (struct dentry *)data;
 	case FSNOTIFY_EVENT_PATH:
 		return ((const struct path *)data)->dentry;
+	case FSNOTIFY_EVENT_FILE_RANGE:
+		return file_range_path(data)->dentry;
 	default:
 		return NULL;
 	}
@@ -345,6 +361,8 @@ static inline const struct path *fsnotify_data_path(const void *data,
 	switch (data_type) {
 	case FSNOTIFY_EVENT_PATH:
 		return data;
+	case FSNOTIFY_EVENT_FILE_RANGE:
+		return file_range_path(data);
 	default:
 		return NULL;
 	}
@@ -360,6 +378,8 @@ static inline struct super_block *fsnotify_data_sb(const void *data,
 		return ((struct dentry *)data)->d_sb;
 	case FSNOTIFY_EVENT_PATH:
 		return ((const struct path *)data)->dentry->d_sb;
+	case FSNOTIFY_EVENT_FILE_RANGE:
+		return file_range_path(data)->dentry->d_sb;
 	case FSNOTIFY_EVENT_ERROR:
 		return ((struct fs_error_report *) data)->sb;
 	default:
@@ -379,6 +399,18 @@ static inline struct fs_error_report *fsnotify_data_error_report(
 	}
 }
 
+static inline const struct file_range *fsnotify_data_file_range(
+							const void *data,
+							int data_type)
+{
+	switch (data_type) {
+	case FSNOTIFY_EVENT_FILE_RANGE:
+		return (struct file_range *)data;
+	default:
+		return NULL;
+	}
+}
+
 /*
  * Index to merged marks iterator array that correlates to a type of watch.
  * The type of watched object can be deduced from the iterator type, but not
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 07/16] fanotify: rename a misnamed constant
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (5 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 11:41   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 08/16] fanotify: report file range info with pre-content events Josef Bacik
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

FANOTIFY_PIDFD_INFO_HDR_LEN is not the length of the header.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify_user.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 3a7101544f30..5ece186d5c50 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -119,7 +119,7 @@ struct kmem_cache *fanotify_perm_event_cachep __ro_after_init;
 #define FANOTIFY_EVENT_ALIGN 4
 #define FANOTIFY_FID_INFO_HDR_LEN \
 	(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
-#define FANOTIFY_PIDFD_INFO_HDR_LEN \
+#define FANOTIFY_PIDFD_INFO_LEN \
 	sizeof(struct fanotify_event_info_pidfd)
 #define FANOTIFY_ERROR_INFO_LEN \
 	(sizeof(struct fanotify_event_info_error))
@@ -174,14 +174,14 @@ static size_t fanotify_event_len(unsigned int info_mode,
 		dot_len = 1;
 	}
 
-	if (info_mode & FAN_REPORT_PIDFD)
-		event_len += FANOTIFY_PIDFD_INFO_HDR_LEN;
-
 	if (fanotify_event_has_object_fh(event)) {
 		fh_len = fanotify_event_object_fh_len(event);
 		event_len += fanotify_fid_info_len(fh_len, dot_len);
 	}
 
+	if (info_mode & FAN_REPORT_PIDFD)
+		event_len += FANOTIFY_PIDFD_INFO_LEN;
+
 	return event_len;
 }
 
@@ -511,7 +511,7 @@ static int copy_pidfd_info_to_user(int pidfd,
 				   size_t count)
 {
 	struct fanotify_event_info_pidfd info = { };
-	size_t info_len = FANOTIFY_PIDFD_INFO_HDR_LEN;
+	size_t info_len = FANOTIFY_PIDFD_INFO_LEN;
 
 	if (WARN_ON_ONCE(info_len > count))
 		return -EFAULT;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 08/16] fanotify: report file range info with pre-content events
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (6 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 07/16] fanotify: rename a misnamed constant Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response Josef Bacik
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

With group class FAN_CLASS_PRE_CONTENT, report offset and length info
along with FAN_PRE_ACCESS and FAN_PRE_MODIFY permission events.

This information is meant to be used by hierarchical storage managers
that want to fill partial content of files on first access to range.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.h      |  8 +++++++
 fs/notify/fanotify/fanotify_user.c | 38 ++++++++++++++++++++++++++++++
 include/uapi/linux/fanotify.h      |  7 ++++++
 3 files changed, 53 insertions(+)

diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 93598b7d5952..7f06355afa1f 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -448,6 +448,14 @@ static inline bool fanotify_is_perm_event(u32 mask)
 		mask & FANOTIFY_PERM_EVENTS;
 }
 
+static inline bool fanotify_event_has_access_range(struct fanotify_event *event)
+{
+	if (!(event->mask & FANOTIFY_PRE_CONTENT_EVENTS))
+		return false;
+
+	return FANOTIFY_PERM(event)->ppos;
+}
+
 static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
 {
 	return container_of(fse, struct fanotify_event, fse);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 5ece186d5c50..ed56fe6f5ec7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -123,6 +123,8 @@ struct kmem_cache *fanotify_perm_event_cachep __ro_after_init;
 	sizeof(struct fanotify_event_info_pidfd)
 #define FANOTIFY_ERROR_INFO_LEN \
 	(sizeof(struct fanotify_event_info_error))
+#define FANOTIFY_RANGE_INFO_LEN \
+	(sizeof(struct fanotify_event_info_range))
 
 static int fanotify_fid_info_len(int fh_len, int name_len)
 {
@@ -182,6 +184,9 @@ static size_t fanotify_event_len(unsigned int info_mode,
 	if (info_mode & FAN_REPORT_PIDFD)
 		event_len += FANOTIFY_PIDFD_INFO_LEN;
 
+	if (fanotify_event_has_access_range(event))
+		event_len += FANOTIFY_RANGE_INFO_LEN;
+
 	return event_len;
 }
 
@@ -526,6 +531,30 @@ static int copy_pidfd_info_to_user(int pidfd,
 	return info_len;
 }
 
+static size_t copy_range_info_to_user(struct fanotify_event *event,
+				      char __user *buf, int count)
+{
+	struct fanotify_perm_event *pevent = FANOTIFY_PERM(event);
+	struct fanotify_event_info_range info = { };
+	size_t info_len = FANOTIFY_RANGE_INFO_LEN;
+
+	if (WARN_ON_ONCE(info_len > count))
+		return -EFAULT;
+
+	if (WARN_ON_ONCE(!pevent->ppos))
+		return -EINVAL;
+
+	info.hdr.info_type = FAN_EVENT_INFO_TYPE_RANGE;
+	info.hdr.len = info_len;
+	info.offset = *(pevent->ppos);
+	info.count = pevent->count;
+
+	if (copy_to_user(buf, &info, info_len))
+		return -EFAULT;
+
+	return info_len;
+}
+
 static int copy_info_records_to_user(struct fanotify_event *event,
 				     struct fanotify_info *info,
 				     unsigned int info_mode, int pidfd,
@@ -647,6 +676,15 @@ static int copy_info_records_to_user(struct fanotify_event *event,
 		total_bytes += ret;
 	}
 
+	if (fanotify_event_has_access_range(event)) {
+		ret = copy_range_info_to_user(event, buf, count);
+		if (ret < 0)
+			return ret;
+		buf += ret;
+		count -= ret;
+		total_bytes += ret;
+	}
+
 	return total_bytes;
 }
 
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index ac00fad66416..cc28dce5f744 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -145,6 +145,7 @@ struct fanotify_event_metadata {
 #define FAN_EVENT_INFO_TYPE_DFID	3
 #define FAN_EVENT_INFO_TYPE_PIDFD	4
 #define FAN_EVENT_INFO_TYPE_ERROR	5
+#define FAN_EVENT_INFO_TYPE_RANGE	6
 
 /* Special info types for FAN_RENAME */
 #define FAN_EVENT_INFO_TYPE_OLD_DFID_NAME	10
@@ -191,6 +192,12 @@ struct fanotify_event_info_error {
 	__u32 error_count;
 };
 
+struct fanotify_event_info_range {
+	struct fanotify_event_info_header hdr;
+	__u64 offset;
+	__u64 count;
+};
+
 /*
  * User space may need to record additional information about its decision.
  * The extra information type records what kind of information is included.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (7 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 08/16] fanotify: report file range info with pre-content events Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 12:06   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 10/16] fanotify: add a helper to check for pre content events Josef Bacik
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

From: Amir Goldstein <amir73il@gmail.com>

With FAN_DENY response, user trying to perform the filesystem operation
gets an error with errno set to EPERM.

It is useful for hierarchical storage management (HSM) service to be able
to deny access for reasons more diverse than EPERM, for example EAGAIN,
if HSM could retry the operation later.

Allow fanotify groups with priority FAN_CLASSS_PRE_CONTENT to responsd
to permission events with the response value FAN_DENY_ERRNO(errno),
instead of FAN_DENY to return a custom error.

Limit custom error values to errors expected on read(2)/write(2) and
open(2) of regular files. This list could be extended in the future.
Userspace can test for legitimate values of FAN_DENY_ERRNO(errno) by
writing a response to an fanotify group fd with a value of FAN_NOFD in
the fd field of the response.

The change in fanotify_response is backward compatible, because errno is
written in the high 8 bits of the 32bit response field and old kernels
reject respose value with high bits set.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/notify/fanotify/fanotify.c      | 18 ++++++++++-----
 fs/notify/fanotify/fanotify.h      | 10 +++++++++
 fs/notify/fanotify/fanotify_user.c | 36 +++++++++++++++++++++++++-----
 include/linux/fanotify.h           |  5 ++++-
 include/uapi/linux/fanotify.h      |  7 ++++++
 5 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 4e8dce39fa8f..1cbf41b34080 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -224,7 +224,8 @@ static int fanotify_get_response(struct fsnotify_group *group,
 				 struct fanotify_perm_event *event,
 				 struct fsnotify_iter_info *iter_info)
 {
-	int ret;
+	int ret, errno;
+	u32 decision;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
@@ -257,20 +258,27 @@ static int fanotify_get_response(struct fsnotify_group *group,
 		goto out;
 	}
 
+	decision = fanotify_get_response_decision(event->response);
 	/* userspace responded, convert to something usable */
-	switch (event->response & FANOTIFY_RESPONSE_ACCESS) {
+	switch (decision & FANOTIFY_RESPONSE_ACCESS) {
 	case FAN_ALLOW:
 		ret = 0;
 		break;
 	case FAN_DENY:
+		/* Check custom errno from pre-content events */
+		errno = fanotify_get_response_errno(event->response);
+		if (errno) {
+			ret = -errno;
+			break;
+		}
+		fallthrough;
 	default:
 		ret = -EPERM;
 	}
 
 	/* Check if the response should be audited */
-	if (event->response & FAN_AUDIT)
-		audit_fanotify(event->response & ~FAN_AUDIT,
-			       &event->audit_rule);
+	if (decision & FAN_AUDIT)
+		audit_fanotify(decision & ~FAN_AUDIT, &event->audit_rule);
 
 	pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__,
 		 group, event, ret);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 7f06355afa1f..d0722ef13138 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -528,3 +528,13 @@ static inline unsigned int fanotify_mark_user_flags(struct fsnotify_mark *mark)
 
 	return mflags;
 }
+
+static inline u32 fanotify_get_response_decision(u32 res)
+{
+	return res & (FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS);
+}
+
+static inline int fanotify_get_response_errno(int res)
+{
+	return res >> FAN_ERRNO_SHIFT;
+}
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index ed56fe6f5ec7..0a37f1c761aa 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -337,11 +337,13 @@ static int process_access_response(struct fsnotify_group *group,
 	struct fanotify_perm_event *event;
 	int fd = response_struct->fd;
 	u32 response = response_struct->response;
+	u32 decision = fanotify_get_response_decision(response);
+	int errno = fanotify_get_response_errno(response);
 	int ret = info_len;
 	struct fanotify_response_info_audit_rule friar;
 
-	pr_debug("%s: group=%p fd=%d response=%u buf=%p size=%zu\n", __func__,
-		 group, fd, response, info, info_len);
+	pr_debug("%s: group=%p fd=%d response=%x errno=%d buf=%p size=%zu\n",
+		 __func__, group, fd, response, errno, info, info_len);
 	/*
 	 * make sure the response is valid, if invalid we do nothing and either
 	 * userspace can send a valid response or we will clean it up after the
@@ -350,18 +352,42 @@ static int process_access_response(struct fsnotify_group *group,
 	if (response & ~FANOTIFY_RESPONSE_VALID_MASK)
 		return -EINVAL;
 
-	switch (response & FANOTIFY_RESPONSE_ACCESS) {
+	switch (decision & FANOTIFY_RESPONSE_ACCESS) {
 	case FAN_ALLOW:
+		if (errno)
+			return -EINVAL;
+		break;
 	case FAN_DENY:
+		/* Custom errno is supported only for pre-content groups */
+		if (errno && group->priority != FSNOTIFY_PRIO_PRE_CONTENT)
+			return -EINVAL;
+
+		/*
+		 * Limit errno to values expected on open(2)/read(2)/write(2)
+		 * of regular files.
+		 */
+		switch (errno) {
+		case 0:
+		case EIO:
+		case EPERM:
+		case EBUSY:
+		case ETXTBSY:
+		case EAGAIN:
+		case ENOSPC:
+		case EDQUOT:
+			break;
+		default:
+			return -EINVAL;
+		}
 		break;
 	default:
 		return -EINVAL;
 	}
 
-	if ((response & FAN_AUDIT) && !FAN_GROUP_FLAG(group, FAN_ENABLE_AUDIT))
+	if ((decision & FAN_AUDIT) && !FAN_GROUP_FLAG(group, FAN_ENABLE_AUDIT))
 		return -EINVAL;
 
-	if (response & FAN_INFO) {
+	if (decision & FAN_INFO) {
 		ret = process_access_response_info(info, info_len, &friar);
 		if (ret < 0)
 			return ret;
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index ae6cb2688d52..547514542669 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -132,7 +132,10 @@
 /* These masks check for invalid bits in permission responses. */
 #define FANOTIFY_RESPONSE_ACCESS (FAN_ALLOW | FAN_DENY)
 #define FANOTIFY_RESPONSE_FLAGS (FAN_AUDIT | FAN_INFO)
-#define FANOTIFY_RESPONSE_VALID_MASK (FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS)
+#define FANOTIFY_RESPONSE_ERRNO	(FAN_ERRNO_MASK << FAN_ERRNO_SHIFT)
+#define FANOTIFY_RESPONSE_VALID_MASK \
+	(FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS | \
+	 FANOTIFY_RESPONSE_ERRNO)
 
 /* Do not use these old uapi constants internally */
 #undef FAN_ALL_CLASS_BITS
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index cc28dce5f744..7b746c5fcbd8 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -233,6 +233,13 @@ struct fanotify_response_info_audit_rule {
 /* Legit userspace responses to a _PERM event */
 #define FAN_ALLOW	0x01
 #define FAN_DENY	0x02
+/* errno other than EPERM can specified in upper byte of deny response */
+#define FAN_ERRNO_BITS	8
+#define FAN_ERRNO_SHIFT (32 - FAN_ERRNO_BITS)
+#define FAN_ERRNO_MASK	((1 << FAN_ERRNO_BITS) - 1)
+#define FAN_DENY_ERRNO(err) \
+	(FAN_DENY | ((((__u32)(err)) & FAN_ERRNO_MASK) << FAN_ERRNO_SHIFT))
+
 #define FAN_AUDIT	0x10	/* Bitmask to create audit record for result */
 #define FAN_INFO	0x20	/* Bitmask to indicate additional information */
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 10/16] fanotify: add a helper to check for pre content events
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (8 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 12:10   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches Josef Bacik
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

We want to emit events during page fault, and calling into fanotify
could be expensive, so add a helper to allow us to skip calling into
fanotify from page fault.  This will also be used to disable readahead
for content watched files which will be handled in a subsequent patch.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/notify/fsnotify.c             | 12 ++++++++++++
 include/linux/fsnotify_backend.h | 14 ++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 1ca4a8da7f29..cbfaa000f815 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -201,6 +201,18 @@ static inline bool fsnotify_object_watched(struct inode *inode, __u32 mnt_mask,
 	return mask & marks_mask & ALL_FSNOTIFY_EVENTS;
 }
 
+#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
+bool fsnotify_file_has_pre_content_watches(struct file *file)
+{
+	struct inode *inode = file_inode(file);
+	__u32 mnt_mask = real_mount(file->f_path.mnt)->mnt_fsnotify_mask;
+
+	return fsnotify_object_watched(inode, mnt_mask,
+				       FSNOTIFY_PRE_CONTENT_EVENTS);
+}
+#endif
+
+
 /*
  * Notify this dentry's parent about a child's events with child name info
  * if parent is watching or if inode/sb/mount are interested in events with
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 276320846bfd..b495a0676dd3 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -900,6 +900,15 @@ static inline void fsnotify_init_event(struct fsnotify_event *event)
 	INIT_LIST_HEAD(&event->list);
 }
 
+#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
+bool fsnotify_file_has_pre_content_watches(struct file *file);
+#else
+static inline bool fsnotify_file_has_pre_content_watches(struct file *file)
+{
+	return false;
+}
+#endif /* CONFIG_FANOTIFY_ACCESS_PERMISSIONS */
+
 #else
 
 static inline int fsnotify(__u32 mask, const void *data, int data_type,
@@ -938,6 +947,11 @@ static inline u32 fsnotify_get_cookie(void)
 static inline void fsnotify_unmount_inodes(struct super_block *sb)
 {}
 
+static inline bool fsnotify_file_has_pre_content_watches(struct file *file)
+{
+	return false;
+}
+
 #endif	/* CONFIG_FSNOTIFY */
 
 #endif	/* __KERNEL __ */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (9 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 10/16] fanotify: add a helper to check for pre content events Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 12:12   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches Josef Bacik
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

With page faults we can trigger readahead on the file, and then
subsequent faults can find these pages and insert them into the file
without emitting an fanotify event.  To avoid this case, disable
readahead if we have pre-content watches on the file.  This way we are
guaranteed to get an event for every range we attempt to access on a
pre-content watched file.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mm/filemap.c   | 12 ++++++++++++
 mm/readahead.c | 13 +++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index ca8c8d889eef..8b1684b62177 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3122,6 +3122,14 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
 	unsigned long vm_flags = vmf->vma->vm_flags;
 	unsigned int mmap_miss;
 
+	/*
+	 * If we have pre-content watches we need to disable readahead to make
+	 * sure that we don't populate our mapping with 0 filled pages that we
+	 * never emitted an event for.
+	 */
+	if (fsnotify_file_has_pre_content_watches(file))
+		return fpin;
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	/* Use the readahead code, even if readahead is disabled */
 	if ((vm_flags & VM_HUGEPAGE) && HPAGE_PMD_ORDER <= MAX_PAGECACHE_ORDER) {
@@ -3190,6 +3198,10 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
 	struct file *fpin = NULL;
 	unsigned int mmap_miss;
 
+	/* See comment in do_sync_mmap_readahead. */
+	if (fsnotify_file_has_pre_content_watches(file))
+		return fpin;
+
 	/* If we don't want any read-ahead, don't bother */
 	if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages)
 		return fpin;
diff --git a/mm/readahead.c b/mm/readahead.c
index 817b2a352d78..bc068d9218e3 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -128,6 +128,7 @@
 #include <linux/blk-cgroup.h>
 #include <linux/fadvise.h>
 #include <linux/sched/mm.h>
+#include <linux/fsnotify.h>
 
 #include "internal.h"
 
@@ -674,6 +675,14 @@ void page_cache_sync_ra(struct readahead_control *ractl,
 {
 	bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM);
 
+	/*
+	 * If we have pre-content watches we need to disable readahead to make
+	 * sure that we don't find 0 filled pages in cache that we never emitted
+	 * events for.
+	 */
+	if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file))
+		return;
+
 	/*
 	 * Even if readahead is disabled, issue this request as readahead
 	 * as we'll need it to satisfy the requested range. The forced
@@ -704,6 +713,10 @@ void page_cache_async_ra(struct readahead_control *ractl,
 	if (!ractl->ra->ra_pages)
 		return;
 
+	/* See the comment in page_cache_sync_ra. */
+	if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file))
+		return;
+
 	/*
 	 * Same bit is used for PG_readahead and PG_reclaim.
 	 */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (10 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 12:13   ` Christian Brauner
  2024-08-08 19:27 ` [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault Josef Bacik
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

There's nothing stopping us from supporting this, we could simply pass
the order into the helper and emit the proper length.  However currently
there's no tests to validate this works properly, so disable it until
there's a desire to support this along with the appropriate tests.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 mm/memory.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index d10e616d7389..3010bcc5e4f9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -78,6 +78,7 @@
 #include <linux/ptrace.h>
 #include <linux/vmalloc.h>
 #include <linux/sched/sysctl.h>
+#include <linux/fsnotify.h>
 
 #include <trace/events/kmem.h>
 
@@ -5252,8 +5253,17 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
 static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
+	struct file *file = vma->vm_file;
 	if (vma_is_anonymous(vma))
 		return do_huge_pmd_anonymous_page(vmf);
+	/*
+	 * Currently we just emit PAGE_SIZE for our fault events, so don't allow
+	 * a huge fault if we have a pre content watch on this file.  This would
+	 * be trivial to support, but there would need to be tests to ensure
+	 * this works properly and those don't exist currently.
+	 */
+	if (file && fsnotify_file_has_pre_content_watches(file))
+		return VM_FAULT_FALLBACK;
 	if (vma->vm_ops->huge_fault)
 		return vma->vm_ops->huge_fault(vmf, PMD_ORDER);
 	return VM_FAULT_FALLBACK;
@@ -5263,6 +5273,7 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf)
 static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
+	struct file *file = vma->vm_file;
 	const bool unshare = vmf->flags & FAULT_FLAG_UNSHARE;
 	vm_fault_t ret;
 
@@ -5277,6 +5288,9 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf)
 	}
 
 	if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
+		/* See comment in create_huge_pmd. */
+		if (file && fsnotify_file_has_pre_content_watches(file))
+			goto split;
 		if (vma->vm_ops->huge_fault) {
 			ret = vma->vm_ops->huge_fault(vmf, PMD_ORDER);
 			if (!(ret & VM_FAULT_FALLBACK))
@@ -5296,9 +5310,13 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf)
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) &&			\
 	defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
 	struct vm_area_struct *vma = vmf->vma;
+	struct file *file = vma->vm_file;
 	/* No support for anonymous transparent PUD pages yet */
 	if (vma_is_anonymous(vma))
 		return VM_FAULT_FALLBACK;
+	/* See comment in create_huge_pmd. */
+	if (file && fsnotify_file_has_pre_content_watches(file))
+		return VM_FAULT_FALLBACK;
 	if (vma->vm_ops->huge_fault)
 		return vma->vm_ops->huge_fault(vmf, PUD_ORDER);
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -5310,12 +5328,16 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) &&			\
 	defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
 	struct vm_area_struct *vma = vmf->vma;
+	struct file *file = vma->vm_file;
 	vm_fault_t ret;
 
 	/* No support for anonymous transparent PUD pages yet */
 	if (vma_is_anonymous(vma))
 		goto split;
 	if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
+		/* See comment in create_huge_pmd. */
+		if (file && fsnotify_file_has_pre_content_watches(file))
+			goto split;
 		if (vma->vm_ops->huge_fault) {
 			ret = vma->vm_ops->huge_fault(vmf, PUD_ORDER);
 			if (!(ret & VM_FAULT_FALLBACK))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (11 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 10:34   ` Amir Goldstein
  2024-08-08 19:27 ` [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault Josef Bacik
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

FS_PRE_ACCESS or FS_PRE_MODIFY will be generated on page fault depending
on the faulting method.

This pre-content event is meant to be used by hierarchical storage
managers that want to fill in the file content on first read access.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 include/linux/mm.h |  2 +
 mm/filemap.c       | 97 ++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab3d78116043..c33f3b7f7261 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3503,6 +3503,8 @@ extern vm_fault_t filemap_fault(struct vm_fault *vmf);
 extern vm_fault_t filemap_map_pages(struct vm_fault *vmf,
 		pgoff_t start_pgoff, pgoff_t end_pgoff);
 extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
+extern vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
+						    struct file **fpin);
 
 extern unsigned long stack_guard_gap;
 /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
diff --git a/mm/filemap.c b/mm/filemap.c
index 8b1684b62177..3d232166b051 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -46,6 +46,7 @@
 #include <linux/pipe_fs_i.h>
 #include <linux/splice.h>
 #include <linux/rcupdate_wait.h>
+#include <linux/fsnotify.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 #include "internal.h"
@@ -3112,13 +3113,13 @@ static int lock_folio_maybe_drop_mmap(struct vm_fault *vmf, struct folio *folio,
  * that.  If we didn't pin a file then we return NULL.  The file that is
  * returned needs to be fput()'ed when we're done with it.
  */
-static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
+static struct file *do_sync_mmap_readahead(struct vm_fault *vmf,
+					   struct file *fpin)
 {
 	struct file *file = vmf->vma->vm_file;
 	struct file_ra_state *ra = &file->f_ra;
 	struct address_space *mapping = file->f_mapping;
 	DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff);
-	struct file *fpin = NULL;
 	unsigned long vm_flags = vmf->vma->vm_flags;
 	unsigned int mmap_miss;
 
@@ -3190,12 +3191,12 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
  * was pinned if we have to drop the mmap_lock in order to do IO.
  */
 static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
-					    struct folio *folio)
+					    struct folio *folio,
+					    struct file *fpin)
 {
 	struct file *file = vmf->vma->vm_file;
 	struct file_ra_state *ra = &file->f_ra;
 	DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff);
-	struct file *fpin = NULL;
 	unsigned int mmap_miss;
 
 	/* See comment in do_sync_mmap_readahead. */
@@ -3260,6 +3261,72 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf)
 	return ret;
 }
 
+/**
+ * filemap_maybe_emit_fsnotify_event - maybe emit a pre-content event.
+ * @vmf:	struct vm_fault containing details of the fault.
+ * @fpin:	pointer to the struct file pointer that may be pinned.
+ *
+ * If we have pre-content watches on this file we will need to emit an event for
+ * this range.  We will handle dropping the lock and emitting the event.
+ *
+ * If FAULT_FLAG_RETRY_NOWAIT is set then we'll return VM_FAULT_RETRY.
+ *
+ * If no event was emitted then *fpin will be NULL and we will return 0.
+ *
+ * If any error occurred we will return VM_FAULT_SIGBUS, *fpin could still be
+ * set and will need to have fput() called on it.
+ *
+ * If we emitted the event then we will return 0 and *fpin will be set, this
+ * must have fput() called on it, and the caller must call VM_FAULT_RETRY after
+ * any other operations it does in order to re-fault the page and make sure the
+ * appropriate locking is maintained.
+ *
+ * Return: the appropriate vm_fault_t return code, 0 on success.
+ */
+vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
+					     struct file **fpin)
+{
+	struct file *file = vmf->vma->vm_file;
+	loff_t pos = vmf->pgoff << PAGE_SHIFT;
+	int mask = (vmf->flags & FAULT_FLAG_WRITE) ? MAY_WRITE : MAY_READ;
+	int ret;
+
+	/*
+	 * We already did this and now we're retrying with everything locked,
+	 * don't emit the event and continue.
+	 */
+	if (vmf->flags & FAULT_FLAG_TRIED)
+		return 0;
+
+	/* No watches, return NULL. */
+	if (!fsnotify_file_has_pre_content_watches(file))
+		return 0;
+
+	/* We are NOWAIT, we can't wait, just return EAGAIN. */
+	if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
+		return VM_FAULT_RETRY;
+
+	/*
+	 * If this fails then we're not allowed to drop the fault lock, return a
+	 * SIGBUS so we don't errantly populate pagecache with bogus data for
+	 * this file.
+	 */
+	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
+	if (*fpin == NULL)
+		return VM_FAULT_SIGBUS | VM_FAULT_RETRY;
+
+	/*
+	 * We can't fput(*fpin) at this point because we could have been passed
+	 * in fpin from a previous call.
+	 */
+	ret = fsnotify_file_area_perm(*fpin, mask, &pos, PAGE_SIZE);
+	if (ret)
+		return VM_FAULT_SIGBUS;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(filemap_maybe_emit_fsnotify_event);
+
 /**
  * filemap_fault - read in file data for page fault handling
  * @vmf:	struct vm_fault containing details of the fault
@@ -3299,6 +3366,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	if (unlikely(index >= max_idx))
 		return VM_FAULT_SIGBUS;
 
+	/*
+	 * If we have pre-content watchers then we need to generate events on
+	 * page fault so that we can populate any data before the fault.
+	 */
+	ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
+	if (unlikely(ret)) {
+		if (fpin) {
+			fput(fpin);
+			ret |= VM_FAULT_RETRY;
+		}
+		return ret;
+	}
+
 	/*
 	 * Do we have something in the page cache already?
 	 */
@@ -3309,21 +3389,24 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		 * the lock.
 		 */
 		if (!(vmf->flags & FAULT_FLAG_TRIED))
-			fpin = do_async_mmap_readahead(vmf, folio);
+			fpin = do_async_mmap_readahead(vmf, folio, fpin);
 		if (unlikely(!folio_test_uptodate(folio))) {
 			filemap_invalidate_lock_shared(mapping);
 			mapping_locked = true;
 		}
 	} else {
 		ret = filemap_fault_recheck_pte_none(vmf);
-		if (unlikely(ret))
+		if (unlikely(ret)) {
+			if (fpin)
+				goto out_retry;
 			return ret;
+		}
 
 		/* No page in the page cache at all */
 		count_vm_event(PGMAJFAULT);
 		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
 		ret = VM_FAULT_MAJOR;
-		fpin = do_sync_mmap_readahead(vmf);
+		fpin = do_sync_mmap_readahead(vmf, fpin);
 retry_find:
 		/*
 		 * See comment in filemap_create_folio() why we need
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (12 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-09 13:11   ` Amir Goldstein
  2024-08-08 19:27 ` [PATCH v2 15/16] gfs2: " Josef Bacik
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

bcachefs has its own locking around filemap_fault, so we have to make
sure we do the fsnotify hook before the locking.  Add the check to emit
the event before the locking and return VM_FAULT_RETRY to retrigger the
fault once the event has been emitted.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/bcachefs/fs-io-pagecache.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/fs/bcachefs/fs-io-pagecache.c b/fs/bcachefs/fs-io-pagecache.c
index a9cc5cad9cc9..359856df52d4 100644
--- a/fs/bcachefs/fs-io-pagecache.c
+++ b/fs/bcachefs/fs-io-pagecache.c
@@ -562,6 +562,7 @@ void bch2_set_folio_dirty(struct bch_fs *c,
 vm_fault_t bch2_page_fault(struct vm_fault *vmf)
 {
 	struct file *file = vmf->vma->vm_file;
+	struct file *fpin = NULL;
 	struct address_space *mapping = file->f_mapping;
 	struct address_space *fdm = faults_disabled_mapping();
 	struct bch_inode_info *inode = file_bch_inode(file);
@@ -570,6 +571,18 @@ vm_fault_t bch2_page_fault(struct vm_fault *vmf)
 	if (fdm == mapping)
 		return VM_FAULT_SIGBUS;
 
+	ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
+	if (unlikely(ret)) {
+		if (fpin) {
+			fput(fpin);
+			ret |= VM_FAULT_RETRy;
+		}
+		return ret;
+	} else if (fpin) {
+		fput(fpin);
+		return VM_FAULT_RETRY;
+	}
+
 	/* Lock ordering: */
 	if (fdm > mapping) {
 		struct bch_inode_info *fdm_host = to_bch_ei(fdm->host);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 15/16] gfs2: add pre-content fsnotify hook to fault
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (13 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-08 19:27 ` [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults Josef Bacik
  2024-08-08 22:15 ` [PATCH v2 00/16] fanotify: add pre-content hooks Dave Chinner
  16 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

gfs2 takes the glock before calling into filemap fault, so add the
fsnotify hook for ->fault before we take the glock in order to avoid any
possible deadlock with the HSM.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/gfs2/file.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 08982937b5df..b841b1720b5c 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -553,9 +553,22 @@ static vm_fault_t gfs2_fault(struct vm_fault *vmf)
 	struct inode *inode = file_inode(vmf->vma->vm_file);
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_holder gh;
+	struct file *fpin = NULL;
 	vm_fault_t ret;
 	int err;
 
+	ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
+	if (unlikely(ret)) {
+		if (fpin) {
+			fput(fpin);
+			ret |= VM_FAULT_RETRY;
+		}
+		return ret;
+	} else if (fpin) {
+		fput(fpin);
+		return VM_FAULT_RETRY;
+	}
+
 	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
 	err = gfs2_glock_nq(&gh);
 	if (err) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (14 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 15/16] gfs2: " Josef Bacik
@ 2024-08-08 19:27 ` Josef Bacik
  2024-08-08 22:03   ` Dave Chinner
  2024-08-08 22:15 ` [PATCH v2 00/16] fanotify: add pre-content hooks Dave Chinner
  16 siblings, 1 reply; 36+ messages in thread
From: Josef Bacik @ 2024-08-08 19:27 UTC (permalink / raw)
  To: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

xfs has it's own handling for write faults, so we need to add the
pre-content fsnotify hook for this case.  Reads go through filemap_fault
so they're handled properly there.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/xfs/xfs_file.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4cdc54dc9686..585a8c2eea0f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1325,14 +1325,28 @@ __xfs_filemap_fault(
 	bool			write_fault)
 {
 	struct inode		*inode = file_inode(vmf->vma->vm_file);
+	struct file		*fpin = NULL;
+	vm_fault_t		ret;
 
 	trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
 
-	if (write_fault)
-		return xfs_write_fault(vmf, order);
 	if (IS_DAX(inode))
 		return xfs_dax_read_fault(vmf, order);
-	return filemap_fault(vmf);
+
+	if (!write_fault)
+		return filemap_fault(vmf);
+
+	ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
+	if (unlikely(ret)) {
+		if (fpin)
+			fput(fpin);
+		return ret;
+	} else if (fpin) {
+		fput(fpin);
+		return VM_FAULT_RETRY;
+	}
+
+	return xfs_write_fault(vmf, order);
 }
 
 static inline bool
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults
  2024-08-08 19:27 ` [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults Josef Bacik
@ 2024-08-08 22:03   ` Dave Chinner
  2024-08-09 14:15     ` Josef Bacik
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2024-08-08 22:03 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:18PM -0400, Josef Bacik wrote:
> xfs has it's own handling for write faults, so we need to add the
> pre-content fsnotify hook for this case.  Reads go through filemap_fault
> so they're handled properly there.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/xfs/xfs_file.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 4cdc54dc9686..585a8c2eea0f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1325,14 +1325,28 @@ __xfs_filemap_fault(
>  	bool			write_fault)
>  {
>  	struct inode		*inode = file_inode(vmf->vma->vm_file);
> +	struct file		*fpin = NULL;
> +	vm_fault_t		ret;
>  
>  	trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
>  
> -	if (write_fault)
> -		return xfs_write_fault(vmf, order);
>  	if (IS_DAX(inode))
>  		return xfs_dax_read_fault(vmf, order);
> -	return filemap_fault(vmf);
> +
> +	if (!write_fault)
> +		return filemap_fault(vmf);

Doesn't this break DAX read faults? i.e. they have to go through
xfs_dax_read_fault(), not filemap_fault().

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/16] fanotify: add pre-content hooks
  2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
                   ` (15 preceding siblings ...)
  2024-08-08 19:27 ` [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults Josef Bacik
@ 2024-08-08 22:15 ` Dave Chinner
  2024-08-09 14:18   ` Josef Bacik
  16 siblings, 1 reply; 36+ messages in thread
From: Dave Chinner @ 2024-08-08 22:15 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:02PM -0400, Josef Bacik wrote:
> v1: https://lore.kernel.org/linux-fsdevel/cover.1721931241.git.josef@toxicpanda.com/
> 
> v1->v2:
> - reworked the page fault logic based on Jan's suggestion and turned it into a
>   helper.
> - Added 3 patches per-fs where we need to call the fsnotify helper from their
>   ->fault handlers.
> - Disabled readahead in the case that there's a pre-content watch in place.
> - Disabled huge faults when there's a pre-content watch in place (entirely
>   because it's untested, theoretically it should be straightforward to do).
> - Updated the command numbers.
> - Addressed the random spelling/grammer mistakes that Jan pointed out.
> - Addressed the other random nits from Jan.
> 
> --- Original email ---
> 
> Hello,
> 
> These are the patches for the bare bones pre-content fanotify support.  The
> majority of this work is Amir's, my contribution to this has solely been around
> adding the page fault hooks, testing and validating everything.  I'm sending it
> because Amir is traveling a bunch, and I touched it last so I'm going to take
> all the hate and he can take all the credit.

Brave man. :)

> There is a PoC that I've been using to validate this work, you can find the git
> repo here
> 
> https://github.com/josefbacik/remote-fetch
> 
> This consists of 3 different tools.
> 
> 1. populate.  This just creates all the stub files in the directory from the
>    source directory.  Just run ./populate ~/linux ~/hsm-linux and it'll
>    recursively create all of the stub files and directories.
> 2. remote-fetch.  This is the actual PoC, you just point it at the source and
>    destination directory and then you can do whatever.  ./remote-fetch ~/linux
>    ~/hsm-linux.
> 3. mmap-validate.  This was to validate the pagefault thing, this is likely what
>    will be turned into the selftest with remote-fetch.  It creates a file and
>    then you can validate the file matches the right pattern with both normal
>    reads and mmap.  Normally I do something like
> 
>    ./mmap-validate create ~/src/foo
>    ./populate ~/src ~/dst
>    ./rmeote-fetch ~/src ~/dst
>    ./mmap-validate validate ~/dst/foo

This smells like something that should be added to fstests.

FWIW, fstests used to have a whole "fake-hsm" infrastructure
subsystem in it for testing DMAPI events used by HSMs. They were
removed in this commit:

commit 6497ede7ad4e9fc8e5a5a121bd600df896b7d9c6
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Thu Feb 11 13:33:38 2021 -0800

    fstests: remove DMAPI tests

    Upstream XFS has never supported DMAPI, so remove the tests for this
    feature.

    Signed-off-by: Darrick J. Wong <djwong@kernel.org>
    Acked-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Eryu Guan <guaneryu@gmail.com>

See ./dmapi/src/sample_hsm/ for the HSM test code that was removed
in that patchset - it might provide some infrastructure that can be
used to test the fanotify HSM event infrastructure without
reinventing the entire wheel...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault
  2024-08-08 19:27 ` [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault Josef Bacik
@ 2024-08-09 10:34   ` Amir Goldstein
  2024-08-09 14:19     ` Josef Bacik
  0 siblings, 1 reply; 36+ messages in thread
From: Amir Goldstein @ 2024-08-09 10:34 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, brauner, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 8, 2024 at 9:28 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> FS_PRE_ACCESS or FS_PRE_MODIFY will be generated on page fault depending
> on the faulting method.
>
> This pre-content event is meant to be used by hierarchical storage
> managers that want to fill in the file content on first read access.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  include/linux/mm.h |  2 +
>  mm/filemap.c       | 97 ++++++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 92 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ab3d78116043..c33f3b7f7261 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3503,6 +3503,8 @@ extern vm_fault_t filemap_fault(struct vm_fault *vmf);
>  extern vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>                 pgoff_t start_pgoff, pgoff_t end_pgoff);
>  extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
> +extern vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
> +                                                   struct file **fpin);
>
>  extern unsigned long stack_guard_gap;
>  /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 8b1684b62177..3d232166b051 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -46,6 +46,7 @@
>  #include <linux/pipe_fs_i.h>
>  #include <linux/splice.h>
>  #include <linux/rcupdate_wait.h>
> +#include <linux/fsnotify.h>
>  #include <asm/pgalloc.h>
>  #include <asm/tlbflush.h>
>  #include "internal.h"
> @@ -3112,13 +3113,13 @@ static int lock_folio_maybe_drop_mmap(struct vm_fault *vmf, struct folio *folio,
>   * that.  If we didn't pin a file then we return NULL.  The file that is
>   * returned needs to be fput()'ed when we're done with it.
>   */
> -static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
> +static struct file *do_sync_mmap_readahead(struct vm_fault *vmf,
> +                                          struct file *fpin)
>  {
>         struct file *file = vmf->vma->vm_file;
>         struct file_ra_state *ra = &file->f_ra;
>         struct address_space *mapping = file->f_mapping;
>         DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff);
> -       struct file *fpin = NULL;
>         unsigned long vm_flags = vmf->vma->vm_flags;
>         unsigned int mmap_miss;
>
> @@ -3190,12 +3191,12 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
>   * was pinned if we have to drop the mmap_lock in order to do IO.
>   */
>  static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
> -                                           struct folio *folio)
> +                                           struct folio *folio,
> +                                           struct file *fpin)
>  {
>         struct file *file = vmf->vma->vm_file;
>         struct file_ra_state *ra = &file->f_ra;
>         DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff);
> -       struct file *fpin = NULL;
>         unsigned int mmap_miss;
>
>         /* See comment in do_sync_mmap_readahead. */
> @@ -3260,6 +3261,72 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf)
>         return ret;
>  }
>
> +/**
> + * filemap_maybe_emit_fsnotify_event - maybe emit a pre-content event.
> + * @vmf:       struct vm_fault containing details of the fault.
> + * @fpin:      pointer to the struct file pointer that may be pinned.
> + *
> + * If we have pre-content watches on this file we will need to emit an event for
> + * this range.  We will handle dropping the lock and emitting the event.
> + *
> + * If FAULT_FLAG_RETRY_NOWAIT is set then we'll return VM_FAULT_RETRY.
> + *
> + * If no event was emitted then *fpin will be NULL and we will return 0.
> + *
> + * If any error occurred we will return VM_FAULT_SIGBUS, *fpin could still be
> + * set and will need to have fput() called on it.
> + *
> + * If we emitted the event then we will return 0 and *fpin will be set, this
> + * must have fput() called on it, and the caller must call VM_FAULT_RETRY after
> + * any other operations it does in order to re-fault the page and make sure the
> + * appropriate locking is maintained.
> + *
> + * Return: the appropriate vm_fault_t return code, 0 on success.
> + */
> +vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
> +                                            struct file **fpin)
> +{
> +       struct file *file = vmf->vma->vm_file;
> +       loff_t pos = vmf->pgoff << PAGE_SHIFT;
> +       int mask = (vmf->flags & FAULT_FLAG_WRITE) ? MAY_WRITE : MAY_READ;

You missed my comment about using MAY_ACCESS here
and alter fsnotify hook, so legacy FAN_ACCESS_PERM event
won't be generated from page fault.

Thanks,
Amir.

> +       int ret;
> +
> +       /*
> +        * We already did this and now we're retrying with everything locked,
> +        * don't emit the event and continue.
> +        */
> +       if (vmf->flags & FAULT_FLAG_TRIED)
> +               return 0;
> +
> +       /* No watches, return NULL. */
> +       if (!fsnotify_file_has_pre_content_watches(file))
> +               return 0;
> +
> +       /* We are NOWAIT, we can't wait, just return EAGAIN. */
> +       if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
> +               return VM_FAULT_RETRY;
> +
> +       /*
> +        * If this fails then we're not allowed to drop the fault lock, return a
> +        * SIGBUS so we don't errantly populate pagecache with bogus data for
> +        * this file.
> +        */
> +       *fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
> +       if (*fpin == NULL)
> +               return VM_FAULT_SIGBUS | VM_FAULT_RETRY;
> +
> +       /*
> +        * We can't fput(*fpin) at this point because we could have been passed
> +        * in fpin from a previous call.
> +        */
> +       ret = fsnotify_file_area_perm(*fpin, mask, &pos, PAGE_SIZE);
> +       if (ret)
> +               return VM_FAULT_SIGBUS;
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(filemap_maybe_emit_fsnotify_event);
> +
>  /**
>   * filemap_fault - read in file data for page fault handling
>   * @vmf:       struct vm_fault containing details of the fault
> @@ -3299,6 +3366,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>         if (unlikely(index >= max_idx))
>                 return VM_FAULT_SIGBUS;
>
> +       /*
> +        * If we have pre-content watchers then we need to generate events on
> +        * page fault so that we can populate any data before the fault.
> +        */
> +       ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
> +       if (unlikely(ret)) {
> +               if (fpin) {
> +                       fput(fpin);
> +                       ret |= VM_FAULT_RETRY;
> +               }
> +               return ret;
> +       }
> +
>         /*
>          * Do we have something in the page cache already?
>          */
> @@ -3309,21 +3389,24 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>                  * the lock.
>                  */
>                 if (!(vmf->flags & FAULT_FLAG_TRIED))
> -                       fpin = do_async_mmap_readahead(vmf, folio);
> +                       fpin = do_async_mmap_readahead(vmf, folio, fpin);
>                 if (unlikely(!folio_test_uptodate(folio))) {
>                         filemap_invalidate_lock_shared(mapping);
>                         mapping_locked = true;
>                 }
>         } else {
>                 ret = filemap_fault_recheck_pte_none(vmf);
> -               if (unlikely(ret))
> +               if (unlikely(ret)) {
> +                       if (fpin)
> +                               goto out_retry;
>                         return ret;
> +               }
>
>                 /* No page in the page cache at all */
>                 count_vm_event(PGMAJFAULT);
>                 count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
>                 ret = VM_FAULT_MAJOR;
> -               fpin = do_sync_mmap_readahead(vmf);
> +               fpin = do_sync_mmap_readahead(vmf, fpin);
>  retry_find:
>                 /*
>                  * See comment in filemap_create_folio() why we need
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 07/16] fanotify: rename a misnamed constant
  2024-08-08 19:27 ` [PATCH v2 07/16] fanotify: rename a misnamed constant Josef Bacik
@ 2024-08-09 11:41   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 11:41 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:09PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> FANOTIFY_PIDFD_INFO_HDR_LEN is not the length of the header.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---

Should probably be routed separately?
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 03/16] fsnotify: generate pre-content permission event on open
  2024-08-08 19:27 ` [PATCH v2 03/16] fsnotify: generate pre-content permission event on open Josef Bacik
@ 2024-08-09 11:51   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 11:51 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:05PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> FS_PRE_ACCESS or FS_PRE_MODIFY will be generated on open depending on
> file open mode.  The pre-content event will be generated in addition to
> FS_OPEN_PERM, but without sb_writers held and after file was truncated
> in case file was opened with O_CREAT and/or O_TRUNC.
> 
> The event will have a range info of (0..0) to provide an opportunity
> to fill entire file content on open.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---

More magic hooks in that code...

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event
  2024-08-08 19:27 ` [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event Josef Bacik
@ 2024-08-09 11:57   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 11:57 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:06PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> Similar to FAN_ACCESS_PERM permission event, but it is only allowed with
> class FAN_CLASS_PRE_CONTENT and only allowed on regular files and dirs.
> 
> Unlike FAN_ACCESS_PERM, it is safe to write to the file being accessed
> in the context of the event handler.
> 
> This pre-content event is meant to be used by hierarchical storage
> managers that want to fill the content of files on first read access.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fanotify/fanotify.c      |  3 ++-
>  fs/notify/fanotify/fanotify_user.c | 17 ++++++++++++++---
>  include/linux/fanotify.h           | 14 ++++++++++----
>  include/uapi/linux/fanotify.h      |  2 ++
>  4 files changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 224bccaab4cc..7dac8e4486df 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -910,8 +910,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>  	BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
>  	BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
>  	BUILD_BUG_ON(FAN_RENAME != FS_RENAME);
> +	BUILD_BUG_ON(FAN_PRE_ACCESS != FS_PRE_ACCESS);
>  
> -	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 21);
> +	BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 22);
>  
>  	mask = fanotify_group_event_mask(group, iter_info, &match_mask,
>  					 mask, data, data_type, dir);
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 2e2fba8a9d20..c294849e474f 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1628,6 +1628,7 @@ static int fanotify_events_supported(struct fsnotify_group *group,
>  				     unsigned int flags)
>  {
>  	unsigned int mark_type = flags & FANOTIFY_MARK_TYPE_BITS;
> +	bool is_dir = d_is_dir(path->dentry);
>  	/* Strict validation of events in non-dir inode mask with v5.17+ APIs */
>  	bool strict_dir_events = FAN_GROUP_FLAG(group, FAN_REPORT_TARGET_FID) ||
>  				 (mask & FAN_RENAME) ||
> @@ -1665,9 +1666,15 @@ static int fanotify_events_supported(struct fsnotify_group *group,
>  	 * but because we always allowed it, error only when using new APIs.
>  	 */
>  	if (strict_dir_events && mark_type == FAN_MARK_INODE &&
> -	    !d_is_dir(path->dentry) && (mask & FANOTIFY_DIRONLY_EVENT_BITS))
> +	    !is_dir && (mask & FANOTIFY_DIRONLY_EVENT_BITS))
>  		return -ENOTDIR;
>  
> +	/* Pre-content events are only supported on regular files and dirs */
> +	if (mask & FANOTIFY_PRE_CONTENT_EVENTS) {
> +		if (!is_dir && !d_is_reg(path->dentry))
> +			return -EINVAL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -1769,11 +1776,15 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>  		goto fput_and_out;
>  
>  	/*
> -	 * Permission events require minimum priority FAN_CLASS_CONTENT.
> +	 * Permission events are not allowed for FAN_CLASS_NOTIF.
> +	 * Pre-content permission events are not allowed for FAN_CLASS_CONTENT.
>  	 */
>  	ret = -EINVAL;
>  	if (mask & FANOTIFY_PERM_EVENTS &&
> -	    group->priority < FSNOTIFY_PRIO_CONTENT)
> +	    group->priority == FSNOTIFY_PRIO_NORMAL)
> +		goto fput_and_out;
> +	else if (mask & FANOTIFY_PRE_CONTENT_EVENTS &&
> +		 group->priority == FSNOTIFY_PRIO_CONTENT)
>  		goto fput_and_out;
>  
>  	if (mask & FAN_FS_ERROR &&
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index 4f1c4f603118..5c811baf44d2 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -88,6 +88,16 @@
>  #define FANOTIFY_DIRENT_EVENTS	(FAN_MOVE | FAN_CREATE | FAN_DELETE | \
>  				 FAN_RENAME)
>  
> +/* Content events can be used to inspect file content */
> +#define FANOTIFY_CONTENT_PERM_EVENTS (FAN_OPEN_PERM | FAN_OPEN_EXEC_PERM | \
> +				      FAN_ACCESS_PERM)
> +/* Pre-content events can be used to fill file content */
> +#define FANOTIFY_PRE_CONTENT_EVENTS  (FAN_PRE_ACCESS)
> +
> +/* Events that require a permission response from user */
> +#define FANOTIFY_PERM_EVENTS	(FANOTIFY_CONTENT_PERM_EVENTS | \
> +				 FANOTIFY_PRE_CONTENT_EVENTS)
> +

Fwiw, this is one of my pet peeves with fanotify. It uses nesting of
defines very liberally. For the occasional reader that needs to
understand what flags are checked for its quite an excercise having to
go back and resolving multiple levels of defines. I would humbly urge
some restraint in that area.

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY permission event
  2024-08-08 19:27 ` [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY " Josef Bacik
@ 2024-08-09 11:57   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 11:57 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:07PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> Generate FAN_PRE_MODIFY permission event from fsnotify_file_perm()
> pre-write hook to notify fanotify listeners on an intent to make
> modification to a file.
> 
> Like FAN_PRE_ACCESS, it is only allowed with FAN_CLASS_PRE_CONTENT
> and unlike FAN_MODIFY, it is only allowed on regular files.
> 
> Like FAN_PRE_ACCESS, it is generated without sb_start_write() held,
> so it is safe to perform filesystem modifications in the context of
> event handler.
> 
> This pre-content event is meant to be used by hierarchical storage
> managers that want to fill the content of files on first write access.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event
  2024-08-08 19:27 ` [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event Josef Bacik
@ 2024-08-09 12:00   ` Christian Brauner
  2024-08-09 18:36     ` Josef Bacik
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 12:00 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:08PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> We would like to add file range information to pre-content events.
> 
> Pass a struct file_range with optional offset and length to event handler
> along with pre-content permission event.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fanotify/fanotify.c    | 10 ++++++++--
>  fs/notify/fanotify/fanotify.h    |  2 ++
>  include/linux/fsnotify.h         | 17 ++++++++++++++++-
>  include/linux/fsnotify_backend.h | 32 ++++++++++++++++++++++++++++++++
>  4 files changed, 58 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index b163594843f5..4e8dce39fa8f 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -549,9 +549,13 @@ static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
>  	return &pevent->fae;
>  }
>  
> -static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
> +static struct fanotify_event *fanotify_alloc_perm_event(const void *data,
> +							int data_type,
>  							gfp_t gfp)
>  {
> +	const struct path *path = fsnotify_data_path(data, data_type);
> +	const struct file_range *range =
> +			    fsnotify_data_file_range(data, data_type);
>  	struct fanotify_perm_event *pevent;
>  
>  	pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp);
> @@ -565,6 +569,8 @@ static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
>  	pevent->hdr.len = 0;
>  	pevent->state = FAN_EVENT_INIT;
>  	pevent->path = *path;
> +	pevent->ppos = range ? range->ppos : NULL;
> +	pevent->count = range ? range->count : 0;
>  	path_get(path);
>  
>  	return &pevent->fae;
> @@ -802,7 +808,7 @@ static struct fanotify_event *fanotify_alloc_event(
>  	old_memcg = set_active_memcg(group->memcg);
>  
>  	if (fanotify_is_perm_event(mask)) {
> -		event = fanotify_alloc_perm_event(path, gfp);
> +		event = fanotify_alloc_perm_event(data, data_type, gfp);
>  	} else if (fanotify_is_error_event(mask)) {
>  		event = fanotify_alloc_error_event(group, fsid, data,
>  						   data_type, &hash);
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index e5ab33cae6a7..93598b7d5952 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -425,6 +425,8 @@ FANOTIFY_PE(struct fanotify_event *event)
>  struct fanotify_perm_event {
>  	struct fanotify_event fae;
>  	struct path path;
> +	const loff_t *ppos;		/* optional file range info */
> +	size_t count;
>  	u32 response;			/* userspace answer to the event */
>  	unsigned short state;		/* state of the event */
>  	int fd;		/* fd we passed to userspace for this event */
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index a28daf136fea..4609d9b6b087 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -132,6 +132,21 @@ static inline int fsnotify_file(struct file *file, __u32 mask)
>  }
>  
>  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
> +static inline int fsnotify_file_range(struct file *file, __u32 mask,
> +				      const loff_t *ppos, size_t count)
> +{
> +	struct file_range range;
> +
> +	if (file->f_mode & FMODE_NONOTIFY)
> +		return 0;
> +
> +	range.path = &file->f_path;
> +	range.ppos = ppos;
> +	range.count = count;
> +	return fsnotify_parent(range.path->dentry, mask, &range,
> +			       FSNOTIFY_EVENT_FILE_RANGE);
> +}
> +
>  /*
>   * fsnotify_file_area_perm - permission hook before access/modify of file range
>   */
> @@ -175,7 +190,7 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
>  	else
>  		return 0;
>  
> -	return fsnotify_file(file, fsnotify_mask);
> +	return fsnotify_file_range(file, fsnotify_mask, ppos, count);
>  }
>  
>  /*
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 200a5e3b1cd4..276320846bfd 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -298,6 +298,7 @@ static inline void fsnotify_group_assert_locked(struct fsnotify_group *group)
>  /* When calling fsnotify tell it if the data is a path or inode */
>  enum fsnotify_data_type {
>  	FSNOTIFY_EVENT_NONE,
> +	FSNOTIFY_EVENT_FILE_RANGE,
>  	FSNOTIFY_EVENT_PATH,
>  	FSNOTIFY_EVENT_INODE,
>  	FSNOTIFY_EVENT_DENTRY,
> @@ -310,6 +311,17 @@ struct fs_error_report {
>  	struct super_block *sb;
>  };
>  
> +struct file_range {
> +	const struct path *path;
> +	const loff_t *ppos;
> +	size_t count;
> +};
> +
> +static inline const struct path *file_range_path(const struct file_range *range)
> +{
> +	return range->path;
> +}
> +
>  static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
>  {
>  	switch (data_type) {
> @@ -319,6 +331,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
>  		return d_inode(data);
>  	case FSNOTIFY_EVENT_PATH:
>  		return d_inode(((const struct path *)data)->dentry);
> +	case FSNOTIFY_EVENT_FILE_RANGE:
> +		return d_inode(file_range_path(data)->dentry);
>  	case FSNOTIFY_EVENT_ERROR:
>  		return ((struct fs_error_report *)data)->inode;
>  	default:
> @@ -334,6 +348,8 @@ static inline struct dentry *fsnotify_data_dentry(const void *data, int data_typ
>  		return (struct dentry *)data;
>  	case FSNOTIFY_EVENT_PATH:
>  		return ((const struct path *)data)->dentry;
> +	case FSNOTIFY_EVENT_FILE_RANGE:
> +		return file_range_path(data)->dentry;
>  	default:
>  		return NULL;
>  	}
> @@ -345,6 +361,8 @@ static inline const struct path *fsnotify_data_path(const void *data,
>  	switch (data_type) {
>  	case FSNOTIFY_EVENT_PATH:
>  		return data;
> +	case FSNOTIFY_EVENT_FILE_RANGE:
> +		return file_range_path(data);
>  	default:
>  		return NULL;
>  	}
> @@ -360,6 +378,8 @@ static inline struct super_block *fsnotify_data_sb(const void *data,
>  		return ((struct dentry *)data)->d_sb;
>  	case FSNOTIFY_EVENT_PATH:
>  		return ((const struct path *)data)->dentry->d_sb;
> +	case FSNOTIFY_EVENT_FILE_RANGE:
> +		return file_range_path(data)->dentry->d_sb;
>  	case FSNOTIFY_EVENT_ERROR:
>  		return ((struct fs_error_report *) data)->sb;
>  	default:
> @@ -379,6 +399,18 @@ static inline struct fs_error_report *fsnotify_data_error_report(
>  	}
>  }
>  
> +static inline const struct file_range *fsnotify_data_file_range(
> +							const void *data,
> +							int data_type)
> +{
> +	switch (data_type) {
> +	case FSNOTIFY_EVENT_FILE_RANGE:
> +		return (struct file_range *)data;
> +	default:
> +		return NULL;

Wouldn't you want something like

case FSNOTIFY_EVENT_NONE
	return NULL;
default:
	WARN_ON_ONCE(data_type);
	return NULL;

to guard against garbage being passed to fsnotify_data_file_range()?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response
  2024-08-08 19:27 ` [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response Josef Bacik
@ 2024-08-09 12:06   ` Christian Brauner
  2024-08-09 18:38     ` Josef Bacik
  0 siblings, 1 reply; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 12:06 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:11PM GMT, Josef Bacik wrote:
> From: Amir Goldstein <amir73il@gmail.com>
> 
> With FAN_DENY response, user trying to perform the filesystem operation
> gets an error with errno set to EPERM.
> 
> It is useful for hierarchical storage management (HSM) service to be able
> to deny access for reasons more diverse than EPERM, for example EAGAIN,
> if HSM could retry the operation later.
> 
> Allow fanotify groups with priority FAN_CLASSS_PRE_CONTENT to responsd
> to permission events with the response value FAN_DENY_ERRNO(errno),
> instead of FAN_DENY to return a custom error.
> 
> Limit custom error values to errors expected on read(2)/write(2) and
> open(2) of regular files. This list could be extended in the future.
> Userspace can test for legitimate values of FAN_DENY_ERRNO(errno) by
> writing a response to an fanotify group fd with a value of FAN_NOFD in
> the fd field of the response.
> 
> The change in fanotify_response is backward compatible, because errno is
> written in the high 8 bits of the 32bit response field and old kernels
> reject respose value with high bits set.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/notify/fanotify/fanotify.c      | 18 ++++++++++-----
>  fs/notify/fanotify/fanotify.h      | 10 +++++++++
>  fs/notify/fanotify/fanotify_user.c | 36 +++++++++++++++++++++++++-----
>  include/linux/fanotify.h           |  5 ++++-
>  include/uapi/linux/fanotify.h      |  7 ++++++
>  5 files changed, 65 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 4e8dce39fa8f..1cbf41b34080 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -224,7 +224,8 @@ static int fanotify_get_response(struct fsnotify_group *group,
>  				 struct fanotify_perm_event *event,
>  				 struct fsnotify_iter_info *iter_info)
>  {
> -	int ret;
> +	int ret, errno;
> +	u32 decision;
>  
>  	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>  
> @@ -257,20 +258,27 @@ static int fanotify_get_response(struct fsnotify_group *group,
>  		goto out;
>  	}
>  
> +	decision = fanotify_get_response_decision(event->response);
>  	/* userspace responded, convert to something usable */
> -	switch (event->response & FANOTIFY_RESPONSE_ACCESS) {
> +	switch (decision & FANOTIFY_RESPONSE_ACCESS) {
>  	case FAN_ALLOW:
>  		ret = 0;
>  		break;
>  	case FAN_DENY:
> +		/* Check custom errno from pre-content events */
> +		errno = fanotify_get_response_errno(event->response);

Fwiw, you're fetching from event->response again but have already
stashed it in @decision earlier. Probably just an oversight.

> +		if (errno) {
> +			ret = -errno;
> +			break;
> +		}
> +		fallthrough;
>  	default:
>  		ret = -EPERM;
>  	}
>  
>  	/* Check if the response should be audited */
> -	if (event->response & FAN_AUDIT)
> -		audit_fanotify(event->response & ~FAN_AUDIT,
> -			       &event->audit_rule);
> +	if (decision & FAN_AUDIT)
> +		audit_fanotify(decision & ~FAN_AUDIT, &event->audit_rule);
>  
>  	pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__,
>  		 group, event, ret);
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 7f06355afa1f..d0722ef13138 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -528,3 +528,13 @@ static inline unsigned int fanotify_mark_user_flags(struct fsnotify_mark *mark)
>  
>  	return mflags;
>  }
> +
> +static inline u32 fanotify_get_response_decision(u32 res)
> +{
> +	return res & (FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS);
> +}
> +
> +static inline int fanotify_get_response_errno(int res)
> +{
> +	return res >> FAN_ERRNO_SHIFT;
> +}
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index ed56fe6f5ec7..0a37f1c761aa 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -337,11 +337,13 @@ static int process_access_response(struct fsnotify_group *group,
>  	struct fanotify_perm_event *event;
>  	int fd = response_struct->fd;
>  	u32 response = response_struct->response;
> +	u32 decision = fanotify_get_response_decision(response);
> +	int errno = fanotify_get_response_errno(response);
>  	int ret = info_len;
>  	struct fanotify_response_info_audit_rule friar;
>  
> -	pr_debug("%s: group=%p fd=%d response=%u buf=%p size=%zu\n", __func__,
> -		 group, fd, response, info, info_len);
> +	pr_debug("%s: group=%p fd=%d response=%x errno=%d buf=%p size=%zu\n",
> +		 __func__, group, fd, response, errno, info, info_len);
>  	/*
>  	 * make sure the response is valid, if invalid we do nothing and either
>  	 * userspace can send a valid response or we will clean it up after the
> @@ -350,18 +352,42 @@ static int process_access_response(struct fsnotify_group *group,
>  	if (response & ~FANOTIFY_RESPONSE_VALID_MASK)
>  		return -EINVAL;
>  
> -	switch (response & FANOTIFY_RESPONSE_ACCESS) {
> +	switch (decision & FANOTIFY_RESPONSE_ACCESS) {
>  	case FAN_ALLOW:
> +		if (errno)
> +			return -EINVAL;
> +		break;
>  	case FAN_DENY:
> +		/* Custom errno is supported only for pre-content groups */
> +		if (errno && group->priority != FSNOTIFY_PRIO_PRE_CONTENT)
> +			return -EINVAL;
> +
> +		/*
> +		 * Limit errno to values expected on open(2)/read(2)/write(2)
> +		 * of regular files.
> +		 */
> +		switch (errno) {
> +		case 0:
> +		case EIO:
> +		case EPERM:
> +		case EBUSY:
> +		case ETXTBSY:
> +		case EAGAIN:
> +		case ENOSPC:
> +		case EDQUOT:
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
>  		break;
>  	default:
>  		return -EINVAL;
>  	}
>  
> -	if ((response & FAN_AUDIT) && !FAN_GROUP_FLAG(group, FAN_ENABLE_AUDIT))
> +	if ((decision & FAN_AUDIT) && !FAN_GROUP_FLAG(group, FAN_ENABLE_AUDIT))
>  		return -EINVAL;
>  
> -	if (response & FAN_INFO) {
> +	if (decision & FAN_INFO) {
>  		ret = process_access_response_info(info, info_len, &friar);
>  		if (ret < 0)
>  			return ret;
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index ae6cb2688d52..547514542669 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -132,7 +132,10 @@
>  /* These masks check for invalid bits in permission responses. */
>  #define FANOTIFY_RESPONSE_ACCESS (FAN_ALLOW | FAN_DENY)
>  #define FANOTIFY_RESPONSE_FLAGS (FAN_AUDIT | FAN_INFO)
> -#define FANOTIFY_RESPONSE_VALID_MASK (FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS)
> +#define FANOTIFY_RESPONSE_ERRNO	(FAN_ERRNO_MASK << FAN_ERRNO_SHIFT)
> +#define FANOTIFY_RESPONSE_VALID_MASK \
> +	(FANOTIFY_RESPONSE_ACCESS | FANOTIFY_RESPONSE_FLAGS | \
> +	 FANOTIFY_RESPONSE_ERRNO)
>  
>  /* Do not use these old uapi constants internally */
>  #undef FAN_ALL_CLASS_BITS
> diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> index cc28dce5f744..7b746c5fcbd8 100644
> --- a/include/uapi/linux/fanotify.h
> +++ b/include/uapi/linux/fanotify.h
> @@ -233,6 +233,13 @@ struct fanotify_response_info_audit_rule {
>  /* Legit userspace responses to a _PERM event */
>  #define FAN_ALLOW	0x01
>  #define FAN_DENY	0x02
> +/* errno other than EPERM can specified in upper byte of deny response */
> +#define FAN_ERRNO_BITS	8
> +#define FAN_ERRNO_SHIFT (32 - FAN_ERRNO_BITS)
> +#define FAN_ERRNO_MASK	((1 << FAN_ERRNO_BITS) - 1)
> +#define FAN_DENY_ERRNO(err) \
> +	(FAN_DENY | ((((__u32)(err)) & FAN_ERRNO_MASK) << FAN_ERRNO_SHIFT))
> +
>  #define FAN_AUDIT	0x10	/* Bitmask to create audit record for result */
>  #define FAN_INFO	0x20	/* Bitmask to indicate additional information */
>  
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 10/16] fanotify: add a helper to check for pre content events
  2024-08-08 19:27 ` [PATCH v2 10/16] fanotify: add a helper to check for pre content events Josef Bacik
@ 2024-08-09 12:10   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 12:10 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:12PM GMT, Josef Bacik wrote:
> We want to emit events during page fault, and calling into fanotify
> could be expensive, so add a helper to allow us to skip calling into
> fanotify from page fault.  This will also be used to disable readahead
> for content watched files which will be handled in a subsequent patch.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches
  2024-08-08 19:27 ` [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches Josef Bacik
@ 2024-08-09 12:12   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 12:12 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:13PM GMT, Josef Bacik wrote:
> With page faults we can trigger readahead on the file, and then
> subsequent faults can find these pages and insert them into the file
> without emitting an fanotify event.  To avoid this case, disable
> readahead if we have pre-content watches on the file.  This way we are
> guaranteed to get an event for every range we attempt to access on a
> pre-content watched file.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---

Looks sensible,
Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches
  2024-08-08 19:27 ` [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches Josef Bacik
@ 2024-08-09 12:13   ` Christian Brauner
  0 siblings, 0 replies; 36+ messages in thread
From: Christian Brauner @ 2024-08-09 12:13 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 08, 2024 at 03:27:14PM GMT, Josef Bacik wrote:
> There's nothing stopping us from supporting this, we could simply pass
> the order into the helper and emit the proper length.  However currently
> there's no tests to validate this works properly, so disable it until
> there's a desire to support this along with the appropriate tests.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault
  2024-08-08 19:27 ` [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault Josef Bacik
@ 2024-08-09 13:11   ` Amir Goldstein
  2024-08-09 14:21     ` Josef Bacik
  0 siblings, 1 reply; 36+ messages in thread
From: Amir Goldstein @ 2024-08-09 13:11 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kernel-team, linux-fsdevel, jack, brauner, linux-xfs, gfs2,
	linux-bcachefs

On Thu, Aug 8, 2024 at 9:28 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> bcachefs has its own locking around filemap_fault, so we have to make
> sure we do the fsnotify hook before the locking.  Add the check to emit
> the event before the locking and return VM_FAULT_RETRY to retrigger the
> fault once the event has been emitted.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/bcachefs/fs-io-pagecache.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/fs/bcachefs/fs-io-pagecache.c b/fs/bcachefs/fs-io-pagecache.c
> index a9cc5cad9cc9..359856df52d4 100644
> --- a/fs/bcachefs/fs-io-pagecache.c
> +++ b/fs/bcachefs/fs-io-pagecache.c
> @@ -562,6 +562,7 @@ void bch2_set_folio_dirty(struct bch_fs *c,
>  vm_fault_t bch2_page_fault(struct vm_fault *vmf)
>  {
>         struct file *file = vmf->vma->vm_file;
> +       struct file *fpin = NULL;
>         struct address_space *mapping = file->f_mapping;
>         struct address_space *fdm = faults_disabled_mapping();
>         struct bch_inode_info *inode = file_bch_inode(file);
> @@ -570,6 +571,18 @@ vm_fault_t bch2_page_fault(struct vm_fault *vmf)
>         if (fdm == mapping)
>                 return VM_FAULT_SIGBUS;
>
> +       ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
> +       if (unlikely(ret)) {
> +               if (fpin) {
> +                       fput(fpin);
> +                       ret |= VM_FAULT_RETRy;

Typo RETRy

> +               }
> +               return ret;
> +       } else if (fpin) {
> +               fput(fpin);
> +               return VM_FAULT_RETRY;
> +       }
> +

This chunk is almost duplicate in all call sites in filesystems.
Could it maybe be enclosed in a helper.
It is bad enough that we have to spray those in filesystem code,
so at least give the copy&paste errors to the bare minimum?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults
  2024-08-08 22:03   ` Dave Chinner
@ 2024-08-09 14:15     ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 14:15 UTC (permalink / raw)
  To: Dave Chinner
  Cc: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

On Fri, Aug 09, 2024 at 08:03:41AM +1000, Dave Chinner wrote:
> On Thu, Aug 08, 2024 at 03:27:18PM -0400, Josef Bacik wrote:
> > xfs has it's own handling for write faults, so we need to add the
> > pre-content fsnotify hook for this case.  Reads go through filemap_fault
> > so they're handled properly there.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/xfs/xfs_file.c | 20 +++++++++++++++++---
> >  1 file changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 4cdc54dc9686..585a8c2eea0f 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1325,14 +1325,28 @@ __xfs_filemap_fault(
> >  	bool			write_fault)
> >  {
> >  	struct inode		*inode = file_inode(vmf->vma->vm_file);
> > +	struct file		*fpin = NULL;
> > +	vm_fault_t		ret;
> >  
> >  	trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
> >  
> > -	if (write_fault)
> > -		return xfs_write_fault(vmf, order);
> >  	if (IS_DAX(inode))
> >  		return xfs_dax_read_fault(vmf, order);
> > -	return filemap_fault(vmf);
> > +
> > +	if (!write_fault)
> > +		return filemap_fault(vmf);
> 
> Doesn't this break DAX read faults? i.e. they have to go through
> xfs_dax_read_fault(), not filemap_fault().

Oops my bad, I had it right before then decided to make it cleaner and forgot
what the original code was doing, I'll fix it up, thanks!

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/16] fanotify: add pre-content hooks
  2024-08-08 22:15 ` [PATCH v2 00/16] fanotify: add pre-content hooks Dave Chinner
@ 2024-08-09 14:18   ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 14:18 UTC (permalink / raw)
  To: Dave Chinner
  Cc: kernel-team, linux-fsdevel, jack, amir73il, brauner, linux-xfs,
	gfs2, linux-bcachefs

On Fri, Aug 09, 2024 at 08:15:55AM +1000, Dave Chinner wrote:
> On Thu, Aug 08, 2024 at 03:27:02PM -0400, Josef Bacik wrote:
> > v1: https://lore.kernel.org/linux-fsdevel/cover.1721931241.git.josef@toxicpanda.com/
> > 
> > v1->v2:
> > - reworked the page fault logic based on Jan's suggestion and turned it into a
> >   helper.
> > - Added 3 patches per-fs where we need to call the fsnotify helper from their
> >   ->fault handlers.
> > - Disabled readahead in the case that there's a pre-content watch in place.
> > - Disabled huge faults when there's a pre-content watch in place (entirely
> >   because it's untested, theoretically it should be straightforward to do).
> > - Updated the command numbers.
> > - Addressed the random spelling/grammer mistakes that Jan pointed out.
> > - Addressed the other random nits from Jan.
> > 
> > --- Original email ---
> > 
> > Hello,
> > 
> > These are the patches for the bare bones pre-content fanotify support.  The
> > majority of this work is Amir's, my contribution to this has solely been around
> > adding the page fault hooks, testing and validating everything.  I'm sending it
> > because Amir is traveling a bunch, and I touched it last so I'm going to take
> > all the hate and he can take all the credit.
> 
> Brave man. :)
> 
> > There is a PoC that I've been using to validate this work, you can find the git
> > repo here
> > 
> > https://github.com/josefbacik/remote-fetch
> > 
> > This consists of 3 different tools.
> > 
> > 1. populate.  This just creates all the stub files in the directory from the
> >    source directory.  Just run ./populate ~/linux ~/hsm-linux and it'll
> >    recursively create all of the stub files and directories.
> > 2. remote-fetch.  This is the actual PoC, you just point it at the source and
> >    destination directory and then you can do whatever.  ./remote-fetch ~/linux
> >    ~/hsm-linux.
> > 3. mmap-validate.  This was to validate the pagefault thing, this is likely what
> >    will be turned into the selftest with remote-fetch.  It creates a file and
> >    then you can validate the file matches the right pattern with both normal
> >    reads and mmap.  Normally I do something like
> > 
> >    ./mmap-validate create ~/src/foo
> >    ./populate ~/src ~/dst
> >    ./rmeote-fetch ~/src ~/dst
> >    ./mmap-validate validate ~/dst/foo
> 
> This smells like something that should be added to fstests.
> 
> FWIW, fstests used to have a whole "fake-hsm" infrastructure
> subsystem in it for testing DMAPI events used by HSMs. They were
> removed in this commit:
> 
> commit 6497ede7ad4e9fc8e5a5a121bd600df896b7d9c6
> Author: Darrick J. Wong <djwong@kernel.org>
> Date:   Thu Feb 11 13:33:38 2021 -0800
> 
>     fstests: remove DMAPI tests
> 
>     Upstream XFS has never supported DMAPI, so remove the tests for this
>     feature.
> 
>     Signed-off-by: Darrick J. Wong <djwong@kernel.org>
>     Acked-by: Christoph Hellwig <hch@lst.de>
>     Signed-off-by: Eryu Guan <guaneryu@gmail.com>
> 
> See ./dmapi/src/sample_hsm/ for the HSM test code that was removed
> in that patchset - it might provide some infrastructure that can be
> used to test the fanotify HSM event infrastructure without
> reinventing the entire wheel...

Yup as soon as this is merged into a tree my first stop is LTP, which is where
all the fanotify tests currently exist.  It won't cost me anything to add it to
fstests as well, so I'll follow up with that.

Generally I'd post the tests at the same time, but since it's dependent on what
we settle on for the implementation behavior I'm holding that stuff back.
Thanks,

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault
  2024-08-09 10:34   ` Amir Goldstein
@ 2024-08-09 14:19     ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 14:19 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: kernel-team, linux-fsdevel, jack, brauner, linux-xfs, gfs2,
	linux-bcachefs

On Fri, Aug 09, 2024 at 12:34:34PM +0200, Amir Goldstein wrote:
> On Thu, Aug 8, 2024 at 9:28 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > FS_PRE_ACCESS or FS_PRE_MODIFY will be generated on page fault depending
> > on the faulting method.
> >
> > This pre-content event is meant to be used by hierarchical storage
> > managers that want to fill in the file content on first read access.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  include/linux/mm.h |  2 +
> >  mm/filemap.c       | 97 ++++++++++++++++++++++++++++++++++++++++++----
> >  2 files changed, 92 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index ab3d78116043..c33f3b7f7261 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3503,6 +3503,8 @@ extern vm_fault_t filemap_fault(struct vm_fault *vmf);
> >  extern vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> >                 pgoff_t start_pgoff, pgoff_t end_pgoff);
> >  extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
> > +extern vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
> > +                                                   struct file **fpin);
> >
> >  extern unsigned long stack_guard_gap;
> >  /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 8b1684b62177..3d232166b051 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -46,6 +46,7 @@
> >  #include <linux/pipe_fs_i.h>
> >  #include <linux/splice.h>
> >  #include <linux/rcupdate_wait.h>
> > +#include <linux/fsnotify.h>
> >  #include <asm/pgalloc.h>
> >  #include <asm/tlbflush.h>
> >  #include "internal.h"
> > @@ -3112,13 +3113,13 @@ static int lock_folio_maybe_drop_mmap(struct vm_fault *vmf, struct folio *folio,
> >   * that.  If we didn't pin a file then we return NULL.  The file that is
> >   * returned needs to be fput()'ed when we're done with it.
> >   */
> > -static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
> > +static struct file *do_sync_mmap_readahead(struct vm_fault *vmf,
> > +                                          struct file *fpin)
> >  {
> >         struct file *file = vmf->vma->vm_file;
> >         struct file_ra_state *ra = &file->f_ra;
> >         struct address_space *mapping = file->f_mapping;
> >         DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff);
> > -       struct file *fpin = NULL;
> >         unsigned long vm_flags = vmf->vma->vm_flags;
> >         unsigned int mmap_miss;
> >
> > @@ -3190,12 +3191,12 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
> >   * was pinned if we have to drop the mmap_lock in order to do IO.
> >   */
> >  static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
> > -                                           struct folio *folio)
> > +                                           struct folio *folio,
> > +                                           struct file *fpin)
> >  {
> >         struct file *file = vmf->vma->vm_file;
> >         struct file_ra_state *ra = &file->f_ra;
> >         DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff);
> > -       struct file *fpin = NULL;
> >         unsigned int mmap_miss;
> >
> >         /* See comment in do_sync_mmap_readahead. */
> > @@ -3260,6 +3261,72 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf)
> >         return ret;
> >  }
> >
> > +/**
> > + * filemap_maybe_emit_fsnotify_event - maybe emit a pre-content event.
> > + * @vmf:       struct vm_fault containing details of the fault.
> > + * @fpin:      pointer to the struct file pointer that may be pinned.
> > + *
> > + * If we have pre-content watches on this file we will need to emit an event for
> > + * this range.  We will handle dropping the lock and emitting the event.
> > + *
> > + * If FAULT_FLAG_RETRY_NOWAIT is set then we'll return VM_FAULT_RETRY.
> > + *
> > + * If no event was emitted then *fpin will be NULL and we will return 0.
> > + *
> > + * If any error occurred we will return VM_FAULT_SIGBUS, *fpin could still be
> > + * set and will need to have fput() called on it.
> > + *
> > + * If we emitted the event then we will return 0 and *fpin will be set, this
> > + * must have fput() called on it, and the caller must call VM_FAULT_RETRY after
> > + * any other operations it does in order to re-fault the page and make sure the
> > + * appropriate locking is maintained.
> > + *
> > + * Return: the appropriate vm_fault_t return code, 0 on success.
> > + */
> > +vm_fault_t filemap_maybe_emit_fsnotify_event(struct vm_fault *vmf,
> > +                                            struct file **fpin)
> > +{
> > +       struct file *file = vmf->vma->vm_file;
> > +       loff_t pos = vmf->pgoff << PAGE_SHIFT;
> > +       int mask = (vmf->flags & FAULT_FLAG_WRITE) ? MAY_WRITE : MAY_READ;
> 
> You missed my comment about using MAY_ACCESS here
> and alter fsnotify hook, so legacy FAN_ACCESS_PERM event
> won't be generated from page fault.

I did miss that, I'll fix it up in v3, thanks!

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault
  2024-08-09 13:11   ` Amir Goldstein
@ 2024-08-09 14:21     ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 14:21 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: kernel-team, linux-fsdevel, jack, brauner, linux-xfs, gfs2,
	linux-bcachefs

On Fri, Aug 09, 2024 at 03:11:34PM +0200, Amir Goldstein wrote:
> On Thu, Aug 8, 2024 at 9:28 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > bcachefs has its own locking around filemap_fault, so we have to make
> > sure we do the fsnotify hook before the locking.  Add the check to emit
> > the event before the locking and return VM_FAULT_RETRY to retrigger the
> > fault once the event has been emitted.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/bcachefs/fs-io-pagecache.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/fs/bcachefs/fs-io-pagecache.c b/fs/bcachefs/fs-io-pagecache.c
> > index a9cc5cad9cc9..359856df52d4 100644
> > --- a/fs/bcachefs/fs-io-pagecache.c
> > +++ b/fs/bcachefs/fs-io-pagecache.c
> > @@ -562,6 +562,7 @@ void bch2_set_folio_dirty(struct bch_fs *c,
> >  vm_fault_t bch2_page_fault(struct vm_fault *vmf)
> >  {
> >         struct file *file = vmf->vma->vm_file;
> > +       struct file *fpin = NULL;
> >         struct address_space *mapping = file->f_mapping;
> >         struct address_space *fdm = faults_disabled_mapping();
> >         struct bch_inode_info *inode = file_bch_inode(file);
> > @@ -570,6 +571,18 @@ vm_fault_t bch2_page_fault(struct vm_fault *vmf)
> >         if (fdm == mapping)
> >                 return VM_FAULT_SIGBUS;
> >
> > +       ret = filemap_maybe_emit_fsnotify_event(vmf, &fpin);
> > +       if (unlikely(ret)) {
> > +               if (fpin) {
> > +                       fput(fpin);
> > +                       ret |= VM_FAULT_RETRy;
> 
> Typo RETRy

Hmm I swear I had bcachefs turned on in my config, I'll fix this and also fix my
config.

> 
> > +               }
> > +               return ret;
> > +       } else if (fpin) {
> > +               fput(fpin);
> > +               return VM_FAULT_RETRY;
> > +       }
> > +
> 
> This chunk is almost duplicate in all call sites in filesystems.
> Could it maybe be enclosed in a helper.
> It is bad enough that we have to spray those in filesystem code,
> so at least give the copy&paste errors to the bare minimum?

You should have seen what I had to begin with ;).  I agree, I'll rework this to
reduce how much we're carrying around.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event
  2024-08-09 12:00   ` Christian Brauner
@ 2024-08-09 18:36     ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 18:36 UTC (permalink / raw)
  To: Christian Brauner
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Fri, Aug 09, 2024 at 02:00:29PM +0200, Christian Brauner wrote:
> On Thu, Aug 08, 2024 at 03:27:08PM GMT, Josef Bacik wrote:
> > From: Amir Goldstein <amir73il@gmail.com>
> > 
> > We would like to add file range information to pre-content events.
> > 
> > Pass a struct file_range with optional offset and length to event handler
> > along with pre-content permission event.
> > 
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/notify/fanotify/fanotify.c    | 10 ++++++++--
> >  fs/notify/fanotify/fanotify.h    |  2 ++
> >  include/linux/fsnotify.h         | 17 ++++++++++++++++-
> >  include/linux/fsnotify_backend.h | 32 ++++++++++++++++++++++++++++++++
> >  4 files changed, 58 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > index b163594843f5..4e8dce39fa8f 100644
> > --- a/fs/notify/fanotify/fanotify.c
> > +++ b/fs/notify/fanotify/fanotify.c
> > @@ -549,9 +549,13 @@ static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
> >  	return &pevent->fae;
> >  }
> >  
> > -static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
> > +static struct fanotify_event *fanotify_alloc_perm_event(const void *data,
> > +							int data_type,
> >  							gfp_t gfp)
> >  {
> > +	const struct path *path = fsnotify_data_path(data, data_type);
> > +	const struct file_range *range =
> > +			    fsnotify_data_file_range(data, data_type);
> >  	struct fanotify_perm_event *pevent;
> >  
> >  	pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp);
> > @@ -565,6 +569,8 @@ static struct fanotify_event *fanotify_alloc_perm_event(const struct path *path,
> >  	pevent->hdr.len = 0;
> >  	pevent->state = FAN_EVENT_INIT;
> >  	pevent->path = *path;
> > +	pevent->ppos = range ? range->ppos : NULL;
> > +	pevent->count = range ? range->count : 0;
> >  	path_get(path);
> >  
> >  	return &pevent->fae;
> > @@ -802,7 +808,7 @@ static struct fanotify_event *fanotify_alloc_event(
> >  	old_memcg = set_active_memcg(group->memcg);
> >  
> >  	if (fanotify_is_perm_event(mask)) {
> > -		event = fanotify_alloc_perm_event(path, gfp);
> > +		event = fanotify_alloc_perm_event(data, data_type, gfp);
> >  	} else if (fanotify_is_error_event(mask)) {
> >  		event = fanotify_alloc_error_event(group, fsid, data,
> >  						   data_type, &hash);
> > diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> > index e5ab33cae6a7..93598b7d5952 100644
> > --- a/fs/notify/fanotify/fanotify.h
> > +++ b/fs/notify/fanotify/fanotify.h
> > @@ -425,6 +425,8 @@ FANOTIFY_PE(struct fanotify_event *event)
> >  struct fanotify_perm_event {
> >  	struct fanotify_event fae;
> >  	struct path path;
> > +	const loff_t *ppos;		/* optional file range info */
> > +	size_t count;
> >  	u32 response;			/* userspace answer to the event */
> >  	unsigned short state;		/* state of the event */
> >  	int fd;		/* fd we passed to userspace for this event */
> > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> > index a28daf136fea..4609d9b6b087 100644
> > --- a/include/linux/fsnotify.h
> > +++ b/include/linux/fsnotify.h
> > @@ -132,6 +132,21 @@ static inline int fsnotify_file(struct file *file, __u32 mask)
> >  }
> >  
> >  #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
> > +static inline int fsnotify_file_range(struct file *file, __u32 mask,
> > +				      const loff_t *ppos, size_t count)
> > +{
> > +	struct file_range range;
> > +
> > +	if (file->f_mode & FMODE_NONOTIFY)
> > +		return 0;
> > +
> > +	range.path = &file->f_path;
> > +	range.ppos = ppos;
> > +	range.count = count;
> > +	return fsnotify_parent(range.path->dentry, mask, &range,
> > +			       FSNOTIFY_EVENT_FILE_RANGE);
> > +}
> > +
> >  /*
> >   * fsnotify_file_area_perm - permission hook before access/modify of file range
> >   */
> > @@ -175,7 +190,7 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
> >  	else
> >  		return 0;
> >  
> > -	return fsnotify_file(file, fsnotify_mask);
> > +	return fsnotify_file_range(file, fsnotify_mask, ppos, count);
> >  }
> >  
> >  /*
> > diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> > index 200a5e3b1cd4..276320846bfd 100644
> > --- a/include/linux/fsnotify_backend.h
> > +++ b/include/linux/fsnotify_backend.h
> > @@ -298,6 +298,7 @@ static inline void fsnotify_group_assert_locked(struct fsnotify_group *group)
> >  /* When calling fsnotify tell it if the data is a path or inode */
> >  enum fsnotify_data_type {
> >  	FSNOTIFY_EVENT_NONE,
> > +	FSNOTIFY_EVENT_FILE_RANGE,
> >  	FSNOTIFY_EVENT_PATH,
> >  	FSNOTIFY_EVENT_INODE,
> >  	FSNOTIFY_EVENT_DENTRY,
> > @@ -310,6 +311,17 @@ struct fs_error_report {
> >  	struct super_block *sb;
> >  };
> >  
> > +struct file_range {
> > +	const struct path *path;
> > +	const loff_t *ppos;
> > +	size_t count;
> > +};
> > +
> > +static inline const struct path *file_range_path(const struct file_range *range)
> > +{
> > +	return range->path;
> > +}
> > +
> >  static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> >  {
> >  	switch (data_type) {
> > @@ -319,6 +331,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> >  		return d_inode(data);
> >  	case FSNOTIFY_EVENT_PATH:
> >  		return d_inode(((const struct path *)data)->dentry);
> > +	case FSNOTIFY_EVENT_FILE_RANGE:
> > +		return d_inode(file_range_path(data)->dentry);
> >  	case FSNOTIFY_EVENT_ERROR:
> >  		return ((struct fs_error_report *)data)->inode;
> >  	default:
> > @@ -334,6 +348,8 @@ static inline struct dentry *fsnotify_data_dentry(const void *data, int data_typ
> >  		return (struct dentry *)data;
> >  	case FSNOTIFY_EVENT_PATH:
> >  		return ((const struct path *)data)->dentry;
> > +	case FSNOTIFY_EVENT_FILE_RANGE:
> > +		return file_range_path(data)->dentry;
> >  	default:
> >  		return NULL;
> >  	}
> > @@ -345,6 +361,8 @@ static inline const struct path *fsnotify_data_path(const void *data,
> >  	switch (data_type) {
> >  	case FSNOTIFY_EVENT_PATH:
> >  		return data;
> > +	case FSNOTIFY_EVENT_FILE_RANGE:
> > +		return file_range_path(data);
> >  	default:
> >  		return NULL;
> >  	}
> > @@ -360,6 +378,8 @@ static inline struct super_block *fsnotify_data_sb(const void *data,
> >  		return ((struct dentry *)data)->d_sb;
> >  	case FSNOTIFY_EVENT_PATH:
> >  		return ((const struct path *)data)->dentry->d_sb;
> > +	case FSNOTIFY_EVENT_FILE_RANGE:
> > +		return file_range_path(data)->dentry->d_sb;
> >  	case FSNOTIFY_EVENT_ERROR:
> >  		return ((struct fs_error_report *) data)->sb;
> >  	default:
> > @@ -379,6 +399,18 @@ static inline struct fs_error_report *fsnotify_data_error_report(
> >  	}
> >  }
> >  
> > +static inline const struct file_range *fsnotify_data_file_range(
> > +							const void *data,
> > +							int data_type)
> > +{
> > +	switch (data_type) {
> > +	case FSNOTIFY_EVENT_FILE_RANGE:
> > +		return (struct file_range *)data;
> > +	default:
> > +		return NULL;
> 
> Wouldn't you want something like
> 
> case FSNOTIFY_EVENT_NONE
> 	return NULL;
> default:
> 	WARN_ON_ONCE(data_type);
> 	return NULL;
> 
> to guard against garbage being passed to fsnotify_data_file_range()?

We don't do this in any of the other helpers, and this is used generically in
fanotify_alloc_perm_event(), which handles having no range properly.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response
  2024-08-09 12:06   ` Christian Brauner
@ 2024-08-09 18:38     ` Josef Bacik
  0 siblings, 0 replies; 36+ messages in thread
From: Josef Bacik @ 2024-08-09 18:38 UTC (permalink / raw)
  To: Christian Brauner
  Cc: kernel-team, linux-fsdevel, jack, amir73il, linux-xfs, gfs2,
	linux-bcachefs

On Fri, Aug 09, 2024 at 02:06:56PM +0200, Christian Brauner wrote:
> On Thu, Aug 08, 2024 at 03:27:11PM GMT, Josef Bacik wrote:
> > From: Amir Goldstein <amir73il@gmail.com>
> > 
> > With FAN_DENY response, user trying to perform the filesystem operation
> > gets an error with errno set to EPERM.
> > 
> > It is useful for hierarchical storage management (HSM) service to be able
> > to deny access for reasons more diverse than EPERM, for example EAGAIN,
> > if HSM could retry the operation later.
> > 
> > Allow fanotify groups with priority FAN_CLASSS_PRE_CONTENT to responsd
> > to permission events with the response value FAN_DENY_ERRNO(errno),
> > instead of FAN_DENY to return a custom error.
> > 
> > Limit custom error values to errors expected on read(2)/write(2) and
> > open(2) of regular files. This list could be extended in the future.
> > Userspace can test for legitimate values of FAN_DENY_ERRNO(errno) by
> > writing a response to an fanotify group fd with a value of FAN_NOFD in
> > the fd field of the response.
> > 
> > The change in fanotify_response is backward compatible, because errno is
> > written in the high 8 bits of the 32bit response field and old kernels
> > reject respose value with high bits set.
> > 
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/notify/fanotify/fanotify.c      | 18 ++++++++++-----
> >  fs/notify/fanotify/fanotify.h      | 10 +++++++++
> >  fs/notify/fanotify/fanotify_user.c | 36 +++++++++++++++++++++++++-----
> >  include/linux/fanotify.h           |  5 ++++-
> >  include/uapi/linux/fanotify.h      |  7 ++++++
> >  5 files changed, 65 insertions(+), 11 deletions(-)
> > 
> > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > index 4e8dce39fa8f..1cbf41b34080 100644
> > --- a/fs/notify/fanotify/fanotify.c
> > +++ b/fs/notify/fanotify/fanotify.c
> > @@ -224,7 +224,8 @@ static int fanotify_get_response(struct fsnotify_group *group,
> >  				 struct fanotify_perm_event *event,
> >  				 struct fsnotify_iter_info *iter_info)
> >  {
> > -	int ret;
> > +	int ret, errno;
> > +	u32 decision;
> >  
> >  	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
> >  
> > @@ -257,20 +258,27 @@ static int fanotify_get_response(struct fsnotify_group *group,
> >  		goto out;
> >  	}
> >  
> > +	decision = fanotify_get_response_decision(event->response);
> >  	/* userspace responded, convert to something usable */
> > -	switch (event->response & FANOTIFY_RESPONSE_ACCESS) {
> > +	switch (decision & FANOTIFY_RESPONSE_ACCESS) {
> >  	case FAN_ALLOW:
> >  		ret = 0;
> >  		break;
> >  	case FAN_DENY:
> > +		/* Check custom errno from pre-content events */
> > +		errno = fanotify_get_response_errno(event->response);
> 
> Fwiw, you're fetching from event->response again but have already
> stashed it in @decision earlier. Probably just an oversight.
> 

Decision is the part that has the errno masked off, event->response is the full
mask which will have the errno set in the upper bits, so we have to do the
separate call with event->response to get the errno.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2024-08-09 18:38 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 19:27 [PATCH v2 00/16] fanotify: add pre-content hooks Josef Bacik
2024-08-08 19:27 ` [PATCH v2 01/16] fanotify: don't skip extra event info if no info_mode is set Josef Bacik
2024-08-08 19:27 ` [PATCH v2 02/16] fsnotify: introduce pre-content permission event Josef Bacik
2024-08-08 19:27 ` [PATCH v2 03/16] fsnotify: generate pre-content permission event on open Josef Bacik
2024-08-09 11:51   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 04/16] fanotify: introduce FAN_PRE_ACCESS permission event Josef Bacik
2024-08-09 11:57   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 05/16] fanotify: introduce FAN_PRE_MODIFY " Josef Bacik
2024-08-09 11:57   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 06/16] fanotify: pass optional file access range in pre-content event Josef Bacik
2024-08-09 12:00   ` Christian Brauner
2024-08-09 18:36     ` Josef Bacik
2024-08-08 19:27 ` [PATCH v2 07/16] fanotify: rename a misnamed constant Josef Bacik
2024-08-09 11:41   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 08/16] fanotify: report file range info with pre-content events Josef Bacik
2024-08-08 19:27 ` [PATCH v2 09/16] fanotify: allow to set errno in FAN_DENY permission response Josef Bacik
2024-08-09 12:06   ` Christian Brauner
2024-08-09 18:38     ` Josef Bacik
2024-08-08 19:27 ` [PATCH v2 10/16] fanotify: add a helper to check for pre content events Josef Bacik
2024-08-09 12:10   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 11/16] fanotify: disable readahead if we have pre-content watches Josef Bacik
2024-08-09 12:12   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 12/16] mm: don't allow huge faults for files with pre content watches Josef Bacik
2024-08-09 12:13   ` Christian Brauner
2024-08-08 19:27 ` [PATCH v2 13/16] fsnotify: generate pre-content permission event on page fault Josef Bacik
2024-08-09 10:34   ` Amir Goldstein
2024-08-09 14:19     ` Josef Bacik
2024-08-08 19:27 ` [PATCH v2 14/16] bcachefs: add pre-content fsnotify hook to fault Josef Bacik
2024-08-09 13:11   ` Amir Goldstein
2024-08-09 14:21     ` Josef Bacik
2024-08-08 19:27 ` [PATCH v2 15/16] gfs2: " Josef Bacik
2024-08-08 19:27 ` [PATCH v2 16/16] xfs: add pre-content fsnotify hook for write faults Josef Bacik
2024-08-08 22:03   ` Dave Chinner
2024-08-09 14:15     ` Josef Bacik
2024-08-08 22:15 ` [PATCH v2 00/16] fanotify: add pre-content hooks Dave Chinner
2024-08-09 14:18   ` Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).