* [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast
@ 2019-09-06 13:56 Ritesh Harjani
2019-09-16 3:22 ` Ritesh Harjani
2019-09-16 21:54 ` Jeff Layton
0 siblings, 2 replies; 3+ messages in thread
From: Ritesh Harjani @ 2019-09-06 13:56 UTC (permalink / raw
To: jlayton, viro; +Cc: hsiangkao, linux-fsdevel, aneesh.kumar, wugyuan, riteshh
d_is_negative can race with d_instantiate_new()
-> __d_set_inode_and_type().
For e.g. in use cases where Thread-1 is creating
symlink (doing d_instantiate_new()) & Thread-2 is doing
cat of that symlink while doing lookup_fast (via REF-walk-
one such case is, when ->permission returns -ECHILD).
During this race if __d_set_and_inode_type() does out-of-order
execution and set the dentry->d_flags before setting
dentry->inode, then it can result into following kernel panic.
This change fixes the issue by directly checking for inode.
E.g. kernel panic, since inode was NULL.
trailing_symlink() -> may_follow_link() -> inode->i_uid.
Issue signature:-
[NIP : trailing_symlink+80]
[LR : trailing_symlink+1092]
#4 [c00000198069bb70] trailing_symlink at c0000000004bae60 (unreliable)
#5 [c00000198069bc00] path_openat at c0000000004bdd14
#6 [c00000198069bc90] do_filp_open at c0000000004c0274
#7 [c00000198069bdb0] do_sys_open at c00000000049b248
#8 [c00000198069be30] system_call at c00000000000b388
Sequence of events:-
Thread-2(Comm: ln) Thread-1(Comm: cat)
dentry = __d_lookup() //nonRCU
__d_set_and_inode_type() (Out-of-order execution)
flags = READ_ONCE(dentry->d_flags);
flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
flags |= type_flags;
WRITE_ONCE(dentry->d_flags, flags);
if (unlikely(d_is_negative()) // fails
{}
// since d_flags is already updated in
// Thread-2 in parallel but inode
// not yet set.
// d_is_negative returns false
*inode = d_backing_inode(path->dentry);
// means inode is still NULL
dentry->d_inode = inode;
trailing_symlink()
may_follow_link()
inode = nd->link_inode;
// nd->link_inode = NULL
//Then it crashes while
//doing inode->i_uid
Reported-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
Tested-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
fs/namei.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/namei.c b/fs/namei.c
index 209c51a5226c..b5867fe988e0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1623,7 +1623,21 @@ static int lookup_fast(struct nameidata *nd,
dput(dentry);
return status;
}
- if (unlikely(d_is_negative(dentry))) {
+
+ /*
+ * Caution: d_is_negative() can race with
+ * __d_set_inode_and_type().
+ * For e.g. in use cases where Thread-1 is creating
+ * symlink (doing d_instantiate_new()) & Thread-2 is doing
+ * cat of that symlink and falling here (via Ref-walk) while
+ * doing lookup_fast (one such case is when ->permission
+ * returns -ECHILD).
+ * Now if __d_set_inode_and_type() does out-of-order execution
+ * i.e. it first sets the dentry->d_flags & then dentry->inode
+ * then it can result into inode being NULL, causing panic later.
+ * Hence directly check if inode is NULL here.
+ */
+ if (unlikely(d_really_is_negative(dentry))) {
dput(dentry);
return -ENOENT;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast
2019-09-06 13:56 [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast Ritesh Harjani
@ 2019-09-16 3:22 ` Ritesh Harjani
2019-09-16 21:54 ` Jeff Layton
1 sibling, 0 replies; 3+ messages in thread
From: Ritesh Harjani @ 2019-09-16 3:22 UTC (permalink / raw
To: viro; +Cc: jlayton, hsiangkao, linux-fsdevel, aneesh.kumar, wugyuan
Hello Al Viro,
Any comments on this patch?
This is the same approach as suggested by you [1]
[1]: https://patchwork.kernel.org/patch/10909881/
-ritesh
On 9/6/19 7:26 PM, Ritesh Harjani wrote:
> d_is_negative can race with d_instantiate_new()
> -> __d_set_inode_and_type().
> For e.g. in use cases where Thread-1 is creating
> symlink (doing d_instantiate_new()) & Thread-2 is doing
> cat of that symlink while doing lookup_fast (via REF-walk-
> one such case is, when ->permission returns -ECHILD).
>
> During this race if __d_set_and_inode_type() does out-of-order
> execution and set the dentry->d_flags before setting
> dentry->inode, then it can result into following kernel panic.
>
> This change fixes the issue by directly checking for inode.
>
> E.g. kernel panic, since inode was NULL.
> trailing_symlink() -> may_follow_link() -> inode->i_uid.
> Issue signature:-
> [NIP : trailing_symlink+80]
> [LR : trailing_symlink+1092]
> #4 [c00000198069bb70] trailing_symlink at c0000000004bae60 (unreliable)
> #5 [c00000198069bc00] path_openat at c0000000004bdd14
> #6 [c00000198069bc90] do_filp_open at c0000000004c0274
> #7 [c00000198069bdb0] do_sys_open at c00000000049b248
> #8 [c00000198069be30] system_call at c00000000000b388
>
> Sequence of events:-
> Thread-2(Comm: ln) Thread-1(Comm: cat)
>
> dentry = __d_lookup() //nonRCU
>
> __d_set_and_inode_type() (Out-of-order execution)
> flags = READ_ONCE(dentry->d_flags);
> flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
> flags |= type_flags;
> WRITE_ONCE(dentry->d_flags, flags);
>
> if (unlikely(d_is_negative()) // fails
> {}
> // since d_flags is already updated in
> // Thread-2 in parallel but inode
> // not yet set.
> // d_is_negative returns false
>
> *inode = d_backing_inode(path->dentry);
> // means inode is still NULL
>
> dentry->d_inode = inode;
>
> trailing_symlink()
> may_follow_link()
> inode = nd->link_inode;
> // nd->link_inode = NULL
> //Then it crashes while
> //doing inode->i_uid
>
> Reported-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
> Tested-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
> fs/namei.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 209c51a5226c..b5867fe988e0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1623,7 +1623,21 @@ static int lookup_fast(struct nameidata *nd,
> dput(dentry);
> return status;
> }
> - if (unlikely(d_is_negative(dentry))) {
> +
> + /*
> + * Caution: d_is_negative() can race with
> + * __d_set_inode_and_type().
> + * For e.g. in use cases where Thread-1 is creating
> + * symlink (doing d_instantiate_new()) & Thread-2 is doing
> + * cat of that symlink and falling here (via Ref-walk) while
> + * doing lookup_fast (one such case is when ->permission
> + * returns -ECHILD).
> + * Now if __d_set_inode_and_type() does out-of-order execution
> + * i.e. it first sets the dentry->d_flags & then dentry->inode
> + * then it can result into inode being NULL, causing panic later.
> + * Hence directly check if inode is NULL here.
> + */
> + if (unlikely(d_really_is_negative(dentry))) {
> dput(dentry);
> return -ENOENT;
> }
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast
2019-09-06 13:56 [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast Ritesh Harjani
2019-09-16 3:22 ` Ritesh Harjani
@ 2019-09-16 21:54 ` Jeff Layton
1 sibling, 0 replies; 3+ messages in thread
From: Jeff Layton @ 2019-09-16 21:54 UTC (permalink / raw
To: Ritesh Harjani, viro; +Cc: hsiangkao, linux-fsdevel, aneesh.kumar, wugyuan
On Fri, 2019-09-06 at 19:26 +0530, Ritesh Harjani wrote:
> d_is_negative can race with d_instantiate_new()
> -> __d_set_inode_and_type().
> For e.g. in use cases where Thread-1 is creating
> symlink (doing d_instantiate_new()) & Thread-2 is doing
> cat of that symlink while doing lookup_fast (via REF-walk-
> one such case is, when ->permission returns -ECHILD).
>
> During this race if __d_set_and_inode_type() does out-of-order
> execution and set the dentry->d_flags before setting
> dentry->inode, then it can result into following kernel panic.
>
> This change fixes the issue by directly checking for inode.
>
> E.g. kernel panic, since inode was NULL.
> trailing_symlink() -> may_follow_link() -> inode->i_uid.
> Issue signature:-
> [NIP : trailing_symlink+80]
> [LR : trailing_symlink+1092]
> #4 [c00000198069bb70] trailing_symlink at c0000000004bae60 (unreliable)
> #5 [c00000198069bc00] path_openat at c0000000004bdd14
> #6 [c00000198069bc90] do_filp_open at c0000000004c0274
> #7 [c00000198069bdb0] do_sys_open at c00000000049b248
> #8 [c00000198069be30] system_call at c00000000000b388
>
> Sequence of events:-
> Thread-2(Comm: ln) Thread-1(Comm: cat)
>
> dentry = __d_lookup() //nonRCU
>
> __d_set_and_inode_type() (Out-of-order execution)
> flags = READ_ONCE(dentry->d_flags);
> flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
> flags |= type_flags;
> WRITE_ONCE(dentry->d_flags, flags);
>
> if (unlikely(d_is_negative()) // fails
> {}
> // since d_flags is already updated in
> // Thread-2 in parallel but inode
> // not yet set.
> // d_is_negative returns false
>
> *inode = d_backing_inode(path->dentry);
> // means inode is still NULL
>
> dentry->d_inode = inode;
>
> trailing_symlink()
> may_follow_link()
> inode = nd->link_inode;
> // nd->link_inode = NULL
> //Then it crashes while
> //doing inode->i_uid
>
> Reported-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
> Tested-by: Guang Yuan Wu <wugyuan@cn.ibm.com>
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
> fs/namei.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 209c51a5226c..b5867fe988e0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1623,7 +1623,21 @@ static int lookup_fast(struct nameidata *nd,
> dput(dentry);
> return status;
> }
> - if (unlikely(d_is_negative(dentry))) {
> +
> + /*
> + * Caution: d_is_negative() can race with
> + * __d_set_inode_and_type().
> + * For e.g. in use cases where Thread-1 is creating
> + * symlink (doing d_instantiate_new()) & Thread-2 is doing
> + * cat of that symlink and falling here (via Ref-walk) while
> + * doing lookup_fast (one such case is when ->permission
> + * returns -ECHILD).
> + * Now if __d_set_inode_and_type() does out-of-order execution
> + * i.e. it first sets the dentry->d_flags & then dentry->inode
> + * then it can result into inode being NULL, causing panic later.
> + * Hence directly check if inode is NULL here.
> + */
> + if (unlikely(d_really_is_negative(dentry))) {
> dput(dentry);
> return -ENOENT;
> }
Looks reasonable to me. The only alternative I see is to put the
barriers back in there (which still might not be a bad idea), but this
should at least address this race.
Acked-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-16 21:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-06 13:56 [PATCH 1/1] vfs: Really check for inode ptr in lookup_fast Ritesh Harjani
2019-09-16 3:22 ` Ritesh Harjani
2019-09-16 21:54 ` Jeff Layton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.