LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup
@ 2024-04-24 22:54 Luis Chamberlain
  2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw)
  To: akpm, ziy, linux-mm
  Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry,
	p.raghav, da.gomez, mcgrof

I've been stress testing large folio splits using the new debugfs interface
with a new fstests test [0], to make the debugfs interface more useful it helps
to do bounds checks on input parameters so this fixes two issues found while
doing automatic stress testing of the debugfs interface with found files.

Provided there are no issues I think this should go in for v6.9-rc6.

Also, *should* we strive to support splits for large folios to min order
with say MADV_NOHUGEPAGE ?

[0] https://lkml.kernel.org/r/20240424224649.1494092-1-mcgrof@kernel.org

Luis Chamberlain (2):
  mm/huge_memory: skip invalid debugfs file entry for folio split
  mm/huge_memory: cap max length on debugfs file entry folio split

 mm/huge_memory.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
@ 2024-04-24 22:54 ` Luis Chamberlain
  2024-04-25  1:03   ` Zi Yan
  2024-04-25 21:01   ` Andrew Morton
  2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain
  2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
  2 siblings, 2 replies; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw)
  To: akpm, ziy, linux-mm
  Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry,
	p.raghav, da.gomez, mcgrof

If the file entry is too long we may easily end up going out of bounds
and crash after strsep() on sscanf(). To avoid this ensure we bound the
string to an expected length before we use sscanf() on it.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 mm/huge_memory.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9e9879d2f501..8386d24a163e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
 		char file_path[MAX_INPUT_BUF_SZ];
 		pgoff_t off_start = 0, off_end = 0;
 		size_t input_len = strlen(input_buf);
+		size_t max_left_over;
 
 		tok = strsep(&buf, ",");
 		if (tok) {
@@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
 			goto out;
 		}
 
+		max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path);
+		if (!buf ||
+		    strnlen(buf, max_left_over) < 7 ||
+		    strnlen(buf, max_left_over) > max_left_over) {
+			ret = -EINVAL;
+			goto out;
+		}
+
 		ret = sscanf(buf, "0x%lx,0x%lx,%d", &off_start, &off_end, &new_order);
 		if (ret != 2 && ret != 3) {
 			ret = -EINVAL;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry folio split
  2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
  2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
@ 2024-04-24 22:54 ` Luis Chamberlain
  2024-04-25  1:05   ` Zi Yan
  2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
  2 siblings, 1 reply; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-24 22:54 UTC (permalink / raw)
  To: akpm, ziy, linux-mm
  Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry,
	p.raghav, da.gomez, mcgrof

Don't allow to query beyond a mapped file's length. Since this is just
a debugfs interface allow userspace to be lazy and use a large value so
we can just use the entire file.

Without this we can end up wasting cycles looking for folios which
just don't exist for no good reason.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 mm/huge_memory.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8386d24a163e..86a8c7b3b8dc 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3535,7 +3535,7 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start,
 	struct file *candidate;
 	struct address_space *mapping;
 	int ret = -EINVAL;
-	pgoff_t index;
+	pgoff_t index, fsize;
 	int nr_pages = 1;
 	unsigned long total = 0, split = 0;
 
@@ -3547,11 +3547,14 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start,
 	if (IS_ERR(candidate))
 		goto out;
 
+	mapping = candidate->f_mapping;
+	fsize = i_size_read(mapping->host);
+	if (off_end > fsize)
+		off_end = fsize;
+
 	pr_debug("split file-backed THPs in file: %s, page offset: [0x%lx - 0x%lx]\n",
 		 file_path, off_start, off_end);
 
-	mapping = candidate->f_mapping;
-
 	for (index = off_start; index < off_end; index += nr_pages) {
 		struct folio *folio = filemap_get_folio(mapping, index);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup
  2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
  2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
  2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain
@ 2024-04-24 22:55 ` Luis Chamberlain
  2 siblings, 0 replies; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-24 22:55 UTC (permalink / raw)
  To: akpm, ziy, linux-mm
  Cc: fstests, linux-xfs, linux-kernel, willy, hare, john.g.garry,
	p.raghav, da.gomez

The subject hints at a cleanup but I decided to leave that for another
separate patch which I'll send next.

  Luis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
@ 2024-04-25  1:03   ` Zi Yan
  2024-04-25 22:40     ` Luis Chamberlain
  2024-04-25 21:01   ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: Zi Yan @ 2024-04-25  1:03 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

[-- Attachment #1: Type: text/plain, Size: 1610 bytes --]

On 24 Apr 2024, at 18:54, Luis Chamberlain wrote:

> If the file entry is too long we may easily end up going out of bounds
> and crash after strsep() on sscanf(). To avoid this ensure we bound the
> string to an expected length before we use sscanf() on it.
>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  mm/huge_memory.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9e9879d2f501..8386d24a163e 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
>  		char file_path[MAX_INPUT_BUF_SZ];
>  		pgoff_t off_start = 0, off_end = 0;
>  		size_t input_len = strlen(input_buf);
> +		size_t max_left_over;
>
>  		tok = strsep(&buf, ",");
>  		if (tok) {
> @@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
>  			goto out;
>  		}
>
> +		max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path);
> +		if (!buf ||
> +		    strnlen(buf, max_left_over) < 7 ||

What is this magic number 7? strlen("0xN,0xN") as the minimal input string size?
Maybe use sizeof("0xN,0xN") - 1 instead?

> +		    strnlen(buf, max_left_over) > max_left_over) {
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
>  		ret = sscanf(buf, "0x%lx,0x%lx,%d", &off_start, &off_end, &new_order);
>  		if (ret != 2 && ret != 3) {
>  			ret = -EINVAL;
> -- 
> 2.43.0

Everything else looks good to me. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry folio split
  2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain
@ 2024-04-25  1:05   ` Zi Yan
  0 siblings, 0 replies; 10+ messages in thread
From: Zi Yan @ 2024-04-25  1:05 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

[-- Attachment #1: Type: text/plain, Size: 1617 bytes --]

On 24 Apr 2024, at 18:54, Luis Chamberlain wrote:

> Don't allow to query beyond a mapped file's length. Since this is just
> a debugfs interface allow userspace to be lazy and use a large value so
> we can just use the entire file.
>
> Without this we can end up wasting cycles looking for folios which
> just don't exist for no good reason.
>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  mm/huge_memory.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8386d24a163e..86a8c7b3b8dc 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3535,7 +3535,7 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start,
>  	struct file *candidate;
>  	struct address_space *mapping;
>  	int ret = -EINVAL;
> -	pgoff_t index;
> +	pgoff_t index, fsize;
>  	int nr_pages = 1;
>  	unsigned long total = 0, split = 0;
>
> @@ -3547,11 +3547,14 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start,
>  	if (IS_ERR(candidate))
>  		goto out;
>
> +	mapping = candidate->f_mapping;
> +	fsize = i_size_read(mapping->host);
> +	if (off_end > fsize)
> +		off_end = fsize;
> +
>  	pr_debug("split file-backed THPs in file: %s, page offset: [0x%lx - 0x%lx]\n",
>  		 file_path, off_start, off_end);
>
> -	mapping = candidate->f_mapping;
> -
>  	for (index = off_start; index < off_end; index += nr_pages) {
>  		struct folio *folio = filemap_get_folio(mapping, index);

LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
  2024-04-25  1:03   ` Zi Yan
@ 2024-04-25 21:01   ` Andrew Morton
  2024-04-29  4:04     ` Luis Chamberlain
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2024-04-25 21:01 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote:

> If the file entry is too long we may easily end up going out of bounds
> and crash after strsep() on sscanf(). 
> 

Can you explain why?  I'm not seeing it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-25  1:03   ` Zi Yan
@ 2024-04-25 22:40     ` Luis Chamberlain
  0 siblings, 0 replies; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-25 22:40 UTC (permalink / raw)
  To: Zi Yan
  Cc: akpm, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

On Wed, Apr 24, 2024 at 09:03:51PM -0400, Zi Yan wrote:
> On 24 Apr 2024, at 18:54, Luis Chamberlain wrote:
> 
> > If the file entry is too long we may easily end up going out of bounds
> > and crash after strsep() on sscanf(). To avoid this ensure we bound the
> > string to an expected length before we use sscanf() on it.
> >
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> >  mm/huge_memory.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 9e9879d2f501..8386d24a163e 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3623,6 +3623,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
> >  		char file_path[MAX_INPUT_BUF_SZ];
> >  		pgoff_t off_start = 0, off_end = 0;
> >  		size_t input_len = strlen(input_buf);
> > +		size_t max_left_over;
> >
> >  		tok = strsep(&buf, ",");
> >  		if (tok) {
> > @@ -3632,6 +3633,14 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf,
> >  			goto out;
> >  		}
> >
> > +		max_left_over = MAX_INPUT_BUF_SZ - strlen(file_path);
> > +		if (!buf ||
> > +		    strnlen(buf, max_left_over) < 7 ||
> 
> What is this magic number 7? strlen("0xN,0xN") as the minimal input string size?
> Maybe use sizeof("0xN,0xN") - 1 instead?

Sure and I forgot the fixes tag, will send a v2.

  Luis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-25 21:01   ` Andrew Morton
@ 2024-04-29  4:04     ` Luis Chamberlain
  2024-04-29 16:23       ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Luis Chamberlain @ 2024-04-29  4:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

On Thu, Apr 25, 2024 at 02:01:26PM -0700, Andrew Morton wrote:
> On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote:
> 
> > If the file entry is too long we may easily end up going out of bounds
> > and crash after strsep() on sscanf(). 
> > 
> 
> Can you explain why?  I'm not seeing it.

I couldn't see it either but I just looked at the crash below and
its the only thing I could think of. So I think its when userspace
somehow abuses MAX_INPUT_BUF_SZ a lot somehow.

The test I am using:

https://lore.kernel.org/all/20240424224649.1494092-1-mcgrof@kernel.org/

The crash:

Apr 28 20:09:21 debian12-xfs-reflink-16k-4ks unknown: run fstests generic/745 at 2024-04-28 20:09:21
Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): EXPERIMENTAL: Filesystem with Large Block Size (16384 bytes) enabled.
Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): Mounting V5 Filesystem 6633551d-bd66-4a7c-a4ce-7e312e611742
Apr 28 20:09:22 debian12-xfs-reflink-16k-4ks kernel: XFS (loop5): Ending clean mount
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/fss896, page offset: [0x0 - 0x0] with order: 2
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 0 file-backed THP split
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/f0, page offset: [0x0 - 0x6a4000] with order: 2
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 2 file-backed THP split
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/f1, page offset: [0x0 - 0x0] with order: 2
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 0 file-backed THP split
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p2/d0/d1/d4/f5, page offset: [0x0 - 0xc83d64] with order: 2
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 1 of 6 file-backed THP split
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p2/d0/d1/d4/f9, page offset: [0x0 - 0xcad301] with order: 2
Apr 28 20:09:25 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 9 of 79 file-backed THP split


<--- after ~2 minutes -->

Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/dc/dd/d274/d2ca/de12/d139/d4b1/d3cf/d4bb/d1240/d96c/d902/d7a5/d8b0/dbfb/dd18/d795/d8a1/dcc5/d1950/d19ed/d1204/d1452/db5f/dbb7/d10e8/d316/d707/f8f3, page offset: [0x0 - 0x17c0000] with order: 2
Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 6 file-backed THP split
Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: split file-backed THPs in file: /media/scratch/test/p0/dc/dd/d274/d2ca/de12/d139/d4b1/d3cf/d4bb/d1240/d96c/d902/d7a5/d8b0/dbfb/dd18/d795/d8a1/dcc5/d1950/d19ed/d1204/d1452/db5f/dbb7/d10e8/d316/d707/fbb8, page offset: [0x0 - 0x32b533] with order: 2
Apr 28 20:11:24 debian12-xfs-reflink-16k-4ks kernel: huge_memory: 0 of 5 file-backed THP split
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: #PF: supervisor read access in kernel mode
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: #PF: error_code(0x0000) - not-present page
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: PGD 0 P4D 0
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CPU: 7 PID: 2177 Comm: 745 Not tainted 6.9.0-rc5+ #5
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RIP: 0010:vsscanf (lib/vsprintf.c:3465) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Code: 0f 85 1d 02 00 00 48 8b 54 24 08 0f b6 02 3c 25 40 0f 95 c7 84 c0 0f 95 c1 40 20 cf 41 89 fe 74 40 48 8d 4a 01 48 89 4c 24 08 <41> 0f b6 45 00 38 02 0f 84 87 02 00 00 48 8b 44 24 38 65 48 2b 04
All code
========
   0:	0f 85 1d 02 00 00    	jne    0x223
   6:	48 8b 54 24 08       	mov    0x8(%rsp),%rdx
   b:	0f b6 02             	movzbl (%rdx),%eax
   e:	3c 25                	cmp    $0x25,%al
  10:	40 0f 95 c7          	setne  %dil
  14:	84 c0                	test   %al,%al
  16:	0f 95 c1             	setne  %cl
  19:	40 20 cf             	and    %cl,%dil
  1c:	41 89 fe             	mov    %edi,%r14d
  1f:	74 40                	je     0x61
  21:	48 8d 4a 01          	lea    0x1(%rdx),%rcx
  25:	48 89 4c 24 08       	mov    %rcx,0x8(%rsp)
  2a:*	41 0f b6 45 00       	movzbl 0x0(%r13),%eax		<-- trapping instruction
  2f:	38 02                	cmp    %al,(%rdx)
  31:	0f 84 87 02 00 00    	je     0x2be
  37:	48 8b 44 24 38       	mov    0x38(%rsp),%rax
  3c:	65                   	gs
  3d:	48                   	rex.W
  3e:	2b                   	.byte 0x2b
  3f:	04                   	.byte 0x4

Code starting with the faulting instruction
===========================================
   0:	41 0f b6 45 00       	movzbl 0x0(%r13),%eax
   5:	38 02                	cmp    %al,(%rdx)
   7:	0f 84 87 02 00 00    	je     0x294
   d:	48 8b 44 24 38       	mov    0x38(%rsp),%rax
  12:	65                   	gs
  13:	48                   	rex.W
  14:	2b                   	.byte 0x2b
  15:	04                   	.byte 0x4
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RSP: 0018:ffffb24ec358bad8 EFLAGS: 00010202
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RAX: 0000000000000030 RBX: ffffb24ec358bb50 RCX: ffffffffa03fbc18
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RDX: ffffffffa03fbc17 RSI: ffffffffa03fbc17 RDI: 0000000000000001
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RBP: 0000000000000000 R08: ffffb24ec358bbdc R09: 000000000000002c
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R10: ffffb24ec358bbfa R11: 0000000000000000 R12: 0000000000000000
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffb24ec358bf08
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: FS:  00007f8b39189740(0000) GS:ffff88837bdc0000(0000) knlGS:0000000000000000
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: CR2: 0000000000000000 CR3: 000000010c9f2003 CR4: 0000000000770ef0
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: PKRU: 55555554
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Call Trace:
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel:  <TASK>
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? page_fault_oops (arch/x86/mm/fault.c:713) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? do_user_addr_fault (./include/linux/kprobes.h:591 (discriminator 1) arch/x86/mm/fault.c:1265 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1563) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? vsscanf (lib/vsprintf.c:3465) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? stack_depot_save_flags (lib/stackdepot.c:609) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: sscanf (lib/vsprintf.c:3725) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __check_object_size (mm/usercopy.c:226 (discriminator 1) mm/usercopy.c:213 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: split_huge_pages_write (mm/huge_memory.c:3664) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: full_proxy_write (fs/debugfs/file.c:328 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: vfs_write (fs/read_write.c:588) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? __count_memcg_events (mm/memcontrol.c:723 (discriminator 1) mm/memcontrol.c:962 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? count_memcg_events.constprop.0 (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) ./include/linux/memcontrol.h:1097 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? handle_mm_fault (mm/memory.c:5483 mm/memory.c:5622) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ? preempt_count_add (./include/linux/ftrace.h:974 (discriminator 1) kernel/sched/core.c:5852 (discriminator 1) kernel/sched/core.c:5849 (discriminator 1) kernel/sched/core.c:5877 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: ksys_write (fs/read_write.c:643) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RIP: 0033:0x7f8b39284240
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
All code
========
   0:	40 00 48 8b          	rex add %cl,-0x75(%rax)
   4:	15 c1 9b 0d 00       	adc    $0xd9bc1,%eax
   9:	f7 d8                	neg    %eax
   b:	64 89 02             	mov    %eax,%fs:(%rdx)
   e:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
  15:	eb b7                	jmp    0xffffffffffffffce
  17:	0f 1f 00             	nopl   (%rax)
  1a:	80 3d a1 23 0e 00 00 	cmpb   $0x0,0xe23a1(%rip)        # 0xe23c2
  21:	74 17                	je     0x3a
  23:	b8 01 00 00 00       	mov    $0x1,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 58                	ja     0x8a
  32:	c3                   	ret
  33:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  3a:	48 83 ec 28          	sub    $0x28,%rsp
  3e:	48                   	rex.W
  3f:	89                   	.byte 0x89

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 58                	ja     0x60
   8:	c3                   	ret
   9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  10:	48 83 ec 28          	sub    $0x28,%rsp
  14:	48                   	rex.W
  15:	89                   	.byte 0x89
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RSP: 002b:00007fff2a14b5f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RAX: ffffffffffffffda RBX: 0000000000000109 RCX: 00007f8b39284240
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RDX: 0000000000000109 RSI: 00005589ea673400 RDI: 0000000000000001
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: RBP: 00005589ea673400 R08: 0000000000000007 R09: 0000000000000073
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000109
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel: R13: 00007f8b3935f760 R14: 0000000000000109 R15: 00007f8b3935a9e0
Apr 28 20:11:25 debian12-xfs-reflink-16k-4ks kernel:  </TASK>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split
  2024-04-29  4:04     ` Luis Chamberlain
@ 2024-04-29 16:23       ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2024-04-29 16:23 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: ziy, linux-mm, fstests, linux-xfs, linux-kernel, willy, hare,
	john.g.garry, p.raghav, da.gomez

On Sun, 28 Apr 2024 21:04:50 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote:

> On Thu, Apr 25, 2024 at 02:01:26PM -0700, Andrew Morton wrote:
> > On Wed, 24 Apr 2024 15:54:48 -0700 Luis Chamberlain <mcgrof@kernel.org> wrote:
> > 
> > > If the file entry is too long we may easily end up going out of bounds
> > > and crash after strsep() on sscanf(). 
> > > 
> > 
> > Can you explain why?  I'm not seeing it.
> 
> I couldn't see it either but I just looked at the crash below and
> its the only thing I could think of. So I think its when userspace
> somehow abuses MAX_INPUT_BUF_SZ a lot somehow.

This isn't a good basis for making kernel changes :(

Can you investigate a little further please?  What actually is present
at *buf when your new checks succeed?  Could we be seeing 0xNNN,0xNNN
and leaving new_order unaltered?  Or something.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-04-29 16:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 22:54 [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain
2024-04-24 22:54 ` [PATCH 1/2] mm/huge_memory: skip invalid debugfs file entry for folio split Luis Chamberlain
2024-04-25  1:03   ` Zi Yan
2024-04-25 22:40     ` Luis Chamberlain
2024-04-25 21:01   ` Andrew Morton
2024-04-29  4:04     ` Luis Chamberlain
2024-04-29 16:23       ` Andrew Morton
2024-04-24 22:54 ` [PATCH 2/2] mm/huge_memory: cap max length on debugfs file entry " Luis Chamberlain
2024-04-25  1:05   ` Zi Yan
2024-04-24 22:55 ` [PATCH 0/2] mm/huge_memory: couple fixes and one cleanup Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).