All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF")
@ 2019-05-17  2:41 Hou Tao
  2019-05-17  6:59 ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Hou Tao @ 2019-05-17  2:41 UTC (permalink / raw
  To: linux-xfs, david; +Cc: Darrick J. Wong

Hi,

I don't understand why the commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") is needed here:

diff --git a/fs/iomap.c b/fs/iomap.c
index 72f3864a2e6b..77c214194edf 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1677,7 +1677,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
                dio->submit.cookie = submit_bio(bio);
        } while (nr_pages);

-       if (need_zeroout) {
+       /*
+        * We need to zeroout the tail of a sub-block write if the extent type
+        * requires zeroing or the write extends beyond EOF. If we don't zero
+        * the block tail in the latter case, we can expose stale data via mmap
+        * reads of the EOF block.
+        */
+       if (need_zeroout ||
+           ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
                /* zero out from the end of the write to the end of the block */
                pad = pos & (fs_block_size - 1);
                if (pad)

If need_zeroout is false, it means the block neither is a unwritten block nor
a newly-mapped block, but that also means the block must had been a unwritten block
or a newly-mapped block before this write, so the block must have been zeroed, correct ?

It also introduces unnecessary sub-block zeroing if we repeat the same sub-block write.

I also have tried to reproduce the problem by using fsx as noted in the commit message,
but cann't reproduce it. Maybe I do it in the wrong way:

$ ./ltp/fsx -d -g H -H -z -C -I -w 1024 -F -r 1024 -t 4096 -Z /tmp/xfs/fsx

The XFS related with /tmp/xfs is formatted with "-b size=4096". I also try "-b size=1024",
but still no luck.

Could someone explain the scenario in which the extra block zeroing is needed ? Thanks.

Regards,
Tao

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF")
  2019-05-17  2:41 Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") Hou Tao
@ 2019-05-17  6:59 ` Dave Chinner
  2019-05-17 12:56   ` Hou Tao
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2019-05-17  6:59 UTC (permalink / raw
  To: Hou Tao; +Cc: linux-xfs, Darrick J. Wong

On Fri, May 17, 2019 at 10:41:44AM +0800, Hou Tao wrote:
> Hi,
> 
> I don't understand why the commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") is needed here:
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 72f3864a2e6b..77c214194edf 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -1677,7 +1677,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>                 dio->submit.cookie = submit_bio(bio);
>         } while (nr_pages);
> 
> -       if (need_zeroout) {
> +       /*
> +        * We need to zeroout the tail of a sub-block write if the extent type
> +        * requires zeroing or the write extends beyond EOF. If we don't zero
> +        * the block tail in the latter case, we can expose stale data via mmap
> +        * reads of the EOF block.
> +        */
> +       if (need_zeroout ||
> +           ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
>                 /* zero out from the end of the write to the end of the block */
>                 pad = pos & (fs_block_size - 1);
>                 if (pad)
> 
> If need_zeroout is false, it means the block neither is a unwritten block nor
> a newly-mapped block, but that also means the block must had been a unwritten block
> or a newly-mapped block before this write, so the block must have been zeroed, correct ?

No. One the contrary, it's a direct IO write to beyond the end of
the file which means the block has not been zeroed at all. If it is
an unwritten extent, it most definitely does not contain zeroes
(unwritten extents are a flag in the extent metadata, not zeroed
disk space) and so it doesn't matter it is written or unwritten we
must zero it before we update the file size.

Why? Because if we then mmap the page that spans EOF, whatever is on
disk beyond EOF is exposed to the user process. Hence if we don't
zero the tail of the block beyond EOF during DIO writes then we can
leak stale information to unprivileged users....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF")
  2019-05-17  6:59 ` Dave Chinner
@ 2019-05-17 12:56   ` Hou Tao
  2019-05-18  3:10     ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Hou Tao @ 2019-05-17 12:56 UTC (permalink / raw
  To: Dave Chinner; +Cc: linux-xfs, Darrick J. Wong

Hi,

On 2019/5/17 14:59, Dave Chinner wrote:
> On Fri, May 17, 2019 at 10:41:44AM +0800, Hou Tao wrote:
>> Hi,
>>
>> I don't understand why the commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") is needed here:
>>
>> diff --git a/fs/iomap.c b/fs/iomap.c
>> index 72f3864a2e6b..77c214194edf 100644
>> --- a/fs/iomap.c
>> +++ b/fs/iomap.c
>> @@ -1677,7 +1677,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>>                 dio->submit.cookie = submit_bio(bio);
>>         } while (nr_pages);
>>
>> -       if (need_zeroout) {
>> +       /*
>> +        * We need to zeroout the tail of a sub-block write if the extent type
>> +        * requires zeroing or the write extends beyond EOF. If we don't zero
>> +        * the block tail in the latter case, we can expose stale data via mmap
>> +        * reads of the EOF block.
>> +        */
>> +       if (need_zeroout ||
>> +           ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
>>                 /* zero out from the end of the write to the end of the block */
>>                 pad = pos & (fs_block_size - 1);
>>                 if (pad)
>>
>> If need_zeroout is false, it means the block neither is a unwritten block nor
>> a newly-mapped block, but that also means the block must had been a unwritten block
>> or a newly-mapped block before this write, so the block must have been zeroed, correct ?
> 
> No. One the contrary, it's a direct IO write to beyond the end of
> the file which means the block has not been zeroed at all. If it is
> an unwritten extent, it most definitely does not contain zeroes
> (unwritten extents are a flag in the extent metadata, not zeroed
> disk space) and so it doesn't matter it is written or unwritten we
> must zero it before we update the file size.
> 
Ah, I still can not understand the reason why "the block has not been zeroed at all".
Do you mean the scenario in which the past-EOF write tries to write an already mapped
block and the past-EOF part of this block has not been zeroed ?

Because if the past-EOF write tries to write to a new allocated block, then IOMAP_F_NEW
must have been set in iomap->flags and need_zeroout will be true. If it tries to write
to an unwritten block, need_zeroout will also be true.

If it tries to write a block in which the past-EOF part has not been zeroed, even without
the past-EOF direct write, the data exposure is still possible, right ?
If not, could you please give a specific example on how this happens ? Thanks.

Regards,
Tao

> Why? Because if we then mmap the page that spans EOF, whatever is on
> disk beyond EOF is exposed to the user process. Hence if we don't
> zero the tail of the block beyond EOF during DIO writes then we can
> leak stale information to unprivileged users....
> 
> Cheers,
> 
> Dave.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF")
  2019-05-17 12:56   ` Hou Tao
@ 2019-05-18  3:10     ` Dave Chinner
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2019-05-18  3:10 UTC (permalink / raw
  To: Hou Tao; +Cc: linux-xfs, Darrick J. Wong

On Fri, May 17, 2019 at 08:56:35PM +0800, Hou Tao wrote:
> Hi,
> 
> On 2019/5/17 14:59, Dave Chinner wrote:
> > On Fri, May 17, 2019 at 10:41:44AM +0800, Hou Tao wrote:
> >> Hi,
> >>
> >> I don't understand why the commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") is needed here:
> >>
> >> diff --git a/fs/iomap.c b/fs/iomap.c
> >> index 72f3864a2e6b..77c214194edf 100644
> >> --- a/fs/iomap.c
> >> +++ b/fs/iomap.c
> >> @@ -1677,7 +1677,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
> >>                 dio->submit.cookie = submit_bio(bio);
> >>         } while (nr_pages);
> >>
> >> -       if (need_zeroout) {
> >> +       /*
> >> +        * We need to zeroout the tail of a sub-block write if the extent type
> >> +        * requires zeroing or the write extends beyond EOF. If we don't zero
> >> +        * the block tail in the latter case, we can expose stale data via mmap
> >> +        * reads of the EOF block.
> >> +        */
> >> +       if (need_zeroout ||
> >> +           ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
> >>                 /* zero out from the end of the write to the end of the block */
> >>                 pad = pos & (fs_block_size - 1);
> >>                 if (pad)
> >>
> >> If need_zeroout is false, it means the block neither is a unwritten block nor
> >> a newly-mapped block, but that also means the block must had been a unwritten block
> >> or a newly-mapped block before this write, so the block must have been zeroed, correct ?
> > 
> > No. One the contrary, it's a direct IO write to beyond the end of
> > the file which means the block has not been zeroed at all. If it is
> > an unwritten extent, it most definitely does not contain zeroes
> > (unwritten extents are a flag in the extent metadata, not zeroed
> > disk space) and so it doesn't matter it is written or unwritten we
> > must zero it before we update the file size.
> > 
> Ah, I still can not understand the reason why "the block has not been zeroed at all".
> Do you mean the scenario in which the past-EOF write tries to write an already mapped
> block and the past-EOF part of this block has not been zeroed ?

Yup. Speculative delayed allocation beyond EOF triggered by mmap()
or buffered writes() that has then been allocated by writeback, then
we do a dio that extends EOF into that space.

> Because if the past-EOF write tries to write to a new allocated block, then IOMAP_F_NEW
> must have been set in iomap->flags and need_zeroout will be true. If it tries to write
> to an unwritten block, need_zeroout will also be true.

But if it tries to write to an already allocated, written block,
need_zeroout is not true and it will need to zero.
> 
> If it tries to write a block in which the past-EOF part has not been zeroed, even without
> the past-EOF direct write, the data exposure is still possible, right ?

Not if we zero both sides of the dio correctly before we extend the
file size on disk.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-18  3:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-17  2:41 Question about commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") Hou Tao
2019-05-17  6:59 ` Dave Chinner
2019-05-17 12:56   ` Hou Tao
2019-05-18  3:10     ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.