Linux-api Archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: John Garry <john.g.garry@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	martin.petersen@oracle.com, himanshu.madhani@oracle.com
Subject: Re: [PATCH 2/4] readv.2: Document RWF_ATOMIC flag
Date: Tue, 24 Oct 2023 08:39:06 -0700	[thread overview]
Message-ID: <20231024153906.GJ11391@frogsfrogsfrogs> (raw)
In-Reply-To: <7da93082-2985-85f4-7688-a082728de0a5@oracle.com>

On Tue, Oct 24, 2023 at 01:35:33PM +0100, John Garry wrote:
> On 09/10/2023 22:05, Darrick J. Wong wrote:
> > > > If the file range is a sparse hole, the directio setup will allocate
> > > > space and create an unwritten mapping before issuing the write bio.  The
> > > > rest of the process works the same as preallocations and has the same
> > > > behaviors.
> > > > 
> > > > If the file range is allocated and was previously written, the write is
> > > > issued and that's all that's needed from the fs.  After a crash, reads
> > > > of the storage device produce the old contents or the new contents.
> > > This is exactly what I explained when reviewing the code that
> > > rejected RWF_ATOMIC without O_DSYNC on metadata dirty inodes.
> > I'm glad we agree. 😄
> > 
> > John, when you're back from vacation, can we get rid of this language
> > and all those checks under _is_dsync() in the iomap patch?
> > 
> > (That code is 100% the result of me handwaving and bellyaching 6 months
> > ago when the team was trying to get all the atomic writes bits working
> > prior to LSF and I was too burned out to think the xfs part through.
> > As a result, I decided that we'd only support strict overwrites for the
> > first iteration.)
> 
> So this following additive code in iomap_dio_bio_iter() should be dropped:
> 
> ----8<-----
> 
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -275,10 +275,11 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct
> iomap_dio *dio,
>  static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>  		struct iomap_dio *dio)
>  {
> 
> ...
> 
> @@ -292,6 +293,13 @@ static loff_t iomap_dio_bio_iter(const struct
> iomap_iter *iter,
>  	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
>  		return -EINVAL;
> 
> +	if (atomic_write && !iocb_is_dsync(dio->iocb)) {
> +		if (iomap->flags & IOMAP_F_DIRTY)
> +			return -EIO;
> +		if (iomap->type != IOMAP_MAPPED)
> +			return -EIO;
> +	}
> +
> 
> ---->8-----
> 
> ok?

Yes.

> > 
> > > > Summarizing:
> > > > 
> > > > An (ATOMIC|SYNC) request provides the strongest guarantees (data
> > > > will not be torn, and all file metadata updates are persisted before
> > > > the write is returned to userspace.  Programs see either the old data or
> > > > the new data, even if there's a crash.
> > > > 
> > > > (ATOMIC|DSYNC) is less strong -- data will not be torn, and any file
> > > > updates for just that region are persisted before the write is returned.
> > > > 
> > > > (ATOMIC) is the least strong -- data will not be torn.  Neither the
> > > > filesystem nor the device make guarantees that anything ended up on
> > > > stable storage, but if it does, programs see either the old data or the
> > > > new data.
> > > Yup, that makes sense to me.
> > Perhaps this ^^ is what we should be documenting here.
> > 
> > > > Maybe we should rename the whole UAPI s/atomic/untorn/...
> > > Perhaps, though "torn writes" is nomenclature that nobody outside
> > > storage and filesystem developers really knows about. All I ever
> > > hear from userspace developers is "we want atomic/all-or-nothing
> > > data writes"...

How about O_NOTEARS -> PWF_NOTEARS -> REQ_NOTEARS.

<obligatory "There's no crying in baseball" link, etc.>

--D

> > Fair 'enuf.
> 
> 
> Thanks,
> John

  reply	other threads:[~2023-10-24 15:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-29  9:37 [PATCH 0/4] man2: Document RWF_ATOMIC John Garry
2023-09-29  9:37 ` [PATCH 1/4] statx.2: Document STATX_WRITE_ATOMIC John Garry
2023-09-29  9:37 ` [PATCH 2/4] readv.2: Document RWF_ATOMIC flag John Garry
2023-10-03 19:25   ` Bart Van Assche
2023-10-04  8:47     ` John Garry
2023-10-04 17:36       ` Bart Van Assche
2023-10-04 22:48       ` Dave Chinner
2023-10-09 17:44   ` Darrick J. Wong
2023-10-09 20:39     ` Dave Chinner
2023-10-09 21:05       ` Darrick J. Wong
2023-10-24 12:35         ` John Garry
2023-10-24 15:39           ` Darrick J. Wong [this message]
2023-10-24 12:30     ` John Garry
2023-10-24 15:39       ` Darrick J. Wong
2023-09-29  9:37 ` [PATCH 3/4] man2/open.2: Document RWF_ATOMIC John Garry
2023-09-29  9:37 ` [PATCH 4/4] io_submit.2: " John Garry
2023-10-09 17:45   ` Darrick J. Wong
2023-10-24 11:51     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231024153906.GJ11391@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=himanshu.madhani@oracle.com \
    --cc=john.g.garry@oracle.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).