LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: remove checks for data=ordered and journal_async_commit options
@ 2024-03-20 10:58 Jongseok Kim
  2024-03-20 12:03 ` Jan Kara
  0 siblings, 1 reply; 2+ messages in thread
From: Jongseok Kim @ 2024-03-20 10:58 UTC (permalink / raw
  To: tytso, jack, adilger.kernel; +Cc: linux-ext4, linux-kernel, Jongseok Kim

On Tue, Nov 25, 2014 at 04:56:15PM +0100, Jan Kara wrote:
> Option journal_async_commit breaks gurantees of data=ordered mode as it
> sends only a single cache flush after writing a transaction commit
> block. Thus even though the transaction including the commit block is
> fully stored on persistent storage, file data may still linger in drives
> caches and will be lost on power failure. Since all checksums match on
> journal recovery, we replay the transaction thus possibly exposing stale
> user data.
>
> To fix this data exposure issue, remove the possibility to use
> journal_async_commit in data=ordered mode.
>
> Signed-off-by: Jan Kara <jack@suse.cz>

In employing the journal_async_commit feature, the approach involves
slightly early submission of a request for a transaction commit block
without the inclusion of PREFLUSH and FUA. Instead, once all the blocks of
the transaction have been written, the journal device is flushed once.
This methodology, even under data=ordered mode, due to the procedure of
flushing the j_fs_dev, guarantees that the file data is permanently stored
by the time the commit block request is initiated.

This discussion pertains to scenarios where the file data device and
the journal device are distinct. If the devices are the same,
then all file data is written before the flush of the journal device,
making a single flush sufficient in this context.

Consequently, it remains entirely permissible to activate
the journal_async_commit option, even when operating in ordered mode.

Should my interpretation deviate in any manner,
I earnestly request your guidance and correction.

Signed-off-by: Jongseok Kim <ks77sj@gmail.com>
---
 fs/ext4/super.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index cfb8449c731f..2141c2eb4bf0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4954,13 +4954,6 @@ static int ext4_load_and_init_journal(struct super_block *sb,
 		break;
 	}
 
-	if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA &&
-	    test_opt(sb, JOURNAL_ASYNC_COMMIT)) {
-		ext4_msg(sb, KERN_ERR, "can't mount with "
-			"journal_async_commit in data=ordered mode");
-		goto out;
-	}
-
 	set_task_ioprio(sbi->s_journal->j_task, ctx->journal_ioprio);
 
 	sbi->s_journal->j_submit_inode_data_buffers =
@@ -6510,13 +6503,6 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
 			err = -EINVAL;
 			goto restore_opts;
 		}
-	} else if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA) {
-		if (test_opt(sb, JOURNAL_ASYNC_COMMIT)) {
-			ext4_msg(sb, KERN_ERR, "can't mount with "
-				"journal_async_commit in data=ordered mode");
-			err = -EINVAL;
-			goto restore_opts;
-		}
 	}
 
 	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_NO_MBCACHE) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ext4: remove checks for data=ordered and journal_async_commit options
  2024-03-20 10:58 [PATCH] ext4: remove checks for data=ordered and journal_async_commit options Jongseok Kim
@ 2024-03-20 12:03 ` Jan Kara
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2024-03-20 12:03 UTC (permalink / raw
  To: Jongseok Kim; +Cc: tytso, jack, adilger.kernel, linux-ext4, linux-kernel

On Wed 20-03-24 19:58:08, Jongseok Kim wrote:
> On Tue, Nov 25, 2014 at 04:56:15PM +0100, Jan Kara wrote:
> > Option journal_async_commit breaks gurantees of data=ordered mode as it
> > sends only a single cache flush after writing a transaction commit
> > block. Thus even though the transaction including the commit block is
> > fully stored on persistent storage, file data may still linger in drives
> > caches and will be lost on power failure. Since all checksums match on
> > journal recovery, we replay the transaction thus possibly exposing stale
> > user data.
> >
> > To fix this data exposure issue, remove the possibility to use
> > journal_async_commit in data=ordered mode.
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> In employing the journal_async_commit feature, the approach involves
> slightly early submission of a request for a transaction commit block
> without the inclusion of PREFLUSH and FUA. Instead, once all the blocks of
> the transaction have been written, the journal device is flushed once.
> This methodology, even under data=ordered mode, due to the procedure of
> flushing the j_fs_dev, guarantees that the file data is permanently stored
> by the time the commit block request is initiated.
> 
> This discussion pertains to scenarios where the file data device and
> the journal device are distinct. If the devices are the same,
> then all file data is written before the flush of the journal device,
> making a single flush sufficient in this context.
> 
> Consequently, it remains entirely permissible to activate
> the journal_async_commit option, even when operating in ordered mode.

So I agree that when journal is on a different device jbd2 will flush the
data device before writing commit record so we are safe in data=ordered
mode even with async commit. Good idea. But:

1) For dealing with this complication we need to have a sensible usecase.
Do you have performance numbers from this combination that show async
commit brings noticeable performance benefit? Please include them in the
changelog.

2) The code should make sure we are really using external journal device
before allowing async commit in data=ordered mode.

Thanks!

								Honza


> Should my interpretation deviate in any manner,
> I earnestly request your guidance and correction.
> 
> Signed-off-by: Jongseok Kim <ks77sj@gmail.com>
> ---
>  fs/ext4/super.c | 14 --------------
>  1 file changed, 14 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index cfb8449c731f..2141c2eb4bf0 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4954,13 +4954,6 @@ static int ext4_load_and_init_journal(struct super_block *sb,
>  		break;
>  	}
>  
> -	if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA &&
> -	    test_opt(sb, JOURNAL_ASYNC_COMMIT)) {
> -		ext4_msg(sb, KERN_ERR, "can't mount with "
> -			"journal_async_commit in data=ordered mode");
> -		goto out;
> -	}
> -
>  	set_task_ioprio(sbi->s_journal->j_task, ctx->journal_ioprio);
>  
>  	sbi->s_journal->j_submit_inode_data_buffers =
> @@ -6510,13 +6503,6 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
>  			err = -EINVAL;
>  			goto restore_opts;
>  		}
> -	} else if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA) {
> -		if (test_opt(sb, JOURNAL_ASYNC_COMMIT)) {
> -			ext4_msg(sb, KERN_ERR, "can't mount with "
> -				"journal_async_commit in data=ordered mode");
> -			err = -EINVAL;
> -			goto restore_opts;
> -		}
>  	}
>  
>  	if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_NO_MBCACHE) {
> -- 
> 2.34.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-03-20 12:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-20 10:58 [PATCH] ext4: remove checks for data=ordered and journal_async_commit options Jongseok Kim
2024-03-20 12:03 ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).