DM-Devel Archive mirror
 help / color / mirror / Atom feed
From: Martin Wilck <martin.wilck@suse.com>
To: Benjamin Marzinski <bmarzins@redhat.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
	device-mapper development <dm-devel@lists.linux.dev>
Subject: Re: [PATCH v2 4/5] libmultipath: pad dev_loss_tmo to avoid race with no_path_retry
Date: Thu, 02 May 2024 20:44:28 +0200	[thread overview]
Message-ID: <f37205021bc5d7e57425022e57f315ff003d2214.camel@suse.com> (raw)
In-Reply-To: <ZjO50C2hB8MilnHV@redhat.com>

On Thu, 2024-05-02 at 12:05 -0400, Benjamin Marzinski wrote:
> On Thu, May 02, 2024 at 05:14:56PM +0200, Martin Wilck wrote:
> > On Thu, 2024-04-25 at 19:35 -0400, Benjamin Marzinski wrote:
> > > Currently multipath makes sure that dev_loss_tmo is at least as
> > > long
> > > as
> > > the configured no path queueing time. The goal is to make sure
> > > that
> > > path
> > > devices aren't removed while the multipath device is still
> > > queueing
> > > in
> > > hopes that they will become usable again.
> > > 
> > > This is racy. Multipathd may take longer to check the paths than
> > > configured. If strict_timing isn't set, it will definitely take
> > > longer.
> > > To account for this, pad the minimum dev_loss_tmo value by five
> > > seconds
> > > (one default checker interval) plus one second per minute of no
> > > path
> > > queueing time.
> > 
> > What's the rationale for the "one second per minute" part?
> > It feels kind of arbitrary to me, and I don't find it obvious that
> > we
> > need larger padding for larger timeouts.
> 
> The way the checker works, without strict_timing, we do all the
> checking
> work, and then we wait for a second. Obviously, the checking work
> takes
> time, so every nominal one second checker loop will in reality take
> longer than a second. The larger you have configured (no_path_retry *
> checkint), the further away the actual time when we disable queueing
> will be from (no_path_retry * checkint) seconds.
> 
> With strict_timing on, things work better. As long as the checking
> work
> doesn't take more than a second, you will stay fairly close to one
> second per loop, but even still, it can be longer. nanosleep() will
> sleep till at least the time specified. If multipathd is not running
> as
> a realtime process, it will usually sleep for longer.
> 
> Also, if the checker work takes longer than a second, then
> strict_timing
> will make sure that the loop takes (close to) exactly 2 (or whatever
> the
> next number is) seconds. However, retry_count_tick() still only ticks
> down by 1 for each loop. This means that the real time before
> queueing
> is disabled can diverge from (no_path_retry * checkint). And again,
> the
> longer you've configured it to wait, the more it diverges.
> 
> I thought about passing ticks to retry_count_tick(), so that it would
> correctly deal with these multi-tick loops, but this can cause it to
> diverge from the actual number of retries attempted, in a way that
> would
> mean we disable queueing before the desired number of retries have
> occurred, which is a bigger problem, IMHO.  No matter how long the
> checker work takes, you will only ever do one path check per loop. In
> the worse case, say that the checker work (or anything else in
> multipathd) gets stuck for a couple of minutes. You would only do one
> path check, but hundreds of ticks would be passed to the
> retry_count_tick(), meaning that you could disable queueing after
> only
> one retry. Now you could limit the number of ticks you passed to
> retry_count_tick(), but even still, if the a path was 1 tick away
> from
> being checked, and the checker work took 3 seconds, you would still
> pass
> 3 ticks to retry_count_tick(), which would cause it to start to
> diverge
> from the actual time it takes for the path to do the configured
> amount
> of retries.
> 
> We could improve retry_count_tick(), so that it actually tracks the
> number of retries you did, but that still leaves the problems when
> either strict_timing isn't on or multipathd isn't a realtime process.
> It seemed easier to just say, the longer you have configured
> multipathd
> to wait to disable queueing, the more the time when it actually
> disables
> queueing can diverge from the configured time.
> 
> The number I picked for the rate of divergence is just a guest, and
> if
> you don't like it, I'm fine with removing it. After all, the whole
> point
> of this patchset is to make sure that the worst thing that can happen
> if
> paths disappear while we are still queueing is that a multipath
> device
> isn't autoremoved.

I was just asking for a rationale. This is fine.

I think we can, and should, improve multipathd's timing. Unfortunately
this has been on my agenda for a long time already.

For 4/5, too:

Reviewed-by: Martin Wilck <mwilck@suse.com>


  reply	other threads:[~2024-05-02 18:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25 23:35 [PATCH v2 0/5] multipath: fix hang in flush_map_nopaths Benjamin Marzinski
2024-04-25 23:35 ` [PATCH v2 1/5] libmultipath: export partmap_in_use Benjamin Marzinski
2024-04-25 23:35 ` [PATCH v2 2/5] libmultipath: change flush_on_last_del to fix a multipathd hang Benjamin Marzinski
2024-04-30 17:06   ` Martin Wilck
2024-04-30 21:29     ` Benjamin Marzinski
2024-05-02  8:36       ` Martin Wilck
2024-05-02 15:06   ` Martin Wilck
2024-04-25 23:35 ` [PATCH v2 3/5] libmultipath: remove redundant config option from InfiniBox config Benjamin Marzinski
2024-04-25 23:35 ` [PATCH v2 4/5] libmultipath: pad dev_loss_tmo to avoid race with no_path_retry Benjamin Marzinski
2024-05-02 15:14   ` Martin Wilck
2024-05-02 16:05     ` Benjamin Marzinski
2024-05-02 18:44       ` Martin Wilck [this message]
2024-04-25 23:35 ` [PATCH v2 5/5] libmultipath: fix deferred_remove function arguments Benjamin Marzinski
2024-05-02 15:52   ` Martin Wilck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f37205021bc5d7e57425022e57f315ff003d2214.camel@suse.com \
    --to=martin.wilck@suse.com \
    --cc=bmarzins@redhat.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).