From: Alexander Aring <aahringo@redhat.com>
To: teigland@redhat.com
Cc: gfs2@lists.linux.dev, aahringo@redhat.com
Subject: [PATCHv2 dlm/next 0/9] dlm: sand fix, rhashtable, timers and lookup hotpath speedup
Date: Mon, 15 Apr 2024 14:39:34 -0400 [thread overview]
Message-ID: <20240415183943.645497-1-aahringo@redhat.com> (raw)
Hi,
this is a patch series for an immense change in DLM rsb hashtable logic.
It removes the double lookup functionality for rsb hashtables, convert
to rhashtable instead of own bucket hlist hash implementation.
At first there is a fix for scand that I detected while I was
implementing this patch series. It could be that remove messages are
still send when the lockspace is releasing the resource that could occur
into a use after free.
There is a conversion to use lists (keep/toss lists) instead of
iterating over the hash bucket. As we do a transition to rhashtable,
they don't like to being iterated regarding to their own bucket sizing
implementation that is sitting in the rhashtable implementation. We just
introduce the lists to do the iteration, as advantage we have a huge
reduce of code in the debugfs dump functionality as we use the dump list
helpers of debugfs. There is also a potential refcount bug when holding
rsb references of rsbs in toss state as receive remove message requires
no rsb references being hold. Another issue is also holding rsb in keep
state as they are not going into toss state when they required to.
It is now forbidden to hold references of rsbs in toss state. The
refcounter must be only functional in rsb keep state. That hopefully
will show is more invalid usage of the rsb refcounter if the rsb is in
toss state.
The scand was being fixed but now also it's removed. The scand process
was holding the hashtable/hash bucket lock for a longer timer because it
iterated over the whole hash. We use timers now to reduce the held time
of the hashtable lock. We introduce a per lockspace toss queue with
tossed timer rsb and the first item is the earliest rsb that will be
expired by the timer vice versa the last item. This makes it easy to
change the timer expiration to the next one in the queue.
The last two patches we move very likely lookup hotpath to read lock
mostly. This should for sure avoid contention in the most cases.
Unlikely path need to still hold the write lock and do some extra
relookup and check if the state of an rsb changed. However I think we
hit over 90% the likely path that we only need to hold the read lock
and avoid contention between processing DLM messages and the user
triggers new DLM requests.
- Alex
changes since v2:
- introduce dlm_timer_resume() and call it in between LSFL_RUNNING
and ls_in_recovery lock. Comment this function and some rare cases
that it is only a "try" resume.
- comment more why holding ls_rsbtbl_lock lock in timer and correct
the retry comment in case of timer hit contention. It is only a
"try" as well, as others can set new timer expirations.
changes since RFC:
- hold the write_lock in find_rsb_dir/nodir when hitting the do_toss
path and then do the lookup and check on the do_toss rsb fields
- move from a per rsb timer to a per lockspace timer and introduce
a per rsb toss queue.
Alexander Aring (9):
dlm: increment ls_count on find_ls_to_scan()
dlm: change to non per bucket hashtable lock
dlm: merge toss and keep hash into one
dlm: fix avoid rsb hold during debugfs dump
dlm: switch to use rhashtable for rsbs
dlm: remove refcounting if rsb is on toss
dlm: drop scand kthread and use timers
dlm: likely read lock path for rsb lookup
dlm: convert lkbidr to rwlock
fs/dlm/config.c | 8 +
fs/dlm/config.h | 2 +
fs/dlm/debug_fs.c | 212 ++---------
fs/dlm/dir.c | 14 +-
fs/dlm/dlm_internal.h | 40 +-
fs/dlm/lock.c | 867 ++++++++++++++++++++++++------------------
fs/dlm/lock.h | 5 +-
fs/dlm/lockspace.c | 151 ++------
fs/dlm/member.c | 2 +
fs/dlm/recover.c | 29 +-
fs/dlm/recoverd.c | 50 +--
11 files changed, 643 insertions(+), 737 deletions(-)
--
2.43.0
next reply other threads:[~2024-04-15 18:39 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-15 18:39 Alexander Aring [this message]
2024-04-15 18:39 ` [PATCHv2 dlm/next 1/9] dlm: increment ls_count on find_ls_to_scan() Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 2/9] dlm: change to non per bucket hashtable lock Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 3/9] dlm: merge toss and keep hash into one Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 4/9] dlm: fix avoid rsb hold during debugfs dump Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 5/9] dlm: switch to use rhashtable for rsbs Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 6/9] dlm: remove refcounting if rsb is on toss Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 7/9] dlm: drop scand kthread and use timers Alexander Aring
2024-04-17 11:40 ` Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 8/9] dlm: likely read lock path for rsb lookup Alexander Aring
2024-04-15 18:39 ` [PATCHv2 dlm/next 9/9] dlm: convert lkbidr to rwlock Alexander Aring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240415183943.645497-1-aahringo@redhat.com \
--to=aahringo@redhat.com \
--cc=gfs2@lists.linux.dev \
--cc=teigland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).