Git Mailing List Archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Subject: [PATCH 00/13] reftable: prepare for re-seekable iterators
Date: Wed, 8 May 2024 13:03:33 +0200	[thread overview]
Message-ID: <cover.1715166175.git.ps@pks.im> (raw)

[-- Attachment #1: Type: text/plain, Size: 3313 bytes --]

Hi,

the reftable library uses iterators both to iterate through a set of
records, but also to look up a single record. In past patch series, I
have focussed quite a lot to optimize the case where we iterate through
a large set of records. But looking up a records is still quite
inefficient when doing multiple lookups. This is because whenever we
want to look up a record, we need to create a new iterator, including
all of its internal data structures.

To address this inefficiency, the patch series at hand refactors the
reftable library such that creation of iterators and seeking on an
iterator are separate steps. This refactoring prepares us for reusing
iterators to perform multiple seeks, which in turn will allow us to
reuse internal data structures for subsequent seeks.

The patch series is structured as follows:

  - Patches 1 to 5 perform some general cleanups to make the reftable
    iterators easier to understand.

  - Patchges 6 to 9 refactor the iterators internally such that creation
    of the iterator and seeking on it is clearly separated.

  - Patches 10 to 13 adapt the external interfaces such that they allow
    for reuse of iterators.

Note: this series does not yet go all the way to re-seekable iterators,
and there are no users yet. The patch series is complex enough as-is
already, so I decided to defer that to the next iteration. Thus, the
whole refactoring here should essentially be a large no-op that prepares
the infrastructure for re-seekable iterators.

The series depends on pks/reftable-write-optim at fa74f32291
(reftable/block: reuse compressed array, 2024-04-08).

Thanks!

Patrick

Patrick Steinhardt (13):
  reftable/block: use `size_t` to track restart point index
  reftable/reader: avoid copying index iterator
  reftable/reader: unify indexed and linear seeking
  reftable/reader: separate concerns of table iter and reftable reader
  reftable/reader: inline `reader_seek_internal()`
  reftable/reader: set up the reader when initializing table iterator
  reftable/merged: split up initialization and seeking of records
  reftable/merged: simplify indices for subiterators
  reftable/generic: move seeking of records into the iterator
  reftable/generic: adapt interface to allow reuse of iterators
  reftable/reader: adapt interface to allow reuse of iterators
  reftable/stack: provide convenience functions to create iterators
  reftable/merged: adapt interface to allow reuse of iterators

 refs/reftable-backend.c      |  48 ++++----
 reftable/block.c             |   4 +-
 reftable/generic.c           |  94 +++++++++++----
 reftable/generic.h           |   9 +-
 reftable/iter.c              |  23 +++-
 reftable/merged.c            | 148 ++++++++----------------
 reftable/merged.h            |   6 +
 reftable/merged_test.c       |  19 ++-
 reftable/reader.c            | 218 +++++++++++++++--------------------
 reftable/readwrite_test.c    |  35 ++++--
 reftable/reftable-generic.h  |   8 +-
 reftable/reftable-iterator.h |  21 ++++
 reftable/reftable-merged.h   |  15 ---
 reftable/reftable-reader.h   |  45 ++------
 reftable/reftable-stack.h    |  18 +++
 reftable/stack.c             |  29 ++++-
 16 files changed, 378 insertions(+), 362 deletions(-)

-- 
2.45.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

             reply	other threads:[~2024-05-08 11:03 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08 11:03 Patrick Steinhardt [this message]
2024-05-08 11:03 ` [PATCH 01/13] reftable/block: use `size_t` to track restart point index Patrick Steinhardt
2024-05-08 11:03 ` [PATCH 02/13] reftable/reader: avoid copying index iterator Patrick Steinhardt
2024-05-08 11:03 ` [PATCH 03/13] reftable/reader: unify indexed and linear seeking Patrick Steinhardt
2024-05-08 11:03 ` [PATCH 04/13] reftable/reader: separate concerns of table iter and reftable reader Patrick Steinhardt
2024-05-08 11:03 ` [PATCH 05/13] reftable/reader: inline `reader_seek_internal()` Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 06/13] reftable/reader: set up the reader when initializing table iterator Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 07/13] reftable/merged: split up initialization and seeking of records Patrick Steinhardt
2024-05-10 19:18   ` Justin Tobler
2024-05-13  8:36     ` Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 08/13] reftable/merged: simplify indices for subiterators Patrick Steinhardt
2024-05-10 19:25   ` Justin Tobler
2024-05-13  8:36     ` Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 09/13] reftable/generic: move seeking of records into the iterator Patrick Steinhardt
2024-05-10 21:44   ` Justin Tobler
2024-05-13  8:36     ` Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 10/13] reftable/generic: adapt interface to allow reuse of iterators Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 11/13] reftable/reader: " Patrick Steinhardt
2024-05-10 21:48   ` Justin Tobler
2024-05-13  8:36     ` Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 12/13] reftable/stack: provide convenience functions to create iterators Patrick Steinhardt
2024-05-08 11:04 ` [PATCH 13/13] reftable/merged: adapt interface to allow reuse of iterators Patrick Steinhardt
2024-05-08 23:42 ` [PATCH 00/13] reftable: prepare for re-seekable iterators Junio C Hamano
2024-05-09  0:16   ` Junio C Hamano
2024-05-10  7:48   ` Patrick Steinhardt
2024-05-10 15:40     ` Junio C Hamano
2024-05-10 16:13       ` Patrick Steinhardt
2024-05-10 17:17         ` Junio C Hamano
2024-05-13  8:46 ` [PATCH v2 " Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 01/13] reftable/block: use `size_t` to track restart point index Patrick Steinhardt
2024-05-21 13:34     ` Karthik Nayak
2024-05-22  7:23       ` Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 02/13] reftable/reader: avoid copying index iterator Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 03/13] reftable/reader: unify indexed and linear seeking Patrick Steinhardt
2024-05-21 14:41     ` Karthik Nayak
2024-05-22  7:23       ` Patrick Steinhardt
2024-05-22  7:56         ` Karthik Nayak
2024-05-13  8:47   ` [PATCH v2 04/13] reftable/reader: separate concerns of table iter and reftable reader Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 05/13] reftable/reader: inline `reader_seek_internal()` Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 06/13] reftable/reader: set up the reader when initializing table iterator Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 07/13] reftable/merged: split up initialization and seeking of records Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 08/13] reftable/merged: simplify indices for subiterators Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 09/13] reftable/generic: move seeking of records into the iterator Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 10/13] reftable/generic: adapt interface to allow reuse of iterators Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 11/13] reftable/reader: " Patrick Steinhardt
2024-05-13  8:47   ` [PATCH v2 12/13] reftable/stack: provide convenience functions to create iterators Patrick Steinhardt
2024-05-13  8:48   ` [PATCH v2 13/13] reftable/merged: adapt interface to allow reuse of iterators Patrick Steinhardt
2024-05-15 20:15   ` [PATCH v2 00/13] reftable: prepare for re-seekable iterators Justin Tobler
2024-05-21 15:31   ` Karthik Nayak
2024-05-22  7:23     ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1715166175.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).