BPF Archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Quentin Monnet <qmo@kernel.org>
Cc: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>,
	Eduard Zingerman <eddyz87@gmail.com>,
	bpf@vger.kernel.org,  ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, kafai@meta.com,  kernel-team@meta.com,
	Mykyta Yatsenko <yatsenko@meta.com>
Subject: Re: [PATCH bpf-next] bpftool: introduce btf c dump sorting
Date: Wed, 8 May 2024 09:21:33 -0700	[thread overview]
Message-ID: <CAEf4BzYzbCTV8cr-mCpVy5dco+1j1oig7SVPEmBjvKJEyb5JFw@mail.gmail.com> (raw)
In-Reply-To: <e20be6d3-b9c7-4a64-add1-f4c7a6d3a4bc@kernel.org>

On Tue, May 7, 2024 at 4:30 PM Quentin Monnet <qmo@kernel.org> wrote:
>
> On 07/05/2024 22:02, Andrii Nakryiko wrote:
> > On Mon, May 6, 2024 at 6:45 AM Mykyta Yatsenko
> > <mykyta.yatsenko5@gmail.com> wrote:
> >>
> >> From: Mykyta Yatsenko <yatsenko@meta.com>
> >>
> >> Provide a way to sort bpftool c dump output, to simplify vmlinux.h
> >> diffing and forcing more natural definitions ordering.
> >>
> >> Use `normalized` argument in bpftool CLI after `format c` for example:
> >> ```
> >> bpftool btf dump file /sys/kernel/btf/fuse format c normalized
> >> ```
> >>
> >> Definitions are sorted by their BTF kind ranks, lexicographically and
> >> typedefs are forced to go right after their base type.
> >>
> >> Type ranks
> >>
> >> Assign ranks to btf kinds (defined in function btf_type_rank) to set
> >> next order:
> >> 1. Anonymous enums
> >> 2. Anonymous enums64
> >> 3. Named enums
> >> 4. Named enums64
> >> 5. Trivial types typedefs (ints, then floats)
> >> 6. Structs
> >> 7. Unions
> >> 8. Function prototypes
> >> 9. Forward declarations
> >>
> >> Lexicographical ordering
> >>
> >> Definitions within the same BTF kind are ordered by their names.
> >> Anonymous enums are ordered by their first element.
> >>
> >> Forcing typedefs to go right after their base type
> >>
> >> To make sure that typedefs are emitted right after their base type,
> >> we build a list of type's typedefs (struct typedef_ref) and after
> >> emitting type, its typedefs are emitted as well (lexicographically)
> >>
> >> There is a small flaw in this implementation:
> >> Type dependencies are resolved by bpf lib, so when type is dumped
> >> because it is a dependency, its typedefs are not output right after it,
> >> as bpflib does not have the list of typedefs for a given type.
> >>
> >> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
> >> ---
> >>  tools/bpf/bpftool/btf.c | 264 +++++++++++++++++++++++++++++++++++++++-
> >>  1 file changed, 259 insertions(+), 5 deletions(-)
> >>
> >
> > I applied this locally to experiment. Generated vmlinux.h for the
> > production (a bit older) kernel and then for latest bpf-next/master
> > kernel. And then tried diff between normalized vmlinux.h dumps and
> > non-normalized.
> >
> > It took a bit for the diff tool to generate, but I think diff for
> > normalized vmlinux.h is actually very usable. You can see an example
> > at [1]. It shows whole new types being added in front of existing
> > ones. And for existing ones it shows only parts that actually changed.
> > It's quite nice. And note that I used a relatively stale production
> > kernel vs latest upstream bpf-next, *AND* with different (bigger)
> > Kconfig. So for more incremental changes in kernel config/version the
> > diff should be much slower.
> >
> > I think this idea of normalizing vmlinux.h works and is useful.
> >
> > Eduard, Quentin, please take a look when you get a chance.
> >
> > My high-level feedback. I like the idea and it seems to work well in
> > practice. I do think, though, that the current implementation is a bit
> > over-engineered. I'd drop all the complexity with TYPEDEF and try to
> > get almost the same behavior with a slightly different ranking
> > strategy.
> >
> > Tracking which types are emitted seems unnecessary btf_dumper is doing
> > that already internally. So I think overall flow could be basically
> > three steps:
> >
> >   - precalculate/cache "sort names" and ranks;
> >   - sort based on those two, construct 0-based list of types to emit
> >   - just go linearly over that sorted list, call btf_dump__dump_type()
> > on each one with original type ID; if the type was already emitted or
> > is not the type that's emitted as an independent type (e.g.,
> > FUNC_PROTO), btf_dump__dump_type() should do the right thing (do
> > nothing).
> >
> > Any flaws in the above proposal?
> >
> >   [1] https://gist.github.com/anakryiko/cca678c8f77833d9eb99ffc102612e28
>
> Hi, thanks for the patch - and thanks Andrii for the Cc. I didn't have
> time to look at the code yet (will do), but the idea looks great.
>
> My main question would be, how much overhead does the sorting add to the
> BTF dump, and if this overhead is low, is it even worth having a

I actually measured. I didn't save numbers, but it was something like
50% slower in current implementation (and with simplifications I
proposed it should be faster still, probably), so not a lot slower.
It's not noticeable in practice.

So yes, we can probably make this a default. I'd probably still leave
an option to dump using "natural" BTF order, just in case.


> dedicated command-line keyword to trigger the sorting, or should we just
> make it the default behaviour for the C-formatted dump? (Or is there any
> advantage in dumping with the current, unsorted order?)
>
> Quentin

  reply	other threads:[~2024-05-08 16:21 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-06 13:44 [PATCH bpf-next] bpftool: introduce btf c dump sorting Mykyta
2024-05-07 21:02 ` Andrii Nakryiko
2024-05-07 23:30   ` Quentin Monnet
2024-05-08 16:21     ` Andrii Nakryiko [this message]
2024-05-08 23:07   ` Mykyta Yatsenko
2024-05-08 23:21     ` Andrii Nakryiko
2024-05-08 23:35       ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYzbCTV8cr-mCpVy5dco+1j1oig7SVPEmBjvKJEyb5JFw@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=kafai@meta.com \
    --cc=kernel-team@meta.com \
    --cc=mykyta.yatsenko5@gmail.com \
    --cc=qmo@kernel.org \
    --cc=yatsenko@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).