From: jim.cromie@gmail.com
To: kernelnewbies <kernelnewbies@kernelnewbies.org>
Subject: how to use perf record effectively
Date: Fri, 10 Nov 2023 12:11:14 -0700 [thread overview]
Message-ID: <CAJfuBxwOx+picsXQk4opgvs=6hud5TjW4X3XHefWnxPGKr+G7A@mail.gmail.com> (raw)
I have a working patchset which de-duplicates the pr_debug
per-callsite ( module, filename, function ) data.
it loads that column data into 3 maple-trees,
and simple accessor fns retrieve the data
by lookup with the pr-debug address.
So it stores these callsites:
[ 0.721980] dyndbg: 3653 prdebugs in 307 modules, 19 KiB in ddebug
tables, 114 kiB ..
into these intervals:
[ 104.047210] dyndbg: mt-funcs has 2174 entries
[ 104.047816] dyndbg: mt-files has 539 entries
[ 104.048410] dyndbg: mt-mods has 312 entries
once these are loaded, the __dyndbg_sites section,
which separates the 3 columns from the __dyndbg section,
can be recycled.contains
ALL GOOD SO FAR.
BUT WHATS THE RUNTIME COST OF THIS ?
perf stat -r200 cat /proc/dynamic_debug/control > /dev/null;
this should be a good test - it calls all 3 accessors for each
pr-debug in the kernel.
but comparing master against this branch shows little change,
and adding --table to see the variations in the runs
suggests that the change is less than the variation within a test.
MASTER - v6.6
bash-5.2# perf stat -r 200 cat /proc/dynamic_debug/control > /dev/null
Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):
10.29 msec task-clock # 0.713
CPUs utilized ( +- 0.56% )
43 context-switches # 4.177
K/sec ( +- 0.03% )
1 cpu-migrations # 97.142
/sec ( +- 5.80% )
73 page-faults # 7.091
K/sec ( +- 0.10% )
8906200 cycles # 0.865
GHz ( +- 0.17% )
147349 stalled-cycles-frontend # 1.65%
frontend cycles idle ( +- 0.18% )
24971 stalled-cycles-backend # 0.28%
backend cycles idle ( +- 8.18% )
20589718 instructions # 2.31
insn per cycle
# 0.01 stalled
cycles per insn ( +- 0.02% )
5470202 branches # 531.388
M/sec ( +- 0.01% )
0 branch-misses
0.0144421 +- 0.0000647 seconds time elapsed ( +- 0.45% )
DE_DUPLICATION branch
bash-5.2# perf stat -r200 cat /proc/dynamic_debug/control > /dev/null
Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):
21.89 msec task-clock # 0.622
CPUs utilized ( +- 0.69% )
44 context-switches # 2.010
K/sec ( +- 0.12% )
1 cpu-migrations # 45.693
/sec ( +- 3.87% )
73 page-faults # 3.336
K/sec ( +- 0.10% )
52017542 cycles # 2.377
GHz ( +- 0.54% )
177875 stalled-cycles-frontend # 0.34%
frontend cycles idle ( +- 0.48% )
134469 stalled-cycles-backend # 0.26%
backend cycles idle ( +- 4.24% )
134707837 instructions # 2.59
insn per cycle
# 0.00 stalled
cycles per insn ( +- 0.30% )
39386555 branches # 1.800
G/sec ( +- 0.29% )
0 branch-misses
0.035188 +- 0.000167 seconds time elapsed ( +- 0.47% )
I tried perf stat record, then perf-diff on the results,
it showed empty comparisons on a handful of event-types
[jimc@frodo boots-dump]$ perf diff -v
v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0*
v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675*
v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906* > foo
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
[jimc@frodo boots-dump]$ more foo
# Event 'task-clock'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'context-switches'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'cpu-migrations'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'page-faults'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'cycles'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'stalled-cycles-frontend'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'stalled-cycles-backend'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'instructions'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'branches'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
# Event 'branch-misses'
#
# Data files:
# [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
# [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
# [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
# [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
# [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
# [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0 Delta Abs/1 Delta Abs/2 Delta Abs/3 Delta Abs/4
Delta Abs/5 Shared Object Symbol
# .......... ........... ........... ........... ...........
........... ............. ......
#
Does anyone here have enough experience with perf to recommend
some tests to tease out the differences ?
_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
reply other threads:[~2023-11-10 19:13 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJfuBxwOx+picsXQk4opgvs=6hud5TjW4X3XHefWnxPGKr+G7A@mail.gmail.com' \
--to=jim.cromie@gmail.com \
--cc=kernelnewbies@kernelnewbies.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).