Kernel Newbies archive mirror
 help / color / mirror / Atom feed
From: jim.cromie@gmail.com
To: kernelnewbies <kernelnewbies@kernelnewbies.org>
Subject: how to use perf record effectively
Date: Fri, 10 Nov 2023 12:11:14 -0700	[thread overview]
Message-ID: <CAJfuBxwOx+picsXQk4opgvs=6hud5TjW4X3XHefWnxPGKr+G7A@mail.gmail.com> (raw)

I have a working patchset which de-duplicates the pr_debug
per-callsite ( module, filename, function ) data.

it loads that column data into 3 maple-trees,
and simple accessor fns retrieve the data
by lookup with the pr-debug address.

So it stores these callsites:
[    0.721980] dyndbg: 3653 prdebugs in 307 modules, 19 KiB in ddebug
tables, 114 kiB ..

into these intervals:
[  104.047210] dyndbg: mt-funcs has 2174 entries
[  104.047816] dyndbg: mt-files has 539 entries
[  104.048410] dyndbg: mt-mods has 312 entries

once these are loaded, the __dyndbg_sites section,
which separates the 3 columns from the __dyndbg section,
can be recycled.contains

ALL GOOD SO FAR.
BUT WHATS THE RUNTIME COST OF THIS ?

perf stat -r200 cat /proc/dynamic_debug/control > /dev/null;

this should be a good test - it calls all 3 accessors for each
pr-debug in the kernel.

but comparing master against this branch shows little change,
and adding --table to see the variations in the runs
suggests that the change is less than the variation within a test.

MASTER - v6.6

bash-5.2# perf stat -r 200 cat /proc/dynamic_debug/control > /dev/null

 Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):

             10.29 msec task-clock                       #    0.713
CPUs utilized               ( +-  0.56% )
                43      context-switches                 #    4.177
K/sec                       ( +-  0.03% )
                 1      cpu-migrations                   #   97.142
/sec                        ( +-  5.80% )
                73      page-faults                      #    7.091
K/sec                       ( +-  0.10% )
           8906200      cycles                           #    0.865
GHz                         ( +-  0.17% )
            147349      stalled-cycles-frontend          #    1.65%
frontend cycles idle        ( +-  0.18% )
             24971      stalled-cycles-backend           #    0.28%
backend cycles idle         ( +-  8.18% )
          20589718      instructions                     #    2.31
insn per cycle
                                                  #    0.01  stalled
cycles per insn     ( +-  0.02% )
           5470202      branches                         #  531.388
M/sec                       ( +-  0.01% )
                 0      branch-misses

         0.0144421 +- 0.0000647 seconds time elapsed  ( +-  0.45% )


DE_DUPLICATION branch

bash-5.2# perf stat -r200 cat /proc/dynamic_debug/control > /dev/null

 Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):

             21.89 msec task-clock                       #    0.622
CPUs utilized               ( +-  0.69% )
                44      context-switches                 #    2.010
K/sec                       ( +-  0.12% )
                 1      cpu-migrations                   #   45.693
/sec                        ( +-  3.87% )
                73      page-faults                      #    3.336
K/sec                       ( +-  0.10% )
          52017542      cycles                           #    2.377
GHz                         ( +-  0.54% )
            177875      stalled-cycles-frontend          #    0.34%
frontend cycles idle        ( +-  0.48% )
            134469      stalled-cycles-backend           #    0.26%
backend cycles idle         ( +-  4.24% )
         134707837      instructions                     #    2.59
insn per cycle
                                                  #    0.00  stalled
cycles per insn     ( +-  0.30% )
          39386555      branches                         #    1.800
G/sec                       ( +-  0.29% )
                 0      branch-misses

          0.035188 +- 0.000167 seconds time elapsed  ( +-  0.47% )



I tried perf stat record, then perf-diff on the results,
it showed empty comparisons on a handful of event-types

[jimc@frodo boots-dump]$ perf diff -v
v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0*
v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675*
v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906* > foo
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
[jimc@frodo boots-dump]$ more foo
# Event 'task-clock'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'context-switches'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'cpu-migrations'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'page-faults'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'cycles'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'stalled-cycles-frontend'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'stalled-cycles-backend'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'instructions'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'branches'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'branch-misses'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#



Does anyone here have enough experience with perf to recommend
some tests to tease out the differences ?

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

                 reply	other threads:[~2023-11-10 19:13 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJfuBxwOx+picsXQk4opgvs=6hud5TjW4X3XHefWnxPGKr+G7A@mail.gmail.com' \
    --to=jim.cromie@gmail.com \
    --cc=kernelnewbies@kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).