All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] Dump off-cpu samples directly
@ 2024-04-22  8:05 Howard Chu
  0 siblings, 0 replies; 2+ messages in thread
From: Howard Chu @ 2024-04-22  8:05 UTC (permalink / raw
  To: peterz
  Cc: mingo, acme, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, kan.liang, yangjihong1, zegao2021,
	leo.yan, ravi.bangoria, linux-perf-users, linux-kernel, bpf,
	Howard Chu

As mentioned in: https://bugzilla.kernel.org/show_bug.cgi?id=207323

Currently, off-cpu samples are dumped when perf record is exiting. This
results in off-cpu samples being after the regular samples. Also, samples
are stored in large BPF maps which contain all the stack traces and
accumulated off-cpu time, but they are eventually going to fill up after
running for an extensive period. This patch fixes those problems by dumping
samples directly into perf ring buffer, and dispatching those samples to the
correct format.

Before, off-cpu samples are after regular samples

```
         swapper       0 [000] 963432.136150:    2812933    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [000] 963432.637911:    4932876    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [001] 963432.798072:    6273398    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [000] 963433.541152:    5279005    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
sh 1410180 [000] 18446744069.414584:    2528851 offcpu-time: 
	    7837148e6e87 wait4+0x17 (/usr/lib/libc.so.6)


sh 1410185 [000] 18446744069.414584:    2314223 offcpu-time: 
	    7837148e6e87 wait4+0x17 (/usr/lib/libc.so.6)


awk 1409644 [000] 18446744069.414584:     191785 offcpu-time: 
	    702609d03681 read+0x11 (/usr/lib/libc.so.6)
	          4a02a4 [unknown] ([unknown])
```

After, regular samples(cycles:P) and off-cpu(offcpu-time) samples are
collected simultaneously:

```
upowerd     741 [000] 963757.428701:     297848 offcpu-time: 
	    72b2da11e6bc read+0x4c (/usr/lib/libc.so.6)


      irq/9-acpi      56 [000] 963757.429116:    8760875    cycles:P:  ffffffffb779849f acpi_os_read_port+0x2f ([kernel.kallsyms])
upowerd     741 [000] 963757.429172:     459522 offcpu-time: 
	    72b2da11e6bc read+0x4c (/usr/lib/libc.so.6)


         swapper       0 [002] 963757.434529:    5759904    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
perf 1419260 [000] 963757.434550: 1001012116 offcpu-time: 
	    7274e5d190bf __poll+0x4f (/usr/lib/libc.so.6)
	    591acfc5daf0 perf_evlist__poll+0x24 (/root/hw/perf-tools-next/tools/perf/perf)
	    591acfb1ca50 perf_evlist__poll_thread+0x160 (/root/hw/perf-tools-next/tools/perf/perf)
	    7274e5ca955a [unknown] (/usr/lib/libc.so.6)
```

Here's a simple flowchart:

[parse_event (sample type: PERF_SAMPLE_RAW)] --> [config (bind fds,
sample_id, sample_type)] --> [off_cpu_strip (sample type: PERF_SAMPLE_RAW)] -->
[record_done(hooks off_cpu_finish)] --> [change_type(sample type: OFFCPU_SAMPLE_TYPES)]


---

 tools/perf/builtin-record.c             |  98 ++++++++++++++++++++--
 tools/perf/tests/shell/record_offcpu.sh |  29 -------
 tools/perf/util/bpf_off_cpu.c           | 245 +++++++++++++++++++++++++------------------------------
 tools/perf/util/bpf_skel/off_cpu.bpf.c  | 163 +++++++++++++++++++++++++++++-------
 tools/perf/util/evsel.c                 |   8 --
 tools/perf/util/off_cpu.h               |  14 +++-
 tools/perf/util/perf-hooks-list.h       |   1 +
 7 files changed, 347 insertions(+), 211 deletions(-)

-- 
2.44.0

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH v1 0/4] Dump off-cpu samples directly
@ 2024-04-22  8:35 Howard Chu
  0 siblings, 0 replies; 2+ messages in thread
From: Howard Chu @ 2024-04-22  8:35 UTC (permalink / raw
  To: peterz
  Cc: mingo, acme, namhyung, mark.rutland, alexander.shishkin, jolsa,
	irogers, adrian.hunter, kan.liang, zegao2021, leo.yan,
	ravi.bangoria, linux-perf-users, linux-kernel, bpf, Howard Chu

As mentioned in: https://bugzilla.kernel.org/show_bug.cgi?id=207323

Currently, off-cpu samples are dumped when perf record is exiting. This
results in off-cpu samples being after the regular samples. Also, samples
are stored in large BPF maps which contain all the stack traces and
accumulated off-cpu time, but they are eventually going to fill up after
running for an extensive period. This patch fixes those problems by dumping
samples directly into perf ring buffer, and dispatching those samples to the
correct format.

Before, off-cpu samples are after regular samples

```
         swapper       0 [000] 963432.136150:    2812933    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [000] 963432.637911:    4932876    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [001] 963432.798072:    6273398    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
         swapper       0 [000] 963433.541152:    5279005    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
sh 1410180 [000] 18446744069.414584:    2528851 offcpu-time: 
	    7837148e6e87 wait4+0x17 (/usr/lib/libc.so.6)


sh 1410185 [000] 18446744069.414584:    2314223 offcpu-time: 
	    7837148e6e87 wait4+0x17 (/usr/lib/libc.so.6)


awk 1409644 [000] 18446744069.414584:     191785 offcpu-time: 
	    702609d03681 read+0x11 (/usr/lib/libc.so.6)
	          4a02a4 [unknown] ([unknown])
```

After, regular samples(cycles:P) and off-cpu(offcpu-time) samples are
collected simultaneously:

```
upowerd     741 [000] 963757.428701:     297848 offcpu-time: 
	    72b2da11e6bc read+0x4c (/usr/lib/libc.so.6)


      irq/9-acpi      56 [000] 963757.429116:    8760875    cycles:P:  ffffffffb779849f acpi_os_read_port+0x2f ([kernel.kallsyms])
upowerd     741 [000] 963757.429172:     459522 offcpu-time: 
	    72b2da11e6bc read+0x4c (/usr/lib/libc.so.6)


         swapper       0 [002] 963757.434529:    5759904    cycles:P:  ffffffffb7db1bc2 intel_idle+0x62 ([kernel.kallsyms])
perf 1419260 [000] 963757.434550: 1001012116 offcpu-time: 
	    7274e5d190bf __poll+0x4f (/usr/lib/libc.so.6)
	    591acfc5daf0 perf_evlist__poll+0x24 (/root/hw/perf-tools-next/tools/perf/perf)
	    591acfb1ca50 perf_evlist__poll_thread+0x160 (/root/hw/perf-tools-next/tools/perf/perf)
	    7274e5ca955a [unknown] (/usr/lib/libc.so.6)
```

Here's a simple flowchart:

[parse_event (sample type: PERF_SAMPLE_RAW)] --> [config (bind fds,
sample_id, sample_type)] --> [off_cpu_strip (sample type: PERF_SAMPLE_RAW)] -->
[record_done(hooks off_cpu_finish)] --> [change_type(sample type: OFFCPU_SAMPLE_TYPES)]


---

 tools/perf/builtin-record.c             |  98 ++++++++++++++++++++--
 tools/perf/tests/shell/record_offcpu.sh |  29 -------
 tools/perf/util/bpf_off_cpu.c           | 245 +++++++++++++++++++++++++------------------------------
 tools/perf/util/bpf_skel/off_cpu.bpf.c  | 163 +++++++++++++++++++++++++++++-------
 tools/perf/util/evsel.c                 |   8 --
 tools/perf/util/off_cpu.h               |  14 +++-
 tools/perf/util/perf-hooks-list.h       |   1 +
 7 files changed, 347 insertions(+), 211 deletions(-)

-- 
2.44.0

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-04-22  8:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-22  8:35 [PATCH v1 0/4] Dump off-cpu samples directly Howard Chu
  -- strict thread matches above, loose matches on Subject: below --
2024-04-22  8:05 Howard Chu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.