* [RFC][PATCH v3 00/16] Introduce kmemdump
@ 2025-09-12 15:08 Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 01/16] kmemdump: " Eugen Hristev
` (17 more replies)
0 siblings, 18 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
kmemdump is a mechanism which allows the kernel to mark specific memory
areas for dumping or specific backend usage.
Once regions are marked, kmemdump keeps an internal list with the regions
and registers them in the backend.
Further, depending on the backend driver, these regions can be dumped using
firmware or different hardware block.
Regions being marked beforehand, when the system is up and running, there
is no need nor dependency on a panic handler, or a working kernel that can
dump the debug information.
The kmemdump approach works when pstore, kdump, or another mechanism do not.
Pstore relies on persistent storage, a dedicated RAM area or flash, which
has the disadvantage of having the memory reserved all the time, or another
specific non volatile memory. Some devices cannot keep the RAM contents on
reboot so ramoops does not work. Some devices do not allow kexec to run
another kernel to debug the crashed one.
For such devices, that have another mechanism to help debugging, like
firmware, kmemdump is a viable solution.
kmemdump can create a core image, similar with /proc/vmcore, with only
the registered regions included. This can be loaded into crash tool/gdb and
analyzed.
To have this working, specific information from the kernel is registered,
and this is done at kmemdump init time, no need for the kmemdump user to
do anything.
This version of the kmemdump patch series includes two backend drivers:
one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
backend for Android devices, reworked from this source here:
https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
written originally by Jone Chou <jonechou@google.com>
*** History, motivation and available online resources ***
Initial version of kmemdump and discussion is available here:
https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
Kmemdump has been presented and discussed at Linaro Connect 2025,
including motivation, scope, usability and feasability.
Video of the recording is available here for anyone interested:
https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
Linaro blog on kmemdump can be found here:
https://www.linaro.org/blog/introduction-to-kmemdump/
The implementation is based on the initial Pstore/directly mapped zones
published as an RFC here:
https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
The back-end implementation for qcom_minidump is based on the minidump
patch series and driver written by Mukesh Ojha, thanks:
https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
The RFC v2 version with .section creation and macro annotation kmemdump
is available here:
https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
*** How to use kmemdump with minidump backend on Qualcomm platform guide ***
Prerequisites:
Crash tool compiled with target=ARM64 and minor changes required for usual crash
mode (minimal mode works without the patch)
A patch can be applied from here https://p.calebs.dev/49a048
This patch will be eventually sent in a reworked way to crash tool.
Target kernel must be built with :
CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
information needed for crash tool.
Also, the kernel requires these as well:
CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
Kernel arguments:
Kernel firmware must be set to mode 'mini' by kernel module parameter
like this : qcom_scm.download_mode=mini
After the kernel boots, and qcom_minidump module is loaded, everything is ready for
a possible crash.
Once the crash happens, the firmware will kick in and you will see on
the console the message saying Sahara init, etc, that the firmware is
waiting in download mode. (this is subject to firmware supporting this
mode, I am using sa8775p-ride board)
Example of log on the console:
"
[...]
B - 1096414 - usb: init start
B - 1100287 - usb: qusb_dci_platform , 0x19
B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
B - 1107455 - usb: usb2phy: PRIM success , 0x4
B - 1112670 - usb: dci, chgr_type_det_err
B - 1117154 - usb: ID:0x260, value: 0x4
B - 1121942 - usb: ID:0x108, value: 0x1d90
B - 1124992 - usb: timer_start , 0x4c4b40
B - 1129140 - usb: vbus_det_pm_unavail
B - 1133136 - usb: ID:0x252, value: 0x4
B - 1148874 - usb: SUPER , 0x900e
B - 1275510 - usb: SUPER , 0x900e
B - 1388970 - usb: ID:0x20d, value: 0x0
B - 1411113 - usb: ENUM success
B - 1411113 - Sahara Init
B - 1414285 - Sahara Open
"
Once the board is in download mode, you can use the qdl tool (I
personally use edl , have not tried qdl yet), to get all the regions as
separate files.
The tool from the host computer will list the regions in the order they
were downloaded.
Once you have all the files simply use `cat` to put them all together,
in the order of the indexes.
For my kernel config and setup, here is my cat command : (you can use a script
or something, I haven't done that so far):
`cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
memory/md_Kunknown46.BIN memory/md_Kunknown49.BIN memory/md_Kunknown50.BIN \
memory/md_Kunknown51.BIN > ~/minidump_image`
Once you have the resulted file, use `crash` tool to load it, like this:
`./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
There is also a --minimal mode for ./crash that would work without any patch applied
to crash tool, but you can't inspect symbols, etc.
Once you load crash you will see something like this :
KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
DUMPFILE: /home/eugen/new
CPUS: 8 [OFFLINE: 5]
DATE: Thu Jan 1 02:00:00 EET 1970
UPTIME: 00:00:22
TASKS: 0
NODENAME: qemuarm64
RELEASE: 6.17.0-rc5-next-20250910-00020-g7dfa02aeae7e
VERSION: #116 SMP PREEMPT Thu Sep 11 18:28:06 EEST 2025
MACHINE: aarch64 (unknown Mhz)
MEMORY: 34.2 GB
PANIC: ""
crash> log
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
[ 0.000000] Linux version 6.17.0-rc5-next-20250910-00020-g7dfa02aeae7e (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #116 SMP PREEMPT Thu Sep 11 18:28:06 EEST 2025
*** Debug Kinfo backend driver ***
I don't have any device to actually test this. So I have not.
I hacked the driver to just use a kmalloc'ed area to save things instead
of the shared memory, and dumped everything there and checked whether it is identical
with what the downstream driver would have saved.
So this synthetic test passed and memories are identical.
Anyone who actually wants to test this, feel free to reply to the patch.
I have also written a simple DT binding for the driver.
Thanks for everyone reviewing and bringing ideas into the discussion.
Eugen
Changelog since the v2 of the RFC:
- V2 available here : https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
- Removed the .section as requested by David Hildenbrand.
- Moved all kmemdump registration(when possible) to vmcoreinfo.
- Because of this, some of the variables that I was registering had to be non-static
so I had to modify this as per David Hildenbrand suggestion.
- Fixed minor things in the Kinfo driver: one field was broken, fixed some
compiler warnings, fixed the copyright and remove some useless includes.
- Moved the whole kmemdump from drivers/debug into mm/ and Kconfigs into mm/Kconfig.debug
and it's now available in kernel hacking, as per Randy Dunlap review
- Reworked some of the Documentation as per review from Jon Corbet
Changelog since the v1 of the RFC:
- V1 available here: https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
- Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
This means new API, macros, new way to store the regions inside kmemdump
(ditched the IDR, moved to static allocation, have a static default backend, etc)
- Reworked qcom_minidump driver based on review from Bjorn Andersson
- Reworked printk log buffer registration based on review from Petr Mladek
I appologize if I missed any review comments. I know there is still lots of work
on this series and hope I will improve it more and more.
Patches are sent on top of next-20250910
Eugen Hristev (16):
kmemdump: Introduce kmemdump
Documentation: Add kmemdump
kmemdump: Add coreimage ELF layer
Documentation: kmemdump: Add section for coreimage ELF
kernel/vmcore_info: Register dynamic information into Kmemdump
kmemdump: Introduce qcom-minidump backend driver
soc: qcom: smem: Add minidump device
init/version: Add banner_len to save banner length
genirq/irqdesc: Have nr_irqs as non-static
panic: Have tainted_mask as non-static
mm/swapfile: Have nr_swapfiles as non-static
printk: Register information into Kmemdump
sched: Add sched_get_runqueues_area
kernel/vmcoreinfo: Register kmemdump core image information
kmemdump: Add Kinfo backend driver
dt-bindings: Add Google Kinfo
Documentation/dev-tools/index.rst | 1 +
Documentation/dev-tools/kmemdump.rst | 139 +++++++
.../bindings/misc/google,kinfo.yaml | 36 ++
MAINTAINERS | 19 +
drivers/soc/qcom/smem.c | 10 +
include/linux/kmemdump.h | 130 +++++++
include/linux/printk.h | 1 +
init/version-timestamp.c | 1 +
init/version.c | 1 +
kernel/irq/irqdesc.c | 2 +-
kernel/panic.c | 2 +-
kernel/printk/printk.c | 47 +++
kernel/sched/core.c | 15 +
kernel/sched/sched.h | 2 +
kernel/vmcore_info.c | 149 ++++++++
mm/Kconfig.debug | 2 +
mm/Makefile | 1 +
mm/kmemdump/Kconfig.debug | 53 +++
mm/kmemdump/Makefile | 6 +
mm/kmemdump/kinfo.c | 293 +++++++++++++++
mm/kmemdump/kmemdump.c | 234 ++++++++++++
mm/kmemdump/kmemdump_coreimage.c | 222 +++++++++++
mm/kmemdump/qcom_minidump.c | 353 ++++++++++++++++++
mm/swapfile.c | 2 +-
24 files changed, 1718 insertions(+), 3 deletions(-)
create mode 100644 Documentation/dev-tools/kmemdump.rst
create mode 100644 Documentation/devicetree/bindings/misc/google,kinfo.yaml
create mode 100644 include/linux/kmemdump.h
create mode 100644 mm/kmemdump/Kconfig.debug
create mode 100644 mm/kmemdump/Makefile
create mode 100644 mm/kmemdump/kinfo.c
create mode 100644 mm/kmemdump/kmemdump.c
create mode 100644 mm/kmemdump/kmemdump_coreimage.c
create mode 100644 mm/kmemdump/qcom_minidump.c
--
2.43.0
^ permalink raw reply [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 01/16] kmemdump: Introduce kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 02/16] Documentation: Add kmemdump Eugen Hristev
` (16 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Kmemdump mechanism allows any driver to mark a specific memory area
for later dumping/debugging purpose, depending on the functionality
of the attached backend.
The backend would interface any hardware mechanism that will allow
dumping to complete regardless of the state of the kernel
(running, frozen, crashed, or any particular state).
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 6 ++
include/linux/kmemdump.h | 63 ++++++++++++
mm/Kconfig.debug | 2 +
mm/Makefile | 1 +
mm/kmemdump/Kconfig.debug | 14 +++
mm/kmemdump/Makefile | 3 +
mm/kmemdump/kmemdump.c | 202 ++++++++++++++++++++++++++++++++++++++
7 files changed, 291 insertions(+)
create mode 100644 include/linux/kmemdump.h
create mode 100644 mm/kmemdump/Kconfig.debug
create mode 100644 mm/kmemdump/Makefile
create mode 100644 mm/kmemdump/kmemdump.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 8cf4990a8ff6..1713cccefc91 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13810,6 +13810,12 @@ L: linux-iio@vger.kernel.org
S: Supported
F: drivers/iio/accel/kionix-kx022a*
+KMEMDUMP
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: include/linux/kmemdump.h
+F: mm/kmemdump/kmemdump.c
+
KMEMLEAK
M: Catalin Marinas <catalin.marinas@arm.com>
S: Maintained
diff --git a/include/linux/kmemdump.h b/include/linux/kmemdump.h
new file mode 100644
index 000000000000..8e764bb2d8ac
--- /dev/null
+++ b/include/linux/kmemdump.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _KMEMDUMP_H
+#define _KMEMDUMP_H
+
+enum kmemdump_uid {
+ KMEMDUMP_ID_START = 0,
+ KMEMDUMP_ID_USER_START,
+ KMEMDUMP_ID_USER_END,
+ KMEMDUMP_ID_NO_ID,
+};
+
+#ifdef CONFIG_KMEMDUMP
+/**
+ * struct kmemdump_zone - region mark zone information
+ * @id: unique id for this zone
+ * @zone: pointer to the memory area for this zone
+ * @size: size of the memory area of this zone
+ */
+struct kmemdump_zone {
+ enum kmemdump_uid id;
+ void *zone;
+ size_t size;
+};
+
+#define KMEMDUMP_BACKEND_MAX_NAME 128
+/**
+ * struct kmemdump_backend - region mark backend information
+ * @name: the name of the backend
+ * @register_region: callback to register region in the backend
+ * @unregister_region: callback to unregister region in the backend
+ */
+struct kmemdump_backend {
+ char name[KMEMDUMP_BACKEND_MAX_NAME];
+ int (*register_region)(const struct kmemdump_backend *be,
+ enum kmemdump_uid uid, void *vaddr, size_t size);
+ int (*unregister_region)(const struct kmemdump_backend *be,
+ enum kmemdump_uid uid);
+};
+
+int kmemdump_register_backend(const struct kmemdump_backend *backend);
+void kmemdump_unregister_backend(const struct kmemdump_backend *backend);
+
+int kmemdump_register_id(enum kmemdump_uid id, void *zone, size_t size);
+
+#define kmemdump_register(...) \
+ kmemdump_register_id(KMEMDUMP_ID_NO_ID, __VA_ARGS__) \
+
+void kmemdump_unregister(enum kmemdump_uid id);
+#else
+static inline int kmemdump_register_id(enum kmemdump_uid uid, void *area,
+ size_t size)
+{
+ return 0;
+}
+
+#define kmemdump_register(...)
+
+static inline void kmemdump_unregister(enum kmemdump_uid id)
+{
+}
+#endif
+
+#endif
diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 32b65073d0cc..b6aad5cb09c1 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -309,3 +309,5 @@ config PER_VMA_LOCK_STATS
overhead in the page fault path.
If in doubt, say N.
+
+source "mm/kmemdump/Kconfig.debug"
diff --git a/mm/Makefile b/mm/Makefile
index 21abb3353550..ca1691dd8924 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -90,6 +90,7 @@ obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o
obj-$(CONFIG_KSM) += ksm.o
obj-$(CONFIG_PAGE_POISONING) += page_poison.o
obj-$(CONFIG_KASAN) += kasan/
+obj-$(CONFIG_KMEMDUMP) += kmemdump/
obj-$(CONFIG_KFENCE) += kfence/
obj-$(CONFIG_KMSAN) += kmsan/
obj-$(CONFIG_FAILSLAB) += failslab.o
diff --git a/mm/kmemdump/Kconfig.debug b/mm/kmemdump/Kconfig.debug
new file mode 100644
index 000000000000..5654180141c0
--- /dev/null
+++ b/mm/kmemdump/Kconfig.debug
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config KMEMDUMP
+ bool "KMEMDUMP: Allow the kernel to register memory regions for dumping purpose"
+ help
+ Kmemdump mechanism allows any driver to mark a specific memory area
+ for later dumping/debugging purpose, depending on the functionality
+ of the attached backend.
+ The backend would interface any hardware mechanism that will allow
+ dumping to complete regardless of the state of the kernel
+ (running, frozen, crashed, or any particular state).
+
+ Note that modules using this feature must be rebuilt if option
+ changes.
diff --git a/mm/kmemdump/Makefile b/mm/kmemdump/Makefile
new file mode 100644
index 000000000000..f5b917a6ef5e
--- /dev/null
+++ b/mm/kmemdump/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-y += kmemdump.o
diff --git a/mm/kmemdump/kmemdump.c b/mm/kmemdump/kmemdump.c
new file mode 100644
index 000000000000..c016457620a4
--- /dev/null
+++ b/mm/kmemdump/kmemdump.c
@@ -0,0 +1,202 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/kmemdump.h>
+
+#define MAX_ZONES 201
+
+static int default_register_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *area, size_t sz)
+{
+ return 0;
+}
+
+static int default_unregister_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id)
+{
+ return 0;
+}
+
+static const struct kmemdump_backend kmemdump_default_backend = {
+ .name = "default",
+ .register_region = default_register_region,
+ .unregister_region = default_unregister_region,
+};
+
+static const struct kmemdump_backend *backend = &kmemdump_default_backend;
+static DEFINE_MUTEX(kmemdump_lock);
+static struct kmemdump_zone kmemdump_zones[MAX_ZONES];
+
+/**
+ * kmemdump_register_id() - Register region into kmemdump with given ID.
+ * @req_id: Requested unique kmemdump_uid that identifies the region
+ * This can be KMEMDUMP_ID_NO_ID, in which case the function will
+ * find an unused ID and return it.
+ * @zone: pointer to the zone of memory
+ * @size: region size
+ *
+ * Return: On success, it returns the unique id for the region.
+ * On failure, it returns negative error value.
+ */
+int kmemdump_register_id(enum kmemdump_uid req_id, void *zone, size_t size)
+{
+ struct kmemdump_zone *z;
+ enum kmemdump_uid uid = req_id;
+ int ret;
+
+ if (uid < KMEMDUMP_ID_START)
+ return -EINVAL;
+
+ if (uid >= MAX_ZONES)
+ return -ENOSPC;
+
+ mutex_lock(&kmemdump_lock);
+
+ if (uid == KMEMDUMP_ID_NO_ID)
+ while (uid < MAX_ZONES) {
+ if (!kmemdump_zones[uid].id)
+ break;
+ uid++;
+ }
+
+ if (uid == MAX_ZONES) {
+ mutex_unlock(&kmemdump_lock);
+ return -ENOSPC;
+ }
+
+ z = &kmemdump_zones[uid];
+
+ if (z->id) {
+ mutex_unlock(&kmemdump_lock);
+ return -EALREADY;
+ }
+
+ ret = backend->register_region(backend, uid, zone, size);
+ if (ret) {
+ mutex_unlock(&kmemdump_lock);
+ return ret;
+ }
+
+ z->zone = zone;
+ z->size = size;
+ z->id = uid;
+
+ mutex_unlock(&kmemdump_lock);
+
+ return uid;
+}
+EXPORT_SYMBOL_GPL(kmemdump_register_id);
+
+/**
+ * kmemdump_unregister() - Unregister region from kmemdump.
+ * @id: unique id that was returned when this region was successfully
+ * registered initially.
+ *
+ * Return: None
+ */
+void kmemdump_unregister(enum kmemdump_uid id)
+{
+ struct kmemdump_zone *z = NULL;
+
+ mutex_lock(&kmemdump_lock);
+
+ z = &kmemdump_zones[id];
+ if (!z->id) {
+ mutex_unlock(&kmemdump_lock);
+ return;
+ }
+
+ backend->unregister_region(backend, z->id);
+
+ memset(z, 0, sizeof(*z));
+
+ mutex_unlock(&kmemdump_lock);
+}
+EXPORT_SYMBOL_GPL(kmemdump_unregister);
+
+/**
+ * kmemdump_register_backend() - Register a backend into kmemdump.
+ * @be: Pointer to a driver allocated backend. This backend must have
+ * two callbacks for registering and deregistering a zone from the
+ * backend.
+ *
+ * Only one backend is supported at a time.
+ *
+ * Return: On success, it returns 0, negative error value otherwise.
+ */
+int kmemdump_register_backend(const struct kmemdump_backend *be)
+{
+ enum kmemdump_uid uid;
+ int ret;
+
+ if (!be || !be->register_region || !be->unregister_region)
+ return -EINVAL;
+
+ mutex_lock(&kmemdump_lock);
+
+ /* Try to call the old backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ backend->unregister_region(backend,
+ kmemdump_zones[uid].id);
+
+ backend = be;
+ pr_debug("kmemdump backend %s registered successfully.\n",
+ backend->name);
+
+ /* Call the new backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++) {
+ if (!kmemdump_zones[uid].id)
+ continue;
+ ret = backend->register_region(backend,
+ kmemdump_zones[uid].id,
+ kmemdump_zones[uid].zone,
+ kmemdump_zones[uid].size);
+ if (ret)
+ pr_debug("register region failed with %d\n", ret);
+ }
+
+ mutex_unlock(&kmemdump_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kmemdump_register_backend);
+
+/**
+ * kmemdump_unregister_backend() - Unregister the backend from kmemdump.
+ * @be: Pointer to a driver allocated backend. This backend must match
+ * the initially registered backend.
+ *
+ * Only one backend is supported at a time.
+ * Before deregistering, this will call the backend to unregister all the
+ * previously registered zones.
+ *
+ * Return: None
+ */
+void kmemdump_unregister_backend(const struct kmemdump_backend *be)
+{
+ enum kmemdump_uid uid;
+
+ mutex_lock(&kmemdump_lock);
+
+ if (backend != be) {
+ mutex_unlock(&kmemdump_lock);
+ return;
+ }
+
+ /* Try to call the old backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ backend->unregister_region(backend,
+ kmemdump_zones[uid].id);
+
+ pr_debug("kmemdump backend %s removed successfully.\n", be->name);
+
+ backend = &kmemdump_default_backend;
+
+ mutex_unlock(&kmemdump_lock);
+}
+EXPORT_SYMBOL_GPL(kmemdump_unregister_backend);
+
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 02/16] Documentation: Add kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 01/16] kmemdump: " Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 03/16] kmemdump: Add coreimage ELF layer Eugen Hristev
` (15 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Document the new kmemdump kernel feature.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
Documentation/dev-tools/index.rst | 1 +
Documentation/dev-tools/kmemdump.rst | 131 +++++++++++++++++++++++++++
MAINTAINERS | 1 +
3 files changed, 133 insertions(+)
create mode 100644 Documentation/dev-tools/kmemdump.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 65c54b27a60b..1b6674efeda0 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -28,6 +28,7 @@ Documentation/process/debugging/index.rst
kmsan
ubsan
kmemleak
+ kmemdump
kcsan
kfence
kselftest
diff --git a/Documentation/dev-tools/kmemdump.rst b/Documentation/dev-tools/kmemdump.rst
new file mode 100644
index 000000000000..504321de951a
--- /dev/null
+++ b/Documentation/dev-tools/kmemdump.rst
@@ -0,0 +1,131 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========
+kmemdump
+========
+
+This document provides information about the kmemdump feature.
+
+Overview
+========
+
+kmemdump is a mechanism that allows any driver or producer to register a
+chunk of memory into it, to be used at a later time for a specific
+purpose like debugging or memory dumping.
+
+kmemdump allows a backend to be connected, this backend interfaces a
+specific hardware that can debug or dump the memory previously registered
+into kmemdump.
+
+The reasoning for kmemdump is to minimize the required debug information
+in case of a kernel problem. A traditional debug method involves dumping
+the whole kernel memory and then inspecting it. Kmemdump allows the
+users to select which memory is of interest, in order to help this
+specific use case in production, where memory and connectivity
+are limited.
+
+Although the kernel has multiple debugging mechanisms, kmemdump fits
+a particular model which is not covered by the others.
+
+kmemdump Internals
+==================
+
+API
+---
+
+A memory region is being registered with a call to kmemdump_register() which
+takes as parameters the ID of the region, a pointer to the virtual memory
+start address and the size. If successful, this call returns an unique ID for
+the allocated zone (either the requested ID or an allocated ID).
+IDs are predefined in the kmemdump header. A second registration with the
+same ID is not allowed, the caller needs to deregister first.
+A dedicated NO_ID is defined, which has kmemdump allocate a new unique ID
+for the request and return it. This case is useful with multiple dynamic
+loop allocations where ID is not significant.
+
+The region would be registered with a call to kmemdump_unregister() which
+takes the id as a parameter.
+
+For dynamically allocated memory, kmemdump defines a variety of wrappers
+on top of allocation functions which are given as parameters.
+This makes the dynamic allocation easy to use without additional calls
+to registration functions. However kmemdump still exposes the register API
+for cases where it may be needed (e.g. size is not exactly known at allocation
+time).
+
+For static variables, a variety of annotation macros are provided. These
+macros will create an annotation struct inside a separate section.
+
+
+Backend
+-------
+
+Backend is represented by a struct kmemdump_backend which has to be filled
+in by the backend driver. Further, this struct is being passed to kmemdump
+with a backend_register() call. backend_unregister() will remove the backend
+from kmemdump.
+
+Once a backend is being registered, all previously registered regions are
+being sent to the backend for registration.
+
+When the backend is being removed, all regions are being first deregistered
+from the backend.
+
+kmemdump will request the backend to register a region with register_region()
+call, and deregister a region with unregister_region() call. These two
+functions are mandatory to be provided by a backend at registration time.
+
+Data structures
+---------------
+
+struct kmemdump_backend represents the kmemdump backend and should be
+initialized by the backend driver.
+
+The regions are being stored in a simple fixed size array. It avoids
+memory allocation overhead. This is not performance critical nor does
+allocating a few hundred entries create a memory consumption problem.
+
+The static variables registered into kmemdump are being annotated into
+a dedicated .kemdump memory section. This is then walked by kmemdump
+at a later time and each variable is registered.
+
+kmemdump Initialization
+-----------------------
+
+After system boots, kmemdump will be ready to accept region registration
+from producer drivers. Even if the backend may not be registered yet,
+there is a default no-op backend that is registered. At any time the backend
+can be changed with a real backend in which case all regions are being
+registered to the new backend.
+
+backend functionality
+---------------------
+
+kmemdump backend can keep it's own list of regions and use the specific
+hardware available to dump the memory regions or use them for debugging.
+
+kmemdump example
+================
+
+A production scenario for kmemdump is the following:
+The kernel registers the linux_banner variable into kmemdump with
+a simple call like:
+
+ kmemdump_register(linux_banner, sizeof(linux_banner));
+
+The backend will receive a call to it's register_region() callback after it
+probes and registers with kmemdump.
+The backend will then note into a specific table the address of the banner
+and the size of it.
+The specific table is then written to a shared memory area that can be
+read by upper level firmware.
+When the kernel freezes (hypothetically), the kernel will no longer feed
+the watchdog. The watchdog will trigger a higher exception level interrupt
+which will be handled by the upper level firmware. This firmware will then
+read the shared memory table and find an entry with the start and size of
+the banner. It will then copy it for debugging purpose. The upper level
+firmware will then be able to provide useful debugging information,
+like in this example, the banner.
+
+As seen here, kmemdump facilitates the interaction between the kernel
+and a specific backend.
diff --git a/MAINTAINERS b/MAINTAINERS
index 1713cccefc91..974f43c3902b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13813,6 +13813,7 @@ F: drivers/iio/accel/kionix-kx022a*
KMEMDUMP
M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
+F: Documentation/dev-tools/kmemdump.rst
F: include/linux/kmemdump.h
F: mm/kmemdump/kmemdump.c
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 03/16] kmemdump: Add coreimage ELF layer
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 01/16] kmemdump: " Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 02/16] Documentation: Add kmemdump Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 04/16] Documentation: kmemdump: Add section for coreimage ELF Eugen Hristev
` (14 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Implement kmemdumping into an ELF coreimage.
With this feature enabled, kmemdump will assemble all the regions
into a coreimage, by having an initial first region with an ELF header,
a second region with vmcoreinfo data, and then register vital kernel
information in the subsequent regions.
This image can then be dumped, assembled into a single file and loaded
into debugging tools like crash/gdb.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 1 +
include/linux/kmemdump.h | 67 ++++++++++
mm/kmemdump/Kconfig.debug | 18 ++-
mm/kmemdump/Makefile | 1 +
mm/kmemdump/kmemdump.c | 32 +++++
mm/kmemdump/kmemdump_coreimage.c | 222 +++++++++++++++++++++++++++++++
6 files changed, 339 insertions(+), 2 deletions(-)
create mode 100644 mm/kmemdump/kmemdump_coreimage.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 974f43c3902b..fc8cd34cf190 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13816,6 +13816,7 @@ S: Maintained
F: Documentation/dev-tools/kmemdump.rst
F: include/linux/kmemdump.h
F: mm/kmemdump/kmemdump.c
+F: mm/kmemdump/kmemdump_coreimage.c
KMEMLEAK
M: Catalin Marinas <catalin.marinas@arm.com>
diff --git a/include/linux/kmemdump.h b/include/linux/kmemdump.h
index 8e764bb2d8ac..ac2eb1b4ba06 100644
--- a/include/linux/kmemdump.h
+++ b/include/linux/kmemdump.h
@@ -4,6 +4,52 @@
enum kmemdump_uid {
KMEMDUMP_ID_START = 0,
+ KMEMDUMP_ID_COREIMAGE_ELF,
+ KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
+ KMEMDUMP_ID_COREIMAGE_CONFIG,
+ KMEMDUMP_ID_COREIMAGE_MEMSECT,
+ KMEMDUMP_ID_COREIMAGE__totalram_pages,
+ KMEMDUMP_ID_COREIMAGE___cpu_possible_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_present_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_online_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_active_mask,
+ KMEMDUMP_ID_COREIMAGE_jiffies_64,
+ KMEMDUMP_ID_COREIMAGE_linux_banner,
+ KMEMDUMP_ID_COREIMAGE_nr_threads,
+ KMEMDUMP_ID_COREIMAGE_nr_irqs,
+ KMEMDUMP_ID_COREIMAGE_tainted_mask,
+ KMEMDUMP_ID_COREIMAGE_taint_flags,
+ KMEMDUMP_ID_COREIMAGE_mem_section,
+ KMEMDUMP_ID_COREIMAGE_node_data,
+ KMEMDUMP_ID_COREIMAGE_node_states,
+ KMEMDUMP_ID_COREIMAGE___per_cpu_offset,
+ KMEMDUMP_ID_COREIMAGE_nr_swapfiles,
+ KMEMDUMP_ID_COREIMAGE_init_uts_ns,
+ KMEMDUMP_ID_COREIMAGE_printk_rb_static,
+ KMEMDUMP_ID_COREIMAGE_printk_rb_dynamic,
+ KMEMDUMP_ID_COREIMAGE_prb,
+ KMEMDUMP_ID_COREIMAGE_prb_descs,
+ KMEMDUMP_ID_COREIMAGE_prb_infos,
+ KMEMDUMP_ID_COREIMAGE_prb_data,
+ KMEMDUMP_ID_COREIMAGE_runqueues,
+ KMEMDUMP_ID_COREIMAGE_high_memory,
+ KMEMDUMP_ID_COREIMAGE_init_mm,
+ KMEMDUMP_ID_COREIMAGE_init_mm_pgd,
+ KMEMDUMP_ID_COREIMAGE__sinittext,
+ KMEMDUMP_ID_COREIMAGE__einittext,
+ KMEMDUMP_ID_COREIMAGE__end,
+ KMEMDUMP_ID_COREIMAGE__text,
+ KMEMDUMP_ID_COREIMAGE__stext,
+ KMEMDUMP_ID_COREIMAGE__etext,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_num_syms,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_relative_base,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_offsets,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_names,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_token_table,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_token_index,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_markers,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_seqs_of_names,
+ KMEMDUMP_ID_COREIMAGE_swapper_pg_dir,
KMEMDUMP_ID_USER_START,
KMEMDUMP_ID_USER_END,
KMEMDUMP_ID_NO_ID,
@@ -60,4 +106,25 @@ static inline void kmemdump_unregister(enum kmemdump_uid id)
}
#endif
+#ifdef CONFIG_KMEMDUMP
+#ifdef CONFIG_KMEMDUMP_COREIMAGE
+int init_elfheader(void);
+void update_elfheader(const struct kmemdump_zone *z);
+int clear_elfheader(const struct kmemdump_zone *z);
+#else
+static inline int init_elfheader(void)
+{
+ return 0;
+}
+
+static inline void update_elfheader(const struct kmemdump_zone *z)
+{
+}
+
+static inline int clear_elfheader(const struct kmemdump_zone *z)
+{
+ return 0;
+}
+#endif
+#endif
#endif
diff --git a/mm/kmemdump/Kconfig.debug b/mm/kmemdump/Kconfig.debug
index 5654180141c0..f62bde50a81b 100644
--- a/mm/kmemdump/Kconfig.debug
+++ b/mm/kmemdump/Kconfig.debug
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
-config KMEMDUMP
- bool "KMEMDUMP: Allow the kernel to register memory regions for dumping purpose"
+menuconfig KMEMDUMP
+ bool "KMEMDUMP: Register memory regions for dumping purpose"
help
Kmemdump mechanism allows any driver to mark a specific memory area
for later dumping/debugging purpose, depending on the functionality
@@ -12,3 +12,17 @@ config KMEMDUMP
Note that modules using this feature must be rebuilt if option
changes.
+
+config KMEMDUMP_COREIMAGE
+ depends on KMEMDUMP
+ select VMCORE_INFO
+ bool "Assemble memory regions into a coredump readable with debuggers"
+ help
+ Enabling this will assemble all the memory regions into a
+ core ELF file. The first region will include program headers for
+ all the regions. The second region is the vmcoreinfo and specific
+ coredump structures.
+ All the other regions follow. Specific kernel variables required
+ for debug tools are being registered.
+ The coredump file can then be loaded into GDB or crash tool and
+ further inspected.
diff --git a/mm/kmemdump/Makefile b/mm/kmemdump/Makefile
index f5b917a6ef5e..eed67f15a8d0 100644
--- a/mm/kmemdump/Makefile
+++ b/mm/kmemdump/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-y += kmemdump.o
+obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
diff --git a/mm/kmemdump/kmemdump.c b/mm/kmemdump/kmemdump.c
index c016457620a4..3827b0597cac 100644
--- a/mm/kmemdump/kmemdump.c
+++ b/mm/kmemdump/kmemdump.c
@@ -28,6 +28,32 @@ static const struct kmemdump_backend kmemdump_default_backend = {
static const struct kmemdump_backend *backend = &kmemdump_default_backend;
static DEFINE_MUTEX(kmemdump_lock);
static struct kmemdump_zone kmemdump_zones[MAX_ZONES];
+static bool kmemdump_initialized;
+
+static int __init init_kmemdump(void)
+{
+ enum kmemdump_uid uid;
+
+ init_elfheader();
+
+ mutex_lock(&kmemdump_lock);
+ /*
+ * Some regions may have been registered very early.
+ * Update the elf header for all existing regions,
+ * except for KMEMDUMP_ID_COREIMAGE_ELF and
+ * KMEMDUMP_ID_COREIMAGE_VMCOREINFO, those are included in the
+ * ELF header upon its creation.
+ */
+ for (uid = KMEMDUMP_ID_COREIMAGE_CONFIG; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ update_elfheader(&kmemdump_zones[uid]);
+
+ kmemdump_initialized = true;
+ mutex_unlock(&kmemdump_lock);
+
+ return 0;
+}
+late_initcall(init_kmemdump);
/**
* kmemdump_register_id() - Register region into kmemdump with given ID.
@@ -83,6 +109,9 @@ int kmemdump_register_id(enum kmemdump_uid req_id, void *zone, size_t size)
z->size = size;
z->id = uid;
+ if (kmemdump_initialized)
+ update_elfheader(z);
+
mutex_unlock(&kmemdump_lock);
return uid;
@@ -110,6 +139,9 @@ void kmemdump_unregister(enum kmemdump_uid id)
backend->unregister_region(backend, z->id);
+ if (kmemdump_initialized)
+ clear_elfheader(z);
+
memset(z, 0, sizeof(*z));
mutex_unlock(&kmemdump_lock);
diff --git a/mm/kmemdump/kmemdump_coreimage.c b/mm/kmemdump/kmemdump_coreimage.c
new file mode 100644
index 000000000000..a7b51a171d8e
--- /dev/null
+++ b/mm/kmemdump/kmemdump_coreimage.c
@@ -0,0 +1,222 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/io.h>
+#include <linux/elfcore.h>
+#include <linux/kmemdump.h>
+#include <linux/vmcore_info.h>
+
+#define CORE_STR "CORE"
+
+#define MAX_NUM_ENTRIES 201
+
+static struct elfhdr *ehdr;
+static size_t elf_offset;
+
+static void append_kcore_note(char *notes, size_t *i, const char *name,
+ unsigned int type, const void *desc,
+ size_t descsz)
+{
+ struct elf_note *note = (struct elf_note *)¬es[*i];
+
+ note->n_namesz = strlen(name) + 1;
+ note->n_descsz = descsz;
+ note->n_type = type;
+ *i += sizeof(*note);
+ memcpy(¬es[*i], name, note->n_namesz);
+ *i = ALIGN(*i + note->n_namesz, 4);
+ memcpy(¬es[*i], desc, descsz);
+ *i = ALIGN(*i + descsz, 4);
+}
+
+static void append_kcore_note_nodesc(char *notes, size_t *i, const char *name,
+ unsigned int type, size_t descsz)
+{
+ struct elf_note *note = (struct elf_note *)¬es[*i];
+
+ note->n_namesz = strlen(name) + 1;
+ note->n_descsz = descsz;
+ note->n_type = type;
+ *i += sizeof(*note);
+ memcpy(¬es[*i], name, note->n_namesz);
+ *i = ALIGN(*i + note->n_namesz, 4);
+}
+
+static struct elf_phdr *elf_phdr_entry_addr(struct elfhdr *ehdr, int idx)
+{
+ struct elf_phdr *ephdr = (struct elf_phdr *)((size_t)ehdr + ehdr->e_phoff);
+
+ return &ephdr[idx];
+}
+
+/**
+ * clear_elfheader() - Remove the program header for a specific memory zone
+ * @z: pointer to the kmemdump zone
+ *
+ * Return: On success, it returns 0, errno otherwise
+ */
+int clear_elfheader(const struct kmemdump_zone *z)
+{
+ struct elf_phdr *phdr;
+ struct elf_phdr *tmp_phdr;
+ unsigned int phidx;
+ unsigned int i;
+
+ for (i = 0; i < ehdr->e_phnum; i++) {
+ phdr = elf_phdr_entry_addr(ehdr, i);
+ if (phdr->p_paddr == virt_to_phys(z->zone) &&
+ phdr->p_memsz == ALIGN(z->size, 4))
+ break;
+ }
+
+ if (i == ehdr->e_phnum) {
+ pr_debug("Cannot find program header entry in elf\n");
+ return -EINVAL;
+ }
+
+ phidx = i;
+
+ /* Clear program header */
+ tmp_phdr = elf_phdr_entry_addr(ehdr, phidx);
+ for (i = phidx; i < ehdr->e_phnum - 1; i++) {
+ tmp_phdr = elf_phdr_entry_addr(ehdr, i + 1);
+ phdr = elf_phdr_entry_addr(ehdr, i);
+ memcpy(phdr, tmp_phdr, sizeof(*phdr));
+ phdr->p_offset = phdr->p_offset - ALIGN(z->size, 4);
+ }
+ memset(tmp_phdr, 0, sizeof(*tmp_phdr));
+ ehdr->e_phnum--;
+
+ elf_offset -= ALIGN(z->size, 4);
+
+ return 0;
+}
+
+/**
+ * update_elfheader() - Add the program header for a specific memory zone
+ * @z: pointer to the kmemdump zone
+ *
+ * Return: None
+ */
+void update_elfheader(const struct kmemdump_zone *z)
+{
+ struct elf_phdr *phdr;
+
+ phdr = elf_phdr_entry_addr(ehdr, ehdr->e_phnum++);
+
+ phdr->p_type = PT_LOAD;
+ phdr->p_offset = elf_offset;
+ phdr->p_vaddr = (elf_addr_t)z->zone;
+ phdr->p_paddr = (elf_addr_t)virt_to_phys(z->zone);
+ phdr->p_filesz = phdr->p_memsz = ALIGN(z->size, 4);
+ phdr->p_flags = PF_R | PF_W;
+
+ elf_offset += ALIGN(z->size, 4);
+}
+
+/**
+ * init_elfheader() - Prepare coreinfo elf header
+ * This function prepares the elf header for the coredump image.
+ * Initially there is a single program header for the elf NOTE.
+ * The note contains the usual core dump information, and the
+ * vmcoreinfo.
+ *
+ * Return: 0 on success, errno otherwise
+ */
+int init_elfheader(void)
+{
+ struct elf_phdr *phdr;
+ void *notes;
+ unsigned int elfh_size;
+ unsigned int phdr_off;
+ size_t note_len, i = 0;
+
+ struct elf_prstatus prstatus = {};
+ struct elf_prpsinfo prpsinfo = {
+ .pr_sname = 'R',
+ .pr_fname = "vmlinux",
+ };
+
+ /*
+ * Header buffer contains:
+ * ELF header, Note entry with PR status, PR ps info, and vmcoreinfo
+ * MAX_NUM_ENTRIES Program headers,
+ */
+ elfh_size = sizeof(*ehdr);
+ elfh_size += sizeof(struct elf_prstatus);
+ elfh_size += sizeof(struct elf_prpsinfo);
+ elfh_size += sizeof(VMCOREINFO_NOTE_NAME);
+ elfh_size += ALIGN(vmcoreinfo_size, 4);
+ elfh_size += (sizeof(*phdr)) * (MAX_NUM_ENTRIES);
+
+ elfh_size = ALIGN(elfh_size, 4);
+
+ /* Never freed */
+ ehdr = kzalloc(elfh_size, GFP_KERNEL);
+ if (!ehdr)
+ return -ENOMEM;
+
+ /* Assign Program headers offset, it's right after the elf header. */
+ phdr = (struct elf_phdr *)(ehdr + 1);
+ phdr_off = sizeof(*ehdr);
+
+ memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+ ehdr->e_ident[EI_CLASS] = ELF_CLASS;
+ ehdr->e_ident[EI_DATA] = ELF_DATA;
+ ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+ ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+ ehdr->e_type = ET_CORE;
+ ehdr->e_machine = ELF_ARCH;
+ ehdr->e_version = EV_CURRENT;
+ ehdr->e_ehsize = sizeof(*ehdr);
+ ehdr->e_phentsize = sizeof(*phdr);
+
+ elf_offset = elfh_size;
+
+ notes = (void *)(((char *)ehdr) + elf_offset);
+
+ /* we have a single program header now */
+ ehdr->e_phnum = 1;
+
+ /* Length of the note is made of :
+ * 3 elf notes structs (prstatus, prpsinfo, vmcoreinfo)
+ * 3 notes names (2 core strings, 1 vmcoreinfo name)
+ * sizeof each note
+ */
+ note_len = (3 * sizeof(struct elf_note) +
+ 2 * ALIGN(sizeof(CORE_STR), 4) +
+ VMCOREINFO_NOTE_NAME_BYTES +
+ ALIGN(sizeof(struct elf_prstatus), 4) +
+ ALIGN(sizeof(struct elf_prpsinfo), 4) +
+ ALIGN(vmcoreinfo_size, 4));
+
+ phdr->p_type = PT_NOTE;
+ phdr->p_offset = elf_offset;
+ phdr->p_filesz = note_len;
+
+ /* advance elf offset */
+ elf_offset += note_len;
+
+ strscpy(prpsinfo.pr_psargs, saved_command_line,
+ sizeof(prpsinfo.pr_psargs));
+
+ append_kcore_note(notes, &i, CORE_STR, NT_PRSTATUS, &prstatus,
+ sizeof(prstatus));
+ append_kcore_note(notes, &i, CORE_STR, NT_PRPSINFO, &prpsinfo,
+ sizeof(prpsinfo));
+ append_kcore_note_nodesc(notes, &i, VMCOREINFO_NOTE_NAME, 0,
+ ALIGN(vmcoreinfo_size, 4));
+
+ ehdr->e_phoff = phdr_off;
+
+ /* This is the first kmemdump region, the ELF header */
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_ELF, ehdr,
+ elfh_size + note_len - ALIGN(vmcoreinfo_size, 4));
+
+ /*
+ * The second region is the vmcoreinfo, which goes right after.
+ * It's being registered through vmcoreinfo.
+ */
+
+ return 0;
+}
+
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 04/16] Documentation: kmemdump: Add section for coreimage ELF
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (2 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 03/16] kmemdump: Add coreimage ELF layer Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 05/16] kernel/vmcore_info: Register dynamic information into Kmemdump Eugen Hristev
` (13 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add section describing the utility of coreimage ELF generation for
kmemdump.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
Documentation/dev-tools/kmemdump.rst | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/Documentation/dev-tools/kmemdump.rst b/Documentation/dev-tools/kmemdump.rst
index 504321de951a..5616843e0407 100644
--- a/Documentation/dev-tools/kmemdump.rst
+++ b/Documentation/dev-tools/kmemdump.rst
@@ -27,6 +27,14 @@ are limited.
Although the kernel has multiple debugging mechanisms, kmemdump fits
a particular model which is not covered by the others.
+kmemdump can also prepare specific regions of the kernel that can be
+put together to form a minimal core image file. To achieve this, the first
+region is an ELF header with program headers for each region, and another
+region contains specific ELF NOTE section with vmcoreinfo.
+There are also multiple regions registered with basic kernel information
+that will allow debugging tools like 'crash' to load the image.
+To enable this feature, use CONFIG_KMEMDUMP_COREIMAGE.
+
kmemdump Internals
==================
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 05/16] kernel/vmcore_info: Register dynamic information into Kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (3 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 04/16] Documentation: kmemdump: Add section for coreimage ELF Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 06/16] kmemdump: Introduce qcom-minidump backend driver Eugen Hristev
` (12 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Register vmcoreinfo information into kmemdump.
Because the size of the info is computed after all entries are being
added, there is no point in registering the whole page, rather, call
the kmemdump registration once everything is in place with the right size.
A second reason is that the vmcoreinfo is added as a region inside
the ELF coreimage note, there is no point in having blank space at the end.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/vmcore_info.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index e066d31d08f8..3e2e846ba9c8 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -14,6 +14,7 @@
#include <linux/cpuhotplug.h>
#include <linux/memblock.h>
#include <linux/kmemleak.h>
+#include <linux/kmemdump.h>
#include <asm/page.h>
#include <asm/sections.h>
@@ -118,6 +119,12 @@ phys_addr_t __weak paddr_vmcoreinfo_note(void)
}
EXPORT_SYMBOL(paddr_vmcoreinfo_note);
+static void vmcoreinfo_kmemdump(void)
+{
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
+ (void *)vmcoreinfo_data, vmcoreinfo_size);
+}
+
static int __init crash_save_vmcoreinfo_init(void)
{
vmcoreinfo_data = (unsigned char *)get_zeroed_page(GFP_KERNEL);
@@ -227,6 +234,7 @@ static int __init crash_save_vmcoreinfo_init(void)
arch_crash_save_vmcoreinfo();
update_vmcoreinfo_note();
+ vmcoreinfo_kmemdump();
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 06/16] kmemdump: Introduce qcom-minidump backend driver
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (4 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 05/16] kernel/vmcore_info: Register dynamic information into Kmemdump Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 07/16] soc: qcom: smem: Add minidump device Eugen Hristev
` (11 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Qualcomm Minidump is a backend driver for kmemdump.
Regions are being registered into the shared memory on Qualcomm platforms
and into the table of contents.
Further, the firmware can read the table of contents and dump the memory
accordingly.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 5 +
mm/kmemdump/Kconfig.debug | 12 ++
mm/kmemdump/Makefile | 1 +
mm/kmemdump/qcom_minidump.c | 353 ++++++++++++++++++++++++++++++++++++
4 files changed, 371 insertions(+)
create mode 100644 mm/kmemdump/qcom_minidump.c
diff --git a/MAINTAINERS b/MAINTAINERS
index fc8cd34cf190..8234acb24cbc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13818,6 +13818,11 @@ F: include/linux/kmemdump.h
F: mm/kmemdump/kmemdump.c
F: mm/kmemdump/kmemdump_coreimage.c
+KMEMDUMP QCOM MINIDUMP BACKEND DRIVER
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: mm/kmemdump/qcom_minidump.c
+
KMEMLEAK
M: Catalin Marinas <catalin.marinas@arm.com>
S: Maintained
diff --git a/mm/kmemdump/Kconfig.debug b/mm/kmemdump/Kconfig.debug
index f62bde50a81b..91cec45bc3ca 100644
--- a/mm/kmemdump/Kconfig.debug
+++ b/mm/kmemdump/Kconfig.debug
@@ -26,3 +26,15 @@ config KMEMDUMP_COREIMAGE
for debug tools are being registered.
The coredump file can then be loaded into GDB or crash tool and
further inspected.
+
+config KMEMDUMP_QCOM_MINIDUMP_BACKEND
+ tristate "Qualcomm Minidump kmemdump backend driver"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on KMEMDUMP
+ help
+ Say y here to enable the Qualcomm Minidump kmemdump backend
+ driver.
+ With this backend, the registered regions are being linked
+ into the minidump table of contents. Further on, the firmware
+ will be able to read the table of contents and extract the
+ memory regions on case-by-case basis.
diff --git a/mm/kmemdump/Makefile b/mm/kmemdump/Makefile
index eed67f15a8d0..6ec3871203ef 100644
--- a/mm/kmemdump/Makefile
+++ b/mm/kmemdump/Makefile
@@ -2,3 +2,4 @@
obj-y += kmemdump.o
obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
+obj-$(CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND) += qcom_minidump.o
diff --git a/mm/kmemdump/qcom_minidump.c b/mm/kmemdump/qcom_minidump.c
new file mode 100644
index 000000000000..604a58240c20
--- /dev/null
+++ b/mm/kmemdump/qcom_minidump.c
@@ -0,0 +1,353 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Qualcomm Minidump backend driver for Kmemdump
+ * Copyright (C) 2016,2024-2025 Linaro Ltd
+ * Copyright (C) 2015 Sony Mobile Communications Inc
+ * Copyright (c) 2012-2013, The Linux Foundation. All rights reserved.
+ */
+
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/soc/qcom/smem.h>
+#include <linux/kmemdump.h>
+#include <linux/container_of.h>
+
+/*
+ * In some of the Old Qualcomm devices, boot firmware statically allocates 300
+ * as total number of supported region (including all co-processors) in
+ * minidump table out of which linux was using 201. In future, this limitation
+ * from boot firmware might get removed by allocating the region dynamically.
+ * So, keep it compatible with older devices, we can keep the current limit for
+ * Linux to 201.
+ */
+#define MAX_NUM_REGIONS 201
+
+#define MAX_NUM_SUBSYSTEMS 10
+#define MAX_REGION_NAME_LENGTH 16
+#define SBL_MINIDUMP_SMEM_ID 602
+#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+
+#define MINIDUMP_SS_ENCR_NOTREQ (0 << 24 | 0 << 16 | 'N' << 8 | 'R' << 0)
+
+#define MINIDUMP_SUBSYSTEM_APSS 0
+
+const char *kmemdump_id_to_md_string[] = {
+ "",
+ "ELF",
+ "vmcoreinfo",
+ "config",
+ "memsect",
+ "totalram",
+ "cpu_possible",
+ "cpu_present",
+ "cpu_online",
+ "cpu_active",
+ "jiffies",
+ "linux_banner",
+ "nr_threads",
+ "nr_irqs",
+ "tainted_mask",
+ "taint_flags",
+ "mem_section",
+ "node_data",
+ "node_states",
+ "__per_cpu_offset",
+ "nr_swapfiles",
+ "init_uts_ns",
+ "printk_rb_static",
+ "printk_rb_dynamic",
+ "prb",
+ "prb_descs",
+ "prb_infos",
+ "prb_data",
+ "runqueues",
+ "high_memory",
+ "init_mm",
+ "init_mm_pgd",
+};
+
+/**
+ * struct minidump_region - Minidump region
+ * @name : Name of the region to be dumped
+ * @seq_num: : Use to differentiate regions with same name.
+ * @valid : This entry to be dumped (if set to 1)
+ * @address : Physical address of region to be dumped
+ * @size : Size of the region
+ */
+struct minidump_region {
+ char name[MAX_REGION_NAME_LENGTH];
+ __le32 seq_num;
+ __le32 valid;
+ __le64 address;
+ __le64 size;
+};
+
+/**
+ * struct minidump_subsystem - Subsystem's SMEM Table of content
+ * @status : Subsystem toc init status
+ * @enabled : if set to 1, this region would be copied during coredump
+ * @encryption_status: Encryption status for this subsystem
+ * @encryption_required : Decides to encrypt the subsystem regions or not
+ * @region_count : Number of regions added in this subsystem toc
+ * @regions_baseptr : regions base pointer of the subsystem
+ */
+struct minidump_subsystem {
+ __le32 status;
+ __le32 enabled;
+ __le32 encryption_status;
+ __le32 encryption_required;
+ __le32 region_count;
+ __le64 regions_baseptr;
+};
+
+/**
+ * struct minidump_global_toc - Global Table of Content
+ * @status : Global Minidump init status
+ * @revision : Minidump revision
+ * @enabled : Minidump enable status
+ * @subsystems : Array of subsystems toc
+ */
+struct minidump_global_toc {
+ __le32 status;
+ __le32 revision;
+ __le32 enabled;
+ struct minidump_subsystem subsystems[MAX_NUM_SUBSYSTEMS];
+};
+
+#define MINIDUMP_MAX_NAME_LENGTH 12
+/**
+ * struct qcom_minidump_region - Minidump region information
+ *
+ * @name: Minidump region name
+ * @virt_addr: Virtual address of the entry.
+ * @phys_addr: Physical address of the entry to dump.
+ * @size: Number of bytes to dump from @address location,
+ * and it should be 4 byte aligned.
+ * @id: Region id.
+ */
+struct qcom_minidump_region {
+ char name[MINIDUMP_MAX_NAME_LENGTH];
+ void *virt_addr;
+ phys_addr_t phys_addr;
+ size_t size;
+ unsigned int id;
+};
+
+/**
+ * struct minidump - Minidump driver data information
+ *
+ * @dev: Minidump device struct.
+ * @toc: Minidump table of contents subsystem.
+ * @regions: Minidump regions array.
+ * @md_be: Minidump backend.
+ */
+struct minidump {
+ struct device *dev;
+ struct minidump_subsystem *toc;
+ struct minidump_region *regions;
+ struct kmemdump_backend md_be;
+};
+
+static struct minidump *md;
+
+#define be_to_minidump(be) container_of(be, struct minidump, md_be)
+
+/**
+ * qcom_apss_md_table_init() - Initialize the minidump table
+ * @md: minidump data
+ * @mdss_toc: minidump subsystem table of contents
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int qcom_apss_md_table_init(struct minidump *md,
+ struct minidump_subsystem *mdss_toc)
+{
+ md->toc = mdss_toc;
+ md->regions = devm_kcalloc(md->dev, MAX_NUM_REGIONS,
+ sizeof(*md->regions), GFP_KERNEL);
+ if (!md->regions)
+ return -ENOMEM;
+
+ md->toc->regions_baseptr = cpu_to_le64(virt_to_phys(md->regions));
+ md->toc->enabled = cpu_to_le32(MINIDUMP_SS_ENABLED);
+ md->toc->status = cpu_to_le32(1);
+ md->toc->region_count = cpu_to_le32(0);
+
+ /* Tell bootloader not to encrypt the regions of this subsystem */
+ md->toc->encryption_status = cpu_to_le32(MINIDUMP_SS_ENCR_DONE);
+ md->toc->encryption_required = cpu_to_le32(MINIDUMP_SS_ENCR_NOTREQ);
+
+ return 0;
+}
+
+/**
+ * qcom_md_get_region_index() - Lookup minidump region by kmemdump id
+ * @md: minidump data
+ * @id: minidump region id
+ *
+ * Return: On success, it returns the internal region index, on failure,
+ * returns negative error value
+ */
+static int qcom_md_get_region_index(struct minidump *md, int id)
+{
+ unsigned int count = le32_to_cpu(md->toc->region_count);
+ unsigned int i;
+
+ for (i = 0; i < count; i++)
+ if (md->regions[i].seq_num == id)
+ return i;
+
+ return -ENOENT;
+}
+
+/**
+ * register_md_region() - Register a new minidump region
+ * @be: kmemdump backend, this should be the minidump backend
+ * @id: unique id to identify the region
+ * @vaddr: virtual memory address of the region start
+ * @size: size of the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int register_md_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *vaddr, size_t size)
+{
+ struct minidump *md = be_to_minidump(be);
+ struct minidump_region *mdr;
+ unsigned int num_region, region_cnt;
+ const char *name = "unknown";
+
+ if (!vaddr || !size)
+ return -EINVAL;
+
+ if (id < ARRAY_SIZE(kmemdump_id_to_md_string))
+ name = kmemdump_id_to_md_string[id];
+
+ if (qcom_md_get_region_index(md, id) >= 0) {
+ dev_dbg(md->dev, "%s:%d region is already registered\n",
+ name, id);
+ return -EEXIST;
+ }
+
+ /* Check if there is a room for a new entry */
+ num_region = le32_to_cpu(md->toc->region_count);
+ if (num_region >= MAX_NUM_REGIONS) {
+ dev_err(md->dev, "maximum region limit %u reached\n",
+ num_region);
+ return -ENOSPC;
+ }
+
+ region_cnt = le32_to_cpu(md->toc->region_count);
+ mdr = &md->regions[region_cnt];
+ scnprintf(mdr->name, MAX_REGION_NAME_LENGTH, "K%.8s", name);
+ mdr->seq_num = id;
+ mdr->address = cpu_to_le64(__pa(vaddr));
+ mdr->size = cpu_to_le64(ALIGN(size, 4));
+ mdr->valid = cpu_to_le32(MINIDUMP_REGION_VALID);
+ region_cnt++;
+ md->toc->region_count = cpu_to_le32(region_cnt);
+
+ return 0;
+}
+
+/**
+ * unregister_md_region() - Unregister a previously registered minidump region
+ * @be: pointer to backend
+ * @id: unique id to identify the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int unregister_md_region(const struct kmemdump_backend *be,
+ unsigned int id)
+{
+ struct minidump *md = be_to_minidump(be);
+ struct minidump_region *mdr;
+ unsigned int region_cnt;
+ unsigned int idx;
+
+ idx = qcom_md_get_region_index(md, id);
+ if (idx < 0) {
+ dev_err(md->dev, "%d region is not present\n", id);
+ return idx;
+ }
+
+ mdr = &md->regions[0];
+ region_cnt = le32_to_cpu(md->toc->region_count);
+ /*
+ * Left shift all the regions exist after this removed region
+ * index by 1 to fill the gap and zero out the last region
+ * present at the end.
+ */
+ memmove(&mdr[idx], &mdr[idx + 1], (region_cnt - idx - 1) * sizeof(*mdr));
+ memset(&mdr[region_cnt - 1], 0, sizeof(*mdr));
+ region_cnt--;
+ md->toc->region_count = cpu_to_le32(region_cnt);
+
+ return 0;
+}
+
+static int qcom_md_probe(struct platform_device *pdev)
+{
+ struct minidump_global_toc *mdgtoc;
+ size_t size;
+ int ret;
+
+ md = kzalloc(sizeof(*md), GFP_KERNEL);
+ if (!md)
+ return -ENOMEM;
+
+ md->dev = &pdev->dev;
+
+ strscpy(md->md_be.name, "qcom_minidump");
+ md->md_be.register_region = register_md_region;
+ md->md_be.unregister_region = unregister_md_region;
+
+ mdgtoc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, &size);
+ if (IS_ERR(mdgtoc)) {
+ ret = PTR_ERR(mdgtoc);
+ dev_err(md->dev, "Couldn't find minidump smem item %d\n", ret);
+ goto qcom_md_probe_fail;
+ }
+
+ if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
+ dev_err(md->dev, "minidump table is not initialized\n");
+ ret = -ENAVAIL;
+ goto qcom_md_probe_fail;
+ }
+
+ ret = qcom_apss_md_table_init(md, &mdgtoc->subsystems[MINIDUMP_SUBSYSTEM_APSS]);
+ if (ret)
+ goto qcom_md_probe_fail;
+
+ return kmemdump_register_backend(&md->md_be);
+
+qcom_md_probe_fail:
+ kfree(md);
+ return ret;
+}
+
+static void qcom_md_remove(struct platform_device *pdev)
+{
+ kfree(md);
+ kmemdump_unregister_backend(&md->md_be);
+}
+
+static struct platform_driver qcom_md_driver = {
+ .probe = qcom_md_probe,
+ .remove = qcom_md_remove,
+ .driver = {
+ .name = "qcom-minidump",
+ },
+};
+
+module_platform_driver(qcom_md_driver);
+
+MODULE_AUTHOR("Eugen Hristev <eugen.hristev@linaro.org>");
+MODULE_AUTHOR("Mukesh Ojha <quic_mojha@quicinc.com>");
+MODULE_DESCRIPTION("Qualcomm kmemdump minidump backend driver");
+MODULE_LICENSE("GPL");
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 07/16] soc: qcom: smem: Add minidump device
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (5 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 06/16] kmemdump: Introduce qcom-minidump backend driver Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 08/16] init/version: Add banner_len to save banner length Eugen Hristev
` (10 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add a minidump platform device.
Minidump can collect various memory snippets using dedicated firmware.
To know which snippets to collect, each snippet must be registered
by the kernel into a specific shared memory table which is controlled
by the qcom smem driver.
To instantiate the minidump platform driver, register its data using
platform_device_register_data.
Later on, the minidump driver will probe and register itself into
kmemdump as a backend.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
drivers/soc/qcom/smem.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
index c4c45f15dca4..03315722d71a 100644
--- a/drivers/soc/qcom/smem.c
+++ b/drivers/soc/qcom/smem.c
@@ -270,6 +270,7 @@ struct smem_region {
* @partitions: list of partitions of current processor/host
* @item_count: max accepted item number
* @socinfo: platform device pointer
+ * @mdinfo: minidump device pointer
* @num_regions: number of @regions
* @regions: list of the memory regions defining the shared memory
*/
@@ -280,6 +281,7 @@ struct qcom_smem {
u32 item_count;
struct platform_device *socinfo;
+ struct platform_device *mdinfo;
struct smem_ptable *ptable;
struct smem_partition global_partition;
struct smem_partition partitions[SMEM_HOST_COUNT];
@@ -1236,12 +1238,20 @@ static int qcom_smem_probe(struct platform_device *pdev)
if (IS_ERR(smem->socinfo))
dev_dbg(&pdev->dev, "failed to register socinfo device\n");
+ smem->mdinfo = platform_device_register_data(&pdev->dev, "qcom-minidump",
+ PLATFORM_DEVID_AUTO, NULL,
+ 0);
+ if (IS_ERR(smem->mdinfo))
+ dev_err(&pdev->dev, "failed to register platform md device\n");
+
return 0;
}
static void qcom_smem_remove(struct platform_device *pdev)
{
platform_device_unregister(__smem->socinfo);
+ if (!IS_ERR(__smem->mdinfo))
+ platform_device_unregister(__smem->mdinfo);
hwspin_lock_free(__smem->hwlock);
__smem = NULL;
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 08/16] init/version: Add banner_len to save banner length
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (6 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 07/16] soc: qcom: smem: Add minidump device Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static Eugen Hristev
` (9 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add banner_len to store banner length.
This is useful to save the banner into dumping mechanisms.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
include/linux/printk.h | 1 +
init/version-timestamp.c | 1 +
init/version.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 45c663124c9b..5bc617222948 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -12,6 +12,7 @@
struct console;
extern const char linux_banner[];
+extern const int banner_len;
extern const char linux_proc_banner[];
extern int oops_in_progress; /* If set, an oops, panic(), BUG() or die() is in progress */
diff --git a/init/version-timestamp.c b/init/version-timestamp.c
index 043cbf80a766..1fdd795be747 100644
--- a/init/version-timestamp.c
+++ b/init/version-timestamp.c
@@ -28,3 +28,4 @@ struct uts_namespace init_uts_ns = {
const char linux_banner[] =
"Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";
+const int banner_len = sizeof(linux_banner);
diff --git a/init/version.c b/init/version.c
index 94c96f6fbfe6..68d16748b081 100644
--- a/init/version.c
+++ b/init/version.c
@@ -48,6 +48,7 @@ BUILD_LTO_INFO;
struct uts_namespace init_uts_ns __weak;
const char linux_banner[] __weak;
+const int banner_len __weak;
#include "version-timestamp.c"
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (7 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 08/16] init/version: Add banner_len to save banner length Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-16 21:10 ` Thomas Gleixner
2025-09-12 15:08 ` [RFC][PATCH v3 10/16] panic: Have tainted_mask " Eugen Hristev
` (8 subsequent siblings)
17 siblings, 1 reply; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
nr_irqs is required for debugging the kernel, and needs to be
accessible for kmemdump into vmcoreinfo.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/irq/irqdesc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index db714d3014b5..6c3c8c4687fd 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -139,7 +139,7 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
desc_smp_init(desc, node, affinity);
}
-static unsigned int nr_irqs = NR_IRQS;
+unsigned int nr_irqs = NR_IRQS;
/**
* irq_get_nr_irqs() - Number of interrupts supported by the system.
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 10/16] panic: Have tainted_mask as non-static
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (8 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 11/16] mm/swapfile: Have nr_swapfiles " Eugen Hristev
` (7 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
tainted_mask is required for debugging the kernel, and needs to be
accessible for kmemdump into vmcoreinfo.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/panic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/panic.c b/kernel/panic.c
index d9c7cd09aeb9..048c33dab98a 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -54,7 +54,7 @@ static unsigned int __read_mostly sysctl_oops_all_cpu_backtrace;
#endif /* CONFIG_SMP */
int panic_on_oops = CONFIG_PANIC_ON_OOPS_VALUE;
-static unsigned long tainted_mask =
+unsigned long tainted_mask =
IS_ENABLED(CONFIG_RANDSTRUCT) ? (1 << TAINT_RANDSTRUCT) : 0;
static int pause_on_oops;
static int pause_on_oops_flag;
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 11/16] mm/swapfile: Have nr_swapfiles as non-static
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (9 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 10/16] panic: Have tainted_mask " Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 12/16] printk: Register information into Kmemdump Eugen Hristev
` (6 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
nr_swapfiles is required for debugging the kernel, and needs to be
accessible for kmemdump into vmcoreinfo.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index a7ffabbe65ef..2ef51da2c642 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -63,7 +63,7 @@ static struct swap_cluster_info *lock_cluster(struct swap_info_struct *si,
static inline void unlock_cluster(struct swap_cluster_info *ci);
static DEFINE_SPINLOCK(swap_lock);
-static unsigned int nr_swapfiles;
+unsigned int nr_swapfiles;
atomic_long_t nr_swap_pages;
/*
* Some modules use swappable objects and may try to swap them out under
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 12/16] printk: Register information into Kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (10 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 11/16] mm/swapfile: Have nr_swapfiles " Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 13/16] sched: Add sched_get_runqueues_area Eugen Hristev
` (5 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Kmemdump requires the prb, data, descriptors and info.
Add it inside the log_buf_vmcoreinfo_setup()
In the case when the log buffer is dynamically replaced by a runtime
allocated version, call kmemdump to register the data but call unregister
to remove the old registered data first.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/printk/printk.c | 47 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5aee9ffb16b9..f75489fd82df 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -49,6 +49,7 @@
#include <linux/sched/debug.h>
#include <linux/sched/task_stack.h>
#include <linux/panic.h>
+#include <linux/kmemdump.h>
#include <linux/uaccess.h>
#include <asm/sections.h>
@@ -964,6 +965,43 @@ const struct file_operations kmsg_fops = {
};
#ifdef CONFIG_VMCORE_INFO
+static void log_buf_vmcoreinfo_kmemdump_update(void *data, size_t data_size,
+ void *descs, size_t descs_size,
+ void *infos, size_t infos_size)
+{
+ kmemdump_unregister(KMEMDUMP_ID_COREIMAGE_prb_data);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_data,
+ (void *)data, data_size);
+
+ kmemdump_unregister(KMEMDUMP_ID_COREIMAGE_prb_descs);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_descs,
+ (void *)descs, descs_size);
+
+ kmemdump_unregister(KMEMDUMP_ID_COREIMAGE_prb_infos);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_infos,
+ (void *)infos, infos_size);
+}
+
+static void log_buf_vmcoreinfo_kmemdump(void)
+{
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb,
+ (void *)&prb, sizeof(prb));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_descs,
+ (void *)&_printk_rb_static_descs,
+ sizeof(_printk_rb_static_descs));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_infos,
+ (void *)&_printk_rb_static_infos,
+ sizeof(_printk_rb_static_infos));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_prb_data,
+ (void *)&__log_buf, __LOG_BUF_LEN);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_printk_rb_static,
+ (void *)&printk_rb_static,
+ sizeof(printk_rb_static));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_printk_rb_dynamic,
+ (void *)&printk_rb_dynamic,
+ sizeof(printk_rb_dynamic));
+}
+
/*
* This appends the listed symbols to /proc/vmcore
*
@@ -1029,6 +1067,8 @@ void log_buf_vmcoreinfo_setup(void)
VMCOREINFO_STRUCT_SIZE(latched_seq);
VMCOREINFO_OFFSET(latched_seq, val);
+
+ log_buf_vmcoreinfo_kmemdump();
}
#endif
@@ -1214,6 +1254,11 @@ void __init setup_log_buf(int early)
new_descs, ilog2(new_descs_count),
new_infos);
+#ifdef CONFIG_VMCORE_INFO
+ log_buf_vmcoreinfo_kmemdump_update(new_log_buf, new_log_buf_len,
+ new_descs, new_descs_size,
+ new_infos, new_infos_size);
+#endif
local_irq_save(flags);
log_buf_len = new_log_buf_len;
@@ -1257,8 +1302,10 @@ void __init setup_log_buf(int early)
return;
err_free_descs:
+ kmemdump_unregister(KMEMDUMP_ID_COREIMAGE_prb_descs);
memblock_free(new_descs, new_descs_size);
err_free_log_buf:
+ kmemdump_unregister(KMEMDUMP_ID_COREIMAGE_prb_data);
memblock_free(new_log_buf, new_log_buf_len);
out:
print_log_buf_usage_stats();
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 13/16] sched: Add sched_get_runqueues_area
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (11 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 12/16] printk: Register information into Kmemdump Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 14/16] kernel/vmcoreinfo: Register kmemdump core image information Eugen Hristev
` (4 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add simple function to get the runqueues area and size for dumping
purpose.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/sched/core.c | 15 +++++++++++++++
kernel/sched/sched.h | 2 ++
2 files changed, 17 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9af28286e61a..a054dd1fda68 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -120,6 +120,21 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+/**
+ * sched_get_runqueues_area() - obtain runqueues area for dumping
+ * @start: pointer to the start of the area, to be filled in
+ * @size: size of the area, to be filled in
+ *
+ * The obtained area is only to be used for dumping purpose
+ *
+ * Return: none
+ */
+void sched_get_runqueues_area(void **start, size_t *size)
+{
+ *start = &runqueues;
+ *size = sizeof(runqueues);
+}
+
#ifdef CONFIG_SCHED_PROXY_EXEC
DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
static int __init setup_proxy_exec(char *str)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b5367c514c14..3b9cedb1fbeb 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1330,6 +1330,8 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
#define raw_rq() raw_cpu_ptr(&runqueues)
+void sched_get_runqueues_area(void **start, size_t *size);
+
#ifdef CONFIG_SCHED_PROXY_EXEC
static inline void rq_set_donor(struct rq *rq, struct task_struct *t)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 14/16] kernel/vmcoreinfo: Register kmemdump core image information
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (12 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 13/16] sched: Add sched_get_runqueues_area Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver Eugen Hristev
` (3 subsequent siblings)
17 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
The coreimage generated by kmemdump requires some kernel information
in order to be successfully loaded by `crash` or gdb.
Register all this information through vmcoreinfo once vmcoreinfo is setup.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/vmcore_info.c | 141 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 141 insertions(+)
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index 3e2e846ba9c8..1d83e95cf9be 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -15,6 +15,7 @@
#include <linux/memblock.h>
#include <linux/kmemleak.h>
#include <linux/kmemdump.h>
+#include <linux/sched/stat.h>
#include <asm/page.h>
#include <asm/sections.h>
@@ -24,6 +25,17 @@
#include "kallsyms_internal.h"
#include "kexec_internal.h"
+void sched_get_runqueues_area(void **start, size_t *size);
+
+extern unsigned int nr_irqs;
+extern unsigned long tainted_mask;
+extern unsigned int nr_swapfiles;
+
+#ifdef CONFIG_IKCONFIG_PROC
+extern char kernel_config_data;
+extern char kernel_config_data_end;
+#endif
+
/* vmcoreinfo stuff */
unsigned char *vmcoreinfo_data;
size_t vmcoreinfo_size;
@@ -121,8 +133,137 @@ EXPORT_SYMBOL(paddr_vmcoreinfo_note);
static void vmcoreinfo_kmemdump(void)
{
+ void *start;
+ size_t size;
+ int i;
+
kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
(void *)vmcoreinfo_data, vmcoreinfo_size);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_linux_banner,
+ (void *)&linux_banner, banner_len);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_init_uts_ns,
+ (void *)&init_uts_ns, sizeof(init_uts_ns));
+
+ sched_get_runqueues_area(&start, &size);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
+ (void *)start, size);
+
+#ifdef CONFIG_IKCONFIG_PROC
+ /* Register 8 bytes before and after, to catch the marker too */
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_CONFIG,
+ (void *)&kernel_config_data - 8,
+ &kernel_config_data_end - &kernel_config_data + 16);
+#endif
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE___cpu_possible_mask,
+ (void *)&__cpu_possible_mask,
+ sizeof(__cpu_possible_mask));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE___cpu_active_mask,
+ (void *)&__cpu_active_mask,
+ sizeof(__cpu_active_mask));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE___cpu_online_mask,
+ (void *)&__cpu_online_mask,
+ sizeof(__cpu_online_mask));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE___cpu_present_mask,
+ (void *)&__cpu_present_mask,
+ sizeof(__cpu_present_mask));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_nr_irqs,
+ (void *)&nr_irqs, sizeof(nr_irqs));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_tainted_mask,
+ (void *)&tainted_mask, sizeof(tainted_mask));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_taint_flags,
+ (void *)&taint_flags, sizeof(taint_flags));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_jiffies_64,
+ (void *)&jiffies_64, sizeof(jiffies_64));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_nr_threads,
+ (void *)&nr_threads, sizeof(nr_threads));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_node_states,
+ (void *)&node_states, sizeof(node_states));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_init_mm,
+ (void *)&init_mm, sizeof(init_mm));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_init_mm_pgd,
+ (void *)&init_mm.pgd, sizeof(*init_mm.pgd));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__totalram_pages,
+ (void *)&_totalram_pages, sizeof(_totalram_pages));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_nr_swapfiles,
+ (void *)&nr_swapfiles, sizeof(nr_swapfiles));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE___per_cpu_offset,
+ (void *)&__per_cpu_offset, sizeof(__per_cpu_offset));
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_high_memory,
+ (void *)&high_memory, sizeof(high_memory));
+#ifdef CONFIG_NUMA
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_node_data,
+ (void *)&node_data,
+ MAX_NUMNODES * sizeof(struct pglist_data));
+
+ for (i = 0; i < MAX_NUMNODES; i++) {
+ if (!NODE_DATA(i))
+ continue;
+ kmemdump_register((void *)NODE_DATA(i),
+ roundup(sizeof(pg_data_t), SMP_CACHE_BYTES));
+ }
+#endif
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_mem_section,
+ (void *)&mem_section, sizeof(mem_section));
+ for (i = 0; i < NR_SECTION_ROOTS; i++) {
+ if (!mem_section[i])
+ continue;
+ kmemdump_register((void *)mem_section[i],
+ SECTIONS_PER_ROOT * sizeof(struct mem_section));
+ }
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_MEMSECT,
+ (void *)mem_section,
+ sizeof(struct mem_section *) * NR_SECTION_ROOTS);
+
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_num_syms,
+ (void *)&kallsyms_num_syms,
+ sizeof(kallsyms_num_syms));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_relative_base,
+ (void *)&kallsyms_relative_base,
+ sizeof(kallsyms_relative_base));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_offsets,
+ (void *)&kallsyms_offsets,
+ sizeof(&kallsyms_offsets));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_names,
+ (void *)&kallsyms_names,
+ sizeof(&kallsyms_names));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_token_table,
+ (void *)&kallsyms_token_table,
+ sizeof(&kallsyms_token_table));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_token_index,
+ (void *)&kallsyms_token_index,
+ sizeof(&kallsyms_token_index));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_markers,
+ (void *)&kallsyms_markers,
+ sizeof(&kallsyms_markers));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_kallsyms_seqs_of_names,
+ (void *)&kallsyms_seqs_of_names,
+ sizeof(&kallsyms_seqs_of_names));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__sinittext,
+ (void *)&_sinittext, sizeof(&_sinittext));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__einittext,
+ (void *)&_einittext, sizeof(&_einittext));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__end,
+ (void *)&_end, sizeof(&_end));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__text,
+ (void *)&_text, sizeof(&_text));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__stext,
+ (void *)&_stext, sizeof(&_stext));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE__etext,
+ (void *)&_etext, sizeof(&_etext));
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_swapper_pg_dir,
+ (void *)&swapper_pg_dir, sizeof(&swapper_pg_dir));
}
static int __init crash_save_vmcoreinfo_init(void)
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (13 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 14/16] kernel/vmcoreinfo: Register kmemdump core image information Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-16 5:48 ` Alexey Klimov
2025-09-22 10:01 ` Tudor Ambarus
2025-09-12 15:08 ` [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo Eugen Hristev
` (2 subsequent siblings)
17 siblings, 2 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add Kinfo backend driver.
This backend driver will select only regions of interest for the firmware,
and it copy those into a shared memory area that is supplied via OF.
The firmware is only interested in addresses for some symbols.
The list format is kinfo-compatible, with devices like Google Pixel phone.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 5 +
mm/kmemdump/Kconfig.debug | 13 ++
mm/kmemdump/Makefile | 1 +
mm/kmemdump/kinfo.c | 293 ++++++++++++++++++++++++++++++++++++++
4 files changed, 312 insertions(+)
create mode 100644 mm/kmemdump/kinfo.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 8234acb24cbc..65d9e5db46a9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13818,6 +13818,11 @@ F: include/linux/kmemdump.h
F: mm/kmemdump/kmemdump.c
F: mm/kmemdump/kmemdump_coreimage.c
+KMEMDUMP KINFO BACKEND DRIVER
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: mm/kmemdump/kinfo.c
+
KMEMDUMP QCOM MINIDUMP BACKEND DRIVER
M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
diff --git a/mm/kmemdump/Kconfig.debug b/mm/kmemdump/Kconfig.debug
index 91cec45bc3ca..ff88bf8017ae 100644
--- a/mm/kmemdump/Kconfig.debug
+++ b/mm/kmemdump/Kconfig.debug
@@ -38,3 +38,16 @@ config KMEMDUMP_QCOM_MINIDUMP_BACKEND
into the minidump table of contents. Further on, the firmware
will be able to read the table of contents and extract the
memory regions on case-by-case basis.
+
+config KMEMDUMP_KINFO_BACKEND
+ tristate "Shared memory KInfo compatible backend"
+ depends on KMEMDUMP
+ select VMCORE_INFO
+ help
+ Say y here to enable the Shared memory KInfo compatible backend
+ driver.
+ With this backend, the registered regions are copied to a shared
+ memory zone at register time.
+ The shared memory zone is supplied via OF.
+ This backend will select only regions that are of interest,
+ and keep only addresses. The format of the list is Kinfo compatible.
diff --git a/mm/kmemdump/Makefile b/mm/kmemdump/Makefile
index 6ec3871203ef..1ec94ee6c008 100644
--- a/mm/kmemdump/Makefile
+++ b/mm/kmemdump/Makefile
@@ -3,3 +3,4 @@
obj-y += kmemdump.o
obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
obj-$(CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND) += qcom_minidump.o
+obj-$(CONFIG_KMEMDUMP_KINFO_BACKEND) += kinfo.o
diff --git a/mm/kmemdump/kinfo.c b/mm/kmemdump/kinfo.c
new file mode 100644
index 000000000000..9f0ec8a1aaa2
--- /dev/null
+++ b/mm/kmemdump/kinfo.c
@@ -0,0 +1,293 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *
+ * Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
+ * Copyright 2021 Google LLC
+ * Copyright 2025 Linaro Ltd. Eugen Hristev <eugen.hristev@linaro.org>
+ */
+#include <linux/platform_device.h>
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/kmemdump.h>
+#include <linux/module.h>
+#include <linux/utsname.h>
+
+#define BUILD_INFO_LEN 256
+#define DEBUG_KINFO_MAGIC 0xCCEEDDFF
+
+/*
+ * Header structure must be byte-packed, since the table is provided to
+ * bootloader.
+ */
+struct kernel_info {
+ /* For kallsyms */
+ __u8 enabled_all;
+ __u8 enabled_base_relative;
+ __u8 enabled_absolute_percpu;
+ __u8 enabled_cfi_clang;
+ __u32 num_syms;
+ __u16 name_len;
+ __u16 bit_per_long;
+ __u16 module_name_len;
+ __u16 symbol_len;
+ __u64 _relative_pa;
+ __u64 _text_pa;
+ __u64 _stext_pa;
+ __u64 _etext_pa;
+ __u64 _sinittext_pa;
+ __u64 _einittext_pa;
+ __u64 _end_pa;
+ __u64 _offsets_pa;
+ __u64 _names_pa;
+ __u64 _token_table_pa;
+ __u64 _token_index_pa;
+ __u64 _markers_pa;
+ __u64 _seqs_of_names_pa;
+
+ /* For frame pointer */
+ __u32 thread_size;
+
+ /* For virt_to_phys */
+ __u64 swapper_pg_dir_pa;
+
+ /* For linux banner */
+ __u8 last_uts_release[__NEW_UTS_LEN];
+
+ /* Info of running build */
+ __u8 build_info[BUILD_INFO_LEN];
+
+ /* For module kallsyms */
+ __u32 enabled_modules_tree_lookup;
+ __u32 mod_mem_offset;
+ __u32 mod_kallsyms_offset;
+} __packed;
+
+struct kernel_all_info {
+ __u32 magic_number;
+ __u32 combined_checksum;
+ struct kernel_info info;
+} __packed;
+
+struct debug_kinfo {
+ struct device *dev;
+ void *all_info_addr;
+ u32 all_info_size;
+ struct kmemdump_backend kinfo_be;
+};
+
+static struct debug_kinfo *kinfo;
+
+#define be_to_kinfo(be) container_of(be, struct debug_kinfo, kinfo_be)
+
+static void update_kernel_all_info(struct kernel_all_info *all_info)
+{
+ int index;
+ struct kernel_info *info;
+ u32 *checksum_info;
+
+ all_info->magic_number = DEBUG_KINFO_MAGIC;
+ all_info->combined_checksum = 0;
+
+ info = &all_info->info;
+ checksum_info = (u32 *)info;
+ for (index = 0; index < sizeof(*info) / sizeof(u32); index++)
+ all_info->combined_checksum ^= checksum_info[index];
+}
+
+static int build_info_set(const char *str, const struct kernel_param *kp)
+{
+ struct kernel_all_info *all_info = kinfo->all_info_addr;
+ size_t build_info_size;
+
+ if (kinfo->all_info_addr == 0 || kinfo->all_info_size == 0)
+ return -ENAVAIL;
+
+ all_info = (struct kernel_all_info *)kinfo->all_info_addr;
+ build_info_size = sizeof(all_info->info.build_info);
+
+ memcpy(&all_info->info.build_info, str, min(build_info_size - 1,
+ strlen(str)));
+ update_kernel_all_info(all_info);
+
+ if (strlen(str) > build_info_size) {
+ pr_warn("%s: Build info buffer (len: %zd) can't hold entire string '%s'\n",
+ __func__, build_info_size, str);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static const struct kernel_param_ops build_info_op = {
+ .set = build_info_set,
+};
+
+module_param_cb(build_info, &build_info_op, NULL, 0200);
+MODULE_PARM_DESC(build_info, "Write build info to field 'build_info' of debug kinfo.");
+
+static int register_kinfo_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *vaddr, size_t size)
+{
+ struct debug_kinfo *kinfo = be_to_kinfo(be);
+ struct kernel_all_info *all_info = kinfo->all_info_addr;
+ struct kernel_info *info = &all_info->info;
+ struct uts_namespace *uts;
+
+ switch (id) {
+ case KMEMDUMP_ID_COREIMAGE__sinittext:
+ info->_sinittext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__einittext:
+ info->_einittext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__end:
+ info->_end_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__text:
+ info->_text_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__stext:
+ info->_stext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__etext:
+ info->_etext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_num_syms:
+ info->num_syms = *(__u32 *)vaddr;
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_relative_base:
+ info->_relative_pa = (u64)__pa(*(u64 *)vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_offsets:
+ info->_offsets_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_names:
+ info->_names_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_token_table:
+ info->_token_table_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_token_index:
+ info->_token_index_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_markers:
+ info->_markers_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_seqs_of_names:
+ info->_seqs_of_names_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_swapper_pg_dir:
+ info->swapper_pg_dir_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_init_uts_ns:
+ uts = vaddr;
+ strscpy(info->last_uts_release, uts->name.release, __NEW_UTS_LEN);
+ break;
+ default:
+ break;
+ };
+
+ update_kernel_all_info(all_info);
+ return 0;
+}
+
+static int unregister_kinfo_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id)
+{
+ return 0;
+}
+
+static int debug_kinfo_probe(struct platform_device *pdev)
+{
+ struct device_node *mem_region;
+ struct reserved_mem *rmem;
+ struct kernel_info *info;
+ struct kernel_all_info *all_info;
+
+ mem_region = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
+ if (!mem_region) {
+ dev_warn(&pdev->dev, "no such memory-region\n");
+ return -ENODEV;
+ }
+
+ rmem = of_reserved_mem_lookup(mem_region);
+ if (!rmem) {
+ dev_warn(&pdev->dev, "no such reserved mem of node name %s\n",
+ pdev->dev.of_node->name);
+ return -ENODEV;
+ }
+
+ /* Need to wait for reserved memory to be mapped */
+ if (!rmem->priv)
+ return -EPROBE_DEFER;
+
+ if (!rmem->base || !rmem->size) {
+ dev_warn(&pdev->dev, "unexpected reserved memory\n");
+ return -EINVAL;
+ }
+
+ if (rmem->size < sizeof(struct kernel_all_info)) {
+ dev_warn(&pdev->dev, "unexpected reserved memory size\n");
+ return -EINVAL;
+ }
+
+ kinfo = kzalloc(sizeof(*kinfo), GFP_KERNEL);
+ if (!kinfo)
+ return -ENOMEM;
+
+ kinfo->dev = &pdev->dev;
+
+ strscpy(kinfo->kinfo_be.name, "debug_kinfo");
+ kinfo->kinfo_be.register_region = register_kinfo_region;
+ kinfo->kinfo_be.unregister_region = unregister_kinfo_region;
+ kinfo->all_info_addr = rmem->priv;
+ kinfo->all_info_size = rmem->size;
+
+ all_info = kinfo->all_info_addr;
+
+ memset(all_info, 0, sizeof(struct kernel_all_info));
+ info = &all_info->info;
+ info->enabled_all = IS_ENABLED(CONFIG_KALLSYMS_ALL);
+ info->enabled_absolute_percpu = IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU);
+ info->enabled_base_relative = IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE);
+ info->enabled_cfi_clang = IS_ENABLED(CONFIG_CFI_CLANG);
+ info->name_len = KSYM_NAME_LEN;
+ info->bit_per_long = BITS_PER_LONG;
+ info->module_name_len = MODULE_NAME_LEN;
+ info->symbol_len = KSYM_SYMBOL_LEN;
+ info->thread_size = THREAD_SIZE;
+ info->enabled_modules_tree_lookup = IS_ENABLED(CONFIG_MODULES_TREE_LOOKUP);
+ info->mod_mem_offset = offsetof(struct module, mem);
+ info->mod_kallsyms_offset = offsetof(struct module, kallsyms);
+
+ return kmemdump_register_backend(&kinfo->kinfo_be);
+}
+
+static void debug_kinfo_remove(struct platform_device *pdev)
+{
+ kfree(kinfo);
+ kmemdump_unregister_backend(&kinfo->kinfo_be);
+}
+
+static const struct of_device_id debug_kinfo_of_match[] = {
+ { .compatible = "google,debug-kinfo" },
+ {},
+};
+MODULE_DEVICE_TABLE(of, debug_kinfo_of_match);
+
+static struct platform_driver debug_kinfo_driver = {
+ .probe = debug_kinfo_probe,
+ .remove = debug_kinfo_remove,
+ .driver = {
+ .name = "debug-kinfo",
+ .of_match_table = of_match_ptr(debug_kinfo_of_match),
+ },
+};
+module_platform_driver(debug_kinfo_driver);
+
+MODULE_AUTHOR("Eugen Hristev <eugen.hristev@linaro.org>");
+MODULE_AUTHOR("Jone Chou <jonechou@google.com>");
+MODULE_DESCRIPTION("kmemdump Kinfo Driver");
+MODULE_LICENSE("GPL");
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (14 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver Eugen Hristev
@ 2025-09-12 15:08 ` Eugen Hristev
2025-09-14 11:56 ` Krzysztof Kozlowski
2025-09-12 15:56 ` [RFC][PATCH v3 00/16] Introduce kmemdump David Hildenbrand
2025-09-16 7:49 ` Mukesh Ojha
17 siblings, 1 reply; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 15:08 UTC (permalink / raw)
To: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
Add documentation for Google Kinfo kmemdump backend driver.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
.../bindings/misc/google,kinfo.yaml | 36 +++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 37 insertions(+)
create mode 100644 Documentation/devicetree/bindings/misc/google,kinfo.yaml
diff --git a/Documentation/devicetree/bindings/misc/google,kinfo.yaml b/Documentation/devicetree/bindings/misc/google,kinfo.yaml
new file mode 100644
index 000000000000..b1e4fac43586
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/google,kinfo.yaml
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/misc/google,kinfo.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Google Pixel Kinfo debug driver
+
+maintainers:
+ - Eugen Hristev <eugen.hristev@linaro.org>
+
+description:
+ The Google Pixel Kinfo debug driver uses a supplied reserved memory area
+ to save debugging information on the running kernel.
+
+properties:
+ compatible:
+ items:
+ - const: google,kinfo
+
+ memory-region:
+ maxItems: 1
+ description: Reference to the reserved-memory for the data
+
+required:
+ - compatible
+ - memory-region
+
+additionalProperties: true
+
+examples:
+ - |
+ debug-kinfo {
+ compatible = "google,debug-kinfo";
+ memory-region = <&debug_kinfo_reserved>;
+ };
diff --git a/MAINTAINERS b/MAINTAINERS
index 65d9e5db46a9..6a846c51db04 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13821,6 +13821,7 @@ F: mm/kmemdump/kmemdump_coreimage.c
KMEMDUMP KINFO BACKEND DRIVER
M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
+F: Documentation/devicetree/bindings/misc/google,kinfo.yaml
F: mm/kmemdump/kinfo.c
KMEMDUMP QCOM MINIDUMP BACKEND DRIVER
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 00/16] Introduce kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (15 preceding siblings ...)
2025-09-12 15:08 ` [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo Eugen Hristev
@ 2025-09-12 15:56 ` David Hildenbrand
2025-09-12 18:35 ` Eugen Hristev
2025-09-16 7:49 ` Mukesh Ojha
17 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-09-12 15:56 UTC (permalink / raw)
To: Eugen Hristev, linux-arm-msm, linux-kernel, linux-mm, tglx,
andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
>
> Changelog since the v2 of the RFC:
> - V2 available here : https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
> - Removed the .section as requested by David Hildenbrand.
> - Moved all kmemdump registration(when possible) to vmcoreinfo.
> - Because of this, some of the variables that I was registering had to be non-static
> so I had to modify this as per David Hildenbrand suggestion.
> - Fixed minor things in the Kinfo driver: one field was broken, fixed some
> compiler warnings, fixed the copyright and remove some useless includes.
> - Moved the whole kmemdump from drivers/debug into mm/ and Kconfigs into mm/Kconfig.debug
> and it's now available in kernel hacking, as per Randy Dunlap review
> - Reworked some of the Documentation as per review from Jon Corbet
IIUC, it's now only printk.c where we do kmemdump-related magic, right?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 00/16] Introduce kmemdump
2025-09-12 15:56 ` [RFC][PATCH v3 00/16] Introduce kmemdump David Hildenbrand
@ 2025-09-12 18:35 ` Eugen Hristev
0 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-12 18:35 UTC (permalink / raw)
To: David Hildenbrand, linux-arm-msm, linux-kernel, linux-mm, tglx,
andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/12/25 18:56, David Hildenbrand wrote:
>>
>> Changelog since the v2 of the RFC:
>> - V2 available here : https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
>> - Removed the .section as requested by David Hildenbrand.
>> - Moved all kmemdump registration(when possible) to vmcoreinfo.
>> - Because of this, some of the variables that I was registering had to be non-static
>> so I had to modify this as per David Hildenbrand suggestion.
>> - Fixed minor things in the Kinfo driver: one field was broken, fixed some
>> compiler warnings, fixed the copyright and remove some useless includes.
>> - Moved the whole kmemdump from drivers/debug into mm/ and Kconfigs into mm/Kconfig.debug
>> and it's now available in kernel hacking, as per Randy Dunlap review
>> - Reworked some of the Documentation as per review from Jon Corbet
>
> IIUC, it's now only printk.c where we do kmemdump-related magic, right?
>
Yes. The other places just have some changes such that I am able to
gather the data inside vmcoreinfo. (remove static, add some function to
get sizes)
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo
2025-09-12 15:08 ` [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo Eugen Hristev
@ 2025-09-14 11:56 ` Krzysztof Kozlowski
0 siblings, 0 replies; 42+ messages in thread
From: Krzysztof Kozlowski @ 2025-09-14 11:56 UTC (permalink / raw)
To: Eugen Hristev, linux-arm-msm, linux-kernel, linux-mm, tglx,
andersson, pmladek, rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 12/09/2025 17:08, Eugen Hristev wrote:
> Add documentation for Google Kinfo kmemdump backend driver.
>
> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
> ---
> .../bindings/misc/google,kinfo.yaml | 36 +++++++++++++++++++
> MAINTAINERS | 1 +
> 2 files changed, 37 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/misc/google,kinfo.yaml
>
> diff --git a/Documentation/devicetree/bindings/misc/google,kinfo.yaml b/Documentation/devicetree/bindings/misc/google,kinfo.yaml
> new file mode 100644
> index 000000000000..b1e4fac43586
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/misc/google,kinfo.yaml
> @@ -0,0 +1,36 @@
> +# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/misc/google,kinfo.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Google Pixel Kinfo debug driver
> +
> +maintainers:
> + - Eugen Hristev <eugen.hristev@linaro.org>
> +
> +description:
> + The Google Pixel Kinfo debug driver uses a supplied reserved memory area
> + to save debugging information on the running kernel.
Bindings should be for hardware, not drivers, so this does not belong to
DT. It might be a dedicated reserved memory node, though.
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver
2025-09-12 15:08 ` [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver Eugen Hristev
@ 2025-09-16 5:48 ` Alexey Klimov
2025-09-22 10:01 ` Tudor Ambarus
1 sibling, 0 replies; 42+ messages in thread
From: Alexey Klimov @ 2025-09-16 5:48 UTC (permalink / raw)
To: Eugen Hristev
Cc: linux-arm-msm, linux-kernel, linux-mm, tglx, andersson, pmladek,
rdunlap, corbet, david, mhocko, tudor.ambarus, mukesh.ojha,
linux-arm-kernel, linux-hardening, jonechou, rostedt, linux-doc,
devicetree
On Fri Sep 12, 2025 at 4:08 PM BST, Eugen Hristev wrote:
[..]
> --- /dev/null
> +++ b/mm/kmemdump/kinfo.c
> @@ -0,0 +1,293 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *
> + * Copyright 2002 Rusty Russell <rusty@rustcorp.com.au> IBM Corporation
> + * Copyright 2021 Google LLC
> + * Copyright 2025 Linaro Ltd. Eugen Hristev <eugen.hristev@linaro.org>
> + */
> +#include <linux/platform_device.h>
> +#include <linux/kallsyms.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_reserved_mem.h>
> +#include <linux/kmemdump.h>
> +#include <linux/module.h>
> +#include <linux/utsname.h>
Could you please check if the headers are sorted here
and in all other patches in this series?
Also module.h is duplicated.
[..]
> +static int build_info_set(const char *str, const struct kernel_param *kp)
> +{
> + struct kernel_all_info *all_info = kinfo->all_info_addr;
here ^^
> + size_t build_info_size;
> +
> + if (kinfo->all_info_addr == 0 || kinfo->all_info_size == 0)
> + return -ENAVAIL;
> +
> + all_info = (struct kernel_all_info *)kinfo->all_info_addr;
Maybe assignment of all_info on declaration in the beginning of this function
is not needed then?
> + build_info_size = sizeof(all_info->info.build_info);
> +
> + memcpy(&all_info->info.build_info, str, min(build_info_size - 1,
> + strlen(str)));
> + update_kernel_all_info(all_info);
> +
> + if (strlen(str) > build_info_size) {
> + pr_warn("%s: Build info buffer (len: %zd) can't hold entire string '%s'\n",
> + __func__, build_info_size, str);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
[...]
Best regards,
Alexey
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 00/16] Introduce kmemdump
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
` (16 preceding siblings ...)
2025-09-12 15:56 ` [RFC][PATCH v3 00/16] Introduce kmemdump David Hildenbrand
@ 2025-09-16 7:49 ` Mukesh Ojha
2025-09-16 15:25 ` Luck, Tony
17 siblings, 1 reply; 42+ messages in thread
From: Mukesh Ojha @ 2025-09-16 7:49 UTC (permalink / raw)
To: Eugen Hristev
Cc: kees, tony.luck, gpiccoli, linux-arm-msm, linux-kernel, linux-mm,
tglx, andersson, pmladek, rdunlap, corbet, david, mhocko,
tudor.ambarus, linux-arm-kernel, linux-hardening, jonechou,
rostedt, linux-doc, devicetree
On Fri, Sep 12, 2025 at 06:08:39PM +0300, Eugen Hristev wrote:
> kmemdump is a mechanism which allows the kernel to mark specific memory
> areas for dumping or specific backend usage.
> Once regions are marked, kmemdump keeps an internal list with the regions
> and registers them in the backend.
> Further, depending on the backend driver, these regions can be dumped using
> firmware or different hardware block.
> Regions being marked beforehand, when the system is up and running, there
> is no need nor dependency on a panic handler, or a working kernel that can
> dump the debug information.
> The kmemdump approach works when pstore, kdump, or another mechanism do not.
> Pstore relies on persistent storage, a dedicated RAM area or flash, which
> has the disadvantage of having the memory reserved all the time, or another
> specific non volatile memory. Some devices cannot keep the RAM contents on
> reboot so ramoops does not work. Some devices do not allow kexec to run
> another kernel to debug the crashed one.
> For such devices, that have another mechanism to help debugging, like
> firmware, kmemdump is a viable solution.
>
> kmemdump can create a core image, similar with /proc/vmcore, with only
> the registered regions included. This can be loaded into crash tool/gdb and
> analyzed.
> To have this working, specific information from the kernel is registered,
> and this is done at kmemdump init time, no need for the kmemdump user to
> do anything.
>
> This version of the kmemdump patch series includes two backend drivers:
> one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
> backend for Android devices, reworked from this source here:
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
> written originally by Jone Chou <jonechou@google.com>
+Adding some pstore experts to bring this to their attention if this can
be followed and if they find it useful.
Is not a good idea to add pstore as one of the backend of kmemdump so that all
the kmemdump captured data automatically flow in a separate record(section) in
pstore(e.g.,ram) so that all user of pstore automatically benefit from kmemdump
captured data and later kmemdump section from ramoops can be recovered from
userspace in next boot and crash utility can be run on that.
kmemdump captured data
|
---------------------
| |
V V
Vendors firmware pstore(ram)
backend backend
(minidump)
Thanks,
-Mukesh
>
> *** History, motivation and available online resources ***
>
> Initial version of kmemdump and discussion is available here:
> https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
>
> Kmemdump has been presented and discussed at Linaro Connect 2025,
> including motivation, scope, usability and feasability.
> Video of the recording is available here for anyone interested:
> https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
>
> Linaro blog on kmemdump can be found here:
> https://www.linaro.org/blog/introduction-to-kmemdump/
>
> The implementation is based on the initial Pstore/directly mapped zones
> published as an RFC here:
> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>
> The back-end implementation for qcom_minidump is based on the minidump
> patch series and driver written by Mukesh Ojha, thanks:
> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>
> The RFC v2 version with .section creation and macro annotation kmemdump
> is available here:
> https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
>
> *** How to use kmemdump with minidump backend on Qualcomm platform guide ***
>
> Prerequisites:
> Crash tool compiled with target=ARM64 and minor changes required for usual crash
> mode (minimal mode works without the patch)
> A patch can be applied from here https://p.calebs.dev/49a048
> This patch will be eventually sent in a reworked way to crash tool.
>
> Target kernel must be built with :
> CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
> information needed for crash tool.
>
> Also, the kernel requires these as well:
> CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
> CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
>
> Kernel arguments:
> Kernel firmware must be set to mode 'mini' by kernel module parameter
> like this : qcom_scm.download_mode=mini
>
> After the kernel boots, and qcom_minidump module is loaded, everything is ready for
> a possible crash.
>
> Once the crash happens, the firmware will kick in and you will see on
> the console the message saying Sahara init, etc, that the firmware is
> waiting in download mode. (this is subject to firmware supporting this
> mode, I am using sa8775p-ride board)
>
> Example of log on the console:
> "
> [...]
> B - 1096414 - usb: init start
> B - 1100287 - usb: qusb_dci_platform , 0x19
> B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
> B - 1107455 - usb: usb2phy: PRIM success , 0x4
> B - 1112670 - usb: dci, chgr_type_det_err
> B - 1117154 - usb: ID:0x260, value: 0x4
> B - 1121942 - usb: ID:0x108, value: 0x1d90
> B - 1124992 - usb: timer_start , 0x4c4b40
> B - 1129140 - usb: vbus_det_pm_unavail
> B - 1133136 - usb: ID:0x252, value: 0x4
> B - 1148874 - usb: SUPER , 0x900e
> B - 1275510 - usb: SUPER , 0x900e
> B - 1388970 - usb: ID:0x20d, value: 0x0
> B - 1411113 - usb: ENUM success
> B - 1411113 - Sahara Init
> B - 1414285 - Sahara Open
> "
>
> Once the board is in download mode, you can use the qdl tool (I
> personally use edl , have not tried qdl yet), to get all the regions as
> separate files.
> The tool from the host computer will list the regions in the order they
> were downloaded.
>
> Once you have all the files simply use `cat` to put them all together,
> in the order of the indexes.
> For my kernel config and setup, here is my cat command : (you can use a script
> or something, I haven't done that so far):
>
> `cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
> memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
> memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
> memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
> memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
> memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
> memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
> memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
> memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
> memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
> memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
> memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
> memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
> memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
> memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
> memory/md_Kunknown46.BIN memory/md_Kunknown49.BIN memory/md_Kunknown50.BIN \
> memory/md_Kunknown51.BIN > ~/minidump_image`
>
> Once you have the resulted file, use `crash` tool to load it, like this:
> `./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
>
> There is also a --minimal mode for ./crash that would work without any patch applied
> to crash tool, but you can't inspect symbols, etc.
>
> Once you load crash you will see something like this :
>
> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
> DUMPFILE: /home/eugen/new
> CPUS: 8 [OFFLINE: 5]
> DATE: Thu Jan 1 02:00:00 EET 1970
> UPTIME: 00:00:22
> TASKS: 0
> NODENAME: qemuarm64
> RELEASE: 6.17.0-rc5-next-20250910-00020-g7dfa02aeae7e
> VERSION: #116 SMP PREEMPT Thu Sep 11 18:28:06 EEST 2025
> MACHINE: aarch64 (unknown Mhz)
> MEMORY: 34.2 GB
> PANIC: ""
> crash> log
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
> [ 0.000000] Linux version 6.17.0-rc5-next-20250910-00020-g7dfa02aeae7e (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #116 SMP PREEMPT Thu Sep 11 18:28:06 EEST 2025
>
>
> *** Debug Kinfo backend driver ***
> I don't have any device to actually test this. So I have not.
> I hacked the driver to just use a kmalloc'ed area to save things instead
> of the shared memory, and dumped everything there and checked whether it is identical
> with what the downstream driver would have saved.
> So this synthetic test passed and memories are identical.
> Anyone who actually wants to test this, feel free to reply to the patch.
> I have also written a simple DT binding for the driver.
>
> Thanks for everyone reviewing and bringing ideas into the discussion.
>
> Eugen
>
> Changelog since the v2 of the RFC:
> - V2 available here : https://lore.kernel.org/all/20250724135512.518487-1-eugen.hristev@linaro.org/
> - Removed the .section as requested by David Hildenbrand.
> - Moved all kmemdump registration(when possible) to vmcoreinfo.
> - Because of this, some of the variables that I was registering had to be non-static
> so I had to modify this as per David Hildenbrand suggestion.
> - Fixed minor things in the Kinfo driver: one field was broken, fixed some
> compiler warnings, fixed the copyright and remove some useless includes.
> - Moved the whole kmemdump from drivers/debug into mm/ and Kconfigs into mm/Kconfig.debug
> and it's now available in kernel hacking, as per Randy Dunlap review
> - Reworked some of the Documentation as per review from Jon Corbet
>
>
> Changelog since the v1 of the RFC:
> - V1 available here: https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
> - Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
> This means new API, macros, new way to store the regions inside kmemdump
> (ditched the IDR, moved to static allocation, have a static default backend, etc)
> - Reworked qcom_minidump driver based on review from Bjorn Andersson
> - Reworked printk log buffer registration based on review from Petr Mladek
>
> I appologize if I missed any review comments. I know there is still lots of work
> on this series and hope I will improve it more and more.
> Patches are sent on top of next-20250910
>
> Eugen Hristev (16):
> kmemdump: Introduce kmemdump
> Documentation: Add kmemdump
> kmemdump: Add coreimage ELF layer
> Documentation: kmemdump: Add section for coreimage ELF
> kernel/vmcore_info: Register dynamic information into Kmemdump
> kmemdump: Introduce qcom-minidump backend driver
> soc: qcom: smem: Add minidump device
> init/version: Add banner_len to save banner length
> genirq/irqdesc: Have nr_irqs as non-static
> panic: Have tainted_mask as non-static
> mm/swapfile: Have nr_swapfiles as non-static
> printk: Register information into Kmemdump
> sched: Add sched_get_runqueues_area
> kernel/vmcoreinfo: Register kmemdump core image information
> kmemdump: Add Kinfo backend driver
> dt-bindings: Add Google Kinfo
>
> Documentation/dev-tools/index.rst | 1 +
> Documentation/dev-tools/kmemdump.rst | 139 +++++++
> .../bindings/misc/google,kinfo.yaml | 36 ++
> MAINTAINERS | 19 +
> drivers/soc/qcom/smem.c | 10 +
> include/linux/kmemdump.h | 130 +++++++
> include/linux/printk.h | 1 +
> init/version-timestamp.c | 1 +
> init/version.c | 1 +
> kernel/irq/irqdesc.c | 2 +-
> kernel/panic.c | 2 +-
> kernel/printk/printk.c | 47 +++
> kernel/sched/core.c | 15 +
> kernel/sched/sched.h | 2 +
> kernel/vmcore_info.c | 149 ++++++++
> mm/Kconfig.debug | 2 +
> mm/Makefile | 1 +
> mm/kmemdump/Kconfig.debug | 53 +++
> mm/kmemdump/Makefile | 6 +
> mm/kmemdump/kinfo.c | 293 +++++++++++++++
> mm/kmemdump/kmemdump.c | 234 ++++++++++++
> mm/kmemdump/kmemdump_coreimage.c | 222 +++++++++++
> mm/kmemdump/qcom_minidump.c | 353 ++++++++++++++++++
> mm/swapfile.c | 2 +-
> 24 files changed, 1718 insertions(+), 3 deletions(-)
> create mode 100644 Documentation/dev-tools/kmemdump.rst
> create mode 100644 Documentation/devicetree/bindings/misc/google,kinfo.yaml
> create mode 100644 include/linux/kmemdump.h
> create mode 100644 mm/kmemdump/Kconfig.debug
> create mode 100644 mm/kmemdump/Makefile
> create mode 100644 mm/kmemdump/kinfo.c
> create mode 100644 mm/kmemdump/kmemdump.c
> create mode 100644 mm/kmemdump/kmemdump_coreimage.c
> create mode 100644 mm/kmemdump/qcom_minidump.c
>
> --
> 2.43.0
>
--
-Mukesh Ojha
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: [RFC][PATCH v3 00/16] Introduce kmemdump
2025-09-16 7:49 ` Mukesh Ojha
@ 2025-09-16 15:25 ` Luck, Tony
2025-09-16 15:27 ` Eugen Hristev
0 siblings, 1 reply; 42+ messages in thread
From: Luck, Tony @ 2025-09-16 15:25 UTC (permalink / raw)
To: Mukesh Ojha, Eugen Hristev
Cc: kees@kernel.org, gpiccoli@igalia.com,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, tglx@linutronix.de, andersson@kernel.org,
pmladek@suse.com, rdunlap@infradead.org, corbet@lwn.net,
david@redhat.com, mhocko@suse.com, tudor.ambarus@linaro.org,
linux-arm-kernel@lists.infradead.org,
linux-hardening@vger.kernel.org, jonechou@google.com,
rostedt@goodmis.org, linux-doc@vger.kernel.org,
devicetree@vger.kernel.org
> +Adding some pstore experts to bring this to their attention if this can
> be followed and if they find it useful.
Depends on the capabilities of the pstore backend. Some of them
(ERST, EFI variables) have tiny capacity (just a few kilobytes) so
well suited for saving the tail of the console log, but if the user specified
more than a couple of pages to be dumped using this mechanism, that
would exceed the persistent store capacity.
-Tony
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 00/16] Introduce kmemdump
2025-09-16 15:25 ` Luck, Tony
@ 2025-09-16 15:27 ` Eugen Hristev
0 siblings, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-16 15:27 UTC (permalink / raw)
To: Luck, Tony, Mukesh Ojha
Cc: kees@kernel.org, gpiccoli@igalia.com,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, tglx@linutronix.de, andersson@kernel.org,
pmladek@suse.com, rdunlap@infradead.org, corbet@lwn.net,
david@redhat.com, mhocko@suse.com, tudor.ambarus@linaro.org,
linux-arm-kernel@lists.infradead.org,
linux-hardening@vger.kernel.org, jonechou@google.com,
rostedt@goodmis.org, linux-doc@vger.kernel.org,
devicetree@vger.kernel.org
On 9/16/25 18:25, Luck, Tony wrote:
>> +Adding some pstore experts to bring this to their attention if this can
>> be followed and if they find it useful.
>
> Depends on the capabilities of the pstore backend. Some of them
> (ERST, EFI variables) have tiny capacity (just a few kilobytes) so
> well suited for saving the tail of the console log, but if the user specified
> more than a couple of pages to be dumped using this mechanism, that
> would exceed the persistent store capacity.
>
> -Tony
The backend can fully decide what to select from all the regions.
Some regions of interest are named (listed inside an enum with an ID),
and some have an incremental ID that is being assigned.
Either way, the backend can choose to ignore what is unwanted.
E.g. patch 16/16 where the kinfo driver selects just a few of the
regions which are of interest for Pixel debugging, the rest being ignored.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-12 15:08 ` [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static Eugen Hristev
@ 2025-09-16 21:10 ` Thomas Gleixner
2025-09-16 21:16 ` Thomas Gleixner
0 siblings, 1 reply; 42+ messages in thread
From: Thomas Gleixner @ 2025-09-16 21:10 UTC (permalink / raw)
To: Eugen Hristev, linux-arm-msm, linux-kernel, linux-mm, andersson,
pmladek, rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
On Fri, Sep 12 2025 at 18:08, Eugen Hristev wrote:
> nr_irqs is required for debugging the kernel, and needs to be
> accessible for kmemdump into vmcoreinfo.
That's a patently bad idea.
Care to grep how many instances of 'nr_irqs' variables are in the
kernel?
That name is way too generic to be made global.
Thanks,
tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-16 21:10 ` Thomas Gleixner
@ 2025-09-16 21:16 ` Thomas Gleixner
2025-09-17 5:43 ` Eugen Hristev
0 siblings, 1 reply; 42+ messages in thread
From: Thomas Gleixner @ 2025-09-16 21:16 UTC (permalink / raw)
To: Eugen Hristev, linux-arm-msm, linux-kernel, linux-mm, andersson,
pmladek, rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree, Eugen Hristev
On Tue, Sep 16 2025 at 23:10, Thomas Gleixner wrote:
> On Fri, Sep 12 2025 at 18:08, Eugen Hristev wrote:
>> nr_irqs is required for debugging the kernel, and needs to be
>> accessible for kmemdump into vmcoreinfo.
>
> That's a patently bad idea.
>
> Care to grep how many instances of 'nr_irqs' variables are in the
> kernel?
>
> That name is way too generic to be made global.
Aside of that there is _ZERO_ justification to expose variables globaly,
which have been made file local with a lot of effort in the past.
I pointed you to a solution for that and just because David does not
like it means that it's acceptable to fiddle in subsystems and expose
their carefully localized variables.
Thanks
tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-16 21:16 ` Thomas Gleixner
@ 2025-09-17 5:43 ` Eugen Hristev
2025-09-17 7:16 ` David Hildenbrand
0 siblings, 1 reply; 42+ messages in thread
From: Eugen Hristev @ 2025-09-17 5:43 UTC (permalink / raw)
To: Thomas Gleixner, linux-arm-msm, linux-kernel, linux-mm, andersson,
pmladek, rdunlap, corbet, david, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/17/25 00:16, Thomas Gleixner wrote:
> On Tue, Sep 16 2025 at 23:10, Thomas Gleixner wrote:
>> On Fri, Sep 12 2025 at 18:08, Eugen Hristev wrote:
>>> nr_irqs is required for debugging the kernel, and needs to be
>>> accessible for kmemdump into vmcoreinfo.
>>
>> That's a patently bad idea.
>>
>> Care to grep how many instances of 'nr_irqs' variables are in the
>> kernel?
>>
>> That name is way too generic to be made global.
>
> Aside of that there is _ZERO_ justification to expose variables globaly,
> which have been made file local with a lot of effort in the past.
>
> I pointed you to a solution for that and just because David does not
> like it means that it's acceptable to fiddle in subsystems and expose
> their carefully localized variables.
>
I agree. I explained the solution to David. He wanted to un-static
everything. I disagreed.
I implemented your idea in the v2 of the patch series.
Did you have a look on how it turned out ?
Perhaps I can improve on that and make it more acceptable.
Eugen
> Thanks
>
> tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 5:43 ` Eugen Hristev
@ 2025-09-17 7:16 ` David Hildenbrand
2025-09-17 14:10 ` Thomas Gleixner
0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-09-17 7:16 UTC (permalink / raw)
To: Eugen Hristev, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 17.09.25 07:43, Eugen Hristev wrote:
>
>
> On 9/17/25 00:16, Thomas Gleixner wrote:
>> On Tue, Sep 16 2025 at 23:10, Thomas Gleixner wrote:
>>> On Fri, Sep 12 2025 at 18:08, Eugen Hristev wrote:
>>>> nr_irqs is required for debugging the kernel, and needs to be
>>>> accessible for kmemdump into vmcoreinfo.
>>>
>>> That's a patently bad idea.
>>>
>>> Care to grep how many instances of 'nr_irqs' variables are in the
>>> kernel?
>>>
>>> That name is way too generic to be made global.
>>
>> Aside of that there is _ZERO_ justification to expose variables globaly,
>> which have been made file local with a lot of effort in the past.
>>
>> I pointed you to a solution for that and just because David does not
>> like it means that it's acceptable to fiddle in subsystems and expose
>> their carefully localized variables.
It would have been great if we could have had that discussion in the
previous thread.
I didn't like what I saw in v2. In particular, having subsystems fiddle
with kmemdump specifics.
I prefer if we can find a way to not have subsystems to that.
>>
>
> I agree. I explained the solution to David. He wanted to un-static
> everything. I disagreed.
Some other subsystem wants to have access to this information. I agree
that exposing these variables as r/w globally is not ideal.
I raised the alternative of exposing areas or other information through
simple helper functions that kmemdump can just use to compose whatever
it needs to compose.
Do we really need that .section thingy?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 7:16 ` David Hildenbrand
@ 2025-09-17 14:10 ` Thomas Gleixner
2025-09-17 14:26 ` Eugen Hristev
2025-09-17 14:46 ` David Hildenbrand
0 siblings, 2 replies; 42+ messages in thread
From: Thomas Gleixner @ 2025-09-17 14:10 UTC (permalink / raw)
To: David Hildenbrand, Eugen Hristev, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
> On 17.09.25 07:43, Eugen Hristev wrote:
>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>> I pointed you to a solution for that and just because David does not
>>> like it means that it's acceptable to fiddle in subsystems and expose
>>> their carefully localized variables.
>
> It would have been great if we could have had that discussion in the
> previous thread.
Sorry. I was busy with other stuff and did not pay attention to that
discussion.
> Some other subsystem wants to have access to this information. I agree
> that exposing these variables as r/w globally is not ideal.
It's a nono in this case. We had bugs (long ago) where people fiddled
with this stuff (I assume accidentally for my mental sanity sake) and
caused really nasty to debug issues. C is a horrible language to
encapsulate stuff properly as we all know.
> I raised the alternative of exposing areas or other information through
> simple helper functions that kmemdump can just use to compose whatever
> it needs to compose.
>
> Do we really need that .section thingy?
The section thing is simple and straight forward as it just puts the
annotated stuff into the section along with size and id and I definitely
find that more palatable, than sprinkling random functions all over the
place to register stuff.
Sure, you can achieve the same thing with an accessor function. In case
of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
do not expose the information already for real functional reasons adding
such helpers just for this coredump muck is really worse than having a
clearly descriptive and obvious annotation which results in the section
build.
The charm of sections is that they don't neither extra code nor stubs or
ifdeffery when a certain subsystem is disabled and therefore no
information available.
I'm not insisting on sections, but having a table of 2k instead of
hundred functions, stubs and whatever is definitely a win to me.
Thanks,
tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 14:10 ` Thomas Gleixner
@ 2025-09-17 14:26 ` Eugen Hristev
2025-09-17 14:46 ` David Hildenbrand
1 sibling, 0 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-17 14:26 UTC (permalink / raw)
To: Thomas Gleixner, David Hildenbrand, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/17/25 17:10, Thomas Gleixner wrote:
> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>> On 17.09.25 07:43, Eugen Hristev wrote:
>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>> I pointed you to a solution for that and just because David does not
>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>> their carefully localized variables.
>>
>> It would have been great if we could have had that discussion in the
>> previous thread.
>
> Sorry. I was busy with other stuff and did not pay attention to that
> discussion.
>
>> Some other subsystem wants to have access to this information. I agree
>> that exposing these variables as r/w globally is not ideal.
>
> It's a nono in this case. We had bugs (long ago) where people fiddled
> with this stuff (I assume accidentally for my mental sanity sake) and
> caused really nasty to debug issues. C is a horrible language to
> encapsulate stuff properly as we all know.
>
>> I raised the alternative of exposing areas or other information through
>> simple helper functions that kmemdump can just use to compose whatever
>> it needs to compose.
>>
>> Do we really need that .section thingy?
>
> The section thing is simple and straight forward as it just puts the
> annotated stuff into the section along with size and id and I definitely
> find that more palatable, than sprinkling random functions all over the
> place to register stuff.
+1 from my side.
>
> Sure, you can achieve the same thing with an accessor function. In case
> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
Not really. I cannot use this accessory function because it returns the
<value> of nr_irqs. To have this working with a debug tool, I need to
dump the actual memory where nr_irqs reside. This is because any debug
tool will not call any function or code, rather look in the dump where
is the variable to find its value. And nr_irqs is not in the coredump
image if it's not registered itself into kmemdump.
So to make it work, the accessory would have to return a pointer to
nr_irqs. Which is wrong. Returning a pointer to a static, outside of the
subsystem, is not right from my point of view.
> do not expose the information already for real functional reasons adding
> such helpers just for this coredump muck is really worse than having a
> clearly descriptive and obvious annotation which results in the section
> build.
>
> The charm of sections is that they don't neither extra code nor stubs or
> ifdeffery when a certain subsystem is disabled and therefore no
> information available.
>
> I'm not insisting on sections, but having a table of 2k instead of
> hundred functions, stubs and whatever is definitely a win to me.
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 14:10 ` Thomas Gleixner
2025-09-17 14:26 ` Eugen Hristev
@ 2025-09-17 14:46 ` David Hildenbrand
2025-09-17 15:02 ` Eugen Hristev
1 sibling, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-09-17 14:46 UTC (permalink / raw)
To: Thomas Gleixner, Eugen Hristev, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 17.09.25 16:10, Thomas Gleixner wrote:
> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>> On 17.09.25 07:43, Eugen Hristev wrote:
>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>> I pointed you to a solution for that and just because David does not
>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>> their carefully localized variables.
>>
>> It would have been great if we could have had that discussion in the
>> previous thread.
>
> Sorry. I was busy with other stuff and did not pay attention to that
> discussion.
I understand, I'm busy with too much stuff such that sometimes it might
be good to interrupt me earlier: "David, nooo, you're all wrong"
>
>> Some other subsystem wants to have access to this information. I agree
>> that exposing these variables as r/w globally is not ideal.
>
> It's a nono in this case. We had bugs (long ago) where people fiddled
> with this stuff (I assume accidentally for my mental sanity sake) and
> caused really nasty to debug issues. C is a horrible language to
> encapsulate stuff properly as we all know.
Yeah, there is this ACCESS_PRIVATE stuff but it only works with structs
and relies on sparse IIRC.
>
>> I raised the alternative of exposing areas or other information through
>> simple helper functions that kmemdump can just use to compose whatever
>> it needs to compose.
>>
>> Do we really need that .section thingy?
>
> The section thing is simple and straight forward as it just puts the
> annotated stuff into the section along with size and id and I definitely
> find that more palatable, than sprinkling random functions all over the
> place to register stuff.
>
> Sure, you can achieve the same thing with an accessor function. In case
> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
Right, the challenge really is that we want the memory range covered by
that address, otherwise it would be easy.
> do not expose the information already for real functional reasons adding
> such helpers just for this coredump muck is really worse than having a
> clearly descriptive and obvious annotation which results in the section
> build.
Yeah, I'm mostly unhappy about the "#include <linux/kmemdump.h>" stuff.
Guess it would all feel less "kmemdump" specific if we would just have a
generic way to tag/describe certain physical memory areas and kmemdump
would simply make use of that.
For example, wondering if it could come in handy to have an ordinary
vmcoreinfo header contain this information as well?
Case in point, right now we do in crash_save_vmcoreinfo_init()
VMCOREINFO_SYMBOL_ARRAY(mem_section);
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
And in kmemdump code we do
kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_mem_section,
(void *)&mem_section, sizeof(mem_section));
I guess both cases actually describe roughly the same information: An
area with a given name.
Note 1: Wondering if sizeof(mem_section) is actually correct in the
kmemdump case
Note 2: Wondering if kmemdump would also want the struct size, not just
the area length.
(memblock alloc wrappers are a separate discussion)
>
> The charm of sections is that they don't neither extra code nor stubs or
> ifdeffery when a certain subsystem is disabled and therefore no
> information available.
Extra code is a very good point.
>
> I'm not insisting on sections, but having a table of 2k instead of
> hundred functions, stubs and whatever is definitely a win to me.
So far it looks like it's not that many, but of course, the question
would be how it evolves.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 14:46 ` David Hildenbrand
@ 2025-09-17 15:02 ` Eugen Hristev
2025-09-17 15:18 ` David Hildenbrand
0 siblings, 1 reply; 42+ messages in thread
From: Eugen Hristev @ 2025-09-17 15:02 UTC (permalink / raw)
To: David Hildenbrand, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/17/25 17:46, David Hildenbrand wrote:
> On 17.09.25 16:10, Thomas Gleixner wrote:
>> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>>> On 17.09.25 07:43, Eugen Hristev wrote:
>>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>>> I pointed you to a solution for that and just because David does not
>>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>>> their carefully localized variables.
>>>
>>> It would have been great if we could have had that discussion in the
>>> previous thread.
>>
>> Sorry. I was busy with other stuff and did not pay attention to that
>> discussion.
>
> I understand, I'm busy with too much stuff such that sometimes it might
> be good to interrupt me earlier: "David, nooo, you're all wrong"
>
>>
>>> Some other subsystem wants to have access to this information. I agree
>>> that exposing these variables as r/w globally is not ideal.
>>
>> It's a nono in this case. We had bugs (long ago) where people fiddled
>> with this stuff (I assume accidentally for my mental sanity sake) and
>> caused really nasty to debug issues. C is a horrible language to
>> encapsulate stuff properly as we all know.
>
> Yeah, there is this ACCESS_PRIVATE stuff but it only works with structs
> and relies on sparse IIRC.
>
>>
>>> I raised the alternative of exposing areas or other information through
>>> simple helper functions that kmemdump can just use to compose whatever
>>> it needs to compose.
>>>
>>> Do we really need that .section thingy?
>>
>> The section thing is simple and straight forward as it just puts the
>> annotated stuff into the section along with size and id and I definitely
>> find that more palatable, than sprinkling random functions all over the
>> place to register stuff.
>>
>> Sure, you can achieve the same thing with an accessor function. In case
>> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
>
> Right, the challenge really is that we want the memory range covered by
> that address, otherwise it would be easy.
>
>> do not expose the information already for real functional reasons adding
>> such helpers just for this coredump muck is really worse than having a
>> clearly descriptive and obvious annotation which results in the section
>> build.
>
> Yeah, I'm mostly unhappy about the "#include <linux/kmemdump.h>" stuff.
>
> Guess it would all feel less "kmemdump" specific if we would just have a
> generic way to tag/describe certain physical memory areas and kmemdump
> would simply make use of that.
The idea was to make "kmemdump" exactly this generic way to tag/describe
the memory.
If we would call it differently , simply dump , would it be better ?
e.g. include linux/dump.h
and then DUMP(var, size) ?
could we call it maybe MARK ? or TAG ?
TAG_MEM(area, size)
this would go to a separate section called .tagged_memory.
Then anyone can walk through the section and collect the data.
I am just coming up with ideas here.
Could it be even part of mm.h instead of having a new header perhaps ?
Then we won't need to include one more.
>
> For example, wondering if it could come in handy to have an ordinary
> vmcoreinfo header contain this information as well?
>
> Case in point, right now we do in crash_save_vmcoreinfo_init()
>
> VMCOREINFO_SYMBOL_ARRAY(mem_section);
> VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> VMCOREINFO_STRUCT_SIZE(mem_section);
>
> And in kmemdump code we do
>
> kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_mem_section,
> (void *)&mem_section, sizeof(mem_section));
>
> I guess both cases actually describe roughly the same information: An
> area with a given name.
>
> Note 1: Wondering if sizeof(mem_section) is actually correct in the
> kmemdump case
>
> Note 2: Wondering if kmemdump would also want the struct size, not just
> the area length.
For kmemdump, right now, debugging without vmlinux symbols is rather
impossible, so we have all that information from vmlinux.
>
> (memblock alloc wrappers are a separate discussion)
>
>>
>> The charm of sections is that they don't neither extra code nor stubs or
>> ifdeffery when a certain subsystem is disabled and therefore no
>> information available.
>
> Extra code is a very good point.
>
>>
>> I'm not insisting on sections, but having a table of 2k instead of
>> hundred functions, stubs and whatever is definitely a win to me.
>
> So far it looks like it's not that many, but of course, the question
> would be how it evolves.
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 15:02 ` Eugen Hristev
@ 2025-09-17 15:18 ` David Hildenbrand
2025-09-17 15:32 ` Eugen Hristev
2025-09-17 18:42 ` Thomas Gleixner
0 siblings, 2 replies; 42+ messages in thread
From: David Hildenbrand @ 2025-09-17 15:18 UTC (permalink / raw)
To: Eugen Hristev, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 17.09.25 17:02, Eugen Hristev wrote:
>
>
> On 9/17/25 17:46, David Hildenbrand wrote:
>> On 17.09.25 16:10, Thomas Gleixner wrote:
>>> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>>>> On 17.09.25 07:43, Eugen Hristev wrote:
>>>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>>>> I pointed you to a solution for that and just because David does not
>>>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>>>> their carefully localized variables.
>>>>
>>>> It would have been great if we could have had that discussion in the
>>>> previous thread.
>>>
>>> Sorry. I was busy with other stuff and did not pay attention to that
>>> discussion.
>>
>> I understand, I'm busy with too much stuff such that sometimes it might
>> be good to interrupt me earlier: "David, nooo, you're all wrong"
>>
>>>
>>>> Some other subsystem wants to have access to this information. I agree
>>>> that exposing these variables as r/w globally is not ideal.
>>>
>>> It's a nono in this case. We had bugs (long ago) where people fiddled
>>> with this stuff (I assume accidentally for my mental sanity sake) and
>>> caused really nasty to debug issues. C is a horrible language to
>>> encapsulate stuff properly as we all know.
>>
>> Yeah, there is this ACCESS_PRIVATE stuff but it only works with structs
>> and relies on sparse IIRC.
>>
>>>
>>>> I raised the alternative of exposing areas or other information through
>>>> simple helper functions that kmemdump can just use to compose whatever
>>>> it needs to compose.
>>>>
>>>> Do we really need that .section thingy?
>>>
>>> The section thing is simple and straight forward as it just puts the
>>> annotated stuff into the section along with size and id and I definitely
>>> find that more palatable, than sprinkling random functions all over the
>>> place to register stuff.
>>>
>>> Sure, you can achieve the same thing with an accessor function. In case
>>> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
>>
>> Right, the challenge really is that we want the memory range covered by
>> that address, otherwise it would be easy.
>>
>>> do not expose the information already for real functional reasons adding
>>> such helpers just for this coredump muck is really worse than having a
>>> clearly descriptive and obvious annotation which results in the section
>>> build.
>>
>> Yeah, I'm mostly unhappy about the "#include <linux/kmemdump.h>" stuff.
>>
>> Guess it would all feel less "kmemdump" specific if we would just have a
>> generic way to tag/describe certain physical memory areas and kmemdump
>> would simply make use of that.
>
> The idea was to make "kmemdump" exactly this generic way to tag/describe
> the memory.
That's probably where I got lost, after reading the cover letter
assuming that this is primarily to program kmemdump backends, which I
understood to just special hw/firmware areas, whereby kinfo acts as a
filter.
> If we would call it differently , simply dump , would it be better ?
> e.g. include linux/dump.h
> and then DUMP(var, size) ?
>
> could we call it maybe MARK ? or TAG ?
> TAG_MEM(area, size)
I'm wondering whether there could be any other user for this kind of
information.
Like R/O access in a debug kernel to these areas, exporting the
ranges/names + easy read access to content through debugfs or something.
Guess that partially falls under the "dump" category.
Including that information in a vmcore info would probably allow to
quickly extract some information even without the debug symbols around
(I run into that every now and then).
>
> this would go to a separate section called .tagged_memory.
>
Maybe just "tagged_memory.h" or sth. like that? I'm bad at naming, so I
would let others make better suggestions.
> Then anyone can walk through the section and collect the data.
>
> I am just coming up with ideas here.
> Could it be even part of mm.h instead of having a new header perhaps ?
> Then we won't need to include one more.
I don't really have something against a new include, just not one that
sounded like a very specific subsystem, not something more generic.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 15:18 ` David Hildenbrand
@ 2025-09-17 15:32 ` Eugen Hristev
2025-09-17 15:44 ` David Hildenbrand
2025-09-17 18:42 ` Thomas Gleixner
1 sibling, 1 reply; 42+ messages in thread
From: Eugen Hristev @ 2025-09-17 15:32 UTC (permalink / raw)
To: David Hildenbrand, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/17/25 18:18, David Hildenbrand wrote:
> On 17.09.25 17:02, Eugen Hristev wrote:
>>
>>
>> On 9/17/25 17:46, David Hildenbrand wrote:
>>> On 17.09.25 16:10, Thomas Gleixner wrote:
>>>> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>>>>> On 17.09.25 07:43, Eugen Hristev wrote:
>>>>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>>>>> I pointed you to a solution for that and just because David does not
>>>>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>>>>> their carefully localized variables.
>>>>>
>>>>> It would have been great if we could have had that discussion in the
>>>>> previous thread.
>>>>
>>>> Sorry. I was busy with other stuff and did not pay attention to that
>>>> discussion.
>>>
>>> I understand, I'm busy with too much stuff such that sometimes it might
>>> be good to interrupt me earlier: "David, nooo, you're all wrong"
>>>
>>>>
>>>>> Some other subsystem wants to have access to this information. I agree
>>>>> that exposing these variables as r/w globally is not ideal.
>>>>
>>>> It's a nono in this case. We had bugs (long ago) where people fiddled
>>>> with this stuff (I assume accidentally for my mental sanity sake) and
>>>> caused really nasty to debug issues. C is a horrible language to
>>>> encapsulate stuff properly as we all know.
>>>
>>> Yeah, there is this ACCESS_PRIVATE stuff but it only works with structs
>>> and relies on sparse IIRC.
>>>
>>>>
>>>>> I raised the alternative of exposing areas or other information through
>>>>> simple helper functions that kmemdump can just use to compose whatever
>>>>> it needs to compose.
>>>>>
>>>>> Do we really need that .section thingy?
>>>>
>>>> The section thing is simple and straight forward as it just puts the
>>>> annotated stuff into the section along with size and id and I definitely
>>>> find that more palatable, than sprinkling random functions all over the
>>>> place to register stuff.
>>>>
>>>> Sure, you can achieve the same thing with an accessor function. In case
>>>> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
>>>
>>> Right, the challenge really is that we want the memory range covered by
>>> that address, otherwise it would be easy.
>>>
>>>> do not expose the information already for real functional reasons adding
>>>> such helpers just for this coredump muck is really worse than having a
>>>> clearly descriptive and obvious annotation which results in the section
>>>> build.
>>>
>>> Yeah, I'm mostly unhappy about the "#include <linux/kmemdump.h>" stuff.
>>>
>>> Guess it would all feel less "kmemdump" specific if we would just have a
>>> generic way to tag/describe certain physical memory areas and kmemdump
>>> would simply make use of that.
>>
>> The idea was to make "kmemdump" exactly this generic way to tag/describe
>> the memory.
>
> That's probably where I got lost, after reading the cover letter
> assuming that this is primarily to program kmemdump backends, which I
> understood to just special hw/firmware areas, whereby kinfo acts as a
> filter.
If there is a mechanism to tag all this memory, or regions, into a
specific section, what we would do with it next ?
It would have a purpose to be parsed and reused by different drivers,
that would be able to actually use it.
So there has a to be some kind of middleman, that holds onto this list
of regions, manages it (unique id, add/remove), and allows certain
drivers to use it.
Now it would be interesting to have different kind of drivers connect to
it (or backends how I called them).
One of these programs an internal table for the firmware to use.
Another , writes information into a dedicated reserved-memory for the
bootloader to use on the next soft reboot (memory preserved).
I called this middleman kmemdump. But it can be named differently, and
it can reside in different places in the kernel.
But what I would like to avoid is to just tag all this memory and have
any kind of driver connect to the table. That works, but it's quite
loose on having control over the table. E.g. no kmemdump, tag all the
memory to sections, and have specific drivers (that would reside where?)
walk it.
>
>> If we would call it differently , simply dump , would it be better ?
>> e.g. include linux/dump.h
>> and then DUMP(var, size) ?
>>
>> could we call it maybe MARK ? or TAG ?
>> TAG_MEM(area, size)
>
> I'm wondering whether there could be any other user for this kind of
> information.
>
> Like R/O access in a debug kernel to these areas, exporting the
> ranges/names + easy read access to content through debugfs or something.
One idea I had to to have a jtag script read the table , parse it, and
know where some information resides.
Another idea is to use Uboot in case of persistent memory across reboot,
and Uboot can read all the sections and assemble a ready-to-download
coredump. (sure this doesn't work in all cases)
What can be done in case of hypervisor is to implement there a routine
that would read it, in case the OS is non-responsive, or even in the
secure monitor.
Another suggestion I had from someone was to use a pure software default
backend in which to just keep the regions stored, and it could be
accessed through userspace or read by a crash analyzer.
>
> Guess that partially falls under the "dump" category.
>
> Including that information in a vmcore info would probably allow to
> quickly extract some information even without the debug symbols around
> (I run into that every now and then).
>
>>
>> this would go to a separate section called .tagged_memory.
>>
>
> Maybe just "tagged_memory.h" or sth. like that? I'm bad at naming, so I
> would let others make better suggestions.
>
>> Then anyone can walk through the section and collect the data.
>>
>> I am just coming up with ideas here.
>> Could it be even part of mm.h instead of having a new header perhaps ?
>> Then we won't need to include one more.
>
> I don't really have something against a new include, just not one that
> sounded like a very specific subsystem, not something more generic.
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 15:32 ` Eugen Hristev
@ 2025-09-17 15:44 ` David Hildenbrand
0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2025-09-17 15:44 UTC (permalink / raw)
To: Eugen Hristev, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 17.09.25 17:32, Eugen Hristev wrote:
>
>
> On 9/17/25 18:18, David Hildenbrand wrote:
>> On 17.09.25 17:02, Eugen Hristev wrote:
>>>
>>>
>>> On 9/17/25 17:46, David Hildenbrand wrote:
>>>> On 17.09.25 16:10, Thomas Gleixner wrote:
>>>>> On Wed, Sep 17 2025 at 09:16, David Hildenbrand wrote:
>>>>>> On 17.09.25 07:43, Eugen Hristev wrote:
>>>>>>> On 9/17/25 00:16, Thomas Gleixner wrote:
>>>>>>>> I pointed you to a solution for that and just because David does not
>>>>>>>> like it means that it's acceptable to fiddle in subsystems and expose
>>>>>>>> their carefully localized variables.
>>>>>>
>>>>>> It would have been great if we could have had that discussion in the
>>>>>> previous thread.
>>>>>
>>>>> Sorry. I was busy with other stuff and did not pay attention to that
>>>>> discussion.
>>>>
>>>> I understand, I'm busy with too much stuff such that sometimes it might
>>>> be good to interrupt me earlier: "David, nooo, you're all wrong"
>>>>
>>>>>
>>>>>> Some other subsystem wants to have access to this information. I agree
>>>>>> that exposing these variables as r/w globally is not ideal.
>>>>>
>>>>> It's a nono in this case. We had bugs (long ago) where people fiddled
>>>>> with this stuff (I assume accidentally for my mental sanity sake) and
>>>>> caused really nasty to debug issues. C is a horrible language to
>>>>> encapsulate stuff properly as we all know.
>>>>
>>>> Yeah, there is this ACCESS_PRIVATE stuff but it only works with structs
>>>> and relies on sparse IIRC.
>>>>
>>>>>
>>>>>> I raised the alternative of exposing areas or other information through
>>>>>> simple helper functions that kmemdump can just use to compose whatever
>>>>>> it needs to compose.
>>>>>>
>>>>>> Do we really need that .section thingy?
>>>>>
>>>>> The section thing is simple and straight forward as it just puts the
>>>>> annotated stuff into the section along with size and id and I definitely
>>>>> find that more palatable, than sprinkling random functions all over the
>>>>> place to register stuff.
>>>>>
>>>>> Sure, you can achieve the same thing with an accessor function. In case
>>>>> of nr_irqs there is already one: irq_get_nr_irqs(), but for places which
>>>>
>>>> Right, the challenge really is that we want the memory range covered by
>>>> that address, otherwise it would be easy.
>>>>
>>>>> do not expose the information already for real functional reasons adding
>>>>> such helpers just for this coredump muck is really worse than having a
>>>>> clearly descriptive and obvious annotation which results in the section
>>>>> build.
>>>>
>>>> Yeah, I'm mostly unhappy about the "#include <linux/kmemdump.h>" stuff.
>>>>
>>>> Guess it would all feel less "kmemdump" specific if we would just have a
>>>> generic way to tag/describe certain physical memory areas and kmemdump
>>>> would simply make use of that.
>>>
>>> The idea was to make "kmemdump" exactly this generic way to tag/describe
>>> the memory.
>>
>> That's probably where I got lost, after reading the cover letter
>> assuming that this is primarily to program kmemdump backends, which I
>> understood to just special hw/firmware areas, whereby kinfo acts as a
>> filter.
>
> If there is a mechanism to tag all this memory, or regions, into a
> specific section, what we would do with it next ?
> It would have a purpose to be parsed and reused by different drivers,
> that would be able to actually use it.
> So there has a to be some kind of middleman, that holds onto this list
> of regions, manages it (unique id, add/remove), and allows certain
> drivers to use it.
Right, just someone that maintains the list and possibly allows
traversing the list and possibly getting notifications on add/remove.
> Now it would be interesting to have different kind of drivers connect to
> it (or backends how I called them).
> One of these programs an internal table for the firmware to use.
> Another , writes information into a dedicated reserved-memory for the
> bootloader to use on the next soft reboot (memory preserved).
> I called this middleman kmemdump. But it can be named differently, and
> it can reside in different places in the kernel.
> But what I would like to avoid is to just tag all this memory and have
> any kind of driver connect to the table. That works, but it's quite
> loose on having control over the table. E.g. no kmemdump, tag all the
> memory to sections, and have specific drivers (that would reside where?)
> walk it.
Yeah, you want just some simple "registry" with traversal+notification.
>
>>
>>> If we would call it differently , simply dump , would it be better ?
>>> e.g. include linux/dump.h
>>> and then DUMP(var, size) ?
>>>
>>> could we call it maybe MARK ? or TAG ?
>>> TAG_MEM(area, size)
Just because I thought about it again, "named memory" could be an
alternative to "tagged memory".
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 15:18 ` David Hildenbrand
2025-09-17 15:32 ` Eugen Hristev
@ 2025-09-17 18:42 ` Thomas Gleixner
2025-09-17 19:03 ` David Hildenbrand
1 sibling, 1 reply; 42+ messages in thread
From: Thomas Gleixner @ 2025-09-17 18:42 UTC (permalink / raw)
To: David Hildenbrand, Eugen Hristev, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On Wed, Sep 17 2025 at 17:18, David Hildenbrand wrote:
> On 17.09.25 17:02, Eugen Hristev wrote:
>> On 9/17/25 17:46, David Hildenbrand wrote:
>>> On 17.09.25 16:10, Thomas Gleixner wrote:
>>>> Sorry. I was busy with other stuff and did not pay attention to that
>>>> discussion.
>>>
>>> I understand, I'm busy with too much stuff such that sometimes it might
>>> be good to interrupt me earlier: "David, nooo, you're all wrong"
I know that feeling.
>> The idea was to make "kmemdump" exactly this generic way to tag/describe
>> the memory.
>
> That's probably where I got lost, after reading the cover letter
> assuming that this is primarily to program kmemdump backends, which I
> understood to just special hw/firmware areas, whereby kinfo acts as a
> filter.
>
>> If we would call it differently , simply dump , would it be better ?
>> e.g. include linux/dump.h
>> and then DUMP(var, size) ?
>>
>> could we call it maybe MARK ? or TAG ?
>> TAG_MEM(area, size)
>
> I'm wondering whether there could be any other user for this kind of
> information.
>
> Like R/O access in a debug kernel to these areas, exporting the
> ranges/names + easy read access to content through debugfs or something.
>
> Guess that partially falls under the "dump" category.
I'd rather call it inspection.
> Including that information in a vmcore info would probably allow to
> quickly extract some information even without the debug symbols around
> (I run into that every now and then).
Correct.
>> this would go to a separate section called .tagged_memory.
That'd be confusing vs. actual memory tags, no?
> Maybe just "tagged_memory.h" or sth. like that? I'm bad at naming, so I
> would let others make better suggestions.
inspect.h :)
I'm going to use 'inspect' as prefix for the thoughts below, but that's
obviously subject to s/inspect/$BETTERNAME/g :)
>> Then anyone can walk through the section and collect the data.
>>
>> I am just coming up with ideas here.
>> Could it be even part of mm.h instead of having a new header perhaps ?
>> Then we won't need to include one more.
>
> I don't really have something against a new include, just not one that
> sounded like a very specific subsystem, not something more generic.
Right. We really don't want to have five different mechanisms for five
infrastructures which all allow to inspect kernel memory (life or
dead) in one way or the other. The difference between them is mostly:
- Which subset of the information they expose for inspection
- The actual exposure mechanism: crash dump, firmware storage,
run-time snapshots in a filesystem, ....
Having one shared core infrastructure to expose data to those mechanisms
makes everyones life simpler.
That obviously needs to collect the superset of data, but that's just a
bit more memory consumed. That's arguably significantly smaller than
supporting a zoo of mechanisms to register data for different
infrastructures.
I'm quite sure that at least a substantial amount of the required
information can be collected at compile time in special section
tables. The rest can be collected in runtime tables, which have the same
format as the compile time section tables to avoid separate parsers.
Let me just float some ideas here, how that might look like. It might be
completely inpractical, but then it might be at least fodder for
thoughts.
As this is specific for the compiled kernel version you can define an
extensible struct format for the table.
struct inspect_entry {
unsigned long properties;
unsigned int type;
unsigned int id;
const char name[$MAX_NAME_LEN];
unsigned long address;
unsigned long length;
....
};
@type
refers either to a table with type information, which describes
the struct in some way or just generate a detached compile time
description.
@id
a unique id created at compile time or via registration at
runtime. Might not be required
@name:
Name of the memory region. That might go into a separate table
which is referenced by @id, but that's up for debate.
@address:
@length:
obvious :)
...
Whatever a particular consumer might require
@properties:
A "bitfield", which allows to mark this entry as (in)valid for a
particular consumer.
That obviously requires to modify these properties when the
requirements of a consumer change, new consumers arrive or new
producers are added, but I think it's easier to do that at the
producer side than maintaining filters on all consumer ends
forever.
Though I might be wrong as usual. IOW this needs some thoughts. :)
The interesting engineering challenge with such a scheme is to come up
with a annotation mechanism which is extensible.
Runtime is trivial as it just needs to fill in the new field in the
datastructure and all other runtime users have that zero
initialized automatically, if you get the mechanism correct in the
first place. Think in templates :)
Compile time is a bit more effort, but that should be solvable with
key/value pairs.
Don't even waste a thought about creating the final tables and
sections in macro magic. All the annotation macros have to do is to
emit the pairs in a structured way into discardable sections.
Those section are then converted in post processing into the actual
section table formats and added to the kernel image. Not a
spectacular new concept. The kernel build does this already today.
Just keep the compile time annotation macro magic simple and
stupid. It can waste 10k per entry at compile time and then let
postprocessing worry about downsizing and consolidation. Nothing to
see here :)
Hope that helps.
Thanks,
tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 18:42 ` Thomas Gleixner
@ 2025-09-17 19:03 ` David Hildenbrand
2025-09-18 8:23 ` Thomas Gleixner
0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-09-17 19:03 UTC (permalink / raw)
To: Thomas Gleixner, Eugen Hristev, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
>
>>> this would go to a separate section called .tagged_memory.
>
> That'd be confusing vs. actual memory tags, no?
Yeah, I came to the conclusion just after an upstream call we just had
about that topic (bi-weekly MM alignment session).
I'm open for any suggestions that make it more generic. My first
instinct was "named memory regions".
>
>> Maybe just "tagged_memory.h" or sth. like that? I'm bad at naming, so I
>> would let others make better suggestions.
>
> inspect.h :)
>
> I'm going to use 'inspect' as prefix for the thoughts below, but that's
> obviously subject to s/inspect/$BETTERNAME/g :)
>
>>> Then anyone can walk through the section and collect the data.
>>>
>>> I am just coming up with ideas here.
>>> Could it be even part of mm.h instead of having a new header perhaps ?
>>> Then we won't need to include one more.
>>
>> I don't really have something against a new include, just not one that
>> sounded like a very specific subsystem, not something more generic.
>
> Right. We really don't want to have five different mechanisms for five
> infrastructures which all allow to inspect kernel memory (life or
> dead) in one way or the other. The difference between them is mostly:
>
> - Which subset of the information they expose for inspection
>
> - The actual exposure mechanism: crash dump, firmware storage,
> run-time snapshots in a filesystem, ....
>
> Having one shared core infrastructure to expose data to those mechanisms
> makes everyones life simpler.
>
> That obviously needs to collect the superset of data, but that's just a
> bit more memory consumed. That's arguably significantly smaller than
> supporting a zoo of mechanisms to register data for different
> infrastructures.
>
> I'm quite sure that at least a substantial amount of the required
> information can be collected at compile time in special section
> tables. The rest can be collected in runtime tables, which have the same
> format as the compile time section tables to avoid separate parsers.
>
> Let me just float some ideas here, how that might look like. It might be
> completely inpractical, but then it might be at least fodder for
> thoughts.
Thanks a bunch for writing all that down!
>
> As this is specific for the compiled kernel version you can define an
> extensible struct format for the table.
>
> struct inspect_entry {
> unsigned long properties;
> unsigned int type;
> unsigned int id;
> const char name[$MAX_NAME_LEN];
> unsigned long address;
> unsigned long length;
> ....
> };
>
> @type
> refers either to a table with type information, which describes
> the struct in some way or just generate a detached compile time
> description.
>
> @id
> a unique id created at compile time or via registration at
> runtime. Might not be required
We discussed that maybe one would want some kind of a "class"
description. For example we might have to register one pgdat area per
node. Giving each one a unique name might be impractical / unreasonable.
Still, someone would want to select / filter out all entries of the same
"class".
Just a thought.
>
> @name:
> Name of the memory region. That might go into a separate table
> which is referenced by @id, but that's up for debate.
Jup.
>
> @address:
> @length:
> obvious :)
>
> ...
> Whatever a particular consumer might require
>
> @properties:
>
> A "bitfield", which allows to mark this entry as (in)valid for a
> particular consumer.
>
> That obviously requires to modify these properties when the
> requirements of a consumer change, new consumers arrive or new
> producers are added, but I think it's easier to do that at the
> producer side than maintaining filters on all consumer ends
> forever.
Question would be if that is not up to a consumer to decide ("allowlist"
/ filter) by class or id, stored elsewhere.
>
> Though I might be wrong as usual. IOW this needs some thoughts. :)
>
> The interesting engineering challenge with such a scheme is to come up
> with a annotation mechanism which is extensible.
>
> Runtime is trivial as it just needs to fill in the new field in the
> datastructure and all other runtime users have that zero
> initialized automatically, if you get the mechanism correct in the
> first place. Think in templates :)
>
> Compile time is a bit more effort, but that should be solvable with
> key/value pairs.
>
> Don't even waste a thought about creating the final tables and
> sections in macro magic. All the annotation macros have to do is to
> emit the pairs in a structured way into discardable sections.
>
> Those section are then converted in post processing into the actual
> section table formats and added to the kernel image. Not a
> spectacular new concept. The kernel build does this already today.
>
> Just keep the compile time annotation macro magic simple and
> stupid. It can waste 10k per entry at compile time and then let
> postprocessing worry about downsizing and consolidation. Nothing to
> see here :)
Sounds interesting!
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-17 19:03 ` David Hildenbrand
@ 2025-09-18 8:23 ` Thomas Gleixner
2025-09-18 13:53 ` Eugen Hristev
0 siblings, 1 reply; 42+ messages in thread
From: Thomas Gleixner @ 2025-09-18 8:23 UTC (permalink / raw)
To: David Hildenbrand, Eugen Hristev, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On Wed, Sep 17 2025 at 21:03, David Hildenbrand wrote:
>> As this is specific for the compiled kernel version you can define an
>> extensible struct format for the table.
>>
>> struct inspect_entry {
>> unsigned long properties;
>> unsigned int type;
>> unsigned int id;
>> const char name[$MAX_NAME_LEN];
>> unsigned long address;
>> unsigned long length;
>> ....
>> };
>>
>> @type
>> refers either to a table with type information, which describes
>> the struct in some way or just generate a detached compile time
>> description.
>>
>> @id
>> a unique id created at compile time or via registration at
>> runtime. Might not be required
>
> We discussed that maybe one would want some kind of a "class"
> description. For example we might have to register one pgdat area per
> node. Giving each one a unique name might be impractical / unreasonable.
>
> Still, someone would want to select / filter out all entries of the same
> "class".
>
> Just a thought.
Right. As I said this was mostly a insta brain dump to start a
discussion. Seems it worked :)
>> @properties:
>>
>> A "bitfield", which allows to mark this entry as (in)valid for a
>> particular consumer.
>>
>> That obviously requires to modify these properties when the
>> requirements of a consumer change, new consumers arrive or new
>> producers are added, but I think it's easier to do that at the
>> producer side than maintaining filters on all consumer ends
>> forever.
>
> Question would be if that is not up to a consumer to decide ("allowlist"
> / filter) by class or id, stored elsewhere.
Yes, I looked at it the wrong way round. We should leave the filtering
to the consumers. If you use allow lists, then a newly introduced class
won't be automatically exposed everywhere.
Thanks,
tglx
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-18 8:23 ` Thomas Gleixner
@ 2025-09-18 13:53 ` Eugen Hristev
2025-09-18 18:43 ` Randy Dunlap
2025-09-25 20:11 ` David Hildenbrand
0 siblings, 2 replies; 42+ messages in thread
From: Eugen Hristev @ 2025-09-18 13:53 UTC (permalink / raw)
To: Thomas Gleixner, David Hildenbrand, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/18/25 11:23, Thomas Gleixner wrote:
> On Wed, Sep 17 2025 at 21:03, David Hildenbrand wrote:
>>> As this is specific for the compiled kernel version you can define an
>>> extensible struct format for the table.
>>>
>>> struct inspect_entry {
>>> unsigned long properties;
>>> unsigned int type;
>>> unsigned int id;
>>> const char name[$MAX_NAME_LEN];
>>> unsigned long address;
>>> unsigned long length;
>>> ....
>>> };
>>>
>>> @type
>>> refers either to a table with type information, which describes
>>> the struct in some way or just generate a detached compile time
>>> description.
>>>
>>> @id
>>> a unique id created at compile time or via registration at
>>> runtime. Might not be required
>>
>> We discussed that maybe one would want some kind of a "class"
>> description. For example we might have to register one pgdat area per
>> node. Giving each one a unique name might be impractical / unreasonable.
>>
>> Still, someone would want to select / filter out all entries of the same
>> "class".
>>
>> Just a thought.
>
> Right. As I said this was mostly a insta brain dump to start a
> discussion. Seems it worked :)
>
>>> @properties:
>>>
>>> A "bitfield", which allows to mark this entry as (in)valid for a
>>> particular consumer.
>>>
>>> That obviously requires to modify these properties when the
>>> requirements of a consumer change, new consumers arrive or new
>>> producers are added, but I think it's easier to do that at the
>>> producer side than maintaining filters on all consumer ends
>>> forever.
>>
>> Question would be if that is not up to a consumer to decide ("allowlist"
>> / filter) by class or id, stored elsewhere.
>
> Yes, I looked at it the wrong way round. We should leave the filtering
> to the consumers. If you use allow lists, then a newly introduced class
> won't be automatically exposed everywhere.
>
> Thanks,
>
> tglx
So, one direction to follow from this discussion is to have the
inspection entry and inspection table for all these entries.
Now, one burning question open for debate, is, should this reside into mm ?
mm/inspect.h would have to define the inspection entry struct, and some
macros to help everyone add an inspection entry.
E.g. INSPECTION_ENTRY(my ptr, my size);
and this would be used all over the kernel wherever folks want to
register something.
Now the second part is, where to keep all the inspection drivers ?
Would it make sense to have mm/inspection/inspection_helpers.h which
would keep the table start/end, some macros to traverse the tables, and
this would be included by the inspection drivers.
inspection drivers would then probe via any mechanism, and tap into the
inspection table.
I am thinking that my model with a single backend can be enhanced by
allowing any inspection driver to access it. And further on, each
inspection driver would register a notifier to be called when an entry
is being created or not. This would mean N possible drivers connected to
the table at the same time. ( if that would make sense...)
Would it make sense for pstore to have an inspection driver that would
be connected here to get different kinds of stuff ?
Would it make sense to have some debugfs driver that would just expose
to user space different regions ? Perhaps something similar with
/proc/kcore but not the whole kernel memory rather only the exposed
inspection entries.
Now, for the dynamic memory, e.g. memblock_alloc and friends ,
would it be interesting to have a flag e.g. MEMBLOCK_INSPECT, that would
be used when calling it, and in the background, this would request an
inspection_entry being created ? Or it makes more sense to call some
function like inspect_register as a different call directly at the
allocation point ?
Feel free to throw your opinion at each of the above.
Thanks for helping out !
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-18 13:53 ` Eugen Hristev
@ 2025-09-18 18:43 ` Randy Dunlap
2025-09-25 20:11 ` David Hildenbrand
1 sibling, 0 replies; 42+ messages in thread
From: Randy Dunlap @ 2025-09-18 18:43 UTC (permalink / raw)
To: Eugen Hristev, Thomas Gleixner, David Hildenbrand, linux-arm-msm,
linux-kernel, linux-mm, andersson, pmladek, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 9/18/25 6:53 AM, Eugen Hristev wrote:
>
>
> So, one direction to follow from this discussion is to have the
> inspection entry and inspection table for all these entries.
> Now, one burning question open for debate, is, should this reside into mm ?
> mm/inspect.h would have to define the inspection entry struct, and some
> macros to help everyone add an inspection entry.
> E.g. INSPECTION_ENTRY(my ptr, my size);
> and this would be used all over the kernel wherever folks want to
> register something.
> Now the second part is, where to keep all the inspection drivers ?
> Would it make sense to have mm/inspection/inspection_helpers.h which
> would keep the table start/end, some macros to traverse the tables, and
> this would be included by the inspection drivers.
> inspection drivers would then probe via any mechanism, and tap into the
> inspection table.
Surely someone wants to inspect more than mm/ variables.
I prefer kernel/inspect/ etc.
> I am thinking that my model with a single backend can be enhanced by
> allowing any inspection driver to access it. And further on, each
> inspection driver would register a notifier to be called when an entry
> is being created or not. This would mean N possible drivers connected to
> the table at the same time. ( if that would make sense...)
> Would it make sense for pstore to have an inspection driver that would
> be connected here to get different kinds of stuff ?
> Would it make sense to have some debugfs driver that would just expose
> to user space different regions ? Perhaps something similar with
> /proc/kcore but not the whole kernel memory rather only the exposed
> inspection entries.
> Now, for the dynamic memory, e.g. memblock_alloc and friends ,
> would it be interesting to have a flag e.g. MEMBLOCK_INSPECT, that would
> be used when calling it, and in the background, this would request an
> inspection_entry being created ? Or it makes more sense to call some
> function like inspect_register as a different call directly at the
> allocation point ?
>
> Feel free to throw your opinion at each of the above.
> Thanks for helping out !
In general I like the way that this is going.
Thanks to all of you for this discussion.
--
~Randy
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver
2025-09-12 15:08 ` [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver Eugen Hristev
2025-09-16 5:48 ` Alexey Klimov
@ 2025-09-22 10:01 ` Tudor Ambarus
1 sibling, 0 replies; 42+ messages in thread
From: Tudor Ambarus @ 2025-09-22 10:01 UTC (permalink / raw)
To: Eugen Hristev, linux-arm-msm, linux-kernel, linux-mm, tglx,
andersson, pmladek, rdunlap, corbet, david, mhocko
Cc: mukesh.ojha, linux-arm-kernel, linux-hardening, jonechou, rostedt,
linux-doc, devicetree
Hi,
On 9/12/25 4:08 PM, Eugen Hristev wrote:
> Add Kinfo backend driver.
> This backend driver will select only regions of interest for the firmware,
> and it copy those into a shared memory area that is supplied via OF.
> The firmware is only interested in addresses for some symbols.
> The list format is kinfo-compatible, with devices like Google Pixel phone.
>
> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
> ---
> MAINTAINERS | 5 +
> mm/kmemdump/Kconfig.debug | 13 ++
> mm/kmemdump/Makefile | 1 +
> mm/kmemdump/kinfo.c | 293 ++++++++++++++++++++++++++++++++++++++
> 4 files changed, 312 insertions(+)
> create mode 100644 mm/kmemdump/kinfo.c
I tested the series on pixel 6 and I could see the backtraces correctly
decoded by the bootloader:
Tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Thanks!
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static
2025-09-18 13:53 ` Eugen Hristev
2025-09-18 18:43 ` Randy Dunlap
@ 2025-09-25 20:11 ` David Hildenbrand
1 sibling, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2025-09-25 20:11 UTC (permalink / raw)
To: Eugen Hristev, Thomas Gleixner, linux-arm-msm, linux-kernel,
linux-mm, andersson, pmladek, rdunlap, corbet, mhocko
Cc: tudor.ambarus, mukesh.ojha, linux-arm-kernel, linux-hardening,
jonechou, rostedt, linux-doc, devicetree
On 18.09.25 15:53, Eugen Hristev wrote:
>
>
> On 9/18/25 11:23, Thomas Gleixner wrote:
>> On Wed, Sep 17 2025 at 21:03, David Hildenbrand wrote:
>>>> As this is specific for the compiled kernel version you can define an
>>>> extensible struct format for the table.
>>>>
>>>> struct inspect_entry {
>>>> unsigned long properties;
>>>> unsigned int type;
>>>> unsigned int id;
>>>> const char name[$MAX_NAME_LEN];
>>>> unsigned long address;
>>>> unsigned long length;
>>>> ....
>>>> };
>>>>
>>>> @type
>>>> refers either to a table with type information, which describes
>>>> the struct in some way or just generate a detached compile time
>>>> description.
>>>>
>>>> @id
>>>> a unique id created at compile time or via registration at
>>>> runtime. Might not be required
>>>
>>> We discussed that maybe one would want some kind of a "class"
>>> description. For example we might have to register one pgdat area per
>>> node. Giving each one a unique name might be impractical / unreasonable.
>>>
>>> Still, someone would want to select / filter out all entries of the same
>>> "class".
>>>
>>> Just a thought.
>>
>> Right. As I said this was mostly a insta brain dump to start a
>> discussion. Seems it worked :)
>>
>>>> @properties:
>>>>
>>>> A "bitfield", which allows to mark this entry as (in)valid for a
>>>> particular consumer.
>>>>
>>>> That obviously requires to modify these properties when the
>>>> requirements of a consumer change, new consumers arrive or new
>>>> producers are added, but I think it's easier to do that at the
>>>> producer side than maintaining filters on all consumer ends
>>>> forever.
>>>
>>> Question would be if that is not up to a consumer to decide ("allowlist"
>>> / filter) by class or id, stored elsewhere.
>>
>> Yes, I looked at it the wrong way round. We should leave the filtering
>> to the consumers. If you use allow lists, then a newly introduced class
>> won't be automatically exposed everywhere.
>>
>> Thanks,
>>
>> tglx
>
>
> So, one direction to follow from this discussion is to have the
> inspection entry and inspection table for all these entries.
> Now, one burning question open for debate, is, should this reside into mm ?
> mm/inspect.h would have to define the inspection entry struct, and some
> macros to help everyone add an inspection entry.
> E.g. INSPECTION_ENTRY(my ptr, my size);
> and this would be used all over the kernel wherever folks want to
> register something.
If we're moving this to kernel/ or similar I'd suggest to not call this
only "inspect" but something that somehow contains the term "mem".
"mem-inspect.h" ?
> Now the second part is, where to keep all the inspection drivers ?
> Would it make sense to have mm/inspection/inspection_helpers.h which
> would keep the table start/end, some macros to traverse the tables, and
> this would be included by the inspection drivers.
> inspection drivers would then probe via any mechanism, and tap into the
> inspection table.
Good question. I think some examples of alternatives might help to
driver that discussion.
> I am thinking that my model with a single backend can be enhanced by
> allowing any inspection driver to access it. And further on, each
> inspection driver would register a notifier to be called when an entry
> is being created or not. This would mean N possible drivers connected to
> the table at the same time. ( if that would make sense...)
Yeah, I think some notifier mechanism is what we want.
> Would it make sense for pstore to have an inspection driver that would
> be connected here to get different kinds of stuff ?
Something for the pstore folks to answer :)
> Would it make sense to have some debugfs driver that would just expose
> to user space different regions ? Perhaps something similar with
> /proc/kcore but not the whole kernel memory rather only the exposed
> inspection entries.
Definetly, this is what I previously mentioned. Maybe we would only
indicate region metadata and actual access to regions would simply
happen through /proc/kcore if someone wants to dump data from user space.
> Now, for the dynamic memory, e.g. memblock_alloc and friends ,
> would it be interesting to have a flag e.g. MEMBLOCK_INSPECT, that would
> be used when calling it, and in the background, this would request an
> inspection_entry being created ? Or it makes more sense to call some
> function like inspect_register as a different call directly at the
> allocation point ?
We'd probably want some interface to define the metadata
(name/class/whatever), a simple flag likely will not do, right?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2025-09-25 20:11 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12 15:08 [RFC][PATCH v3 00/16] Introduce kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 01/16] kmemdump: " Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 02/16] Documentation: Add kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 03/16] kmemdump: Add coreimage ELF layer Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 04/16] Documentation: kmemdump: Add section for coreimage ELF Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 05/16] kernel/vmcore_info: Register dynamic information into Kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 06/16] kmemdump: Introduce qcom-minidump backend driver Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 07/16] soc: qcom: smem: Add minidump device Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 08/16] init/version: Add banner_len to save banner length Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 09/16] genirq/irqdesc: Have nr_irqs as non-static Eugen Hristev
2025-09-16 21:10 ` Thomas Gleixner
2025-09-16 21:16 ` Thomas Gleixner
2025-09-17 5:43 ` Eugen Hristev
2025-09-17 7:16 ` David Hildenbrand
2025-09-17 14:10 ` Thomas Gleixner
2025-09-17 14:26 ` Eugen Hristev
2025-09-17 14:46 ` David Hildenbrand
2025-09-17 15:02 ` Eugen Hristev
2025-09-17 15:18 ` David Hildenbrand
2025-09-17 15:32 ` Eugen Hristev
2025-09-17 15:44 ` David Hildenbrand
2025-09-17 18:42 ` Thomas Gleixner
2025-09-17 19:03 ` David Hildenbrand
2025-09-18 8:23 ` Thomas Gleixner
2025-09-18 13:53 ` Eugen Hristev
2025-09-18 18:43 ` Randy Dunlap
2025-09-25 20:11 ` David Hildenbrand
2025-09-12 15:08 ` [RFC][PATCH v3 10/16] panic: Have tainted_mask " Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 11/16] mm/swapfile: Have nr_swapfiles " Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 12/16] printk: Register information into Kmemdump Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 13/16] sched: Add sched_get_runqueues_area Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 14/16] kernel/vmcoreinfo: Register kmemdump core image information Eugen Hristev
2025-09-12 15:08 ` [RFC][PATCH v3 15/16] kmemdump: Add Kinfo backend driver Eugen Hristev
2025-09-16 5:48 ` Alexey Klimov
2025-09-22 10:01 ` Tudor Ambarus
2025-09-12 15:08 ` [RFC][PATCH v3 16/16] dt-bindings: Add Google Kinfo Eugen Hristev
2025-09-14 11:56 ` Krzysztof Kozlowski
2025-09-12 15:56 ` [RFC][PATCH v3 00/16] Introduce kmemdump David Hildenbrand
2025-09-12 18:35 ` Eugen Hristev
2025-09-16 7:49 ` Mukesh Ojha
2025-09-16 15:25 ` Luck, Tony
2025-09-16 15:27 ` Eugen Hristev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).