LinuxPPC-Dev Archive mirror
 help / color / mirror / Atom feed
From: Sourabh Jain <sourabhjain@linux.ibm.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
	linuxppc-dev@lists.ozlabs.org
Cc: Baoquan he <bhe@redhat.com>, Jiri Bohac <jbohac@suse.cz>,
	Hari Bathini <hbathini@linux.ibm.com>,
	Madhavan Srinivasan <maddy@linux.ibm.com>,
	Mahesh Salgaonkar <mahesh@linux.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Shivang Upadhyay <shivangu@linux.ibm.com>,
	kexec@lists.infradead.org
Subject: Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
Date: Tue, 4 Nov 2025 10:48:51 +0530	[thread overview]
Message-ID: <7957bd55-5bda-406f-aab3-64e0620bd452@linux.ibm.com> (raw)
In-Reply-To: <87y0on4ebh.ritesh.list@gmail.com>



On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>
>> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
>> crashkernel= command line option") and commit ab475510e042 ("kdump:
>> implement reserve_crashkernel_cma") added CMA support for kdump
>> crashkernel reservation.
>>
>> Extend crashkernel CMA reservation support to powerpc.
>>
>> The following changes are made to enable CMA reservation on powerpc:
>>
>> - Parse and obtain the CMA reservation size along with other crashkernel
>>    parameters
>> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
>> - Include the CMA-reserved ranges in the usable memory ranges for the
>>    kdump kernel to use.
>> - Exclude the CMA-reserved ranges from the crash kernel memory to
>>    prevent them from being exported through /proc/vmcore.
>>
>> With the introduction of the CMA crashkernel regions,
>> crash_exclude_mem_range() needs to be called multiple times to exclude
>> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
>> avoid repetitive logic for validating mem_ranges size and handling
>> reallocation when required, this functionality is moved to a new wrapper
>> function crash_exclude_mem_range_guarded().
>>
>> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
>> after pageblock_order is initialized.
>>
>> Update kernel-parameters.txt to document CMA support for crashkernel on
>> powerpc architecture.
>>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Jiri Bohac <jbohac@suse.cz>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>> Cc: kexec@lists.infradead.org
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>> Changlog:
>>
>> v3 -> v4
>>   - Removed repeated initialization to tmem in
>>     crash_exclude_mem_range_guarded()
>>   - Call crash_exclude_mem_range() with right crashk ranges
>>
>> v4 -> v5:
>>   - Document CMA-based crashkernel support for ppc64 in kernel-parameters.txt
>> ---
>>   .../admin-guide/kernel-parameters.txt         |  2 +-
>>   arch/powerpc/include/asm/kexec.h              |  2 +
>>   arch/powerpc/kernel/setup-common.c            |  4 +-
>>   arch/powerpc/kexec/core.c                     | 10 ++++-
>>   arch/powerpc/kexec/ranges.c                   | 43 ++++++++++++++-----
>>   5 files changed, 47 insertions(+), 14 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 6c42061ca20e..0f386b546cec 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -1013,7 +1013,7 @@
>>   			It will be ignored when crashkernel=X,high is not used
>>   			or memory reserved is below 4G.
>>   	crashkernel=size[KMG],cma
>> -			[KNL, X86] Reserve additional crash kernel memory from
>> +			[KNL, X86, ppc64] Reserve additional crash kernel memory from
> Shouldn't this be PPC and not ppc64?
>
> If I see the crash_dump support...
>
> config ARCH_SUPPORTS_CRASH_DUMP
> 	def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
>
> The changes below aren't specific to ppc64 correct?

The thing is this feature is only supported with KEXEC_FILE and which 
only supported on PPC64:

config ARCH_SUPPORTS_KEXEC_FILE
     def_bool PPC64

Hence I kept it as ppc64.

I think I should update that in the commit message.

Also do you think is it good to restrict this feature to KEXEC_FILE?

>
>>   			CMA. This reservation is usable by the first system's
>>   			userspace memory and kernel movable allocations (memory
>>   			balloon, zswap). Pages allocated from this memory range
>> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
>> index 4bbf9f699aaa..bd4a6c42a5f3 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
>>   #ifdef CONFIG_CRASH_RESERVE
>>   int __init overlaps_crashkernel(unsigned long start, unsigned long size);
>>   extern void arch_reserve_crashkernel(void);
>> +extern void kdump_cma_reserve(void);
>>   #else
>>   static inline void arch_reserve_crashkernel(void) {}
>>   static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
>> +static inline void kdump_cma_reserve(void) { }
>>   #endif
>>   
>>   #if defined(CONFIG_CRASH_DUMP)
>> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
>> index 68d47c53876c..c8c42b419742 100644
>> --- a/arch/powerpc/kernel/setup-common.c
>> +++ b/arch/powerpc/kernel/setup-common.c
>> @@ -35,6 +35,7 @@
>>   #include <linux/of_irq.h>
>>   #include <linux/hugetlb.h>
>>   #include <linux/pgtable.h>
>> +#include <asm/kexec.h>
>>   #include <asm/io.h>
>>   #include <asm/paca.h>
>>   #include <asm/processor.h>
>> @@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
>>   	initmem_init();
>>   
>>   	/*
>> -	 * Reserve large chunks of memory for use by CMA for fadump, KVM and
>> +	 * Reserve large chunks of memory for use by CMA for kdump, fadump, KVM and
>>   	 * hugetlb. These must be called after initmem_init(), so that
>>   	 * pageblock_order is initialised.
>>   	 */
>>   	fadump_cma_init();
>> +	kdump_cma_reserve();
>>   	kvm_cma_reserve();
>>   	gigantic_hugetlb_cma_reserve();
>>   
>> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
>> index d1a2d755381c..25744737eff5 100644
>> --- a/arch/powerpc/kexec/core.c
>> +++ b/arch/powerpc/kexec/core.c
>> @@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
>>   {
>>   }
>>   
>> +unsigned long long cma_size;
>> +
> nit:
> Since this is a gloabal powerpc variable you are defining, then can we
> keep it's name to crashk_cma_size?

Yeah make sense. I will update the variable name.


>
>>   /*
>>    * Do not allocate memory (or fail in any way) in machine_kexec().
>>    * We are past the point of no return, committed to rebooting now.
>> @@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
>>   
>>   	/* use common parsing */
>>   	ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
>> -				&crash_base, NULL, NULL, NULL);
>> +				&crash_base, NULL, &cma_size, NULL);
>>   
>>   	if (ret)
>>   		return;
>> @@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
>>   	reserve_crashkernel_generic(crash_size, crash_base, 0, false);
>>   }
>>   
>> +void __init kdump_cma_reserve(void)
>> +{
>> +	if (cma_size)
>> +		reserve_crashkernel_cma(cma_size);
>> +}
>> +
> nit:
> cma_size is already checked for null within reserve_crashkernel_cma(),
> so we don't really need kdump_cma_reserve() function call as such.
>
> Also kdump_cma_reserve() only make sense with #ifdef CRASHKERNEL_CMA..
> so instead do you think we can directly call reserve_crashkernel_cma(cma_size)?

I think the above kdump_cma_reserve() definition should come under 
CONFIG_CRASH_RESERVE
because the way it is declared in arch/powerpc/include/asm/kexec.h.

I would like to keep kdump_cma_reserve() as is it because of two reasons:

- It keeps setup_arch() free from kdump #ifdefs
- In case if we want to add some condition on this reservation it would 
straight forward.

So lets keep kdump_cma_reserve as is, unless you have strong opinion on 
not to.

>>   int __init overlaps_crashkernel(unsigned long start, unsigned long size)
>>   {
>>   	return (start + size) > crashk_res.start && start <= crashk_res.end;
>> diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
>> index 3702b0bdab14..3bd27c38726b 100644
>> --- a/arch/powerpc/kexec/ranges.c
>> +++ b/arch/powerpc/kexec/ranges.c
>> @@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
>>    */
>>   int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>   {
>> -	int ret;
>> +	int ret, i;
>>   
>>   	/*
>>   	 * Early boot failure observed on guests when low memory (first memory
>> @@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>   	if (ret)
>>   		goto out;
>>   
>> +	for (i = 0; i < crashk_cma_cnt; i++) {
>> +		ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
>> +				    crashk_cma_ranges[i].end - crashk_cma_ranges[i].start + 1);
>> +		if (ret)
>> +			goto out;
>> +	}
>> +
>>   	ret = add_rtas_mem_range(mem_ranges);
>>   	if (ret)
>>   		goto out;
>> @@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>   #endif /* CONFIG_KEXEC_FILE */
>>   
>>   #ifdef CONFIG_CRASH_DUMP
>> +static int crash_exclude_mem_range_guarded(struct crash_mem **mem_ranges,
>> +					   unsigned long long mstart,
>> +					   unsigned long long mend)
>> +{
>> +	struct crash_mem *tmem = *mem_ranges;
>> +
>> +	/* Reallocate memory ranges if there is no space to split ranges */
>> +	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> +		tmem = realloc_mem_ranges(mem_ranges);
>> +		if (!tmem)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	return crash_exclude_mem_range(tmem, mstart, mend);
>> +}
>> +
>>   /**
>>    * get_crash_memory_ranges - Get crash memory ranges. This list includes
>>    *                           first/crashing kernel's memory regions that
>> @@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>   int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>>   {
>>   	phys_addr_t base, end;
>> -	struct crash_mem *tmem;
>>   	u64 i;
>>   	int ret;
>>   
>> @@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>>   			sort_memory_ranges(*mem_ranges, true);
>>   	}
>>   
>> -	/* Reallocate memory ranges if there is no space to split ranges */
>> -	tmem = *mem_ranges;
>> -	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> -		tmem = realloc_mem_ranges(mem_ranges);
>> -		if (!tmem)
>> -			goto out;
>> -	}
>> -
>>   	/* Exclude crashkernel region */
>> -	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
>> +	ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_res.start, crashk_res.end);
>>   	if (ret)
>>   		goto out;
>>   
>> +	for (i = 0; i < crashk_cma_cnt; ++i) {
>> +		ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_cma_ranges[i].start,
>> +					      crashk_cma_ranges[i].end);
>> +		if (ret)
>> +			goto out;
>> +	}
>> +
>>   	/*
>>   	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
>>   	 *        regions are exported to save their context at the time of
>> -- 
>> 2.51.0



  reply	other threads:[~2025-11-04  5:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-03  4:37 [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation Sourabh Jain
2025-11-03 10:10 ` Ritesh Harjani
2025-11-04  5:18   ` Sourabh Jain [this message]
2025-11-04  9:34     ` Sourabh Jain
2025-11-04 10:24       ` Ritesh Harjani
2025-11-04 12:38         ` Sourabh Jain
2025-11-04 10:18     ` Ritesh Harjani
2025-11-04 10:35       ` Sourabh Jain
2025-11-04 10:51         ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7957bd55-5bda-406f-aab3-64e0620bd452@linux.ibm.com \
    --to=sourabhjain@linux.ibm.com \
    --cc=bhe@redhat.com \
    --cc=hbathini@linux.ibm.com \
    --cc=jbohac@suse.cz \
    --cc=kexec@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mahesh@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=ritesh.list@gmail.com \
    --cc=shivangu@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).