From: David Hildenbrand <david@redhat.com>
To: "Verma, Vishal L" <vishal.l.verma@intel.com>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"Jiang, Dave" <dave.jiang@intel.com>,
"osalvador@suse.de" <osalvador@suse.de>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"Huang, Ying" <ying.huang@intel.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
"Hocko, Michal" <mhocko@suse.com>,
"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
"jmoyer@redhat.com" <jmoyer@redhat.com>,
"Jonathan.Cameron@Huawei.com" <Jonathan.Cameron@Huawei.com>
Subject: Re: [PATCH v7 2/3] mm/memory_hotplug: split memmap_on_memory requests across memblocks
Date: Tue, 31 Oct 2023 11:13:14 +0100 [thread overview]
Message-ID: <e5d9423e-5a61-4fbe-b971-52e4283c1afd@redhat.com> (raw)
In-Reply-To: <cdeef06d81abb3fc4b5f4bea6b3fd5b83972249b.camel@intel.com>
On 31.10.23 03:14, Verma, Vishal L wrote:
> On Mon, 2023-10-30 at 11:20 +0100, David Hildenbrand wrote:
>> On 26.10.23 00:44, Vishal Verma wrote:
>>>
> [..]
>
>>> @@ -2146,11 +2186,69 @@ void try_offline_node(int nid)
>>> }
>>> EXPORT_SYMBOL(try_offline_node);
>>>
>>> -static int __ref try_remove_memory(u64 start, u64 size)
>>> +static void __ref remove_memory_blocks_and_altmaps(u64 start, u64 size)
>>> {
>>> - struct memory_block *mem;
>>> - int rc = 0, nid = NUMA_NO_NODE;
>>> + unsigned long memblock_size = memory_block_size_bytes();
>>> struct vmem_altmap *altmap = NULL;
>>> + struct memory_block *mem;
>>> + u64 cur_start;
>>> + int rc;
>>> +
>>> + /*
>>> + * For memmap_on_memory, the altmaps could have been added on
>>> + * a per-memblock basis. Loop through the entire range if so,
>>> + * and remove each memblock and its altmap.
>>> + */
>>
>> /*
>> * altmaps where added on a per-memblock basis; we have to process
>> * each individual memory block.
>> */
>>
>>> + for (cur_start = start; cur_start < start + size;
>>> + cur_start += memblock_size) {
>>> + rc = walk_memory_blocks(cur_start, memblock_size, &mem,
>>> + test_has_altmap_cb);
>>> + if (rc) {
>>> + altmap = mem->altmap;
>>> + /*
>>> + * Mark altmap NULL so that we can add a debug
>>> + * check on memblock free.
>>> + */
>>> + mem->altmap = NULL;
>>> + }
>>
>> Simpler (especially, we know that there must be an altmap):
>>
>> mem = find_memory_block(pfn_to_section_nr(cur_start));
>> altmap = mem->altmap;
>> mem->altmap = NULL;
>>
>> I think we might be able to remove test_has_altmap_cb() then.
>>
>>> +
>>> + remove_memory_block_devices(cur_start, memblock_size);
>>> +
>>> + arch_remove_memory(cur_start, memblock_size, altmap);
>>> +
>>> + /* Verify that all vmemmap pages have actually been freed. */
>>> + if (altmap) {
>>
>> There must be an altmap, so this can be done unconditionally.
>
> Hi David,
Hi!
>
> All other comments make sense, making those changes now.
>
> However for this one, does the WARN() below go away then?
>
> I was wondering if maybe arch_remove_memory() is responsible for
> freeing the altmap here, and at this stage we're just checking if that
> happened. If it didn't WARN and then free it.
I think that has to stay, to make sure arch_remove_memory() did the
right thing and we don't -- by BUG -- still have some altmap pages in
use after they should have been completely freed.
>
> I drilled down the path, and I don't see altmap actually getting freed
> in vmem_altmap_free(), but I wasn't sure if <something else> was meant
> to free it as altmap->alloc went down to 0.
vmem_altmap_free() does the "altmap->alloc -= nr_pfns", which is called
when arch_remove_memory() frees the vmemmap pages and detects that they
actually come from the altmap reserve and not from the buddy/earlyboot
allocator.
Freeing an altmap is just unaccounting it in the altmap structure; and
here we make sure that we are actually back down to 0 and don't have
some weird altmap freeing BUG in arch_remove_memory().
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2023-10-31 10:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-25 22:44 [PATCH v7 0/3] mm: use memmap_on_memory semantics for dax/kmem Vishal Verma
2023-10-25 22:44 ` [PATCH v7 1/3] mm/memory_hotplug: replace an open-coded kmemdup() in add_memory_resource() Vishal Verma
2023-10-25 22:44 ` [PATCH v7 2/3] mm/memory_hotplug: split memmap_on_memory requests across memblocks Vishal Verma
2023-10-30 2:46 ` Huang, Ying
2023-10-30 10:20 ` David Hildenbrand
2023-10-31 2:14 ` Verma, Vishal L
2023-10-31 10:13 ` David Hildenbrand [this message]
2023-10-25 22:44 ` [PATCH v7 3/3] dax/kmem: allow kmem to add memory with memmap_on_memory Vishal Verma
2023-10-30 2:57 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e5d9423e-5a61-4fbe-b971-52e4283c1afd@redhat.com \
--to=david@redhat.com \
--cc=Jonathan.Cameron@Huawei.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=jmoyer@redhat.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=nvdimm@lists.linux.dev \
--cc=osalvador@suse.de \
--cc=vishal.l.verma@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).