Re: [PATCH v5] mm/rmap: do not add fully unmapped large folio to deferred split list

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Yang Shi <shy828301@gmail.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Barry Song <21cnbao@gmail.com>, Lance Yang <ioworker0@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5] mm/rmap: do not add fully unmapped large folio to deferred split list
Date: Fri, 26 Apr 2024 22:15:55 +0200	[thread overview]
Message-ID: <fb58cd88-d5f3-42ea-a710-0cff3764cd32@redhat.com> (raw)
In-Reply-To: <CB058583-C99A-4897-B3AE-11C56D6E3A59@nvidia.com>

On 26.04.24 21:20, Zi Yan wrote:
> On 26 Apr 2024, at 15:08, David Hildenbrand wrote:
> 
>> On 26.04.24 21:02, Zi Yan wrote:
>>> From: Zi Yan <ziy@nvidia.com>
>>>
>>> In __folio_remove_rmap(), a large folio is added to deferred split list
>>> if any page in a folio loses its final mapping. But it is possible that
>>> the folio is fully unmapped and adding it to deferred split list is
>>> unnecessary.
>>>
>>> For PMD-mapped THPs, that was not really an issue, because removing the
>>> last PMD mapping in the absence of PTE mappings would not have added the
>>> folio to the deferred split queue.
>>>
>>> However, for PTE-mapped THPs, which are now more prominent due to mTHP,
>>> they are always added to the deferred split queue. One side effect
>>> is that the THP_DEFERRED_SPLIT_PAGE stat for a PTE-mapped folio can be
>>> unintentionally increased, making it look like there are many partially
>>> mapped folios -- although the whole folio is fully unmapped stepwise.
>>>
>>> Core-mm now tries batch-unmapping consecutive PTEs of PTE-mapped THPs
>>> where possible starting from commit b06dc281aa99 ("mm/rmap: introduce
>>> folio_remove_rmap_[pte|ptes|pmd]()"). When it happens, a whole PTE-mapped
>>> folio is unmapped in one go and can avoid being added to deferred split
>>> list, reducing the THP_DEFERRED_SPLIT_PAGE noise. But there will still be
>>> noise when we cannot batch-unmap a complete PTE-mapped folio in one go
>>> -- or where this type of batching is not implemented yet, e.g., migration.
>>>
>>> To avoid the unnecessary addition, folio->_nr_pages_mapped is checked
>>> to tell if the whole folio is unmapped. If the folio is already on
>>> deferred split list, it will be skipped, too.
>>>
>>> Note: commit 98046944a159 ("mm: huge_memory: add the missing
>>> folio_test_pmd_mappable() for THP split statistics") tried to exclude
>>> mTHP deferred split stats from THP_DEFERRED_SPLIT_PAGE, but it does not
>>> fix the above issue. A fully unmapped PTE-mapped order-9 THP was still
>>> added to deferred split list and counted as THP_DEFERRED_SPLIT_PAGE,
>>> since nr is 512 (non zero), level is RMAP_LEVEL_PTE, and inside
>>> deferred_split_folio() the order-9 folio is folio_test_pmd_mappable().
>>>
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>> ---
>>>    mm/rmap.c | 12 +++++++++---
>>>    1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 2608c40dffad..a9bd64ebdd9a 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -1495,6 +1495,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
>>>    {
>>>    	atomic_t *mapped = &folio->_nr_pages_mapped;
>>>    	int last, nr = 0, nr_pmdmapped = 0;
>>> +	bool partially_mapped = false;
>>>    	enum node_stat_item idx;
>>>     	__folio_rmap_sanity_checks(folio, page, nr_pages, level);
>>> @@ -1515,6 +1516,8 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
>>>    					nr++;
>>>    			}
>>>    		} while (page++, --nr_pages > 0);
>>> +
>>> +		partially_mapped = !!nr && !!atomic_read(mapped);
>>
>> Nit: The && should remove the need for both !!.
> 
> My impression was that !! is needed to convert from int to bool and I do
> find "!!int && !!int" use in the kernel. 

I might be wrong about this, but if you wouldn't write

	if (!!nr && !!atomic_read(mapped))

then

bool partially_mapped = nr && atomic_read(mapped);

is sufficient.

&& would make sure that the result is either 0 or 1, which
you can store safely in a bool, no matter which underlying type
is used to store that value.

But I *think* nowdays, the compiler will always handle that
correctly, even without the "&&" (ever since C99 added _Bool).

Likely, also

	bool partially_mapped = nr & atomic_read(mapped);

Would nowadays work, but looks stupid.


Related: https://lkml.org/lkml/2013/8/31/138

---
#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>
#include <inttypes.h>

volatile uint64_t a = 0x8000000000000000ull;

void main (void) {
         printf("uint64_t a = a: 0x%" PRIx64 "\n", a);

         int i1 = a;
         printf("int i1 = a: %d\n", i1);

         int i2 = !!a;
         printf("int i2 = !!a: %d\n", i2);

         bool b1 = a;
         printf("bool b1 = a: %d\n", b1);

         bool b2 = !!a;
         printf("bool b2 = !!a: %d\n", b2);
}
---
$ ./test
uint64_t a = a: 0x8000000000000000
int i1 = a: 0
int i2 = !!a: 1
bool b1 = a: 1
bool b2 = !!a: 1
---

Note that if bool would be defined as "int", you would need the !!, otherwise you
would lose information.

But even for b1, the gcc generates now:

  40118c:       48 8b 05 7d 2e 00 00    mov    0x2e7d(%rip),%rax        # 404010 <a>
  401193:       48 85 c0                test   %rax,%rax
  401196:       0f 95 c0                setne  %al


My stdbool.h contains

#define bool	_Bool

And I think C99 added _Bool that makes that work.

But I didn't read the standard, and it's time for the weekend :)

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2024-04-26 20:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-26 19:02 [PATCH v5] mm/rmap: do not add fully unmapped large folio to deferred split list Zi Yan
2024-04-26 19:08 ` David Hildenbrand
2024-04-26 19:20   ` Zi Yan
2024-04-26 20:15     ` David Hildenbrand [this message]
2024-04-26 20:22       ` Zi Yan
2024-04-27  4:06       ` Lance Yang
2024-04-27  6:51         ` David Hildenbrand
2024-04-27  9:32     ` Barry Song
2024-04-26 20:42 ` Yang Shi
2024-04-27  4:09 ` Lance Yang
2024-05-01 13:24 ` Alexander Gordeev
2024-05-01 13:38   ` Zi Yan
2024-05-01 15:54     ` David Hildenbrand
2024-05-02 13:18     ` Alexander Gordeev
2024-05-02 13:20       ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb58cd88-d5f3-42ea-a710-0cff3764cd32@redhat.com \
    --to=david@redhat.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.