Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

Linux-arch Archive mirror
 help / color / mirror / Atom feed

From: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com>
To: Dave Hansen <dave.hansen@intel.com>,
	Will Deacon <will@kernel.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nick Piggin <npiggin@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	"David S. Miller" <davem@davemloft.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Arnd Bergmann <arnd@arndb.de>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-mm@kvack.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	sparclinux@vger.kernel.org, kernel@openvz.org
Subject: Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()
Date: Sat, 18 Dec 2021 17:31:43 +0300	[thread overview]
Message-ID: <d6094dc4-3976-e06f-696b-c55f696fe287@virtuozzo.com> (raw)
In-Reply-To: <fcbb726d-fe6a-8fe4-20fd-6a10cdef007a@intel.com>

>> This allows archs to optimize it, by
>> freeing multiple tables in a single release_pages() call. This is
>> faster than individual put_page() calls, especially with memcg
>> accounting enabled.
> 
> Could we quantify "faster"?  There's a non-trivial amount of code being
> added here and it would be nice to back it up with some cold-hard numbers.

I currently don't have numbers for this patch taken alone. This patch originates from work done some 
years ago to reduce cost of memory accounting, and x86-only version of this patch was in 
virtuozzo/openvz kernel since then. Other patches from that work have been upstreamed, but this one was 
missed.

Still it's obvious that release_pages() shall be faster that a loop calling put_page() - isn't that 
exactly the reason why release_pages() exists and is different from a loop calling put_page()?

>>   static void __tlb_remove_table_free(struct mmu_table_batch *batch)
>>   {
>> -	int i;
>> -
>> -	for (i = 0; i < batch->nr; i++)
>> -		__tlb_remove_table(batch->tables[i]);
>> -
>> +	__tlb_remove_tables(batch->tables, batch->nr);
>>   	free_page((unsigned long)batch);
>>   }
> 
> This leaves a single call-site for __tlb_remove_table():
> 
>> static void tlb_remove_table_one(void *table)
>> {
>>          tlb_remove_table_sync_one();
>>          __tlb_remove_table(table);
>> }
> 
> Is that worth it, or could it just be:
> 
> 	__tlb_remove_tables(&table, 1);

I was considering that while preparing the patch, however that resulted into even larger change in 
archs, due to removal of non-batched call, and I decided not to follow this way.

And, Peter's suggestion to integrate free_page_and_swap()-based implementation of __tlb_remove_table() 
into mm/mmu_gather.c under ifdef, and then do the optimization locally in mm/mmu_gather.c, looks better.

>> +void free_pages_and_swap_cache_nolru(struct page **pages, int nr)
>> +{
>> +	__free_pages_and_swap_cache(pages, nr, false);
>>   }
> 
> This went unmentioned in the changelog.  But, it seems like there's a
> specific optimization here.  In the exiting code,
> free_pages_and_swap_cache() is wasteful if no page in pages[] is on the
> LRU.  It doesn't need the lru_add_drain().

This is a somewhat different topic.

In scope of this patch, the _nolru version was added because there was no lru draining in the looped 
call to __tlb_remove_table(). Having it added to the batched version, although won't break things, does 
add overhead that was not there before, which is in direct conflict with the original goal.

If the version with draining lru is indeed not needed, it can be cleaned out in scope of a different 
patchset.

> 		if (!do_lru)
> 			VM_WARN_ON_ONCE_PAGE(PageLRU(pagep[i]),
> 					     pagep[i]);
> 		free_swap_cache(...);

This looks like a good safety measure, will add it.

> But, even more than that, do all the architectures even need the
> free_swap_cache()?

I was under impression that process page tables are a valid target for swapping out. Although I can be 
wrong here.

Nikita

next prev parent reply	other threads:[~2021-12-18 14:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17  8:19 [PATCH/RFC] mm: add and use batched version of __tlb_remove_table() Nikita Yushchenko
2021-12-17 18:26 ` Dave Hansen
2021-12-18 14:31   ` Nikita Yushchenko [this message]
2021-12-19  1:34     ` Dave Hansen
2021-12-23  9:55       ` Nikita Yushchenko
2021-12-18  0:37 ` Peter Zijlstra
2021-12-18 13:35   ` Nikita Yushchenko
     [not found] ` <YbzZaFY+ht+bUtcz@ravnborg.org>
2021-12-18 13:38   ` Nikita Yushchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6094dc4-3976-e06f-696b-c55f696fe287@virtuozzo.com \
    --to=nikita.yushchenko@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=kernel@openvz.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).