Linux-mm Archive mirror
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH net-next v2 09/15] mm: page_frag: reuse MSB of 'size' field for pfmemalloc
Date: Mon, 6 May 2024 20:33:58 +0800	[thread overview]
Message-ID: <a208cde1-41f2-c838-0bd1-a37d58f2179b@huawei.com> (raw)
In-Reply-To: <CAKgT0Ufm0=1cmyRLcrcu1_FAAeBokj3rpFAXJvVxgARXSStAuA@mail.gmail.com>

On 2024/4/30 22:54, Alexander Duyck wrote:
> On Tue, Apr 30, 2024 at 5:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> On 2024/4/29 22:49, Alexander Duyck wrote:
>>
>> ...
>>
>>>>>
>>>>
>>>> After considering a few different layouts for 'struct page_frag_cache',
>>>> it seems the below is more optimized:
>>>>
>>>> struct page_frag_cache {
>>>>         /* page address & pfmemalloc & order */
>>>>         void *va;
>>>
>>> I see. So basically just pack the much smaller bitfields in here.
>>>
>>>>
>>>> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) && (BITS_PER_LONG <= 32)
>>>>         u16 pagecnt_bias;
>>>>         u16 size;
>>>> #else
>>>>         u32 pagecnt_bias;
>>>>         u32 size;
>>>> #endif
>>>> }
>>>>
>>>> The lower bits of 'va' is or'ed with the page order & pfmemalloc instead
>>>> of offset or pagecnt_bias, so that we don't have to add more checking
>>>> for handling the problem of not having enough space for offset or
>>>> pagecnt_bias for PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE and 32 bits system.
>>>> And page address & pfmemalloc & order is unchanged for the same page
>>>> in the same 'page_frag_cache' instance, it makes sense to fit them
>>>> together.
>>>>
>>>> Also, it seems it is better to replace 'offset' with 'size', which indicates
>>>> the remaining size for the cache in a 'page_frag_cache' instance, and we
>>>> might be able to do a single 'size >= fragsz' checking for the case of cache
>>>> being enough, which should be the fast path if we ensure size is zoro when
>>>> 'va' == NULL.
>>>
>>> I'm not sure the rename to size is called for as it is going to be
>>> confusing. Maybe something like "remaining"?
>>
>> Yes, using 'size' for that is a bit confusing.
>> Beside the above 'remaining', by googling, it seems we may have below
>> options too:
>> 'residual','unused', 'surplus'
>>
>>>
>>>> Something like below:
>>>>
>>>> #define PAGE_FRAG_CACHE_ORDER_MASK      GENMASK(1, 0)
>>>> #define PAGE_FRAG_CACHE_PFMEMALLOC_BIT  BIT(2)
>>>
>>> The only downside is that it is ossifying things so that we can only
>>
>> There is 12 bits that is always useful, we can always extend ORDER_MASK
>> to more bits if lager order number is needed.
>>
>>> ever do order 3 as the maximum cache size. It might be better to do a
>>> full 8 bytes as on x86 it would just mean accessing the low end of a
>>> 16b register. Then you can have pfmemalloc at bit 8.
>>
>> I am not sure I understand the above as it seems we may have below option:
>> 1. Use somthing like the above ORDER_MASK and PFMEMALLOC_BIT.
>> 2. Use bitfield as something like below:
>>
>> unsigned long va:20;---or 52 for 64bit system
>> unsigned long pfmemalloc:1
>> unsigned long order:11;
>>
>> Or is there a better idea in your mind?
> 
> All I was suggesting was to make the ORDER_MASK 8 bits. Doing that the
> compiler can just store VA in a register such as RCX and instead of
> having to bother with a mask it could then just use CL to access the
> order. As far as the flag bits such as pfmemalloc the 4 bits starting
> at 8 would make the most sense since it doesn't naturally align to
> anything and would have to be masked anyway.

Ok.

> 
> Using a bitfield like you suggest would be problematic as the compiler
> would assume a shift is needed so you would have to add a shift to
> your code to offset it for accessing VA.
> 
>>>
>>>> struct page_frag_cache {
>>>>         /* page address & pfmemalloc & order */
>>>>         void *va;
>>>>
>>>
>>> When you start combining things like this we normally would convert va
>>> to an unsigned long.
>>
>> Ack.

It seems it makes more sense to convert va to something like 'struct encoded_va'
mirroring 'struct encoded_page' in below:

https://elixir.bootlin.com/linux/v6.7-rc8/source/include/linux/mm_types.h#L222

>>
>>>
>>>> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) && (BITS_PER_LONG <= 32)
>>>>         u16 pagecnt_bias;
>>>>         u16 size;
>>>> #else
>>>>         u32 pagecnt_bias;
>>>>         u32 size;
>>>> #endif
>>>> };
>>>>
>>>>



  reply	other threads:[~2024-05-06 12:34 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240415131941.51153-1-linyunsheng@huawei.com>
2024-04-15 13:19 ` [PATCH net-next v2 01/15] mm: page_frag: add a test module for page_frag Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 03/15] mm: page_frag: use free_unref_page() to free page fragment Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 04/15] mm: move the page fragment allocator from page_alloc into its own file Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 05/15] mm: page_frag: use initial zero offset for page_frag_alloc_align() Yunsheng Lin
2024-04-15 23:55   ` Alexander H Duyck
2024-04-16 13:11     ` Yunsheng Lin
2024-04-16 15:51       ` Alexander H Duyck
2024-04-17 13:17         ` Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 06/15] mm: page_frag: change page_frag_alloc_* API to accept align param Yunsheng Lin
2024-04-16 16:08   ` Alexander Duyck
2024-04-17 13:18     ` Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 07/15] mm: page_frag: add '_va' suffix to page_frag API Yunsheng Lin
2024-04-16 16:12   ` Alexander H Duyck
2024-04-17 13:18     ` Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 08/15] mm: page_frag: add two inline helper for " Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 09/15] mm: page_frag: reuse MSB of 'size' field for pfmemalloc Yunsheng Lin
2024-04-16 16:22   ` Alexander H Duyck
2024-04-17 13:19     ` Yunsheng Lin
2024-04-17 15:11       ` Alexander H Duyck
2024-04-18  9:39         ` Yunsheng Lin
2024-04-26  9:38           ` Yunsheng Lin
2024-04-29 14:49             ` Alexander Duyck
2024-04-30 12:05               ` Yunsheng Lin
2024-04-30 14:54                 ` Alexander Duyck
2024-05-06 12:33                   ` Yunsheng Lin [this message]
2024-04-15 13:19 ` [PATCH net-next v2 10/15] mm: page_frag: reuse existing bit field of 'va' for pagecnt_bias Yunsheng Lin
2024-04-16 16:33   ` Alexander H Duyck
2024-04-17 13:23     ` Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 12/15] mm: page_frag: introduce prepare/commit API for page_frag Yunsheng Lin
2024-04-15 13:19 ` [PATCH net-next v2 14/15] mm: page_frag: update documentation " Yunsheng Lin
2024-04-16  6:13   ` Bagas Sanjaya
2024-04-16 13:11     ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a208cde1-41f2-c838-0bd1-a37d58f2179b@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).