BPF Archive mirror
 help / color / mirror / Atom feed
From: "wuqiang.matt" <wuqiang.matt@bytedance.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	Andrii Nakryiko <andrii@kernel.org>,
	linux-trace-kernel@vger.kernel.org, rostedt@goodmis.org,
	mhiramat@kernel.org
Cc: bpf@vger.kernel.org, "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] objpool: enable inlining objpool_push() and objpool_pop() operations
Date: Fri, 10 May 2024 17:15:58 +0800	[thread overview]
Message-ID: <6994f6c1-29eb-46cb-942e-c2d1e3fe9f5d@bytedance.com> (raw)
In-Reply-To: <93840eb4-609d-49d3-b48a-9c26bfb5b8ec@suse.cz>

On 2024/5/10 16:20, Vlastimil Babka wrote:
> On 5/10/24 9:59 AM, wuqiang.matt wrote:
>> On 2024/5/7 21:55, Vlastimil Babka wrote:
>   >>
>>>> +	} while (!try_cmpxchg_acquire(&slot->tail, &tail, tail + 1));
>>>> +
>>>> +	/* now the tail position is reserved for the given obj */
>>>> +	WRITE_ONCE(slot->entries[tail & slot->mask], obj);
>>>> +	/* update sequence to make this obj available for pop() */
>>>> +	smp_store_release(&slot->last, tail + 1);
>>>> +
>>>> +	return 0;
>>>> +}
>>>>    
>>>>    /**
>>>>     * objpool_push() - reclaim the object and return back to objpool
>>>> @@ -134,7 +219,19 @@ void *objpool_pop(struct objpool_head *pool);
>>>>     * return: 0 or error code (it fails only when user tries to push
>>>>     * the same object multiple times or wrong "objects" into objpool)
>>>>     */
>>>> -int objpool_push(void *obj, struct objpool_head *pool);
>>>> +static inline int objpool_push(void *obj, struct objpool_head *pool)
>>>> +{
>>>> +	unsigned long flags;
>>>> +	int rc;
>>>> +
>>>> +	/* disable local irq to avoid preemption & interruption */
>>>> +	raw_local_irq_save(flags);
>>>> +	rc = __objpool_try_add_slot(obj, pool, raw_smp_processor_id());
>>>
>>> And IIUC, we could in theory objpool_pop() on one cpu, then later another
>>> cpu might do objpool_push() and cause the latter cpu's pool to go over
>>> capacity? Is there some implicit requirements of objpool users to take care
>>> of having matched cpu for pop and push? Are the current objpool users
>>> obeying this requirement? (I can see the selftests do, not sure about the
>>> actual users).
>>> Or am I missing something? Thanks.
>>
>> The objects are all pre-allocated along with creation of the new objpool
>> and the total number of objects never exceeds the capacity on local node.
> 
> Aha, I see, the capacity of entries is enough to hold objects from all nodes
> in the most unfortunate case they all end up freed from a single cpu.
> 
>> So objpool_push() would always find an available slot from the ring-array
>> for the given object to insert back. objpool_pop() would try looping all
>> the percpu slots until an object is found or whole objpool is empty.
> 
> So it's correct, but seems rather wasteful to have the whole capacity for
> entries replicated on every cpu? It does make objpool_push() simple and
> fast, but as you say, objpool_pop() still has to search potentially all
> non-local percpu slots, with disabled irqs, which is far from ideal.

Yes, it's a trade-off between performance and memory usage, with a slight
increase of memory consumption for a significant improvement of performance.

The reason of disabling local irqs is objpool uses a 32bit sequence number
as the state description of each element. It could likely overflow and go
back with the same value for extreme cases. 64bit value could eliminate the
collision but seems too heavy.

> And the "abort if the slot was already full" comment for
> objpool_try_add_slot() seems still misleading? Maybe that was your initial
> idea but changed later?

Right, the comments are just left unchanged during iterations. The original
implementation kept each percpu ring-array very compact and objpool_push will
try looping all cpu nodes to return the given object to objpool.

Actually my new update would remove objpool_try_add_slot and integrate it's 
functionality into objpool_push. I'll submit the new patch when I finish the
verification.

> 
>> Currently kretprobe is the only actual usecase of objpool.
>>
>> I'm testing an updated objpool in our HIDS project for critical pathes,
>> which is widely deployed on servers inside my company. The new version
>> eliminates the raw_local_irq_save and raw_local_irq_restore pair of
>> objpool_push and gains up to 5% of performance boost.
> 
> Mind Ccing me and linux-mm once you are posting that?

Sure, I'll make sure to let you know.

> Thanks,
> Vlastimil
> 

Regards,
Matt Wu

  reply	other threads:[~2024-05-10  9:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24 21:52 [PATCH 0/2] Objpool performance improvements Andrii Nakryiko
2024-04-24 21:52 ` [PATCH 1/2] objpool: enable inlining objpool_push() and objpool_pop() operations Andrii Nakryiko
2024-05-07 13:55   ` Vlastimil Babka
2024-05-10  7:59     ` wuqiang.matt
2024-05-10  8:20       ` Vlastimil Babka
2024-05-10  9:15         ` wuqiang.matt [this message]
2024-04-24 21:52 ` [PATCH 2/2] objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids Andrii Nakryiko
2024-04-26 14:25 ` [PATCH 0/2] Objpool performance improvements Masami Hiramatsu
2024-04-26 16:05   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6994f6c1-29eb-46cb-942e-c2d1e3fe9f5d@bytedance.com \
    --to=wuqiang.matt@bytedance.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).