Re: [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Roman Gushchin <roman.gushchin@linux.dev>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats
Date: Mon, 29 Apr 2024 09:00:16 -0700	[thread overview]
Message-ID: <Zi_EEOUS_iCh2Nfh@P9FQF9L96D> (raw)
In-Reply-To: <20240427003733.3898961-4-shakeel.butt@linux.dev>

On Fri, Apr 26, 2024 at 05:37:29PM -0700, Shakeel Butt wrote:
> At the moment, the amount of memory allocated for stats related structs
> in the mem_cgroup corresponds to the size of enum node_stat_item.
> However not all fields in enum node_stat_item has corresponding memcg
> stats. So, let's use indirection mechanism similar to the one used for
> memcg vmstats management.
> 
> For a given x86_64 config, the size of stats with and without patch is:
> 
> structs size in bytes         w/o     with
> 
> struct lruvec_stats           1128     648
> struct lruvec_stats_percpu     752     432
> struct memcg_vmstats          1832    1352
> struct memcg_vmstats_percpu   1280     960
> 
> The memory savings is further compounded by the fact that these structs
> are allocated for each cpu and for each node. To be precise, for each
> memcg the memory saved would be:
> 
> Memory saved = ((21 * 3 * NR_NODES) + (21 * 2 * NR_NODS * NR_CPUS) +
> 	       (21 * 3) + (21 * 2 * NR_CPUS)) * sizeof(long)
> 
> Where 21 is the number of fields eliminated.

Nice savings!

> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
>  mm/memcontrol.c | 138 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 115 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5e337ed6c6bf..c164bc9b8ed6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -576,35 +576,105 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
>  	return mz;
>  }
>  
> +/* Subset of node_stat_item for memcg stats */
> +static const unsigned int memcg_node_stat_items[] = {
> +	NR_INACTIVE_ANON,
> +	NR_ACTIVE_ANON,
> +	NR_INACTIVE_FILE,
> +	NR_ACTIVE_FILE,
> +	NR_UNEVICTABLE,
> +	NR_SLAB_RECLAIMABLE_B,
> +	NR_SLAB_UNRECLAIMABLE_B,
> +	WORKINGSET_REFAULT_ANON,
> +	WORKINGSET_REFAULT_FILE,
> +	WORKINGSET_ACTIVATE_ANON,
> +	WORKINGSET_ACTIVATE_FILE,
> +	WORKINGSET_RESTORE_ANON,
> +	WORKINGSET_RESTORE_FILE,
> +	WORKINGSET_NODERECLAIM,
> +	NR_ANON_MAPPED,
> +	NR_FILE_MAPPED,
> +	NR_FILE_PAGES,
> +	NR_FILE_DIRTY,
> +	NR_WRITEBACK,
> +	NR_SHMEM,
> +	NR_SHMEM_THPS,
> +	NR_FILE_THPS,
> +	NR_ANON_THPS,
> +	NR_KERNEL_STACK_KB,
> +	NR_PAGETABLE,
> +	NR_SECONDARY_PAGETABLE,
> +#ifdef CONFIG_SWAP
> +	NR_SWAPCACHE,
> +#endif
> +};
> +
> +static const unsigned int memcg_stat_items[] = {
> +	MEMCG_SWAP,
> +	MEMCG_SOCK,
> +	MEMCG_PERCPU_B,
> +	MEMCG_VMALLOC,
> +	MEMCG_KMEM,
> +	MEMCG_ZSWAP_B,
> +	MEMCG_ZSWAPPED,
> +};
> +
> +#define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
> +#define NR_MEMCG_STATS (NR_MEMCG_NODE_STAT_ITEMS + ARRAY_SIZE(memcg_stat_items))
> +static int8_t mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly;
> +
> +static void init_memcg_stats(void)
> +{
> +	int8_t i, j = 0;
> +
> +	/* Switch to short once this failure occurs. */
> +	BUILD_BUG_ON(NR_MEMCG_STATS >= 127 /* INT8_MAX */);
> +
> +	for (i = 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i)
> +		mem_cgroup_stats_index[memcg_node_stat_items[i]] = ++j;
> +
> +	for (i = 0; i < ARRAY_SIZE(memcg_stat_items); ++i)
> +		mem_cgroup_stats_index[memcg_stat_items[i]] = ++j;
> +}
> +
> +static inline int memcg_stats_index(int idx)
> +{
> +	return mem_cgroup_stats_index[idx] - 1;
> +}

Hm, I'm slightly worried about the performance penalty due to the increased cache
footprint. Can't we have some formula to translate idx to memcg_idx instead of
a translation table?
If it requires a re-arrangement of items we can add a translation table on the
read side to save the visible order in procfs/sysfs.
Or I'm overthinking and the real difference is negligible?

Thanks!

next prev parent reply	other threads:[~2024-04-29 16:00 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-27  0:37 [PATCH v2 0/7] memcg: reduce memory consumption by memcg stats Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 1/7] memcg: reduce memory size of mem_cgroup_events_index Shakeel Butt
2024-04-27  0:42   ` Yosry Ahmed
2024-04-27  1:15     ` Shakeel Butt
2024-04-29 15:36   ` Roman Gushchin
2024-04-27  0:37 ` [PATCH v2 2/7] memcg: dynamically allocate lruvec_stats Shakeel Butt
2024-04-27  1:23   ` Yosry Ahmed
2024-04-29 15:50   ` Roman Gushchin
2024-04-29 19:46     ` Shakeel Butt
2024-04-29 21:02       ` Roman Gushchin
2024-04-29 21:59         ` Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats Shakeel Butt
2024-04-27  0:51   ` Yosry Ahmed
2024-04-27  1:16     ` Shakeel Butt
2024-04-27  1:18       ` Yosry Ahmed
2024-04-29 16:00   ` Roman Gushchin [this message]
2024-04-29 20:00     ` Shakeel Butt
2024-04-29 17:35   ` T.J. Mercier
2024-04-29 20:13     ` Shakeel Butt
2024-04-29 22:23       ` T.J. Mercier
2024-04-27  0:37 ` [PATCH v2 4/7] memcg: cleanup __mod_memcg_lruvec_state Shakeel Butt
2024-04-27  0:53   ` Yosry Ahmed
2024-04-29 15:45   ` Roman Gushchin
2024-04-27  0:37 ` [PATCH v2 5/7] memcg: pr_warn_once for unexpected events and stats Shakeel Butt
2024-04-27  0:58   ` Yosry Ahmed
2024-04-27  1:18     ` Shakeel Butt
2024-04-27 14:22       ` Johannes Weiner
2024-04-29 19:54         ` Shakeel Butt
2024-04-29 16:06   ` Roman Gushchin
2024-04-29 19:56     ` Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 6/7] memcg: use proper type for mod_memcg_state Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 7/7] mm: cleanup WORKINGSET_NODES in workingset Shakeel Butt
2024-04-29 16:07   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zi_EEOUS_iCh2Nfh@P9FQF9L96D \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.