Linux-mm Archive mirror
 help / color / mirror / Atom feed
From: Kairui Song <ryncsn@gmail.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Muchun Song <muchun.song@linux.dev>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Michal Hocko <mhocko@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	gthelen@google.coma, rientjes@google.com,
	 Chris Li <chrisl@kernel.org>
Subject: Re: [PATCH rfc 0/9] mm: memcg: separate legacy cgroup v1 code and put under config option
Date: Thu, 23 May 2024 01:58:49 +0800	[thread overview]
Message-ID: <CAMgjq7ARr0=OG8GQOJvzLtfdrtiwvJ19-mx1snxqmLE0Za+QCw@mail.gmail.com> (raw)
In-Reply-To: <jf44dfyaenz6xmn2hcodaginrshw5d5hfhmakdxtj4x6szk6b2@cr2rxamkgj2m>

On Thu, May 9, 2024 at 2:33 PM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Wed, May 08, 2024 at 08:41:29PM -0700, Roman Gushchin wrote:
> > Cgroups v2 have been around for a while and many users have fully adopted them,
> > so they never use cgroups v1 features and functionality. Yet they have to "pay"
> > for the cgroup v1 support anyway:
> > 1) the kernel binary contains useless cgroup v1 code,
> > 2) some common structures like task_struct and mem_cgroup have never used
> >    cgroup v1-specific members,
> > 3) some code paths have additional checks which are not needed.
> >
> > Cgroup v1's memory controller has a number of features that are not supported
> > by cgroup v2 and their implementation is pretty much self contained.
> > Most notably, these features are: soft limit reclaim, oom handling in userspace,
> > complicated event notification system, charge migration.
> >
> > Cgroup v1-specific code in memcontrol.c is close to 4k lines in size and it's
> > intervened with generic and cgroup v2-specific code. It's a burden on
> > developers and maintainers.
> >
> > This patchset aims to solve these problems by:
> > 1) moving cgroup v1-specific memcg code to the new mm/memcontrol-v1.c file,
> > 2) putting definitions shared by memcontrol.c and memcontrol-v1.c into the
> >    mm/internal.h header
> > 3) introducing the CONFIG_MEMCG_V1 config option, turned on by default
> > 4) making memcontrol-v1.c to compile only if CONFIG_MEMCG_V1 is set
> > 5) putting unused struct memory_cgroup and task_struct members under
> >    CONFIG_MEMCG_V1 as well.
> >
> > This is an RFC version, which is not 100% polished yet, so but it would be great
> > to discuss and agree on the overall approach.
> >
> > Some open questions, opinions are appreciated:
> > 1) I consider renaming non-static functions in memcontrol-v1.c to have
> >    mem_cgroup_v1_ prefix. Is this a good idea?
> > 2) Do we want to extend it beyond the memory controller? Should
> > 3) Is it better to use a new include/linux/memcontrol-v1.h instead of
> >    mm/internal.h? Or mm/memcontrol-v1.h.
> >
>
> Hi Roman,
>
> A very timely and important topic and we should definitely talk about it
> during LSFMM as well. I have been thinking about this problem for quite
> sometime and I am getting more and more convinced that we should aim to
> completely deprecate memcg-v1.
>
> More specifically:
>
> 1. What are the memcg-v1 features which have no alternative in memcg-v2
> and are blocker for memcg-v1 users? (setting aside the cgroup v2
> structual restrictions)
>
> 2. What are unused memcg-v1 features which we should start deprecating?
>
> IMO we should systematically start deprecating memcg-v1 features and
> start unblocking the users stuck on memcg-v1.
>
> Now regarding the proposal in this series, I think it can be a first
> step but should not give an impression that we are done. The only
> concern I have is the potential of "out of sight, out of mind" situation
> with this change but if we keep the momentum of deprecation of memcg-v1
> it should be fine.
>
> I have CCed Greg and David from Google to get their opinion on what
> memcg-v1 features are blocker for their memcg-v2 migration and if they
> have concern in deprecation of memcg-v1 features.
>
> Anyone else still on memcg-v1, please do provide your input.
>

Hi,

Sorry for joining the discussion late, but I'd like to add some info
here: We are using the "memsw" feature a lot. It's a very useful knob
for container memory overcommitting: It's a great abstraction of the
"expected total memory usage" of a container, so containers can't
allocate too much memory using SWAP, but still be able to SWAP out.

For a simple example, with memsw.limit == memory.limit, containers
can't exceed their original memory limit, even with SWAP enabled, they
get OOM killed as how they used to, but the host is now able to
offload cold pages.

Similar ability seems absent with V2: With memory.swap.max == 0, the
host can't use SWAP to reclaim container memory at all. But with a
value larger than that, containers are able to overuse memory, causing
delayed OOM kill, thrashing, CPU/Memory usage ratio could be heavily
out of balance, especially with compress SWAP backends.

Cgroup accounting of ZSWAP/ZRAM doesn't really help, we want to
account for the total raw usage, not the compressed usage. One example
is that if a container uses tons of duplicated pages, then it can
allocate much more memory than it is limited, that could cause
trouble.

I saw Chris also mentioned Google has a workaround internally for it
for Cgroup V2. This will be a blocker for us and a similar workaround
might be needed. It will be great so see an upstream support for this.


  parent reply	other threads:[~2024-05-22 17:59 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09  3:41 [PATCH rfc 0/9] mm: memcg: separate legacy cgroup v1 code and put under config option Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 1/9] mm: memcg: introduce memcontrol-v1.c Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 2/9] mm: memcg: move soft limit reclaim code to memcontrol-v1.c Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 3/9] mm: memcg: move charge migration " Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 4/9] mm: memcg: move legacy memcg event code into memcontrol-v1.c Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 5/9] mm: memcg: move cgroup v1 interface files to memcontrol-v1.c Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 6/9] mm: memcg: move cgroup v1 oom handling code into memcontrol-v1.c Roman Gushchin
2024-05-10 13:26   ` Michal Hocko
2024-05-25  1:03     ` Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 7/9] mm: memcg: put cgroup v1-specific code under a config option Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 8/9] mm: memcg: put corresponding struct mem_cgroup members under CONFIG_MEMCG_V1 Roman Gushchin
2024-05-09  3:41 ` [PATCH rfc 9/9] mm: memcg: put cgroup v1-related members of task_struct under config option Roman Gushchin
2024-05-09  6:33 ` [PATCH rfc 0/9] mm: memcg: separate legacy cgroup v1 code and put " Shakeel Butt
2024-05-09 17:30   ` Roman Gushchin
2024-05-10  2:59   ` David Rientjes
2024-05-10  7:10     ` Chris Li
2024-05-10  8:10     ` Michal Hocko
2024-05-16  3:35   ` Yafang Shao
2024-05-16 17:29     ` Roman Gushchin
2024-05-17  2:21       ` Yafang Shao
2024-05-18  2:13         ` Roman Gushchin
2024-05-18  7:32     ` Shakeel Butt
2024-05-20  2:14       ` Yafang Shao
2024-05-22 17:58   ` Kairui Song [this message]
2024-05-23 19:55     ` Roman Gushchin
2024-05-23 20:26       ` Chris Li
2024-05-28 17:20       ` Kairui Song
2024-05-09 14:22 ` Johannes Weiner
2024-05-09 14:36   ` Johannes Weiner
2024-05-09 14:57     ` Roman Gushchin
2024-05-10 14:18       ` Johannes Weiner
2024-05-10 13:33 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMgjq7ARr0=OG8GQOJvzLtfdrtiwvJ19-mx1snxqmLE0Za+QCw@mail.gmail.com' \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=gthelen@google.coma \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).