All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: CGroups and pthreads
       [not found] ` <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-01-27 16:38   ` Dermot McGahon
  2014-01-28 17:00   ` Dermot McGahon
  1 sibling, 0 replies; 5+ messages in thread
From: Dermot McGahon @ 2014-01-27 16:38 UTC (permalink / raw
  To: cgroups-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

Is it possible to apply cgroup memory subsystem controls to threads created with
pthread_create() / clone or only tasks that have been created using
fork and exec?

In testing, we seem to be seeing that all allocations are accounted
for against the
PPID / TGID and never the pthread_create()'d TID, even though the TID is an LWP
and can be seen using top (though RSS is aggregate and global of course).

Attached is a simple test program used to print PID / TID and allocate memory
from a cloned TID. After setting breakpoints in child and parent and
setting up a
cgroups hierarchy of 'parent' and 'child', apply memory.limit_in_bytes and
memory.memsw.limit_in_bytes to the child cgroup only and adding the PID to
the parent group and the TID to the child group we see that behaviour.

Is that expected? I realise that the subsystems are all different but
what is confusing
us slightly is that we have previously used the CPU subsystem to set cpu_shares
and adding LWP / TID's to individual cgroups worked just fine for that

Am I misconfiguring somehow or is this a known difference between CPU and
MEMORY subsystems?

[-- Attachment #2: pthread_test.c --]
[-- Type: text/x-csrc, Size: 1094 bytes --]

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/syscall.h>

void thread_func()
{
    printf( "thread pid=%d, thread tid=%d\n", getpid(), syscall( SYS_gettid ) );

    size_t one_hundred_mb = 100 * 1024 * 1024;
    void * allocatedChunk = malloc ( one_hundred_mb );
    memset( allocatedChunk, 0, one_hundred_mb );

    if ( allocatedChunk == NULL )
    {
        printf("couldn't allocate\n");
    }
    else
    {
        int tid = syscall( SYS_gettid );
        printf( "PID: %d, TID: %d - has allocated 100mb\n", getpid(), syscall( SYS_gettid ) );
    }

    sleep(1000);
}

void main()
{
    printf( "main pid=%d, main tid=%d\n", getpid(), syscall( SYS_gettid ) );

    pthread_t thread1;
    pthread_create( &thread1, NULL, (void *)&thread_func, NULL);

/*    pid_t childpid;
    childpid = fork();

    if ( childpid >= 0 )
    {
       if ( childpid == 0 )
       {
          thread_func(); // child
       }
       else
       {
          sleep(1000); // parent
       }
    }
    else
    {
       perror("fork");
       exit(0);
    } */

    sleep(1000);
}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CGroups and pthreads
       [not found] ` <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-01-27 16:38   ` Fwd: CGroups and pthreads Dermot McGahon
@ 2014-01-28 17:00   ` Dermot McGahon
       [not found]     ` <CALaYU_AGFVdo1jaaNmN=KDH2Nr3=_Ud8WXzTXdgxpmJuwL_FAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Dermot McGahon @ 2014-01-28 17:00 UTC (permalink / raw
  To: cgroups-u79uwXL29TY76Z2rM5mHXA

Kernel version is 2.6.32 stock kernel for Red Hat Enterprise Linux 6.

Is there anyone on this list with good knowledge of how the memory subsystem
behaves?


On 27 January 2014 16:36, Dermot McGahon <dmcgahon-mOLzCpKykOlBDgjK7y7TUQ@public.gmane.org> wrote:
> Is it possible to apply cgroup memory subsystem controls to threads created
> with
> pthread_create() / clone or only tasks that have been created using fork and
> exec?
>
> In testing, we seem to be seeing that all allocations are accounted for
> against the
> PPID / TGID and never the pthread_create()'d TID, even though the TID is an
> LWP
> and can be seen using top (though RSS is aggregate and global of course).
>
> Attached is a simple test program used to print PID / TID and allocate
> memory
> from a cloned TID. After setting breakpoints in child and parent and setting
> up a
> cgroups hierarchy of 'parent' and 'child', apply memory.limit_in_bytes and
> memory.memsw.limit_in_bytes to the child cgroup only and adding the PID to
> the parent group and the TID to the child group we see that behaviour.
>
> Is that expected? I realise that the subsystems are all different but what
> is confusing
> us slightly is that we have previously used the CPU subsystem to set
> cpu_shares
> and adding LWP / TID's to individual cgroups worked just fine for that
>
> Am I misconfiguring somehow or is this a known difference between CPU and
> MEMORY subsystems?
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CGroups and pthreads
       [not found]     ` <CALaYU_AGFVdo1jaaNmN=KDH2Nr3=_Ud8WXzTXdgxpmJuwL_FAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-01-29 16:16       ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2014-01-29 16:16 UTC (permalink / raw
  To: Dermot McGahon; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Tue 28-01-14 17:00:19, Dermot McGahon wrote:
> Kernel version is 2.6.32 stock kernel for Red Hat Enterprise Linux 6.
> 
> Is there anyone on this list with good knowledge of how the memory subsystem
> behaves?

posting to linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org would give you higher chances to get a
reply.
 
> On 27 January 2014 16:36, Dermot McGahon <dmcgahon-mOLzCpKykOlBDgjK7y7TUQ@public.gmane.org> wrote:
> > Is it possible to apply cgroup memory subsystem controls to threads
> > created with pthread_create() / clone or only tasks that have been
> > created using fork and exec?

Memory cgroup controller charges memory per address space which is
shared between threads.

[...]
> > Is that expected? I realise that the subsystems are all different
> > but what is confusing us slightly is that we have previously used
> > the CPU subsystem to set cpu_shares and adding LWP / TID's to
> > individual cgroups worked just fine for that

Please note that this will most probably change in the future because
all the containers will be per-task rather than per-thread.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Fwd: CGroups and pthreads
       [not found] <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA@mail.gmail.com>
       [not found] ` <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-01-29 17:15 ` Dermot McGahon
  2014-01-31 20:24   ` James Bottomley
  1 sibling, 1 reply; 5+ messages in thread
From: Dermot McGahon @ 2014-01-29 17:15 UTC (permalink / raw
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 1998 bytes --]

Forwarding a question that was first asked on cgroups mailing list.
Someone recommended asking here instead. We believe that we received
the correct answer, which is that cgroup memory subsystem charges
always to the leader of the Process Group rather than to the TID.
Could someone confirm that is definitely the case (testing does bear
that out). It does make sense to us, since who is to say which thread
should the process shared memory be accounted to. Unfortunately, in
our specific scenario, which is a JVM that generally allocated out of
the heap but occasionally loads native libraries that can allocate
using malloc() in known threads, we would have that information. But
we can see that in the general case it may not be that useful to
account per-thread.

Would appreciate any comments you may have.

-----------

Question originally posted to cgroups mailing list:

Is it possible to apply cgroup memory subsystem controls to threads
created with pthread_create() / clone or only tasks that have been
created using fork and exec?

In testing, we seem to be seeing that all allocations are accounted
for against the PPID / TGID and never the pthread_create()'d TID, even
though the TID is an LWP and can be seen using top (though RSS is
aggregate and global of course).

Attached is a simple test program used to print PID / TID and allocate
memory from a cloned TID. After setting breakpoints in child and
parent and setting up a cgroups hierarchy of 'parent' and 'child',
apply memory.limit_in_bytes and memory.memsw.limit_in_bytes to the
child cgroup only and adding the PID to the parent group and the TID
to the child group we see that behaviour.

Is that expected? I realise that the subsystems are all different but
what is confusing us slightly is that we have previously used the CPU
subsystem to set cpu_shares and adding LWP / TID's to individual
cgroups worked just fine for that

Am I misconfiguring somehow or is this a known difference between CPU
and MEMORY subsystems?

[-- Attachment #2: pthread_test.c --]
[-- Type: text/x-csrc, Size: 1094 bytes --]

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/syscall.h>

void thread_func()
{
    printf( "thread pid=%d, thread tid=%d\n", getpid(), syscall( SYS_gettid ) );

    size_t one_hundred_mb = 100 * 1024 * 1024;
    void * allocatedChunk = malloc ( one_hundred_mb );
    memset( allocatedChunk, 0, one_hundred_mb );

    if ( allocatedChunk == NULL )
    {
        printf("couldn't allocate\n");
    }
    else
    {
        int tid = syscall( SYS_gettid );
        printf( "PID: %d, TID: %d - has allocated 100mb\n", getpid(), syscall( SYS_gettid ) );
    }

    sleep(1000);
}

void main()
{
    printf( "main pid=%d, main tid=%d\n", getpid(), syscall( SYS_gettid ) );

    pthread_t thread1;
    pthread_create( &thread1, NULL, (void *)&thread_func, NULL);

/*    pid_t childpid;
    childpid = fork();

    if ( childpid >= 0 )
    {
       if ( childpid == 0 )
       {
          thread_func(); // child
       }
       else
       {
          sleep(1000); // parent
       }
    }
    else
    {
       perror("fork");
       exit(0);
    } */

    sleep(1000);
}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fwd: CGroups and pthreads
  2014-01-29 17:15 ` Fwd: " Dermot McGahon
@ 2014-01-31 20:24   ` James Bottomley
  0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2014-01-31 20:24 UTC (permalink / raw
  To: Dermot McGahon; +Cc: linux-mm, cgroups

[cc to cgroups@ added]
On Wed, 2014-01-29 at 17:15 +0000, Dermot McGahon wrote:
> Forwarding a question that was first asked on cgroups mailing list.
> Someone recommended asking here instead.

Right, but you still need to keep cgroups in the cc otherwise the thread
gets fractured

>  We believe that we received
> the correct answer, which is that cgroup memory subsystem charges
> always to the leader of the Process Group rather than to the TID.
> Could someone confirm that is definitely the case (testing does bear
> that out).

Michal Hocko already told you that the memory controller charges per
address space.  Threads within a process all share the same address
space so there's no physical way they can get charged separately.

>  It does make sense to us, since who is to say which thread
> should the process shared memory be accounted to. Unfortunately, in
> our specific scenario, which is a JVM that generally allocated out of
> the heap but occasionally loads native libraries that can allocate
> using malloc() in known threads, we would have that information. But
> we can see that in the general case it may not be that useful to
> account per-thread.

What is it you're trying to do?  Give a per thread memory allocation
limit?  That's not possible with cgroups because the threads share an
address space ... I don't even think it's possible with current glibc
and limits because heap space is shared between the threads as well.
This is a consequence of the fact that the brk system call is per
process not per thread.

> Would appreciate any comments you may have.
> 
> -----------
> 
> Question originally posted to cgroups mailing list:
> 
> Is it possible to apply cgroup memory subsystem controls to threads
> created with pthread_create() / clone or only tasks that have been
> created using fork and exec?

It is only possible to assert separate controls for things which have
different address spaces.  Usually fork/exec gives the new process a new
address space (although it doesn't have to).

> In testing, we seem to be seeing that all allocations are accounted
> for against the PPID / TGID and never the pthread_create()'d TID, even
> though the TID is an LWP and can be seen using top (though RSS is
> aggregate and global of course).
> 
> Attached is a simple test program used to print PID / TID and allocate
> memory from a cloned TID. After setting breakpoints in child and
> parent and setting up a cgroups hierarchy of 'parent' and 'child',
> apply memory.limit_in_bytes and memory.memsw.limit_in_bytes to the
> child cgroup only and adding the PID to the parent group and the TID
> to the child group we see that behaviour.
> 
> Is that expected? I realise that the subsystems are all different but
> what is confusing us slightly is that we have previously used the CPU
> subsystem to set cpu_shares and adding LWP / TID's to individual
> cgroups worked just fine for that
> 
> Am I misconfiguring somehow or is this a known difference between CPU
> and MEMORY subsystems?

Yes, CPU operates within the scheduler and all schedulable entities
(that's threads or processes) can be accounted separately.  memcg
operates on address spaces, so only things with separate address spaces
can be accounted separately.

James


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-01-31 20:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA@mail.gmail.com>
     [not found] ` <CALaYU_BZ8iuHnAgkss1wO7BK3qULgotYSpmX4nqX=uC+aTnddA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-27 16:38   ` Fwd: CGroups and pthreads Dermot McGahon
2014-01-28 17:00   ` Dermot McGahon
     [not found]     ` <CALaYU_AGFVdo1jaaNmN=KDH2Nr3=_Ud8WXzTXdgxpmJuwL_FAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-29 16:16       ` Michal Hocko
2014-01-29 17:15 ` Fwd: " Dermot McGahon
2014-01-31 20:24   ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.