All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Haggai Eran <haggaie@mellanox.com>
To: Parav Pandit <pandit.parav@gmail.com>
Cc: <cgroups@vger.kernel.org>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-rdma@vger.kernel.org>,
	<tj@kernel.org>, <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Doug Ledford <dledford@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>, <james.l.morris@oracle.com>,
	<serge@hallyn.com>, Or Gerlitz <ogerlitz@mellanox.com>,
	Matan Barak <matanb@mellanox.com>, <raindel@mellanox.com>,
	<akpm@linux-foundation.org>,
	<linux-security-module@vger.kernel.org>
Subject: Re: [PATCH 5/7] devcg: device cgroup's extension for RDMA resource.
Date: Tue, 8 Sep 2015 17:10:50 +0300	[thread overview]
Message-ID: <55EEEC6A.4030702@mellanox.com> (raw)
In-Reply-To: <CAG53R5VkCQGE9Ufb2JJO0Ksic_nNMJzVRF4wTrATUrnh1C0taA@mail.gmail.com>

On 08/09/2015 13:50, Parav Pandit wrote:
> On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran <haggaie@mellanox.com> wrote:
>> On 07/09/2015 23:38, Parav Pandit wrote:
>>> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext,
>>> +                                   enum devcgroup_rdma_rt type, int num)
>>> +{
>>> +     struct dev_cgroup *dev_cg, *p;
>>> +     struct task_struct *ctx_task;
>>> +
>>> +     if (!num)
>>> +             return;
>>> +
>>> +     /* get cgroup of ib_ucontext it belong to, to uncharge
>>> +      * so that when its called from any worker tasks or any
>>> +      * other tasks to which this resource doesn't belong to,
>>> +      * it can be uncharged correctly.
>>> +      */
>>> +     if (ucontext)
>>> +             ctx_task = get_pid_task(ucontext->tgid, PIDTYPE_PID);
>>> +     else
>>> +             ctx_task = current;
>> So what happens if a process creates a ucontext, forks, and then the
>> child creates and destroys a CQ? If I understand correctly, created
>> resources are always charged to the current process (the child), but
>> when it is destroyed the owner of the ucontext (the parent) will be
>> uncharged.
>>
>> Since ucontexts are not meant to be used by multiple processes, I think
>> it would be okay to always charge the owner process (the one that
>> created the ucontext).
> 
> I need to think about it. I would like to avoid keep per task resource
> counters for two reasons.
> For a while I thought that native fork() doesn't take care to share
> the RDMA resources and all CQ, QP dmaable memory from PID namespace
> perspective.
> 
> 1. Because, it could well happen that process and its child process is
> created in PID namespace_A, after which child is migrated to new PID
> namespace_B.
> after which parent from the namespace_A is terminated. I am not sure
> how the ucontext ownership changes from parent to child process at
> that point today.
> I prefer to keep this complexity out if at all it exists as process
> migration across namespaces is not a frequent event for which to
> optimize the code for.
> 
> 2. by having per task counter (as cost of memory some memory) allows
> to avoid using atomic during charge(), uncharge().
> 
> The intent is to have per task (process and thread) to have their
> resource counter instance, but I can see that its broken where its
> charging parent process as of now without atomics.
> As you said its ok to always charge the owner process, I have to relax
> 2nd requirement and fallback to use atomics for charge(), uncharge()
> or I have to get rid of ucontext from the uncharge() API which is
> difficult due to fput() being in worker thread context.
> 

I think the cost of atomic operations here would normally be negligible
compared to the cost of accessing the hardware to allocate or deallocate
these resources.

WARNING: multiple messages have this Message-ID (diff)
From: Haggai Eran <haggaie@mellanox.com>
To: Parav Pandit <pandit.parav@gmail.com>
Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	tj@kernel.org, lizefan@huawei.com,
	Johannes Weiner <hannes@cmpxchg.org>,
	Doug Ledford <dledford@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	james.l.morris@oracle.com, serge@hallyn.com,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Matan Barak <matanb@mellanox.com>,
	raindel@mellanox.com, akpm@linux-foundation.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH 5/7] devcg: device cgroup's extension for RDMA resource.
Date: Tue, 8 Sep 2015 17:10:50 +0300	[thread overview]
Message-ID: <55EEEC6A.4030702@mellanox.com> (raw)
In-Reply-To: <CAG53R5VkCQGE9Ufb2JJO0Ksic_nNMJzVRF4wTrATUrnh1C0taA@mail.gmail.com>

On 08/09/2015 13:50, Parav Pandit wrote:
> On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran <haggaie@mellanox.com> wrote:
>> On 07/09/2015 23:38, Parav Pandit wrote:
>>> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext,
>>> +                                   enum devcgroup_rdma_rt type, int num)
>>> +{
>>> +     struct dev_cgroup *dev_cg, *p;
>>> +     struct task_struct *ctx_task;
>>> +
>>> +     if (!num)
>>> +             return;
>>> +
>>> +     /* get cgroup of ib_ucontext it belong to, to uncharge
>>> +      * so that when its called from any worker tasks or any
>>> +      * other tasks to which this resource doesn't belong to,
>>> +      * it can be uncharged correctly.
>>> +      */
>>> +     if (ucontext)
>>> +             ctx_task = get_pid_task(ucontext->tgid, PIDTYPE_PID);
>>> +     else
>>> +             ctx_task = current;
>> So what happens if a process creates a ucontext, forks, and then the
>> child creates and destroys a CQ? If I understand correctly, created
>> resources are always charged to the current process (the child), but
>> when it is destroyed the owner of the ucontext (the parent) will be
>> uncharged.
>>
>> Since ucontexts are not meant to be used by multiple processes, I think
>> it would be okay to always charge the owner process (the one that
>> created the ucontext).
> 
> I need to think about it. I would like to avoid keep per task resource
> counters for two reasons.
> For a while I thought that native fork() doesn't take care to share
> the RDMA resources and all CQ, QP dmaable memory from PID namespace
> perspective.
> 
> 1. Because, it could well happen that process and its child process is
> created in PID namespace_A, after which child is migrated to new PID
> namespace_B.
> after which parent from the namespace_A is terminated. I am not sure
> how the ucontext ownership changes from parent to child process at
> that point today.
> I prefer to keep this complexity out if at all it exists as process
> migration across namespaces is not a frequent event for which to
> optimize the code for.
> 
> 2. by having per task counter (as cost of memory some memory) allows
> to avoid using atomic during charge(), uncharge().
> 
> The intent is to have per task (process and thread) to have their
> resource counter instance, but I can see that its broken where its
> charging parent process as of now without atomics.
> As you said its ok to always charge the owner process, I have to relax
> 2nd requirement and fallback to use atomics for charge(), uncharge()
> or I have to get rid of ucontext from the uncharge() API which is
> difficult due to fput() being in worker thread context.
> 

I think the cost of atomic operations here would normally be negligible
compared to the cost of accessing the hardware to allocate or deallocate
these resources.

  reply	other threads:[~2015-09-08 14:11 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-07 20:38 [PATCH 0/7] devcg: device cgroup extension for rdma resource Parav Pandit
2015-09-07 20:38 ` Parav Pandit
2015-09-07 20:38 ` [PATCH 1/7] devcg: Added user option to rdma resource tracking Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:38 ` [PATCH 2/7] devcg: Added rdma resource tracking module Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:38 ` [PATCH 3/7] devcg: Added infrastructure for rdma device cgroup Parav Pandit
2015-09-08  5:31   ` Haggai Eran
2015-09-08  5:31     ` Haggai Eran
2015-09-08  7:02     ` Parav Pandit
2015-09-08  7:02       ` Parav Pandit
2015-09-07 20:38 ` [PATCH 4/7] devcg: Added rdma resource tracker object per task Parav Pandit
2015-09-08  5:48   ` Haggai Eran
2015-09-08  5:48     ` Haggai Eran
2015-09-08  7:04     ` Parav Pandit
2015-09-08  8:24       ` Haggai Eran
2015-09-08  8:24         ` Haggai Eran
2015-09-08  8:26         ` Parav Pandit
2015-09-07 20:38 ` [PATCH 5/7] devcg: device cgroup's extension for RDMA resource Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-08  8:22   ` Haggai Eran
2015-09-08  8:22     ` Haggai Eran
2015-09-08 10:18     ` Parav Pandit
2015-09-08 13:50       ` Haggai Eran
2015-09-08 13:50         ` Haggai Eran
2015-09-08 14:13         ` Parav Pandit
2015-09-08  8:36   ` Haggai Eran
2015-09-08  8:36     ` Haggai Eran
2015-09-08 10:50     ` Parav Pandit
2015-09-08 10:50       ` Parav Pandit
2015-09-08 14:10       ` Haggai Eran [this message]
2015-09-08 14:10         ` Haggai Eran
2015-09-07 20:38 ` [PATCH 6/7] devcg: Added support to use RDMA device cgroup Parav Pandit
2015-09-08  8:40   ` Haggai Eran
2015-09-08  8:40     ` Haggai Eran
2015-09-08 10:22     ` Parav Pandit
2015-09-08 13:40       ` Haggai Eran
2015-09-08 13:40         ` Haggai Eran
2015-09-07 20:38 ` [PATCH 7/7] devcg: Added Documentation of " Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:55 ` [PATCH 0/7] devcg: device cgroup extension for rdma resource Parav Pandit
2015-09-08 12:45 ` Haggai Eran
2015-09-08 12:45   ` Haggai Eran
2015-09-08 15:23 ` Tejun Heo
2015-09-08 15:23   ` Tejun Heo
2015-09-09  3:57   ` Parav Pandit
2015-09-10 16:49     ` Tejun Heo
2015-09-10 17:46       ` Parav Pandit
2015-09-10 17:46         ` Parav Pandit
2015-09-10 20:22         ` Tejun Heo
2015-09-11  3:39           ` Parav Pandit
2015-09-11  4:04             ` Tejun Heo
2015-09-11  4:04               ` Tejun Heo
2015-09-11  4:24               ` Doug Ledford
2015-09-11  4:24                 ` Doug Ledford
2015-09-11 14:52                 ` Tejun Heo
2015-09-11 14:52                   ` Tejun Heo
2015-09-11 16:26                   ` Parav Pandit
2015-09-11 16:34                     ` Tejun Heo
2015-09-11 16:34                       ` Tejun Heo
2015-09-11 16:39                       ` Parav Pandit
2015-09-11 16:39                         ` Parav Pandit
2015-09-11 19:25                         ` Tejun Heo
2015-09-14 10:18                           ` Parav Pandit
2015-09-14 10:18                             ` Parav Pandit
2015-09-11 16:47                   ` Parav Pandit
2015-09-11 16:47                     ` Parav Pandit
2015-09-11 19:05                     ` Tejun Heo
2015-09-11 19:05                       ` Tejun Heo
2015-09-11 19:22                   ` Hefty, Sean
2015-09-11 19:43                     ` Jason Gunthorpe
2015-09-11 19:43                       ` Jason Gunthorpe
2015-09-11 20:06                       ` Hefty, Sean
2015-09-14 11:09                         ` Parav Pandit
2015-09-14 14:04                           ` Parav Pandit
2015-09-14 15:21                             ` Tejun Heo
2015-09-14 15:21                               ` Tejun Heo
2015-09-14 17:28                           ` Jason Gunthorpe
2015-09-14 17:28                             ` Jason Gunthorpe
2015-09-14 18:54                             ` Parav Pandit
2015-09-14 18:54                               ` Parav Pandit
2015-09-14 20:18                               ` Jason Gunthorpe
2015-09-15  3:08                                 ` Parav Pandit
2015-09-15  3:45                                   ` Jason Gunthorpe
2015-09-15  3:45                                     ` Jason Gunthorpe
2015-09-16  4:41                                     ` Parav Pandit
2015-09-16  4:41                                       ` Parav Pandit
2015-09-20 10:35                                     ` Haggai Eran
2015-09-20 10:35                                       ` Haggai Eran
2015-10-28  8:14                                       ` Parav Pandit
2015-10-28  8:14                                         ` Parav Pandit
2015-09-14 10:15                     ` Parav Pandit
2015-09-11  4:43               ` Parav Pandit
2015-09-11 15:03                 ` Tejun Heo
2015-09-10 17:48       ` Hefty, Sean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55EEEC6A.4030702@mellanox.com \
    --to=haggaie@mellanox.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dledford@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=james.l.morris@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=matanb@mellanox.com \
    --cc=ogerlitz@mellanox.com \
    --cc=pandit.parav@gmail.com \
    --cc=raindel@mellanox.com \
    --cc=serge@hallyn.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.