From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754439AbbIHKu0 (ORCPT ); Tue, 8 Sep 2015 06:50:26 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:38564 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754003AbbIHKuT (ORCPT ); Tue, 8 Sep 2015 06:50:19 -0400 MIME-Version: 1.0 In-Reply-To: <55EE9DF5.7030401@mellanox.com> References: <1441658303-18081-1-git-send-email-pandit.parav@gmail.com> <1441658303-18081-6-git-send-email-pandit.parav@gmail.com> <55EE9DF5.7030401@mellanox.com> Date: Tue, 8 Sep 2015 16:20:17 +0530 Message-ID: Subject: Re: [PATCH 5/7] devcg: device cgroup's extension for RDMA resource. From: Parav Pandit To: Haggai Eran Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, tj@kernel.org, lizefan@huawei.com, Johannes Weiner , Doug Ledford , Jonathan Corbet , james.l.morris@oracle.com, serge@hallyn.com, Or Gerlitz , Matan Barak , raindel@mellanox.com, akpm@linux-foundation.org, linux-security-module@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran wrote: > On 07/09/2015 23:38, Parav Pandit wrote: >> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext, >> + enum devcgroup_rdma_rt type, int num) >> +{ >> + struct dev_cgroup *dev_cg, *p; >> + struct task_struct *ctx_task; >> + >> + if (!num) >> + return; >> + >> + /* get cgroup of ib_ucontext it belong to, to uncharge >> + * so that when its called from any worker tasks or any >> + * other tasks to which this resource doesn't belong to, >> + * it can be uncharged correctly. >> + */ >> + if (ucontext) >> + ctx_task = get_pid_task(ucontext->tgid, PIDTYPE_PID); >> + else >> + ctx_task = current; > So what happens if a process creates a ucontext, forks, and then the > child creates and destroys a CQ? If I understand correctly, created > resources are always charged to the current process (the child), but > when it is destroyed the owner of the ucontext (the parent) will be > uncharged. > > Since ucontexts are not meant to be used by multiple processes, I think > it would be okay to always charge the owner process (the one that > created the ucontext). I need to think about it. I would like to avoid keep per task resource counters for two reasons. For a while I thought that native fork() doesn't take care to share the RDMA resources and all CQ, QP dmaable memory from PID namespace perspective. 1. Because, it could well happen that process and its child process is created in PID namespace_A, after which child is migrated to new PID namespace_B. after which parent from the namespace_A is terminated. I am not sure how the ucontext ownership changes from parent to child process at that point today. I prefer to keep this complexity out if at all it exists as process migration across namespaces is not a frequent event for which to optimize the code for. 2. by having per task counter (as cost of memory some memory) allows to avoid using atomic during charge(), uncharge(). The intent is to have per task (process and thread) to have their resource counter instance, but I can see that its broken where its charging parent process as of now without atomics. As you said its ok to always charge the owner process, I have to relax 2nd requirement and fallback to use atomics for charge(), uncharge() or I have to get rid of ucontext from the uncharge() API which is difficult due to fput() being in worker thread context. > >> + dev_cg = task_devcgroup(ctx_task); >> + >> + spin_lock(&ctx_task->rdma_res_counter->lock); >> + ctx_task->rdma_res_counter->usage[type] -= num; >> + >> + for (p = dev_cg; p; p = parent_devcgroup(p)) >> + uncharge_resource(p, type, num); >> + >> + spin_unlock(&ctx_task->rdma_res_counter->lock); >> + >> + if (type == DEVCG_RDMA_RES_TYPE_UCTX) >> + rdma_free_res_counter(ctx_task); >> +} >> +EXPORT_SYMBOL(devcgroup_rdma_uncharge_resource); > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Parav Pandit Subject: Re: [PATCH 5/7] devcg: device cgroup's extension for RDMA resource. Date: Tue, 8 Sep 2015 16:20:17 +0530 Message-ID: References: <1441658303-18081-1-git-send-email-pandit.parav@gmail.com> <1441658303-18081-6-git-send-email-pandit.parav@gmail.com> <55EE9DF5.7030401@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: <55EE9DF5.7030401-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Haggai Eran Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, Johannes Weiner , Doug Ledford , Jonathan Corbet , james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org, Or Gerlitz , Matan Barak , raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-rdma@vger.kernel.org On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran wrote: > On 07/09/2015 23:38, Parav Pandit wrote: >> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext, >> + enum devcgroup_rdma_rt type, int num) >> +{ >> + struct dev_cgroup *dev_cg, *p; >> + struct task_struct *ctx_task; >> + >> + if (!num) >> + return; >> + >> + /* get cgroup of ib_ucontext it belong to, to uncharge >> + * so that when its called from any worker tasks or any >> + * other tasks to which this resource doesn't belong to, >> + * it can be uncharged correctly. >> + */ >> + if (ucontext) >> + ctx_task = get_pid_task(ucontext->tgid, PIDTYPE_PID); >> + else >> + ctx_task = current; > So what happens if a process creates a ucontext, forks, and then the > child creates and destroys a CQ? If I understand correctly, created > resources are always charged to the current process (the child), but > when it is destroyed the owner of the ucontext (the parent) will be > uncharged. > > Since ucontexts are not meant to be used by multiple processes, I think > it would be okay to always charge the owner process (the one that > created the ucontext). I need to think about it. I would like to avoid keep per task resource counters for two reasons. For a while I thought that native fork() doesn't take care to share the RDMA resources and all CQ, QP dmaable memory from PID namespace perspective. 1. Because, it could well happen that process and its child process is created in PID namespace_A, after which child is migrated to new PID namespace_B. after which parent from the namespace_A is terminated. I am not sure how the ucontext ownership changes from parent to child process at that point today. I prefer to keep this complexity out if at all it exists as process migration across namespaces is not a frequent event for which to optimize the code for. 2. by having per task counter (as cost of memory some memory) allows to avoid using atomic during charge(), uncharge(). The intent is to have per task (process and thread) to have their resource counter instance, but I can see that its broken where its charging parent process as of now without atomics. As you said its ok to always charge the owner process, I have to relax 2nd requirement and fallback to use atomics for charge(), uncharge() or I have to get rid of ucontext from the uncharge() API which is difficult due to fput() being in worker thread context. > >> + dev_cg = task_devcgroup(ctx_task); >> + >> + spin_lock(&ctx_task->rdma_res_counter->lock); >> + ctx_task->rdma_res_counter->usage[type] -= num; >> + >> + for (p = dev_cg; p; p = parent_devcgroup(p)) >> + uncharge_resource(p, type, num); >> + >> + spin_unlock(&ctx_task->rdma_res_counter->lock); >> + >> + if (type == DEVCG_RDMA_RES_TYPE_UCTX) >> + rdma_free_res_counter(ctx_task); >> +} >> +EXPORT_SYMBOL(devcgroup_rdma_uncharge_resource); >