From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965059AbbJ1IUY (ORCPT ); Wed, 28 Oct 2015 04:20:24 -0400 Received: from mail-wm0-f42.google.com ([74.125.82.42]:37049 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752457AbbJ1IUR (ORCPT ); Wed, 28 Oct 2015 04:20:17 -0400 X-Greylist: delayed 362 seconds by postgrey-1.27 at vger.kernel.org; Wed, 28 Oct 2015 04:20:15 EDT MIME-Version: 1.0 In-Reply-To: <55FE8C06.8010504@mellanox.com> References: <55F25781.20308@redhat.com> <20150911145213.GQ8114@mtj.duckdns.org> <1828884A29C6694DAF28B7E6B8A82373A903A586@ORSMSX109.amr.corp.intel.com> <20150911194311.GA18755@obsidianresearch.com> <1828884A29C6694DAF28B7E6B8A82373A903A5DB@ORSMSX109.amr.corp.intel.com> <20150914172832.GA21652@obsidianresearch.com> <20150914201840.GA8764@obsidianresearch.com> <20150915034549.GA27847@obsidianresearch.com> <55FE8C06.8010504@mellanox.com> Date: Wed, 28 Oct 2015 13:44:12 +0530 Message-ID: Subject: Re: [PATCH 0/7] devcg: device cgroup extension for rdma resource From: Parav Pandit To: Haggai Eran Cc: Jason Gunthorpe , "Hefty, Sean" , Tejun Heo , Doug Ledford , "cgroups@vger.kernel.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "lizefan@huawei.com" , Johannes Weiner , Jonathan Corbet , "james.l.morris@oracle.com" , "serge@hallyn.com" , Or Gerlitz , Matan Barak , "raindel@mellanox.com" , "akpm@linux-foundation.org" , "linux-security-module@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I finally got some chance and progress on redesigning rdma cgroup controller for the most use cases that we discussed in this email chain. I am posting RFC and soon code in new email. Parav On Sun, Sep 20, 2015 at 4:05 PM, Haggai Eran wrote: > On 15/09/2015 06:45, Jason Gunthorpe wrote: >> No, I'm saying the resource pool is *well defined* and *fixed* by each >> hardware. >> >> The only question is how do we expose the N resource limits, the list >> of which is totally vendor specific. > > I don't see why you say the limits are vendor specific. It is true that > different RDMA devices have different implementations and capabilities, > but they all use the expose the same set of RDMA objects with their > limitations. Whether those limitations come from hardware limitations, > from the driver, or just because the address space is limited, they can > still be exhausted. > >> Yes, using a % scheme fixes the ratios, 1% is going to be a certain >> number of PD's, QP's, MRs, CQ's, etc at a ratio fixed by the driver >> configuration. That is the trade off for API simplicity. >> >> >> Yes, this results in some resources being over provisioned. > > I agree that such a scheme will be easy to configure, but I don't think > it can work well in all situations. Imagine you want to let one > container use almost all RC QPs as you want it to connect to the entire > cluster through RC. Other containers can still use a single datagram QP > to connect to the entire cluster, but they would require many address > handles. If you force a fixed ratio of resources given to each container > it would be hard to describe such a partitioning. > > I think it would be better to expose different controls for the > different RDMA resources. > > Regards, > Haggai From mboxrd@z Thu Jan 1 00:00:00 1970 From: Parav Pandit Subject: Re: [PATCH 0/7] devcg: device cgroup extension for rdma resource Date: Wed, 28 Oct 2015 13:44:12 +0530 Message-ID: References: <55F25781.20308@redhat.com> <20150911145213.GQ8114@mtj.duckdns.org> <1828884A29C6694DAF28B7E6B8A82373A903A586@ORSMSX109.amr.corp.intel.com> <20150911194311.GA18755@obsidianresearch.com> <1828884A29C6694DAF28B7E6B8A82373A903A5DB@ORSMSX109.amr.corp.intel.com> <20150914172832.GA21652@obsidianresearch.com> <20150914201840.GA8764@obsidianresearch.com> <20150915034549.GA27847@obsidianresearch.com> <55FE8C06.8010504@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: <55FE8C06.8010504-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Haggai Eran Cc: Jason Gunthorpe , "Hefty, Sean" , Tejun Heo , Doug Ledford , "cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org" , Johannes Weiner , Jonathan Corbet , "james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org" , "serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org" , Or Gerlitz , Matan Barak , "raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org" , "akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org" , "linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" List-Id: linux-rdma@vger.kernel.org Hi, I finally got some chance and progress on redesigning rdma cgroup controller for the most use cases that we discussed in this email chain. I am posting RFC and soon code in new email. Parav On Sun, Sep 20, 2015 at 4:05 PM, Haggai Eran wrote: > On 15/09/2015 06:45, Jason Gunthorpe wrote: >> No, I'm saying the resource pool is *well defined* and *fixed* by each >> hardware. >> >> The only question is how do we expose the N resource limits, the list >> of which is totally vendor specific. > > I don't see why you say the limits are vendor specific. It is true that > different RDMA devices have different implementations and capabilities, > but they all use the expose the same set of RDMA objects with their > limitations. Whether those limitations come from hardware limitations, > from the driver, or just because the address space is limited, they can > still be exhausted. > >> Yes, using a % scheme fixes the ratios, 1% is going to be a certain >> number of PD's, QP's, MRs, CQ's, etc at a ratio fixed by the driver >> configuration. That is the trade off for API simplicity. >> >> >> Yes, this results in some resources being over provisioned. > > I agree that such a scheme will be easy to configure, but I don't think > it can work well in all situations. Imagine you want to let one > container use almost all RC QPs as you want it to connect to the entire > cluster through RC. Other containers can still use a single datagram QP > to connect to the entire cluster, but they would require many address > handles. If you force a fixed ratio of resources given to each container > it would be hard to describe such a partitioning. > > I think it would be better to expose different controls for the > different RDMA resources. > > Regards, > Haggai