All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Parav Pandit <pandit.parav@gmail.com>
Cc: "Hefty, Sean" <sean.hefty@intel.com>, Tejun Heo <tj@kernel.org>,
	Doug Ledford <dledford@redhat.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"lizefan@huawei.com" <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	"james.l.morris@oracle.com" <james.l.morris@oracle.com>,
	"serge@hallyn.com" <serge@hallyn.com>,
	Haggai Eran <haggaie@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Matan Barak <matanb@mellanox.com>,
	"raindel@mellanox.com" <raindel@mellanox.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-security-module@vger.kernel.org" 
	<linux-security-module@vger.kernel.org>
Subject: Re: [PATCH 0/7] devcg: device cgroup extension for rdma resource
Date: Mon, 14 Sep 2015 21:45:49 -0600	[thread overview]
Message-ID: <20150915034549.GA27847@obsidianresearch.com> (raw)
In-Reply-To: <CAG53R5XY1q+AqJvgtK_Qd4Sai2kZX9vhDKD_2dNXpw4Gf=nz0A@mail.gmail.com>

On Tue, Sep 15, 2015 at 08:38:54AM +0530, Parav Pandit wrote:

> As you precisely described, about wild ratio,
> we are asking vendor driver (bottom most layer) to statically define
> what the resource pool is, without telling him which application are
> we going to run to use those pool.
> Therefore vendor layer cannot ever define "right" resource pool.

No, I'm saying the resource pool is *well defined* and *fixed* by each
hardware.

The only question is how do we expose the N resource limits, the list
of which is totally vendor specific.

Yes, using a % scheme fixes the ratios, 1% is going to be a certain
number of PD's, QP's, MRs, CQ's, etc at a ratio fixed by the driver
configuration. That is the trade off for API simplicity.

Yes, this results in some resources being over provisioned.

I have no idea if that is usable for the workloads people want to run..

But *there is no middle option*. Either each and every single hardware
limited resources has a dedicated per-container limit, or they are
*somehow* bundled and the ratios become fixed.

If Tejun says we can't have something so emphemeral as a vendor
specific list of hardware resource pools - then what choice is
left?

> Instead of bringing such complex solution, that affecting all the
> layers which solves the same problem as this patch,
> its better to keep definition of "bundle" in the user
> library/application deployment engine.
> where bundle is set of those resources.

The kernel has to do the restriction, so at some point you are telling
the kernel to limit each and every unique resource the HW has, which
is back to the original patch set, munging how the data is passed
makes no difference to the basic objection, IMHO.

> rdma cgroup will allow us to run post 512 or 1024 containers without
> using PCIe SR-IOV, without creating any vendor specific resource
> pools.

If you ignore any vendor specific resource limits then you've just
left open a hole, a wayward container can exhaust all others - so what
was the point of doing all this work?

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Parav Pandit <pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: "Hefty,
	Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org"
	<lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	"james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org"
	<james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	"serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org"
	<serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>,
	Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	"raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org"
	<raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	"akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org"
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 0/7] devcg: device cgroup extension for rdma resource
Date: Mon, 14 Sep 2015 21:45:49 -0600	[thread overview]
Message-ID: <20150915034549.GA27847@obsidianresearch.com> (raw)
In-Reply-To: <CAG53R5XY1q+AqJvgtK_Qd4Sai2kZX9vhDKD_2dNXpw4Gf=nz0A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, Sep 15, 2015 at 08:38:54AM +0530, Parav Pandit wrote:

> As you precisely described, about wild ratio,
> we are asking vendor driver (bottom most layer) to statically define
> what the resource pool is, without telling him which application are
> we going to run to use those pool.
> Therefore vendor layer cannot ever define "right" resource pool.

No, I'm saying the resource pool is *well defined* and *fixed* by each
hardware.

The only question is how do we expose the N resource limits, the list
of which is totally vendor specific.

Yes, using a % scheme fixes the ratios, 1% is going to be a certain
number of PD's, QP's, MRs, CQ's, etc at a ratio fixed by the driver
configuration. That is the trade off for API simplicity.

Yes, this results in some resources being over provisioned.

I have no idea if that is usable for the workloads people want to run..

But *there is no middle option*. Either each and every single hardware
limited resources has a dedicated per-container limit, or they are
*somehow* bundled and the ratios become fixed.

If Tejun says we can't have something so emphemeral as a vendor
specific list of hardware resource pools - then what choice is
left?

> Instead of bringing such complex solution, that affecting all the
> layers which solves the same problem as this patch,
> its better to keep definition of "bundle" in the user
> library/application deployment engine.
> where bundle is set of those resources.

The kernel has to do the restriction, so at some point you are telling
the kernel to limit each and every unique resource the HW has, which
is back to the original patch set, munging how the data is passed
makes no difference to the basic objection, IMHO.

> rdma cgroup will allow us to run post 512 or 1024 containers without
> using PCIe SR-IOV, without creating any vendor specific resource
> pools.

If you ignore any vendor specific resource limits then you've just
left open a hole, a wayward container can exhaust all others - so what
was the point of doing all this work?

Jason

  reply	other threads:[~2015-09-15  3:46 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-07 20:38 [PATCH 0/7] devcg: device cgroup extension for rdma resource Parav Pandit
2015-09-07 20:38 ` Parav Pandit
2015-09-07 20:38 ` [PATCH 1/7] devcg: Added user option to rdma resource tracking Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:38 ` [PATCH 2/7] devcg: Added rdma resource tracking module Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:38 ` [PATCH 3/7] devcg: Added infrastructure for rdma device cgroup Parav Pandit
2015-09-08  5:31   ` Haggai Eran
2015-09-08  5:31     ` Haggai Eran
2015-09-08  7:02     ` Parav Pandit
2015-09-08  7:02       ` Parav Pandit
2015-09-07 20:38 ` [PATCH 4/7] devcg: Added rdma resource tracker object per task Parav Pandit
2015-09-08  5:48   ` Haggai Eran
2015-09-08  5:48     ` Haggai Eran
2015-09-08  7:04     ` Parav Pandit
2015-09-08  8:24       ` Haggai Eran
2015-09-08  8:24         ` Haggai Eran
2015-09-08  8:26         ` Parav Pandit
2015-09-07 20:38 ` [PATCH 5/7] devcg: device cgroup's extension for RDMA resource Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-08  8:22   ` Haggai Eran
2015-09-08  8:22     ` Haggai Eran
2015-09-08 10:18     ` Parav Pandit
2015-09-08 13:50       ` Haggai Eran
2015-09-08 13:50         ` Haggai Eran
2015-09-08 14:13         ` Parav Pandit
2015-09-08  8:36   ` Haggai Eran
2015-09-08  8:36     ` Haggai Eran
2015-09-08 10:50     ` Parav Pandit
2015-09-08 10:50       ` Parav Pandit
2015-09-08 14:10       ` Haggai Eran
2015-09-08 14:10         ` Haggai Eran
2015-09-07 20:38 ` [PATCH 6/7] devcg: Added support to use RDMA device cgroup Parav Pandit
2015-09-08  8:40   ` Haggai Eran
2015-09-08  8:40     ` Haggai Eran
2015-09-08 10:22     ` Parav Pandit
2015-09-08 13:40       ` Haggai Eran
2015-09-08 13:40         ` Haggai Eran
2015-09-07 20:38 ` [PATCH 7/7] devcg: Added Documentation of " Parav Pandit
2015-09-07 20:38   ` Parav Pandit
2015-09-07 20:55 ` [PATCH 0/7] devcg: device cgroup extension for rdma resource Parav Pandit
2015-09-08 12:45 ` Haggai Eran
2015-09-08 12:45   ` Haggai Eran
2015-09-08 15:23 ` Tejun Heo
2015-09-08 15:23   ` Tejun Heo
2015-09-09  3:57   ` Parav Pandit
2015-09-10 16:49     ` Tejun Heo
2015-09-10 17:46       ` Parav Pandit
2015-09-10 17:46         ` Parav Pandit
2015-09-10 20:22         ` Tejun Heo
2015-09-11  3:39           ` Parav Pandit
2015-09-11  4:04             ` Tejun Heo
2015-09-11  4:04               ` Tejun Heo
2015-09-11  4:24               ` Doug Ledford
2015-09-11  4:24                 ` Doug Ledford
2015-09-11 14:52                 ` Tejun Heo
2015-09-11 14:52                   ` Tejun Heo
2015-09-11 16:26                   ` Parav Pandit
2015-09-11 16:34                     ` Tejun Heo
2015-09-11 16:34                       ` Tejun Heo
2015-09-11 16:39                       ` Parav Pandit
2015-09-11 16:39                         ` Parav Pandit
2015-09-11 19:25                         ` Tejun Heo
2015-09-14 10:18                           ` Parav Pandit
2015-09-14 10:18                             ` Parav Pandit
2015-09-11 16:47                   ` Parav Pandit
2015-09-11 16:47                     ` Parav Pandit
2015-09-11 19:05                     ` Tejun Heo
2015-09-11 19:05                       ` Tejun Heo
2015-09-11 19:22                   ` Hefty, Sean
2015-09-11 19:43                     ` Jason Gunthorpe
2015-09-11 19:43                       ` Jason Gunthorpe
2015-09-11 20:06                       ` Hefty, Sean
2015-09-14 11:09                         ` Parav Pandit
2015-09-14 14:04                           ` Parav Pandit
2015-09-14 15:21                             ` Tejun Heo
2015-09-14 15:21                               ` Tejun Heo
2015-09-14 17:28                           ` Jason Gunthorpe
2015-09-14 17:28                             ` Jason Gunthorpe
2015-09-14 18:54                             ` Parav Pandit
2015-09-14 18:54                               ` Parav Pandit
2015-09-14 20:18                               ` Jason Gunthorpe
2015-09-15  3:08                                 ` Parav Pandit
2015-09-15  3:45                                   ` Jason Gunthorpe [this message]
2015-09-15  3:45                                     ` Jason Gunthorpe
2015-09-16  4:41                                     ` Parav Pandit
2015-09-16  4:41                                       ` Parav Pandit
2015-09-20 10:35                                     ` Haggai Eran
2015-09-20 10:35                                       ` Haggai Eran
2015-10-28  8:14                                       ` Parav Pandit
2015-10-28  8:14                                         ` Parav Pandit
2015-09-14 10:15                     ` Parav Pandit
2015-09-11  4:43               ` Parav Pandit
2015-09-11 15:03                 ` Tejun Heo
2015-09-10 17:48       ` Hefty, Sean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150915034549.GA27847@obsidianresearch.com \
    --to=jgunthorpe@obsidianresearch.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dledford@redhat.com \
    --cc=haggaie@mellanox.com \
    --cc=hannes@cmpxchg.org \
    --cc=james.l.morris@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=matanb@mellanox.com \
    --cc=ogerlitz@mellanox.com \
    --cc=pandit.parav@gmail.com \
    --cc=raindel@mellanox.com \
    --cc=sean.hefty@intel.com \
    --cc=serge@hallyn.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.