All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation
@ 2018-09-12 16:40 Chris Wilson
  2018-09-12 16:40 ` [PATCH 2/3] drm/i915: Restrict wait to client's timeline on i915_request alloc failure Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Chris Wilson @ 2018-09-12 16:40 UTC (permalink / raw
  To: intel-gfx

If we try and fail to allocate a i915_request, we apply some
backpressure on the clients to throttle the memory allocations coming
from i915.ko. Currently, we wait until completely idle, but this is far
too heavy and leads to some situations where the only escape is to
declare a client hung and reset the GPU. The intent is to only ratelimit
the allocation requests, so we need only wait for a jiffie before using
the normal direct reclaim.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106680
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 09ed48833b54..588bc5a4d18b 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -736,7 +736,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 		ret = i915_gem_wait_for_idle(i915,
 					     I915_WAIT_LOCKED |
 					     I915_WAIT_INTERRUPTIBLE,
-					     MAX_SCHEDULE_TIMEOUT);
+					     1);
 		if (ret)
 			goto err_unreserve;
 
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/i915: Restrict wait to client's timeline on i915_request alloc failure
  2018-09-12 16:40 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
@ 2018-09-12 16:40 ` Chris Wilson
  2018-09-12 16:40 ` [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion Chris Wilson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-09-12 16:40 UTC (permalink / raw
  To: intel-gfx

Let's try not to overly penalize the unlucky client by making them wait
for others to complete their work, and only apply the ratelimit if they
themselves have outstanding work. Still, we apply global reclaim to the
client (we need to scavenge some memory for it) so it doesn't got away
completely scot free.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 588bc5a4d18b..72bcb4ca0c45 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -733,12 +733,16 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
 		/* Ratelimit ourselves to prevent oom from malicious clients */
-		ret = i915_gem_wait_for_idle(i915,
-					     I915_WAIT_LOCKED |
-					     I915_WAIT_INTERRUPTIBLE,
-					     1);
-		if (ret)
+		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
+					 &i915->drm.struct_mutex);
+		if (rq && i915_request_wait(rq,
+					    I915_WAIT_LOCKED |
+					    I915_WAIT_INTERRUPTIBLE,
+					    1) == -EINTR) {
+			ret = -EINTR;
 			goto err_unreserve;
+		}
+		i915_retire_requests(i915);
 
 		/*
 		 * We've forced the client to stall and catch up with whatever
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion
  2018-09-12 16:40 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
  2018-09-12 16:40 ` [PATCH 2/3] drm/i915: Restrict wait to client's timeline on i915_request alloc failure Chris Wilson
@ 2018-09-12 16:40 ` Chris Wilson
  2018-09-13 11:16   ` Tvrtko Ursulin
  2018-09-12 17:40 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
  2018-09-13  0:41 ` ✓ Fi.CI.IGT: " Patchwork
  3 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-09-12 16:40 UTC (permalink / raw
  To: intel-gfx; +Cc: Daniel Vetter

Under mempressure, our goal is to allow ourselves sufficient time to
reclaim the RCU protected slabs without overly penalizing our clients.
Currently, we use a 1 jiffie wait if the client is still active as a
means of throttling the allocations, but we can instead wait for the end
of the RCU grace period of the clients previous allocation.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 14 ++++++--------
 drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 72bcb4ca0c45..a492385b2089 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -732,17 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq = kmem_cache_alloc(i915->requests,
 			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
+		i915_retire_requests(i915);
+
 		/* Ratelimit ourselves to prevent oom from malicious clients */
 		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
 					 &i915->drm.struct_mutex);
-		if (rq && i915_request_wait(rq,
-					    I915_WAIT_LOCKED |
-					    I915_WAIT_INTERRUPTIBLE,
-					    1) == -EINTR) {
-			ret = -EINTR;
-			goto err_unreserve;
-		}
-		i915_retire_requests(i915);
+		if (rq)
+			cond_synchronize_rcu(rq->rcustate);
 
 		/*
 		 * We've forced the client to stall and catch up with whatever
@@ -762,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 		}
 	}
 
+	rq->rcustate = get_state_synchronize_rcu();
+
 	INIT_LIST_HEAD(&rq->active_list);
 	rq->i915 = i915;
 	rq->engine = engine;
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 9898301ab7ef..7fa94b024968 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -100,6 +100,14 @@ struct i915_request {
 	struct i915_timeline *timeline;
 	struct intel_signal_node signaling;
 
+	/*
+	 * The rcu epoch of when this request was allocated. Used to judiciously
+	 * apply backpressure on future allocations to ensure that under
+	 * mempressure there is sufficient RCU ticks for us to reclaim our
+	 * RCU protected slabs.
+	 */
+	unsigned long rcustate;
+
 	/*
 	 * Fences for the various phases in the request's lifetime.
 	 *
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-12 16:40 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
  2018-09-12 16:40 ` [PATCH 2/3] drm/i915: Restrict wait to client's timeline on i915_request alloc failure Chris Wilson
  2018-09-12 16:40 ` [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion Chris Wilson
@ 2018-09-12 17:40 ` Patchwork
  2018-09-13  0:41 ` ✓ Fi.CI.IGT: " Patchwork
  3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-09-12 17:40 UTC (permalink / raw
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
URL   : https://patchwork.freedesktop.org/series/49572/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4813 -> Patchwork_10160 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/49572/revisions/1/mbox/

== Known issues ==

  Here are the changes found in Patchwork_10160 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_suspend@basic-s4-devices:
      fi-bdw-samus:       PASS -> INCOMPLETE (fdo#107773)

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a:
      fi-ilk-650:         PASS -> DMESG-WARN (fdo#106387)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
      fi-blb-e6850:       PASS -> INCOMPLETE (fdo#107718)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_hangcheck:
      fi-cfl-guc:         DMESG-FAIL (fdo#107710) -> PASS

    igt@gem_exec_suspend@basic-s4-devices:
      fi-kbl-7500u:       DMESG-WARN (fdo#107139, fdo#105128) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-byt-clapper:     FAIL (fdo#103167) -> PASS

    
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#105128 https://bugs.freedesktop.org/show_bug.cgi?id=105128
  fdo#106387 https://bugs.freedesktop.org/show_bug.cgi?id=106387
  fdo#107139 https://bugs.freedesktop.org/show_bug.cgi?id=107139
  fdo#107710 https://bugs.freedesktop.org/show_bug.cgi?id=107710
  fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718
  fdo#107773 https://bugs.freedesktop.org/show_bug.cgi?id=107773


== Participating hosts (51 -> 46) ==

  Missing    (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4813 -> Patchwork_10160

  CI_DRM_4813: 3c13515b12339366b414637b69227a4e3cbe21ae @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10160: 3f7aff32f54467dfbc1b6bffbe82182eaa3ebcc0 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

3f7aff32f544 drm/i915: Wait for the previous RCU grace period, not request completion
27b5d3fba477 drm/i915: Restrict wait to client's timeline on i915_request alloc failure
908461b0dad8 drm/i915: Limit the backpressure for i915_request allocation

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10160/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-12 16:40 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
                   ` (2 preceding siblings ...)
  2018-09-12 17:40 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
@ 2018-09-13  0:41 ` Patchwork
  3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-09-13  0:41 UTC (permalink / raw
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
URL   : https://patchwork.freedesktop.org/series/49572/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4813_full -> Patchwork_10160_full =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_10160_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_10160_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_10160_full:

  === IGT changes ===

    ==== Warnings ====

    igt@kms_cursor_legacy@cursor-vs-flip-legacy:
      shard-snb:          PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_10160_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_cursor_legacy@cursor-vs-flip-atomic-transitions:
      shard-hsw:          PASS -> FAIL (fdo#103355)

    igt@perf@polling:
      shard-hsw:          PASS -> FAIL (fdo#102252)

    
    ==== Possible fixes ====

    igt@gem_exec_await@wide-contexts:
      shard-glk:          FAIL (fdo#106680) -> PASS

    
  fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
  fdo#103355 https://bugs.freedesktop.org/show_bug.cgi?id=103355
  fdo#106680 https://bugs.freedesktop.org/show_bug.cgi?id=106680


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4813 -> Patchwork_10160

  CI_DRM_4813: 3c13515b12339366b414637b69227a4e3cbe21ae @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10160: 3f7aff32f54467dfbc1b6bffbe82182eaa3ebcc0 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10160/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion
  2018-09-12 16:40 ` [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion Chris Wilson
@ 2018-09-13 11:16   ` Tvrtko Ursulin
  2018-09-13 11:18     ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-09-13 11:16 UTC (permalink / raw
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


On 12/09/2018 17:40, Chris Wilson wrote:
> Under mempressure, our goal is to allow ourselves sufficient time to
> reclaim the RCU protected slabs without overly penalizing our clients.
> Currently, we use a 1 jiffie wait if the client is still active as a
> means of throttling the allocations, but we can instead wait for the end
> of the RCU grace period of the clients previous allocation.

Why did you opt for three patches changing the same code and just squash 
to last?

Regards,

Tvrtko

> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 14 ++++++--------
>   drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
>   2 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 72bcb4ca0c45..a492385b2089 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -732,17 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	rq = kmem_cache_alloc(i915->requests,
>   			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
>   	if (unlikely(!rq)) {
> +		i915_retire_requests(i915);
> +
>   		/* Ratelimit ourselves to prevent oom from malicious clients */
>   		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
>   					 &i915->drm.struct_mutex);
> -		if (rq && i915_request_wait(rq,
> -					    I915_WAIT_LOCKED |
> -					    I915_WAIT_INTERRUPTIBLE,
> -					    1) == -EINTR) {
> -			ret = -EINTR;
> -			goto err_unreserve;
> -		}
> -		i915_retire_requests(i915);
> +		if (rq)
> +			cond_synchronize_rcu(rq->rcustate);
>   
>   		/*
>   		 * We've forced the client to stall and catch up with whatever
> @@ -762,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   		}
>   	}
>   
> +	rq->rcustate = get_state_synchronize_rcu();
> +
>   	INIT_LIST_HEAD(&rq->active_list);
>   	rq->i915 = i915;
>   	rq->engine = engine;
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 9898301ab7ef..7fa94b024968 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -100,6 +100,14 @@ struct i915_request {
>   	struct i915_timeline *timeline;
>   	struct intel_signal_node signaling;
>   
> +	/*
> +	 * The rcu epoch of when this request was allocated. Used to judiciously
> +	 * apply backpressure on future allocations to ensure that under
> +	 * mempressure there is sufficient RCU ticks for us to reclaim our
> +	 * RCU protected slabs.
> +	 */
> +	unsigned long rcustate;
> +
>   	/*
>   	 * Fences for the various phases in the request's lifetime.
>   	 *
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion
  2018-09-13 11:16   ` Tvrtko Ursulin
@ 2018-09-13 11:18     ` Chris Wilson
  2018-09-13 11:29       ` Tvrtko Ursulin
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-09-13 11:18 UTC (permalink / raw
  To: Tvrtko Ursulin, intel-gfx; +Cc: Daniel Vetter

Quoting Tvrtko Ursulin (2018-09-13 12:16:42)
> 
> On 12/09/2018 17:40, Chris Wilson wrote:
> > Under mempressure, our goal is to allow ourselves sufficient time to
> > reclaim the RCU protected slabs without overly penalizing our clients.
> > Currently, we use a 1 jiffie wait if the client is still active as a
> > means of throttling the allocations, but we can instead wait for the end
> > of the RCU grace period of the clients previous allocation.
> 
> Why did you opt for three patches changing the same code and just squash 
> to last?

1 introduced a timeout, 2 limited it to the single timeline, 3 changed
what we are waiting on entirely. Each of those are big jumps, and only
(1) is required to fix the bug.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion
  2018-09-13 11:18     ` Chris Wilson
@ 2018-09-13 11:29       ` Tvrtko Ursulin
  2018-09-13 11:35         ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-09-13 11:29 UTC (permalink / raw
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


On 13/09/2018 12:18, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-09-13 12:16:42)
>>
>> On 12/09/2018 17:40, Chris Wilson wrote:
>>> Under mempressure, our goal is to allow ourselves sufficient time to
>>> reclaim the RCU protected slabs without overly penalizing our clients.
>>> Currently, we use a 1 jiffie wait if the client is still active as a
>>> means of throttling the allocations, but we can instead wait for the end
>>> of the RCU grace period of the clients previous allocation.
>>
>> Why did you opt for three patches changing the same code and just squash
>> to last?
> 
> 1 introduced a timeout, 2 limited it to the single timeline, 3 changed
> what we are waiting on entirely. Each of those are big jumps, and only
> (1) is required to fix the bug.

I completely understand that, just question why we want to review all 
the intermediate solutions only to end up with superior one in terms of 
both logic, design and simplicity.

Because as said before, I don't really approve of patch 1 that much, 
even if it fixes a bug.

Two is already superior, but three is right to the point of what problem 
you want to solve. (Even if I haven't looked into the exact RCU API yet, 
but it looks believable.)

Anyway, I'll let the other guys comment, maybe it is just me who is too 
picky.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion
  2018-09-13 11:29       ` Tvrtko Ursulin
@ 2018-09-13 11:35         ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-09-13 11:35 UTC (permalink / raw
  To: Tvrtko Ursulin, intel-gfx; +Cc: Daniel Vetter

Quoting Tvrtko Ursulin (2018-09-13 12:29:46)
> 
> On 13/09/2018 12:18, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-09-13 12:16:42)
> >>
> >> On 12/09/2018 17:40, Chris Wilson wrote:
> >>> Under mempressure, our goal is to allow ourselves sufficient time to
> >>> reclaim the RCU protected slabs without overly penalizing our clients.
> >>> Currently, we use a 1 jiffie wait if the client is still active as a
> >>> means of throttling the allocations, but we can instead wait for the end
> >>> of the RCU grace period of the clients previous allocation.
> >>
> >> Why did you opt for three patches changing the same code and just squash
> >> to last?
> > 
> > 1 introduced a timeout, 2 limited it to the single timeline, 3 changed
> > what we are waiting on entirely. Each of those are big jumps, and only
> > (1) is required to fix the bug.
> 
> I completely understand that, just question why we want to review all 
> the intermediate solutions only to end up with superior one in terms of 
> both logic, design and simplicity.

Depends on viewpoint.
 
> Because as said before, I don't really approve of patch 1 that much, 
> even if it fixes a bug.
> 
> Two is already superior, but three is right to the point of what problem 
> you want to solve. (Even if I haven't looked into the exact RCU API yet, 
> but it looks believable.)

2 mixes global/local without any clue as to whether local is a good
idea. I think that switch deserves argument (because what good is
pretending to only check the local client when there's a massive global
bottleneck in the following lines).

The switch over to using waiting a single grace period itself is also
dubious, because there is even less to link that back to gpu behaviour and
that I feel may be more crucial than the handwaving in (1) gives credit
for.

And then there are the shivers that come from having a big global
barrier in something that needs to learn to be lean and scalable.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation
@ 2018-09-14  8:00 Chris Wilson
  2018-09-14 10:09 ` Tvrtko Ursulin
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-09-14  8:00 UTC (permalink / raw
  To: intel-gfx; +Cc: Daniel Vetter

If we try and fail to allocate a i915_request, we apply some
backpressure on the clients to throttle the memory allocations coming
from i915.ko. Currently, we wait until completely idle, but this is far
too heavy and leads to some situations where the only escape is to
declare a client hung and reset the GPU. The intent is to only ratelimit
the allocation requests and to allow ourselves to recycle requests and
memory from any long queues built up by a client hog.

Although the system memory is inherently a global resources, we don't
want to overly penalize an unlucky client to pay the price of reaping a
hog. To reduce the influence of one client on another, we can instead of
waiting for the entire GPU to idle, impose a barrier on the local client.
(One end goal for request allocation is for scalability to many
concurrent allocators; simultaneous execbufs.)

To prevent ourselves from getting caught out by long running requests
(requests that may never finish without userspace intervention, whom we
are blocking) we need to impose a finite timeout, ideally shorter than
hangcheck. A long time ago Paul McKenney suggested that RCU users should
ratelimit themselves using judicious use of cond_synchronize_rcu(). This
gives us the opportunity to reduce our indefinite wait for the GPU to
idle to a wait for the RCU grace period of the previous allocation along
this timeline to expire, satisfying both the local and finite properties
we desire for our ratelimiting.

There are still a few global steps (reclaim not least amongst those!)
when we exhaust the immediate slab pool, at least now the wait is itself
decoupled from struct_mutex for our glorious highly parallel future!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106680
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 14 ++++++++------
 drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 09ed48833b54..a492385b2089 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -732,13 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq = kmem_cache_alloc(i915->requests,
 			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
+		i915_retire_requests(i915);
+
 		/* Ratelimit ourselves to prevent oom from malicious clients */
-		ret = i915_gem_wait_for_idle(i915,
-					     I915_WAIT_LOCKED |
-					     I915_WAIT_INTERRUPTIBLE,
-					     MAX_SCHEDULE_TIMEOUT);
-		if (ret)
-			goto err_unreserve;
+		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
+					 &i915->drm.struct_mutex);
+		if (rq)
+			cond_synchronize_rcu(rq->rcustate);
 
 		/*
 		 * We've forced the client to stall and catch up with whatever
@@ -758,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 		}
 	}
 
+	rq->rcustate = get_state_synchronize_rcu();
+
 	INIT_LIST_HEAD(&rq->active_list);
 	rq->i915 = i915;
 	rq->engine = engine;
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 9898301ab7ef..7fa94b024968 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -100,6 +100,14 @@ struct i915_request {
 	struct i915_timeline *timeline;
 	struct intel_signal_node signaling;
 
+	/*
+	 * The rcu epoch of when this request was allocated. Used to judiciously
+	 * apply backpressure on future allocations to ensure that under
+	 * mempressure there is sufficient RCU ticks for us to reclaim our
+	 * RCU protected slabs.
+	 */
+	unsigned long rcustate;
+
 	/*
 	 * Fences for the various phases in the request's lifetime.
 	 *
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-14  8:00 [PATCH 1/3] " Chris Wilson
@ 2018-09-14 10:09 ` Tvrtko Ursulin
  0 siblings, 0 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14 10:09 UTC (permalink / raw
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


On 14/09/2018 09:00, Chris Wilson wrote:
> If we try and fail to allocate a i915_request, we apply some
> backpressure on the clients to throttle the memory allocations coming
> from i915.ko. Currently, we wait until completely idle, but this is far
> too heavy and leads to some situations where the only escape is to
> declare a client hung and reset the GPU. The intent is to only ratelimit
> the allocation requests and to allow ourselves to recycle requests and
> memory from any long queues built up by a client hog.
> 
> Although the system memory is inherently a global resources, we don't
> want to overly penalize an unlucky client to pay the price of reaping a
> hog. To reduce the influence of one client on another, we can instead of
> waiting for the entire GPU to idle, impose a barrier on the local client.
> (One end goal for request allocation is for scalability to many
> concurrent allocators; simultaneous execbufs.)
> 
> To prevent ourselves from getting caught out by long running requests
> (requests that may never finish without userspace intervention, whom we
> are blocking) we need to impose a finite timeout, ideally shorter than
> hangcheck. A long time ago Paul McKenney suggested that RCU users should
> ratelimit themselves using judicious use of cond_synchronize_rcu(). This
> gives us the opportunity to reduce our indefinite wait for the GPU to
> idle to a wait for the RCU grace period of the previous allocation along
> this timeline to expire, satisfying both the local and finite properties
> we desire for our ratelimiting.
> 
> There are still a few global steps (reclaim not least amongst those!)
> when we exhaust the immediate slab pool, at least now the wait is itself
> decoupled from struct_mutex for our glorious highly parallel future!
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106680
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 14 ++++++++------
>   drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
>   2 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 09ed48833b54..a492385b2089 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -732,13 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	rq = kmem_cache_alloc(i915->requests,
>   			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
>   	if (unlikely(!rq)) {
> +		i915_retire_requests(i915);
> +
>   		/* Ratelimit ourselves to prevent oom from malicious clients */
> -		ret = i915_gem_wait_for_idle(i915,
> -					     I915_WAIT_LOCKED |
> -					     I915_WAIT_INTERRUPTIBLE,
> -					     MAX_SCHEDULE_TIMEOUT);
> -		if (ret)
> -			goto err_unreserve;
> +		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
> +					 &i915->drm.struct_mutex);
> +		if (rq)
> +			cond_synchronize_rcu(rq->rcustate);
>   
>   		/*
>   		 * We've forced the client to stall and catch up with whatever
> @@ -758,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   		}
>   	}
>   
> +	rq->rcustate = get_state_synchronize_rcu();
> +
>   	INIT_LIST_HEAD(&rq->active_list);
>   	rq->i915 = i915;
>   	rq->engine = engine;
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 9898301ab7ef..7fa94b024968 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -100,6 +100,14 @@ struct i915_request {
>   	struct i915_timeline *timeline;
>   	struct intel_signal_node signaling;
>   
> +	/*
> +	 * The rcu epoch of when this request was allocated. Used to judiciously
> +	 * apply backpressure on future allocations to ensure that under
> +	 * mempressure there is sufficient RCU ticks for us to reclaim our
> +	 * RCU protected slabs.
> +	 */
> +	unsigned long rcustate;
> +
>   	/*
>   	 * Fences for the various phases in the request's lifetime.
>   	 *
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-09-14 10:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-12 16:40 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
2018-09-12 16:40 ` [PATCH 2/3] drm/i915: Restrict wait to client's timeline on i915_request alloc failure Chris Wilson
2018-09-12 16:40 ` [PATCH 3/3] drm/i915: Wait for the previous RCU grace period, not request completion Chris Wilson
2018-09-13 11:16   ` Tvrtko Ursulin
2018-09-13 11:18     ` Chris Wilson
2018-09-13 11:29       ` Tvrtko Ursulin
2018-09-13 11:35         ` Chris Wilson
2018-09-12 17:40 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
2018-09-13  0:41 ` ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2018-09-14  8:00 [PATCH 1/3] " Chris Wilson
2018-09-14 10:09 ` Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.