All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Tomas Elf <tomas.elf@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
Date: Tue, 16 Jun 2015 16:46:05 +0100	[thread overview]
Message-ID: <558044BD.1080203@intel.com> (raw)
In-Reply-To: <20150616134445.GW23637@phenom.ffwll.local>

On 16/06/2015 14:44, Daniel Vetter wrote:
> On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
>> When submitting semaphores in execlist mode the hang checker crashes in this
>> function because it is only runnable in ring submission mode. The reason this
>> is of particular interest to the TDR patch series is because we use semaphores
>> as a mean to induce hangs during testing (which is the recommended way to
>> induce hangs for gen8+). It's not clear how this is supposed to work in
>> execlist mode since:
>>
>> 1. This function requires a ring buffer.
>>
>> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
>> corresponding context, which we get from a request.
>>
>> 3. Retieving a request from the hang checker is not straight-forward since that
>> requires us to grab the struct_mutex in order to synchronize against the
>> request retirement thread.
>>
>> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
>> since that puts us at risk of deadlock since a hung thread might be holding the
>> struct_mutex already.
>>
>> Therefore it's not obvious how we're supposed to deal with this. For now, we're
>> doing an early exit from this function, which avoids any kernel panic situation
>> when running our own internal TDR ULT.
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>
> We should have a Testcase: line here which mentions the igt testcase which
> provoke this bug. Or we need to fill this gap asap.
> -Daniel

You know this better than I do: Is there an IGT test that submits a 
semaphore in execlist mode? Because that's all you need to do to 
reproduce this. We could certainly add one if there is none like that 
already.

Thanks,
Tomas

>
>> ---
>>   drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 46bcbff..40c44fc 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>>   	u64 offset = 0;
>>   	int i, backwards;
>>
>> +	/*
>> +	 * This function does not support execlist mode - any attempt to
>> +	 * proceed further into this function will result in a kernel panic
>> +	 * when dereferencing ring->buffer, which is not set up in execlist
>> +	 * mode.
>> +	 *
>> +	 * The correct way of doing it would be to derive the currently
>> +	 * executing ring buffer from the current context, which is derived
>> +	 * from the currently running request. Unfortunately, to get the
>> +	 * current request we would have to grab the struct_mutex before doing
>> +	 * anything else, which would be ill-advised since some other thread
>> +	 * might have grabbed it already and managed to hang itself, causing
>> +	 * the hang checker to deadlock.
>> +	 *
>> +	 * Therefore, this function does not support execlist mode in its
>> +	 * current form. Just return NULL and move on.
>> +	 */
>> +	if (i915.enable_execlists)
>> +		return NULL;
>> +
>>   	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
>>   	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
>>   		return NULL;
>> --
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-06-16 15:46 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
2015-06-08 17:36   ` Chris Wilson
2015-06-09 11:02     ` Tomas Elf
2015-06-16 13:44   ` Daniel Vetter
2015-06-16 15:46     ` Tomas Elf [this message]
2015-06-16 16:50       ` Chris Wilson
2015-06-16 17:07         ` Tomas Elf
2015-06-17 11:43       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
2015-06-16 13:43   ` Daniel Vetter
2015-06-16 15:43     ` Tomas Elf
2015-06-16 16:55       ` Chris Wilson
2015-06-16 17:32         ` Tomas Elf
2015-06-16 19:33           ` Chris Wilson
2015-06-17 11:49             ` Daniel Vetter
2015-06-17 12:51               ` Chris Wilson
2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
2015-06-08 17:33   ` Chris Wilson
2015-06-09 11:06     ` Tomas Elf
2015-06-16 13:48     ` Daniel Vetter
2015-06-16 13:54       ` Chris Wilson
2015-06-16 15:55         ` Daniel Vetter
2015-06-18 11:12         ` Dave Gordon
2015-06-11  9:14   ` Dave Gordon
2015-06-16 13:49   ` Daniel Vetter
2015-06-16 15:54     ` Tomas Elf
2015-06-17 11:51       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
2015-06-08 17:24   ` Chris Wilson
2015-06-09 11:08     ` Tomas Elf
2015-06-09 11:11   ` Chris Wilson
2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
2015-06-08 17:53   ` Chris Wilson
2015-06-08 17:03 ` [RFC 07/11] drm/i915: Reinstate hang recovery work queue Tomas Elf
2015-06-08 17:03 ` [RFC 08/11] drm/i915: Watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check Tomas Elf
2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
2015-06-08 17:45   ` Chris Wilson
2015-06-09 11:18     ` Tomas Elf
2015-06-09 12:27       ` Chris Wilson
2015-06-09 17:28         ` Tomas Elf
2015-06-11  9:32     ` Dave Gordon
2015-06-08 17:03 ` [RFC 11/11] drm/i915: TDR/watchdog trace points Tomas Elf
2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
2015-06-23 10:47   ` Tomas Elf
2015-06-23 11:38     ` Daniel Vetter
2015-06-23 14:06       ` Tomas Elf
2015-06-23 15:20         ` Daniel Vetter
2015-06-23 15:35           ` Daniel Vetter
2015-06-25 10:38             ` Tomas Elf
2015-07-03 11:15 ` Mika Kuoppala
2015-07-03 17:41   ` Tomas Elf
2015-07-09 18:47 ` Chris Wilson
2015-07-10 15:24   ` Tomas Elf
2015-07-10 15:48     ` Tomas Elf
2015-07-11 18:15       ` Chris Wilson
2015-07-11 18:22     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558044BD.1080203@intel.com \
    --to=tomas.elf@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=daniel@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.