All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] Fix Navi cards crashing with FP exception on 5.6+
@ 2020-04-29 15:02 Daniel Kolesa
  2020-04-29 15:02 ` [PATCH 1/1] drm/amd/display: work around fp code being emitted outside of DC_FP_START/END Daniel Kolesa
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Kolesa @ 2020-04-29 15:02 UTC (permalink / raw
  To: amd-gfx; +Cc: Daniel Kolesa

This is not a real fix, but rather a workaround that should be applied to the
5.6 tree and newer. The idea here is to work around the compiler emitting FP
instructions outside the DC_FP_START/DC_FP_END blocks, which it currently does,
which causes crashes at least on ppc64le.

A proper solution would be to move all the floating point code into its own
files, then compile those with hard-float and strictly wrap all calls into
those with the DC_FP_START/DC_FP_END blocks, with the rest of the code being
compiled without floating point, but that's too extensive of a change to do,
and would not be possible in a stable tree.

The proper solution is already being discussed, it seems:

https://lore.kernel.org/lkml/CAG48ez2Sx4ELkM94aD_h_J7K7KBOeuGmvZLKRkg3n_f2WoZ_cg@mail.gmail.com/

Daniel Kolesa (1):
  drm/amd/display: work around fp code being emitted outside of
    DC_FP_START/END

 .../drm/amd/display/dc/dcn20/dcn20_resource.c | 31 ++++++++++++++-----
 1 file changed, 23 insertions(+), 8 deletions(-)

-- 
2.26.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/1] drm/amd/display: work around fp code being emitted outside of DC_FP_START/END
  2020-04-29 15:02 [PATCH 0/1] Fix Navi cards crashing with FP exception on 5.6+ Daniel Kolesa
@ 2020-04-29 15:02 ` Daniel Kolesa
  2020-04-30 14:58   ` Alex Deucher
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Kolesa @ 2020-04-29 15:02 UTC (permalink / raw
  To: amd-gfx; +Cc: Daniel Kolesa

The dcn20_validate_bandwidth function would have code touching the
incorrect registers emitted outside of the boundaries of the
DC_FP_START/END macros, at least on ppc64le. Work around the
problem by wrapping the whole function instead.

Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
---
 .../drm/amd/display/dc/dcn20/dcn20_resource.c | 31 ++++++++++++++-----
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index e310d67..1b0bca9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -3034,25 +3034,32 @@ static bool dcn20_validate_bandwidth_internal(struct dc *dc, struct dc_state *co
 	return out;
 }
 
-
-bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
-		bool fast_validate)
+/*
+ * This must be noinline to ensure anything that deals with FP registers
+ * is contained within this call; previously our compiling with hard-float
+ * would result in fp instructions being emitted outside of the boundaries
+ * of the DC_FP_START/END macros, which makes sense as the compiler has no
+ * idea about what is wrapped and what is not
+ *
+ * This is largely just a workaround to avoid breakage introduced with 5.6,
+ * ideally all fp-using code should be moved into its own file, only that
+ * should be compiled with hard-float, and all code exported from there
+ * should be strictly wrapped with DC_FP_START/END
+ */
+static noinline bool dcn20_validate_bandwidth_fp(struct dc *dc,
+		struct dc_state *context, bool fast_validate)
 {
 	bool voltage_supported = false;
 	bool full_pstate_supported = false;
 	bool dummy_pstate_supported = false;
 	double p_state_latency_us;
 
-	DC_FP_START();
 	p_state_latency_us = context->bw_ctx.dml.soc.dram_clock_change_latency_us;
 	context->bw_ctx.dml.soc.disable_dram_clock_change_vactive_support =
 		dc->debug.disable_dram_clock_change_vactive_support;
 
 	if (fast_validate) {
-		voltage_supported = dcn20_validate_bandwidth_internal(dc, context, true);
-
-		DC_FP_END();
-		return voltage_supported;
+		return dcn20_validate_bandwidth_internal(dc, context, true);
 	}
 
 	// Best case, we support full UCLK switch latency
@@ -3081,7 +3088,15 @@ bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
 
 restore_dml_state:
 	context->bw_ctx.dml.soc.dram_clock_change_latency_us = p_state_latency_us;
+	return voltage_supported;
+}
 
+bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
+		bool fast_validate)
+{
+	bool voltage_supported = false;
+	DC_FP_START();
+	voltage_supported = dcn20_validate_bandwidth_fp(dc, context, fast_validate);
 	DC_FP_END();
 	return voltage_supported;
 }
-- 
2.26.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] drm/amd/display: work around fp code being emitted outside of DC_FP_START/END
  2020-04-29 15:02 ` [PATCH 1/1] drm/amd/display: work around fp code being emitted outside of DC_FP_START/END Daniel Kolesa
@ 2020-04-30 14:58   ` Alex Deucher
  0 siblings, 0 replies; 3+ messages in thread
From: Alex Deucher @ 2020-04-30 14:58 UTC (permalink / raw
  To: Daniel Kolesa; +Cc: amd-gfx list

On Wed, Apr 29, 2020 at 11:08 AM Daniel Kolesa <daniel@octaforge.org> wrote:
>
> The dcn20_validate_bandwidth function would have code touching the
> incorrect registers emitted outside of the boundaries of the
> DC_FP_START/END macros, at least on ppc64le. Work around the
> problem by wrapping the whole function instead.
>
> Signed-off-by: Daniel Kolesa <daniel@octaforge.org>

Applied.  Thanks!

Alex

> ---
>  .../drm/amd/display/dc/dcn20/dcn20_resource.c | 31 ++++++++++++++-----
>  1 file changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> index e310d67..1b0bca9 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> @@ -3034,25 +3034,32 @@ static bool dcn20_validate_bandwidth_internal(struct dc *dc, struct dc_state *co
>         return out;
>  }
>
> -
> -bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
> -               bool fast_validate)
> +/*
> + * This must be noinline to ensure anything that deals with FP registers
> + * is contained within this call; previously our compiling with hard-float
> + * would result in fp instructions being emitted outside of the boundaries
> + * of the DC_FP_START/END macros, which makes sense as the compiler has no
> + * idea about what is wrapped and what is not
> + *
> + * This is largely just a workaround to avoid breakage introduced with 5.6,
> + * ideally all fp-using code should be moved into its own file, only that
> + * should be compiled with hard-float, and all code exported from there
> + * should be strictly wrapped with DC_FP_START/END
> + */
> +static noinline bool dcn20_validate_bandwidth_fp(struct dc *dc,
> +               struct dc_state *context, bool fast_validate)
>  {
>         bool voltage_supported = false;
>         bool full_pstate_supported = false;
>         bool dummy_pstate_supported = false;
>         double p_state_latency_us;
>
> -       DC_FP_START();
>         p_state_latency_us = context->bw_ctx.dml.soc.dram_clock_change_latency_us;
>         context->bw_ctx.dml.soc.disable_dram_clock_change_vactive_support =
>                 dc->debug.disable_dram_clock_change_vactive_support;
>
>         if (fast_validate) {
> -               voltage_supported = dcn20_validate_bandwidth_internal(dc, context, true);
> -
> -               DC_FP_END();
> -               return voltage_supported;
> +               return dcn20_validate_bandwidth_internal(dc, context, true);
>         }
>
>         // Best case, we support full UCLK switch latency
> @@ -3081,7 +3088,15 @@ bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
>
>  restore_dml_state:
>         context->bw_ctx.dml.soc.dram_clock_change_latency_us = p_state_latency_us;
> +       return voltage_supported;
> +}
>
> +bool dcn20_validate_bandwidth(struct dc *dc, struct dc_state *context,
> +               bool fast_validate)
> +{
> +       bool voltage_supported = false;
> +       DC_FP_START();
> +       voltage_supported = dcn20_validate_bandwidth_fp(dc, context, fast_validate);
>         DC_FP_END();
>         return voltage_supported;
>  }
> --
> 2.26.2
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-30 14:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-29 15:02 [PATCH 0/1] Fix Navi cards crashing with FP exception on 5.6+ Daniel Kolesa
2020-04-29 15:02 ` [PATCH 1/1] drm/amd/display: work around fp code being emitted outside of DC_FP_START/END Daniel Kolesa
2020-04-30 14:58   ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.