From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751645AbbGMIuK (ORCPT ); Mon, 13 Jul 2015 04:50:10 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:34141 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751493AbbGMIuI convert rfc822-to-8bit (ORCPT ); Mon, 13 Jul 2015 04:50:08 -0400 MIME-Version: 1.0 In-Reply-To: <55A36FA7.7010707@linux.intel.com> References: <20150713062222.GG3736@phenom.ffwll.local> <55A3678B.6080803@linux.intel.com> <55A36FA7.7010707@linux.intel.com> Date: Mon, 13 Jul 2015 10:50:06 +0200 Message-ID: Subject: Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference From: =?UTF-8?Q?J=C3=B6rg_Otte?= To: Maarten Lankhorst Cc: Linus Torvalds , David Airlie , DRI , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2015-07-13 9:58 GMT+02:00 Maarten Lankhorst : > Op 13-07-15 om 09:42 schreef Jörg Otte: >> 2015-07-13 9:23 GMT+02:00 Maarten Lankhorst : >>> Op 13-07-15 om 08:22 schreef Daniel Vetter: >>>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote: >>>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte wrote: >>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 >>>>>> IP: [] 0xffffffffbd3447bb >>>>> Ugh. Please enable KALLSYMS to get sane symbols. >>>>> >>>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", >>>>> so it's pretty clearly just that change frm >>>>> >>>>> - if (intel_crtc->active) { >>>>> + if (crtc_state->base.active) { >>>>> >>>>> and "crtc_state" is NULL. >>>>> >>>>> And the code very much knows that crtc_state can be NULL, since it's >>>>> initialized with >>>>> >>>>> crtc_state = state->base.state ? >>>>> intel_atomic_get_crtc_state(state->base.state, >>>>> intel_crtc) : NULL; >>>>> >>>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4 >>>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for >>>>> now, or is there a better fix? Like just checking crtc_state for NULL? >>>> Indeed embarrassing. I've missed that we still have 1 caller left that's >>>> using the transitional helpers, and those don't fill out >>>> plane_state->state backpointers to the global atomic update since there is >>>> no global atomic update for transitional helpers. Below diff should fix >>>> this - we need to preferentially check crts_state->active and if that's >>>> not set intel_crtc->active should yield the right result for the one >>>> remaining caller (it's in the crtc_disable paths). >>>> >>>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision >>>> to transition to a different QA team ("we'll do this in 1 week without >>>> upfront planing") I essentially don't have proper QA support for 1-2 >>>> months by now. The other trouble in this area specifically is that this >>>> code is already completely changed in -next again, so any testing done on >>>> integration trees (like -next or drm-intel-nightly) won't test any patches >>>> for 4.2. >>>> -Daniel >>>> >>>> Oh and Signed-off-by: Daniel Vetter in case you >>>> decide to apply this right away. >>>> >>> Well your version has the benefit of compiling without errors. :-) >>> >>> Reviewed-by: Maarten Lankhorst >> Just noticed another problem: >> On each resume I get the following error: >> -----------[ cut here ]------------ >> WARNING: CPU: 2 PID: 2663 at >> /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319 >> 0xffffffff9a33d5e9() >> WARN_ON(!crtc->state->enable) >> CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15 >> ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 >> orkqueue: events_unbound 0xffffffff9a055750 >> 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000 >> ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000 >> ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664 >> all Trace: >> [] ? 0xffffffff9a6d84d2 >> [] ? 0xffffffff9a03c416 >> [] ? 0xffffffff9a03c4d5 >> [] ? 0xffffffff9a33d5e9 >> [] ? 0xffffffff9a343ac3 >> [] ? 0xffffffff9a34444a >> [] ? 0xffffffff9a345518 >> [] ? 0xffffffff9a3246f0 >> [] ? 0xffffffff9a2e1ce8 >> [] ? 0xffffffff9a236170 >> [] ? 0xffffffff9a38b28d >> [] ? 0xffffffff9a38b784 >> [] ? 0xffffffff9a38baa4 >> [] ? 0xffffffff9a05577d >> [] ? 0xffffffff9a04dc47 >> [] ? 0xffffffff9a04dfab >> [] ? 0xffffffff9a04dea0 >> [] ? 0xffffffff9a05331c >> [] ? 0xffffffff9a053260 >> [] ? 0xffffffff9a6dfa0f >> [] ? 0xffffffff9a053260 >> --[ end trace 1b6d28ee34071679 ]--- >> >> Nervertheless resume works, so it doesn't hurt me. >> >> >> BTW: I get also up to 40..50! compile warnings like: >> i915/i915_drv.h: In function 'i915_debugfs_connector_add': >> i915/i915_drv.h:3119:53: warning: no return statement in function >> returning non-void [-Wreturn-type] >> >> which may cause yet uncovered troubles. >> >> Thanks, Jörg > kallsyms please! > > Looks like intel_crtc_disable being called with a mode change on a already disabled crtc, it's gone in 4.3 because of the atomic rework. > > Does something like below work? > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index ba9321998a41..725d2b727704 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc) > struct drm_connector *connector; > struct drm_i915_private *dev_priv = dev->dev_private; > > - /* crtc should still be enabled when we disable it. */ > - WARN_ON(!crtc->state->enable); > - > intel_crtc_disable_planes(crtc); > dev_priv->display.crtc_disable(crtc); > dev_priv->display.off(crtc); > @@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc, > continue; > > if (!crtc_state->enable) { > - intel_crtc_disable(crtc); > + if (crtc->state->enable) > + intel_crtc_disable(crtc); > } else if (crtc->state->enable) { > intel_crtc_disable_planes(crtc); > dev_priv->display.crtc_disable(crtc); > The patch works for me. Thanks, Jörg