From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7F943C4345F for ; Mon, 15 Apr 2024 18:37:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=lxaLGaiL29csxP4dEPDi/BEIqHFzBxsX/k4xuz41DUo=; b=m5E7sGx7Vy+ymf MsyMxBxcfMjYQfr1+wC6X+ibjhc01ReXHdKwTJZJB6vJmUr/nA/kcPJcN8b13NHBwxmLsYW+sJRQU Rs2t6scgB8jK7BdTT2ergtSYu0k0mWdGIAffd5SRLMMmoxGeBbi21ypxqfueEAhTJL2QG6/tqW/yT jCDN5D3BcUND6MBjcuT8i/2J+0POZMjWHRhJkGipbh1Wv1uMK1z19DCd84uvNjiDEAEUWhNzGhcHO XU02MR4vo7sadwjK5loI3xZO97UigsWDGCqG51xi9MAfRrI6iD9UE4IJY5zFtdWur2SYxyLCZu3FB Ws5jHa+ldlKCleRa5KkA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rwRCD-00000009Sd3-02Gx; Mon, 15 Apr 2024 18:36:49 +0000 Received: from mail.alien8.de ([65.109.113.108]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rwRCA-00000009Scb-1RMI for linux-arm-kernel@lists.infradead.org; Mon, 15 Apr 2024 18:36:47 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 4C51440E01FF; Mon, 15 Apr 2024 18:36:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id TFrJuOhBSMmW; Mon, 15 Apr 2024 18:36:38 +0000 (UTC) Received: from zn.tnic (pd953020b.dip0.t-ipconnect.de [217.83.2.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 44CDB40E0177; Mon, 15 Apr 2024 18:36:22 +0000 (UTC) Date: Mon, 15 Apr 2024 20:36:16 +0200 From: Borislav Petkov To: Serge Semin Cc: Michal Simek , Alexander Stein , Tony Luck , James Morse , Mauro Carvalho Chehab , Robert Richter , Dinh Nguyen , Punnaiah Choudary Kalluri , Arnd Bergmann , Greg Kroah-Hartman , linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, Sherry Sun , Borislav Petkov Subject: Re: [PATCH v5 01/20] EDAC/synopsys: Fix ECC status data and IRQ disable race condition Message-ID: <20240415183616.GDZh1zoFsBzvAEduRo@fat_crate.local> References: <20240222181324.28242-1-fancer.lancer@gmail.com> <20240222181324.28242-2-fancer.lancer@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240222181324.28242-2-fancer.lancer@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240415_113646_559951_3BF88219 X-CRM114-Status: GOOD ( 18.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Feb 22, 2024 at 09:12:46PM +0300, Serge Semin wrote: > The race condition around the ECCCLR register access happens in the IRQ > disable method called in the device remove() procedure and in the ECC IRQ > handler: > 1. Enable IRQ: > a. ECCCLR = EN_CE | EN_UE > 2. Disable IRQ: > a. ECCCLR = 0 > 3. IRQ handler: > a. ECCCLR = CLR_CE | CLR_CE_CNT | CLR_CE | CLR_CE_CNT > b. ECCCLR = 0 > c. ECCCLR = EN_CE | EN_UE > So if the IRQ disabling procedure is called concurrently with the IRQ > handler method the IRQ might be actually left enabled due to the > statement 3c. > > The root cause of the problem is that ECCCLR register (which since v3.10a > has been called as ECCCTL) has intermixed ECC status data clear flags and > the IRQ enable/disable flags. Thus the IRQ disabling (clear EN flags) and > handling (write 1 to clear ECC status data) procedures must be serialised > around the ECCCTL register modification to prevent the race. > > So fix the problem described above by adding the spin-lock around the > ECCCLR modifications and preventing the IRQ-handler from modifying the > IRQs enable flags (there is no point in disabling the IRQ and then > re-enabling it again within a single IRQ handler call, see the statements > 3a/3b and 3c above). So I'm looking at the code and am looking at this and wondering how we even ended up in this mess?! An interrupt handler should not *enable* the interrupt again - that's just crazy. And I should've seen that in 4bcffe941758 ("EDAC/synopsys: Re-enable the error interrupts on v3 hw") and stopped it right there. But well, it is what it is... So I'd like to see the following flow: * on init, the interrupt is enabled with enable_intr() *after* registering the interrupt handler. * on exit, the interrupt is disabled with disable_intr() and then no interrupts are coming in anymore. And then I don't think you'll need the spinlock and it'll be sane design. Right? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel