From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB885C433B4 for ; Tue, 13 Apr 2021 11:04:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB15361249 for ; Tue, 13 Apr 2021 11:04:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345345AbhDMLFM (ORCPT ); Tue, 13 Apr 2021 07:05:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345320AbhDMLFJ (ORCPT ); Tue, 13 Apr 2021 07:05:09 -0400 Received: from mail-vk1-xa30.google.com (mail-vk1-xa30.google.com [IPv6:2607:f8b0:4864:20::a30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB363C061574 for ; Tue, 13 Apr 2021 04:04:49 -0700 (PDT) Received: by mail-vk1-xa30.google.com with SMTP id i2so3513189vka.13 for ; Tue, 13 Apr 2021 04:04:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=je0tXK2szor3C65YXdb81D9aE+MAP/3wuEdJm5mxyfw=; b=Mj7Ac0nkjEdSctz+w4445kA2ADO6XOmj12OFa73aesRvYDiBgn+UpPl8quADDUYvS2 xCraii6FMAvWuzrtW3Zja1qRh88iw06YjNgbrh0WSvy9EOKsGlWHIwZEq/Oaylx181vm ioZhfinaESXQZrcg94bpuFY+aj0RsEQwmVDHIzNgk21CAiEa5erqw6CCYqlgNPL6zdYF i2ruCYBzuUepCkhFbxq3Ms4ptjPP6Wihg3ByiWwb06T2h7jVwSnTFsiIkBBN6methXP4 +feO0ty6+q5s6imk/y9xuEKAQHtZ+FUM89habadKZmeVL2a/yqaXocEhL3i0jjqtXAIt BJNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=je0tXK2szor3C65YXdb81D9aE+MAP/3wuEdJm5mxyfw=; b=Eg+i8bBBpEPC2gzhS9XRAJvCo4Bgifo3bsA4An2BjDi7dkzuoYNI8e/IZbp86B0aYM Z+uOgvAFmgDLnl7dG8zYq/tgZWYPgjq+dxUsnkd96YdfWg9ESm1JuFMk7BtMBXKhMMSG ueZGVebY5kmn2MAz0ocy93HEvl7U7LC2GbMGV/20dSx90hPNvASJa0JBjLt/EcmPXLvQ OvVPqQHO4mHXV4dU5EKbKGk56IpqOtTqGtmN9Q4SkGOJQlVeSUqRxy5VToigGvFTn1AG ynKZRa3+kJUigRrO3odyvOZdHMk7n9VYNGB8TmUNoc2kzAn0Y05APeZRjT75dWaKZ7A2 iYcA== X-Gm-Message-State: AOAM531GeWwPgR8ozqDGzCPbDhTfrmUct2wYV3kk57leKBxP5m8T0qwQ BZLWv55J6RHa0f/7yrt1aZk0eku87FUkrm8I8ek= X-Google-Smtp-Source: ABdhPJw2T4zkH5O3s8wSbpVcYKdfS2ik6a773rVb2Mt8OtX9ebRpMLzavazIbmQCo0FNwIjv/JRFYGcacnplRsVrM3w= X-Received: by 2002:a1f:1184:: with SMTP id 126mr13452119vkr.21.1618311889073; Tue, 13 Apr 2021 04:04:49 -0700 (PDT) MIME-Version: 1.0 References: <20210413104503.GD15806@arm.com> In-Reply-To: <20210413104503.GD15806@arm.com> From: =?UTF-8?Q?Christoph_M=C3=BCllner?= Date: Tue, 13 Apr 2021 13:04:37 +0200 Message-ID: Subject: Re: [PATCH] riscv: locks: introduce ticket-based spinlock implementation To: Catalin Marinas Cc: Peter Zijlstra , Palmer Dabbelt , Anup Patel , Guo Ren , linux-riscv , Linux Kernel Mailing List , Guo Ren , will.deacon@arm.com, Arnd Bergmann Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 13, 2021 at 12:45 PM Catalin Marinas wrote: > > On Tue, Apr 13, 2021 at 12:25:00PM +0200, Christoph M=C3=BCllner wrote: > > On Tue, Apr 13, 2021 at 11:37 AM Peter Zijlstra = wrote: > > > On Tue, Apr 13, 2021 at 11:22:40AM +0200, Christoph M=C3=BCllner wrot= e: > > > > What about trylock()? > > > > I.e. one could implement trylock() without a loop, by letting > > > > trylock() fail if the SC fails. > > > > That looks safe on first view, but nobody does this right now. > > > > > > Generic code has to use cmpxchg(), and then you get something like th= is: > > > > > > bool trylock(atomic_t *lock) > > > { > > > u32 old =3D atomic_read(lock); > > > > > > if ((old >> 16) !=3D (old & 0xffff)) > > > return false; > > > > > > return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, = for RCsc */ > > > } > > > > This approach requires two loads (atomic_read() and cmpxchg()), which > > is not required. > > Detecting this pattern and optimizing it in a compiler is quite unlikel= y. > > > > A bit less generic solution would be to wrap the LL/SC (would be > > mandatory in this case) > > instructions and do something like this: > > > > uint32_t __smp_load_acquire_reserved(void*); > > int __smp_store_release_conditional(void*, uint32_t); > > > > typedef union { > > uint32_t v32; > > struct { > > uint16_t owner; > > uint16_t next; > > }; > > } arch_spinlock_t; > > > > int trylock(arch_spinlock_t *lock) > > { > > arch_spinlock_t l; > > int success; > > do { > > l.v32 =3D __smp_load_acquire_reserved(lock); > > if (l.owner !=3D l.next) > > return 0; > > l.next++; > > success =3D __smp_store_release_conditional(lock, l.v32); > > } while (!success); > > return success; > > } > > > > But here we can't tell the compiler to optimize the code between LL and= SC... > > This indeed needs some care. IIUC RISC-V has similar restrictions as arm > here, no load/store instructions are allowed between LR and SC. You > can't guarantee that the compiler won't spill some variable onto the > stack. RISC-V behaves similar, but the specification is a bit more precise: To guarantee forward progress, the ("constrained") LL/SC sequence has to consist of <=3D16 instructions. Further, the "dynamic code executed between the LR and SC instructions can only contain instructions from the base =E2= =80=9CI=E2=80=9D instruction set, excluding loads, stores, backward jumps, taken backward branches, JALR, FENCE, and SYSTEM instructions". And GCC generates a backward jump in-between LL and SC. So we have more than enough reasons, to no do it this way. > > BTW, I think the SC doesn't need release semantics above, only the LR > needs acquire semantics. > > -- > Catalin