From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F7E6C48BE8 for ; Fri, 18 Jun 2021 08:32:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 08C4460C3E for ; Fri, 18 Jun 2021 08:32:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233808AbhFRIeg (ORCPT ); Fri, 18 Jun 2021 04:34:36 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]:44276 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233876AbhFRIe0 (ORCPT ); Fri, 18 Jun 2021 04:34:26 -0400 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-11-fKdMvnO2P96mfGP2YKpuQQ-1; Fri, 18 Jun 2021 09:32:15 +0100 X-MC-Unique: fKdMvnO2P96mfGP2YKpuQQ-1 Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 18 Jun 2021 09:32:14 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.018; Fri, 18 Jun 2021 09:32:14 +0100 From: David Laight To: 'Matteo Croce' CC: Guo Ren , linux-riscv , Linux Kernel Mailing List , linux-arch , Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , Emil Renner Berthing , "Akira Tsukamoto" , Drew Fustini , Bin Meng Subject: RE: [PATCH 1/3] riscv: optimized memcpy Thread-Topic: [PATCH 1/3] riscv: optimized memcpy Thread-Index: AQHXYuC4XkdMIImxVUmoQbZ37iIZIqsYtSAggABBV22AAHkiQA== Date: Fri, 18 Jun 2021 08:32:14 +0000 Message-ID: <0fe90e43868f49b5953afe5abba41327@AcuMS.aculab.com> References: <20210615023812.50885-1-mcroce@linux.microsoft.com> <20210615023812.50885-2-mcroce@linux.microsoft.com> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org RnJvbTogTWF0dGVvIENyb2NlDQo+IFNlbnQ6IDE4IEp1bmUgMjAyMSAwMjowNQ0KLi4uDQo+ID4g PiBJdCdzIHJ1bm5pbmcgYXQgMSBHSHouDQo+ID4gPg0KPiA+ID4gSSBnZXQgMjU3IE1iL3Mgd2l0 aCBhIG1lbWNweSwgYSBiaXQgbW9yZSB3aXRoIGEgbWVtc2V0LA0KPiA+ID4gYnV0IEkgZ2V0IDEy MDAgTWIvcyB3aXRoIGEgY3lsZSB3aGljaCBqdXN0IHJlYWRzIG1lbW9yeSB3aXRoIDY0IGJpdCBh ZGRyZXNzaW5nLg0KPiA+ID4NCj4gPg0KPiA+IEVyciwgSSBmb3JnZXQgYSBtbG9jaygpIGJlZm9y ZSBhY2Nlc3NpbmcgdGhlIG1lbW9yeSBpbiB1c2Vyc3BhY2UuDQoNCldoYXQgaXMgdGhlIG1sb2Nr KCkgZm9yPw0KVGhlIGRhdGEgZm9yIGEgcXVpY2sgbG9vcCB3b24ndCBnZXQgcGFnZWQgb3V0Lg0K WW91IHdhbnQgdG8gdGVzdCBjYWNoZSB0byBjYWNoZSBjb3BpZXMsIHNvIHRoZSBmaXJzdCBsb29w DQp3aWxsIGFsd2F5cyBiZSBzbG93Lg0KQWZ0ZXIgdGhhdCBlYWNoIGl0ZXJhdGlvbiBzaG91bGQg YmUgbXVjaCB0aGUgc2FtZS4NCkkgdXNlIGNvZGUgbGlrZToNCglmb3IgKDs7KSB7DQoJCXN0YXJ0 ID0gcmVhZF90c2MoKTsNCgkJZG9fdGVzdCgpOw0KCQloaXN0b2dyYW1bKHJlYWRfdHNjKCkgLSBz dGFydCkgPj4gbl0rKw0KCX0NCihZb3UgbmVlZCB0byBleGNsdWRlIG91dGxpZXJzKQ0KdG8gZ2V0 IGEgZGlzdHJpYnV0aW9uIGZvciB0aGUgZXhlY3V0aW9uIHRpbWVzLg0KVGVuZHMgdG8gYmUgcHJl dHR5IHN0YWJsZSAtIGV2ZW4gdGhvdWdoIGRpZmZlcmVudCBwcm9ncmFtDQpydW5zIGNhbiBnaXZl IGRpZmZlcmVudCB2YWx1ZXMhDQoJDQo+ID4gVGhlIHJlYWwgc3BlZWQgaGVyZSBpczoNCj4gPg0K PiA+IDggYml0IHJlYWQ6IDE1NS40MiBNYi9zDQo+ID4gNjQgYml0IHJlYWQ6IDI3Ny4yOSBNYi9z DQo+ID4gOCBiaXQgd3JpdGU6IDEzOC41NyBNYi9zDQo+ID4gNjQgYml0IHdyaXRlOiAyMzkuMjEg TWIvcw0KPiA+DQo+IA0KPiBBbnl3YXksIHRoYW5rcyBmb3IgdGhlIGluZm8gb24gbmlvMiB0aW1p bmdzLg0KPiBJZiB5b3UgdGhpbmsgdGhhdCBhbiB1bnJvbGxlZCBsb29wIHdvdWxkIGhlbHAsIHdl IGNhbiBhY2hpZXZlIHRoZSBzYW1lIGluIEMuDQo+IEkgdGhpbmsgd2UgY291bGQgY29kZSBzb21l dGhpbmcgc2ltaWxhciB0byBhIER1ZmYgZGV2aWNlIChvciB3aXRoIGp1bXANCj4gbGFiZWxzKSB0 byB1bnJvbGwgdGhlIGxvb3AgYnV0IGF0IHRoZSBzYW1lIHRpbWUgZG9pbmcgZWZmaWNpZW50IHNt YWxsIGNvcGllcy4NCg0KVW5yb2xsaW5nIGhhcyB0byBiZSBkb25lIHdpdGggY2FyZS4NCkl0IHRl bmRzIHRvIGltcHJvdmUgYmVuY2htYXJrcywgYnV0IHRoZSBleHRyYSBjb2RlIGRpc3BsYWNlcw0K b3RoZXIgY29kZSBmcm9tIHRoZSBpLWNhY2hlIGFuZCBzbG93cyBkb3duIG92ZXJhbGwgcGVyZm9y bWFuY2UuDQpTbyB5b3UgbmVlZCAnanVzdCBlbm91Z2gnIHVucm9sbGluZyB0byBhdm9pZCBjcHUg c3RhbGxzLg0KDQpPbiB5b3VyIHN5c3RlbSBpdCBsb29rcyBsaWtlIHRoZSBtZW1vcnkvY2FjaGUg c3Vic3lzdGVtDQppcyB0aGUgYm90dGxlbmVjayBmb3IgdGhlIHRlc3RzIHlvdSBhcmUgZG9pbmcu DQpJJ2QgcmVhbGx5IGV4cGVjdCBhIDFHSHogY3B1IHRvIGJlIGFibGUgdG8gcmVhZC93cml0ZSBm cm9tDQppdHMgZGF0YSBjYWNoZSBldmVyeSBjbG9jay4NClNvIEknZCBleHBlY3QgdHJhbnNmZXIg cmF0ZXMgbmVhcmVyIDgwMDAgTUIvcywgbm90IDI1MCBNQi9zLg0KDQoJRGF2aWQNCg0KLQ0KUmVn aXN0ZXJlZCBBZGRyZXNzIExha2VzaWRlLCBCcmFtbGV5IFJvYWQsIE1vdW50IEZhcm0sIE1pbHRv biBLZXluZXMsIE1LMSAxUFQsIFVLDQpSZWdpc3RyYXRpb24gTm86IDEzOTczODYgKFdhbGVzKQ0K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50106C48BDF for ; Fri, 18 Jun 2021 08:32:40 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 13ECC6121D for ; Fri, 18 Jun 2021 08:32:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 13ECC6121D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=ACULAB.COM Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=h8+h9Abm8zHB8zcToTmC7o1vSEJSf1EpxQkUJVPG+FM=; b=2wvhPVx0Ki5Oa8 hWkD2wB7tz50GuEwWuWjzdK2OV9BRQSsfS+vsIRwXnO+nU5dZT2+2o+6SkxWp2nFYfVmtPBG8jOL4 OlsAoP/eVNMN2+jIriaWGEaww3Q0Z03e3/jvWvfpOP2LI2BSNY9tzFeu2E55jzreu+Emb0wxl7SyC KOR3HcrbPKz/pt5ZiLDm/R7O05enMM/lEvR1qLn1ZBRMnwUapttS0aYPVijw1HFBwNkU9vrDGtk4R na1WgGUMLDDcrUnrycc555+e3wP+G1F+iukRFZCdoMd0FfNVwMsAQTP2X9Z3yWr6YAuhVg+4vX7z4 gOLHqIomFA74yKuOQIcQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lu9vK-00DJcO-Dl; Fri, 18 Jun 2021 08:32:22 +0000 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lu9vH-00DJRS-1x for linux-riscv@lists.infradead.org; Fri, 18 Jun 2021 08:32:20 +0000 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-11-fKdMvnO2P96mfGP2YKpuQQ-1; Fri, 18 Jun 2021 09:32:15 +0100 X-MC-Unique: fKdMvnO2P96mfGP2YKpuQQ-1 Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Fri, 18 Jun 2021 09:32:14 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.018; Fri, 18 Jun 2021 09:32:14 +0100 From: David Laight To: 'Matteo Croce' CC: Guo Ren , linux-riscv , Linux Kernel Mailing List , linux-arch , Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , Emil Renner Berthing , "Akira Tsukamoto" , Drew Fustini , Bin Meng Subject: RE: [PATCH 1/3] riscv: optimized memcpy Thread-Topic: [PATCH 1/3] riscv: optimized memcpy Thread-Index: AQHXYuC4XkdMIImxVUmoQbZ37iIZIqsYtSAggABBV22AAHkiQA== Date: Fri, 18 Jun 2021 08:32:14 +0000 Message-ID: <0fe90e43868f49b5953afe5abba41327@AcuMS.aculab.com> References: <20210615023812.50885-1-mcroce@linux.microsoft.com> <20210615023812.50885-2-mcroce@linux.microsoft.com> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210618_013219_419170_92B093B0 X-CRM114-Status: GOOD ( 17.78 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Matteo Croce > Sent: 18 June 2021 02:05 ... > > > It's running at 1 GHz. > > > > > > I get 257 Mb/s with a memcpy, a bit more with a memset, > > > but I get 1200 Mb/s with a cyle which just reads memory with 64 bit addressing. > > > > > > > Err, I forget a mlock() before accessing the memory in userspace. What is the mlock() for? The data for a quick loop won't get paged out. You want to test cache to cache copies, so the first loop will always be slow. After that each iteration should be much the same. I use code like: for (;;) { start = read_tsc(); do_test(); histogram[(read_tsc() - start) >> n]++ } (You need to exclude outliers) to get a distribution for the execution times. Tends to be pretty stable - even though different program runs can give different values! > > The real speed here is: > > > > 8 bit read: 155.42 Mb/s > > 64 bit read: 277.29 Mb/s > > 8 bit write: 138.57 Mb/s > > 64 bit write: 239.21 Mb/s > > > > Anyway, thanks for the info on nio2 timings. > If you think that an unrolled loop would help, we can achieve the same in C. > I think we could code something similar to a Duff device (or with jump > labels) to unroll the loop but at the same time doing efficient small copies. Unrolling has to be done with care. It tends to improve benchmarks, but the extra code displaces other code from the i-cache and slows down overall performance. So you need 'just enough' unrolling to avoid cpu stalls. On your system it looks like the memory/cache subsystem is the bottleneck for the tests you are doing. I'd really expect a 1GHz cpu to be able to read/write from its data cache every clock. So I'd expect transfer rates nearer 8000 MB/s, not 250 MB/s. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv