All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
@ 2018-03-21 11:18 Maxime Ripard
  2018-03-29 13:40 ` Mylène Josserand
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Maxime Ripard @ 2018-03-21 11:18 UTC (permalink / raw
  To: u-boot

From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>

Throughput tests have shown the sunxi_mmc driver to take over 10s to
read 10MB from a fast eMMC device due to excessive delays in polling
loops.

This commit restructures the main polling loops to use get_timer(...)
to determine whether a (millisecond) timeout has expired.  We choose
not to use the wait_bit function, as we don't need interruptability
with ctrl-c and have at least one case where two bits (one for an
error condition and another one for completion) need to be read and
using wait_bit would have not added to the clarity.

The observed speedup in testing on a A31 is greater than 10x (e.g. a
10MB write decreases from 9.302s to 0.884s).

Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
---
 drivers/mmc/sunxi_mmc.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/sunxi_mmc.c b/drivers/mmc/sunxi_mmc.c
index 4edb4be46c81..d36c1689e7b1 100644
--- a/drivers/mmc/sunxi_mmc.c
+++ b/drivers/mmc/sunxi_mmc.c
@@ -187,15 +187,16 @@ static int mmc_update_clk(struct sunxi_mmc_priv *priv)
 {
 	unsigned int cmd;
 	unsigned timeout_msecs = 2000;
+	unsigned long start = get_timer(0);
 
 	cmd = SUNXI_MMC_CMD_START |
 	      SUNXI_MMC_CMD_UPCLK_ONLY |
 	      SUNXI_MMC_CMD_WAIT_PRE_OVER;
+
 	writel(cmd, &priv->reg->cmd);
 	while (readl(&priv->reg->cmd) & SUNXI_MMC_CMD_START) {
-		if (!timeout_msecs--)
+		if (get_timer(start) > timeout_msecs)
 			return -1;
-		udelay(1000);
 	}
 
 	/* clock update sets various irq status bits, clear these */
@@ -276,18 +277,21 @@ static int mmc_trans_data_by_cpu(struct sunxi_mmc_priv *priv, struct mmc *mmc,
 	unsigned i;
 	unsigned *buff = (unsigned int *)(reading ? data->dest : data->src);
 	unsigned byte_cnt = data->blocksize * data->blocks;
-	unsigned timeout_usecs = (byte_cnt >> 8) * 1000;
-	if (timeout_usecs < 2000000)
-		timeout_usecs = 2000000;
+	unsigned timeout_msecs = byte_cnt >> 8;
+	unsigned long  start;
+
+	if (timeout_msecs < 2000)
+		timeout_msecs = 2000;
 
 	/* Always read / write data through the CPU */
 	setbits_le32(&priv->reg->gctrl, SUNXI_MMC_GCTRL_ACCESS_BY_AHB);
 
+	start = get_timer(0);
+
 	for (i = 0; i < (byte_cnt >> 2); i++) {
 		while (readl(&priv->reg->status) & status_bit) {
-			if (!timeout_usecs--)
+			if (get_timer(start) > timeout_msecs)
 				return -1;
-			udelay(1);
 		}
 
 		if (reading)
@@ -303,16 +307,16 @@ static int mmc_rint_wait(struct sunxi_mmc_priv *priv, struct mmc *mmc,
 			 uint timeout_msecs, uint done_bit, const char *what)
 {
 	unsigned int status;
+	unsigned long start = get_timer(0);
 
 	do {
 		status = readl(&priv->reg->rint);
-		if (!timeout_msecs-- ||
+		if ((get_timer(start) > timeout_msecs) ||
 		    (status & SUNXI_MMC_RINT_INTERRUPT_ERROR_BIT)) {
 			debug("%s timeout %x\n", what,
 			      status & SUNXI_MMC_RINT_INTERRUPT_ERROR_BIT);
 			return -ETIMEDOUT;
 		}
-		udelay(1000);
 	} while (!(status & done_bit));
 
 	return 0;
@@ -404,15 +408,16 @@ static int sunxi_mmc_send_cmd_common(struct sunxi_mmc_priv *priv,
 	}
 
 	if (cmd->resp_type & MMC_RSP_BUSY) {
+		unsigned long start = get_timer(0);
 		timeout_msecs = 2000;
+
 		do {
 			status = readl(&priv->reg->status);
-			if (!timeout_msecs--) {
+			if (get_timer(start) > timeout_msecs) {
 				debug("busy timeout\n");
 				error = -ETIMEDOUT;
 				goto out;
 			}
-			udelay(1000);
 		} while (status & SUNXI_MMC_STATUS_CARD_DATA_BUSY);
 	}
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-03-21 11:18 [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver Maxime Ripard
@ 2018-03-29 13:40 ` Mylène Josserand
  2018-04-04  6:43 ` Jagan Teki
  2018-04-06  5:54 ` Maxime Ripard
  2 siblings, 0 replies; 13+ messages in thread
From: Mylène Josserand @ 2018-03-29 13:40 UTC (permalink / raw
  To: u-boot

Hello,

On Wed, 21 Mar 2018 12:18:58 +0100
Maxime Ripard <maxime.ripard@bootlin.com> wrote:

> From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> 
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
> 
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
> 
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).
> 
> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>

Tested-by: Mylène Josserand <mylene.josserand@bootlin.com>

Thanks,

-- 
Mylène Josserand, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-03-21 11:18 [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver Maxime Ripard
  2018-03-29 13:40 ` Mylène Josserand
@ 2018-04-04  6:43 ` Jagan Teki
  2018-04-04  7:06   ` Maxime Ripard
  2018-04-06  5:54 ` Maxime Ripard
  2 siblings, 1 reply; 13+ messages in thread
From: Jagan Teki @ 2018-04-04  6:43 UTC (permalink / raw
  To: u-boot

On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
<maxime.ripard@bootlin.com> wrote:
> From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
>
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
>
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).

Fyi: I've seen significant improvement, but not 10x on A64
(bananpi-m64) with read

Before this change:

=> mmc dev 0
switch to partitions #0, OK
mmc0 is current device
=> fatload mmc 0:1 $kernel_addr_r Image
reading Image
16310784 bytes read in 821 ms (18.9 MiB/s)
=> mmc dev 1
switch to partitions #0, OK
mmc1(part 0) is current device
=> ext4load mmc 1:1 $kernel_addr_r Image
16310784 bytes read in 1109 ms (14 MiB/s)


After this change:

=> mmc dev 0
switch to partitions #0, OK
mmc0 is current device
=> fatload mmc 0:1 $kernel_addr_r Image
16310784 bytes read in 784 ms (19.8 MiB/s)
=> mmc dev 1
switch to partitions #0, OK
mmc1(part 0) is current device
=> ext4load mmc 1:1 $kernel_addr_r Image
16310784 bytes read in 793 ms (19.6 MiB/s)

Jagan.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-04  6:43 ` Jagan Teki
@ 2018-04-04  7:06   ` Maxime Ripard
  2018-04-06  6:06     ` Jagan Teki
  0 siblings, 1 reply; 13+ messages in thread
From: Maxime Ripard @ 2018-04-04  7:06 UTC (permalink / raw
  To: u-boot

On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
> <maxime.ripard@bootlin.com> wrote:
> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> >
> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > read 10MB from a fast eMMC device due to excessive delays in polling
> > loops.
> >
> > This commit restructures the main polling loops to use get_timer(...)
> > to determine whether a (millisecond) timeout has expired.  We choose
> > not to use the wait_bit function, as we don't need interruptability
> > with ctrl-c and have at least one case where two bits (one for an
> > error condition and another one for completion) need to be read and
> > using wait_bit would have not added to the clarity.
> >
> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > 10MB write decreases from 9.302s to 0.884s).
> 
> Fyi: I've seen significant improvement, but not 10x on A64
> (bananpi-m64) with read
> 
> Before this change:
> 
> => mmc dev 0
> switch to partitions #0, OK
> mmc0 is current device
> => fatload mmc 0:1 $kernel_addr_r Image
> reading Image
> 16310784 bytes read in 821 ms (18.9 MiB/s)
> => mmc dev 1
> switch to partitions #0, OK
> mmc1(part 0) is current device
> => ext4load mmc 1:1 $kernel_addr_r Image
> 16310784 bytes read in 1109 ms (14 MiB/s)
> 
> 
> After this change:
> 
> => mmc dev 0
> switch to partitions #0, OK
> mmc0 is current device
> => fatload mmc 0:1 $kernel_addr_r Image
> 16310784 bytes read in 784 ms (19.8 MiB/s)
> => mmc dev 1
> switch to partitions #0, OK
> mmc1(part 0) is current device
> => ext4load mmc 1:1 $kernel_addr_r Image
> 16310784 bytes read in 793 ms (19.6 MiB/s)

Yeah, the smaller the file is, the bigger the gain is. Since you have
an almost twice bigger file, the gains are probably just noise at that
point and the bottleneck starts to be your MMC.

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20180404/14c09f53/attachment.sig>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-03-21 11:18 [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver Maxime Ripard
  2018-03-29 13:40 ` Mylène Josserand
  2018-04-04  6:43 ` Jagan Teki
@ 2018-04-06  5:54 ` Maxime Ripard
  2018-04-16 19:55   ` Maxime Ripard
  2 siblings, 1 reply; 13+ messages in thread
From: Maxime Ripard @ 2018-04-06  5:54 UTC (permalink / raw
  To: u-boot

Hi Jaehoon,

On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> 
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
> 
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
> 
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).
> 
> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>

Any chance we can merge this for the next release?

Thanks!
Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-04  7:06   ` Maxime Ripard
@ 2018-04-06  6:06     ` Jagan Teki
  2018-04-24 19:57       ` Maxime Ripard
  0 siblings, 1 reply; 13+ messages in thread
From: Jagan Teki @ 2018-04-06  6:06 UTC (permalink / raw
  To: u-boot

On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
<maxime.ripard@bootlin.com> wrote:
> On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
>> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
>> <maxime.ripard@bootlin.com> wrote:
>> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> >
>> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > loops.
>> >
>> > This commit restructures the main polling loops to use get_timer(...)
>> > to determine whether a (millisecond) timeout has expired.  We choose
>> > not to use the wait_bit function, as we don't need interruptability
>> > with ctrl-c and have at least one case where two bits (one for an
>> > error condition and another one for completion) need to be read and
>> > using wait_bit would have not added to the clarity.
>> >
>> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > 10MB write decreases from 9.302s to 0.884s).
>>
>> Fyi: I've seen significant improvement, but not 10x on A64
>> (bananpi-m64) with read
>>
>> Before this change:
>>
>> => mmc dev 0
>> switch to partitions #0, OK
>> mmc0 is current device
>> => fatload mmc 0:1 $kernel_addr_r Image
>> reading Image
>> 16310784 bytes read in 821 ms (18.9 MiB/s)
>> => mmc dev 1
>> switch to partitions #0, OK
>> mmc1(part 0) is current device
>> => ext4load mmc 1:1 $kernel_addr_r Image
>> 16310784 bytes read in 1109 ms (14 MiB/s)
>>
>>
>> After this change:
>>
>> => mmc dev 0
>> switch to partitions #0, OK
>> mmc0 is current device
>> => fatload mmc 0:1 $kernel_addr_r Image
>> 16310784 bytes read in 784 ms (19.8 MiB/s)
>> => mmc dev 1
>> switch to partitions #0, OK
>> mmc1(part 0) is current device
>> => ext4load mmc 1:1 $kernel_addr_r Image
>> 16310784 bytes read in 793 ms (19.6 MiB/s)
>
> Yeah, the smaller the file is, the bigger the gain is. Since you have
> an almost twice bigger file, the gains are probably just noise at that
> point and the bottleneck starts to be your MMC.

Acked-by: Jagan Teki <jagan@openedev.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-06  5:54 ` Maxime Ripard
@ 2018-04-16 19:55   ` Maxime Ripard
  2018-04-16 20:37     ` Michael Nazzareno Trimarchi
  0 siblings, 1 reply; 13+ messages in thread
From: Maxime Ripard @ 2018-04-16 19:55 UTC (permalink / raw
  To: u-boot

On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
> Hi Jaehoon,
> 
> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> > 
> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > read 10MB from a fast eMMC device due to excessive delays in polling
> > loops.
> > 
> > This commit restructures the main polling loops to use get_timer(...)
> > to determine whether a (millisecond) timeout has expired.  We choose
> > not to use the wait_bit function, as we don't need interruptability
> > with ctrl-c and have at least one case where two bits (one for an
> > error condition and another one for completion) need to be read and
> > using wait_bit would have not added to the clarity.
> > 
> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > 10MB write decreases from 9.302s to 0.884s).
> > 
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> > Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
> 
> Any chance we can merge this for the next release?

Ping?

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20180416/6b5e6fc3/attachment.sig>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-16 19:55   ` Maxime Ripard
@ 2018-04-16 20:37     ` Michael Nazzareno Trimarchi
  2018-04-20 20:10       ` Maxime Ripard
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Nazzareno Trimarchi @ 2018-04-16 20:37 UTC (permalink / raw
  To: u-boot

Hi

On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
<maxime.ripard@bootlin.com> wrote:
> On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
>> Hi Jaehoon,
>>
>> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
>> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> >
>> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > loops.
>> >
>> > This commit restructures the main polling loops to use get_timer(...)
>> > to determine whether a (millisecond) timeout has expired.  We choose
>> > not to use the wait_bit function, as we don't need interruptability
>> > with ctrl-c and have at least one case where two bits (one for an
>> > error condition and another one for completion) need to be read and
>> > using wait_bit would have not added to the clarity.
>> >
>> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > 10MB write decreases from 9.302s to 0.884s).
>> >
>> > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> > Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
>>
>> Any chance we can merge this for the next release?
>
> Ping?
>

Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in the driver

Michael

> Maxime
>
> --
> Maxime Ripard, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
> _______________________________________________
> U-Boot mailing list
> U-Boot at lists.denx.de
> https://lists.denx.de/listinfo/u-boot
>



-- 
| Michael Nazzareno Trimarchi                     Amarula Solutions BV |
| COO  -  Founder                                      Cruquiuskade 47 |
| +31(0)851119172                                 Amsterdam 1018 AM NL |
|                  [`as] http://www.amarulasolutions.com               |

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-16 20:37     ` Michael Nazzareno Trimarchi
@ 2018-04-20 20:10       ` Maxime Ripard
  2018-04-20 20:49         ` Michael Nazzareno Trimarchi
  0 siblings, 1 reply; 13+ messages in thread
From: Maxime Ripard @ 2018-04-20 20:10 UTC (permalink / raw
  To: u-boot

On Mon, Apr 16, 2018 at 10:37:11PM +0200, Michael Nazzareno Trimarchi wrote:
> Hi
> 
> On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
> <maxime.ripard@bootlin.com> wrote:
> > On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
> >> Hi Jaehoon,
> >>
> >> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> >> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> >> >
> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> >> > read 10MB from a fast eMMC device due to excessive delays in polling
> >> > loops.
> >> >
> >> > This commit restructures the main polling loops to use get_timer(...)
> >> > to determine whether a (millisecond) timeout has expired.  We choose
> >> > not to use the wait_bit function, as we don't need interruptability
> >> > with ctrl-c and have at least one case where two bits (one for an
> >> > error condition and another one for completion) need to be read and
> >> > using wait_bit would have not added to the clarity.
> >> >
> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> >> > 10MB write decreases from 9.302s to 0.884s).
> >> >
> >> > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> >> > Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
> >>
> >> Any chance we can merge this for the next release?
> >
> > Ping?
> >
> 
> Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in
> the driver

This will probably speed up the transfer as well, but we don't need
that udelay in the first place. We don't have any application or OS to
be nice to, so we can just busy loop in order to achieve the higher
throughput. Or am I missing something?

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-20 20:10       ` Maxime Ripard
@ 2018-04-20 20:49         ` Michael Nazzareno Trimarchi
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Nazzareno Trimarchi @ 2018-04-20 20:49 UTC (permalink / raw
  To: u-boot

Hi

On Fri, Apr 20, 2018 at 10:10 PM, Maxime Ripard
<maxime.ripard@bootlin.com> wrote:
> On Mon, Apr 16, 2018 at 10:37:11PM +0200, Michael Nazzareno Trimarchi wrote:
>> Hi
>>
>> On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
>> <maxime.ripard@bootlin.com> wrote:
>> > On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
>> >> Hi Jaehoon,
>> >>
>> >> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
>> >> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> >> >
>> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> >> > read 10MB from a fast eMMC device due to excessive delays in polling
>> >> > loops.
>> >> >
>> >> > This commit restructures the main polling loops to use get_timer(...)
>> >> > to determine whether a (millisecond) timeout has expired.  We choose
>> >> > not to use the wait_bit function, as we don't need interruptability
>> >> > with ctrl-c and have at least one case where two bits (one for an
>> >> > error condition and another one for completion) need to be read and
>> >> > using wait_bit would have not added to the clarity.
>> >> >
>> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> >> > 10MB write decreases from 9.302s to 0.884s).
>> >> >
>> >> > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> >> > Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
>> >>
>> >> Any chance we can merge this for the next release?
>> >
>> > Ping?
>> >
>>
>> Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in
>> the driver
>
> This will probably speed up the transfer as well, but we don't need
> that udelay in the first place. We don't have any application or OS to
> be nice to, so we can just busy loop in order to achieve the higher
> throughput. Or am I missing something?
>

One is to try to have less code change and second was to ping in another way
to be included

Michael

> Maxime
>
> --
> Maxime Ripard, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> https://bootlin.com



-- 
| Michael Nazzareno Trimarchi                     Amarula Solutions BV |
| COO  -  Founder                                      Cruquiuskade 47 |
| +31(0)851119172                                 Amsterdam 1018 AM NL |
|                  [`as] http://www.amarulasolutions.com               |

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-06  6:06     ` Jagan Teki
@ 2018-04-24 19:57       ` Maxime Ripard
  2018-04-24 20:16         ` Tom Rini
  0 siblings, 1 reply; 13+ messages in thread
From: Maxime Ripard @ 2018-04-24 19:57 UTC (permalink / raw
  To: u-boot

Hi Jagan,

On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
> On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
> <maxime.ripard@bootlin.com> wrote:
> > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
> >> <maxime.ripard@bootlin.com> wrote:
> >> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> >> >
> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> >> > read 10MB from a fast eMMC device due to excessive delays in polling
> >> > loops.
> >> >
> >> > This commit restructures the main polling loops to use get_timer(...)
> >> > to determine whether a (millisecond) timeout has expired.  We choose
> >> > not to use the wait_bit function, as we don't need interruptability
> >> > with ctrl-c and have at least one case where two bits (one for an
> >> > error condition and another one for completion) need to be read and
> >> > using wait_bit would have not added to the clarity.
> >> >
> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> >> > 10MB write decreases from 9.302s to 0.884s).
> >>
> >> Fyi: I've seen significant improvement, but not 10x on A64
> >> (bananpi-m64) with read
> >>
> >> Before this change:
> >>
> >> => mmc dev 0
> >> switch to partitions #0, OK
> >> mmc0 is current device
> >> => fatload mmc 0:1 $kernel_addr_r Image
> >> reading Image
> >> 16310784 bytes read in 821 ms (18.9 MiB/s)
> >> => mmc dev 1
> >> switch to partitions #0, OK
> >> mmc1(part 0) is current device
> >> => ext4load mmc 1:1 $kernel_addr_r Image
> >> 16310784 bytes read in 1109 ms (14 MiB/s)
> >>
> >>
> >> After this change:
> >>
> >> => mmc dev 0
> >> switch to partitions #0, OK
> >> mmc0 is current device
> >> => fatload mmc 0:1 $kernel_addr_r Image
> >> 16310784 bytes read in 784 ms (19.8 MiB/s)
> >> => mmc dev 1
> >> switch to partitions #0, OK
> >> mmc1(part 0) is current device
> >> => ext4load mmc 1:1 $kernel_addr_r Image
> >> 16310784 bytes read in 793 ms (19.6 MiB/s)
> >
> > Yeah, the smaller the file is, the bigger the gain is. Since you have
> > an almost twice bigger file, the gains are probably just noise at that
> > point and the bottleneck starts to be your MMC.
> 
> Acked-by: Jagan Teki <jagan@openedev.com>

Jaehoon doesn't seem to reply at all, can we merge this through the
sunxi tree?

Thanks!
Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-24 19:57       ` Maxime Ripard
@ 2018-04-24 20:16         ` Tom Rini
  2018-04-25  5:01           ` Jagan Teki
  0 siblings, 1 reply; 13+ messages in thread
From: Tom Rini @ 2018-04-24 20:16 UTC (permalink / raw
  To: u-boot

On Tue, Apr 24, 2018 at 09:57:58PM +0200, Maxime Ripard wrote:
> Hi Jagan,
> 
> On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
> > On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
> > <maxime.ripard@bootlin.com> wrote:
> > > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> > >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
> > >> <maxime.ripard@bootlin.com> wrote:
> > >> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> > >> >
> > >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > >> > read 10MB from a fast eMMC device due to excessive delays in polling
> > >> > loops.
> > >> >
> > >> > This commit restructures the main polling loops to use get_timer(...)
> > >> > to determine whether a (millisecond) timeout has expired.  We choose
> > >> > not to use the wait_bit function, as we don't need interruptability
> > >> > with ctrl-c and have at least one case where two bits (one for an
> > >> > error condition and another one for completion) need to be read and
> > >> > using wait_bit would have not added to the clarity.
> > >> >
> > >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > >> > 10MB write decreases from 9.302s to 0.884s).
> > >>
> > >> Fyi: I've seen significant improvement, but not 10x on A64
> > >> (bananpi-m64) with read
> > >>
> > >> Before this change:
> > >>
> > >> => mmc dev 0
> > >> switch to partitions #0, OK
> > >> mmc0 is current device
> > >> => fatload mmc 0:1 $kernel_addr_r Image
> > >> reading Image
> > >> 16310784 bytes read in 821 ms (18.9 MiB/s)
> > >> => mmc dev 1
> > >> switch to partitions #0, OK
> > >> mmc1(part 0) is current device
> > >> => ext4load mmc 1:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 1109 ms (14 MiB/s)
> > >>
> > >>
> > >> After this change:
> > >>
> > >> => mmc dev 0
> > >> switch to partitions #0, OK
> > >> mmc0 is current device
> > >> => fatload mmc 0:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 784 ms (19.8 MiB/s)
> > >> => mmc dev 1
> > >> switch to partitions #0, OK
> > >> mmc1(part 0) is current device
> > >> => ext4load mmc 1:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 793 ms (19.6 MiB/s)
> > >
> > > Yeah, the smaller the file is, the bigger the gain is. Since you have
> > > an almost twice bigger file, the gains are probably just noise at that
> > > point and the bottleneck starts to be your MMC.
> > 
> > Acked-by: Jagan Teki <jagan@openedev.com>
> 
> Jaehoon doesn't seem to reply at all, can we merge this through the
> sunxi tree?

Yes.

Reviewed-by: Tom Rini <trini@konsulko.com>

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20180424/6b0495d6/attachment.sig>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver
  2018-04-24 20:16         ` Tom Rini
@ 2018-04-25  5:01           ` Jagan Teki
  0 siblings, 0 replies; 13+ messages in thread
From: Jagan Teki @ 2018-04-25  5:01 UTC (permalink / raw
  To: u-boot

On Wed, Apr 25, 2018 at 1:46 AM, Tom Rini <trini@konsulko.com> wrote:
> On Tue, Apr 24, 2018 at 09:57:58PM +0200, Maxime Ripard wrote:
>> Hi Jagan,
>>
>> On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
>> > On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
>> > <maxime.ripard@bootlin.com> wrote:
>> > > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
>> > >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
>> > >> <maxime.ripard@bootlin.com> wrote:
>> > >> > From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> > >> >
>> > >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > >> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > >> > loops.
>> > >> >
>> > >> > This commit restructures the main polling loops to use get_timer(...)
>> > >> > to determine whether a (millisecond) timeout has expired.  We choose
>> > >> > not to use the wait_bit function, as we don't need interruptability
>> > >> > with ctrl-c and have at least one case where two bits (one for an
>> > >> > error condition and another one for completion) need to be read and
>> > >> > using wait_bit would have not added to the clarity.
>> > >> >
>> > >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > >> > 10MB write decreases from 9.302s to 0.884s).
>> > >>
>> > >> Fyi: I've seen significant improvement, but not 10x on A64
>> > >> (bananpi-m64) with read
>> > >>
>> > >> Before this change:
>> > >>
>> > >> => mmc dev 0
>> > >> switch to partitions #0, OK
>> > >> mmc0 is current device
>> > >> => fatload mmc 0:1 $kernel_addr_r Image
>> > >> reading Image
>> > >> 16310784 bytes read in 821 ms (18.9 MiB/s)
>> > >> => mmc dev 1
>> > >> switch to partitions #0, OK
>> > >> mmc1(part 0) is current device
>> > >> => ext4load mmc 1:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 1109 ms (14 MiB/s)
>> > >>
>> > >>
>> > >> After this change:
>> > >>
>> > >> => mmc dev 0
>> > >> switch to partitions #0, OK
>> > >> mmc0 is current device
>> > >> => fatload mmc 0:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 784 ms (19.8 MiB/s)
>> > >> => mmc dev 1
>> > >> switch to partitions #0, OK
>> > >> mmc1(part 0) is current device
>> > >> => ext4load mmc 1:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 793 ms (19.6 MiB/s)
>> > >
>> > > Yeah, the smaller the file is, the bigger the gain is. Since you have
>> > > an almost twice bigger file, the gains are probably just noise at that
>> > > point and the bottleneck starts to be your MMC.
>> >
>> > Acked-by: Jagan Teki <jagan@openedev.com>
>>
>> Jaehoon doesn't seem to reply at all, can we merge this through the
>> sunxi tree?

Applied to u-boot-sunxi/master

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-04-25  5:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-21 11:18 [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver Maxime Ripard
2018-03-29 13:40 ` Mylène Josserand
2018-04-04  6:43 ` Jagan Teki
2018-04-04  7:06   ` Maxime Ripard
2018-04-06  6:06     ` Jagan Teki
2018-04-24 19:57       ` Maxime Ripard
2018-04-24 20:16         ` Tom Rini
2018-04-25  5:01           ` Jagan Teki
2018-04-06  5:54 ` Maxime Ripard
2018-04-16 19:55   ` Maxime Ripard
2018-04-16 20:37     ` Michael Nazzareno Trimarchi
2018-04-20 20:10       ` Maxime Ripard
2018-04-20 20:49         ` Michael Nazzareno Trimarchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.