* [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
@ 2024-03-22 16:41 Niklas Cassel
2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Niklas Cassel @ 2024-03-22 16:41 UTC (permalink / raw
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman
Cc: Kuppuswamy Sathyanarayanan, Damien Le Moal, Niklas Cassel,
linux-pci
The current code uses writel()/readl(), which has an implicit memory
barrier for every single readl()/writel().
Additionally, reading 4 bytes at a time over the PCI bus is not really
optimal, considering that this code is running in an ioctl handler.
Use memcpy_toio()/memcpy_fromio() for BAR tests.
Before patch with a 4MB BAR:
$ time /usr/bin/pcitest -b 1
BAR1: OKAY
real 0m 1.56s
After patch with a 4MB BAR:
$ time /usr/bin/pcitest -b 1
BAR1: OKAY
real 0m 0.54s
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
Changes since v3:
-Use scope-based resource management __free attribute from cleanup.h to
avoid overly verbose gotos and labels for error handling.
-Added a comment related to why we allocate a buffer of max 1MB.
(kmalloc() default upper limit is usually 4 MB on ARM and x86.)
drivers/misc/pci_endpoint_test.c | 54 +++++++++++++++++++++++++-------
1 file changed, 42 insertions(+), 12 deletions(-)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index 705029ad8eb5..bf64d3aff7d8 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -7,6 +7,7 @@
*/
#include <linux/crc32.h>
+#include <linux/cleanup.h>
#include <linux/delay.h>
#include <linux/fs.h>
#include <linux/io.h>
@@ -272,31 +273,60 @@ static const u32 bar_test_pattern[] = {
0xA5A5A5A5,
};
+static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
+ enum pci_barno barno, int offset,
+ void *write_buf, void *read_buf,
+ int size)
+{
+ memset(write_buf, bar_test_pattern[barno], size);
+ memcpy_toio(test->bar[barno] + offset, write_buf, size);
+
+ memcpy_fromio(read_buf, test->bar[barno] + offset, size);
+
+ return memcmp(write_buf, read_buf, size);
+}
+
static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
enum pci_barno barno)
{
- int j;
- u32 val;
- int size;
+ int j, bar_size, buf_size, iters, remain;
+ void *write_buf __free(kfree) = NULL;
+ void *read_buf __free(kfree) = NULL;
struct pci_dev *pdev = test->pdev;
if (!test->bar[barno])
return false;
- size = pci_resource_len(pdev, barno);
+ bar_size = pci_resource_len(pdev, barno);
if (barno == test->test_reg_bar)
- size = 0x4;
+ bar_size = 0x4;
- for (j = 0; j < size; j += 4)
- pci_endpoint_test_bar_writel(test, barno, j,
- bar_test_pattern[barno]);
+ /*
+ * Allocate a buffer of max size 1MB, and reuse that buffer while
+ * iterating over the whole BAR size (which might be much larger).
+ */
+ buf_size = min(SZ_1M, bar_size);
- for (j = 0; j < size; j += 4) {
- val = pci_endpoint_test_bar_readl(test, barno, j);
- if (val != bar_test_pattern[barno])
+ write_buf = kmalloc(buf_size, GFP_KERNEL);
+ if (!write_buf)
+ return false;
+
+ read_buf = kmalloc(buf_size, GFP_KERNEL);
+ if (!read_buf)
+ return false;
+
+ iters = bar_size / buf_size;
+ for (j = 0; j < iters; j++)
+ if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
+ write_buf, read_buf, buf_size))
+ return false;
+
+ remain = bar_size % buf_size;
+ if (remain)
+ if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
+ write_buf, read_buf, remain))
return false;
- }
return true;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
@ 2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
2024-03-25 7:58 ` Niklas Cassel
2024-04-18 9:08 ` Niklas Cassel
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2024-03-22 17:03 UTC (permalink / raw
To: Niklas Cassel, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman
Cc: Damien Le Moal, linux-pci
On 3/22/24 9:41 AM, Niklas Cassel wrote:
> The current code uses writel()/readl(), which has an implicit memory
> barrier for every single readl()/writel().
>
> Additionally, reading 4 bytes at a time over the PCI bus is not really
> optimal, considering that this code is running in an ioctl handler.
>
> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>
> Before patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 1.56s
>
> After patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 0.54s
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> Changes since v3:
> -Use scope-based resource management __free attribute from cleanup.h to
> avoid overly verbose gotos and labels for error handling.
> -Added a comment related to why we allocate a buffer of max 1MB.
> (kmalloc() default upper limit is usually 4 MB on ARM and x86.)
>
> drivers/misc/pci_endpoint_test.c | 54 +++++++++++++++++++++++++-------
> 1 file changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 705029ad8eb5..bf64d3aff7d8 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/crc32.h>
> +#include <linux/cleanup.h>
> #include <linux/delay.h>
> #include <linux/fs.h>
> #include <linux/io.h>
> @@ -272,31 +273,60 @@ static const u32 bar_test_pattern[] = {
> 0xA5A5A5A5,
> };
>
> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> + enum pci_barno barno, int offset,
> + void *write_buf, void *read_buf,
> + int size)
> +{
> + memset(write_buf, bar_test_pattern[barno], size);
> + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> +
> + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
> +
> + return memcmp(write_buf, read_buf, size);
> +}
> +
> static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
> enum pci_barno barno)
> {
> - int j;
> - u32 val;
> - int size;
> + int j, bar_size, buf_size, iters, remain;
> + void *write_buf __free(kfree) = NULL;
> + void *read_buf __free(kfree) = NULL;
Please check the following cleanup doc. Recommendation is to avoid above __free(kfree) = NULL declarations and directly define write_buf/read_buf.
https://lore.kernel.org/lkml/171097196970.1011049.9726486429680041876.stgit@dwillia2-xfh.jf.intel.com/T/#m05d71a668ff0fad46c2055dbdcde55d7381780b7
> struct pci_dev *pdev = test->pdev;
>
> if (!test->bar[barno])
> return false;
>
> - size = pci_resource_len(pdev, barno);
> + bar_size = pci_resource_len(pdev, barno);
>
> if (barno == test->test_reg_bar)
> - size = 0x4;
> + bar_size = 0x4;
>
> - for (j = 0; j < size; j += 4)
> - pci_endpoint_test_bar_writel(test, barno, j,
> - bar_test_pattern[barno]);
> + /*
> + * Allocate a buffer of max size 1MB, and reuse that buffer while
> + * iterating over the whole BAR size (which might be much larger).
> + */
> + buf_size = min(SZ_1M, bar_size);
>
> - for (j = 0; j < size; j += 4) {
> - val = pci_endpoint_test_bar_readl(test, barno, j);
> - if (val != bar_test_pattern[barno])
> + write_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!write_buf)
> + return false;
> +
> + read_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!read_buf)
> + return false;
> +
> + iters = bar_size / buf_size;
> + for (j = 0; j < iters; j++)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
> + write_buf, read_buf, buf_size))
> + return false;
> +
> + remain = bar_size % buf_size;
> + if (remain)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
> + write_buf, read_buf, remain))
> return false;
> - }
>
> return true;
> }
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
@ 2024-03-25 7:58 ` Niklas Cassel
0 siblings, 0 replies; 7+ messages in thread
From: Niklas Cassel @ 2024-03-25 7:58 UTC (permalink / raw
To: Kuppuswamy Sathyanarayanan, Dan Williams
Cc: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman,
Damien Le Moal, linux-pci
Hello Kuppuswamy, Dan,
On Fri, Mar 22, 2024 at 10:03:12AM -0700, Kuppuswamy Sathyanarayanan wrote:
> > +
> > static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
> > enum pci_barno barno)
> > {
> > - int j;
> > - u32 val;
> > - int size;
> > + int j, bar_size, buf_size, iters, remain;
> > + void *write_buf __free(kfree) = NULL;
> > + void *read_buf __free(kfree) = NULL;
>
> Please check the following cleanup doc. Recommendation is to avoid above __free(kfree) = NULL declarations and directly define write_buf/read_buf.
>
> https://lore.kernel.org/lkml/171097196970.1011049.9726486429680041876.stgit@dwillia2-xfh.jf.intel.com/T/#m05d71a668ff0fad46c2055dbdcde55d7381780b7
I don't think that the docs say that using:
void *ptr __free(kfree) = NULL;
is always an anti-pattern.
"If other cleanup helpers appeared in such a function that depended on
@dev being live to complete their unwind then using the
"struct obj_type *obj __free(...) = NULL" style is an anti-pattern that
potentially causes a use-after-free bug."
I think it says that it is an anti-pattern IFF there are two cleanup
helpers in a function AND they have have a inter-dependency.
This patch just adds a single cleanup helper, so there can be no
inter-dependency problem.
This pattern is common in Linus's current tree, see e.g.
$ git grep -C 3 -E "__free(.*) = NULL"
If this was a problem, I don't think we would have seen
100 different instances of this pattern.
In other words, I think this patch is fine.
Dan, please correct me if I'm wrong.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
@ 2024-04-18 9:08 ` Niklas Cassel
2024-04-18 17:49 ` Kuppuswamy Sathyanarayanan
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Niklas Cassel @ 2024-04-18 9:08 UTC (permalink / raw
To: Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman
Cc: Kuppuswamy Sathyanarayanan, Damien Le Moal, linux-pci
On Fri, Mar 22, 2024 at 05:41:38PM +0100, Niklas Cassel wrote:
> The current code uses writel()/readl(), which has an implicit memory
> barrier for every single readl()/writel().
>
> Additionally, reading 4 bytes at a time over the PCI bus is not really
> optimal, considering that this code is running in an ioctl handler.
>
> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>
> Before patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 1.56s
>
> After patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 0.54s
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> Changes since v3:
> -Use scope-based resource management __free attribute from cleanup.h to
> avoid overly verbose gotos and labels for error handling.
> -Added a comment related to why we allocate a buffer of max 1MB.
> (kmalloc() default upper limit is usually 4 MB on ARM and x86.)
>
> drivers/misc/pci_endpoint_test.c | 54 +++++++++++++++++++++++++-------
> 1 file changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 705029ad8eb5..bf64d3aff7d8 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/crc32.h>
> +#include <linux/cleanup.h>
> #include <linux/delay.h>
> #include <linux/fs.h>
> #include <linux/io.h>
> @@ -272,31 +273,60 @@ static const u32 bar_test_pattern[] = {
> 0xA5A5A5A5,
> };
>
> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> + enum pci_barno barno, int offset,
> + void *write_buf, void *read_buf,
> + int size)
> +{
> + memset(write_buf, bar_test_pattern[barno], size);
> + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> +
> + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
> +
> + return memcmp(write_buf, read_buf, size);
> +}
> +
> static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
> enum pci_barno barno)
> {
> - int j;
> - u32 val;
> - int size;
> + int j, bar_size, buf_size, iters, remain;
> + void *write_buf __free(kfree) = NULL;
> + void *read_buf __free(kfree) = NULL;
> struct pci_dev *pdev = test->pdev;
>
> if (!test->bar[barno])
> return false;
>
> - size = pci_resource_len(pdev, barno);
> + bar_size = pci_resource_len(pdev, barno);
>
> if (barno == test->test_reg_bar)
> - size = 0x4;
> + bar_size = 0x4;
>
> - for (j = 0; j < size; j += 4)
> - pci_endpoint_test_bar_writel(test, barno, j,
> - bar_test_pattern[barno]);
> + /*
> + * Allocate a buffer of max size 1MB, and reuse that buffer while
> + * iterating over the whole BAR size (which might be much larger).
> + */
> + buf_size = min(SZ_1M, bar_size);
>
> - for (j = 0; j < size; j += 4) {
> - val = pci_endpoint_test_bar_readl(test, barno, j);
> - if (val != bar_test_pattern[barno])
> + write_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!write_buf)
> + return false;
> +
> + read_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!read_buf)
> + return false;
> +
> + iters = bar_size / buf_size;
> + for (j = 0; j < iters; j++)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
> + write_buf, read_buf, buf_size))
> + return false;
> +
> + remain = bar_size % buf_size;
> + if (remain)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
> + write_buf, read_buf, remain))
> return false;
> - }
>
> return true;
> }
> --
> 2.44.0
>
Gentle ping :)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
2024-04-18 9:08 ` Niklas Cassel
@ 2024-04-18 17:49 ` Kuppuswamy Sathyanarayanan
2024-05-04 14:34 ` Manivannan Sadhasivam
2024-05-17 10:09 ` Krzysztof Wilczyński
4 siblings, 0 replies; 7+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2024-04-18 17:49 UTC (permalink / raw
To: Niklas Cassel, Manivannan Sadhasivam, Krzysztof Wilczyński,
Kishon Vijay Abraham I, Arnd Bergmann, Greg Kroah-Hartman
Cc: Damien Le Moal, linux-pci
On 3/22/24 9:41 AM, Niklas Cassel wrote:
> The current code uses writel()/readl(), which has an implicit memory
> barrier for every single readl()/writel().
>
> Additionally, reading 4 bytes at a time over the PCI bus is not really
> optimal, considering that this code is running in an ioctl handler.
>
> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>
> Before patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 1.56s
>
> After patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 0.54s
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
LGTM
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Changes since v3:
> -Use scope-based resource management __free attribute from cleanup.h to
> avoid overly verbose gotos and labels for error handling.
> -Added a comment related to why we allocate a buffer of max 1MB.
> (kmalloc() default upper limit is usually 4 MB on ARM and x86.)
>
> drivers/misc/pci_endpoint_test.c | 54 +++++++++++++++++++++++++-------
> 1 file changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 705029ad8eb5..bf64d3aff7d8 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/crc32.h>
> +#include <linux/cleanup.h>
> #include <linux/delay.h>
> #include <linux/fs.h>
> #include <linux/io.h>
> @@ -272,31 +273,60 @@ static const u32 bar_test_pattern[] = {
> 0xA5A5A5A5,
> };
>
> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> + enum pci_barno barno, int offset,
> + void *write_buf, void *read_buf,
> + int size)
> +{
> + memset(write_buf, bar_test_pattern[barno], size);
> + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> +
> + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
> +
> + return memcmp(write_buf, read_buf, size);
> +}
> +
> static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
> enum pci_barno barno)
> {
> - int j;
> - u32 val;
> - int size;
> + int j, bar_size, buf_size, iters, remain;
> + void *write_buf __free(kfree) = NULL;
> + void *read_buf __free(kfree) = NULL;
> struct pci_dev *pdev = test->pdev;
>
> if (!test->bar[barno])
> return false;
>
> - size = pci_resource_len(pdev, barno);
> + bar_size = pci_resource_len(pdev, barno);
>
> if (barno == test->test_reg_bar)
> - size = 0x4;
> + bar_size = 0x4;
>
> - for (j = 0; j < size; j += 4)
> - pci_endpoint_test_bar_writel(test, barno, j,
> - bar_test_pattern[barno]);
> + /*
> + * Allocate a buffer of max size 1MB, and reuse that buffer while
> + * iterating over the whole BAR size (which might be much larger).
> + */
> + buf_size = min(SZ_1M, bar_size);
>
> - for (j = 0; j < size; j += 4) {
> - val = pci_endpoint_test_bar_readl(test, barno, j);
> - if (val != bar_test_pattern[barno])
> + write_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!write_buf)
> + return false;
> +
> + read_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!read_buf)
> + return false;
> +
> + iters = bar_size / buf_size;
> + for (j = 0; j < iters; j++)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
> + write_buf, read_buf, buf_size))
> + return false;
> +
> + remain = bar_size % buf_size;
> + if (remain)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
> + write_buf, read_buf, remain))
> return false;
> - }
>
> return true;
> }
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
` (2 preceding siblings ...)
2024-04-18 17:49 ` Kuppuswamy Sathyanarayanan
@ 2024-05-04 14:34 ` Manivannan Sadhasivam
2024-05-17 10:09 ` Krzysztof Wilczyński
4 siblings, 0 replies; 7+ messages in thread
From: Manivannan Sadhasivam @ 2024-05-04 14:34 UTC (permalink / raw
To: Niklas Cassel
Cc: Krzysztof Wilczyński, Kishon Vijay Abraham I, Arnd Bergmann,
Greg Kroah-Hartman, Kuppuswamy Sathyanarayanan, Damien Le Moal,
linux-pci
On Fri, Mar 22, 2024 at 05:41:38PM +0100, Niklas Cassel wrote:
> The current code uses writel()/readl(), which has an implicit memory
> barrier for every single readl()/writel().
>
> Additionally, reading 4 bytes at a time over the PCI bus is not really
> optimal, considering that this code is running in an ioctl handler.
>
> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>
> Before patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 1.56s
>
> After patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 0.54s
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
- Mani
> ---
> Changes since v3:
> -Use scope-based resource management __free attribute from cleanup.h to
> avoid overly verbose gotos and labels for error handling.
> -Added a comment related to why we allocate a buffer of max 1MB.
> (kmalloc() default upper limit is usually 4 MB on ARM and x86.)
>
> drivers/misc/pci_endpoint_test.c | 54 +++++++++++++++++++++++++-------
> 1 file changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
> index 705029ad8eb5..bf64d3aff7d8 100644
> --- a/drivers/misc/pci_endpoint_test.c
> +++ b/drivers/misc/pci_endpoint_test.c
> @@ -7,6 +7,7 @@
> */
>
> #include <linux/crc32.h>
> +#include <linux/cleanup.h>
> #include <linux/delay.h>
> #include <linux/fs.h>
> #include <linux/io.h>
> @@ -272,31 +273,60 @@ static const u32 bar_test_pattern[] = {
> 0xA5A5A5A5,
> };
>
> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
> + enum pci_barno barno, int offset,
> + void *write_buf, void *read_buf,
> + int size)
> +{
> + memset(write_buf, bar_test_pattern[barno], size);
> + memcpy_toio(test->bar[barno] + offset, write_buf, size);
> +
> + memcpy_fromio(read_buf, test->bar[barno] + offset, size);
> +
> + return memcmp(write_buf, read_buf, size);
> +}
> +
> static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
> enum pci_barno barno)
> {
> - int j;
> - u32 val;
> - int size;
> + int j, bar_size, buf_size, iters, remain;
> + void *write_buf __free(kfree) = NULL;
> + void *read_buf __free(kfree) = NULL;
> struct pci_dev *pdev = test->pdev;
>
> if (!test->bar[barno])
> return false;
>
> - size = pci_resource_len(pdev, barno);
> + bar_size = pci_resource_len(pdev, barno);
>
> if (barno == test->test_reg_bar)
> - size = 0x4;
> + bar_size = 0x4;
>
> - for (j = 0; j < size; j += 4)
> - pci_endpoint_test_bar_writel(test, barno, j,
> - bar_test_pattern[barno]);
> + /*
> + * Allocate a buffer of max size 1MB, and reuse that buffer while
> + * iterating over the whole BAR size (which might be much larger).
> + */
> + buf_size = min(SZ_1M, bar_size);
>
> - for (j = 0; j < size; j += 4) {
> - val = pci_endpoint_test_bar_readl(test, barno, j);
> - if (val != bar_test_pattern[barno])
> + write_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!write_buf)
> + return false;
> +
> + read_buf = kmalloc(buf_size, GFP_KERNEL);
> + if (!read_buf)
> + return false;
> +
> + iters = bar_size / buf_size;
> + for (j = 0; j < iters; j++)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
> + write_buf, read_buf, buf_size))
> + return false;
> +
> + remain = bar_size % buf_size;
> + if (remain)
> + if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
> + write_buf, read_buf, remain))
> return false;
> - }
>
> return true;
> }
> --
> 2.44.0
>
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
` (3 preceding siblings ...)
2024-05-04 14:34 ` Manivannan Sadhasivam
@ 2024-05-17 10:09 ` Krzysztof Wilczyński
4 siblings, 0 replies; 7+ messages in thread
From: Krzysztof Wilczyński @ 2024-05-17 10:09 UTC (permalink / raw
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Kishon Vijay Abraham I, Arnd Bergmann,
Greg Kroah-Hartman, Kuppuswamy Sathyanarayanan, Damien Le Moal,
linux-pci
Hello,
> The current code uses writel()/readl(), which has an implicit memory
> barrier for every single readl()/writel().
>
> Additionally, reading 4 bytes at a time over the PCI bus is not really
> optimal, considering that this code is running in an ioctl handler.
>
> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>
> Before patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 1.56s
>
> After patch with a 4MB BAR:
> $ time /usr/bin/pcitest -b 1
> BAR1: OKAY
> real 0m 0.54s
Applied to misc, thank you!
[1/1] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests
https://git.kernel.org/pci/pci/c/9ff0a8cdf12e
Krzysztof
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-05-17 10:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-22 16:41 [PATCH v4] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests Niklas Cassel
2024-03-22 17:03 ` Kuppuswamy Sathyanarayanan
2024-03-25 7:58 ` Niklas Cassel
2024-04-18 9:08 ` Niklas Cassel
2024-04-18 17:49 ` Kuppuswamy Sathyanarayanan
2024-05-04 14:34 ` Manivannan Sadhasivam
2024-05-17 10:09 ` Krzysztof Wilczyński
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).