Linux-ARM-MSM Archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
       [not found] ` <d8930bce-6db6-45f4-8f09-8a00fa48e607@notapiano>
@ 2024-05-22 10:03   ` Neil Armstrong
  2024-05-22 11:33     ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Armstrong @ 2024-05-22 10:03 UTC (permalink / raw
  To: Nícolas F. R. A. Prado, Andy Shevchenko
  Cc: Mark Brown, linux-spi, linux-kernel, linux-arm-msm

Hi,

On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
> On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
>> Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
>> have checks for orig_nents against 0. No need to duplicate this.
>> All the same applies to other DMA mapping API calls.
>>
>> Also note, there is no other user in the kernel that does this kind of
>> checks.
>>
>> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 
> Hi,
> 
> this commit caused a regression which I reported here:
> 
> https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
> 
> along with some thoughts on the cause and a possible solution, though I'm not
> familiar with this code base at all and would really appreciate any feedback you
> may have.

I also see the same regression on the SM8550 and SM8650 platforms,
please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.

Thanks,
Neil

> 
> Thanks,
> Nícolas
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 10:03   ` [PATCH v1 1/1] spi: Remove unneded check for orig_nents Neil Armstrong
@ 2024-05-22 11:33     ` Andy Shevchenko
  2024-05-22 11:53       ` Neil Armstrong
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2024-05-22 11:33 UTC (permalink / raw
  To: Neil Armstrong
  Cc: Nícolas F. R. A. Prado, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
> On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
> > On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
> > > Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
> > > have checks for orig_nents against 0. No need to duplicate this.
> > > All the same applies to other DMA mapping API calls.
> > > 
> > > Also note, there is no other user in the kernel that does this kind of
> > > checks.
> > > 
> > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > 
> > Hi,
> > 
> > this commit caused a regression which I reported here:
> > 
> > https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
> > 
> > along with some thoughts on the cause and a possible solution, though I'm not
> > familiar with this code base at all and would really appreciate any feedback you
> > may have.
> 
> I also see the same regression on the SM8550 and SM8650 platforms,
> please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.

There is still no answer from IOMMU patch author. Do you have the same trace
due to IOMMU calls? Anyway, I guess it would be nice to see it.

Meanwhile, I have three changes I posted in the replies to the initial report,
can you combine them all and test? This will be a plan B (? or A, depending on
the culprit).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 11:33     ` Andy Shevchenko
@ 2024-05-22 11:53       ` Neil Armstrong
  2024-05-22 13:18         ` Neil Armstrong
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Armstrong @ 2024-05-22 11:53 UTC (permalink / raw
  To: Andy Shevchenko
  Cc: Nícolas F. R. A. Prado, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

On 22/05/2024 13:33, Andy Shevchenko wrote:
> On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
>> On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
>>> On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
>>>> Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
>>>> have checks for orig_nents against 0. No need to duplicate this.
>>>> All the same applies to other DMA mapping API calls.
>>>>
>>>> Also note, there is no other user in the kernel that does this kind of
>>>> checks.
>>>>
>>>> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>>>
>>> Hi,
>>>
>>> this commit caused a regression which I reported here:
>>>
>>> https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
>>>
>>> along with some thoughts on the cause and a possible solution, though I'm not
>>> familiar with this code base at all and would really appreciate any feedback you
>>> may have.
>>
>> I also see the same regression on the SM8550 and SM8650 platforms,
>> please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.
> 
> There is still no answer from IOMMU patch author. Do you have the same trace
> due to IOMMU calls? Anyway, I guess it would be nice to see it.

Yes :
[    6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
<snip>
[    6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
<snip>
[    6.688286] Call trace:
[    6.688287]  iommu_dma_sync_sg_for_device+0x28/0x100
[    6.717582]  __dma_sync_sg_for_device+0x3c/0x40
[    6.717585]  spi_transfer_one_message+0x358/0x680
[    6.732229]  __spi_pump_transfer_message+0x188/0x494
[    6.732232]  __spi_sync+0x2a8/0x3c4
[    6.732234]  spi_sync+0x30/0x54
[    6.732236]  goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
[    6.739854]  _regmap_raw_write_impl+0x538/0x674
[    6.750053]  _regmap_raw_write+0xb4/0x144
[    6.750056]  regmap_raw_write+0x7c/0xc0
[    6.750058]  goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
[    6.765520]  goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
[    6.765522]  goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
[    6.772339]  spi_probe+0x84/0xe4
[    6.772342]  really_probe+0xbc/0x29c
[    6.784313]  __driver_probe_device+0x78/0x12c
[    6.784316]  driver_probe_device+0x3c/0x15c
[    6.784319]  __driver_attach+0x90/0x19c
[    6.784322]  bus_for_each_dev+0x7c/0xdc
[    6.794520]  driver_attach+0x24/0x30
[    6.794523]  bus_add_driver+0xe4/0x208
[    6.794526]  driver_register+0x5c/0x124
[    6.802586]  __spi_register_driver+0xa4/0xe4
[    6.802589]  goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
[    6.802591]  do_one_initcall+0x80/0x1c8
[    6.902310]  do_init_module+0x60/0x218
[    6.921988]  load_module+0x1bcc/0x1d8c
[    6.925847]  init_module_from_file+0x88/0xcc
[    6.930238]  __arm64_sys_finit_module+0x1dc/0x2e4
[    6.935074]  invoke_syscall+0x48/0x114
[    6.938944]  el0_svc_common.constprop.0+0xc0/0xe0
[    6.943781]  do_el0_svc+0x1c/0x28
[    6.947195]  el0_svc+0x34/0xd8
[    6.950348]  el0t_64_sync_handler+0x120/0x12c
[    6.954833]  el0t_64_sync+0x190/0x194
[    6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c2

Reverting  8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the crash.

> 
> Meanwhile, I have three changes I posted in the replies to the initial report,
> can you combine them all and test? This will be a plan B (? or A, depending on
> the culprit).
> 

I'll try to apply them and test.

Neil


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 11:53       ` Neil Armstrong
@ 2024-05-22 13:18         ` Neil Armstrong
  2024-05-22 14:24           ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Armstrong @ 2024-05-22 13:18 UTC (permalink / raw
  To: Andy Shevchenko
  Cc: Nícolas F. R. A. Prado, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

Hi,

On 22/05/2024 13:53, Neil Armstrong wrote:
> On 22/05/2024 13:33, Andy Shevchenko wrote:
>> On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
>>> On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
>>>> On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
>>>>> Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
>>>>> have checks for orig_nents against 0. No need to duplicate this.
>>>>> All the same applies to other DMA mapping API calls.
>>>>>
>>>>> Also note, there is no other user in the kernel that does this kind of
>>>>> checks.
>>>>>
>>>>> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
>>>>
>>>> Hi,
>>>>
>>>> this commit caused a regression which I reported here:
>>>>
>>>> https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
>>>>
>>>> along with some thoughts on the cause and a possible solution, though I'm not
>>>> familiar with this code base at all and would really appreciate any feedback you
>>>> may have.
>>>
>>> I also see the same regression on the SM8550 and SM8650 platforms,
>>> please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.
>>
>> There is still no answer from IOMMU patch author. Do you have the same trace
>> due to IOMMU calls? Anyway, I guess it would be nice to see it.
> 
> Yes :
> [    6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> <snip>
> [    6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
> <snip>
> [    6.688286] Call trace:
> [    6.688287]  iommu_dma_sync_sg_for_device+0x28/0x100
> [    6.717582]  __dma_sync_sg_for_device+0x3c/0x40
> [    6.717585]  spi_transfer_one_message+0x358/0x680
> [    6.732229]  __spi_pump_transfer_message+0x188/0x494
> [    6.732232]  __spi_sync+0x2a8/0x3c4
> [    6.732234]  spi_sync+0x30/0x54
> [    6.732236]  goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
> [    6.739854]  _regmap_raw_write_impl+0x538/0x674
> [    6.750053]  _regmap_raw_write+0xb4/0x144
> [    6.750056]  regmap_raw_write+0x7c/0xc0
> [    6.750058]  goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
> [    6.765520]  goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
> [    6.765522]  goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
> [    6.772339]  spi_probe+0x84/0xe4
> [    6.772342]  really_probe+0xbc/0x29c
> [    6.784313]  __driver_probe_device+0x78/0x12c
> [    6.784316]  driver_probe_device+0x3c/0x15c
> [    6.784319]  __driver_attach+0x90/0x19c
> [    6.784322]  bus_for_each_dev+0x7c/0xdc
> [    6.794520]  driver_attach+0x24/0x30
> [    6.794523]  bus_add_driver+0xe4/0x208
> [    6.794526]  driver_register+0x5c/0x124
> [    6.802586]  __spi_register_driver+0xa4/0xe4
> [    6.802589]  goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
> [    6.802591]  do_one_initcall+0x80/0x1c8
> [    6.902310]  do_init_module+0x60/0x218
> [    6.921988]  load_module+0x1bcc/0x1d8c
> [    6.925847]  init_module_from_file+0x88/0xcc
> [    6.930238]  __arm64_sys_finit_module+0x1dc/0x2e4
> [    6.935074]  invoke_syscall+0x48/0x114
> [    6.938944]  el0_svc_common.constprop.0+0xc0/0xe0
> [    6.943781]  do_el0_svc+0x1c/0x28
> [    6.947195]  el0_svc+0x34/0xd8
> [    6.950348]  el0t_64_sync_handler+0x120/0x12c
> [    6.954833]  el0t_64_sync+0x190/0x194
> [    6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c2
> 
> Reverting  8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the crash.
> 
>>
>> Meanwhile, I have three changes I posted in the replies to the initial report,
>> can you combine them all and test? This will be a plan B (? or A, depending on
>> the culprit).
>>
> 
> I'll try to apply them and test.

I stacked the 3 changes, and it works:
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD

For reference, the changeset looks like:
============><============================================================================================
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 289feccca376..0851c5e1fd1f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1220,6 +1220,11 @@ void spi_unmap_buf(struct spi_controller *ctlr, struct device *dev,
  	spi_unmap_buf_attrs(ctlr, dev, sgt, dir, 0);
  }

+/* Dummy SG for unidirect transfers */
+static struct scatterlist dummy_sg = {
+	.page_link = SG_END,
+};
+
  static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
  {
  	struct device *tx_dev, *rx_dev;
@@ -1243,6 +1248,7 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
  	else
  		rx_dev = ctlr->dev.parent;

+	ret = -ENOMSG;
  	list_for_each_entry(xfer, &msg->transfers, transfer_list) {
  		/* The sync is done before each transfer. */
  		unsigned long attrs = DMA_ATTR_SKIP_CPU_SYNC;
@@ -1257,6 +1263,9 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
  						attrs);
  			if (ret != 0)
  				return ret;
+		} else {
+			memset(&xfer->tx_sg, 0, sizeof(xfer->tx_sg));
+			xfer->tx_sg.sgl = &dummy_sg;
  		}

  		if (xfer->rx_buf != NULL) {
@@ -1270,8 +1279,14 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)

  				return ret;
  			}
+		} else {
+			memset(&xfer->rx_sg, 0, sizeof(xfer->rx_sg));
+			xfer->rx_sg.sgl = &dummy_sg;
  		}
  	}
+	/* No transfer has been mapped, bail out with success */
+	if (ret)
+		return 0;

  	ctlr->cur_rx_dma_dev = rx_dev;
  	ctlr->cur_tx_dma_dev = tx_dev;
============><============================================================================================

Thanks,
Neil
> 
> Neil
> 


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 13:18         ` Neil Armstrong
@ 2024-05-22 14:24           ` Andy Shevchenko
  2024-05-22 15:12             ` Nícolas F. R. A. Prado
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2024-05-22 14:24 UTC (permalink / raw
  To: Neil Armstrong
  Cc: Nícolas F. R. A. Prado, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

On Wed, May 22, 2024 at 03:18:18PM +0200, Neil Armstrong wrote:
> On 22/05/2024 13:53, Neil Armstrong wrote:
> > On 22/05/2024 13:33, Andy Shevchenko wrote:
> > > On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
> > > > On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
> > > > > On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
> > > > > > Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
> > > > > > have checks for orig_nents against 0. No need to duplicate this.
> > > > > > All the same applies to other DMA mapping API calls.
> > > > > > 
> > > > > > Also note, there is no other user in the kernel that does this kind of
> > > > > > checks.
> > > > > > 
> > > > > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > this commit caused a regression which I reported here:
> > > > > 
> > > > > https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
> > > > > 
> > > > > along with some thoughts on the cause and a possible solution, though I'm not
> > > > > familiar with this code base at all and would really appreciate any feedback you
> > > > > may have.
> > > > 
> > > > I also see the same regression on the SM8550 and SM8650 platforms,
> > > > please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.
> > > 
> > > There is still no answer from IOMMU patch author. Do you have the same trace
> > > due to IOMMU calls? Anyway, I guess it would be nice to see it.
> > 
> > Yes :
> > [    6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> > <snip>
> > [    6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
> > <snip>
> > [    6.688286] Call trace:
> > [    6.688287]  iommu_dma_sync_sg_for_device+0x28/0x100
> > [    6.717582]  __dma_sync_sg_for_device+0x3c/0x40
> > [    6.717585]  spi_transfer_one_message+0x358/0x680
> > [    6.732229]  __spi_pump_transfer_message+0x188/0x494
> > [    6.732232]  __spi_sync+0x2a8/0x3c4
> > [    6.732234]  spi_sync+0x30/0x54
> > [    6.732236]  goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
> > [    6.739854]  _regmap_raw_write_impl+0x538/0x674
> > [    6.750053]  _regmap_raw_write+0xb4/0x144
> > [    6.750056]  regmap_raw_write+0x7c/0xc0
> > [    6.750058]  goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
> > [    6.765520]  goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
> > [    6.765522]  goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
> > [    6.772339]  spi_probe+0x84/0xe4
> > [    6.772342]  really_probe+0xbc/0x29c
> > [    6.784313]  __driver_probe_device+0x78/0x12c
> > [    6.784316]  driver_probe_device+0x3c/0x15c
> > [    6.784319]  __driver_attach+0x90/0x19c
> > [    6.784322]  bus_for_each_dev+0x7c/0xdc
> > [    6.794520]  driver_attach+0x24/0x30
> > [    6.794523]  bus_add_driver+0xe4/0x208
> > [    6.794526]  driver_register+0x5c/0x124
> > [    6.802586]  __spi_register_driver+0xa4/0xe4
> > [    6.802589]  goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
> > [    6.802591]  do_one_initcall+0x80/0x1c8
> > [    6.902310]  do_init_module+0x60/0x218
> > [    6.921988]  load_module+0x1bcc/0x1d8c
> > [    6.925847]  init_module_from_file+0x88/0xcc
> > [    6.930238]  __arm64_sys_finit_module+0x1dc/0x2e4
> > [    6.935074]  invoke_syscall+0x48/0x114
> > [    6.938944]  el0_svc_common.constprop.0+0xc0/0xe0
> > [    6.943781]  do_el0_svc+0x1c/0x28
> > [    6.947195]  el0_svc+0x34/0xd8
> > [    6.950348]  el0t_64_sync_handler+0x120/0x12c
> > [    6.954833]  el0t_64_sync+0x190/0x194
> > [    6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c2
> > 
> > Reverting  8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the crash.
> > 
> > > 
> > > Meanwhile, I have three changes I posted in the replies to the initial report,
> > > can you combine them all and test? This will be a plan B (? or A, depending on
> > > the culprit).
> > > 
> > 
> > I'll try to apply them and test.
> 
> I stacked the 3 changes, and it works:
> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD

Thank you!

I will try to develop and submit a fix against IOMMU which I believe is the
correct place to address this. So, this one will be plan B in case the IOMMU
folks will refuse the other one.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 14:24           ` Andy Shevchenko
@ 2024-05-22 15:12             ` Nícolas F. R. A. Prado
  2024-05-22 15:24               ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-05-22 15:12 UTC (permalink / raw
  To: Andy Shevchenko
  Cc: Neil Armstrong, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

On Wed, May 22, 2024 at 05:24:40PM +0300, Andy Shevchenko wrote:
> On Wed, May 22, 2024 at 03:18:18PM +0200, Neil Armstrong wrote:
> > On 22/05/2024 13:53, Neil Armstrong wrote:
> > > On 22/05/2024 13:33, Andy Shevchenko wrote:
> > > > On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
> > > > > On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
> > > > > > On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
> > > > > > > Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
> > > > > > > have checks for orig_nents against 0. No need to duplicate this.
> > > > > > > All the same applies to other DMA mapping API calls.
> > > > > > > 
> > > > > > > Also note, there is no other user in the kernel that does this kind of
> > > > > > > checks.
> > > > > > > 
> > > > > > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > this commit caused a regression which I reported here:
> > > > > > 
> > > > > > https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
> > > > > > 
> > > > > > along with some thoughts on the cause and a possible solution, though I'm not
> > > > > > familiar with this code base at all and would really appreciate any feedback you
> > > > > > may have.
> > > > > 
> > > > > I also see the same regression on the SM8550 and SM8650 platforms,
> > > > > please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.
> > > > 
> > > > There is still no answer from IOMMU patch author. Do you have the same trace
> > > > due to IOMMU calls? Anyway, I guess it would be nice to see it.
[..]
> > > > 
> > > > Meanwhile, I have three changes I posted in the replies to the initial report,
> > > > can you combine them all and test? This will be a plan B (? or A, depending on
> > > > the culprit).
> > > > 
> > > 
> > > I'll try to apply them and test.
> > 
> > I stacked the 3 changes, and it works:
> > Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
> 
> Thank you!
> 
> I will try to develop and submit a fix against IOMMU which I believe is the
> correct place to address this. So, this one will be plan B in case the IOMMU
> folks will refuse the other one.

Hi,

that change did not work for me. Stack trace follows at the end.

But adding the following on top did fix it:

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 0851c5e1fd1f..5d3972d9d1da 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1253,8 +1253,13 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
                /* The sync is done before each transfer. */
                unsigned long attrs = DMA_ATTR_SKIP_CPU_SYNC;

-               if (!ctlr->can_dma(ctlr, msg->spi, xfer))
+               if (!ctlr->can_dma(ctlr, msg->spi, xfer)) {
+                       memset(&xfer->tx_sg, 0, sizeof(xfer->tx_sg));
+                       xfer->tx_sg.sgl = &dummy_sg;
+                       memset(&xfer->rx_sg, 0, sizeof(xfer->rx_sg));
+                       xfer->rx_sg.sgl = &dummy_sg;
                        continue;
+               }

                if (xfer->tx_buf != NULL) {
                        ret = spi_map_buf_attrs(ctlr, tx_dev, &xfer->tx_sg,


The thing is that even with all the previous changes applied, if one of the
transfers inside the message doesn't support DMA, the null pointer would still
be passed to the DMA API.

Alternatively, I think a better way to achieve the same is (I have verified this
also works):

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 0851c5e1fd1f..3f2ee70d080a 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1322,7 +1322,7 @@ static int __spi_unmap_msg(struct spi_controller *ctlr, struct spi_message *msg)
 	return 0;
 }

-static void spi_dma_sync_for_device(struct spi_controller *ctlr,
+static void spi_dma_sync_for_device(struct spi_controller *ctlr, struct spi_message *msg,
 				    struct spi_transfer *xfer)
 {
 	struct device *rx_dev = ctlr->cur_rx_dma_dev;
@@ -1331,11 +1331,14 @@ static void spi_dma_sync_for_device(struct spi_controller *ctlr,
 	if (!ctlr->cur_msg_mapped)
 		return;

+	if (!ctlr->can_dma(ctlr, msg->spi, xfer))
+		return;
+
 	dma_sync_sgtable_for_device(tx_dev, &xfer->tx_sg, DMA_TO_DEVICE);
 	dma_sync_sgtable_for_device(rx_dev, &xfer->rx_sg, DMA_FROM_DEVICE);
 }

-static void spi_dma_sync_for_cpu(struct spi_controller *ctlr,
+static void spi_dma_sync_for_cpu(struct spi_controller *ctlr, struct spi_message *msg,
 				 struct spi_transfer *xfer)
 {
 	struct device *rx_dev = ctlr->cur_rx_dma_dev;
@@ -1344,6 +1347,9 @@ static void spi_dma_sync_for_cpu(struct spi_controller *ctlr,
 	if (!ctlr->cur_msg_mapped)
 		return;

+	if (!ctlr->can_dma(ctlr, msg->spi, xfer))
+		return;
+
 	dma_sync_sgtable_for_cpu(rx_dev, &xfer->rx_sg, DMA_FROM_DEVICE);
 	dma_sync_sgtable_for_cpu(tx_dev, &xfer->tx_sg, DMA_TO_DEVICE);
 }
@@ -1637,10 +1643,10 @@ static int spi_transfer_one_message(struct spi_controller *ctlr,
 			reinit_completion(&ctlr->xfer_completion);

 fallback_pio:
-			spi_dma_sync_for_device(ctlr, xfer);
+			spi_dma_sync_for_device(ctlr, msg, xfer);
 			ret = ctlr->transfer_one(ctlr, msg->spi, xfer);
 			if (ret < 0) {
-				spi_dma_sync_for_cpu(ctlr, xfer);
+				spi_dma_sync_for_cpu(ctlr, msg, xfer);

 				if (ctlr->cur_msg_mapped &&
 				   (xfer->error & SPI_TRANS_FAIL_NO_START)) {
@@ -1665,7 +1671,7 @@ static int spi_transfer_one_message(struct spi_controller *ctlr,
 					msg->status = ret;
 			}

-			spi_dma_sync_for_cpu(ctlr, xfer);
+			spi_dma_sync_for_cpu(ctlr, msg, xfer);
 		} else {
 			if (xfer->len)
 				dev_err(&msg->spi->dev,


I'll let you decide what is the best fix for the issue (what has been posted so
far in this thread or another fix in IOMMU). If you go with this, feel free to
add my

Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

Thanks,
Nícolas

Stack trace:

[    3.086173] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
[    3.103002] Mem abort info:
[    3.105885]   ESR = 0x0000000096000004
[    3.109738]   EC = 0x25: DABT (current EL), IL = 32 bits
[    3.115198]   SET = 0, FnV = 0
[    3.118342]   EA = 0, S1PTW = 0
[    3.121570]   FSC = 0x04: level 0 translation fault
[    3.126584] Data abort info:
[    3.129545]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    3.135188]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    3.140383]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    3.145837] [000000000000001c] user address but active_mm is swapper
[    3.152369] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[    3.156423] input: cros_ec as /devices/platform/soc@0/ac0000.geniqup/a80000.spi/spi_master/spi6/spi6.0/a80000.spi:ec@0:keyboard-controller/input/input0
[    3.158806] Modules linked in:
[    3.158812] CPU: 6 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240515-00005-gad5f430e51d2 #424
[    3.158820] Hardware name: Google Kingoftown (DT)
[    3.158823] Workqueue: events_unbound deferred_probe_work_func
[    3.175091] input: cros_ec_buttons as /devices/platform/soc@0/ac0000.geniqup/a80000.spi/spi_master/spi6/spi6.0/a80000.spi:ec@0:keyboard-controller/input/input1
[    3.175859]
[    3.175861] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    3.186858] cros-ec-spi spi6.0: Chrome EC device registered
[    3.190317] pc : iommu_dma_sync_sg_for_device+0x28/0x100
[    3.190327] lr : __dma_sync_sg_for_device+0x28/0x4c
[    3.211853] tpm_tis_spi spi0.0: Cr50 firmware version: B2-C:0 RO_A:0.0.11/bc74f7dc RW_A:0.6.70/cr50_v2.0.3361-b70f24b1b
[    3.212461] sp : ffff800080942dc0
[    3.212464] x29: ffff800080942dc0 x28: ffff7eb103be2000 x27: ffff7eb101012010
[    3.257562] x26: ffff800080943008 x25: ffff7eb103be2480 x24: ffffb819fada5180
[    3.264887] x23: ffff7eb101012010 x22: 0000000000000001 x21: 0000000000000000
[    3.272213] x20: ffffb819fb10c718 x19: 0000000000000000 x18: ffffb819fbde8988
[    3.279539] x17: 0000000000010108 x16: 0000000000000000 x15: 0000000000000002
[    3.286863] x14: 0000000000000001 x13: 000000000016356d x12: 0000000000000001
[    3.294188] x11: ffff800080942cd0 x10: ffff7eb103c74ff8 x9 : ffff7eb103be2469
[    3.301515] x8 : ffff7eb10101cf04 x7 : 00000000ffffffff x6 : 0000000000000001
[    3.308849] x5 : ffffb819fa51a780 x4 : ffffb819f9704acc x3 : 0000000000000001
[    3.316182] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff7eb101012010
[    3.323509] Call trace:
[    3.326023]  iommu_dma_sync_sg_for_device+0x28/0x100
[    3.331125]  __dma_sync_sg_for_device+0x28/0x4c
[    3.335790]  spi_transfer_one_message+0x378/0x6e4
[    3.340634]  __spi_pump_transfer_message+0x1e4/0x504
[    3.345737]  __spi_sync+0x2a0/0x3c4
[    3.349334]  spi_sync+0x30/0x54
[    3.352568]  spi_mem_exec_op+0x26c/0x41c
[    3.356610]  spi_nor_spimem_read_data+0x148/0x158
[    3.361454]  spi_nor_read_data+0x30/0x3c
[    3.365494]  spi_nor_read_sfdp+0x74/0xe4
[    3.369532]  spi_nor_parse_sfdp+0x120/0x11d0
[    3.373921]  spi_nor_sfdp_init_params_deprecated+0x3c/0x8c
[    3.379562]  spi_nor_scan+0x7ac/0xef8
[    3.383336]  spi_nor_probe+0x94/0x2ec
[    3.387109]  spi_mem_probe+0x6c/0xac
[    3.390796]  spi_probe+0x84/0xe4
[    3.394119]  really_probe+0xbc/0x2a0
[    3.397793]  __driver_probe_device+0x78/0x12c
[    3.402270]  driver_probe_device+0x40/0x160
[    3.406569]  __device_attach_driver+0xb8/0x134
[    3.411144]  bus_for_each_drv+0x84/0xe0
[    3.415091]  __device_attach+0xa8/0x1b0
[    3.419040]  device_initial_probe+0x14/0x20
[    3.423340]  bus_probe_device+0xa8/0xac
[    3.427288]  device_add+0x590/0x750
[    3.430872]  __spi_add_device+0x138/0x208
[    3.434999]  of_register_spi_device+0x394/0x57c
[    3.439656]  spi_register_controller+0x394/0x760
[    3.444399]  qcom_qspi_probe+0x328/0x390
[    3.448442]  platform_probe+0x68/0xd8
[    3.452216]  really_probe+0xbc/0x2a0
[    3.455898]  __driver_probe_device+0x78/0x12c
[    3.460382]  driver_probe_device+0x40/0x160
[    3.464682]  __device_attach_driver+0xb8/0x134
[    3.469246]  bus_for_each_drv+0x84/0xe0
[    3.473193]  __device_attach+0xa8/0x1b0
[    3.477137]  device_initial_probe+0x14/0x20
[    3.481435]  bus_probe_device+0xa8/0xac
[    3.485383]  deferred_probe_work_func+0x88/0xc0
[    3.490044]  process_one_work+0x154/0x298
[    3.494172]  worker_thread+0x304/0x408
[    3.498035]  kthread+0x118/0x11c
[    3.501358]  ret_from_fork+0x10/0x20
[    3.505046] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
[    3.511301] ---[ end trace 0000000000000000 ]---

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents
  2024-05-22 15:12             ` Nícolas F. R. A. Prado
@ 2024-05-22 15:24               ` Andy Shevchenko
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Shevchenko @ 2024-05-22 15:24 UTC (permalink / raw
  To: Nícolas F. R. A. Prado
  Cc: Neil Armstrong, Mark Brown, linux-spi, linux-kernel,
	linux-arm-msm

On Wed, May 22, 2024 at 11:12:43AM -0400, Nícolas F. R. A. Prado wrote:
> On Wed, May 22, 2024 at 05:24:40PM +0300, Andy Shevchenko wrote:
> > On Wed, May 22, 2024 at 03:18:18PM +0200, Neil Armstrong wrote:
> > > On 22/05/2024 13:53, Neil Armstrong wrote:
> > > > On 22/05/2024 13:33, Andy Shevchenko wrote:
> > > > > On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
> > > > > > On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
> > > > > > > On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
> > > > > > > > Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
> > > > > > > > have checks for orig_nents against 0. No need to duplicate this.
> > > > > > > > All the same applies to other DMA mapping API calls.
> > > > > > > > 
> > > > > > > > Also note, there is no other user in the kernel that does this kind of
> > > > > > > > checks.
> > > > > > > > 
> > > > > > > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > this commit caused a regression which I reported here:
> > > > > > > 
> > > > > > > https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano
> > > > > > > 
> > > > > > > along with some thoughts on the cause and a possible solution, though I'm not
> > > > > > > familiar with this code base at all and would really appreciate any feedback you
> > > > > > > may have.
> > > > > > 
> > > > > > I also see the same regression on the SM8550 and SM8650 platforms,
> > > > > > please CC linux-arm-msm@vger.kernel.org and me for a potential fix to test on those platforms.
> > > > > 
> > > > > There is still no answer from IOMMU patch author. Do you have the same trace
> > > > > due to IOMMU calls? Anyway, I guess it would be nice to see it.
> [..]
> > > > > 
> > > > > Meanwhile, I have three changes I posted in the replies to the initial report,
> > > > > can you combine them all and test? This will be a plan B (? or A, depending on
> > > > > the culprit).
> > > > > 
> > > > 
> > > > I'll try to apply them and test.
> > > 
> > > I stacked the 3 changes, and it works:
> > > Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
> > 
> > Thank you!
> > 
> > I will try to develop and submit a fix against IOMMU which I believe is the
> > correct place to address this. So, this one will be plan B in case the IOMMU
> > folks will refuse the other one.
> 
> Hi,
> 
> that change did not work for me. Stack trace follows at the end.
> 
> But adding the following on top did fix it:
> 
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 0851c5e1fd1f..5d3972d9d1da 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -1253,8 +1253,13 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
>                 /* The sync is done before each transfer. */
>                 unsigned long attrs = DMA_ATTR_SKIP_CPU_SYNC;
> 
> -               if (!ctlr->can_dma(ctlr, msg->spi, xfer))
> +               if (!ctlr->can_dma(ctlr, msg->spi, xfer)) {
> +                       memset(&xfer->tx_sg, 0, sizeof(xfer->tx_sg));
> +                       xfer->tx_sg.sgl = &dummy_sg;
> +                       memset(&xfer->rx_sg, 0, sizeof(xfer->rx_sg));
> +                       xfer->rx_sg.sgl = &dummy_sg;
>                         continue;
> +               }
> 
>                 if (xfer->tx_buf != NULL) {
>                         ret = spi_map_buf_attrs(ctlr, tx_dev, &xfer->tx_sg,
> 
> 
> The thing is that even with all the previous changes applied, if one of the
> transfers inside the message doesn't support DMA, the null pointer would still
> be passed to the DMA API.

Ah, indeed, the conditionals also can be translated to

	can_dma = can_dma();

	if (can_dma && _buf)
		...
	else
		...

but...


> Alternatively, I think a better way to achieve the same is (I have verified this
> also works):

I agree that this one is much better approach. I will clean it up and send
as a quick fix. IOMMU will still be on the table as I think it's wrong to
dereference SG when nents = 0.

> I'll let you decide what is the best fix for the issue (what has been posted so
> far in this thread or another fix in IOMMU). If you go with this, feel free to
> add my
> 
> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

Thank you!

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-22 15:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240507201028.564630-1-andriy.shevchenko@linux.intel.com>
     [not found] ` <d8930bce-6db6-45f4-8f09-8a00fa48e607@notapiano>
2024-05-22 10:03   ` [PATCH v1 1/1] spi: Remove unneded check for orig_nents Neil Armstrong
2024-05-22 11:33     ` Andy Shevchenko
2024-05-22 11:53       ` Neil Armstrong
2024-05-22 13:18         ` Neil Armstrong
2024-05-22 14:24           ` Andy Shevchenko
2024-05-22 15:12             ` Nícolas F. R. A. Prado
2024-05-22 15:24               ` Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).