Linux-ide Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Power management fixes
@ 2024-01-11 11:51 Damien Le Moal
  2024-01-11 11:51 ` [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby Damien Le Moal
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Damien Le Moal @ 2024-01-11 11:51 UTC (permalink / raw
  To: linux-ide, Niklas Cassel
  Cc: Dieter Mummenschanz, Wang Zhihao, linux-regressions

The first patch is a fairly obvious fix. The second one needs testing.

Dieter, Zhihao,

Can you please try these patches to see if they resolve your issues
with system low power state transitions ?

Damien Le Moal (2):
  ata: libata-core: Do not try to set sleeping devices to standby
  ata: libata-core: Revert "ata: libata-core: Fix
    ata_pci_shutdown_one()"

 drivers/ata/libata-core.c | 75 ++++++++++++++++++++++++---------------
 1 file changed, 47 insertions(+), 28 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby
  2024-01-11 11:51 [PATCH 0/2] Power management fixes Damien Le Moal
@ 2024-01-11 11:51 ` Damien Le Moal
  2024-02-14 11:03   ` Niklas Cassel
  2024-01-11 11:51 ` [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()" Damien Le Moal
       [not found] ` <DU0P251MB082515FC8FE77424231B475CF4682@DU0P251MB0825.EURP251.PROD.OUTLOOK.COM>
  2 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-11 11:51 UTC (permalink / raw
  To: linux-ide, Niklas Cassel
  Cc: Dieter Mummenschanz, Wang Zhihao, linux-regressions

In ata ata_dev_power_set_standby(), check that the target device is not
sleeping. If it is, there is no need to do anything.

Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/ata/libata-core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 09ed67772fae..d9f80f4f70f5 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2017,6 +2017,10 @@ void ata_dev_power_set_standby(struct ata_device *dev)
 	struct ata_taskfile tf;
 	unsigned int err_mask;
 
+	/* If the device is already sleeping, do nothing. */
+	if (dev->flags & ATA_DFLAG_SLEEPING)
+		return;
+
 	/*
 	 * Some odd clown BIOSes issue spindown on power off (ACPI S4 or S5)
 	 * causing some drives to spin up and down again. For these, do nothing
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-01-11 11:51 [PATCH 0/2] Power management fixes Damien Le Moal
  2024-01-11 11:51 ` [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby Damien Le Moal
@ 2024-01-11 11:51 ` Damien Le Moal
  2024-01-11 18:10   ` Sergei Shtylyov
       [not found] ` <DU0P251MB082515FC8FE77424231B475CF4682@DU0P251MB0825.EURP251.PROD.OUTLOOK.COM>
  2 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-11 11:51 UTC (permalink / raw
  To: linux-ide, Niklas Cassel
  Cc: Dieter Mummenschanz, Wang Zhihao, linux-regressions

This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.

Several users have signaled issues with commit fd3a6837d8e1 ("ata:
libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
system SoC to go to a low power state. The reason for this problem
is not well understood but given that this patch is harmless with the
improvements to ata_dev_power_set_standby(), restore it to allow system
lower power state transitions.

For regular system shutdown, ata_dev_power_set_standby() will be
executed twice: once the scsi device is removed and another when
ata_pci_shutdown_one() executes and EH completes unloading the devices.
Make the second call to ata_dev_power_set_standby() do nothing by using
ata_dev_power_is_active() and return if the device is already in
standby.

Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 30 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index d9f80f4f70f5..20a366942626 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
 	return true;
 }
 
+static bool ata_dev_power_is_active(struct ata_device *dev)
+{
+	struct ata_taskfile tf;
+	unsigned int err_mask;
+
+	ata_tf_init(dev, &tf);
+	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
+	tf.protocol = ATA_PROT_NODATA;
+	tf.command = ATA_CMD_CHK_POWER;
+
+	err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
+	if (err_mask) {
+		ata_dev_err(dev, "Check power mode failed (err_mask=0x%x)\n",
+			    err_mask);
+		/*
+		 * Assume we are in standby mode so that we always force a
+		 * spinup in ata_dev_power_set_active().
+		 */
+		return false;
+	}
+
+	ata_dev_dbg(dev, "Power mode: 0x%02x\n", tf.nsect);
+
+	/* Active or idle */
+	return tf.nsect == 0xff;
+}
+
 /**
  *	ata_dev_power_set_standby - Set a device power mode to standby
  *	@dev: target device
@@ -2017,8 +2044,9 @@ void ata_dev_power_set_standby(struct ata_device *dev)
 	struct ata_taskfile tf;
 	unsigned int err_mask;
 
-	/* If the device is already sleeping, do nothing. */
-	if (dev->flags & ATA_DFLAG_SLEEPING)
+	/* If the device is already sleeping or in standby, do nothing. */
+	if ((dev->flags & ATA_DFLAG_SLEEPING) ||
+	    !ata_dev_power_is_active(dev))
 		return;
 
 	/*
@@ -2046,33 +2074,6 @@ void ata_dev_power_set_standby(struct ata_device *dev)
 			    err_mask);
 }
 
-static bool ata_dev_power_is_active(struct ata_device *dev)
-{
-	struct ata_taskfile tf;
-	unsigned int err_mask;
-
-	ata_tf_init(dev, &tf);
-	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
-	tf.protocol = ATA_PROT_NODATA;
-	tf.command = ATA_CMD_CHK_POWER;
-
-	err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
-	if (err_mask) {
-		ata_dev_err(dev, "Check power mode failed (err_mask=0x%x)\n",
-			    err_mask);
-		/*
-		 * Assume we are in standby mode so that we always force a
-		 * spinup in ata_dev_power_set_active().
-		 */
-		return false;
-	}
-
-	ata_dev_dbg(dev, "Power mode: 0x%02x\n", tf.nsect);
-
-	/* Active or idle */
-	return tf.nsect == 0xff;
-}
-
 /**
  *	ata_dev_power_set_active -  Set a device power mode to active
  *	@dev: target device
@@ -6184,10 +6185,24 @@ EXPORT_SYMBOL_GPL(ata_pci_remove_one);
 void ata_pci_shutdown_one(struct pci_dev *pdev)
 {
 	struct ata_host *host = pci_get_drvdata(pdev);
+	struct ata_port *ap;
+	unsigned long flags;
 	int i;
 
+	/* Tell EH to disable all devices */
 	for (i = 0; i < host->n_ports; i++) {
-		struct ata_port *ap = host->ports[i];
+		ap = host->ports[i];
+		spin_lock_irqsave(ap->lock, flags);
+		ap->pflags |= ATA_PFLAG_UNLOADING;
+		ata_port_schedule_eh(ap);
+		spin_unlock_irqrestore(ap->lock, flags);
+	}
+
+	for (i = 0; i < host->n_ports; i++) {
+		ap = host->ports[i];
+
+		/* Wait for EH to complete before freezing the port */
+		ata_port_wait_eh(ap);
 
 		ap->pflags |= ATA_PFLAG_FROZEN;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-01-11 11:51 ` [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()" Damien Le Moal
@ 2024-01-11 18:10   ` Sergei Shtylyov
  2024-01-11 23:13     ` Damien Le Moal
  2024-02-19 15:29     ` Niklas Cassel
  0 siblings, 2 replies; 23+ messages in thread
From: Sergei Shtylyov @ 2024-01-11 18:10 UTC (permalink / raw
  To: Damien Le Moal, linux-ide, Niklas Cassel
  Cc: Dieter Mummenschanz, Wang Zhihao, linux-regressions

On 1/11/24 2:51 PM, Damien Le Moal wrote:

> This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.
> 
> Several users have signaled issues with commit fd3a6837d8e1 ("ata:
> libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
> system SoC to go to a low power state. The reason for this problem
> is not well understood but given that this patch is harmless with the
> improvements to ata_dev_power_set_standby(), restore it to allow system
> lower power state transitions.
> 
> For regular system shutdown, ata_dev_power_set_standby() will be
> executed twice: once the scsi device is removed and another when
> ata_pci_shutdown_one() executes and EH completes unloading the devices.
> Make the second call to ata_dev_power_set_standby() do nothing by using
> ata_dev_power_is_active() and return if the device is already in
> standby.
> 
> Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
>  drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
>  1 file changed, 45 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index d9f80f4f70f5..20a366942626 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
>  	return true;
>  }
>  
> +static bool ata_dev_power_is_active(struct ata_device *dev)
> +{
> +	struct ata_taskfile tf;
> +	unsigned int err_mask;
> +
> +	ata_tf_init(dev, &tf);
> +	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;

   Why set ATA_TFLAG_ISADDR, BTW? This command doesn't use any taskfile
regs but the device/head reg. Material for a fix, I guess... :-)

> +	tf.protocol = ATA_PROT_NODATA;
> +	tf.command = ATA_CMD_CHK_POWER;
> +
[...]

MBR, Sergey

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-01-11 18:10   ` Sergei Shtylyov
@ 2024-01-11 23:13     ` Damien Le Moal
  2024-02-19 15:29     ` Niklas Cassel
  1 sibling, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2024-01-11 23:13 UTC (permalink / raw
  To: Sergei Shtylyov, linux-ide, Niklas Cassel
  Cc: Dieter Mummenschanz, Wang Zhihao

On 1/12/24 03:10, Sergei Shtylyov wrote:
> On 1/11/24 2:51 PM, Damien Le Moal wrote:
> 
>> This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.
>>
>> Several users have signaled issues with commit fd3a6837d8e1 ("ata:
>> libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
>> system SoC to go to a low power state. The reason for this problem
>> is not well understood but given that this patch is harmless with the
>> improvements to ata_dev_power_set_standby(), restore it to allow system
>> lower power state transitions.
>>
>> For regular system shutdown, ata_dev_power_set_standby() will be
>> executed twice: once the scsi device is removed and another when
>> ata_pci_shutdown_one() executes and EH completes unloading the devices.
>> Make the second call to ata_dev_power_set_standby() do nothing by using
>> ata_dev_power_is_active() and return if the device is already in
>> standby.
>>
>> Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> ---
>>  drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
>>  1 file changed, 45 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>> index d9f80f4f70f5..20a366942626 100644
>> --- a/drivers/ata/libata-core.c
>> +++ b/drivers/ata/libata-core.c
>> @@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
>>  	return true;
>>  }
>>  
>> +static bool ata_dev_power_is_active(struct ata_device *dev)
>> +{
>> +	struct ata_taskfile tf;
>> +	unsigned int err_mask;
>> +
>> +	ata_tf_init(dev, &tf);
>> +	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
> 
>    Why set ATA_TFLAG_ISADDR, BTW? This command doesn't use any taskfile
> regs but the device/head reg. Material for a fix, I guess... :-)

Good point. I will look into it. This code was moved from libata-scsi
translation, so it has been like this for a very long time...

> 
>> +	tf.protocol = ATA_PROT_NODATA;
>> +	tf.command = ATA_CMD_CHK_POWER;
>> +
> [...]
> 
> MBR, Sergey

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] Power management fixes
       [not found] ` <DU0P251MB082515FC8FE77424231B475CF4682@DU0P251MB0825.EURP251.PROD.OUTLOOK.COM>
@ 2024-01-22  8:49   ` Damien Le Moal
       [not found]     ` <trinity-0be6e8a8-e6d3-4d60-be0d-59592a9edd65-1706010022623@3c-app-webde-bap10>
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-22  8:49 UTC (permalink / raw
  To: 王 志豪, linux-ide@vger.kernel.org,
	Niklas Cassel
  Cc: Dieter Mummenschanz, linux-regressions

On 1/11/24 21:09, 王 志豪 wrote:
> thanks a lot!I will take a try and reply you soon

Did you try the patches ?

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Aw: Re: [PATCH 0/2] Power management fixes
       [not found]     ` <trinity-0be6e8a8-e6d3-4d60-be0d-59592a9edd65-1706010022623@3c-app-webde-bap10>
@ 2024-01-23 11:52       ` Damien Le Moal
       [not found]         ` <trinity-0df92d73-be55-433c-bdb2-4387f7ea590b-1706686178879@3c-app-webde-bap43>
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-23 11:52 UTC (permalink / raw
  To: Dieter Mummenschanz, linux-ide

On 1/23/24 20:40, Dieter Mummenschanz wrote:
> Damien,
> sorry for getting back to you so late. So is this patch series just a revert or 
> is it something new? Can I patch and test against 6.8-rc or should I use 6.7? 
> Anyway I need at least a couple of days since I'm very busy ATM.

Yes, the second patch is essentially a revert. If you can test with 6.8-rc1 it
would be great.

Thanks.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Aw: Re: Re: [PATCH 0/2] Power management fixes
       [not found]         ` <trinity-0df92d73-be55-433c-bdb2-4387f7ea590b-1706686178879@3c-app-webde-bap43>
@ 2024-01-31  7:38           ` Damien Le Moal
  2024-01-31 11:49             ` Niklas Cassel
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-31  7:38 UTC (permalink / raw
  To: Dieter Mummenschanz, linux-ide, Niklas Cassel

On 1/31/24 16:29, Dieter Mummenschanz wrote:
> Damien,
> so I've applied the patch to 6.8-rc2. Interesting thing is that the behaviour is 
> exactly the same as before (w/o the patch). Besides not 
> honoring CONFIG_SATA_MOBILE_LPM_POLICY=3 after boot my system refuses to 
> transition into lower power states > pc2 after resume even after letting it sit 
> idle for 10 minutes. Transition is only reached after issuing hdparm -Y. So if 
> the patch restores the original behaviour then why did it stop working?!

Interesting... hdparm -Y puts the drive to sleep while the revert puts the drive
in standby state. Sleep is a near complete shutdown of the drive, while standby
is not.

In any case, I think something else is causing this. Probably PCI or ACPI
related changes. So a bisect would be needed to understand this. But this is
going to be very painful to do for this as each test take a while I guess ? How
long do you need to wait to see the system going into low power state (when it
is working that is) ?

Also, which kernel version is the last one you know is OK ?

> dmesg:
> https://pastes.io/1vmmvhvfub
> Regards
> Dieter
> *Gesendet:* Dienstag, 23. Januar 2024 um 12:52 Uhr
> *Von:* "Damien Le Moal" <dlemoal@kernel.org>
> *An:* "Dieter Mummenschanz" <dmummenschanz@web.de>, linux-ide@vger.kernel.org
> *Betreff:* Re: Aw: Re: [PATCH 0/2] Power management fixes
> On 1/23/24 20:40, Dieter Mummenschanz wrote:
>  > Damien,
>  > sorry for getting back to you so late. So is this patch series just a revert or
>  > is it something new? Can I patch and test against 6.8-rc or should I use 6.7?
>  > Anyway I need at least a couple of days since I'm very busy ATM.
> 
> Yes, the second patch is essentially a revert. If you can test with 6.8-rc1 it
> would be great.
> 
> Thanks.
> 
> --
> Damien Le Moal
> Western Digital Research

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Aw: Re: Re: [PATCH 0/2] Power management fixes
  2024-01-31  7:38           ` Aw: " Damien Le Moal
@ 2024-01-31 11:49             ` Niklas Cassel
  2024-01-31 12:09               ` Damien Le Moal
  2024-02-01  7:10               ` Dieter Mummenschanz
  0 siblings, 2 replies; 23+ messages in thread
From: Niklas Cassel @ 2024-01-31 11:49 UTC (permalink / raw
  To: Damien Le Moal; +Cc: Dieter Mummenschanz, linux-ide

On Wed, Jan 31, 2024 at 04:38:28PM +0900, Damien Le Moal wrote:
> On 1/31/24 16:29, Dieter Mummenschanz wrote:
> > Damien,
> > so I've applied the patch to 6.8-rc2. Interesting thing is that the behaviour is 
> > exactly the same as before (w/o the patch). Besides not 
> > honoring CONFIG_SATA_MOBILE_LPM_POLICY=3 after boot my system refuses to 
> > transition into lower power states > pc2 after resume even after letting it sit 
> > idle for 10 minutes. Transition is only reached after issuing hdparm -Y. So if 
> > the patch restores the original behaviour then why did it stop working?!

Hello Dieter,

Just to confirm,
while testing Damien's patch on top of 6.8-rc2,
did you still do:

for foo in /sys/class/scsi_host/host*/link_power_management_policy;
  do echo med_power_with_dipm > $foo;
done

It should be needed until we add (or modify the existing entry, if any)
your PCI vendor and device id to use "board_ahci_low_power" instead of
"board_ahci", see e.g.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/ata/ahci.c?id=b8b8b4e0c052b2c06e1c4820a8001f4e0f77900f


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Aw: Re: Re: [PATCH 0/2] Power management fixes
  2024-01-31 11:49             ` Niklas Cassel
@ 2024-01-31 12:09               ` Damien Le Moal
  2024-02-01  7:12                 ` Aw: " Dieter Mummenschanz
  2024-02-01  7:10               ` Dieter Mummenschanz
  1 sibling, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2024-01-31 12:09 UTC (permalink / raw
  To: Niklas Cassel; +Cc: Dieter Mummenschanz, linux-ide

On 1/31/24 20:49, Niklas Cassel wrote:
> On Wed, Jan 31, 2024 at 04:38:28PM +0900, Damien Le Moal wrote:
>> On 1/31/24 16:29, Dieter Mummenschanz wrote:
>>> Damien,
>>> so I've applied the patch to 6.8-rc2. Interesting thing is that the behaviour is 
>>> exactly the same as before (w/o the patch). Besides not 
>>> honoring CONFIG_SATA_MOBILE_LPM_POLICY=3 after boot my system refuses to 
>>> transition into lower power states > pc2 after resume even after letting it sit 
>>> idle for 10 minutes. Transition is only reached after issuing hdparm -Y. So if 
>>> the patch restores the original behaviour then why did it stop working?!
> 
> Hello Dieter,
> 
> Just to confirm,
> while testing Damien's patch on top of 6.8-rc2,
> did you still do:
> 
> for foo in /sys/class/scsi_host/host*/link_power_management_policy;
>   do echo med_power_with_dipm > $foo;
> done
> 
> It should be needed until we add (or modify the existing entry, if any)
> your PCI vendor and device id to use "board_ahci_low_power" instead of
> "board_ahci", see e.g.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/ata/ahci.c?id=b8b8b4e0c052b2c06e1c4820a8001f4e0f77900f

Indeed. What is the AHCI adapter here ?
It would be interesting if it is a Tiger Lake :)

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Aw: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-01-31 11:49             ` Niklas Cassel
  2024-01-31 12:09               ` Damien Le Moal
@ 2024-02-01  7:10               ` Dieter Mummenschanz
  2024-02-01 10:51                 ` Niklas Cassel
  1 sibling, 1 reply; 23+ messages in thread
From: Dieter Mummenschanz @ 2024-02-01  7:10 UTC (permalink / raw
  To: Niklas Cassel, linux-ide

Hi Niklas,

> Just to confirm,
> while testing Damien's patch on top of 6.8-rc2,
> did you still do:

> for foo in /sys/class/scsi_host/host*/link_power_management_policy;
> do echo med_power_with_dipm > $foo;
> done

YES! This I always have to do this on this laptop in order to get transitions.

> It should be needed until we add (or modify the existing entry, if any)
> your PCI vendor and device id to use "board_ahci_low_power" instead of
> "board_ahci", see e.g.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/ata/ahci.c?> id=b8b8b4e0c052b2c06e1c4820a8001f4e0f77900f

Not sure I understand. So this CAN be fined inside the Kernel?


Kind regards,
Dieter

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Aw: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-01-31 12:09               ` Damien Le Moal
@ 2024-02-01  7:12                 ` Dieter Mummenschanz
  2024-02-01  8:09                   ` Damien Le Moal
  0 siblings, 1 reply; 23+ messages in thread
From: Dieter Mummenschanz @ 2024-02-01  7:12 UTC (permalink / raw
  To: Damien Le Moal, cassel, linux-ide


> Indeed. What is the AHCI adapter here ?
> It would be interesting if it is a Tiger Lake :)

No It's an older Coffeelake/Cannonlake Platform.

lspci -v |grep AHCI shows:

00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10) (prog-if 01 [AHCI 1.0])
        Subsystem: CLEVO/KAPOK Computer Cannon Lake PCH SATA AHCI Controller
 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Aw: Re: Re: Re: [PATCH 0/2] Power management fixes
  2024-02-01  7:12                 ` Aw: " Dieter Mummenschanz
@ 2024-02-01  8:09                   ` Damien Le Moal
  0 siblings, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2024-02-01  8:09 UTC (permalink / raw
  To: Dieter Mummenschanz, cassel, linux-ide

On 2/1/24 16:12, Dieter Mummenschanz wrote:
> 
>> Indeed. What is the AHCI adapter here ?
>> It would be interesting if it is a Tiger Lake :)
> 
> No It's an older Coffeelake/Cannonlake Platform.
> 
> lspci -v |grep AHCI shows:
> 
> 00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10) (prog-if 01 [AHCI 1.0])
>         Subsystem: CLEVO/KAPOK Computer Cannon Lake PCH SATA AHCI Controller

Thanks. Can you send the PCI-IDs too please ? (lspci -n -s 00:17.0)

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-02-01  7:10               ` Dieter Mummenschanz
@ 2024-02-01 10:51                 ` Niklas Cassel
  2024-02-02 14:53                   ` Aw: " Dieter Mummenschanz
  0 siblings, 1 reply; 23+ messages in thread
From: Niklas Cassel @ 2024-02-01 10:51 UTC (permalink / raw
  To: Dieter Mummenschanz; +Cc: linux-ide, Damien Le Moal

On Thu, Feb 01, 2024 at 08:10:11AM +0100, Dieter Mummenschanz wrote:
> > It should be needed until we add (or modify the existing entry, if any)
> > your PCI vendor and device id to use "board_ahci_low_power" instead of
> > "board_ahci", see e.g.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/ata/ahci.c?> id=b8b8b4e0c052b2c06e1c4820a8001f4e0f77900f
> 
> Not sure I understand. So this CAN be fined inside the Kernel?

Yes, this part of the problem can be avoided by adding an explicit entry
that uses "board_ahci_low_power", as I explained in:
https://lore.kernel.org/linux-ide/ZaATdGDOo5jiBqCR@x1-carbon/T/#u

We seem to have a problem with Tiger Lake, but that problem seems to be
related to Intel VMD.

From looking at your logs, you don't seem to have Intel VMD, however
I'm guessing that some other motherboards that uses Cannon Lake might
have Intel VMD, so I guess the safest thing is to wait until that issue
has been resolved before adding a "board_ahci_low_power" entry for
Cannon Lake.



However, we are still confused why the revert in this series did not
enable your system to enter deeper power states, as you have previously
confirmed that reverting this change:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/ata/libata-core.c?id=d035e4eb38b3ea5ae9ead342f888fd3c394b0fe0
allowed you to enter deeper power states once again.

I see now that Damien's revert (patch 2/2 in this series) is not a simple
$ git revert fd3a6837d8e18cb7be80dcca1283276290336a7a
it seems to have some other small changes in the same patch as well.

Sorry for asking you to test something once more...
But could you please test with:
v6.6-rc2 + git revert fd3a6837d8e18cb7be80dcca1283276290336a7a


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Aw: Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-02-01 10:51                 ` Niklas Cassel
@ 2024-02-02 14:53                   ` Dieter Mummenschanz
  2024-02-05 19:00                     ` Niklas Cassel
  0 siblings, 1 reply; 23+ messages in thread
From: Dieter Mummenschanz @ 2024-02-02 14:53 UTC (permalink / raw
  To: Niklas Cassel; +Cc: linux-ide, Damien Le Moal


> Yes, this part of the problem can be avoided by adding an explicit entry
> that uses "board_ahci_low_power", as I explained in:
> https://lore.kernel.org/linux-ide/ZaATdGDOo5jiBqCR@x1-carbon/T/#u[https://lore.kernel.org/linux-> ide/ZaATdGDOo5jiBqCR@x1-carbon/T/#u]

> We seem to have a problem with Tiger Lake, but that problem seems to be
> related to Intel VMD.

> From looking at your logs, you don't seem to have Intel VMD, however
> I'm guessing that some other motherboards that uses Cannon Lake might
> have Intel VMD, so I guess the safest thing is to wait until that issue
> has been resolved before adding a "board_ahci_low_power" entry for
> Cannon Lake.

This mighht be kind of naive but what about adding an kernel option to force LPM even if the chipset/board is not oficially supported?

> I see now that Damien's revert (patch 2/2 in this series) is not a simple
> $ git revert fd3a6837d8e18cb7be80dcca1283276290336a7a
> it seems to have some other small changes in the same patch as well.

> Sorry for asking you to test something once more...
> But could you please test with:
> v6.6-rc2 + git revert fd3a6837d8e18cb7be80dcca1283276290336a7a

No problem. I've explicitly reverted with 6.8-rc2 and have the kernel running for 1 1/2 days now with 3 suspends and on all of them my system transitions to pc8 so I guess we're good but I'll keep on testing to be sure. Hope this helps.

Kind regards,
Dieter

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-02-02 14:53                   ` Aw: " Dieter Mummenschanz
@ 2024-02-05 19:00                     ` Niklas Cassel
       [not found]                       ` <trinity-0bc8e6ea-7808-4508-af3a-be22281abf24-1707231996854@3c-app-webde-bs42>
  0 siblings, 1 reply; 23+ messages in thread
From: Niklas Cassel @ 2024-02-05 19:00 UTC (permalink / raw
  To: Dieter Mummenschanz; +Cc: linux-ide, Damien Le Moal

Hello Dieter,

On Fri, Feb 02, 2024 at 03:53:36PM +0100, Dieter Mummenschanz wrote:
> 
> > Yes, this part of the problem can be avoided by adding an explicit entry
> > that uses "board_ahci_low_power", as I explained in:
> > https://lore.kernel.org/linux-ide/ZaATdGDOo5jiBqCR@x1-carbon/T/#u[https://lore.kernel.org/linux-> ide/ZaATdGDOo5jiBqCR@x1-carbon/T/#u]
> 
> > We seem to have a problem with Tiger Lake, but that problem seems to be
> > related to Intel VMD.
> 
> > From looking at your logs, you don't seem to have Intel VMD, however
> > I'm guessing that some other motherboards that uses Cannon Lake might
> > have Intel VMD, so I guess the safest thing is to wait until that issue
> > has been resolved before adding a "board_ahci_low_power" entry for
> > Cannon Lake.
> 
> This mighht be kind of naive but what about adding an kernel option to force LPM even if the chipset/board is not oficially supported?

That is a good suggestion.
However, I'm hoping that this series:
https://lore.kernel.org/linux-ide/3e50681d-7a68-4c4a-91f6-278a3cfe23f8@amd.com/T/

Will avoid the need for that, as it would kill the
"board_ahci_low_power" board type completely.


> 
> > I see now that Damien's revert (patch 2/2 in this series) is not a simple
> > $ git revert fd3a6837d8e18cb7be80dcca1283276290336a7a
> > it seems to have some other small changes in the same patch as well.
> 
> > Sorry for asking you to test something once more...
> > But could you please test with:
> > v6.6-rc2 + git revert fd3a6837d8e18cb7be80dcca1283276290336a7a
> 
> No problem. I've explicitly reverted with 6.8-rc2 and have the kernel running for 1 1/2 days now with 3 suspends and on all of them my system transitions to pc8 so I guess we're good but I'll keep on testing to be sure. Hope this helps.

After talking with Damien, we are still completely lost.

commit fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
only modifies the ata_pci_shutdown_one(), which is only called on shutdown.

It is not called in the system suspend / system resume path,
so it should not be possible for this commit to have an impact.


Reading all the emails and bugzilla again, what you are seeing is that:

On a fresh boot, and leaving the computer untouched for a few minutes,
and looking at e.g. turbostat, some of your CPUs have spent time in
the lower-power C-states, e.g. C8/C9/C10.

However, if you do a system suspend + system resume, and then leave
the computer untouched for a few minutes, and look at e.g. turbotstat,
_none_ of your CPUs have spend time in the lower-power C-states.

Is this correct?


If so, it seems that suspend/resume must have messed up some setting...

The commit that added the code to ata_pci_shutdown_one()
was added in:
5b6fba546da2 ("ata: libata-core: Detach a port devices on shutdown")
This was first added in v6.7-rc1.
(So this commit was never included in v6.6.)

The code added in 5b6fba546da2 was later removed in:
commit fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
which was first added in v6.7-rc1.

Are you certain that v6.6 works and v6.7 is bad?

There are quite few libata patches between v6.6 and v6.7:
$ git log --oneline v6.7 --not v6.6 -- drivers/ata

Damien has sent many libata patches related to suspend/resume in the
last couple of kernel releases, so it is possible that something has
regressed. We just have a really hard time seeing how fd3a6837d8e1
could be the bad commit.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
       [not found]                       ` <trinity-0bc8e6ea-7808-4508-af3a-be22281abf24-1707231996854@3c-app-webde-bs42>
@ 2024-02-06 21:46                         ` Niklas Cassel
  2024-02-08 14:37                           ` Aw: " Dieter Mummenschanz
  0 siblings, 1 reply; 23+ messages in thread
From: Niklas Cassel @ 2024-02-06 21:46 UTC (permalink / raw
  To: Dieter Mummenschanz; +Cc: linux-ide, Damien Le Moal

Hello Dieter,

On Tue, Feb 06, 2024 at 04:06:36PM +0100, Dieter Mummenschanz wrote:
> Hello Niklas,
> 
> > That is a good suggestion.
> > However, I'm hoping that this series:
> > https://lore.kernel.org/linux-ide/3e50681d-7a68-4c4a-91f6-278a3cfe23f8@amd.com/T/[https://lore.kernel.org/linux-> ide/3e50681d-7a68-4c4a-91f6-278a3cfe23f8@amd.com/T/]
> 
> > Will avoid the need for that, as it would kill the
> > "board_ahci_low_power" board type completely.
> 
> Very nice. If there is anything "baked" enough to test, please let me know.

It should be ready for testing:
https://github.com/floatious/linux/tree/external-port-v2

With this, you should no longer need:
for foo in /sys/class/scsi_host/host*/link_power_management_policy;
  do echo med_power_with_dipm > $foo;
done

(Assuming you have compiled your kernel with CONFIG_SATA_MOBILE_LPM_POLICY=3).


> 
> > After talking with Damien, we are still completely lost.
> > commit fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
> > only modifies the ata_pci_shutdown_one(), which is only called on shutdown.
> 
> > It is not called in the system suspend / system resume path,
> > so it should not be possible for this commit to have an impact.
> 
> I don't understand it either. For what it's worth, this morning after resuming, my system was stuck at pc2 for 10+ Minutes. After issuing hdparm -Y /dev/sdb I saw lower transitions down to pc8 right away without(!) issuing the same command for sda. So maybe fd3a6837d8e1 isn't the culprit after all. I'll keep testing.

You shouldn't need to do:
$ hdparm -Y

As that bypasses some of the internal logic in libata.

I assume that you didn't need this on v6.6 and older?

(Instead libata should put the device to standby or sleep itself,
it shouldn't need to be done explicitly by the user.)


> 
> 
> > Reading all the emails and bugzilla again, what you are seeing is that:
> 
> > On a fresh boot, and leaving the computer untouched for a few minutes,
> > and looking at e.g. turbostat, some of your CPUs have spent time in
> > the lower-power C-states, e.g. C8/C9/C10.
> 
> Correct! Btw. I'm looking at powertop and also have a daemon that feeds the current c-states to conky for display so I have an eye on the c-states at all times.
> 
> > However, if you do a system suspend + system resume, and then leave
> > the computer untouched for a few minutes, and look at e.g. turbotstat,
> > _none_ of your CPUs have spend time in the lower-power C-states.
> 
> > Is this correct?
> 
> Yes! The system spends ~90% in pc2 no lower c-states.
> 
> > If so, it seems that suspend/resume must have messed up some setting...
> 
> > The commit that added the code to ata_pci_shutdown_one()
> > was added in:
> > 5b6fba546da2 ("ata: libata-core: Detach a port devices on shutdown")
> > This was first added in v6.7-rc1.
> > (So this commit was never included in v6.6.)
> 
> > The code added in 5b6fba546da2 was later removed in:
> > commit fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
> > which was first added in v6.7-rc1.
> 
> > Are you certain that v6.6 works and v6.7 is bad?
> 
> Yes, absolutely!

How certain are you that v6.6 works?

Could it be the same problem there, that it works sometimes and doesn't
work sometimes?

The reason why I'm asking is that Damien's major changes got included
in v6.6-rc4:
https://lore.kernel.org/linux-ide/20230929133324.164211-1-dlemoal@kernel.org/

So I would be less surprised if you said that you can enter pc8 on every
boot on v6.5, but on v6.6 it only works occasionally.

But if you can enter pc8 on every boot on v6.6, but not on v6.7,
then it is probably easier to figure out which commit that broke
things, as there were not that many suspend/resume related changes
added in v6.7.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Aw: Re: Re: Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-02-06 21:46                         ` Niklas Cassel
@ 2024-02-08 14:37                           ` Dieter Mummenschanz
  2024-02-13 20:02                             ` Niklas Cassel
  0 siblings, 1 reply; 23+ messages in thread
From: Dieter Mummenschanz @ 2024-02-08 14:37 UTC (permalink / raw
  To: Niklas Cassel; +Cc: linux-ide, Damien Le Moal

Hello Niklas,


> It should be ready for testing:
> https://github.com/floatious/linux/tree/external-port-v2[https://github.com/floatious/linux/tree/external-port-v2]

> With this, you should no longer need:
> for foo in /sys/class/scsi_host/host*/link_power_management_policy;
> do echo med_power_with_dipm > $foo;
> done

> (Assuming you have compiled your kernel with CONFIG_SATA_MOBILE_LPM_POLICY=3).

Oh sweet! Will give it a spin over the weekend if I find the time.


> You shouldn't need to do:
> $ hdparm -Y

> As that bypasses some of the internal logic in libata.

> I assume that you didn't need this on v6.6 and older?

> (Instead libata should put the device to standby or sleep itself,
> it shouldn't need to be done explicitly by the user.)

I'm at a loss either. No idea why this is actually working. When I encountered the low-power issue I searched the web and stumbled upon hdparm -Y and gave it a shot.

> How certain are you that v6.6 works?

Let's put it this way: I've installed most 6.6-rc's so I had it running for several weeks and I always have an eye on conky showing me the c-states in 5 second interval. And during all that time and several suspends it never occured.

It definately started with the first 6.7 -rc I've installed (don't recall wich one first). There were no hardware changes and no BIOS updates during that time. If I find some spare time, I'm gonna give the latest 6.6 from kernel.org a spin to verify.

> Could it be the same problem there, that it works sometimes and doesn't
> work sometimes?

Again I've never encountered this issue with 6.6 or any other kernel before. I would have remembered that.

> The reason why I'm asking is that Damien's major changes got included
> in v6.6-rc4:
> https://lore.kernel.org/linux-ide/20230929133324.164211-1-dlemoal@kernel.org/[https://lore.kernel.org/linux-> ide/20230929133324.164211-1-dlemoal@kernel.org/]

> So I would be less surprised if you said that you can enter pc8 on every
> boot on v6.5, but on v6.6 it only works occasionally.

To be sure I would have to switch back to 6.6. If I find the time I'll give it a shot.

> But if you can enter pc8 on every boot on v6.6, but not on v6.7,
> then it is probably easier to figure out which commit that broke
> things, as there were not that many suspend/resume related changes
> added in v6.7.

Maybe you could point them out so I can try to bisect?

Kind regards,
Dieter


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: Re: Re: Re:  Re: Re: [PATCH 0/2] Power management fixes
  2024-02-08 14:37                           ` Aw: " Dieter Mummenschanz
@ 2024-02-13 20:02                             ` Niklas Cassel
  0 siblings, 0 replies; 23+ messages in thread
From: Niklas Cassel @ 2024-02-13 20:02 UTC (permalink / raw
  To: Dieter Mummenschanz; +Cc: linux-ide, Damien Le Moal

Hello Dieter,

On Thu, Feb 08, 2024 at 03:37:48PM +0100, Dieter Mummenschanz wrote:

> > I assume that you didn't need this on v6.6 and older?
> 
> > (Instead libata should put the device to standby or sleep itself,
> > it shouldn't need to be done explicitly by the user.)
> 
> I'm at a loss either. No idea why this is actually working. When I encountered the low-power issue I searched the web and stumbled upon hdparm -Y and gave it a shot.

Are you saying that you always do this before suspending?
Both on v6.6 and v6.7?


> > But if you can enter pc8 on every boot on v6.6, but not on v6.7,
> > then it is probably easier to figure out which commit that broke
> > things, as there were not that many suspend/resume related changes
> > added in v6.7.
> 
> Maybe you could point them out so I can try to bisect?

Sure.

These are the libata (and related SCSI) patches added in v6.7
(that are not in v6.6):

b09d7f8fd50f scsi: sd: Fix system start for ATA devices
6371be7aeb98 scsi: Change SCSI device boolean fields to single bit flags
fd3a6837d8e1 ata: libata-core: Fix ata_pci_shutdown_one()
2da4c5e24e86 ata: libata-core: Improve ata_dev_power_set_active()
54d7211da7cd ata: libata-eh: Spinup disk on resume after revalidation
1b947279798f ata: libata: Cleanup inline DMA helper functions
0fecb50891aa ata: libata-eh: Reduce "disable device" message verbosity
7f95731c74d7 ata: libata-eh: Improve reset error messages
88b9f8928678 ata: libata-sata: Improve ata_sas_slave_configure()
3341b82368fb ata: libata-core: Do not resume runtime suspended ports
3a94af2488bf ata: libata-core: Do not poweroff runtime suspended ports
09b055cfb0e9 ata: libata-core: Remove ata_port_resume_async()
6702255d700a ata: libata-core: Remove ata_port_suspend_async()
5b6fba546da2 ata: libata-core: Detach a port devices on shutdown
cfead0dd81de ata: libata-core: Synchronize ata_port_detach() with hotplug
8c1f08170694 ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
c4367ac83805 scsi: Remove scsi device no_start_on_resume flag

I would start with v6.6 and apply them one by one.
(Starting with c4367ac83805)

However, it is also possible that the regression is completely unrelated
to libata changes (e.g. caused by a ACPI suspend/resume patch), and in
that case, you will need a complete bisection.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby
  2024-01-11 11:51 ` [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby Damien Le Moal
@ 2024-02-14 11:03   ` Niklas Cassel
  0 siblings, 0 replies; 23+ messages in thread
From: Niklas Cassel @ 2024-02-14 11:03 UTC (permalink / raw
  To: Damien Le Moal
  Cc: linux-ide, Dieter Mummenschanz, Wang Zhihao, linux-regressions

On Thu, Jan 11, 2024 at 08:51:22PM +0900, Damien Le Moal wrote:
> In ata ata_dev_power_set_standby(), check that the target device is not
> sleeping. If it is, there is no need to do anything.
> 
> Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
> Cc: stable@vger.kernel.org
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
>  drivers/ata/libata-core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 09ed67772fae..d9f80f4f70f5 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2017,6 +2017,10 @@ void ata_dev_power_set_standby(struct ata_device *dev)
>  	struct ata_taskfile tf;
>  	unsigned int err_mask;
>  
> +	/* If the device is already sleeping, do nothing. */
> +	if (dev->flags & ATA_DFLAG_SLEEPING)
> +		return;
> +
>  	/*
>  	 * Some odd clown BIOSes issue spindown on power off (ACPI S4 or S5)
>  	 * causing some drives to spin up and down again. For these, do nothing
> -- 
> 2.43.0
> 

Holding off [PATCH 2/2] until we get a better understanding of things,
thus for [PATCH 1/2] only:

Applied:
https://git.kernel.org/pub/scm/linux/kernel/git/libata/linux.git/log/?h=for-6.8-fixes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-01-11 18:10   ` Sergei Shtylyov
  2024-01-11 23:13     ` Damien Le Moal
@ 2024-02-19 15:29     ` Niklas Cassel
  2024-02-23 21:04       ` Sergey Shtylyov
  1 sibling, 1 reply; 23+ messages in thread
From: Niklas Cassel @ 2024-02-19 15:29 UTC (permalink / raw
  To: Sergei Shtylyov
  Cc: Damien Le Moal, linux-ide, Dieter Mummenschanz, Wang Zhihao,
	linux-regressions

Hello Sergei,

On Thu, Jan 11, 2024 at 09:10:43PM +0300, Sergei Shtylyov wrote:
> On 1/11/24 2:51 PM, Damien Le Moal wrote:
> 
> > This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.
> > 
> > Several users have signaled issues with commit fd3a6837d8e1 ("ata:
> > libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
> > system SoC to go to a low power state. The reason for this problem
> > is not well understood but given that this patch is harmless with the
> > improvements to ata_dev_power_set_standby(), restore it to allow system
> > lower power state transitions.
> > 
> > For regular system shutdown, ata_dev_power_set_standby() will be
> > executed twice: once the scsi device is removed and another when
> > ata_pci_shutdown_one() executes and EH completes unloading the devices.
> > Make the second call to ata_dev_power_set_standby() do nothing by using
> > ata_dev_power_is_active() and return if the device is already in
> > standby.
> > 
> > Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> > ---
> >  drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
> >  1 file changed, 45 insertions(+), 30 deletions(-)
> > 
> > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> > index d9f80f4f70f5..20a366942626 100644
> > --- a/drivers/ata/libata-core.c
> > +++ b/drivers/ata/libata-core.c
> > @@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
> >  	return true;
> >  }
> >  
> > +static bool ata_dev_power_is_active(struct ata_device *dev)
> > +{
> > +	struct ata_taskfile tf;
> > +	unsigned int err_mask;
> > +
> > +	ata_tf_init(dev, &tf);
> > +	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
> 
>    Why set ATA_TFLAG_ISADDR, BTW? This command doesn't use any taskfile
> regs but the device/head reg. Material for a fix, I guess... :-)
> 
> > +	tf.protocol = ATA_PROT_NODATA;
> > +	tf.command = ATA_CMD_CHK_POWER;
> > +
> [...]

Looking at the definition of the flag:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/libata.h?h=v6.8-rc5#n76

"enable r/w to nsect/lba regs"

This function does read from the nsect reg:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?h=v6.8-rc5#n2069

So I would prefer to keep it as it.


Basically the whole motto for libata right now:
"If it ain't broke, don't fix it."


Sure, if we look at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-sff.c?h=v6.8-rc5#n343

it seems that flags ATA_TFLAG_ISADDR, ATA_TFLAG_LBA48, and ATA_TFLAG_DEVICE
is used to "guard" if these regs should be written to the TF.


However, if we look at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-sff.c?h=v6.8-rc5#n392

is seems that only flag ATA_TFLAG_LBA48 is used to "guard" if the regs should
be read from the TF.

So it looks like the intention was that these flags should be used
to guard if certain TF regs should be written or read, but in reality,
only some of the flags are actually for guarding reads.
(While all of the flags are used for guarding writes.)


Personally, I don't really see the point of using flags to guard writes
to the host controller. Why would we want to skip writing certain TF regs?
The struct ata_taskfile should be zero-initialized before filling it with
a command, so why not always write all TF regs and remove these flags?

Anyway, why touch it now and risk introducing regressions on some old PATA
hardware?


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-02-19 15:29     ` Niklas Cassel
@ 2024-02-23 21:04       ` Sergey Shtylyov
  2024-02-26  9:28         ` Niklas Cassel
  0 siblings, 1 reply; 23+ messages in thread
From: Sergey Shtylyov @ 2024-02-23 21:04 UTC (permalink / raw
  To: Niklas Cassel
  Cc: Damien Le Moal, linux-ide, Dieter Mummenschanz, Wang Zhihao,
	linux-regressions

On 2/19/24 6:29 PM, Niklas Cassel wrote:
[...]

>>> This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.
>>>
>>> Several users have signaled issues with commit fd3a6837d8e1 ("ata:
>>> libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
>>> system SoC to go to a low power state. The reason for this problem
>>> is not well understood but given that this patch is harmless with the
>>> improvements to ata_dev_power_set_standby(), restore it to allow system
>>> lower power state transitions.
>>>
>>> For regular system shutdown, ata_dev_power_set_standby() will be
>>> executed twice: once the scsi device is removed and another when
>>> ata_pci_shutdown_one() executes and EH completes unloading the devices.
>>> Make the second call to ata_dev_power_set_standby() do nothing by using
>>> ata_dev_power_is_active() and return if the device is already in
>>> standby.
>>>
>>> Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>>> ---
>>>  drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
>>>  1 file changed, 45 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>> index d9f80f4f70f5..20a366942626 100644
>>> --- a/drivers/ata/libata-core.c
>>> +++ b/drivers/ata/libata-core.c
>>> @@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
>>>  	return true;
>>>  }
>>>  
>>> +static bool ata_dev_power_is_active(struct ata_device *dev)
>>> +{
>>> +	struct ata_taskfile tf;
>>> +	unsigned int err_mask;
>>> +
>>> +	ata_tf_init(dev, &tf);
>>> +	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
>>
>>    Why set ATA_TFLAG_ISADDR, BTW? This command doesn't use any taskfile
>> regs but the device/head reg. Material for a fix, I guess... :-)
>>
>>> +	tf.protocol = ATA_PROT_NODATA;
>>> +	tf.command = ATA_CMD_CHK_POWER;
>>> +
>> [...]
> 
> Looking at the definition of the flag:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/libata.h?h=v6.8-rc5#n76
> 
> "enable r/w to nsect/lba regs"

   I'm afraid this comment doesn't reflect the reality in its r/w part --
if you look at e.g. ata_sff_tf_read(), you'll see that it always reads 
all the legacy taskfile and only checks ATA_TFLAG_LBA48 in order to
determine whether it should read the HOBs as well...

> This function does read from the nsect reg:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?h=v6.8-rc5#n2069
> 
> So I would prefer to keep it as it.

   IMO, it doesn't make much sense -- unless you assume that the device
could leave that reg unset as a result of this command...

> Basically the whole motto for libata right now:
> "If it ain't broke, don't fix it."

   Do you realize that each taskfile reg access takes e.g. 900-990 ns on
the Intel PIIX/ICH (the part # was 82371/82801) IDE controllers (with 33
MHz PCI bus)? Luckily, we just have to write (almost?) whole taskfile on
the read/write commands anyway...

> Sure, if we look at:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-sff.c?h=v6.8-rc5#n343
> 
> it seems that flags ATA_TFLAG_ISADDR, ATA_TFLAG_LBA48, and ATA_TFLAG_DEVICE
> is used to "guard" if these regs should be written to the TF.
> 
> 
> However, if we look at:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-sff.c?h=v6.8-rc5#n392
> 
> is seems that only flag ATA_TFLAG_LBA48 is used to "guard" if the regs should
> be read from the TF.

   Luckily, we have to read back the whole taskfile only on the read/write
errors...

> So it looks like the intention was that these flags should be used
> to guard if certain TF regs should be written or read, but in reality,
> only some of the flags are actually for guarding reads.
> (While all of the flags are used for guarding writes.)

   So you're seeing that inconsistency (I mentioned) yourself... :-)

> Personally, I don't really see the point of using flags to guard writes
> to the host controller. Why would we want to skip writing certain TF regs?
> 
> The struct ata_taskfile should be zero-initialized before filling it with

   TBH, I generally hate how libata implemented the taskfile accessors
and I hate how *struct* ata_taskfile looks too... :-)

> a command, so why not always write all TF regs and remove these flags?

   To stop wasting a good microsecond per a reg R/W cycle, perhaps? :-)
   Anyway, the ATA standards clearly describe what the regs are used by
each command and what to expect on a normal/erratic command completion...

   In drivers/ide/ we finanally ended up with 8-bit reg validity flags,
each bit corresponding to an individual taskfile reg:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60f85019c6c8c1aebf3485a313e0da094bc95d07

> Anyway, why touch it now and risk introducing regressions on some old PATA
> hardware?

   Do you realize that drivers/ide/ wasn't writing out the whole taskfile
when issuing this particular command since:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d364c7f50b3bb6dc77259974038567b821e2cf0a

   If there were regressions, we would have seen them a long time ago,
no? :-)

> Kind regards,
> Niklas

MBR, Sergey

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()"
  2024-02-23 21:04       ` Sergey Shtylyov
@ 2024-02-26  9:28         ` Niklas Cassel
  0 siblings, 0 replies; 23+ messages in thread
From: Niklas Cassel @ 2024-02-26  9:28 UTC (permalink / raw
  To: Sergey Shtylyov
  Cc: Damien Le Moal, linux-ide, Dieter Mummenschanz, Wang Zhihao,
	linux-regressions

Hello Sergey,

On Sat, Feb 24, 2024 at 12:04:46AM +0300, Sergey Shtylyov wrote:
> On 2/19/24 6:29 PM, Niklas Cassel wrote:
> [...]
> 
> >>> This reverts commit fd3a6837d8e18cb7be80dcca1283276290336a7a.
> >>>
> >>> Several users have signaled issues with commit fd3a6837d8e1 ("ata:
> >>> libata-core: Fix ata_pci_shutdown_one()") which causes failure of the
> >>> system SoC to go to a low power state. The reason for this problem
> >>> is not well understood but given that this patch is harmless with the
> >>> improvements to ata_dev_power_set_standby(), restore it to allow system
> >>> lower power state transitions.
> >>>
> >>> For regular system shutdown, ata_dev_power_set_standby() will be
> >>> executed twice: once the scsi device is removed and another when
> >>> ata_pci_shutdown_one() executes and EH completes unloading the devices.
> >>> Make the second call to ata_dev_power_set_standby() do nothing by using
> >>> ata_dev_power_is_active() and return if the device is already in
> >>> standby.
> >>>
> >>> Fixes: fd3a6837d8e1 ("ata: libata-core: Fix ata_pci_shutdown_one()")
> >>> Cc: stable@vger.kernel.org
> >>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> >>> ---
> >>>  drivers/ata/libata-core.c | 75 +++++++++++++++++++++++----------------
> >>>  1 file changed, 45 insertions(+), 30 deletions(-)
> >>>
> >>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>> index d9f80f4f70f5..20a366942626 100644
> >>> --- a/drivers/ata/libata-core.c
> >>> +++ b/drivers/ata/libata-core.c
> >>> @@ -2001,6 +2001,33 @@ bool ata_dev_power_init_tf(struct ata_device *dev, struct ata_taskfile *tf,
> >>>  	return true;
> >>>  }
> >>>  
> >>> +static bool ata_dev_power_is_active(struct ata_device *dev)
> >>> +{
> >>> +	struct ata_taskfile tf;
> >>> +	unsigned int err_mask;
> >>> +
> >>> +	ata_tf_init(dev, &tf);
> >>> +	tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;
> >>
> >>    Why set ATA_TFLAG_ISADDR, BTW? This command doesn't use any taskfile
> >> regs but the device/head reg. Material for a fix, I guess... :-)
> >>
> >>> +	tf.protocol = ATA_PROT_NODATA;
> >>> +	tf.command = ATA_CMD_CHK_POWER;
> >>> +
> >> [...]
> > 
> > Looking at the definition of the flag:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/libata.h?h=v6.8-rc5#n76
> > 
> > "enable r/w to nsect/lba regs"
> 
>    I'm afraid this comment doesn't reflect the reality in its r/w part --
> if you look at e.g. ata_sff_tf_read(), you'll see that it always reads 
> all the legacy taskfile and only checks ATA_TFLAG_LBA48 in order to
> determine whether it should read the HOBs as well...

Considering that you have experience with cleaning up a similar mess in
drivers/ide, patches for drivers/ata which fixes the comment and the
functions needlessly setting the ATA_TFLAG_DEVICE flag is more than
welcome.

From a quick look, there appears to be quite a few functions (in
addition to ata_dev_power_is_active()) which needlessly sets this flag.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-02-26  9:28 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-11 11:51 [PATCH 0/2] Power management fixes Damien Le Moal
2024-01-11 11:51 ` [PATCH 1/2] ata: libata-core: Do not try to set sleeping devices to standby Damien Le Moal
2024-02-14 11:03   ` Niklas Cassel
2024-01-11 11:51 ` [PATCH 2/2] ata: libata-core: Revert "ata: libata-core: Fix ata_pci_shutdown_one()" Damien Le Moal
2024-01-11 18:10   ` Sergei Shtylyov
2024-01-11 23:13     ` Damien Le Moal
2024-02-19 15:29     ` Niklas Cassel
2024-02-23 21:04       ` Sergey Shtylyov
2024-02-26  9:28         ` Niklas Cassel
     [not found] ` <DU0P251MB082515FC8FE77424231B475CF4682@DU0P251MB0825.EURP251.PROD.OUTLOOK.COM>
2024-01-22  8:49   ` [PATCH 0/2] Power management fixes Damien Le Moal
     [not found]     ` <trinity-0be6e8a8-e6d3-4d60-be0d-59592a9edd65-1706010022623@3c-app-webde-bap10>
2024-01-23 11:52       ` Aw: " Damien Le Moal
     [not found]         ` <trinity-0df92d73-be55-433c-bdb2-4387f7ea590b-1706686178879@3c-app-webde-bap43>
2024-01-31  7:38           ` Aw: " Damien Le Moal
2024-01-31 11:49             ` Niklas Cassel
2024-01-31 12:09               ` Damien Le Moal
2024-02-01  7:12                 ` Aw: " Dieter Mummenschanz
2024-02-01  8:09                   ` Damien Le Moal
2024-02-01  7:10               ` Dieter Mummenschanz
2024-02-01 10:51                 ` Niklas Cassel
2024-02-02 14:53                   ` Aw: " Dieter Mummenschanz
2024-02-05 19:00                     ` Niklas Cassel
     [not found]                       ` <trinity-0bc8e6ea-7808-4508-af3a-be22281abf24-1707231996854@3c-app-webde-bs42>
2024-02-06 21:46                         ` Niklas Cassel
2024-02-08 14:37                           ` Aw: " Dieter Mummenschanz
2024-02-13 20:02                             ` Niklas Cassel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).