* [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-22 7:31 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-22 7:31 UTC (permalink / raw)
To: linux-sh
If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
up with an infinite loop in genpd_dev_pm_{at,de}tach().
This may happen due to a genpd.prepared_count imbalance. This is a bug
elsewhere, but it will result in a system lock up, possibly during
reboot of an otherwise functioning system.
To avoid this, put a limit on the maximum number of loop iterations,
including a simple back-off mechanism. If the limit is reached, the
operation will just fail. An error message is already printed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/base/power/domain.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index cdd547bd67df8218..60e0309dd8dd0264 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -6,6 +6,7 @@
* This file is released under the GPLv2.
*/
+#include <linux/delay.h>
#include <linux/kernel.h>
#include <linux/io.h>
#include <linux/platform_device.h>
@@ -19,6 +20,9 @@
#include <linux/suspend.h>
#include <linux/export.h>
+#define GENPD_RETRIES 20
+#define GENPD_DELAY_US 10
+
#define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
({ \
type (*__routine)(struct device *__d); \
@@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
static void genpd_dev_pm_detach(struct device *dev, bool power_off)
{
struct generic_pm_domain *pd;
+ unsigned int i;
int ret = 0;
pd = pm_genpd_lookup_dev(dev);
@@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
dev_dbg(dev, "removing from PM domain %s\n", pd->name);
- while (1) {
+ for (i = 0; i < GENPD_RETRIES; i++) {
ret = pm_genpd_remove_device(pd, dev);
if (ret != -EAGAIN)
break;
+
+ if (i > GENPD_RETRIES / 2)
+ udelay(GENPD_DELAY_US);
cond_resched();
}
@@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
{
struct of_phandle_args pd_args;
struct generic_pm_domain *pd;
+ unsigned int i;
int ret;
if (!dev->of_node)
@@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
dev_dbg(dev, "adding to PM domain %s\n", pd->name);
- while (1) {
+ for (i = 0; i < GENPD_RETRIES; i++) {
ret = pm_genpd_add_device(pd, dev);
if (ret != -EAGAIN)
break;
+
+ if (i > GENPD_RETRIES / 2)
+ udelay(GENPD_DELAY_US);
cond_resched();
}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-22 7:31 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-22 7:31 UTC (permalink / raw)
To: Daniel Lezcano, Thomas Gleixner, Rafael J. Wysocki, Kevin Hilman,
Ulf Hansson
Cc: Magnus Damm, Laurent Pinchart, linux-pm, linux-sh, linux-kernel,
Geert Uytterhoeven
If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
up with an infinite loop in genpd_dev_pm_{at,de}tach().
This may happen due to a genpd.prepared_count imbalance. This is a bug
elsewhere, but it will result in a system lock up, possibly during
reboot of an otherwise functioning system.
To avoid this, put a limit on the maximum number of loop iterations,
including a simple back-off mechanism. If the limit is reached, the
operation will just fail. An error message is already printed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/base/power/domain.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index cdd547bd67df8218..60e0309dd8dd0264 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -6,6 +6,7 @@
* This file is released under the GPLv2.
*/
+#include <linux/delay.h>
#include <linux/kernel.h>
#include <linux/io.h>
#include <linux/platform_device.h>
@@ -19,6 +20,9 @@
#include <linux/suspend.h>
#include <linux/export.h>
+#define GENPD_RETRIES 20
+#define GENPD_DELAY_US 10
+
#define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
({ \
type (*__routine)(struct device *__d); \
@@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
static void genpd_dev_pm_detach(struct device *dev, bool power_off)
{
struct generic_pm_domain *pd;
+ unsigned int i;
int ret = 0;
pd = pm_genpd_lookup_dev(dev);
@@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
dev_dbg(dev, "removing from PM domain %s\n", pd->name);
- while (1) {
+ for (i = 0; i < GENPD_RETRIES; i++) {
ret = pm_genpd_remove_device(pd, dev);
if (ret != -EAGAIN)
break;
+
+ if (i > GENPD_RETRIES / 2)
+ udelay(GENPD_DELAY_US);
cond_resched();
}
@@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
{
struct of_phandle_args pd_args;
struct generic_pm_domain *pd;
+ unsigned int i;
int ret;
if (!dev->of_node)
@@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
dev_dbg(dev, "adding to PM domain %s\n", pd->name);
- while (1) {
+ for (i = 0; i < GENPD_RETRIES; i++) {
ret = pm_genpd_add_device(pd, dev);
if (ret != -EAGAIN)
break;
+
+ if (i > GENPD_RETRIES / 2)
+ udelay(GENPD_DELAY_US);
cond_resched();
}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-22 7:31 ` Geert Uytterhoeven
(?)
@ 2015-06-22 23:41 ` Rafael J. Wysocki
-1 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-22 23:41 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Daniel Lezcano, Thomas Gleixner, Kevin Hilman, Ulf Hansson,
Magnus Damm, Laurent Pinchart, linux-pm, linux-sh, linux-kernel
On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>
> This may happen due to a genpd.prepared_count imbalance. This is a bug
> elsewhere, but it will result in a system lock up, possibly during
> reboot of an otherwise functioning system.
>
> To avoid this, put a limit on the maximum number of loop iterations,
> including a simple back-off mechanism. If the limit is reached, the
> operation will just fail. An error message is already printed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
This was too late for my first 4.2 pull request, but I may push it in the
second half of the merge window. Does it depend on anything?
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-22 23:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-22 23:41 UTC (permalink / raw)
To: linux-sh
On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>
> This may happen due to a genpd.prepared_count imbalance. This is a bug
> elsewhere, but it will result in a system lock up, possibly during
> reboot of an otherwise functioning system.
>
> To avoid this, put a limit on the maximum number of loop iterations,
> including a simple back-off mechanism. If the limit is reached, the
> operation will just fail. An error message is already printed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
This was too late for my first 4.2 pull request, but I may push it in the
second half of the merge window. Does it depend on anything?
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-22 23:41 ` Rafael J. Wysocki
0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-22 23:41 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Daniel Lezcano, Thomas Gleixner, Kevin Hilman, Ulf Hansson,
Magnus Damm, Laurent Pinchart, linux-pm, linux-sh, linux-kernel
On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>
> This may happen due to a genpd.prepared_count imbalance. This is a bug
> elsewhere, but it will result in a system lock up, possibly during
> reboot of an otherwise functioning system.
>
> To avoid this, put a limit on the maximum number of loop iterations,
> including a simple back-off mechanism. If the limit is reached, the
> operation will just fail. An error message is already printed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
This was too late for my first 4.2 pull request, but I may push it in the
second half of the merge window. Does it depend on anything?
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-22 23:41 ` Rafael J. Wysocki
(?)
@ 2015-06-23 7:16 ` Geert Uytterhoeven
-1 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 7:16 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner, Kevin Hilman,
Ulf Hansson, Magnus Damm, Laurent Pinchart, Linux PM list,
Linux-sh list, linux-kernel@vger.kernel.org
Hi Rafael,
On Tue, Jun 23, 2015 at 1:41 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>
>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>> elsewhere, but it will result in a system lock up, possibly during
>> reboot of an otherwise functioning system.
>>
>> To avoid this, put a limit on the maximum number of loop iterations,
>> including a simple back-off mechanism. If the limit is reached, the
>> operation will just fail. An error message is already printed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> This was too late for my first 4.2 pull request, but I may push it in the
> second half of the merge window. Does it depend on anything?
Not that I'm aware of.
It applies fine on v4.1 or next-20150622.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 7:16 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 7:16 UTC (permalink / raw)
To: linux-sh
Hi Rafael,
On Tue, Jun 23, 2015 at 1:41 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>
>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>> elsewhere, but it will result in a system lock up, possibly during
>> reboot of an otherwise functioning system.
>>
>> To avoid this, put a limit on the maximum number of loop iterations,
>> including a simple back-off mechanism. If the limit is reached, the
>> operation will just fail. An error message is already printed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> This was too late for my first 4.2 pull request, but I may push it in the
> second half of the merge window. Does it depend on anything?
Not that I'm aware of.
It applies fine on v4.1 or next-20150622.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-sh" in
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 7:16 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 7:16 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner, Kevin Hilman,
Ulf Hansson, Magnus Damm, Laurent Pinchart, Linux PM list,
Linux-sh list, linux-kernel@vger.kernel.org
Hi Rafael,
On Tue, Jun 23, 2015 at 1:41 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, June 22, 2015 09:31:22 AM Geert Uytterhoeven wrote:
>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>
>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>> elsewhere, but it will result in a system lock up, possibly during
>> reboot of an otherwise functioning system.
>>
>> To avoid this, put a limit on the maximum number of loop iterations,
>> including a simple back-off mechanism. If the limit is reached, the
>> operation will just fail. An error message is already printed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> This was too late for my first 4.2 pull request, but I may push it in the
> second half of the merge window. Does it depend on anything?
Not that I'm aware of.
It applies fine on v4.1 or next-20150622.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-23 12:50 ` Ulf Hansson
-1 siblings, 0 replies; 38+ messages in thread
From: Ulf Hansson @ 2015-06-23 12:50 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Daniel Lezcano, Thomas Gleixner, Rafael J. Wysocki, Kevin Hilman,
Magnus Damm, Laurent Pinchart, linux-pm@vger.kernel.org,
Linux-sh list, linux-kernel@vger.kernel.org
On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>
> This may happen due to a genpd.prepared_count imbalance. This is a bug
> elsewhere, but it will result in a system lock up, possibly during
> reboot of an otherwise functioning system.
>
> To avoid this, put a limit on the maximum number of loop iterations,
> including a simple back-off mechanism. If the limit is reached, the
> operation will just fail. An error message is already printed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> drivers/base/power/domain.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index cdd547bd67df8218..60e0309dd8dd0264 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -6,6 +6,7 @@
> * This file is released under the GPLv2.
> */
>
> +#include <linux/delay.h>
> #include <linux/kernel.h>
> #include <linux/io.h>
> #include <linux/platform_device.h>
> @@ -19,6 +20,9 @@
> #include <linux/suspend.h>
> #include <linux/export.h>
>
> +#define GENPD_RETRIES 20
> +#define GENPD_DELAY_US 10
> +
> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
> ({ \
> type (*__routine)(struct device *__d); \
> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
> {
> struct generic_pm_domain *pd;
> + unsigned int i;
> int ret = 0;
>
> pd = pm_genpd_lookup_dev(dev);
> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>
> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>
> - while (1) {
> + for (i = 0; i < GENPD_RETRIES; i++) {
> ret = pm_genpd_remove_device(pd, dev);
> if (ret != -EAGAIN)
> break;
> +
> + if (i > GENPD_RETRIES / 2)
> + udelay(GENPD_DELAY_US);
> cond_resched();
> }
>
> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
> {
> struct of_phandle_args pd_args;
> struct generic_pm_domain *pd;
> + unsigned int i;
> int ret;
>
> if (!dev->of_node)
> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>
> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>
> - while (1) {
> + for (i = 0; i < GENPD_RETRIES; i++) {
> ret = pm_genpd_add_device(pd, dev);
> if (ret != -EAGAIN)
> break;
> +
> + if (i > GENPD_RETRIES / 2)
> + udelay(GENPD_DELAY_US);
In this execution path, we retry when getting -EAGAIN while believing
the reason to the error are only *temporary* as we are soon waiting
for all devices in the genpd to be system PM resumed. At least that's
my understanding to why we want to deal with -EAGAIN here, but I might
be wrong.
In this regards, I wonder whether it could be better to re-try only a
few times but with a far longer interval time than a couple us. What
do you think?
However, what if the reason to why we get -EAGAIN isn't *temporary*,
because we are about to enter system PM suspend state. Then the caller
of this function which comes via some bus' ->probe(), will hang until
the a system PM resume is completed. Is that really going to work? So,
for this case your limited re-try approach will affect this scenario
as well, have you considered that?
> cond_resched();
> }
>
> --
> 1.9.1
>
Kind regards
Uffe
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 12:50 ` Ulf Hansson
0 siblings, 0 replies; 38+ messages in thread
From: Ulf Hansson @ 2015-06-23 12:50 UTC (permalink / raw)
To: linux-sh
On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>
> This may happen due to a genpd.prepared_count imbalance. This is a bug
> elsewhere, but it will result in a system lock up, possibly during
> reboot of an otherwise functioning system.
>
> To avoid this, put a limit on the maximum number of loop iterations,
> including a simple back-off mechanism. If the limit is reached, the
> operation will just fail. An error message is already printed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> drivers/base/power/domain.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index cdd547bd67df8218..60e0309dd8dd0264 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -6,6 +6,7 @@
> * This file is released under the GPLv2.
> */
>
> +#include <linux/delay.h>
> #include <linux/kernel.h>
> #include <linux/io.h>
> #include <linux/platform_device.h>
> @@ -19,6 +20,9 @@
> #include <linux/suspend.h>
> #include <linux/export.h>
>
> +#define GENPD_RETRIES 20
> +#define GENPD_DELAY_US 10
> +
> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
> ({ \
> type (*__routine)(struct device *__d); \
> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
> {
> struct generic_pm_domain *pd;
> + unsigned int i;
> int ret = 0;
>
> pd = pm_genpd_lookup_dev(dev);
> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>
> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>
> - while (1) {
> + for (i = 0; i < GENPD_RETRIES; i++) {
> ret = pm_genpd_remove_device(pd, dev);
> if (ret != -EAGAIN)
> break;
> +
> + if (i > GENPD_RETRIES / 2)
> + udelay(GENPD_DELAY_US);
> cond_resched();
> }
>
> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
> {
> struct of_phandle_args pd_args;
> struct generic_pm_domain *pd;
> + unsigned int i;
> int ret;
>
> if (!dev->of_node)
> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>
> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>
> - while (1) {
> + for (i = 0; i < GENPD_RETRIES; i++) {
> ret = pm_genpd_add_device(pd, dev);
> if (ret != -EAGAIN)
> break;
> +
> + if (i > GENPD_RETRIES / 2)
> + udelay(GENPD_DELAY_US);
In this execution path, we retry when getting -EAGAIN while believing
the reason to the error are only *temporary* as we are soon waiting
for all devices in the genpd to be system PM resumed. At least that's
my understanding to why we want to deal with -EAGAIN here, but I might
be wrong.
In this regards, I wonder whether it could be better to re-try only a
few times but with a far longer interval time than a couple us. What
do you think?
However, what if the reason to why we get -EAGAIN isn't *temporary*,
because we are about to enter system PM suspend state. Then the caller
of this function which comes via some bus' ->probe(), will hang until
the a system PM resume is completed. Is that really going to work? So,
for this case your limited re-try approach will affect this scenario
as well, have you considered that?
> cond_resched();
> }
>
> --
> 1.9.1
>
Kind regards
Uffe
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-23 13:20 ` Geert Uytterhoeven
-1 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 13:20 UTC (permalink / raw)
To: Ulf Hansson
Cc: Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Rafael J. Wysocki, Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
Hi Ulf,
On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>
>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>> elsewhere, but it will result in a system lock up, possibly during
>> reboot of an otherwise functioning system.
>>
>> To avoid this, put a limit on the maximum number of loop iterations,
>> including a simple back-off mechanism. If the limit is reached, the
>> operation will just fail. An error message is already printed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>> drivers/base/power/domain.c | 16 ++++++++++++++--
>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index cdd547bd67df8218..60e0309dd8dd0264 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -6,6 +6,7 @@
>> * This file is released under the GPLv2.
>> */
>>
>> +#include <linux/delay.h>
>> #include <linux/kernel.h>
>> #include <linux/io.h>
>> #include <linux/platform_device.h>
>> @@ -19,6 +20,9 @@
>> #include <linux/suspend.h>
>> #include <linux/export.h>
>>
>> +#define GENPD_RETRIES 20
>> +#define GENPD_DELAY_US 10
>> +
>> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
>> ({ \
>> type (*__routine)(struct device *__d); \
>> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
>> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>> {
>> struct generic_pm_domain *pd;
>> + unsigned int i;
>> int ret = 0;
>>
>> pd = pm_genpd_lookup_dev(dev);
>> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>
>> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>>
>> - while (1) {
>> + for (i = 0; i < GENPD_RETRIES; i++) {
>> ret = pm_genpd_remove_device(pd, dev);
>> if (ret != -EAGAIN)
>> break;
>> +
>> + if (i > GENPD_RETRIES / 2)
>> + udelay(GENPD_DELAY_US);
>> cond_resched();
>> }
>>
>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>> {
>> struct of_phandle_args pd_args;
>> struct generic_pm_domain *pd;
>> + unsigned int i;
>> int ret;
>>
>> if (!dev->of_node)
>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>
>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>
>> - while (1) {
>> + for (i = 0; i < GENPD_RETRIES; i++) {
>> ret = pm_genpd_add_device(pd, dev);
>> if (ret != -EAGAIN)
>> break;
>> +
>> + if (i > GENPD_RETRIES / 2)
>> + udelay(GENPD_DELAY_US);
>
> In this execution path, we retry when getting -EAGAIN while believing
> the reason to the error are only *temporary* as we are soon waiting
> for all devices in the genpd to be system PM resumed. At least that's
> my understanding to why we want to deal with -EAGAIN here, but I might
> be wrong.
>
> In this regards, I wonder whether it could be better to re-try only a
> few times but with a far longer interval time than a couple us. What
> do you think?
That's indeed viable. I have no idea for how long this temporary state can
extend.
> However, what if the reason to why we get -EAGAIN isn't *temporary*,
> because we are about to enter system PM suspend state. Then the caller
> of this function which comes via some bus' ->probe(), will hang until
> the a system PM resume is completed. Is that really going to work? So,
> for this case your limited re-try approach will affect this scenario
> as well, have you considered that?
There's a limit on the number of retries, so it won't hang indefinitely.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 13:20 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 13:20 UTC (permalink / raw)
To: linux-sh
Hi Ulf,
On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>
>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>> elsewhere, but it will result in a system lock up, possibly during
>> reboot of an otherwise functioning system.
>>
>> To avoid this, put a limit on the maximum number of loop iterations,
>> including a simple back-off mechanism. If the limit is reached, the
>> operation will just fail. An error message is already printed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>> drivers/base/power/domain.c | 16 ++++++++++++++--
>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index cdd547bd67df8218..60e0309dd8dd0264 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -6,6 +6,7 @@
>> * This file is released under the GPLv2.
>> */
>>
>> +#include <linux/delay.h>
>> #include <linux/kernel.h>
>> #include <linux/io.h>
>> #include <linux/platform_device.h>
>> @@ -19,6 +20,9 @@
>> #include <linux/suspend.h>
>> #include <linux/export.h>
>>
>> +#define GENPD_RETRIES 20
>> +#define GENPD_DELAY_US 10
>> +
>> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
>> ({ \
>> type (*__routine)(struct device *__d); \
>> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
>> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>> {
>> struct generic_pm_domain *pd;
>> + unsigned int i;
>> int ret = 0;
>>
>> pd = pm_genpd_lookup_dev(dev);
>> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>
>> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>>
>> - while (1) {
>> + for (i = 0; i < GENPD_RETRIES; i++) {
>> ret = pm_genpd_remove_device(pd, dev);
>> if (ret != -EAGAIN)
>> break;
>> +
>> + if (i > GENPD_RETRIES / 2)
>> + udelay(GENPD_DELAY_US);
>> cond_resched();
>> }
>>
>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>> {
>> struct of_phandle_args pd_args;
>> struct generic_pm_domain *pd;
>> + unsigned int i;
>> int ret;
>>
>> if (!dev->of_node)
>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>
>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>
>> - while (1) {
>> + for (i = 0; i < GENPD_RETRIES; i++) {
>> ret = pm_genpd_add_device(pd, dev);
>> if (ret != -EAGAIN)
>> break;
>> +
>> + if (i > GENPD_RETRIES / 2)
>> + udelay(GENPD_DELAY_US);
>
> In this execution path, we retry when getting -EAGAIN while believing
> the reason to the error are only *temporary* as we are soon waiting
> for all devices in the genpd to be system PM resumed. At least that's
> my understanding to why we want to deal with -EAGAIN here, but I might
> be wrong.
>
> In this regards, I wonder whether it could be better to re-try only a
> few times but with a far longer interval time than a couple us. What
> do you think?
That's indeed viable. I have no idea for how long this temporary state can
extend.
> However, what if the reason to why we get -EAGAIN isn't *temporary*,
> because we are about to enter system PM suspend state. Then the caller
> of this function which comes via some bus' ->probe(), will hang until
> the a system PM resume is completed. Is that really going to work? So,
> for this case your limited re-try approach will affect this scenario
> as well, have you considered that?
There's a limit on the number of retries, so it won't hang indefinitely.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-23 13:38 ` Rafael J. Wysocki
-1 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-23 13:38 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Ulf Hansson, Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Rafael J. Wysocki, Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
On Tue, Jun 23, 2015 at 3:20 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Ulf,
>
> On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
>>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>>
>>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>>> elsewhere, but it will result in a system lock up, possibly during
>>> reboot of an otherwise functioning system.
>>>
>>> To avoid this, put a limit on the maximum number of loop iterations,
>>> including a simple back-off mechanism. If the limit is reached, the
>>> operation will just fail. An error message is already printed.
>>>
>>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>>> ---
>>> drivers/base/power/domain.c | 16 ++++++++++++++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>>> index cdd547bd67df8218..60e0309dd8dd0264 100644
>>> --- a/drivers/base/power/domain.c
>>> +++ b/drivers/base/power/domain.c
>>> @@ -6,6 +6,7 @@
>>> * This file is released under the GPLv2.
>>> */
>>>
>>> +#include <linux/delay.h>
>>> #include <linux/kernel.h>
>>> #include <linux/io.h>
>>> #include <linux/platform_device.h>
>>> @@ -19,6 +20,9 @@
>>> #include <linux/suspend.h>
>>> #include <linux/export.h>
>>>
>>> +#define GENPD_RETRIES 20
>>> +#define GENPD_DELAY_US 10
>>> +
>>> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
>>> ({ \
>>> type (*__routine)(struct device *__d); \
>>> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
>>> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>> {
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret = 0;
>>>
>>> pd = pm_genpd_lookup_dev(dev);
>>> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>>
>>> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_remove_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>> cond_resched();
>>> }
>>>
>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>> {
>>> struct of_phandle_args pd_args;
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret;
>>>
>>> if (!dev->of_node)
>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>
>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_add_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>
>> In this execution path, we retry when getting -EAGAIN while believing
>> the reason to the error are only *temporary* as we are soon waiting
>> for all devices in the genpd to be system PM resumed. At least that's
>> my understanding to why we want to deal with -EAGAIN here, but I might
>> be wrong.
>>
>> In this regards, I wonder whether it could be better to re-try only a
>> few times but with a far longer interval time than a couple us. What
>> do you think?
>
> That's indeed viable. I have no idea for how long this temporary state can
> extend.
A usual approach to this kind of thing is to use exponential fallback
where you increase the delay twice with respect to the previous one
every time.
Rafael
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 13:38 ` Rafael J. Wysocki
0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-23 13:38 UTC (permalink / raw)
To: linux-sh
On Tue, Jun 23, 2015 at 3:20 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Ulf,
>
> On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote:
>>> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
>>> up with an infinite loop in genpd_dev_pm_{at,de}tach().
>>>
>>> This may happen due to a genpd.prepared_count imbalance. This is a bug
>>> elsewhere, but it will result in a system lock up, possibly during
>>> reboot of an otherwise functioning system.
>>>
>>> To avoid this, put a limit on the maximum number of loop iterations,
>>> including a simple back-off mechanism. If the limit is reached, the
>>> operation will just fail. An error message is already printed.
>>>
>>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>>> ---
>>> drivers/base/power/domain.c | 16 ++++++++++++++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>>> index cdd547bd67df8218..60e0309dd8dd0264 100644
>>> --- a/drivers/base/power/domain.c
>>> +++ b/drivers/base/power/domain.c
>>> @@ -6,6 +6,7 @@
>>> * This file is released under the GPLv2.
>>> */
>>>
>>> +#include <linux/delay.h>
>>> #include <linux/kernel.h>
>>> #include <linux/io.h>
>>> #include <linux/platform_device.h>
>>> @@ -19,6 +20,9 @@
>>> #include <linux/suspend.h>
>>> #include <linux/export.h>
>>>
>>> +#define GENPD_RETRIES 20
>>> +#define GENPD_DELAY_US 10
>>> +
>>> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \
>>> ({ \
>>> type (*__routine)(struct device *__d); \
>>> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider);
>>> static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>> {
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret = 0;
>>>
>>> pd = pm_genpd_lookup_dev(dev);
>>> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off)
>>>
>>> dev_dbg(dev, "removing from PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_remove_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>> cond_resched();
>>> }
>>>
>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>> {
>>> struct of_phandle_args pd_args;
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret;
>>>
>>> if (!dev->of_node)
>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>
>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_add_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>
>> In this execution path, we retry when getting -EAGAIN while believing
>> the reason to the error are only *temporary* as we are soon waiting
>> for all devices in the genpd to be system PM resumed. At least that's
>> my understanding to why we want to deal with -EAGAIN here, but I might
>> be wrong.
>>
>> In this regards, I wonder whether it could be better to re-try only a
>> few times but with a far longer interval time than a couple us. What
>> do you think?
>
> That's indeed viable. I have no idea for how long this temporary state can
> extend.
A usual approach to this kind of thing is to use exponential fallback
where you increase the delay twice with respect to the previous one
every time.
Rafael
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-23 13:45 ` Geert Uytterhoeven
-1 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 13:45 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Ulf Hansson, Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Rafael J. Wysocki, Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
Hi Rafael,
On Tue, Jun 23, 2015 at 3:38 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>>
>>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>>
>>>> - while (1) {
>>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>>> ret = pm_genpd_add_device(pd, dev);
>>>> if (ret != -EAGAIN)
>>>> break;
>>>> +
>>>> + if (i > GENPD_RETRIES / 2)
>>>> + udelay(GENPD_DELAY_US);
>>>
>>> In this execution path, we retry when getting -EAGAIN while believing
>>> the reason to the error are only *temporary* as we are soon waiting
>>> for all devices in the genpd to be system PM resumed. At least that's
>>> my understanding to why we want to deal with -EAGAIN here, but I might
>>> be wrong.
>>>
>>> In this regards, I wonder whether it could be better to re-try only a
>>> few times but with a far longer interval time than a couple us. What
>>> do you think?
>>
>> That's indeed viable. I have no idea for how long this temporary state can
>> extend.
>
> A usual approach to this kind of thing is to use exponential fallback
> where you increase the delay twice with respect to the previous one
> every time.
Right, but when do you give up?
Note that udelay() is a busy loop. Should it fall back to msleep() after
a while?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-23 13:45 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-23 13:45 UTC (permalink / raw)
To: linux-sh
Hi Rafael,
On Tue, Jun 23, 2015 at 3:38 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>>
>>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>>
>>>> - while (1) {
>>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>>> ret = pm_genpd_add_device(pd, dev);
>>>> if (ret != -EAGAIN)
>>>> break;
>>>> +
>>>> + if (i > GENPD_RETRIES / 2)
>>>> + udelay(GENPD_DELAY_US);
>>>
>>> In this execution path, we retry when getting -EAGAIN while believing
>>> the reason to the error are only *temporary* as we are soon waiting
>>> for all devices in the genpd to be system PM resumed. At least that's
>>> my understanding to why we want to deal with -EAGAIN here, but I might
>>> be wrong.
>>>
>>> In this regards, I wonder whether it could be better to re-try only a
>>> few times but with a far longer interval time than a couple us. What
>>> do you think?
>>
>> That's indeed viable. I have no idea for how long this temporary state can
>> extend.
>
> A usual approach to this kind of thing is to use exponential fallback
> where you increase the delay twice with respect to the previous one
> every time.
Right, but when do you give up?
Note that udelay() is a busy loop. Should it fall back to msleep() after
a while?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-24 8:33 ` Ulf Hansson
-1 siblings, 0 replies; 38+ messages in thread
From: Ulf Hansson @ 2015-06-24 8:33 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Rafael J. Wysocki, Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
[...]
>>>
>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>> {
>>> struct of_phandle_args pd_args;
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret;
>>>
>>> if (!dev->of_node)
>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>
>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_add_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>
>> In this execution path, we retry when getting -EAGAIN while believing
>> the reason to the error are only *temporary* as we are soon waiting
>> for all devices in the genpd to be system PM resumed. At least that's
>> my understanding to why we want to deal with -EAGAIN here, but I might
>> be wrong.
>>
>> In this regards, I wonder whether it could be better to re-try only a
>> few times but with a far longer interval time than a couple us. What
>> do you think?
>
> That's indeed viable. I have no idea for how long this temporary state can
> extend.
That will depend on the system PM resume time for the devices residing
in the genpd. So, I guess we need a guestimate then. How about a total
sleep time of a few seconds?
>
>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
>> because we are about to enter system PM suspend state. Then the caller
>> of this function which comes via some bus' ->probe(), will hang until
>> the a system PM resume is completed. Is that really going to work? So,
>> for this case your limited re-try approach will affect this scenario
>> as well, have you considered that?
>
> There's a limit on the number of retries, so it won't hang indefinitely.
What happens with the timer functions (like msleep()) during the
system PM suspend transition?
Kind regards
Uffe
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-24 8:33 ` Ulf Hansson
0 siblings, 0 replies; 38+ messages in thread
From: Ulf Hansson @ 2015-06-24 8:33 UTC (permalink / raw)
To: linux-sh
[...]
>>>
>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>> {
>>> struct of_phandle_args pd_args;
>>> struct generic_pm_domain *pd;
>>> + unsigned int i;
>>> int ret;
>>>
>>> if (!dev->of_node)
>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>
>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>
>>> - while (1) {
>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>> ret = pm_genpd_add_device(pd, dev);
>>> if (ret != -EAGAIN)
>>> break;
>>> +
>>> + if (i > GENPD_RETRIES / 2)
>>> + udelay(GENPD_DELAY_US);
>>
>> In this execution path, we retry when getting -EAGAIN while believing
>> the reason to the error are only *temporary* as we are soon waiting
>> for all devices in the genpd to be system PM resumed. At least that's
>> my understanding to why we want to deal with -EAGAIN here, but I might
>> be wrong.
>>
>> In this regards, I wonder whether it could be better to re-try only a
>> few times but with a far longer interval time than a couple us. What
>> do you think?
>
> That's indeed viable. I have no idea for how long this temporary state can
> extend.
That will depend on the system PM resume time for the devices residing
in the genpd. So, I guess we need a guestimate then. How about a total
sleep time of a few seconds?
>
>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
>> because we are about to enter system PM suspend state. Then the caller
>> of this function which comes via some bus' ->probe(), will hang until
>> the a system PM resume is completed. Is that really going to work? So,
>> for this case your limited re-try approach will affect this scenario
>> as well, have you considered that?
>
> There's a limit on the number of retries, so it won't hang indefinitely.
What happens with the timer functions (like msleep()) during the
system PM suspend transition?
Kind regards
Uffe
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-24 8:35 ` Geert Uytterhoeven
-1 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-24 8:35 UTC (permalink / raw)
To: Ulf Hansson
Cc: Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Rafael J. Wysocki, Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
On Wed, Jun 24, 2015 at 10:33 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> [...]
>
>>>>
>>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>>> {
>>>> struct of_phandle_args pd_args;
>>>> struct generic_pm_domain *pd;
>>>> + unsigned int i;
>>>> int ret;
>>>>
>>>> if (!dev->of_node)
>>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>>
>>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>>
>>>> - while (1) {
>>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>>> ret = pm_genpd_add_device(pd, dev);
>>>> if (ret != -EAGAIN)
>>>> break;
>>>> +
>>>> + if (i > GENPD_RETRIES / 2)
>>>> + udelay(GENPD_DELAY_US);
>>>
>>> In this execution path, we retry when getting -EAGAIN while believing
>>> the reason to the error are only *temporary* as we are soon waiting
>>> for all devices in the genpd to be system PM resumed. At least that's
>>> my understanding to why we want to deal with -EAGAIN here, but I might
>>> be wrong.
>>>
>>> In this regards, I wonder whether it could be better to re-try only a
>>> few times but with a far longer interval time than a couple us. What
>>> do you think?
>>
>> That's indeed viable. I have no idea for how long this temporary state can
>> extend.
>
> That will depend on the system PM resume time for the devices residing
> in the genpd. So, I guess we need a guestimate then. How about a total
> sleep time of a few seconds?
>
>>
>>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
>>> because we are about to enter system PM suspend state. Then the caller
>>> of this function which comes via some bus' ->probe(), will hang until
>>> the a system PM resume is completed. Is that really going to work? So,
>>> for this case your limited re-try approach will affect this scenario
>>> as well, have you considered that?
>>
>> There's a limit on the number of retries, so it won't hang indefinitely.
>
> What happens with the timer functions (like msleep()) during the
> system PM suspend transition?
I guess we can no longer call msleep() after syscore suspend?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-24 8:35 ` Geert Uytterhoeven
0 siblings, 0 replies; 38+ messages in thread
From: Geert Uytterhoeven @ 2015-06-24 8:35 UTC (permalink / raw)
To: linux-sh
On Wed, Jun 24, 2015 at 10:33 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> [...]
>
>>>>
>>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
>>>> {
>>>> struct of_phandle_args pd_args;
>>>> struct generic_pm_domain *pd;
>>>> + unsigned int i;
>>>> int ret;
>>>>
>>>> if (!dev->of_node)
>>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
>>>>
>>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
>>>>
>>>> - while (1) {
>>>> + for (i = 0; i < GENPD_RETRIES; i++) {
>>>> ret = pm_genpd_add_device(pd, dev);
>>>> if (ret != -EAGAIN)
>>>> break;
>>>> +
>>>> + if (i > GENPD_RETRIES / 2)
>>>> + udelay(GENPD_DELAY_US);
>>>
>>> In this execution path, we retry when getting -EAGAIN while believing
>>> the reason to the error are only *temporary* as we are soon waiting
>>> for all devices in the genpd to be system PM resumed. At least that's
>>> my understanding to why we want to deal with -EAGAIN here, but I might
>>> be wrong.
>>>
>>> In this regards, I wonder whether it could be better to re-try only a
>>> few times but with a far longer interval time than a couple us. What
>>> do you think?
>>
>> That's indeed viable. I have no idea for how long this temporary state can
>> extend.
>
> That will depend on the system PM resume time for the devices residing
> in the genpd. So, I guess we need a guestimate then. How about a total
> sleep time of a few seconds?
>
>>
>>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
>>> because we are about to enter system PM suspend state. Then the caller
>>> of this function which comes via some bus' ->probe(), will hang until
>>> the a system PM resume is completed. Is that really going to work? So,
>>> for this case your limited re-try approach will affect this scenario
>>> as well, have you considered that?
>>
>> There's a limit on the number of retries, so it won't hang indefinitely.
>
> What happens with the timer functions (like msleep()) during the
> system PM suspend transition?
I guess we can no longer call msleep() after syscore suspend?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
2015-06-18 10:22 ` Geert Uytterhoeven
@ 2015-06-24 13:48 ` Rafael J. Wysocki
-1 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-24 13:48 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Ulf Hansson, Geert Uytterhoeven, Daniel Lezcano, Thomas Gleixner,
Kevin Hilman, Magnus Damm, Laurent Pinchart,
linux-pm@vger.kernel.org, Linux-sh list,
linux-kernel@vger.kernel.org
On Wednesday, June 24, 2015 10:35:44 AM Geert Uytterhoeven wrote:
> On Wed, Jun 24, 2015 at 10:33 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > [...]
> >
> >>>>
> >>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
> >>>> {
> >>>> struct of_phandle_args pd_args;
> >>>> struct generic_pm_domain *pd;
> >>>> + unsigned int i;
> >>>> int ret;
> >>>>
> >>>> if (!dev->of_node)
> >>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
> >>>>
> >>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
> >>>>
> >>>> - while (1) {
> >>>> + for (i = 0; i < GENPD_RETRIES; i++) {
> >>>> ret = pm_genpd_add_device(pd, dev);
> >>>> if (ret != -EAGAIN)
> >>>> break;
> >>>> +
> >>>> + if (i > GENPD_RETRIES / 2)
> >>>> + udelay(GENPD_DELAY_US);
> >>>
> >>> In this execution path, we retry when getting -EAGAIN while believing
> >>> the reason to the error are only *temporary* as we are soon waiting
> >>> for all devices in the genpd to be system PM resumed. At least that's
> >>> my understanding to why we want to deal with -EAGAIN here, but I might
> >>> be wrong.
> >>>
> >>> In this regards, I wonder whether it could be better to re-try only a
> >>> few times but with a far longer interval time than a couple us. What
> >>> do you think?
> >>
> >> That's indeed viable. I have no idea for how long this temporary state can
> >> extend.
> >
> > That will depend on the system PM resume time for the devices residing
> > in the genpd. So, I guess we need a guestimate then. How about a total
> > sleep time of a few seconds?
> >
> >>
> >>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
> >>> because we are about to enter system PM suspend state. Then the caller
> >>> of this function which comes via some bus' ->probe(), will hang until
> >>> the a system PM resume is completed. Is that really going to work? So,
> >>> for this case your limited re-try approach will affect this scenario
> >>> as well, have you considered that?
> >>
> >> There's a limit on the number of retries, so it won't hang indefinitely.
> >
> > What happens with the timer functions (like msleep()) during the
> > system PM suspend transition?
>
> I guess we can no longer call msleep() after syscore suspend?
That's correct. Time is effectively frozen at that point.
Rafael
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code
@ 2015-06-24 13:48 ` Rafael J. Wysocki
0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2015-06-24 13:48 UTC (permalink / raw)
To: linux-sh
On Wednesday, June 24, 2015 10:35:44 AM Geert Uytterhoeven wrote:
> On Wed, Jun 24, 2015 at 10:33 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > [...]
> >
> >>>>
> >>>> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev)
> >>>> {
> >>>> struct of_phandle_args pd_args;
> >>>> struct generic_pm_domain *pd;
> >>>> + unsigned int i;
> >>>> int ret;
> >>>>
> >>>> if (!dev->of_node)
> >>>> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev)
> >>>>
> >>>> dev_dbg(dev, "adding to PM domain %s\n", pd->name);
> >>>>
> >>>> - while (1) {
> >>>> + for (i = 0; i < GENPD_RETRIES; i++) {
> >>>> ret = pm_genpd_add_device(pd, dev);
> >>>> if (ret != -EAGAIN)
> >>>> break;
> >>>> +
> >>>> + if (i > GENPD_RETRIES / 2)
> >>>> + udelay(GENPD_DELAY_US);
> >>>
> >>> In this execution path, we retry when getting -EAGAIN while believing
> >>> the reason to the error are only *temporary* as we are soon waiting
> >>> for all devices in the genpd to be system PM resumed. At least that's
> >>> my understanding to why we want to deal with -EAGAIN here, but I might
> >>> be wrong.
> >>>
> >>> In this regards, I wonder whether it could be better to re-try only a
> >>> few times but with a far longer interval time than a couple us. What
> >>> do you think?
> >>
> >> That's indeed viable. I have no idea for how long this temporary state can
> >> extend.
> >
> > That will depend on the system PM resume time for the devices residing
> > in the genpd. So, I guess we need a guestimate then. How about a total
> > sleep time of a few seconds?
> >
> >>
> >>> However, what if the reason to why we get -EAGAIN isn't *temporary*,
> >>> because we are about to enter system PM suspend state. Then the caller
> >>> of this function which comes via some bus' ->probe(), will hang until
> >>> the a system PM resume is completed. Is that really going to work? So,
> >>> for this case your limited re-try approach will affect this scenario
> >>> as well, have you considered that?
> >>
> >> There's a limit on the number of retries, so it won't hang indefinitely.
> >
> > What happens with the timer functions (like msleep()) during the
> > system PM suspend transition?
>
> I guess we can no longer call msleep() after syscore suspend?
That's correct. Time is effectively frozen at that point.
Rafael
^ permalink raw reply [flat|nested] 38+ messages in thread