From: Geert Uytterhoeven <geert@linux-m68k.org> To: Ulf Hansson <ulf.hansson@linaro.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be>, Daniel Lezcano <daniel.lezcano@linaro.org>, Thomas Gleixner <tglx@linutronix.de>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Kevin Hilman <khilman@kernel.org>, Magnus Damm <damm@opensource.se>, Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>, "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>, Linux-sh list <linux-sh@vger.kernel.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code Date: Tue, 23 Jun 2015 15:20:40 +0200 [thread overview] Message-ID: <CAMuHMdX4cWmkx_W4P8vh6soZD_LzHJWgL=QpVUrKdubo2LaePg@mail.gmail.com> (raw) In-Reply-To: <CAPDyKFpRMrJZxZPv8w+jLEGAHu8U+AGSeG3LaJMXGW9zn__o+g@mail.gmail.com> Hi Ulf, On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote: >> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end >> up with an infinite loop in genpd_dev_pm_{at,de}tach(). >> >> This may happen due to a genpd.prepared_count imbalance. This is a bug >> elsewhere, but it will result in a system lock up, possibly during >> reboot of an otherwise functioning system. >> >> To avoid this, put a limit on the maximum number of loop iterations, >> including a simple back-off mechanism. If the limit is reached, the >> operation will just fail. An error message is already printed. >> >> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> >> --- >> drivers/base/power/domain.c | 16 ++++++++++++++-- >> 1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index cdd547bd67df8218..60e0309dd8dd0264 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -6,6 +6,7 @@ >> * This file is released under the GPLv2. >> */ >> >> +#include <linux/delay.h> >> #include <linux/kernel.h> >> #include <linux/io.h> >> #include <linux/platform_device.h> >> @@ -19,6 +20,9 @@ >> #include <linux/suspend.h> >> #include <linux/export.h> >> >> +#define GENPD_RETRIES 20 >> +#define GENPD_DELAY_US 10 >> + >> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \ >> ({ \ >> type (*__routine)(struct device *__d); \ >> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider); >> static void genpd_dev_pm_detach(struct device *dev, bool power_off) >> { >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret = 0; >> >> pd = pm_genpd_lookup_dev(dev); >> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off) >> >> dev_dbg(dev, "removing from PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_remove_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); >> cond_resched(); >> } >> >> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev) >> { >> struct of_phandle_args pd_args; >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret; >> >> if (!dev->of_node) >> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev) >> >> dev_dbg(dev, "adding to PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_add_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); > > In this execution path, we retry when getting -EAGAIN while believing > the reason to the error are only *temporary* as we are soon waiting > for all devices in the genpd to be system PM resumed. At least that's > my understanding to why we want to deal with -EAGAIN here, but I might > be wrong. > > In this regards, I wonder whether it could be better to re-try only a > few times but with a far longer interval time than a couple us. What > do you think? That's indeed viable. I have no idea for how long this temporary state can extend. > However, what if the reason to why we get -EAGAIN isn't *temporary*, > because we are about to enter system PM suspend state. Then the caller > of this function which comes via some bus' ->probe(), will hang until > the a system PM resume is completed. Is that really going to work? So, > for this case your limited re-try approach will affect this scenario > as well, have you considered that? There's a limit on the number of retries, so it won't hang indefinitely. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
WARNING: multiple messages have this Message-ID (diff)
From: Geert Uytterhoeven <geert@linux-m68k.org> To: linux-sh@vger.kernel.org Subject: Re: [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code Date: Tue, 23 Jun 2015 13:20:40 +0000 [thread overview] Message-ID: <CAMuHMdX4cWmkx_W4P8vh6soZD_LzHJWgL=QpVUrKdubo2LaePg@mail.gmail.com> (raw) In-Reply-To: <1434622954-26747-3-git-send-email-geert+renesas@glider.be> Hi Ulf, On Tue, Jun 23, 2015 at 2:50 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 22 June 2015 at 09:31, Geert Uytterhoeven <geert+renesas@glider.be> wrote: >> If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end >> up with an infinite loop in genpd_dev_pm_{at,de}tach(). >> >> This may happen due to a genpd.prepared_count imbalance. This is a bug >> elsewhere, but it will result in a system lock up, possibly during >> reboot of an otherwise functioning system. >> >> To avoid this, put a limit on the maximum number of loop iterations, >> including a simple back-off mechanism. If the limit is reached, the >> operation will just fail. An error message is already printed. >> >> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> >> --- >> drivers/base/power/domain.c | 16 ++++++++++++++-- >> 1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index cdd547bd67df8218..60e0309dd8dd0264 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -6,6 +6,7 @@ >> * This file is released under the GPLv2. >> */ >> >> +#include <linux/delay.h> >> #include <linux/kernel.h> >> #include <linux/io.h> >> #include <linux/platform_device.h> >> @@ -19,6 +20,9 @@ >> #include <linux/suspend.h> >> #include <linux/export.h> >> >> +#define GENPD_RETRIES 20 >> +#define GENPD_DELAY_US 10 >> + >> #define GENPD_DEV_CALLBACK(genpd, type, callback, dev) \ >> ({ \ >> type (*__routine)(struct device *__d); \ >> @@ -2131,6 +2135,7 @@ EXPORT_SYMBOL_GPL(of_genpd_get_from_provider); >> static void genpd_dev_pm_detach(struct device *dev, bool power_off) >> { >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret = 0; >> >> pd = pm_genpd_lookup_dev(dev); >> @@ -2139,10 +2144,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off) >> >> dev_dbg(dev, "removing from PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_remove_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); >> cond_resched(); >> } >> >> @@ -2183,6 +2191,7 @@ int genpd_dev_pm_attach(struct device *dev) >> { >> struct of_phandle_args pd_args; >> struct generic_pm_domain *pd; >> + unsigned int i; >> int ret; >> >> if (!dev->of_node) >> @@ -2218,10 +2227,13 @@ int genpd_dev_pm_attach(struct device *dev) >> >> dev_dbg(dev, "adding to PM domain %s\n", pd->name); >> >> - while (1) { >> + for (i = 0; i < GENPD_RETRIES; i++) { >> ret = pm_genpd_add_device(pd, dev); >> if (ret != -EAGAIN) >> break; >> + >> + if (i > GENPD_RETRIES / 2) >> + udelay(GENPD_DELAY_US); > > In this execution path, we retry when getting -EAGAIN while believing > the reason to the error are only *temporary* as we are soon waiting > for all devices in the genpd to be system PM resumed. At least that's > my understanding to why we want to deal with -EAGAIN here, but I might > be wrong. > > In this regards, I wonder whether it could be better to re-try only a > few times but with a far longer interval time than a couple us. What > do you think? That's indeed viable. I have no idea for how long this temporary state can extend. > However, what if the reason to why we get -EAGAIN isn't *temporary*, > because we are about to enter system PM suspend state. Then the caller > of this function which comes via some bus' ->probe(), will hang until > the a system PM resume is completed. Is that really going to work? So, > for this case your limited re-try approach will affect this scenario > as well, have you considered that? There's a limit on the number of retries, so it won't hang indefinitely. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
next prev parent reply other threads:[~2015-06-23 13:20 UTC|newest] Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-06-18 10:22 [PATCH 0/2] PM / Domains: Infinite loop during reboot Geert Uytterhoeven 2015-06-18 10:22 ` Geert Uytterhoeven 2015-06-18 10:22 ` [PATCH 1/2] clocksource: sh_cmt: Only perform clocksource suspend/resume if enabled Geert Uytterhoeven 2015-06-18 10:22 ` Geert Uytterhoeven 2015-06-18 14:14 ` Laurent Pinchart 2015-06-18 14:14 ` Laurent Pinchart 2015-06-18 14:19 ` Geert Uytterhoeven 2015-06-18 14:19 ` Geert Uytterhoeven 2015-06-18 10:22 ` [PATCH 2/2] PM / Domains: Avoid infinite loops in attach/detach code Geert Uytterhoeven 2015-06-18 10:22 ` Geert Uytterhoeven 2015-06-22 7:30 ` Geert Uytterhoeven 2015-06-22 7:30 ` Geert Uytterhoeven 2015-06-22 7:30 ` Geert Uytterhoeven 2015-06-22 7:31 ` Geert Uytterhoeven 2015-06-22 7:31 ` Geert Uytterhoeven 2015-06-22 7:31 ` Geert Uytterhoeven 2015-06-22 23:41 ` Rafael J. Wysocki 2015-06-22 23:41 ` Rafael J. Wysocki 2015-06-22 23:41 ` Rafael J. Wysocki 2015-06-23 7:16 ` Geert Uytterhoeven 2015-06-23 7:16 ` Geert Uytterhoeven 2015-06-23 7:16 ` Geert Uytterhoeven 2015-06-23 12:50 ` Ulf Hansson 2015-06-23 12:50 ` Ulf Hansson 2015-06-23 13:20 ` Geert Uytterhoeven [this message] 2015-06-23 13:20 ` Geert Uytterhoeven 2015-06-23 13:38 ` Rafael J. Wysocki 2015-06-23 13:38 ` Rafael J. Wysocki 2015-06-23 13:45 ` Geert Uytterhoeven 2015-06-23 13:45 ` Geert Uytterhoeven 2015-06-24 8:33 ` Ulf Hansson 2015-06-24 8:33 ` Ulf Hansson 2015-06-24 8:35 ` Geert Uytterhoeven 2015-06-24 8:35 ` Geert Uytterhoeven 2015-06-24 13:48 ` Rafael J. Wysocki 2015-06-24 13:48 ` Rafael J. Wysocki 2015-06-24 13:44 ` Rafael J. Wysocki 2015-06-24 14:10 ` Rafael J. Wysocki
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAMuHMdX4cWmkx_W4P8vh6soZD_LzHJWgL=QpVUrKdubo2LaePg@mail.gmail.com' \ --to=geert@linux-m68k.org \ --cc=damm@opensource.se \ --cc=daniel.lezcano@linaro.org \ --cc=geert+renesas@glider.be \ --cc=khilman@kernel.org \ --cc=laurent.pinchart+renesas@ideasonboard.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-pm@vger.kernel.org \ --cc=linux-sh@vger.kernel.org \ --cc=rjw@rjwysocki.net \ --cc=tglx@linutronix.de \ --cc=ulf.hansson@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.