All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] watchdog: add a parameter for stop wdt before register
@ 2014-01-14  8:23 Dave Young
  2014-01-14  8:26 ` Wim Van Sebroeck
  2014-01-14 12:16 ` One Thousand Gnomes
  0 siblings, 2 replies; 13+ messages in thread
From: Dave Young @ 2014-01-14  8:23 UTC (permalink / raw
  To: wim; +Cc: dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

In kdump kernel watchdog could interrupt vmcore capturing because we
have no way to disable/stop it while crashing happens.

Add a module parameter stop_before_register so watchdog can be stopped
before register in driver loading path. Thus we can try to load the
watchdog driver as early as possible in kdump kernel to ensure vmcore
capturing.

Don Zickus mentioned that there's the case that bios start the watchdog
and it is expected that the kernel keep the watchdog alive. To address
this case I added the module parameter which is false by default so
it will stop the watchdog only when user provice kernel cmdline
"watchdog.stop_before_register=1".  

Signed-off-by: Dave Young <dyoung@redhat.com>
---
 drivers/watchdog/watchdog_core.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- linux.orig/drivers/watchdog/watchdog_core.c
+++ linux/drivers/watchdog/watchdog_core.c
@@ -42,6 +42,7 @@
 
 static DEFINE_IDA(watchdog_ida);
 static struct class *watchdog_class;
+static bool stop_before_register;
 
 static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
 {
@@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
 	if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
 		return -EINVAL;
 
+	if (stop_before_register)
+		wdd->ops->stop(wdd);
+
 	watchdog_check_min_max_timeout(wdd);
 
 	/*
@@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
 subsys_initcall(watchdog_init);
 module_exit(watchdog_exit);
 
+module_param(stop_before_register, bool, 0644);
+
 MODULE_AUTHOR("Alan Cox <alan@lxorguk.ukuu.org.uk>");
 MODULE_AUTHOR("Wim Van Sebroeck <wim@iguana.be>");
 MODULE_DESCRIPTION("WatchDog Timer Driver Core");

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14  8:23 [PATCH] watchdog: add a parameter for stop wdt before register Dave Young
@ 2014-01-14  8:26 ` Wim Van Sebroeck
  2014-01-14  8:41   ` Dave Young
  2014-01-14 12:16 ` One Thousand Gnomes
  1 sibling, 1 reply; 13+ messages in thread
From: Wim Van Sebroeck @ 2014-01-14  8:26 UTC (permalink / raw
  To: Dave Young; +Cc: dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

Hi Dave,

> In kdump kernel watchdog could interrupt vmcore capturing because we
> have no way to disable/stop it while crashing happens.
> 
> Add a module parameter stop_before_register so watchdog can be stopped
> before register in driver loading path. Thus we can try to load the
> watchdog driver as early as possible in kdump kernel to ensure vmcore
> capturing.
> 
> Don Zickus mentioned that there's the case that bios start the watchdog
> and it is expected that the kernel keep the watchdog alive. To address
> this case I added the module parameter which is false by default so
> it will stop the watchdog only when user provice kernel cmdline
> "watchdog.stop_before_register=1".  
> 
> Signed-off-by: Dave Young <dyoung@redhat.com>
> ---
>  drivers/watchdog/watchdog_core.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- linux.orig/drivers/watchdog/watchdog_core.c
> +++ linux/drivers/watchdog/watchdog_core.c
> @@ -42,6 +42,7 @@
>  
>  static DEFINE_IDA(watchdog_ida);
>  static struct class *watchdog_class;
> +static bool stop_before_register;
>  
>  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
>  {
> @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
>  	if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
>  		return -EINVAL;
>  
> +	if (stop_before_register)
> +		wdd->ops->stop(wdd);
> +
>  	watchdog_check_min_max_timeout(wdd);
>  
>  	/*
> @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
>  subsys_initcall(watchdog_init);
>  module_exit(watchdog_exit);
>  
> +module_param(stop_before_register, bool, 0644);
> +
>  MODULE_AUTHOR("Alan Cox <alan@lxorguk.ukuu.org.uk>");
>  MODULE_AUTHOR("Wim Van Sebroeck <wim@iguana.be>");
>  MODULE_DESCRIPTION("WatchDog Timer Driver Core");

Hmm, need to look closer to this, but my first thought is:
what about devices that cannot be stopped once started...
They should be able to override this module_parameter...

Kind regards,
Wim.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14  8:26 ` Wim Van Sebroeck
@ 2014-01-14  8:41   ` Dave Young
  2014-01-14  9:44     ` Dave Young
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Young @ 2014-01-14  8:41 UTC (permalink / raw
  To: Wim Van Sebroeck; +Cc: dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
> Hi Dave,
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> > 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> > 
> > Don Zickus mentioned that there's the case that bios start the watchdog
> > and it is expected that the kernel keep the watchdog alive. To address
> > this case I added the module parameter which is false by default so
> > it will stop the watchdog only when user provice kernel cmdline
> > "watchdog.stop_before_register=1".  
> > 
> > Signed-off-by: Dave Young <dyoung@redhat.com>
> > ---
> >  drivers/watchdog/watchdog_core.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > --- linux.orig/drivers/watchdog/watchdog_core.c
> > +++ linux/drivers/watchdog/watchdog_core.c
> > @@ -42,6 +42,7 @@
> >  
> >  static DEFINE_IDA(watchdog_ida);
> >  static struct class *watchdog_class;
> > +static bool stop_before_register;
> >  
> >  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
> >  {
> > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
> >  	if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
> >  		return -EINVAL;
> >  
> > +	if (stop_before_register)
> > +		wdd->ops->stop(wdd);
> > +
> >  	watchdog_check_min_max_timeout(wdd);
> >  
> >  	/*
> > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
> >  subsys_initcall(watchdog_init);
> >  module_exit(watchdog_exit);
> >  
> > +module_param(stop_before_register, bool, 0644);
> > +
> >  MODULE_AUTHOR("Alan Cox <alan@lxorguk.ukuu.org.uk>");
> >  MODULE_AUTHOR("Wim Van Sebroeck <wim@iguana.be>");
> >  MODULE_DESCRIPTION("WatchDog Timer Driver Core");
> 
> Hmm, need to look closer to this, but my first thought is:
> what about devices that cannot be stopped once started...
> They should be able to override this module_parameter...

Hi, Wim

Thanks for quick feedback!

I'm not sure the meaning of "cannot be stopped", if it means that
the policy that it should not be stopped, I think since the watchdog_core
is always built-in so the param can only be provided via boot cmdline it would
be fine?

For device which really *cannot* stop, the stop() will fail silently?

Thanks
Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14  8:41   ` Dave Young
@ 2014-01-14  9:44     ` Dave Young
  0 siblings, 0 replies; 13+ messages in thread
From: Dave Young @ 2014-01-14  9:44 UTC (permalink / raw
  To: Wim Van Sebroeck; +Cc: dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

On 01/14/14 at 04:41pm, Dave Young wrote:
> On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
> > Hi Dave,
> > 
> > > In kdump kernel watchdog could interrupt vmcore capturing because we
> > > have no way to disable/stop it while crashing happens.
> > > 
> > > Add a module parameter stop_before_register so watchdog can be stopped
> > > before register in driver loading path. Thus we can try to load the
> > > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > > capturing.
> > > 
> > > Don Zickus mentioned that there's the case that bios start the watchdog
> > > and it is expected that the kernel keep the watchdog alive. To address
> > > this case I added the module parameter which is false by default so
> > > it will stop the watchdog only when user provice kernel cmdline
> > > "watchdog.stop_before_register=1".  
> > > 
> > > Signed-off-by: Dave Young <dyoung@redhat.com>
> > > ---
> > >  drivers/watchdog/watchdog_core.c |    6 ++++++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > --- linux.orig/drivers/watchdog/watchdog_core.c
> > > +++ linux/drivers/watchdog/watchdog_core.c
> > > @@ -42,6 +42,7 @@
> > >  
> > >  static DEFINE_IDA(watchdog_ida);
> > >  static struct class *watchdog_class;
> > > +static bool stop_before_register;
> > >  
> > >  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
> > >  {
> > > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
> > >  	if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
> > >  		return -EINVAL;
> > >  
> > > +	if (stop_before_register)
> > > +		wdd->ops->stop(wdd);
> > > +
> > >  	watchdog_check_min_max_timeout(wdd);
> > >  
> > >  	/*
> > > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
> > >  subsys_initcall(watchdog_init);
> > >  module_exit(watchdog_exit);
> > >  
> > > +module_param(stop_before_register, bool, 0644);
> > > +
> > >  MODULE_AUTHOR("Alan Cox <alan@lxorguk.ukuu.org.uk>");
> > >  MODULE_AUTHOR("Wim Van Sebroeck <wim@iguana.be>");
> > >  MODULE_DESCRIPTION("WatchDog Timer Driver Core");
> > 
> > Hmm, need to look closer to this, but my first thought is:
> > what about devices that cannot be stopped once started...
> > They should be able to override this module_parameter...
> 
> Hi, Wim
> 
> Thanks for quick feedback!
> 
> I'm not sure the meaning of "cannot be stopped", if it means that
> the policy that it should not be stopped, I think since the watchdog_core
> is always built-in so the param can only be provided via boot cmdline it would
> be fine?

Hmm, the wdt driver can be removed then insmod again.
I think you are talking about the nowayout, for this case probably should add below:

if (stop_before_register && !nowayout)
  stop it.

But is there a general way for checkout nowayout, could you give some
hints?

> 
> For device which really *cannot* stop, the stop() will fail silently?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14  8:23 [PATCH] watchdog: add a parameter for stop wdt before register Dave Young
  2014-01-14  8:26 ` Wim Van Sebroeck
@ 2014-01-14 12:16 ` One Thousand Gnomes
  2014-01-14 16:24   ` Vivek Goyal
  2014-01-15  0:59   ` Dave Young
  1 sibling, 2 replies; 13+ messages in thread
From: One Thousand Gnomes @ 2014-01-14 12:16 UTC (permalink / raw
  To: Dave Young; +Cc: wim, dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

On Tue, 14 Jan 2014 16:23:23 +0800
Dave Young <dyoung@redhat.com> wrote:

> In kdump kernel watchdog could interrupt vmcore capturing because we
> have no way to disable/stop it while crashing happens.

Lots of watchdogs cannot be stopped.

> Add a module parameter stop_before_register so watchdog can be stopped
> before register in driver loading path. Thus we can try to load the
> watchdog driver as early as possible in kdump kernel to ensure vmcore
> capturing.

If you want to kdump then don't start the watchdog. The goal of the
watchdog is to make sure the system never gets stuck. Adding conditions
and special cases simply increases the odds of something bad not
triggering the watchdog.

If you have a system that can stop the watchdog then providing no way out
is not set you can open it and stop it.

I don't see the need for any kernel change here

- if it can't be stopped you lost
- if "nowayout" is set then by design you lost
- if it can be stopped, you can open and stop it

Now whether in the !nowayout case the watchdog core should catch whatever
hooks/notifiers are available and stop any watchdogs it can on a
kexec/kdump is a more interesting question and probably needs to default
to not doing so but with the option to force otherwise for debugging work.

Alan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14 12:16 ` One Thousand Gnomes
@ 2014-01-14 16:24   ` Vivek Goyal
  2014-01-15  1:11     ` Dave Young
  2014-01-15  0:59   ` Dave Young
  1 sibling, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2014-01-14 16:24 UTC (permalink / raw
  To: One Thousand Gnomes, Dave Young
  Cc: wim, dzickus, bhe, linux-watchdog, linux-kernel

On Tue, Jan 14, 2014 at 12:16:39PM +0000, One Thousand Gnomes wrote:
> On Tue, 14 Jan 2014 16:23:23 +0800
> Dave Young <dyoung@redhat.com> wrote:
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> 
> Lots of watchdogs cannot be stopped.
> 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> 
> If you want to kdump then don't start the watchdog. The goal of the
> watchdog is to make sure the system never gets stuck. Adding conditions
> and special cases simply increases the odds of something bad not
> triggering the watchdog.
> 
> If you have a system that can stop the watchdog then providing no way out
> is not set you can open it and stop it.
> 
> I don't see the need for any kernel change here
> 
> - if it can't be stopped you lost
> - if "nowayout" is set then by design you lost
> - if it can be stopped, you can open and stop it
> 
> Now whether in the !nowayout case the watchdog core should catch whatever
> hooks/notifiers are available and stop any watchdogs it can on a
> kexec/kdump is a more interesting question and probably needs to default
> to not doing so but with the option to force otherwise for debugging work.

Hi All,

I thought this problem was resolved (atleast conceptually) last time
when Don Zickus brought it up.

He mentioned that it was concluded that keep watchdog interval long
enough, say 60 seconds and keep on kicking it fast enough, say every
10-20 seconds. That would ensure that after the crash, there is atleast
60 - 20 = 40 seconds left before watchdog expires. And in that duration
we should try to boot into second kernel load watchdog driver early enough
from initramfs which can start kicking watchdog again.

I am wondering what happened to this idea. Dave, did we try to implement/
experiment with this?

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14 12:16 ` One Thousand Gnomes
  2014-01-14 16:24   ` Vivek Goyal
@ 2014-01-15  0:59   ` Dave Young
  2014-01-15 12:15     ` One Thousand Gnomes
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Young @ 2014-01-15  0:59 UTC (permalink / raw
  To: One Thousand Gnomes
  Cc: wim, dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

On 01/14/14 at 12:16pm, One Thousand Gnomes wrote:
> On Tue, 14 Jan 2014 16:23:23 +0800
> Dave Young <dyoung@redhat.com> wrote:
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> 
> Lots of watchdogs cannot be stopped.
> 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> 
> If you want to kdump then don't start the watchdog. The goal of the
> watchdog is to make sure the system never gets stuck. Adding conditions
> and special cases simply increases the odds of something bad not
> triggering the watchdog.

watchdog and crash dump really conflicts to some degree, from the watchdog
point of view it can reboot system whhen kdump kernel hangs. But from kdump
point of view it want ensure saving the vmcore for later debugging.

Maybe we can only select only one in this case.
> 
> If you have a system that can stop the watchdog then providing no way out
> is not set you can open it and stop it.
> 
> I don't see the need for any kernel change here
> 
> - if it can't be stopped you lost
> - if "nowayout" is set then by design you lost
> - if it can be stopped, you can open and stop it

For the last one since crashing happens we have no chance to open and stop.

> 
> Now whether in the !nowayout case the watchdog core should catch whatever
> hooks/notifiers are available and stop any watchdogs it can on a
> kexec/kdump is a more interesting question and probably needs to default
> to not doing so but with the option to force otherwise for debugging work.

Unfortunately in crash path there's no chance to do so. It's not good to
add more logic in that path as well.

Thanks
Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-14 16:24   ` Vivek Goyal
@ 2014-01-15  1:11     ` Dave Young
  2014-01-15 16:46       ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Young @ 2014-01-15  1:11 UTC (permalink / raw
  To: Vivek Goyal
  Cc: One Thousand Gnomes, wim, dzickus, bhe, linux-watchdog,
	linux-kernel

On 01/14/14 at 11:24am, Vivek Goyal wrote:
> On Tue, Jan 14, 2014 at 12:16:39PM +0000, One Thousand Gnomes wrote:
> > On Tue, 14 Jan 2014 16:23:23 +0800
> > Dave Young <dyoung@redhat.com> wrote:
> > 
> > > In kdump kernel watchdog could interrupt vmcore capturing because we
> > > have no way to disable/stop it while crashing happens.
> > 
> > Lots of watchdogs cannot be stopped.
> > 
> > > Add a module parameter stop_before_register so watchdog can be stopped
> > > before register in driver loading path. Thus we can try to load the
> > > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > > capturing.
> > 
> > If you want to kdump then don't start the watchdog. The goal of the
> > watchdog is to make sure the system never gets stuck. Adding conditions
> > and special cases simply increases the odds of something bad not
> > triggering the watchdog.
> > 
> > If you have a system that can stop the watchdog then providing no way out
> > is not set you can open it and stop it.
> > 
> > I don't see the need for any kernel change here
> > 
> > - if it can't be stopped you lost
> > - if "nowayout" is set then by design you lost
> > - if it can be stopped, you can open and stop it
> > 
> > Now whether in the !nowayout case the watchdog core should catch whatever
> > hooks/notifiers are available and stop any watchdogs it can on a
> > kexec/kdump is a more interesting question and probably needs to default
> > to not doing so but with the option to force otherwise for debugging work.
> 
> Hi All,
> 
> I thought this problem was resolved (atleast conceptually) last time
> when Don Zickus brought it up.
> 
> He mentioned that it was concluded that keep watchdog interval long
> enough, say 60 seconds and keep on kicking it fast enough, say every
> 10-20 seconds. That would ensure that after the crash, there is atleast
> 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> we should try to boot into second kernel load watchdog driver early enough
> from initramfs which can start kicking watchdog again.

Some drivers did stop the watchdog while module loading such as iTCO_wdt.
so we can load them as early as possible and not necessary to kick them
again. Thus I wrote this patch to add the stop to generic code so more drivers
can be covered.

But as Alan said I begin to feel that this is not a good design. the iTCO_wdt
nowayout become useless because of this stop_before_register..

For wdts which can not be stopped we can still continue working on kicking
them early in initramfs.

> 
> I am wondering what happened to this idea. Dave, did we try to implement/
> experiment with this?

No, we are just begin with iTCO_wdt and is trying to add iTCO_wdt firstly.
Other devices has not been investigated.

Thanks
Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-15  0:59   ` Dave Young
@ 2014-01-15 12:15     ` One Thousand Gnomes
  2014-01-15 16:55       ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: One Thousand Gnomes @ 2014-01-15 12:15 UTC (permalink / raw
  To: Dave Young; +Cc: wim, dzickus, bhe, vgoyal, linux-watchdog, linux-kernel

> watchdog and crash dump really conflicts to some degree, from the watchdog
> point of view it can reboot system whhen kdump kernel hangs. But from kdump
> point of view it want ensure saving the vmcore for later debugging.
> 
> Maybe we can only select only one in this case.

You want to be able to make a decision at runtime which to use.

> > - if it can be stopped, you can open and stop it
> 
> For the last one since crashing happens we have no chance to open and stop.

When you decide you need to set up to catch a core rather than just
crash you can open and stop the watchdog (if supported), and you can then
set up for a kdump and then at some point later if it crashes capture the
dump.

As you say the two are basically incompatible models of operating, but
that also means if you are about to take a dump you want ensure you will
not be disturbed. So as far as I can see if you might need to take a
dump, turn the watchdog off in advance. Worrying about it as you take a
dump is too late.

Alan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-15  1:11     ` Dave Young
@ 2014-01-15 16:46       ` Vivek Goyal
  2014-01-16  1:50         ` Dave Young
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2014-01-15 16:46 UTC (permalink / raw
  To: Dave Young
  Cc: One Thousand Gnomes, wim, dzickus, bhe, linux-watchdog,
	linux-kernel

On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:

[..]
> > I thought this problem was resolved (atleast conceptually) last time
> > when Don Zickus brought it up.
> > 
> > He mentioned that it was concluded that keep watchdog interval long
> > enough, say 60 seconds and keep on kicking it fast enough, say every
> > 10-20 seconds. That would ensure that after the crash, there is atleast
> > 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> > we should try to boot into second kernel load watchdog driver early enough
> > from initramfs which can start kicking watchdog again.
> 
> Some drivers did stop the watchdog while module loading such as iTCO_wdt.

Instead of stopping why not keep on kicking it till user space takes
over this job. This will also make sure that if kdump kernel hangs,
watchdog wil do the job it is supposed to do?

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-15 12:15     ` One Thousand Gnomes
@ 2014-01-15 16:55       ` Vivek Goyal
  2014-01-16  1:52         ` Dave Young
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2014-01-15 16:55 UTC (permalink / raw
  To: One Thousand Gnomes
  Cc: Dave Young, wim, dzickus, bhe, linux-watchdog, linux-kernel

On Wed, Jan 15, 2014 at 12:15:56PM +0000, One Thousand Gnomes wrote:
> > watchdog and crash dump really conflicts to some degree, from the watchdog
> > point of view it can reboot system whhen kdump kernel hangs. But from kdump
> > point of view it want ensure saving the vmcore for later debugging.
> > 
> > Maybe we can only select only one in this case.
> 
> You want to be able to make a decision at runtime which to use.
> 
> > > - if it can be stopped, you can open and stop it
> > 
> > For the last one since crashing happens we have no chance to open and stop.
> 
> When you decide you need to set up to catch a core rather than just
> crash you can open and stop the watchdog (if supported), and you can then
> set up for a kdump and then at some point later if it crashes capture the
> dump.

Disabling watchdog if kdump serice starts will not make many happy. If
kernel hangs, we don't have a functionality to reboot it.

What about other idea of keeping watchdog interval long enough that new
kernel can boot, driver can load and then new driver/user space can
continue to kick the watchdog. And if second kernel hangs, watchdog will
reboot the system.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-15 16:46       ` Vivek Goyal
@ 2014-01-16  1:50         ` Dave Young
  0 siblings, 0 replies; 13+ messages in thread
From: Dave Young @ 2014-01-16  1:50 UTC (permalink / raw
  To: Vivek Goyal
  Cc: One Thousand Gnomes, wim, dzickus, bhe, linux-watchdog,
	linux-kernel

On 01/15/14 at 11:46am, Vivek Goyal wrote:
> On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:
> 
> [..]
> > > I thought this problem was resolved (atleast conceptually) last time
> > > when Don Zickus brought it up.
> > > 
> > > He mentioned that it was concluded that keep watchdog interval long
> > > enough, say 60 seconds and keep on kicking it fast enough, say every
> > > 10-20 seconds. That would ensure that after the crash, there is atleast
> > > 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> > > we should try to boot into second kernel load watchdog driver early enough
> > > from initramfs which can start kicking watchdog again.
> > 
> > Some drivers did stop the watchdog while module loading such as iTCO_wdt.
> 
> Instead of stopping why not keep on kicking it till user space takes
> over this job. This will also make sure that if kdump kernel hangs,
> watchdog wil do the job it is supposed to do?

Yes, rethinking about this problem, kicking it is better than stopping it.
But we will also have more uncertern time between userspace kicking before and
after panic.

Thanks
Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] watchdog: add a parameter for stop wdt before register
  2014-01-15 16:55       ` Vivek Goyal
@ 2014-01-16  1:52         ` Dave Young
  0 siblings, 0 replies; 13+ messages in thread
From: Dave Young @ 2014-01-16  1:52 UTC (permalink / raw
  To: Vivek Goyal
  Cc: One Thousand Gnomes, wim, dzickus, bhe, linux-watchdog,
	linux-kernel

On 01/15/14 at 11:55am, Vivek Goyal wrote:
> On Wed, Jan 15, 2014 at 12:15:56PM +0000, One Thousand Gnomes wrote:
> > > watchdog and crash dump really conflicts to some degree, from the watchdog
> > > point of view it can reboot system whhen kdump kernel hangs. But from kdump
> > > point of view it want ensure saving the vmcore for later debugging.
> > > 
> > > Maybe we can only select only one in this case.
> > 
> > You want to be able to make a decision at runtime which to use.
> > 
> > > > - if it can be stopped, you can open and stop it
> > > 
> > > For the last one since crashing happens we have no chance to open and stop.
> > 
> > When you decide you need to set up to catch a core rather than just
> > crash you can open and stop the watchdog (if supported), and you can then
> > set up for a kdump and then at some point later if it crashes capture the
> > dump.
> 
> Disabling watchdog if kdump serice starts will not make many happy. If
> kernel hangs, we don't have a functionality to reboot it.
> 
> What about other idea of keeping watchdog interval long enough that new
> kernel can boot, driver can load and then new driver/user space can
> continue to kick the watchdog. And if second kernel hangs, watchdog will
> reboot the system.

I would agree this way.

Thanks
Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-01-16  1:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-14  8:23 [PATCH] watchdog: add a parameter for stop wdt before register Dave Young
2014-01-14  8:26 ` Wim Van Sebroeck
2014-01-14  8:41   ` Dave Young
2014-01-14  9:44     ` Dave Young
2014-01-14 12:16 ` One Thousand Gnomes
2014-01-14 16:24   ` Vivek Goyal
2014-01-15  1:11     ` Dave Young
2014-01-15 16:46       ` Vivek Goyal
2014-01-16  1:50         ` Dave Young
2014-01-15  0:59   ` Dave Young
2014-01-15 12:15     ` One Thousand Gnomes
2014-01-15 16:55       ` Vivek Goyal
2014-01-16  1:52         ` Dave Young

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.