All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Stafford Horne <shorne@gmail.com>
To: Matt Redfearn <matt.redfearn@mips.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Jonas Bonn <jonas@southpole.se>,
	Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>,
	Jan Henrik Weinstock <jan.weinstock@ice.rwth-aachen.de>,
	Matt Redfearn <matt.redfearn@imgtec.com>,
	James Hogan <james.hogan@imgtec.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	openrisc@lists.librecores.org,
	Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Subject: Re: [PATCH v4 13/13] openrisc: add tick timer multi-core sync logic
Date: Wed, 1 Nov 2017 09:34:47 +0900	[thread overview]
Message-ID: <20171101003447.GC29237@lianli.shorne-pla.net> (raw)
In-Reply-To: <20171031231759.GB29237@lianli.shorne-pla.net>

On Wed, Nov 01, 2017 at 08:17:59AM +0900, Stafford Horne wrote:
> On Tue, Oct 31, 2017 at 02:06:21PM +0000, Matt Redfearn wrote:
> > Hi,
> > 
> > 
> > On 29/10/17 23:11, Stafford Horne wrote:
> > > In case timers are not in sync when cpus start (i.e. hot plug / offset
> > > resets) we need to synchronize the secondary cpus internal timer with
> > > the main cpu.  This is needed as in OpenRISC SMP there is only one
> > > clocksource registered which reads from the same ttcr register on each
> > > cpu.
> > > 
> > > This synchronization routine heavily borrows from mips implementation that
> > > does something similar.
> [..]
> > > diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> > > index 4763b8b9161e..4d80ce6fa045 100644
> > > --- a/arch/openrisc/kernel/smp.c
> > > +++ b/arch/openrisc/kernel/smp.c
> > > @@ -100,6 +100,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> > >   		pr_crit("CPU%u: failed to start\n", cpu);
> > >   		return -EIO;
> > >   	}
> > > +	synchronise_count_master(cpu);
> > >   	return 0;
> > >   }
> > > @@ -129,6 +130,8 @@ asmlinkage __init void secondary_start_kernel(void)
> > >   	set_cpu_online(cpu, true);
> > >   	complete(&cpu_running);
> > > +	synchronise_count_slave(cpu);
> > > +
> > 
> > 
> > Note that until 8f46cca1e6c06a058374816887059bcc017b382f, the MIPS timer
> > synchronization code contained the possibility of deadlock. If you mark a
> > CPU online before it goes into the synchronize loop, then the boot CPU can
> > schedule a different thread and send IPIs to all "online" CPUs. It gets
> > stuck waiting for the secondary to ack it's IPI, since this secondary CPU
> > has not enabled IRQs yet, and is stuck waiting for the master to synchronise
> > with it. The system then deadlocks.
> > Commit 8f46cca1e6c06a058374816887059bcc017b382f fixed this for MIPS and you
> > might want to similarly move the
> > 
> > set_cpu_online(cpu, true);
> > 
> > after counters are synchronized.
> 
> Thank you for the heads up.  I do remember having interim issues with the timer
> syncing but I havent seen it for a while.  I think I fixed it by also moving
> synchronise_count_slave.
> 
> Let me double check.  Also, I see your patch 8f46cca1e6c06a0583748168 was merged
> last year?

Hello,

I should have read a bit more closely, definitely this could be an issue if the
boot cpu has other things running.

However, looking at mainline I can see the clock sync comes after set_cpu_online
again after this patch in mips.

  6f542ebeaee0 MIPS: Fix race on setting and getting cpu_online_mask
  Author: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>

Is this deadlock an issue in mips again?

-Stafford

WARNING: multiple messages have this Message-ID (diff)
From: Stafford Horne <shorne@gmail.com>
To: openrisc@lists.librecores.org
Subject: [OpenRISC] [PATCH v4 13/13] openrisc: add tick timer multi-core sync logic
Date: Wed, 1 Nov 2017 09:34:47 +0900	[thread overview]
Message-ID: <20171101003447.GC29237@lianli.shorne-pla.net> (raw)
In-Reply-To: <20171031231759.GB29237@lianli.shorne-pla.net>

On Wed, Nov 01, 2017 at 08:17:59AM +0900, Stafford Horne wrote:
> On Tue, Oct 31, 2017 at 02:06:21PM +0000, Matt Redfearn wrote:
> > Hi,
> > 
> > 
> > On 29/10/17 23:11, Stafford Horne wrote:
> > > In case timers are not in sync when cpus start (i.e. hot plug / offset
> > > resets) we need to synchronize the secondary cpus internal timer with
> > > the main cpu.  This is needed as in OpenRISC SMP there is only one
> > > clocksource registered which reads from the same ttcr register on each
> > > cpu.
> > > 
> > > This synchronization routine heavily borrows from mips implementation that
> > > does something similar.
> [..]
> > > diff --git a/arch/openrisc/kernel/smp.c b/arch/openrisc/kernel/smp.c
> > > index 4763b8b9161e..4d80ce6fa045 100644
> > > --- a/arch/openrisc/kernel/smp.c
> > > +++ b/arch/openrisc/kernel/smp.c
> > > @@ -100,6 +100,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> > >   		pr_crit("CPU%u: failed to start\n", cpu);
> > >   		return -EIO;
> > >   	}
> > > +	synchronise_count_master(cpu);
> > >   	return 0;
> > >   }
> > > @@ -129,6 +130,8 @@ asmlinkage __init void secondary_start_kernel(void)
> > >   	set_cpu_online(cpu, true);
> > >   	complete(&cpu_running);
> > > +	synchronise_count_slave(cpu);
> > > +
> > 
> > 
> > Note that until 8f46cca1e6c06a058374816887059bcc017b382f, the MIPS timer
> > synchronization code contained the possibility of deadlock. If you mark a
> > CPU online before it goes into the synchronize loop, then the boot CPU can
> > schedule a different thread and send IPIs to all "online" CPUs. It gets
> > stuck waiting for the secondary to ack it's IPI, since this secondary CPU
> > has not enabled IRQs yet, and is stuck waiting for the master to synchronise
> > with it. The system then deadlocks.
> > Commit 8f46cca1e6c06a058374816887059bcc017b382f fixed this for MIPS and you
> > might want to similarly move the
> > 
> > set_cpu_online(cpu, true);
> > 
> > after counters are synchronized.
> 
> Thank you for the heads up.  I do remember having interim issues with the timer
> syncing but I havent seen it for a while.  I think I fixed it by also moving
> synchronise_count_slave.
> 
> Let me double check.  Also, I see your patch 8f46cca1e6c06a0583748168 was merged
> last year?

Hello,

I should have read a bit more closely, definitely this could be an issue if the
boot cpu has other things running.

However, looking at mainline I can see the clock sync comes after set_cpu_online
again after this patch in mips.

  6f542ebeaee0 MIPS: Fix race on setting and getting cpu_online_mask
  Author: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>

Is this deadlock an issue in mips again?

-Stafford

  reply	other threads:[~2017-11-01  0:34 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-29 23:11 [PATCH v4 00/13] OpenRISC SMP Support Stafford Horne
2017-10-29 23:11 ` [PATCH v4 01/13] openrisc: use shadow registers to save regs on exception Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 02/13] openrisc: add 1 and 2 byte cmpxchg support Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 03/13] openrisc: use qspinlocks and qrwlocks Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 04/13] dt-bindings: add openrisc to vendor prefixes list Stafford Horne
2017-10-29 23:11 ` [PATCH v4 05/13] irqchip: add initial support for ompic Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-30  2:29   ` Marc Zyngier
2017-10-30  2:29     ` [OpenRISC] " Marc Zyngier
2017-10-30  2:29     ` Marc Zyngier
2017-10-30  4:18     ` Stafford Horne
2017-10-30  4:18       ` [OpenRISC] " Stafford Horne
2017-10-30  4:18       ` Stafford Horne
2017-10-30  6:11       ` Marc Zyngier
2017-10-30  6:11         ` [OpenRISC] " Marc Zyngier
2017-11-01 12:17         ` Stafford Horne
2017-11-01 12:17           ` [OpenRISC] " Stafford Horne
2017-11-01 12:17           ` Stafford Horne
2017-10-29 23:11 ` [PATCH v4 06/13] openrisc: initial SMP support Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 07/13] openrisc: fix initial preempt state for secondary cpu tasks Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 08/13] openrisc: sleep instead of spin on secondary wait Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 09/13] openrisc: add cacheflush support to fix icache aliasing Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 10/13] openrisc: add simple_smp dts and defconfig for simulators Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 11/13] openrisc: support framepointers and STACKTRACE_SUPPORT Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 12/13] openrisc: enable LOCKDEP_SUPPORT and irqflags tracing Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-29 23:11 ` [PATCH v4 13/13] openrisc: add tick timer multi-core sync logic Stafford Horne
2017-10-29 23:11   ` [OpenRISC] " Stafford Horne
2017-10-31 14:06   ` Matt Redfearn
2017-10-31 14:06     ` [OpenRISC] " Matt Redfearn
2017-10-31 23:17     ` Stafford Horne
2017-10-31 23:17       ` [OpenRISC] " Stafford Horne
2017-11-01  0:34       ` Stafford Horne [this message]
2017-11-01  0:34         ` Stafford Horne
2017-11-01  9:26         ` Matt Redfearn
2017-11-01  9:26           ` [OpenRISC] " Matt Redfearn
2017-11-01  9:26           ` Matt Redfearn
2017-11-01 12:15           ` Stafford Horne
2017-11-01 12:15             ` [OpenRISC] " Stafford Horne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171101003447.GC29237@lianli.shorne-pla.net \
    --to=shorne@gmail.com \
    --cc=james.hogan@imgtec.com \
    --cc=jan.weinstock@ice.rwth-aachen.de \
    --cc=jonas@southpole.se \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matija.glavinic-pecotic.ext@nokia.com \
    --cc=matt.redfearn@imgtec.com \
    --cc=matt.redfearn@mips.com \
    --cc=openrisc@lists.librecores.org \
    --cc=stefan.kristiansson@saunalahti.fi \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.