All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Borislav Petkov <bp@alien8.de>,
	"musl@lists.openwall.com" <musl@lists.openwall.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers
Date: Wed, 9 Mar 2016 09:56:31 +0100	[thread overview]
Message-ID: <20160309085631.GA3247@gmail.com> (raw)
In-Reply-To: <06079088639eddd756e2092b735ce4a682081308.1457486598.git.luto@kernel.org>


* Andy Lutomirski <luto@kernel.org> wrote:

> musl implements system call cancellation in an unusual but clever way.

So I'm sceptical about the concept.

Could someone remind me why cancellation points matter to user-space?

I know the pthread APIs and semantics that are behind it, I just don't see how it 
can be truly utilized for any meaningful programmatic property: for example the 
moment you add any sort of ad-hoc printf() based tracing or any other spontaneous 
logging IO to your application, you add in a lot of potential cancellation points 
into various places in your user-space logic ...

It's _very_ easy to add inadvertent cancellation point to the code in practice, so 
using the default pthread cancellation model and relying on what is a cancellation 
point is crazy and very libc dependent in general. POSIX seems to be pretty vague 
about it as well. So unless you make heavy use of pthread_setcancelstate() to 
explicitly mark your work atoms, it's a really bad interface to rely on.

And if you are using pthread_setcancelstate(), instead of relying on calcellation, 
then you are not really using the built-in cancellation points but have to spike 
your code with pthread_testcancel(). In that case, why not just use your own 
explicit 'cancellation' points in a few strategic places - which is mostly just a 
simple flag really. That's what most worker thread models that I've seen use.

I suspect more complex runtimes like java runtimes couldn't care less, so it's 
really something that only libc using C/C++ code cares about.

> When a thread issues a cancellable syscall, musl issues the syscall
> through a special thunk that looks roughly like this:
> 
> cancellable_syscall:
> 	test whether a cancel is queued
> 	jnz cancel_me
> 	int $0x80
> end_cancellable_syscall:
> 
> If a pthread cancellation signal hits with
> cancellable_syscall <= EIP < end_cancellable_syscall, then the
> signal interrupted a cancellation point before the syscall in
> question started.  If so, it rewrites the calling context to skip
> the syscall and simulate a -EINTR return.  The caller will detect
> this simulated -EINTR or an actual -EINTR and handle a possible
> cancellation event.

Why is so much complexity added to avoid a ~3 instructions window where 
calcellation is tested? Cancellation at work atom boundaries is a fundamentally 
'polling' model anyway, and signal delivery is asynchronous, with a fundamental 
IPI delay if it's cross-CPU.

> This technique doesn't work if int $0x80 is replaced by a call to
> AT_SYSINFO: the signal handler can no longer tell whether it's
> interrupting a call to AT_SYSINFO or, if it is, where AT_SYSINFO was
> called from.
> 
> Add minimal helpers so that musl's signal handler can learn the
> status of a possible pending AT_SYSINFO invocation and, if it hasn't
> entered the kernel yet, abort it without needing to parse the vdso
> DWARF unwind data.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> musl people-
> 
> Does this solve your AT_SYSINFO cancellation problem?  I'd like to
> make sure it survives an actual implementation before I commit to the ABI.
> 
> x86 people-
> 
> Are you okay with this idea?
> 
> 
>  arch/x86/entry/vdso/Makefile                      |   3 +-
>  arch/x86/entry/vdso/vdso32/cancellation_helpers.c | 116 ++++++++++++++++++++++
>  arch/x86/entry/vdso/vdso32/vdso32.lds.S           |   2 +
>  tools/testing/selftests/x86/unwind_vdso.c         |  57 +++++++++--
>  4 files changed, 171 insertions(+), 7 deletions(-)
>  create mode 100644 arch/x86/entry/vdso/vdso32/cancellation_helpers.c

I'd really like to see a cost/benefit analysis here! Some before/after explanation 
- exactly what is not possible today (in practical terms), what are the practical 
effects of not being able to do that, and how would the bright future look like?

Thanks,

	Ingo

  reply	other threads:[~2016-03-09  8:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-09  1:24 [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers Andy Lutomirski
2016-03-09  8:56 ` Ingo Molnar [this message]
2016-03-09 11:34   ` [musl] " Szabolcs Nagy
2016-03-09 11:40     ` Szabolcs Nagy
2016-03-09 19:47     ` Linus Torvalds
2016-03-09 20:57       ` Andy Lutomirski
2016-03-09 21:26         ` Linus Torvalds
2016-03-10 10:57         ` Ingo Molnar
2016-03-10  3:34       ` Rich Felker
2016-03-10 11:16         ` Ingo Molnar
2016-03-10 16:41           ` Rich Felker
2016-03-10 18:03             ` Ingo Molnar
2016-03-10 23:28               ` Rich Felker
2016-03-11  0:18                 ` Szabolcs Nagy
2016-03-11  0:48                   ` Rich Felker
2016-03-11  1:14                     ` Andy Lutomirski
2016-03-11  1:39                     ` Szabolcs Nagy
2016-03-11  1:49                       ` Szabolcs Nagy
2016-03-11  1:55                       ` Rich Felker
2016-03-11  9:33                 ` Ingo Molnar
2016-03-11 11:39                   ` Szabolcs Nagy
2016-03-11 19:27                     ` Linus Torvalds
2016-03-11 19:30                       ` Andy Lutomirski
2016-03-11 19:39                         ` Linus Torvalds
2016-03-11 19:44                           ` Linus Torvalds
2016-03-12 17:05                             ` Ingo Molnar
2016-03-12 18:10                               ` Rich Felker
2016-03-12 17:00                       ` Ingo Molnar
2016-03-12 18:05                         ` Rich Felker
2016-03-12 18:48                           ` Ingo Molnar
2016-03-12 19:08                             ` Rich Felker
2016-03-12 17:08                     ` Ingo Molnar
2016-03-09 17:58 ` Andy Lutomirski
2016-03-09 21:19   ` Andy Lutomirski
2016-03-12 18:13     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160309085631.GA3247@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=musl@lists.openwall.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.