* [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
@ 2007-05-17 22:06 Philipp Kohlbecher
2007-05-17 22:16 ` H. Peter Anvin
0 siblings, 1 reply; 8+ messages in thread
From: Philipp Kohlbecher @ 2007-05-17 22:06 UTC (permalink / raw
To: Dave Jones, H. Peter Anvin; +Cc: Arnd Bergmann, linux-kernel, linux-assembly
From: Philipp Kohlbecher <pk031698@uni-greifswald.de>
The kernel_execve function issues a software interrupt (int 0x80) to make
a system call to sys_execve. This function expects to find the stack segment
and stack pointer of the function that issued the system call in the pt_regs
struct. The syscall entry code that sets up this struct expects the stack
segment and the stack pointer of the issuing function already on the stack.
But the Intel processor saves these registers only if a stack-switch occurs,
i.e. for inter-privilege interrupts and exceptions (cf. Intel Software
Developer’s Manual, Vol. 3A, p. 5-17,
http://www.intel.com/design/processor/manuals/253668.pdf).
For an intra-privilege interrupt like the one issued in kernel_execve, these
registers must be saved manually.
Signed-off-by: Philipp Kohlbecher <pk031698@uni-greifswald.de>
---
I am a total newbie and this is my first patch, so please be merciful...
Also, please CC me, I am not on the list.
--- a/arch/i386/kernel/sys_i386.c 2007-05-13 18:40:33.000000000 +0200
+++ b/arch/i386/kernel/sys_i386.c 2007-05-13 17:23:10.000000000 +0200
@@ -258,7 +258,8 @@ int kernel_execve(const char *filename,
int kernel_execve(const char *filename, char *const argv[], char *const envp[])
{
long __res;
- asm volatile ("push %%ebx ; movl %2,%%ebx ; int $0x80 ; pop %%ebx"
+ asm volatile ("push %%ebx ; movl %2,%%ebx ; push %%ss ; push %%esp;"
+ "int $0x80 ; addl $(4*2),%%esp ; pop %%ebx"
: "=a" (__res)
: "0" (__NR_execve),"ri" (filename),"c" (argv), "d" (envp) : "memory");
return __res;
-
To unsubscribe from this list: send the line "unsubscribe linux-assembly" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-17 22:06 [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall Philipp Kohlbecher
@ 2007-05-17 22:16 ` H. Peter Anvin
2007-05-17 23:27 ` Philipp Kohlbecher
0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2007-05-17 22:16 UTC (permalink / raw
To: Philipp Kohlbecher
Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
Philipp Kohlbecher wrote:
> From: Philipp Kohlbecher <pk031698@uni-greifswald.de>
>
> The kernel_execve function issues a software interrupt (int 0x80) to make
> a system call to sys_execve. This function expects to find the stack segment
> and stack pointer of the function that issued the system call in the pt_regs
> struct. The syscall entry code that sets up this struct expects the stack
> segment and the stack pointer of the issuing function already on the stack.
> But the Intel processor saves these registers only if a stack-switch occurs,
> i.e. for inter-privilege interrupts and exceptions (cf. Intel Software
> Developer’s Manual, Vol. 3A, p. 5-17,
> http://www.intel.com/design/processor/manuals/253668.pdf).
> For an intra-privilege interrupt like the one issued in kernel_execve, these
> registers must be saved manually.
>
Could you describe the failure scenario this causes? I'm trying to
understand how something that fundamental would have possibly slipped by
testing?
-hpa
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-17 22:16 ` H. Peter Anvin
@ 2007-05-17 23:27 ` Philipp Kohlbecher
2007-05-17 23:33 ` H. Peter Anvin
0 siblings, 1 reply; 8+ messages in thread
From: Philipp Kohlbecher @ 2007-05-17 23:27 UTC (permalink / raw
To: H. Peter Anvin; +Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
H. Peter Anvin wrote:
> Philipp Kohlbecher wrote:
>> From: Philipp Kohlbecher <pk031698@uni-greifswald.de>
>>
>> The kernel_execve function issues a software interrupt (int 0x80) to make
>> a system call to sys_execve. This function expects to find the stack segment
>> and stack pointer of the function that issued the system call in the pt_regs
>> struct. The syscall entry code that sets up this struct expects the stack
>> segment and the stack pointer of the issuing function already on the stack.
>> But the Intel processor saves these registers only if a stack-switch occurs,
>> i.e. for inter-privilege interrupts and exceptions (cf. Intel Software
>> Developer’s Manual, Vol. 3A, p. 5-17,
>> http://www.intel.com/design/processor/manuals/253668.pdf).
>> For an intra-privilege interrupt like the one issued in kernel_execve, these
>> registers must be saved manually.
>>
>
> Could you describe the failure scenario this causes?
I don't know of any problems this causes. The kernel needs to be aware
of the fact that the xss and esp fields of the pt_regs struct may
contain wrong values anyway, as hardware interrupts arriving while the
CPU is in kernel mode would also lead to this condition.
The file include/asm-i386/processor.h contains a comment to that effect
(lines 483-492).
With kernel_execve we can predict this, however, and account for it.
(This may be superfluous, but I don't think it hurts and it might
prevent future errors.)
- Phil Kohlbecher
-
To unsubscribe from this list: send the line "unsubscribe linux-assembly" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-17 23:27 ` Philipp Kohlbecher
@ 2007-05-17 23:33 ` H. Peter Anvin
2007-05-18 8:32 ` Philipp Kohlbecher
0 siblings, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2007-05-17 23:33 UTC (permalink / raw
To: Philipp Kohlbecher
Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
Philipp Kohlbecher wrote:
>
> I don't know of any problems this causes. The kernel needs to be aware
> of the fact that the xss and esp fields of the pt_regs struct may
> contain wrong values anyway, as hardware interrupts arriving while the
> CPU is in kernel mode would also lead to this condition.
> The file include/asm-i386/processor.h contains a comment to that effect
> (lines 483-492).
> With kernel_execve we can predict this, however, and account for it.
> (This may be superfluous, but I don't think it hurts and it might
> prevent future errors.)
>
... and it may *cause* future errors by making it harder to find bugs, too.
In other words, your patch doesn't actually fix anything, it *masks*
potential bugs which would also be triggered by interrupts in kernel
mode. This is bad.
You realize you just make a distinction between synchronous and
asynchronous events, such that your patch means that only asynchronous
events would catch the bugs you mask. This makes bug hunting much harder.
If anything it would be better to poison those fields so that we can
smoke out anything that trusts them.
-hpa
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-17 23:33 ` H. Peter Anvin
@ 2007-05-18 8:32 ` Philipp Kohlbecher
2007-05-18 22:41 ` H. Peter Anvin
0 siblings, 1 reply; 8+ messages in thread
From: Philipp Kohlbecher @ 2007-05-18 8:32 UTC (permalink / raw
To: H. Peter Anvin; +Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
H. Peter Anvin wrote:
> Philipp Kohlbecher wrote:
>> (This may be superfluous, but I don't think it hurts and it might
>> prevent future errors.)
>
> ... and it may *cause* future errors by making it harder to find bugs, too.
>
> In other words, your patch doesn't actually fix anything, it *masks*
> potential bugs which would also be triggered by interrupts in kernel
> mode. This is bad.
I am not sure these potential bugs would also be triggered by interrupts
in kernel mode. After all, kernel_execve is a special case: It is the
only case (AFAIK) where a software interrupt is generated while in
kernel mode (intra-privilege interrupt), but (if all goes right) the
iret "returns" to user mode (inter-privilege return).
This iret will pop the ss and esp registers off the stack, so
start_thread (include/asm-i386/processor.h:444) needs to put the correct
values in these stack positions, which it does.
Now, my point is that, when start_thread gets called at the end of a
successful kernel_execve, these stack positions cannot contain what they
are supposed to contain and no space is allocated on the stack for these
register values, so that something else gets overwritten.
It would seem cleaner to handle this special case differently by making
sure that the stack is set up the way it would be for an inter-privilege
interrupt.
- Phil Kohlbecher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-18 8:32 ` Philipp Kohlbecher
@ 2007-05-18 22:41 ` H. Peter Anvin
2007-05-18 23:08 ` H. Peter Anvin
2007-05-21 10:48 ` Philipp Kohlbecher
0 siblings, 2 replies; 8+ messages in thread
From: H. Peter Anvin @ 2007-05-18 22:41 UTC (permalink / raw
To: Philipp Kohlbecher
Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
Philipp Kohlbecher wrote:
>>
>> In other words, your patch doesn't actually fix anything, it *masks*
>> potential bugs which would also be triggered by interrupts in kernel
>> mode. This is bad.
>
> I am not sure these potential bugs would also be triggered by interrupts
> in kernel mode. After all, kernel_execve is a special case: It is the
> only case (AFAIK) where a software interrupt is generated while in
> kernel mode (intra-privilege interrupt), but (if all goes right) the
> iret "returns" to user mode (inter-privilege return).
>
> This iret will pop the ss and esp registers off the stack, so
> start_thread (include/asm-i386/processor.h:444) needs to put the correct
> values in these stack positions, which it does.
Yes, and it *BETTER*, because those are user-mode values which refer to
the newly created task.
Sticking kernel mode values in those fields would add no value, except
as a poison (since %ss == KERNEL_DS and would cause a #GP(0) if it ever
reached IRET.) If anything, those fields should be pushed as zero or
some other poison bits. That would be slightly better than what's there
now, which is whatever garbage happens to be on the stack. Pushing the
kernel SS:ESP is just plain wrong (not to mention that the way you do it
doesn't even produce the right value for ESP -- you'd have to save away
ESP before you push SS.)
-hpa
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-18 22:41 ` H. Peter Anvin
@ 2007-05-18 23:08 ` H. Peter Anvin
2007-05-21 10:48 ` Philipp Kohlbecher
1 sibling, 0 replies; 8+ messages in thread
From: H. Peter Anvin @ 2007-05-18 23:08 UTC (permalink / raw
To: Philipp Kohlbecher
Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
H. Peter Anvin wrote:
> Sticking kernel mode values in those fields would add no value, except
> as a poison (since %ss == KERNEL_DS and would cause a #GP(0) if it ever
> reached IRET.)
#SS(0), rather, of course.
-hpa
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall
2007-05-18 22:41 ` H. Peter Anvin
2007-05-18 23:08 ` H. Peter Anvin
@ 2007-05-21 10:48 ` Philipp Kohlbecher
1 sibling, 0 replies; 8+ messages in thread
From: Philipp Kohlbecher @ 2007-05-21 10:48 UTC (permalink / raw
To: H. Peter Anvin; +Cc: Dave Jones, Arnd Bergmann, linux-kernel, linux-assembly
H. Peter Anvin wrote:
> Sticking kernel mode values in those fields would add no value, except
> as a poison (since %ss == KERNEL_DS and would cause a #GP(0) if it ever
> reached IRET.) If anything, those fields should be pushed as zero or
> some other poison bits. That would be slightly better than what's there
> now, which is whatever garbage happens to be on the stack. Pushing the
> kernel SS:ESP is just plain wrong (not to mention that the way you do it
> doesn't even produce the right value for ESP -- you'd have to save away
> ESP before you push SS.)
That's true. The xss and esp fields of the pt_regs struct always contain
either garbage (for interrupts occuring while in kernel mode) or
user-mode values (for interrupts occuring while in user mode).
So, filling these fields with kernel-mode values indeed doesn't make
much sense.
Allocating space on the stack and poisoning those values would make
sense, though, so I will modify the patch accordingly and resend it.
Thank you for your feedback!
- Philipp Kohlbecher
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-05-21 10:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-17 22:06 [PATCH 2.6.21.1] i386: save registers before intra-privilege syscall Philipp Kohlbecher
2007-05-17 22:16 ` H. Peter Anvin
2007-05-17 23:27 ` Philipp Kohlbecher
2007-05-17 23:33 ` H. Peter Anvin
2007-05-18 8:32 ` Philipp Kohlbecher
2007-05-18 22:41 ` H. Peter Anvin
2007-05-18 23:08 ` H. Peter Anvin
2007-05-21 10:48 ` Philipp Kohlbecher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).