All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <Andrew.Cooper3@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>, Wei Liu <wl@xen.org>,
	Xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 2/2] x86/spec-ctrl: Reduce HVM RSB overhead where possible
Date: Thu, 11 Aug 2022 18:06:51 +0000	[thread overview]
Message-ID: <ba78d9ff-e31e-210f-ebbb-9abb4a4e3fd3@citrix.com> (raw)
In-Reply-To: <b45778e0-9ad3-d32d-e226-7171cfb59394@suse.com>

On 10/08/2022 09:00, Jan Beulich wrote:
> On 09.08.2022 19:00, Andrew Cooper wrote:
>> --- a/xen/arch/x86/hvm/vmx/entry.S
>> +++ b/xen/arch/x86/hvm/vmx/entry.S
>> @@ -44,6 +44,7 @@ ENTRY(vmx_asm_vmexit_handler)
>>          .endm
>>          ALTERNATIVE "", restore_spec_ctrl, X86_FEATURE_SC_MSR_HVM
>>          /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>> +        /* On PBRSB-vulenrable hardware, `ret` not safe before the start of vmx_vmexit_handler() */
> Besides the spelling issue mentioned by Jason I think this line also
> wants wrapping. Maybe the two comments also want combining to just
> one, such that "WARNING!" clearly applies to both parts.
>
>> @@ -515,7 +515,8 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
>>              boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) ||
>>              opt_eager_fpu || opt_md_clear_hvm)       ? ""               : " None",
>>             boot_cpu_has(X86_FEATURE_SC_MSR_HVM)      ? " MSR_SPEC_CTRL" : "",
>> -           boot_cpu_has(X86_FEATURE_SC_RSB_HVM)      ? " RSB"           : "",
>> +           boot_cpu_has(X86_FEATURE_SC_RSB_HVM)      ? " RSB"           :
>> +           boot_cpu_has(X86_BUG_PBRSB)               ? " PBRSB"         : "",
>>             opt_eager_fpu                             ? " EAGER_FPU"     : "",
>>             opt_md_clear_hvm                          ? " MD_CLEAR"      : "",
>>             boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM)  ? " IBPB-entry"    : "");
> Along the lines of half of what fdbf8bdfebc2 ("x86/spec-ctrl:
> correct per-guest-type reporting of MD_CLEAR") did, I think you also want
> to extend the other (earlier) conditional in this function invocation.

Oh yes, good point.

> I also wonder whether it wouldn't be helpful to parenthesize the new
> construct, such that it'll be more obvious that this is a double
> conditional operator determining a single function argument.

I haven't done that elsewhere.  Personally, I find it easier to follow
the commas on the RHS.

>
>> @@ -718,6 +719,77 @@ static bool __init rsb_is_full_width(void)
>>      return true;
>>  }
>>  
>> +/*
>> + * HVM guests can create arbitrary RSB entries, including ones which point at
>> + * Xen supervisor mappings.
>> + *
>> + * Traditionally, the RSB is not isolated on vmexit, so Xen needs to take
>> + * safety precautions to prevent RSB speculation from consuming guest values.
>> + *
>> + * Intel eIBRS specifies that the RSB is flushed:
>> + *   1) on VMExit when IBRS=1, or
>> + *   2) shortly thereafter when Xen restores the host IBRS=1 setting.
>> + * However, a subset of eIBRS-capable parts also suffer PBRSB and need
>> + * software assistance to maintain RSB safety.
>> + */
>> +static __init enum hvm_rsb {
>> +    hvm_rsb_none,
>> +    hvm_rsb_pbrsb,
>> +    hvm_rsb_stuff32,
>> +} hvm_rsb_calculations(uint64_t caps)
>> +{
>> +    if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
>> +         boot_cpu_data.x86 != 6 )
>> +        return hvm_rsb_stuff32;
>> +
>> +    if ( !(caps & ARCH_CAPS_IBRS_ALL) )
>> +        return hvm_rsb_stuff32;
>> +
>> +    if ( caps & ARCH_CAPS_PBRSB_NO )
>> +        return hvm_rsb_none;
>> +
>> +    /*
>> +     * We're choosing between the eIBRS-capable models which don't enumerate
>> +     * PBRSB_NO.  Earlier steppings of some models don't enumerate eIBRS and
>> +     * are excluded above.
>> +     */
>> +    switch ( boot_cpu_data.x86_model )
>> +    {
>> +        /*
>> +         * Core (inc Hybrid) CPUs to date (August 2022) are vulenrable.
>> +         */
>> +    case 0x55: /* Skylake X */
>> +    case 0x6a: /* Ice Lake SP */
>> +    case 0x6c: /* Ice Lake D */
>> +    case 0x7e: /* Ice Lake client */
>> +    case 0x8a: /* Lakefield (SNC/TMT) */
>> +    case 0x8c: /* Tiger Lake U */
>> +    case 0x8d: /* Tiger Lake H */
>> +    case 0x8e: /* Skylake-L */
> Hmm, is SDM Vol 4's initial table wrong then in stating Kaby Lake /
> Coffee Lake for this and ...
>
>> +    case 0x97: /* Alder Lake S */
>> +    case 0x9a: /* Alder Lake H/P/U */
>> +    case 0x9e: /* Skylake */
> ... this? Otoh I notice that intel-family.h also says Skylake in
> respective comments, despite the constants themselves being named
> differently. Yet again ...
>
>> +    case 0xa5: /* Comet Lake */
>> +    case 0xa6: /* Comet Lake U62 */
> ... you call these Comet Lake when intel-family.h says Skylake also for
> these (and names the latter's variable COMETLAKE_L).
>
> What is in the comments here is of course not of primary concern for
> getting this patch in, but the named anomalies will stand out when all
> of this is switched over to use intel-family.h's constants.

Naming in Skylake-uarch is a total mess.  Half is core codenames, and
half is marketing attempting to cover the fact that nothing much changed
in the 10's of steppings for 0x8e/0x9e.

But yes, I do need to clean up a few details here.  I'm still waiting
for some corrections to be made in official docs.

>
>> @@ -1327,9 +1400,33 @@ void __init init_speculation_mitigations(void)
>>       * HVM guests can always poison the RSB to point at Xen supervisor
>>       * mappings.
>>       */
>> +    hvm_rsb = hvm_rsb_calculations(caps);
>> +    if ( opt_rsb_hvm == -1 )
>> +        opt_rsb_hvm = hvm_rsb != hvm_rsb_none;
>> +
>>      if ( opt_rsb_hvm )
>>      {
>> -        setup_force_cpu_cap(X86_FEATURE_SC_RSB_HVM);
>> +        switch ( hvm_rsb )
>> +        {
>> +        case hvm_rsb_pbrsb:
>> +            setup_force_cpu_cap(X86_BUG_PBRSB);
>> +            break;
>> +
>> +        case hvm_rsb_none:
>> +            /*
>> +             * Somewhat arbitrary.  If something is wrong and the user has
>> +             * forced HVM RSB protections on a system where we think nothing
>> +             * is necessary, they they possibly know something we dont.
>> +             *
>> +             * Use stuff32 in this case, which is the most protection we can
>> +             * muster.
>> +             */
>> +            fallthrough;
>> +
>> +        case hvm_rsb_stuff32:
>> +            setup_force_cpu_cap(X86_FEATURE_SC_RSB_HVM);
>> +            break;
>> +        }
>>  
>>          /*
>>           * For SVM, Xen's RSB safety actions are performed before STGI, so
> For people using e.g. "spec-ctrl=no-ibrs" but leaving RSB stuffing enabled
> (or force-enabling it) we'd need to have an LFENCE somewhere as well.

We don't, but it's subtle.

Attempting to exploit PBRSB is a sub-case of trying to exploit general
RSB speculation on other processors which doesn't flush the RSB on vmexit.

Xen doesn't architecturally execute more RETs than CALLs (unlike other
open source hypervisors which do have a problem here), so an attacker
first needs to control speculation to find a non-architectural path with
excess RETs.

This is already makes it a lack-of-defence-in-depth type problem,
because if the attacker could control speculation like that, they'd not
care about chaining it like this to a more complicated exploit.

An attacker has to find enough rets to unwind all the CALLs Xen has done
thus far (3 in this example.  2 from the first RSB loop, and the call up
into the vmexit handler), and then one extra to consume the bad RSB
entry.  i.e. they need to find an unexpected code sequence in Xen with 4
excess RETs, assuming they can find a gadget in vmx_vmexit_handler()
only which they can control speculation with.

All the HVM funcs are altcalls now, which would have been be the obvious
place to try and attack, but can't be attacked any more.  We do have
some indirect branches, and other mechanisms in place to try and protect
them.

But... an attacker has to do all of this, in the speculative shadow of
the mispredicted loop exit, taking it firmly from "theoretically" into
"impossible" territory.

~Andrew

  reply	other threads:[~2022-08-11 18:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-09 17:00 [PATCH 0/2] x86/spec-ctrl: Reduce HVM RSB overhead Andrew Cooper
2022-08-09 17:00 ` [PATCH 1/2] x86/spec-ctrl: Enumeration for PBRSB_NO Andrew Cooper
2022-08-10  7:10   ` Jan Beulich
2022-08-09 17:00 ` [PATCH 2/2] x86/spec-ctrl: Reduce HVM RSB overhead where possible Andrew Cooper
2022-08-09 20:20   ` Jason Andryuk
2022-08-11 17:05     ` Andrew Cooper
2022-08-10  8:00   ` Jan Beulich
2022-08-11 18:06     ` Andrew Cooper [this message]
2022-08-11 10:55 ` [PATCH 0/2] x86/spec-ctrl: Reduce HVM RSB overhead Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba78d9ff-e31e-210f-ebbb-9abb4a4e3fd3@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.