All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Wu, Feng" <feng.wu@intel.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"keir@xen.org" <keir@xen.org>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	xen-devel <xen-devel@lists.xen.org>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Wu, Feng" <feng.wu@intel.com>
Subject: Re: Fwd: [v3 14/15] Update Posted-Interrupts Descriptor during vCPU scheduling
Date: Tue, 14 Jul 2015 14:08:02 +0000	[thread overview]
Message-ID: <E959C4978C3B6342920538CF579893F00260F191@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <1436526348.22672.387.camel@citrix.com>



> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Friday, July 10, 2015 7:06 PM
> To: Jan Beulich
> Cc: Wu, Feng; andrew.cooper3@citrix.com; George Dunlap; Tian, Kevin; Zhang,
> Yang Z; xen-devel; keir@xen.org
> Subject: Re: [Xen-devel] Fwd: [v3 14/15] Update Posted-Interrupts Descriptor
> during vCPU scheduling
> 
> On Fri, 2015-07-10 at 07:22 +0100, Jan Beulich wrote:
> > >>> On 10.07.15 at 07:59, <feng.wu@intel.com> wrote:
> > > If you agree with doing all this in a central place, maybe we can create
> > > an arch hook for 'struct scheduler' to do this and call it in all the places
> > > vcpu_runstate_change() gets called. What is your opinion about this?
> >
> > Doing this in a central place is certainly the right approach, but
> > adding an arch hook that needs to be called everywhere
> > vcpu_runstate_change() wouldn't serve that purpose.
> >
> Indeed.
> 
> > Instead
> > we'd need to replace all current vcpu_runstate_change() calls
> > with calls to a new function calling both this and the to be added
> > arch hook.
> >
> Well, I also see the value of having this done in one place, but not to
> the point of adding something like this.
> 
> > But please wait for George's / Dario's feedback, because they
> > seem to be even less convinced than me about your model of
> > tying the updates to runstate changes.
> >
> Indeed. George stated very well the reason why vcpu_runstate_change()
> should not be used, and suggested arch hooks to be added in the relevant
> places. I particularly like this idea as, not only it would leave
> vcpu_runstate_change() alone, but it would also help disentangling this
> from runstates, which, IMO, is also important.
> 
> So, can we identify the state (runstate? :-/) transitions that needs
> intercepting, and find a suitable place where to place hooks? I mean,
> something like this:
> 
>  - running-->blocked: can be handled in the arch specific part of
>                       context switch (similarly to CMT/CAT, which
>                       already hooks into there). So, in this case, no
>                       need to add any hook, as arch specific code is
>                       called already;
> 
>  - running-->runnable: same as above;
> 
>  - running-->offline: not sure if you need to take action on this. If
>                       yes, context switch should be fine as well;
> 
>  - blocked-->runnable: I think we need this, don't we? If yes, we
>                        probably want an arch hook in vcpu_wake();
> 
>  - blocked-->offline: do you need it? Well, the hook in wake should work
>                       for this as well, if yes;
> 
>  - runnable/running-->offline: if necessary, we want an hook in
>                                vcpu_sleep_nosync().
> 
> Another way to look at this, less biased toward runstates (i.e., what
> I've been asking for since a while), would be:
> 
>  - do you need to perform an action upon context switch (on prev and/or
>    next vcpu)? If yes, there's an arch specific path in there already;
>  - do you need to perform an action when a vcpu wakes-up? If yes, we
>    need an arch hook in vcpu_wake();
>  - do you need to perform an action when a vcpu goes to sleep? If yes,
>    we need an arch hook in vcpu_sleep_nosync();
> 
> I think this makes a more than fair solution. I happen to like it even
> better than the centralized approach, actually! That is for personal
> taste, but also because I think it may be useful for others too, in
> future, to be able to execute arch specific code, e.g., upon wakes-up,
> in which case we will be able to use the hook that we're introducing
> here for PI.
> 
> Thanks and Regards,
> Dario

Hi Dario,

Thanks for the suggestion! I made a draft patch for this idea, It may have
some issues since It is just a draft version, kind of like prototype, I post
it here just like to know whether it is meet your expectation, if it is I
can continue with this direction and this may speed up the upstreaming
process. Thanks a lot!

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 6eebc1a..7e678c8 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -740,6 +740,81 @@ static void vmx_ctxt_switch_from(struct vcpu *v)
     vmx_save_guest_msrs(v);
     vmx_restore_host_msrs();
     vmx_save_dr(v);
+
+    if ( iommu_intpost )
+    {
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+        struct pi_desc old, new;
+        unsigned long flags;
+
+        if ( vcpu_runnable(v) || !test_bit(_VPF_blocked, &v->pause_flags) )
+        {
+            /*
+             * The vCPU is preempted or sleeped. We don't need to send
+             * notification event to a non-running vcpu, the interrupt
+             * information will be delivered to it before VM-ENTRY when
+             * the vcpu is scheduled to run next time.
+             */
+            pi_set_sn(pi_desc);
+
+        }
+        else if ( test_bit(_VPF_blocked, &v->pause_flags) )
+        {
+            /* The vCPU is blocked */
+            ASSERT(v->arch.hvm_vmx.pi_block_cpu == -1);
+
+            /*
+             * The vCPU is blocked on the block list. Add the blocked
+             * vCPU on the list of the v->arch.hvm_vmx.pi_block_cpu,
+             * which is the destination of the wake-up notification event.
+             */
+            v->arch.hvm_vmx.pi_block_cpu = v->processor;
+            spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
+                              v->arch.hvm_vmx.pi_block_cpu), flags);
+            list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
+                          &per_cpu(pi_blocked_vcpu, v->arch.hvm_vmx.pi_block_cpu));
+            spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
+                               v->arch.hvm_vmx.pi_block_cpu), flags);
+
+            do {
+                old.control = new.control = pi_desc->control;
+
+                /*
+                 * We should not block the vCPU if
+                 * an interrupt was posted for it.
+                 */
+
+                if ( old.on )
+                {
+                    /*
+                     * The vCPU will be removed from the block list
+                     * during its state transferring from RUNSTATE_blocked
+                     * to RUNSTATE_runnable after the following tasklet
+                     * is executed.
+                     */
+                    tasklet_schedule(&v->arch.hvm_vmx.pi_vcpu_wakeup_tasklet);
+                    return;
+                }
+
+                /*
+                 * Change the 'NDST' field to v->arch.hvm_vmx.pi_block_cpu,
+                 * so when external interrupts from assigned deivces happen,
+                 * wakeup notifiction event will go to
+                 * v->arch.hvm_vmx.pi_block_cpu, then in pi_wakeup_interrupt()
+                 * we can find the vCPU in the right list to wake up.
+                 */
+                if ( x2apic_enabled )
+                    new.ndst = cpu_physical_id(v->arch.hvm_vmx.pi_block_cpu);
+                else
+                    new.ndst = MASK_INSR(cpu_physical_id(
+                                     v->arch.hvm_vmx.pi_block_cpu),
+                                     PI_xAPIC_NDST_MASK);
+                new.sn = 0;
+                new.nv = pi_wakeup_vector;
+            } while ( cmpxchg(&pi_desc->control, old.control, new.control)
+                      != old.control );
+        }
+    }
 }

 static void vmx_ctxt_switch_to(struct vcpu *v)
@@ -764,6 +839,22 @@ static void vmx_ctxt_switch_to(struct vcpu *v)

     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
+
+    if ( iommu_intpost )
+    {
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+        ASSERT( pi_desc->sn == 1 );
+
+        if ( x2apic_enabled )
+            write_atomic(&pi_desc->ndst, cpu_physical_id(v->processor));
+        else
+            write_atomic(&pi_desc->ndst,
+                         MASK_INSR(cpu_physical_id(v->processor),
+                         PI_xAPIC_NDST_MASK));
+
+        pi_clear_sn(pi_desc);
+    }
 }


@@ -1900,6 +1991,42 @@ static void vmx_pi_desc_update(struct vcpu *v, int old_state)
     }
 } 
+void arch_vcpu_wake(struct vcpu *v)
+{
+    if ( !iommu_intpost || (v->runstate.state != RUNSTATE_blocked) )
+        return;
+
+    if ( likely(vcpu_runnable(v)) ||
+         !test_bit(_VPF_blocked, &v->pause_flags) )
+    {
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+        unsigned long flags;
+
+        /*
+         * blocked -> runnable/offline
+         * If the state is transferred from RUNSTATE_blocked,
+         * we should set 'NV' feild back to posted_intr_vector,
+         * so the Posted-Interrupts can be delivered to the vCPU
+         * by VT-d HW after it is scheduled to run.
+         */
+        write_atomic((uint8_t*)&pi_desc->nv, posted_intr_vector);
+
+        /*
+         * Delete the vCPU from the related block list
+         * if we are resuming from blocked state
+         */
+        if (v->arch.hvm_vmx.pi_block_cpu != -1)
+        {
+            spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
+                              v->arch.hvm_vmx.pi_block_cpu), flags);
+            list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+            spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
+                                   v->arch.hvm_vmx.pi_block_cpu), flags);
+            v->arch.hvm_vmx.pi_block_cpu = -1;
+        }
+    }
+}
+
 void vmx_hypervisor_cpuid_leaf(uint32_t sub_idx,
                                uint32_t *eax, uint32_t *ebx,
                                uint32_t *ecx, uint32_t *edx)
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 20727d6..7b08797 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -397,6 +397,8 @@ void vcpu_wake(struct vcpu *v)
             vcpu_runstate_change(v, RUNSTATE_offline, NOW());
     }

+    arch_vcpu_wake(v);
+
     vcpu_schedule_unlock_irqrestore(lock, flags, v);

     TRACE_2D(TRC_SCHED_WAKE, v->domain->domain_id, v->vcpu_id);
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 9603cf0..be5aebf 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -266,6 +266,7 @@ static inline unsigned int domain_max_vcpus(const struct domain *d)
 }

 static void arch_pi_desc_update(struct vcpu *v, int old_state) {}
+static void arch_vcpu_wake(struct vcpu *v) {}

 #endif /* __ASM_DOMAIN_H__ */

diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index e175417..38c796c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -511,6 +511,7 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
 enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v);

 void arch_pi_desc_update(struct vcpu *v, int old_state);
+void arch_vcpu_wake(struct vcpu *v);

 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */



> 
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

  parent reply	other threads:[~2015-07-14 14:08 UTC|newest]

Thread overview: 155+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-24  5:18 [v3 00/15] Add VT-d Posted-Interrupts support Feng Wu
2015-06-24  5:18 ` [v3 01/15] Vt-d Posted-intterrupt (PI) design Feng Wu
2015-06-24  6:15   ` Meng Xu
2015-06-24  6:19     ` Wu, Feng
2015-07-08  7:21   ` Tian, Kevin
2015-07-08  7:29     ` Wu, Feng
2015-06-24  5:18 ` [v3 02/15] Add helper macro for X86_FEATURE_CX16 feature detection Feng Wu
2015-06-24 17:31   ` Andrew Cooper
2015-07-08  7:23   ` Tian, Kevin
2015-06-24  5:18 ` [v3 03/15] Add cmpxchg16b support for x86-64 Feng Wu
2015-06-24 18:35   ` Andrew Cooper
2015-07-08  7:06     ` Wu, Feng
2015-07-08  8:12       ` Jan Beulich
2015-07-08  8:33         ` Wu, Feng
2015-07-08  8:43           ` Jan Beulich
2015-07-08  8:50             ` Wu, Feng
2015-07-08  8:50         ` Andrew Cooper
2015-07-10 12:57   ` Jan Beulich
2015-06-24  5:18 ` [v3 04/15] iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature Feng Wu
2015-06-25  9:06   ` Andrew Cooper
2015-06-25  9:47     ` Wu, Feng
2015-06-25 10:16       ` Andrew Cooper
2015-06-25 12:47         ` Wu, Feng
2015-07-08  7:30   ` Tian, Kevin
2015-06-24  5:18 ` [v3 05/15] vt-d: VT-d Posted-Interrupts feature detection Feng Wu
2015-06-25 10:21   ` Andrew Cooper
2015-06-25 13:02     ` Wu, Feng
2015-07-08  7:32   ` Tian, Kevin
2015-07-08  8:00     ` Wu, Feng
2015-06-24  5:18 ` [v3 06/15] vmx: Extend struct pi_desc to support VT-d Posted-Interrupts Feng Wu
2015-06-29 15:04   ` Andrew Cooper
2015-07-08  7:48   ` Tian, Kevin
2015-07-10 13:08   ` Jan Beulich
2015-07-15  2:40     ` Wu, Feng
2015-07-15  8:20       ` Jan Beulich
2015-07-15  8:26         ` Wu, Feng
2015-07-15  8:36           ` Jan Beulich
2015-07-15  8:43             ` Wu, Feng
2015-07-15  9:28               ` Jan Beulich
2015-07-15  9:30                 ` Wu, Feng
2015-07-15  3:13     ` Wu, Feng
2015-06-24  5:18 ` [v3 07/15] vmx: Initialize VT-d Posted-Interrupts Descriptor Feng Wu
2015-06-29 15:32   ` Andrew Cooper
2015-06-30  1:46     ` Wu, Feng
2015-06-30  2:32     ` Dario Faggioli
2015-07-08  7:53   ` Tian, Kevin
2015-06-24  5:18 ` [v3 08/15] Suppress posting interrupts when 'SN' is set Feng Wu
2015-06-29 15:41   ` Andrew Cooper
2015-06-30  1:48     ` Wu, Feng
2015-07-08  9:06   ` Tian, Kevin
2015-07-08 10:11     ` Wu, Feng
2015-07-08 11:31       ` Tian, Kevin
2015-07-08 11:58         ` Wu, Feng
2015-07-10 13:20   ` Jan Beulich
2015-06-24  5:18 ` [v3 09/15] vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts Feng Wu
2015-06-29 16:04   ` Andrew Cooper
2015-06-30  1:52     ` Wu, Feng
2015-07-08  9:10   ` Tian, Kevin
2015-07-10 13:27   ` Jan Beulich
2015-06-24  5:18 ` [v3 10/15] vt-d: Add API to update IRTE when VT-d PI is used Feng Wu
2015-06-29 16:22   ` Andrew Cooper
2015-07-08  9:59   ` Tian, Kevin
2015-07-08 10:12     ` Wu, Feng
2015-07-10 14:01   ` Jan Beulich
2015-07-15  6:04     ` Wu, Feng
2015-07-15  8:24       ` Jan Beulich
2015-07-15  8:38         ` Wu, Feng
2015-07-15  8:46           ` Jan Beulich
2015-07-15  8:55             ` Wu, Feng
2015-07-15  9:32               ` Jan Beulich
2015-06-24  5:18 ` [v3 11/15] Update IRTE according to guest interrupt config changes Feng Wu
2015-06-29 16:46   ` Andrew Cooper
2015-07-08 10:22   ` Tian, Kevin
2015-07-08 10:31     ` Wu, Feng
2015-07-08 11:46       ` Tian, Kevin
2015-07-08 11:52         ` Wu, Feng
2015-07-08 11:54           ` Tian, Kevin
2015-07-10 14:23   ` Jan Beulich
2015-06-24  5:18 ` [v3 12/15] vmx: posted-interrupt handling when vCPU is blocked Feng Wu
2015-06-29 17:07   ` Andrew Cooper
2015-07-08 10:36     ` Wu, Feng
2015-07-08 10:48       ` Jan Beulich
     [not found]   ` <559181F9.6020106@citrix.com>
2015-06-30  2:51     ` Fwd: " Dario Faggioli
2015-06-30  2:59       ` Wu, Feng
2015-06-30  9:46         ` Dario Faggioli
2015-06-30 10:11   ` Andrew Cooper
2015-07-01 13:26     ` Dario Faggioli
2015-07-02  4:27       ` Wu, Feng
2015-07-02  8:30         ` Dario Faggioli
2015-07-02  8:58           ` Wu, Feng
2015-07-02 10:09             ` Dario Faggioli
2015-07-02 10:41               ` Wu, Feng
2015-07-02 10:30           ` Andrew Cooper
2015-07-02 10:56             ` Wu, Feng
2015-07-02 12:04             ` Dario Faggioli
2015-07-02 12:10               ` Wu, Feng
2015-07-02 12:16               ` Andrew Cooper
2015-07-02 12:38                 ` Dario Faggioli
2015-07-02 12:59                   ` Andrew Cooper
2015-07-03  1:33                     ` Wu, Feng
2015-07-02  4:25     ` Wu, Feng
2015-07-08 11:00   ` Tian, Kevin
2015-07-08 11:02     ` Wu, Feng
2015-07-08 12:46     ` Jan Beulich
2015-07-08 13:09       ` Andrew Cooper
2015-07-08 22:49         ` Tian, Kevin
2015-07-09  7:25           ` Jan Beulich
2015-07-10  6:21             ` Wu, Feng
2015-07-10  6:32               ` Jan Beulich
2015-07-10  7:29                 ` Wu, Feng
2015-07-10  8:49                   ` Jan Beulich
2015-07-10  8:57                     ` Wu, Feng
2015-07-08 22:31       ` Tian, Kevin
2015-06-24  5:18 ` [v3 13/15] vmx: Properly handle notification event when vCPU is running Feng Wu
2015-07-08 11:03   ` Tian, Kevin
2015-07-10 14:40   ` Jan Beulich
2015-06-24  5:18 ` [v3 14/15] Update Posted-Interrupts Descriptor during vCPU scheduling Feng Wu
     [not found]   ` <55918214.4030102@citrix.com>
2015-06-30  2:58     ` Fwd: " Dario Faggioli
2015-07-02  4:32       ` Wu, Feng
2015-07-02  4:34         ` Wu, Feng
2015-07-02  8:20         ` Dario Faggioli
2015-07-09  3:09           ` Wu, Feng
2015-07-09  8:18             ` Dario Faggioli
2015-07-09 11:19             ` George Dunlap
2015-07-09 11:29               ` George Dunlap
2015-07-09 11:38               ` Wu, Feng
2015-07-09 12:42                 ` Dario Faggioli
2015-07-10  0:07                   ` Wu, Feng
2015-07-10 12:40                     ` Dario Faggioli
2015-07-10 13:47                       ` Konrad Rzeszutek Wilk
2015-07-10 13:59                         ` Dario Faggioli
2015-07-09 12:53                 ` George Dunlap
2015-07-09 13:44                   ` Jan Beulich
2015-07-09 14:18                     ` Dario Faggioli
2015-07-09 14:27                       ` George Dunlap
2015-07-09 14:47                         ` Dario Faggioli
2015-07-10  5:59                         ` Wu, Feng
2015-07-10  6:22                           ` Jan Beulich
2015-07-10 11:05                             ` Dario Faggioli
2015-07-14  5:44                               ` Wu, Feng
2015-07-14 14:08                               ` Wu, Feng [this message]
2015-07-14 14:54                                 ` Jan Beulich
2015-07-14 15:20                                   ` Dario Faggioli
2015-07-14 16:41                                     ` George Dunlap
2015-07-14 16:02                                 ` Dario Faggioli
2015-07-15  0:54                                   ` Wu, Feng
2015-07-17  7:46                                   ` Wu, Feng
2015-07-17 10:13                                     ` Dario Faggioli
2015-07-17 22:57                                       ` Wu, Feng
2015-07-18 13:43                                         ` Dario Faggioli
2015-07-10  0:15                   ` Wu, Feng
2015-07-08 11:24   ` Tian, Kevin
2015-07-10 14:48   ` Jan Beulich
2015-06-24  5:18 ` [v3 15/15] Add a command line parameter for VT-d posted-interrupts Feng Wu
2015-07-08 11:25   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E959C4978C3B6342920538CF579893F00260F191@SHSMSX104.ccr.corp.intel.com \
    --to=feng.wu@intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.