Xen-Devel Archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
@ 2015-06-08 14:33 Paul Durrant
  2015-06-08 14:33 ` [PATCH 01/17] x86/hvm: simplify hvmemul_do_io() Paul Durrant
                   ` (17 more replies)
  0 siblings, 18 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Paul Durrant

This patch series re-works much of the code involved in emulation of port
and memory mapped I/O for HVM guests.

The code has become very convoluted and, at least by inspection, certain
emulations will apparently malfunction.

The series is broken down into 17 patches (which are also available in
my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
on the emulation18 branch) as follows:

x86/hvm: simplify hvmemul_do_io()
x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
x86/hvm: unify internal portio and mmio intercepts
x86/hvm: unify dpci portio intercept with standard portio intercept
x86/hvm: unify stdvga mmio intercept with standard mmio intercept
x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
x86/hvm: split I/O completion handling from state model
x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
x86/hvm: remove HVMIO_dispatched I/O state
x86/hvm: remove hvm_io_state enumeration
x86/hvm: use ioreq_t to track in-flight state
x86/hvm: only acquire RAM pages for emulation when we need to
x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
x86/hvm: make sure translated MMIO reads or writes fall within a page
x86/hvm: remove multiple open coded 'chunking' loops
x86/hvm: track large memory mapped accesses by linear address

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/17] x86/hvm: simplify hvmemul_do_io()
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 02/17] x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops Paul Durrant
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

Currently hvmemul_do_io() handles paging for I/O to/from a guest address
inline. This causes every exit point to have to execute:

if ( ram_page )
    put_page(ram_page);

This patch introduces wrapper hvmemul_do_io_addr() and
hvmemul_do_io_buffer() functions. The latter is used for I/O to/from a Xen
buffer and thus the complexity of paging can be restricted only to the
former, making the common hvmemul_do_io() function less convoluted.

This patch also tightens up some types and introduces pio/mmio wrappers
for the above functions with comments to document their semantics.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c        |  275 ++++++++++++++++++++++++-------------
 xen/arch/x86/hvm/io.c             |    2 +-
 xen/include/asm-x86/hvm/emulate.h |   18 ++-
 3 files changed, 199 insertions(+), 96 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ac9c9d6..fa70ec0 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,41 +51,23 @@ static void hvmtrace_io_assist(int is_mmio, ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-    int is_mmio, paddr_t addr, unsigned long *reps, int size,
-    paddr_t ram_gpa, int dir, int df, void *p_data)
+    int is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    uint8_t dir, bool_t df, bool_t data_is_addr, uint64_t data)
 {
     struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     ioreq_t p = {
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
         .dir = dir,
         .df = df,
-        .data = ram_gpa,
-        .data_is_ptr = (p_data == NULL),
+        .data = data,
+        .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
     };
-    unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
-    p2m_type_t p2mt;
-    struct page_info *ram_page;
+    void *p_data = (void *)data;
     int rc;
 
-    /* Check for paged out page */
-    ram_page = get_page_from_gfn(curr->domain, ram_gfn, &p2mt, P2M_UNSHARE);
-    if ( p2m_is_paging(p2mt) )
-    {
-        if ( ram_page )
-            put_page(ram_page);
-        p2m_mem_paging_populate(curr->domain, ram_gfn);
-        return X86EMUL_RETRY;
-    }
-    if ( p2m_is_shared(p2mt) )
-    {
-        if ( ram_page )
-            put_page(ram_page);
-        return X86EMUL_RETRY;
-    }
-
     /*
      * Weird-sized accesses have undefined behaviour: we discard writes
      * and read all-ones.
@@ -93,23 +75,13 @@ static int hvmemul_do_io(
     if ( unlikely((size > sizeof(long)) || (size & (size - 1))) )
     {
         gdprintk(XENLOG_WARNING, "bad mmio size %d\n", size);
-        ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
+        ASSERT(!data_is_addr); /* cannot happen with a REP prefix */
         if ( dir == IOREQ_READ )
             memset(p_data, ~0, size);
-        if ( ram_page )
-            put_page(ram_page);
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( !p.data_is_ptr && (dir == IOREQ_WRITE) )
-    {
-        memcpy(&p.data, p_data, size);
-        p_data = NULL;
-    }
-
-    vio = &curr->arch.hvm_vcpu.hvm_io;
-
-    if ( is_mmio && !p.data_is_ptr )
+    if ( is_mmio && !data_is_addr )
     {
         /* Part of a multi-cycle read or write? */
         if ( dir == IOREQ_WRITE )
@@ -117,11 +89,7 @@ static int hvmemul_do_io(
             paddr_t pa = vio->mmio_large_write_pa;
             unsigned int bytes = vio->mmio_large_write_bytes;
             if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-            {
-                if ( ram_page )
-                    put_page(ram_page);
                 return X86EMUL_OKAY;
-            }
         }
         else
         {
@@ -131,8 +99,6 @@ static int hvmemul_do_io(
             {
                 memcpy(p_data, &vio->mmio_large_read[addr - pa],
                        size);
-                if ( ram_page )
-                    put_page(ram_page);
                 return X86EMUL_OKAY;
             }
         }
@@ -144,40 +110,28 @@ static int hvmemul_do_io(
         break;
     case HVMIO_completed:
         vio->io_state = HVMIO_none;
-        if ( p_data == NULL )
-        {
-            if ( ram_page )
-                put_page(ram_page);
+        if ( data_is_addr || dir == IOREQ_WRITE )
             return X86EMUL_UNHANDLEABLE;
-        }
         goto finish_access;
     case HVMIO_dispatched:
         /* May have to wait for previous cycle of a multi-write to complete. */
-        if ( is_mmio && !p.data_is_ptr && (dir == IOREQ_WRITE) &&
+        if ( is_mmio && !data_is_addr && (dir == IOREQ_WRITE) &&
              (addr == (vio->mmio_large_write_pa +
                        vio->mmio_large_write_bytes)) )
-        {
-            if ( ram_page )
-                put_page(ram_page);
             return X86EMUL_RETRY;
-        }
         /* fallthrough */
     default:
-        if ( ram_page )
-            put_page(ram_page);
         return X86EMUL_UNHANDLEABLE;
     }
 
     if ( hvm_io_pending(curr) )
     {
         gdprintk(XENLOG_WARNING, "WARNING: io already pending?\n");
-        if ( ram_page )
-            put_page(ram_page);
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state =
-        (p_data == NULL) ? HVMIO_dispatched : HVMIO_awaiting_completion;
+    vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
+        HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
 
     /*
@@ -190,7 +144,12 @@ static int hvmemul_do_io(
     p.count = *reps;
 
     if ( dir == IOREQ_WRITE )
+    {
+        if ( !data_is_addr )
+            memcpy(&p.data, (void *)data, size);
+
         hvmtrace_io_assist(is_mmio, &p);
+    }
 
     if ( is_mmio )
     {
@@ -235,7 +194,7 @@ static int hvmemul_do_io(
             rc = X86EMUL_RETRY;
             if ( !hvm_send_assist_req(s, &p) )
                 vio->io_state = HVMIO_none;
-            else if ( p_data == NULL )
+            else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
         }
         break;
@@ -245,20 +204,18 @@ static int hvmemul_do_io(
     }
 
     if ( rc != X86EMUL_OKAY )
-    {
-        if ( ram_page )
-            put_page(ram_page);
         return rc;
-    }
 
  finish_access:
     if ( dir == IOREQ_READ )
+    {
         hvmtrace_io_assist(is_mmio, &p);
 
-    if ( p_data != NULL )
-        memcpy(p_data, &vio->io_data, size);
+        if ( !data_is_addr )
+            memcpy((void *)data, &vio->io_data, size);
+    }
 
-    if ( is_mmio && !p.data_is_ptr )
+    if ( is_mmio && !data_is_addr )
     {
         /* Part of a multi-cycle read or write? */
         if ( dir == IOREQ_WRITE )
@@ -285,23 +242,153 @@ static int hvmemul_do_io(
         }
     }
 
-    if ( ram_page )
-        put_page(ram_page);
     return X86EMUL_OKAY;
 }
 
-int hvmemul_do_pio(
-    unsigned long port, unsigned long *reps, int size,
-    paddr_t ram_gpa, int dir, int df, void *p_data)
+int hvmemul_do_io_buffer(
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    uint8_t dir, bool_t df, void *buffer)
 {
-    return hvmemul_do_io(0, port, reps, size, ram_gpa, dir, df, p_data);
+    BUG_ON(buffer == NULL);
+
+    return hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
+                         (uint64_t)buffer);
 }
 
-static int hvmemul_do_mmio(
-    paddr_t gpa, unsigned long *reps, int size,
-    paddr_t ram_gpa, int dir, int df, void *p_data)
+static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
+{
+    struct vcpu *v = current;
+    struct domain *d = v->domain;
+    p2m_type_t p2mt;
+
+    *page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
+
+    if ( *page == NULL )
+        return X86EMUL_UNHANDLEABLE;
+
+    if ( p2m_is_paging(p2mt) )
+    {
+        put_page(*page);
+        p2m_mem_paging_populate(d, gmfn);
+        return X86EMUL_RETRY;
+    }
+
+    if ( p2m_is_shared(p2mt) )
+    {
+        put_page(*page);
+        return X86EMUL_RETRY;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+static void hvmemul_release_page(struct page_info *page)
+{
+    put_page(page);
+}
+
+int hvmemul_do_io_addr(
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+    uint8_t dir, bool_t df, paddr_t ram_addr)
+{
+    struct page_info *ram_page;
+    int rc;
+
+    rc = hvmemul_acquire_page(paddr_to_pfn(ram_addr), &ram_page);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
+                       (uint64_t)ram_addr);
+
+    hvmemul_release_page(ram_page);
+
+    return rc;
+}
+
+/*
+ * Perform I/O between <port> and <buffer>. <dir> indicates the
+ * direction: IOREQ_READ means a read from <port> to <buffer> and
+ * IOREQ_WRITE means a write from <buffer> to <port>. Each access has
+ * width <size> and up to *<reps> accesses will be performed. If
+ * X86EMUL_OKAY is returned then <reps> will be updated with the number
+ * of accesses actually performed.
+ *
+ * NOTE: If *<reps> is greater than 1, each access will use the
+ *       <buffer> pointer; there is no implicit interation over a
+ *       block of memory starting at <buffer>.
+ */
+int hvmemul_do_pio_buffer(uint16_t port,
+                          unsigned long *reps,
+                          unsigned int size,
+                          uint8_t dir,
+                          void *buffer)
+{
+    return hvmemul_do_io_buffer(0, port, reps, size, dir, 0, buffer);
+}
+
+/*
+ * Perform I/O between <port> and guest RAM starting at <ram_addr>.
+ * <dir> indicates the direction: IOREQ_READ means a read from <port> to
+ * RAM and IOREQ_WRITE means a write from RAM to <port>. Each access has
+ * width <size> and up to *<reps> accesses will be performed. If
+ * X86EMUL_OKAY is returned then <reps> will be updated with the number
+ * of accesses actually performed.
+ * Each access will be done to/from successive RAM addresses, increasing
+ * if <df> is 0 or decreasing if <df> is 1.
+ */
+static int hvmemul_do_pio_addr(uint16_t port,
+                               unsigned long *reps,
+                               unsigned int size,
+                               uint8_t dir,
+                               bool_t df,
+                               paddr_t ram_addr)
+{
+    return hvmemul_do_io_addr(0, port, reps, size, dir, df, ram_addr);
+}
+
+/*
+ * Perform I/O between MMIO space starting at <mmio_addr> and <buffer>.
+ * <dir> indicates the direction: IOREQ_READ means a read from MMIO to
+ * <buffer> and IOREQ_WRITE means a write from <buffer> to MMIO. Each
+ * access has width <size> and up to *<reps> accesses will be performed.
+ * If X86EMUL_OKAY is returned then <reps> will be updated with the number
+ * of accesses actually performed.
+ * Each access will be done to/from successive MMIO addresses, increasing
+ * if <df> is 0 or decreasing if <df> is 1.
+ *
+ * NOTE: If *<reps> is greater than 1, each access will use the
+ *       <buffer> pointer; there is no implicit interation over a
+ *       block of memory starting at <buffer>.
+ */
+static int hvmemul_do_mmio_buffer(paddr_t mmio_addr,
+                                  unsigned long *reps,
+                                  unsigned int size,
+                                  uint8_t dir,
+                                  bool_t df,
+                                  void *buffer)
+{
+    return hvmemul_do_io_buffer(1, mmio_addr, reps, size, dir, df, buffer);
+}
+
+/*
+ * Perform I/O between MMIO space starting at <mmio_addr> and guest RAM
+ * starting at <ram_addr>. <dir> indicates the direction: IOREQ_READ
+ * means a read from MMIO to RAM and IOREQ_WRITE means a write from RAM
+ * to MMIO. Each access has width <size> and up to *<reps> accesses will
+ * be performed. If X86EMUL_OKAY is returned then <reps> will be updated
+ * with the number of accesses actually performed.
+ * Each access will be done to/from successive RAM *and* MMIO addresses,
+ * increasing if <df> is 0 or decreasing if <df> is 1.
+ */
+static int hvmemul_do_mmio_addr(paddr_t mmio_addr,
+                                unsigned long *reps,
+                                unsigned int size,
+                                uint8_t dir,
+                                bool_t df,
+                                paddr_t ram_addr)
 {
-    return hvmemul_do_io(1, gpa, reps, size, ram_gpa, dir, df, p_data);
+    return hvmemul_do_io_addr(1, mmio_addr, reps, size, dir, df, ram_addr);
 }
 
 /*
@@ -503,7 +590,8 @@ static int __hvmemul_read(
         gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
         while ( (off + chunk) <= PAGE_SIZE )
         {
-            rc = hvmemul_do_mmio(gpa, &reps, chunk, 0, IOREQ_READ, 0, p_data);
+            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
+                                        p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 return rc;
             addr += chunk;
@@ -537,7 +625,8 @@ static int __hvmemul_read(
                                     hvmemul_ctxt);
         while ( rc == X86EMUL_OKAY )
         {
-            rc = hvmemul_do_mmio(gpa, &reps, chunk, 0, IOREQ_READ, 0, p_data);
+            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
+                                        p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 break;
             addr += chunk;
@@ -645,7 +734,8 @@ static int hvmemul_write(
         gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
         while ( (off + chunk) <= PAGE_SIZE )
         {
-            rc = hvmemul_do_mmio(gpa, &reps, chunk, 0, IOREQ_WRITE, 0, p_data);
+            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
+                                        p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 return rc;
             addr += chunk;
@@ -675,7 +765,8 @@ static int hvmemul_write(
                                     hvmemul_ctxt);
         while ( rc == X86EMUL_OKAY )
         {
-            rc = hvmemul_do_mmio(gpa, &reps, chunk, 0, IOREQ_WRITE, 0, p_data);
+            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
+                                        p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 break;
             addr += chunk;
@@ -849,8 +940,8 @@ static int hvmemul_rep_ins(
     if ( p2mt == p2m_mmio_direct || p2mt == p2m_mmio_dm )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvmemul_do_pio(src_port, reps, bytes_per_rep, gpa, IOREQ_READ,
-                          !!(ctxt->regs->eflags & X86_EFLAGS_DF), NULL);
+    return hvmemul_do_pio_addr(src_port, reps, bytes_per_rep, IOREQ_READ,
+                               !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_outs(
@@ -887,8 +978,8 @@ static int hvmemul_rep_outs(
     if ( p2mt == p2m_mmio_direct || p2mt == p2m_mmio_dm )
         return X86EMUL_UNHANDLEABLE;
 
-    return hvmemul_do_pio(dst_port, reps, bytes_per_rep, gpa, IOREQ_WRITE,
-                          !!(ctxt->regs->eflags & X86_EFLAGS_DF), NULL);
+    return hvmemul_do_pio_addr(dst_port, reps, bytes_per_rep, IOREQ_WRITE,
+                               !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_movs(
@@ -944,12 +1035,12 @@ static int hvmemul_rep_movs(
         return X86EMUL_UNHANDLEABLE;
 
     if ( sp2mt == p2m_mmio_dm )
-        return hvmemul_do_mmio(
-            sgpa, reps, bytes_per_rep, dgpa, IOREQ_READ, df, NULL);
+        return hvmemul_do_mmio_addr(
+            sgpa, reps, bytes_per_rep, IOREQ_READ, df, dgpa);
 
     if ( dp2mt == p2m_mmio_dm )
-        return hvmemul_do_mmio(
-            dgpa, reps, bytes_per_rep, sgpa, IOREQ_WRITE, df, NULL);
+        return hvmemul_do_mmio_addr(
+            dgpa, reps, bytes_per_rep, IOREQ_WRITE, df, sgpa);
 
     /* RAM-to-RAM copy: emulate as equivalent of memmove(dgpa, sgpa, bytes). */
     bytes = *reps * bytes_per_rep;
@@ -1102,8 +1193,8 @@ static int hvmemul_rep_stos(
         return X86EMUL_UNHANDLEABLE;
 
     case p2m_mmio_dm:
-        return hvmemul_do_mmio(gpa, reps, bytes_per_rep, 0, IOREQ_WRITE, df,
-                               p_data);
+        return hvmemul_do_mmio_buffer(gpa, reps, bytes_per_rep, IOREQ_WRITE,
+                                      df, p_data);
     }
 }
 
@@ -1142,7 +1233,7 @@ static int hvmemul_read_io(
 {
     unsigned long reps = 1;
     *val = 0;
-    return hvmemul_do_pio(port, &reps, bytes, 0, IOREQ_READ, 0, val);
+    return hvmemul_do_pio_buffer(port, &reps, bytes, IOREQ_READ, val);
 }
 
 static int hvmemul_write_io(
@@ -1152,7 +1243,7 @@ static int hvmemul_write_io(
     struct x86_emulate_ctxt *ctxt)
 {
     unsigned long reps = 1;
-    return hvmemul_do_pio(port, &reps, bytes, 0, IOREQ_WRITE, 0, &val);
+    return hvmemul_do_pio_buffer(port, &reps, bytes, IOREQ_WRITE, &val);
 }
 
 static int hvmemul_read_cr(
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 68fb890..2ba6272 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -140,7 +140,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
     if ( dir == IOREQ_WRITE )
         data = guest_cpu_user_regs()->eax;
 
-    rc = hvmemul_do_pio(port, &reps, size, 0, dir, 0, &data);
+    rc = hvmemul_do_pio_buffer(port, &reps, size, dir, &data);
 
     switch ( rc )
     {
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index b3971c8..f69e758 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -50,11 +50,23 @@ struct segment_register *hvmemul_get_seg_reg(
     enum x86_segment seg,
     struct hvm_emulate_ctxt *hvmemul_ctxt);
 
-int hvmemul_do_pio(
-    unsigned long port, unsigned long *reps, int size,
-    paddr_t ram_gpa, int dir, int df, void *p_data);
+int hvmemul_do_pio_buffer(uint16_t port,
+                          unsigned long *reps,
+                          unsigned int size,
+                          uint8_t dir,
+                          void *buffer);
 
 void hvm_dump_emulation_state(const char *prefix,
                               struct hvm_emulate_ctxt *hvmemul_ctxt);
 
 #endif /* __ASM_X86_HVM_EMULATE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/17] x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-06-08 14:33 ` [PATCH 01/17] x86/hvm: simplify hvmemul_do_io() Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 03/17] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The struct just contains three methods and no data, so the name
hvm_mmio_ops more accurately reflects its content. A subsequent patch
introduces a new structure which more accurately warrants the name
hvm_mmio_handler so doing the rename in this purely cosmetic patch avoids
conflating functional and cosmetic changes in a single patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/tests/vhpet/emul.h                  |    8 +++---
 tools/tests/vhpet/main.c                  |    6 ++---
 xen/arch/x86/hvm/hpet.c                   |    8 +++---
 xen/arch/x86/hvm/intercept.c              |   42 ++++++++++++++---------------
 xen/arch/x86/hvm/vioapic.c                |    8 +++---
 xen/arch/x86/hvm/vlapic.c                 |    8 +++---
 xen/arch/x86/hvm/vmsi.c                   |    8 +++---
 xen/drivers/passthrough/amd/iommu_guest.c |    8 +++---
 xen/include/asm-x86/hvm/io.h              |   18 ++++++-------
 9 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/tools/tests/vhpet/emul.h b/tools/tests/vhpet/emul.h
index 09e4611..383acff 100644
--- a/tools/tests/vhpet/emul.h
+++ b/tools/tests/vhpet/emul.h
@@ -237,11 +237,11 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
 
-struct hvm_mmio_handler
+struct hvm_mmio_ops
 {
-    hvm_mmio_check_t check_handler;
-    hvm_mmio_read_t read_handler;
-    hvm_mmio_write_t write_handler;
+    hvm_mmio_check_t check;
+    hvm_mmio_read_t  read;
+    hvm_mmio_write_t write;
 };
 
 /* Marshalling and unmarshalling uses a buffer with size and cursor. */
diff --git a/tools/tests/vhpet/main.c b/tools/tests/vhpet/main.c
index fbd7510..6fe65ea 100644
--- a/tools/tests/vhpet/main.c
+++ b/tools/tests/vhpet/main.c
@@ -70,7 +70,7 @@ static int skip_error_on_load;
 
 static char *global_thousep;
 
-extern const struct hvm_mmio_handler hpet_mmio_handler;
+extern const struct hvm_mmio_ops hpet_mmio_ops;
 
 struct domain dom1;
 struct vcpu vcpu0;
@@ -297,13 +297,13 @@ void udelay(int w)
 unsigned int hpet_readl(unsigned long a)
 {
     unsigned long ret = 0;
-    hpet_mmio_handler.read_handler(current, a, 4, &ret);
+    hpet_mmio_ops.read(current, a, 4, &ret);
     return ret;
 }
 
 void hpet_writel(unsigned long d, unsigned long a)
 {
-    hpet_mmio_handler.write_handler(current, a, 4, d);
+    hpet_mmio_ops.write(current, a, 4, d);
     return;
 }
 
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index d898169..9585ca8 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -504,10 +504,10 @@ static int hpet_range(struct vcpu *v, unsigned long addr)
              (addr < (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 }
 
-const struct hvm_mmio_handler hpet_mmio_handler = {
-    .check_handler = hpet_range,
-    .read_handler  = hpet_read,
-    .write_handler = hpet_write
+const struct hvm_mmio_ops hpet_mmio_ops = {
+    .check = hpet_range,
+    .read  = hpet_read,
+    .write = hpet_write
 };
 
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index d52a48c..dc39b1b 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,20 +32,20 @@
 #include <xen/event.h>
 #include <xen/iommu.h>
 
-static const struct hvm_mmio_handler *const
+static const struct hvm_mmio_ops *const
 hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
 {
-    &hpet_mmio_handler,
-    &vlapic_mmio_handler,
-    &vioapic_mmio_handler,
-    &msixtbl_mmio_handler,
-    &iommu_mmio_handler
+    &hpet_mmio_ops,
+    &vlapic_mmio_ops,
+    &vioapic_mmio_ops,
+    &msixtbl_mmio_ops,
+    &iommu_mmio_ops
 };
 
 static int hvm_mmio_access(struct vcpu *v,
                            ioreq_t *p,
-                           hvm_mmio_read_t read_handler,
-                           hvm_mmio_write_t write_handler)
+                           hvm_mmio_read_t read,
+                           hvm_mmio_write_t write)
 {
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     unsigned long data;
@@ -64,11 +64,11 @@ static int hvm_mmio_access(struct vcpu *v,
                 vio->mmio_retrying = 0;
             }
             else
-                rc = read_handler(v, p->addr, p->size, &data);
+                rc = read(v, p->addr, p->size, &data);
             p->data = data;
         }
         else /* p->dir == IOREQ_WRITE */
-            rc = write_handler(v, p->addr, p->size, p->data);
+            rc = write(v, p->addr, p->size, p->data);
         return rc;
     }
 
@@ -86,7 +86,7 @@ static int hvm_mmio_access(struct vcpu *v,
             }
             else
             {
-                rc = read_handler(v, p->addr + step * i, p->size, &data);
+                rc = read(v, p->addr + step * i, p->size, &data);
                 if ( rc != X86EMUL_OKAY )
                     break;
             }
@@ -145,7 +145,7 @@ static int hvm_mmio_access(struct vcpu *v,
             }
             if ( rc != X86EMUL_OKAY )
                 break;
-            rc = write_handler(v, p->addr + step * i, p->size, data);
+            rc = write(v, p->addr + step * i, p->size, data);
             if ( rc != X86EMUL_OKAY )
                 break;
         }
@@ -169,7 +169,7 @@ bool_t hvm_mmio_internal(paddr_t gpa)
     unsigned int i;
 
     for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
-        if ( hvm_mmio_handlers[i]->check_handler(curr, gpa) )
+        if ( hvm_mmio_handlers[i]->check(curr, gpa) )
             return 1;
 
     return 0;
@@ -182,21 +182,21 @@ int hvm_mmio_intercept(ioreq_t *p)
 
     for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
     {
-        hvm_mmio_check_t check_handler =
-            hvm_mmio_handlers[i]->check_handler;
+        hvm_mmio_check_t check =
+            hvm_mmio_handlers[i]->check;
 
-        if ( check_handler(v, p->addr) )
+        if ( check(v, p->addr) )
         {
             if ( unlikely(p->count > 1) &&
-                 !check_handler(v, unlikely(p->df)
-                                   ? p->addr - (p->count - 1L) * p->size
-                                   : p->addr + (p->count - 1L) * p->size) )
+                 !check(v, unlikely(p->df)
+                        ? p->addr - (p->count - 1L) * p->size
+                        : p->addr + (p->count - 1L) * p->size) )
                 p->count = 1;
 
             return hvm_mmio_access(
                 v, p,
-                hvm_mmio_handlers[i]->read_handler,
-                hvm_mmio_handlers[i]->write_handler);
+                hvm_mmio_handlers[i]->read,
+                hvm_mmio_handlers[i]->write);
         }
     }
 
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 1e48110..e4ab336 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -250,10 +250,10 @@ static int vioapic_range(struct vcpu *v, unsigned long addr)
              (addr < vioapic->base_address + VIOAPIC_MEM_LENGTH)));
 }
 
-const struct hvm_mmio_handler vioapic_mmio_handler = {
-    .check_handler = vioapic_range,
-    .read_handler = vioapic_read,
-    .write_handler = vioapic_write
+const struct hvm_mmio_ops vioapic_mmio_ops = {
+    .check = vioapic_range,
+    .read = vioapic_read,
+    .write = vioapic_write
 };
 
 static void ioapic_inj_irq(
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 92b0fa8..56171d6 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -997,10 +997,10 @@ static int vlapic_range(struct vcpu *v, unsigned long addr)
            (offset < PAGE_SIZE);
 }
 
-const struct hvm_mmio_handler vlapic_mmio_handler = {
-    .check_handler = vlapic_range,
-    .read_handler = vlapic_read,
-    .write_handler = vlapic_write
+const struct hvm_mmio_ops vlapic_mmio_ops = {
+    .check = vlapic_range,
+    .read = vlapic_read,
+    .write = vlapic_write
 };
 
 static void set_x2apic_id(struct vlapic *vlapic)
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index cb87909..a1b7165 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -374,10 +374,10 @@ static int msixtbl_range(struct vcpu *v, unsigned long addr)
     return !!virt;
 }
 
-const struct hvm_mmio_handler msixtbl_mmio_handler = {
-    .check_handler = msixtbl_range,
-    .read_handler = msixtbl_read,
-    .write_handler = msixtbl_write
+const struct hvm_mmio_ops msixtbl_mmio_ops = {
+    .check = msixtbl_range,
+    .read = msixtbl_read,
+    .write = msixtbl_write
 };
 
 static void add_msixtbl_entry(struct domain *d,
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index 98e7b38..7b0c102 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -919,8 +919,8 @@ static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
            addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
 }
 
-const struct hvm_mmio_handler iommu_mmio_handler = {
-    .check_handler = guest_iommu_mmio_range,
-    .read_handler = guest_iommu_mmio_read,
-    .write_handler = guest_iommu_mmio_write
+const struct hvm_mmio_ops iommu_mmio_ops = {
+    .check = guest_iommu_mmio_range,
+    .read = guest_iommu_mmio_read,
+    .write = guest_iommu_mmio_write
 };
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 886a9d6..f2aaec5 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -59,17 +59,17 @@ struct hvm_io_handler {
     struct  io_handler hdl_list[MAX_IO_HANDLER];
 };
 
-struct hvm_mmio_handler {
-    hvm_mmio_check_t check_handler;
-    hvm_mmio_read_t read_handler;
-    hvm_mmio_write_t write_handler;
+struct hvm_mmio_ops {
+    hvm_mmio_check_t check;
+    hvm_mmio_read_t  read;
+    hvm_mmio_write_t write;
 };
 
-extern const struct hvm_mmio_handler hpet_mmio_handler;
-extern const struct hvm_mmio_handler vlapic_mmio_handler;
-extern const struct hvm_mmio_handler vioapic_mmio_handler;
-extern const struct hvm_mmio_handler msixtbl_mmio_handler;
-extern const struct hvm_mmio_handler iommu_mmio_handler;
+extern const struct hvm_mmio_ops hpet_mmio_ops;
+extern const struct hvm_mmio_ops vlapic_mmio_ops;
+extern const struct hvm_mmio_ops vioapic_mmio_ops;
+extern const struct hvm_mmio_ops msixtbl_mmio_ops;
+extern const struct hvm_mmio_ops iommu_mmio_ops;
 
 #define HVM_MMIO_HANDLER_NR 5
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/17] x86/hvm: unify internal portio and mmio intercepts
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
  2015-06-08 14:33 ` [PATCH 01/17] x86/hvm: simplify hvmemul_do_io() Paul Durrant
  2015-06-08 14:33 ` [PATCH 02/17] x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 04/17] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c                |   11 +-
 xen/arch/x86/hvm/hpet.c                   |    2 +
 xen/arch/x86/hvm/hvm.c                    |   42 ++-
 xen/arch/x86/hvm/intercept.c              |  581 ++++++++++++++++-------------
 xen/arch/x86/hvm/stdvga.c                 |   30 +-
 xen/arch/x86/hvm/vioapic.c                |    2 +
 xen/arch/x86/hvm/vlapic.c                 |    3 +
 xen/arch/x86/hvm/vmsi.c                   |    5 +
 xen/drivers/passthrough/amd/iommu_guest.c |   30 +-
 xen/include/asm-x86/hvm/domain.h          |    8 +-
 xen/include/asm-x86/hvm/io.h              |  135 +++----
 11 files changed, 477 insertions(+), 372 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fa70ec0..da5dfcc 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -151,16 +151,7 @@ static int hvmemul_do_io(
         hvmtrace_io_assist(is_mmio, &p);
     }
 
-    if ( is_mmio )
-    {
-        rc = hvm_mmio_intercept(&p);
-        if ( rc == X86EMUL_UNHANDLEABLE )
-            rc = hvm_buffered_io_intercept(&p);
-    }
-    else
-    {
-        rc = hvm_portio_intercept(&p);
-    }
+    rc = hvm_io_intercept(&p);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 9585ca8..24b5ec9 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -659,6 +659,8 @@ void hpet_init(struct domain *d)
         h->hpet.comparator64[i] = ~0ULL;
         h->pt[i].source = PTSRC_isa;
     }
+
+    register_mmio_handler(d, &hpet_mmio_ops);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b1023bb..f4b57a4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1452,18 +1452,30 @@ int hvm_domain_initialise(struct domain *d)
     INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
     spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
 
+    INIT_LIST_HEAD(&d->arch.hvm_domain.io_handler.list);
+    spin_lock_init(&d->arch.hvm_domain.io_handler.lock);
+
     hvm_init_cacheattr_region_list(d);
 
     rc = paging_enable(d, PG_refcounts|PG_translate|PG_external);
     if ( rc != 0 )
         goto fail0;
 
-    d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
-    d->arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
     rc = -ENOMEM;
-    if ( !d->arch.hvm_domain.params || !d->arch.hvm_domain.io_handler )
+
+    d->arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
+    if ( !d->arch.hvm_domain.params )
         goto fail1;
-    d->arch.hvm_domain.io_handler->num_slot = 0;
+
+    d->arch.hvm_domain.mmio_handler = xzalloc_array(struct hvm_mmio_handler,
+                                                    NR_MMIO_HANDLERS);
+    if ( !d->arch.hvm_domain.mmio_handler )
+        goto fail2;
+
+    d->arch.hvm_domain.portio_handler = xzalloc_array(struct hvm_portio_handler,
+                                                      NR_PORTIO_HANDLERS);
+    if ( !d->arch.hvm_domain.portio_handler )
+        goto fail3;
 
     /* Set the default IO Bitmap. */
     if ( is_hardware_domain(d) )
@@ -1472,7 +1484,7 @@ int hvm_domain_initialise(struct domain *d)
         if ( d->arch.hvm_domain.io_bitmap == NULL )
         {
             rc = -ENOMEM;
-            goto fail1;
+            goto fail4;
         }
         memset(d->arch.hvm_domain.io_bitmap, ~0, HVM_IOBITMAP_SIZE);
     }
@@ -1494,30 +1506,37 @@ int hvm_domain_initialise(struct domain *d)
 
     rc = vioapic_init(d);
     if ( rc != 0 )
-        goto fail1;
+        goto fail5;
 
     stdvga_init(d);
 
     rtc_init(d);
 
+    msixtbl_init(d);
+
     register_portio_handler(d, 0xe9, 1, hvm_print_line);
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
     rc = hvm_funcs.domain_initialise(d);
     if ( rc != 0 )
-        goto fail2;
+        goto fail6;
 
     return 0;
 
- fail2:
+ fail6:
     rtc_deinit(d);
     stdvga_deinit(d);
     vioapic_deinit(d);
- fail1:
+ fail5:
     if ( is_hardware_domain(d) )
         xfree(d->arch.hvm_domain.io_bitmap);
-    xfree(d->arch.hvm_domain.io_handler);
+ fail4:
+    xfree(d->arch.hvm_domain.portio_handler);
+ fail3:
+    xfree(d->arch.hvm_domain.mmio_handler);
+ fail2:
     xfree(d->arch.hvm_domain.params);
+ fail1:
  fail0:
     hvm_destroy_cacheattr_region_list(d);
     return rc;
@@ -1546,7 +1565,8 @@ void hvm_domain_relinquish_resources(struct domain *d)
 
 void hvm_domain_destroy(struct domain *d)
 {
-    xfree(d->arch.hvm_domain.io_handler);
+    xfree(d->arch.hvm_domain.portio_handler);
+    xfree(d->arch.hvm_domain.mmio_handler);
     xfree(d->arch.hvm_domain.params);
 
     hvm_destroy_cacheattr_region_list(d);
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index dc39b1b..c03502c 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,45 +32,134 @@
 #include <xen/event.h>
 #include <xen/iommu.h>
 
-static const struct hvm_mmio_ops *const
-hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
+static bool_t hvm_mmio_accept(struct hvm_io_handler *io_handler,
+                              struct vcpu *v,
+                              uint64_t addr,
+                              uint32_t size)
 {
-    &hpet_mmio_ops,
-    &vlapic_mmio_ops,
-    &vioapic_mmio_ops,
-    &msixtbl_mmio_ops,
-    &iommu_mmio_ops
+    struct hvm_mmio_handler *mmio_handler;
+    const struct hvm_mmio_ops *ops;
+
+    mmio_handler = container_of(io_handler,
+                                struct hvm_mmio_handler,
+                                io_handler);
+    ops = mmio_handler->ops;
+
+    BUG_ON(ops == NULL);
+
+    return ops->check(v, addr);
+}
+
+static int hvm_mmio_read(struct hvm_io_handler *io_handler,
+                         struct vcpu *v,
+                         uint64_t addr,
+                         uint32_t size,
+                         uint64_t *data)
+{
+    struct hvm_mmio_handler *mmio_handler;
+    const struct hvm_mmio_ops *ops;
+
+    mmio_handler = container_of(io_handler,
+                                struct hvm_mmio_handler,
+                                io_handler);
+    ops = mmio_handler->ops;
+
+    BUG_ON(ops == NULL);
+
+    return ops->read(v, addr, size, data);
+}
+
+static int hvm_mmio_write(struct hvm_io_handler *io_handler,
+                          struct vcpu *v,
+                          uint64_t addr,
+                          uint32_t size,
+                          uint64_t data)
+{
+    struct hvm_mmio_handler *mmio_handler;
+    const struct hvm_mmio_ops *ops;
+
+    mmio_handler = container_of(io_handler,
+                                struct hvm_mmio_handler,
+                                io_handler);
+    ops = mmio_handler->ops;
+
+    BUG_ON(ops == NULL);
+
+    return ops->write(v, addr, size, data);
+}
+
+static const struct hvm_io_ops mmio_ops = {
+    .accept = hvm_mmio_accept,
+    .read = hvm_mmio_read,
+    .write = hvm_mmio_write
+};
+
+static bool_t hvm_portio_accept(struct hvm_io_handler *io_handler,
+                                struct vcpu *v,
+                                uint64_t addr,
+                                uint32_t size)
+{
+    struct hvm_portio_handler *portio_handler;
+
+    portio_handler = container_of(io_handler,
+                                  struct hvm_portio_handler,
+                                  io_handler);
+
+    return (addr >= portio_handler->start) &&
+           ((addr + size) <= portio_handler->end);
+}
+
+static int hvm_portio_read(struct hvm_io_handler *io_handler,
+                           struct vcpu *v,
+                           uint64_t addr,
+                           uint32_t size,
+                           uint64_t *data)
+{
+    struct hvm_portio_handler *portio_handler;
+    uint32_t val;
+    int rc;
+
+    portio_handler = container_of(io_handler,
+                                  struct hvm_portio_handler,
+                                  io_handler);
+
+    rc = portio_handler->action(IOREQ_READ, addr, size, &val);
+    if ( rc == X86EMUL_OKAY )
+        *data = val;
+
+    return rc;
+}
+
+static int hvm_portio_write(struct hvm_io_handler *io_handler,
+                            struct vcpu *v,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t data)
+{
+    struct hvm_portio_handler *portio_handler;
+    uint32_t val = data;
+
+    portio_handler = container_of(io_handler,
+                                  struct hvm_portio_handler,
+                                  io_handler);
+
+    return portio_handler->action(IOREQ_WRITE, addr, size, &val);
+}
+
+static const struct hvm_io_ops portio_ops = {
+    .accept = hvm_portio_accept,
+    .read = hvm_portio_read,
+    .write = hvm_portio_write
 };
 
-static int hvm_mmio_access(struct vcpu *v,
-                           ioreq_t *p,
-                           hvm_mmio_read_t read,
-                           hvm_mmio_write_t write)
+static int process_io_intercept(struct vcpu *v, ioreq_t *p,
+                                struct hvm_io_handler *handler)
 {
     struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
-    unsigned long data;
+    const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-
-    if ( !p->data_is_ptr )
-    {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = read(v, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else /* p->dir == IOREQ_WRITE */
-            rc = write(v, p->addr, p->size, p->data);
-        return rc;
-    }
+    uint64_t data;
+    uint64_t addr;
 
     if ( p->dir == IOREQ_READ )
     {
@@ -86,31 +175,40 @@ static int hvm_mmio_access(struct vcpu *v,
             }
             else
             {
-                rc = read(v, p->addr + step * i, p->size, &data);
+                addr = (p->type == IOREQ_TYPE_COPY) ?
+                    p->addr + step * i :
+                    p->addr;
+                rc = ops->read(handler, v, addr, p->size, &data);
                 if ( rc != X86EMUL_OKAY )
                     break;
             }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
+
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_to_guest_phys(p->data + step * i,
+                                                &data, p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    /* Drop the write as real hardware would. */
+                    continue;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT(0);
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY)
+                    break;
             }
-            if ( rc != X86EMUL_OKAY)
-                break;
+            else
+                p->data = data;
         }
 
         if ( rc == X86EMUL_RETRY )
@@ -120,32 +218,41 @@ static int hvm_mmio_access(struct vcpu *v,
             memcpy(vio->mmio_large_read, &data, p->size);
         }
     }
-    else
+    else /* p->dir == IOREQ_WRITE */
     {
         for ( i = 0; i < p->count; i++ )
         {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
+            if ( p->data_is_ptr )
             {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
+                                                  p->size) )
+                {
+                case HVMCOPY_okay:
+                    break;
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
+                    rc = X86EMUL_RETRY;
+                    break;
+                case HVMCOPY_bad_gfn_to_mfn:
+                    data = ~0;
+                    break;
+                case HVMCOPY_bad_gva_to_gfn:
+                    ASSERT(0);
+                    /* fall through */
+                default:
+                    rc = X86EMUL_UNHANDLEABLE;
+                    break;
+                }
+                if ( rc != X86EMUL_OKAY )
+                    break;
             }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = write(v, p->addr + step * i, p->size, data);
+            else
+                data = p->data;
+
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                p->addr + step * i :
+                p->addr;
+            rc = ops->write(handler, v, addr, p->size, data);
             if ( rc != X86EMUL_OKAY )
                 break;
         }
@@ -163,240 +270,182 @@ static int hvm_mmio_access(struct vcpu *v,
     return rc;
 }
 
-bool_t hvm_mmio_internal(paddr_t gpa)
-{
-    struct vcpu *curr = current;
-    unsigned int i;
-
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
-        if ( hvm_mmio_handlers[i]->check(curr, gpa) )
-            return 1;
-
-    return 0;
-}
+static DEFINE_RCU_READ_LOCK(intercept_rcu_lock);
 
-int hvm_mmio_intercept(ioreq_t *p)
+struct hvm_io_handler *hvm_find_io_handler(struct vcpu *v,
+                                           ioreq_t *p)
 {
-    struct vcpu *v = current;
-    int i;
-
-    for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
-    {
-        hvm_mmio_check_t check =
-            hvm_mmio_handlers[i]->check;
-
-        if ( check(v, p->addr) )
-        {
-            if ( unlikely(p->count > 1) &&
-                 !check(v, unlikely(p->df)
-                        ? p->addr - (p->count - 1L) * p->size
-                        : p->addr + (p->count - 1L) * p->size) )
-                p->count = 1;
-
-            return hvm_mmio_access(
-                v, p,
-                hvm_mmio_handlers[i]->read,
-                hvm_mmio_handlers[i]->write);
-        }
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
+    struct domain *d = v->domain;
+    struct hvm_io_handler *handler;
 
-static int process_portio_intercept(portio_action_t action, ioreq_t *p)
-{
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
+    rcu_read_lock(&intercept_rcu_lock);
 
-    if ( !p->data_is_ptr )
+    list_for_each_entry( handler,
+                         &d->arch.hvm_domain.io_handler.list,
+                         list_entry )
     {
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
-            p->data = data;
-        }
-        else
-        {
-            data = p->data;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
-        }
-        return rc;
-    }
+        const struct hvm_io_ops *ops = handler->ops;
+        uint64_t start, end;
 
-    if ( p->dir == IOREQ_READ )
-    {
-        for ( i = 0; i < p->count; i++ )
+        if ( handler->type != p->type )
+            continue;
+
+        switch ( handler->type )
         {
-            if ( vio->mmio_retrying )
+        case IOREQ_TYPE_PIO:
+            start = p->addr;
+            end = p->addr + p->size;
+            break;
+        case IOREQ_TYPE_COPY:
+            if ( p->df )
             {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
+                start = (p->addr - (p->count - 1L) * p->size);
+                end = p->addr + p->size;
             }
             else
             {
-                rc = action(IOREQ_READ, p->addr, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
+                start = p->addr;
+                end = p->addr + p->count * p->size;
             }
-            if ( rc != X86EMUL_OKAY)
-                break;
+            break;
+        default:
+            BUG();
         }
 
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
+        if ( ops->accept(handler, v, start, end - start) )
+            goto done;
     }
-    else /* p->dir == IOREQ_WRITE */
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            data = 0;
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY )
-                break;
-            rc = action(IOREQ_WRITE, p->addr, p->size, &data);
-            if ( rc != X86EMUL_OKAY )
-                break;
-        }
 
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
-    }
+    handler = NULL;
 
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
-    }
+done:
+    rcu_read_unlock(&intercept_rcu_lock);
 
-    return rc;
+    return handler;
 }
 
-/*
- * Check if the request is handled inside xen
- * return value: 0 --not handled; 1 --handled
- */
-int hvm_io_intercept(ioreq_t *p, int type)
+int hvm_io_intercept(ioreq_t *p)
 {
     struct vcpu *v = current;
-    struct hvm_io_handler *handler = v->domain->arch.hvm_domain.io_handler;
-    int i;
-    unsigned long addr, size;
+    struct hvm_io_handler *handler;
 
-    if ( type == HVM_PORTIO )
+    if ( p->type == IOREQ_TYPE_PIO )
     {
         int rc = dpci_ioport_intercept(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
             return rc;
     }
+    else if ( p->type == IOREQ_TYPE_COPY )
+    {
+        int rc = stdvga_intercept_mmio(p);
+        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
+            return rc;
+    }
 
-    for ( i = 0; i < handler->num_slot; i++ )
+    handler = hvm_find_io_handler(v, p);
+
+    if ( handler == NULL )
+        return X86EMUL_UNHANDLEABLE;
+
+    return process_io_intercept(v, p, handler);
+}
+
+void hvm_register_io_handler(struct domain *d,
+                             struct hvm_io_handler *handler,
+                             bool_t head)
+{
+    spin_lock(&d->arch.hvm_domain.io_handler.lock);
+    if ( head )
+        list_add_rcu(&handler->list_entry,
+                     &d->arch.hvm_domain.io_handler.list);
+    else
+        list_add_tail_rcu(&handler->list_entry,
+                          &d->arch.hvm_domain.io_handler.list);
+    spin_unlock(&d->arch.hvm_domain.io_handler.lock);
+}
+
+void register_mmio_handler(struct domain *d, const struct hvm_mmio_ops *ops)
+{
+    struct hvm_mmio_handler *mmio_handler;
+    unsigned int i;
+
+    for ( i = 0; i < NR_MMIO_HANDLERS; i++ )
     {
-        if ( type != handler->hdl_list[i].type )
-            continue;
-        addr = handler->hdl_list[i].addr;
-        size = handler->hdl_list[i].size;
-        if ( (p->addr >= addr) &&
-             ((p->addr + p->size) <= (addr + size)) )
-        {
-            if ( type == HVM_PORTIO )
-                return process_portio_intercept(
-                    handler->hdl_list[i].action.portio, p);
+        mmio_handler = &d->arch.hvm_domain.mmio_handler[i];
 
-            if ( unlikely(p->count > 1) &&
-                 (unlikely(p->df)
-                  ? p->addr - (p->count - 1L) * p->size < addr
-                  : p->addr + p->count * 1L * p->size - 1 >= addr + size) )
-                p->count = 1;
+        if ( mmio_handler->io_handler.ops == NULL )
+            break;
+    }
 
-            return handler->hdl_list[i].action.mmio(p);
-        }
+    BUG_ON(i == NR_MMIO_HANDLERS);
+
+    mmio_handler->io_handler.type = IOREQ_TYPE_COPY;
+    mmio_handler->io_handler.ops = &mmio_ops;
+
+    hvm_register_io_handler(d, &mmio_handler->io_handler, 0);
+
+    mmio_handler->ops = ops;
+}
+
+void register_portio_handler(struct domain *d, uint64_t addr,
+                             uint32_t size, portio_action_t action)
+{
+    struct hvm_portio_handler *portio_handler;
+    unsigned int i;
+
+    for ( i = 0; i < NR_PORTIO_HANDLERS; i++ )
+    {
+        portio_handler = &d->arch.hvm_domain.portio_handler[i];
+
+        if ( portio_handler->io_handler.ops == NULL )
+            break;
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    BUG_ON(i == NR_PORTIO_HANDLERS);
+
+    portio_handler->io_handler.type = IOREQ_TYPE_PIO;
+    portio_handler->io_handler.ops = &portio_ops;
+
+    hvm_register_io_handler(d, &portio_handler->io_handler, 0);
+
+    portio_handler->start = addr;
+    portio_handler->end = addr + size;
+    portio_handler->action = action;
 }
 
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
-    void *action, int type)
+void relocate_portio_handler(struct domain *d, uint64_t old_addr,
+                             uint64_t new_addr, uint32_t size)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int num = handler->num_slot;
+    struct hvm_portio_handler *portio_handler;
+    unsigned int i;
+
+    for ( i = 0; i < NR_PORTIO_HANDLERS; i++ )
+    {
+        portio_handler = &d->arch.hvm_domain.portio_handler[i];
+
+        if ( portio_handler->io_handler.ops == NULL )
+            break;
+
+        if ( portio_handler->start == old_addr &&
+             portio_handler->end == old_addr + size )
+            goto found;
+    }
 
-    BUG_ON(num >= MAX_IO_HANDLER);
+    BUG();
 
-    handler->hdl_list[num].addr = addr;
-    handler->hdl_list[num].size = size;
-    handler->hdl_list[num].action.ptr = action;
-    handler->hdl_list[num].type = type;
-    handler->num_slot++;
+ found:
+    portio_handler->start = new_addr;
+    portio_handler->end = new_addr + size;
 }
 
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type)
+bool_t hvm_mmio_internal(paddr_t gpa)
 {
-    struct hvm_io_handler *handler = d->arch.hvm_domain.io_handler;
-    int i;
-
-    for ( i = 0; i < handler->num_slot; i++ )
-        if ( (handler->hdl_list[i].addr == old_addr) &&
-             (handler->hdl_list[i].size == size) &&
-             (handler->hdl_list[i].type == type) )
-            handler->hdl_list[i].addr = new_addr;
+    ioreq_t p = {
+        .type = IOREQ_TYPE_COPY,
+        .addr = gpa
+    };
+
+    return (hvm_find_io_handler(current, &p) != NULL);
 }
 
 /*
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 13d1029..5730d35 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -547,12 +547,27 @@ static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
     return 1;
 }
 
-static int stdvga_intercept_mmio(ioreq_t *p)
+int stdvga_intercept_mmio(ioreq_t *p)
 {
     struct domain *d = current->domain;
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
+    uint64_t start, end;
     int buf = 0, rc;
 
+    if ( p->df )
+    {
+        start = (p->addr - (p->count - 1L) * p->size);
+        end = p->addr + p->size;
+    }
+    else
+    {
+        start = p->addr;
+        end = p->addr + p->count * p->size;
+    }
+
+    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
+        return X86EMUL_UNHANDLEABLE;
+
     if ( p->size > 8 )
     {
         gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
@@ -619,9 +634,6 @@ void stdvga_init(struct domain *d)
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
-        /* MMIO. */
-        register_buffered_io_handler(
-            d, VGA_MEM_BASE, VGA_MEM_SIZE, stdvga_intercept_mmio);
     }
 }
 
@@ -638,3 +650,13 @@ void stdvga_deinit(struct domain *d)
         s->vram_page[i] = NULL;
     }
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index e4ab336..df76019 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -456,6 +456,8 @@ int vioapic_init(struct domain *d)
     d->arch.hvm_domain.vioapic->domain = d;
     vioapic_reset(d);
 
+    register_mmio_handler(d, &vioapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 56171d6..6916e14 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1463,6 +1463,9 @@ int vlapic_init(struct vcpu *v)
                  vlapic_init_sipi_action,
                  (unsigned long)v);
 
+    if ( v->vcpu_id == 0 )
+        register_mmio_handler(v->domain, &vlapic_mmio_ops);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index a1b7165..9b192a0 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -511,6 +511,11 @@ found:
     spin_unlock_irq(&irq_desc->lock);
 }
 
+void msixtbl_init(struct domain *d)
+{
+    register_mmio_handler(d, &msixtbl_mmio_ops);
+}
+
 void msixtbl_pt_cleanup(struct domain *d)
 {
     struct msixtbl_entry *entry, *temp;
diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index 7b0c102..1ac85bd 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -868,6 +868,20 @@ static void guest_iommu_reg_init(struct guest_iommu *iommu)
     iommu->reg_ext_feature.hi = upper;
 }
 
+static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
+{
+    struct guest_iommu *iommu = vcpu_iommu(v);
+
+    return iommu && addr >= iommu->mmio_base &&
+           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
+}
+
+const struct hvm_mmio_ops iommu_mmio_ops = {
+    .check = guest_iommu_mmio_range,
+    .read = guest_iommu_mmio_read,
+    .write = guest_iommu_mmio_write
+};
+
 /* Domain specific initialization */
 int guest_iommu_init(struct domain* d)
 {
@@ -894,6 +908,8 @@ int guest_iommu_init(struct domain* d)
 
     spin_lock_init(&iommu->lock);
 
+    register_mmio_handler(d, &iommu_mmio_ops);
+
     return 0;
 }
 
@@ -910,17 +926,3 @@ void guest_iommu_destroy(struct domain *d)
 
     domain_hvm_iommu(d)->arch.g_iommu = NULL;
 }
-
-static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr)
-{
-    struct guest_iommu *iommu = vcpu_iommu(v);
-
-    return iommu && addr >= iommu->mmio_base &&
-           addr < iommu->mmio_base + IOMMU_MMIO_SIZE;
-}
-
-const struct hvm_mmio_ops iommu_mmio_ops = {
-    .check = guest_iommu_mmio_range,
-    .read = guest_iommu_mmio_read,
-    .write = guest_iommu_mmio_write
-};
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index bdab45d..d9c0cfe 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -92,7 +92,13 @@ struct hvm_domain {
 
     struct pl_time         pl_time;
 
-    struct hvm_io_handler *io_handler;
+    struct hvm_mmio_handler   *mmio_handler;
+    struct hvm_portio_handler *portio_handler;
+
+    struct {
+        spinlock_t       lock;
+        struct list_head list;
+    } io_handler;
 
     /* Lock protects access to irq, vpic and vioapic. */
     spinlock_t             irq_lock;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index f2aaec5..633a210 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -24,11 +24,36 @@
 #include <asm/hvm/vioapic.h>
 #include <public/hvm/ioreq.h>
 #include <public/event_channel.h>
+#include <xen/rcupdate.h>
+
+struct hvm_io_handler;
+
+typedef int (*hvm_io_read_t)(struct hvm_io_handler *handler,
+                             struct vcpu *v,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t *val);
+typedef int (*hvm_io_write_t)(struct hvm_io_handler *handler,
+                              struct vcpu *v,
+                              uint64_t addr,
+                              uint32_t size,
+                              uint64_t val);
+typedef bool_t (*hvm_io_accept_t)(struct hvm_io_handler *handler,
+                                  struct vcpu *v,
+                                  uint64_t addr,
+                                  uint32_t size);
+
+struct hvm_io_ops {
+    hvm_io_accept_t accept;
+    hvm_io_read_t   read;
+    hvm_io_write_t  write;
+};
 
-#define MAX_IO_HANDLER             16
-
-#define HVM_PORTIO                  0
-#define HVM_BUFFERED_IO             2
+struct hvm_io_handler {
+    struct list_head        list_entry;
+    const struct hvm_io_ops *ops;
+    uint8_t                 type;
+};
 
 typedef int (*hvm_mmio_read_t)(struct vcpu *v,
                                unsigned long addr,
@@ -40,81 +65,47 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
                                 unsigned long val);
 typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 
-typedef int (*portio_action_t)(
-    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
-typedef int (*mmio_action_t)(ioreq_t *);
-struct io_handler {
-    int                 type;
-    unsigned long       addr;
-    unsigned long       size;
-    union {
-        portio_action_t portio;
-        mmio_action_t   mmio;
-        void           *ptr;
-    } action;
-};
-
-struct hvm_io_handler {
-    int     num_slot;
-    struct  io_handler hdl_list[MAX_IO_HANDLER];
-};
-
 struct hvm_mmio_ops {
     hvm_mmio_check_t check;
     hvm_mmio_read_t  read;
     hvm_mmio_write_t write;
 };
 
-extern const struct hvm_mmio_ops hpet_mmio_ops;
-extern const struct hvm_mmio_ops vlapic_mmio_ops;
-extern const struct hvm_mmio_ops vioapic_mmio_ops;
-extern const struct hvm_mmio_ops msixtbl_mmio_ops;
-extern const struct hvm_mmio_ops iommu_mmio_ops;
+#define NR_MMIO_HANDLERS 5
 
-#define HVM_MMIO_HANDLER_NR 5
+struct hvm_mmio_handler {
+    const struct hvm_mmio_ops *ops;
+    struct hvm_io_handler     io_handler;
+};
+
+typedef int (*portio_action_t)(
+    int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+
+#define NR_PORTIO_HANDLERS 16
 
-int hvm_io_intercept(ioreq_t *p, int type);
-void register_io_handler(
-    struct domain *d, unsigned long addr, unsigned long size,
-    void *action, int type);
-void relocate_io_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size, int type);
+struct hvm_portio_handler {
+    uint64_t              start, end;
+    portio_action_t       action;
+    struct hvm_io_handler io_handler;
+};
 
-static inline int hvm_portio_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_PORTIO);
-}
+void hvm_register_io_handler(struct domain *d,
+                             struct hvm_io_handler *handler,
+                             bool_t head);
+struct hvm_io_handler *hvm_find_io_handler(struct vcpu *v,
+                                           ioreq_t *p);
 
-static inline int hvm_buffered_io_intercept(ioreq_t *p)
-{
-    return hvm_io_intercept(p, HVM_BUFFERED_IO);
-}
+int hvm_io_intercept(ioreq_t *p);
 
+void register_mmio_handler(struct domain *d,
+                           const struct hvm_mmio_ops *ops);
+void register_portio_handler(struct domain *d, uint64_t addr,
+                             uint32_t size, portio_action_t action);
+void relocate_portio_handler(struct domain *d, uint64_t old_addr,
+                             uint64_t new_addr, uint32_t size);
 bool_t hvm_mmio_internal(paddr_t gpa);
-int hvm_mmio_intercept(ioreq_t *p);
-int hvm_buffered_io_send(ioreq_t *p);
 
-static inline void register_portio_handler(
-    struct domain *d, unsigned long addr,
-    unsigned long size, portio_action_t action)
-{
-    register_io_handler(d, addr, size, action, HVM_PORTIO);
-}
-
-static inline void relocate_portio_handler(
-    struct domain *d, unsigned long old_addr, unsigned long new_addr,
-    unsigned long size)
-{
-    relocate_io_handler(d, old_addr, new_addr, size, HVM_PORTIO);
-}
-
-static inline void register_buffered_io_handler(
-    struct domain *d, unsigned long addr,
-    unsigned long size, mmio_action_t action)
-{
-    register_io_handler(d, addr, size, action, HVM_BUFFERED_IO);
-}
+int hvm_buffered_io_send(ioreq_t *p);
 
 void send_timeoffset_req(unsigned long timeoff);
 void send_invalidate_req(void);
@@ -127,6 +118,7 @@ void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
+void msixtbl_init(struct domain *d);
 
 struct hvm_hw_stdvga {
     uint8_t sr_index;
@@ -141,8 +133,19 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
+int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
 #endif /* __ASM_X86_HVM_IO_H__ */
 
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/17] x86/hvm: unify dpci portio intercept with standard portio intercept
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (2 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 03/17] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 05/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c           |    2 +
 xen/arch/x86/hvm/intercept.c     |    8 +-
 xen/arch/x86/hvm/io.c            |  232 +++++++++++++-------------------------
 xen/include/asm-x86/hvm/domain.h |    1 +
 xen/include/asm-x86/hvm/io.h     |    3 +
 xen/include/asm-x86/hvm/vcpu.h   |    2 +
 xen/include/xen/iommu.h          |    1 -
 7 files changed, 89 insertions(+), 160 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f4b57a4..1ecb528 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1517,6 +1517,8 @@ int hvm_domain_initialise(struct domain *d)
     register_portio_handler(d, 0xe9, 1, hvm_print_line);
     register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
+    register_dpci_portio_handler(d);
+
     rc = hvm_funcs.domain_initialise(d);
     if ( rc != 0 )
         goto fail6;
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index c03502c..9e20a72 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -329,13 +329,7 @@ int hvm_io_intercept(ioreq_t *p)
     struct vcpu *v = current;
     struct hvm_io_handler *handler;
 
-    if ( p->type == IOREQ_TYPE_PIO )
-    {
-        int rc = dpci_ioport_intercept(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
-    else if ( p->type == IOREQ_TYPE_COPY )
+    if ( p->type == IOREQ_TYPE_COPY )
     {
         int rc = stdvga_intercept_mmio(p);
         if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 2ba6272..367d51e 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -208,185 +208,113 @@ void hvm_io_assist(ioreq_t *p)
     }
 }
 
-static int dpci_ioport_read(uint32_t mport, ioreq_t *p)
+static bool_t dpci_portio_accept(struct hvm_io_handler *io_handler,
+                                 struct vcpu *v,
+                                 uint64_t addr,
+                                 uint32_t size)
 {
-    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data = 0;
-
-    for ( i = 0; i < p->count; i++ )
-    {
-        if ( vio->mmio_retrying )
-        {
-            if ( vio->mmio_large_read_bytes != p->size )
-                return X86EMUL_UNHANDLEABLE;
-            memcpy(&data, vio->mmio_large_read, p->size);
-            vio->mmio_large_read_bytes = 0;
-            vio->mmio_retrying = 0;
-        }
-        else switch ( p->size )
-        {
-        case 1:
-            data = inb(mport);
-            break;
-        case 2:
-            data = inw(mport);
-            break;
-        case 4:
-            data = inl(mport);
-            break;
-        default:
-            BUG();
-        }
+    struct hvm_iommu *hd = domain_hvm_iommu(v->domain);
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport;
+    uint32_t start, end;
+    uint32_t gport = addr, mport;
 
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                            &data, p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                /* Drop the write as real hardware would. */
-                continue;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-        else
-            p->data = data;
-    }
 
-    if ( rc == X86EMUL_RETRY )
+    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
     {
-        vio->mmio_retry = 1;
-        vio->mmio_large_read_bytes = p->size;
-        memcpy(vio->mmio_large_read, &data, p->size);
+        start = g2m_ioport->gport;
+        end = start + g2m_ioport->np;
+        if ( (gport >= start) && (gport < end) )
+            goto found;
     }
 
-    if ( i != 0 )
+    return 0;
+
+ found:
+    mport = (gport - start) + g2m_ioport->mport;
+
+    if ( !ioports_access_permitted(current->domain, mport,
+                                   mport + size - 1) )
     {
-        p->count = i;
-        rc = X86EMUL_OKAY;
+        gdprintk(XENLOG_ERR, "Error: access to gport=%#x denied!\n",
+                 (uint32_t)addr);
+        return 0;
     }
 
-    return rc;
+    vio->g2m_ioport = g2m_ioport;
+    return 1;
 }
 
-static int dpci_ioport_write(uint32_t mport, ioreq_t *p)
+static int dpci_portio_read(struct hvm_io_handler *io_handler,
+                            struct vcpu *v,
+                            uint64_t addr,
+                            uint32_t size,
+                            uint64_t *data)
 {
-    int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
-    uint32_t data;
-
-    for ( i = 0; i < p->count; i++ )
-    {
-        data = p->data;
-        if ( p->data_is_ptr )
-        {
-            switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
-                                              p->size) )
-            {
-            case HVMCOPY_okay:
-                break;
-            case HVMCOPY_gfn_paged_out:
-            case HVMCOPY_gfn_shared:
-                rc = X86EMUL_RETRY;
-                break;
-            case HVMCOPY_bad_gfn_to_mfn:
-                data = ~0;
-                break;
-            case HVMCOPY_bad_gva_to_gfn:
-                ASSERT(0);
-                /* fall through */
-            default:
-                rc = X86EMUL_UNHANDLEABLE;
-                break;
-            }
-            if ( rc != X86EMUL_OKAY)
-                break;
-        }
-
-        switch ( p->size )
-        {
-        case 1:
-            outb(data, mport);
-            break;
-        case 2:
-            outw(data, mport);
-            break;
-        case 4:
-            outl(data, mport);
-            break;
-        default:
-            BUG();
-        }
-    }
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    uint32_t mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    if ( rc == X86EMUL_RETRY )
-        current->arch.hvm_vcpu.hvm_io.mmio_retry = 1;
-
-    if ( i != 0 )
+    switch ( size )
     {
-        p->count = i;
-        rc = X86EMUL_OKAY;
+    case 1:
+        *data = inb(mport);
+        break;
+    case 2:
+        *data = inw(mport);
+        break;
+    case 4:
+        *data = inl(mport);
+        break;
+    default:
+        BUG();
     }
 
-    return rc;
+    return X86EMUL_OKAY;
 }
 
-int dpci_ioport_intercept(ioreq_t *p)
+static int dpci_portio_write(struct hvm_io_handler *io_handler,
+                             struct vcpu *v,
+                             uint64_t addr,
+                             uint32_t size,
+                             uint64_t data)
 {
-    struct domain *d = current->domain;
-    struct hvm_iommu *hd = domain_hvm_iommu(d);
-    struct g2m_ioport *g2m_ioport;
-    unsigned int mport, gport = p->addr;
-    unsigned int s = 0, e = 0;
-    int rc;
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
+    struct g2m_ioport *g2m_ioport = vio->g2m_ioport;
+    uint32_t mport = (addr - g2m_ioport->gport) + g2m_ioport->mport;
 
-    list_for_each_entry( g2m_ioport, &hd->arch.g2m_ioport_list, list )
+    switch ( size )
     {
-        s = g2m_ioport->gport;
-        e = s + g2m_ioport->np;
-        if ( (gport >= s) && (gport < e) )
-            goto found;
+    case 1:
+        outb(data, mport);
+        break;
+    case 2:
+        outw(data, mport);
+        break;
+    case 4:
+        outl(data, mport);
+        break;
+    default:
+        BUG();
     }
 
-    return X86EMUL_UNHANDLEABLE;
+    return X86EMUL_OKAY;
+}
 
- found:
-    mport = (gport - s) + g2m_ioport->mport;
+static const struct hvm_io_ops dpci_portio_ops = {
+    .accept = dpci_portio_accept,
+    .read = dpci_portio_read,
+    .write = dpci_portio_write
+};
 
-    if ( !ioports_access_permitted(d, mport, mport + p->size - 1) ) 
-    {
-        gdprintk(XENLOG_ERR, "Error: access to gport=%#x denied!\n",
-                 (uint32_t)p->addr);
-        return X86EMUL_UNHANDLEABLE;
-    }
+void register_dpci_portio_handler(struct domain *d)
+{
+    struct hvm_io_handler *handler = &d->arch.hvm_domain.dpci_handler;
 
-    switch ( p->dir )
-    {
-    case IOREQ_READ:
-        rc = dpci_ioport_read(mport, p);
-        break;
-    case IOREQ_WRITE:
-        rc = dpci_ioport_write(mport, p);
-        break;
-    default:
-        gdprintk(XENLOG_ERR, "Error: couldn't handle p->dir = %d", p->dir);
-        rc = X86EMUL_UNHANDLEABLE;
-    }
+    handler->type = IOREQ_TYPE_PIO;
+    handler->ops = &dpci_portio_ops;
 
-    return rc;
+    hvm_register_io_handler(d, handler, 1);
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index d9c0cfe..5d9c9f5 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -94,6 +94,7 @@ struct hvm_domain {
 
     struct hvm_mmio_handler   *mmio_handler;
     struct hvm_portio_handler *portio_handler;
+    struct hvm_io_handler     dpci_handler;
 
     struct {
         spinlock_t       lock;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 633a210..c3bab2f 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -137,6 +137,9 @@ int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
+
+void register_dpci_portio_handler(struct domain *d);
+
 #endif /* __ASM_X86_HVM_IO_H__ */
 
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..dd08416 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -77,6 +77,8 @@ struct hvm_vcpu_io {
     bool_t mmio_retry, mmio_retrying;
 
     unsigned long msix_unmask_address;
+
+    struct g2m_ioport *g2m_ioport;
 };
 
 #define VMCX_EADDR    (~0ULL)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index b30bf41..1d00696 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -93,7 +93,6 @@ void pt_pci_init(void);
 
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
-int dpci_ioport_intercept(ioreq_t *p);
 int pt_irq_create_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 int pt_irq_destroy_bind(struct domain *, xen_domctl_bind_pt_irq_t *);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (3 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 04/17] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 06/17] x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry Paul Durrant
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

It's clear from the following check in hvmemul_rep_movs:

    if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
         (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
        return X86EMUL_UNHANDLEABLE;

that mmio <-> mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio <-> mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c   |    7 ++
 xen/arch/x86/hvm/intercept.c |    7 --
 xen/arch/x86/hvm/stdvga.c    |  201 ++++++++++--------------------------------
 xen/include/asm-x86/hvm/io.h |    1 -
 4 files changed, 54 insertions(+), 162 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index da5dfcc..33e0242 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -270,6 +270,13 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
         return X86EMUL_RETRY;
     }
 
+    /* We cannot handle this case in the lower layers of I/O emulation */
+    if ( p2m_is_mmio(p2mt) )
+    {
+        put_page(*page);
+        return X86EMUL_UNHANDLEABLE;
+    }
+
     return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 9e20a72..3c85756 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -329,13 +329,6 @@ int hvm_io_intercept(ioreq_t *p)
     struct vcpu *v = current;
     struct hvm_io_handler *handler;
 
-    if ( p->type == IOREQ_TYPE_COPY )
-    {
-        int rc = stdvga_intercept_mmio(p);
-        if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-            return rc;
-    }
-
     handler = hvm_find_io_handler(v, p);
 
     if ( handler == NULL )
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 5730d35..25561da 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -275,45 +275,45 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
     return ret;
 }
 
-static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
+static int stdvga_mem_read(struct vcpu *v, unsigned long addr,
+                           unsigned long size, unsigned long *data)
 {
-    uint64_t data = 0;
-
     switch ( size )
     {
     case 1:
-        data = stdvga_mem_readb(addr);
+        *data = stdvga_mem_readb(addr);
         break;
 
     case 2:
-        data = stdvga_mem_readb(addr);
-        data |= stdvga_mem_readb(addr + 1) << 8;
+        *data = stdvga_mem_readb(addr);
+        *data |= stdvga_mem_readb(addr + 1) << 8;
         break;
 
     case 4:
-        data = stdvga_mem_readb(addr);
-        data |= stdvga_mem_readb(addr + 1) << 8;
-        data |= stdvga_mem_readb(addr + 2) << 16;
-        data |= stdvga_mem_readb(addr + 3) << 24;
+        *data = stdvga_mem_readb(addr);
+        *data |= stdvga_mem_readb(addr + 1) << 8;
+        *data |= stdvga_mem_readb(addr + 2) << 16;
+        *data |= stdvga_mem_readb(addr + 3) << 24;
         break;
 
     case 8:
-        data =  (uint64_t)(stdvga_mem_readb(addr));
-        data |= (uint64_t)(stdvga_mem_readb(addr + 1)) << 8;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 2)) << 16;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 3)) << 24;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 4)) << 32;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 5)) << 40;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 6)) << 48;
-        data |= (uint64_t)(stdvga_mem_readb(addr + 7)) << 56;
+        *data =  (uint64_t)(stdvga_mem_readb(addr));
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 1)) << 8;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 2)) << 16;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 3)) << 24;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 4)) << 32;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 5)) << 40;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 6)) << 48;
+        *data |= (uint64_t)(stdvga_mem_readb(addr + 7)) << 56;
         break;
 
     default:
         gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
+        *data = 0;
         break;
     }
 
-    return data;
+    return X86EMUL_OKAY;
 }
 
 static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
@@ -424,8 +424,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
     }
 }
 
-static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
+static int stdvga_mem_write(struct vcpu *v, unsigned long addr,
+                            unsigned long size, unsigned long data)
 {
+    ioreq_t p = { .type = IOREQ_TYPE_COPY,
+                  .addr = addr,
+                  .size = size,
+                  .count = 1,
+                  .dir = IOREQ_WRITE,
+                  .data = data,
+    };
+
     /* Intercept mmio write */
     switch ( size )
     {
@@ -460,153 +469,35 @@ static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
         gdprintk(XENLOG_WARNING, "invalid io size: %"PRId64"\n", size);
         break;
     }
-}
-
-static uint32_t read_data;
-
-static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
-{
-    int i;
-    uint64_t addr = p->addr;
-    p2m_type_t p2mt;
-    struct domain *d = current->domain;
-    int step = p->df ? -p->size : p->size;
 
-    if ( p->data_is_ptr )
-    {
-        uint64_t data = p->data, tmp;
-
-        if ( p->dir == IOREQ_READ )
-        {
-            for ( i = 0; i < p->count; i++ ) 
-            {
-                tmp = stdvga_mem_read(addr, p->size);
-                if ( hvm_copy_to_guest_phys(data, &tmp, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                            d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    /*
-                     * The only case we handle is vga_mem <-> vga_mem.
-                     * Anything else disables caching and leaves it to qemu-dm.
-                     */
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    stdvga_mem_write(data, tmp, p->size);
-                }
-                data += step;
-                addr += step;
-            }
-        }
-        else
-        {
-            for ( i = 0; i < p->count; i++ )
-            {
-                if ( hvm_copy_from_guest_phys(&tmp, data, p->size) !=
-                     HVMCOPY_okay )
-                {
-                    struct page_info *dp = get_page_from_gfn(
-                        d, data >> PAGE_SHIFT, &p2mt, P2M_ALLOC);
-                    if ( (p2mt != p2m_mmio_dm) || (data < VGA_MEM_BASE) ||
-                         ((data + p->size) > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-                    {
-                        if ( dp )
-                            put_page(dp);
-                        return 0;
-                    }
-                    ASSERT(!dp);
-                    tmp = stdvga_mem_read(data, p->size);
-                }
-                stdvga_mem_write(addr, tmp, p->size);
-                data += step;
-                addr += step;
-            }
-        }
-    }
-    else if ( p->dir == IOREQ_WRITE )
-    {
-        for ( i = 0; i < p->count; i++ )
-        {
-            stdvga_mem_write(addr, p->data, p->size);
-            addr += step;
-        }
-    }
-    else
-    {
-        ASSERT(p->count == 1);
-        p->data = stdvga_mem_read(addr, p->size);
-    }
+    if (!hvm_buffered_io_send(&p))
+        return X86EMUL_UNHANDLEABLE;
 
-    read_data = p->data;
-    return 1;
+    return X86EMUL_OKAY;
 }
 
-int stdvga_intercept_mmio(ioreq_t *p)
+static int stdvga_mem_check(struct vcpu *v, unsigned long addr)
 {
-    struct domain *d = current->domain;
-    struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
-    uint64_t start, end;
-    int buf = 0, rc;
-
-    if ( p->df )
-    {
-        start = (p->addr - (p->count - 1L) * p->size);
-        end = p->addr + p->size;
-    }
-    else
-    {
-        start = p->addr;
-        end = p->addr + p->count * p->size;
-    }
-
-    if ( (start < VGA_MEM_BASE) || (end > (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p->size > 8 )
-    {
-        gdprintk(XENLOG_WARNING, "invalid mmio size %d\n", (int)p->size);
-        return X86EMUL_UNHANDLEABLE;
-    }
+    struct hvm_hw_stdvga *s = &v->domain->arch.hvm_domain.stdvga;
+    int rc;
 
     spin_lock(&s->lock);
 
-    if ( s->stdvga && s->cache )
-    {
-        switch ( p->type )
-        {
-        case IOREQ_TYPE_COPY:
-            buf = mmio_move(s, p);
-            if ( !buf )
-                s->cache = 0;
-            break;
-        default:
-            gdprintk(XENLOG_WARNING, "unsupported mmio request type:%d "
-                     "addr:0x%04x data:0x%04x size:%d count:%d state:%d "
-                     "isptr:%d dir:%d df:%d\n",
-                     p->type, (int)p->addr, (int)p->data, (int)p->size,
-                     (int)p->count, p->state,
-                     p->data_is_ptr, p->dir, p->df);
-            s->cache = 0;
-        }
-    }
-    else
-    {
-        buf = (p->dir == IOREQ_WRITE);
-    }
-
-    rc = (buf && hvm_buffered_io_send(p));
+    rc = s->stdvga && s->cache &&
+        (addr >= VGA_MEM_BASE) &&
+        (addr < (VGA_MEM_BASE + VGA_MEM_SIZE));
 
     spin_unlock(&s->lock);
 
-    return rc ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
+    return rc;
 }
 
+const struct hvm_mmio_ops stdvga_mem_ops = {
+    .check = stdvga_mem_check,
+    .read = stdvga_mem_read,
+    .write = stdvga_mem_write
+};
+
 void stdvga_init(struct domain *d)
 {
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
@@ -634,6 +525,8 @@ void stdvga_init(struct domain *d)
         register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio);
         /* Graphics registers. */
         register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio);
+        /* VGA memory */
+        register_mmio_handler(d, &stdvga_mem_ops);
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index c3bab2f..ce8aac9 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -133,7 +133,6 @@ struct hvm_hw_stdvga {
 };
 
 void stdvga_init(struct domain *d);
-int stdvga_intercept_mmio(ioreq_t *p);
 void stdvga_deinit(struct domain *d);
 
 extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/17] x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (4 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 05/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 07/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...and error handling"

NOTE: A straight reversion was not possible because of subsequent changes
      in the code so this at least partially a manual reversion.

By limiting hvmemul_do_io_addr() to reps falling within the pages on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic from
the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose but renamed to io_retry (since it's applicable to port I/O as well
as memory mapped I/O).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |   81 +++++++++++++++++++++++++++-------------
 xen/arch/x86/hvm/intercept.c   |   56 ++++++---------------------
 xen/include/asm-x86/hvm/vcpu.h |    2 +-
 3 files changed, 68 insertions(+), 71 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 33e0242..47da5b9 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -60,6 +60,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
+        .count = *reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -134,15 +135,6 @@ static int hvmemul_do_io(
         HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
 
-    /*
-     * When retrying a repeated string instruction, force exit to guest after
-     * completion of the retried iteration to allow handling of interrupts.
-     */
-    if ( vio->mmio_retrying )
-        *reps = 1;
-
-    p.count = *reps;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -156,17 +148,9 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-    case X86EMUL_RETRY:
-        *reps = p.count;
         p.state = STATE_IORESP_READY;
-        if ( !vio->mmio_retry )
-        {
-            hvm_io_assist(&p);
-            vio->io_state = HVMIO_none;
-        }
-        else
-            /* Defer hvm_io_assist() invocation to hvm_do_resume(). */
-            vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+        hvm_io_assist(&p);
+        vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -289,17 +273,64 @@ int hvmemul_do_io_addr(
     bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, paddr_t ram_addr)
 {
-    struct page_info *ram_page;
+    struct vcpu *v = current;
+    unsigned long ram_gmfn = paddr_to_pfn(ram_addr);
+    struct page_info *ram_page[2];
+    int nr_pages = 0;
+    unsigned long count;
     int rc;
 
-    rc = hvmemul_acquire_page(paddr_to_pfn(ram_addr), &ram_page);
+    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
     if ( rc != X86EMUL_OKAY )
-        return rc;
+        goto out;
+
+    nr_pages++;
+
+    /* Detemine how many reps will fit within this page */
+    for ( count = 0; count < *reps; count++ )
+    {
+        paddr_t start, end;
+
+        if ( df )
+        {
+            start = ram_addr - count * size;
+            end = ram_addr + size - 1;
+        }
+        else
+        {
+            start = ram_addr;
+            end = ram_addr + (count + 1) * size - 1;
+        }
+
+        if ( paddr_to_pfn(start) != ram_gmfn ||
+             paddr_to_pfn(end) != ram_gmfn )
+            break;
+    }
+
+    if ( count == 0 )
+    {
+        /*
+         * This access must span two pages, so grab a reference to
+         * the next page and do a single rep.
+         */
+        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+                                  &ram_page[nr_pages]);
+        if ( rc != X86EMUL_OKAY )
+            goto out;
+
+        nr_pages++;
+        count = 1;
+    }
+
+    v->arch.hvm_vcpu.hvm_io.io_retry = (count < *reps);
+    *reps = count;
 
     rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
                        (uint64_t)ram_addr);
 
-    hvmemul_release_page(ram_page);
+ out:
+    while ( --nr_pages >= 0 )
+        hvmemul_release_page(ram_page[nr_pages]);
 
     return rc;
 }
@@ -1547,8 +1578,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     }
 
     hvmemul_ctxt->exn_pending = 0;
-    vio->mmio_retrying = vio->mmio_retry;
-    vio->mmio_retry = 0;
 
     if ( cpu_has_vmx )
         hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
@@ -1559,7 +1588,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
 
     rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
 
-    if ( rc == X86EMUL_OKAY && vio->mmio_retry )
+    if ( rc == X86EMUL_OKAY && vio->io_retry )
         rc = X86EMUL_RETRY;
     if ( rc != X86EMUL_RETRY )
     {
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 3c85756..d10682b 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -155,7 +155,6 @@ static const struct hvm_io_ops portio_ops = {
 static int process_io_intercept(struct vcpu *v, ioreq_t *p,
                                 struct hvm_io_handler *handler)
 {
-    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
     uint64_t data;
@@ -165,58 +164,38 @@ static int process_io_intercept(struct vcpu *v, ioreq_t *p,
     {
         for ( i = 0; i < p->count; i++ )
         {
-            if ( vio->mmio_retrying )
-            {
-                if ( vio->mmio_large_read_bytes != p->size )
-                    return X86EMUL_UNHANDLEABLE;
-                memcpy(&data, vio->mmio_large_read, p->size);
-                vio->mmio_large_read_bytes = 0;
-                vio->mmio_retrying = 0;
-            }
-            else
-            {
-                addr = (p->type == IOREQ_TYPE_COPY) ?
-                    p->addr + step * i :
-                    p->addr;
-                rc = ops->read(handler, v, addr, p->size, &data);
-                if ( rc != X86EMUL_OKAY )
-                    break;
-            }
+            addr = (p->type == IOREQ_TYPE_COPY) ?
+                p->addr + step * i :
+                p->addr;
+            rc = ops->read(handler, v, addr, p->size, &data);
+            if ( rc != X86EMUL_OKAY )
+                break;
 
             if ( p->data_is_ptr )
             {
                 switch ( hvm_copy_to_guest_phys(p->data + step * i,
-                                                &data, p->size) )
+                                            &data, p->size) )
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     /* Drop the write as real hardware would. */
                     continue;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT(0);
                     /* fall through */
                 default:
                     rc = X86EMUL_UNHANDLEABLE;
                     break;
                 }
-                if ( rc != X86EMUL_OKAY)
+                if ( rc != X86EMUL_OKAY )
                     break;
             }
             else
                 p->data = data;
         }
-
-        if ( rc == X86EMUL_RETRY )
-        {
-            vio->mmio_retry = 1;
-            vio->mmio_large_read_bytes = p->size;
-            memcpy(vio->mmio_large_read, &data, p->size);
-        }
     }
     else /* p->dir == IOREQ_WRITE */
     {
@@ -229,14 +208,12 @@ static int process_io_intercept(struct vcpu *v, ioreq_t *p,
                 {
                 case HVMCOPY_okay:
                     break;
-                case HVMCOPY_gfn_paged_out:
-                case HVMCOPY_gfn_shared:
-                    rc = X86EMUL_RETRY;
-                    break;
                 case HVMCOPY_bad_gfn_to_mfn:
                     data = ~0;
                     break;
                 case HVMCOPY_bad_gva_to_gfn:
+                case HVMCOPY_gfn_paged_out:
+                case HVMCOPY_gfn_shared:
                     ASSERT(0);
                     /* fall through */
                 default:
@@ -256,15 +233,6 @@ static int process_io_intercept(struct vcpu *v, ioreq_t *p,
             if ( rc != X86EMUL_OKAY )
                 break;
         }
-
-        if ( rc == X86EMUL_RETRY )
-            vio->mmio_retry = 1;
-    }
-
-    if ( i != 0 )
-    {
-        p->count = i;
-        rc = X86EMUL_OKAY;
     }
 
     return rc;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index dd08416..9ecb702 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -74,7 +74,7 @@ struct hvm_vcpu_io {
      * For string instruction emulation we need to be able to signal a
      * necessary retry through other than function return codes.
      */
-    bool_t mmio_retry, mmio_retrying;
+    bool_t io_retry;
 
     unsigned long msix_unmask_address;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (5 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 06/17] x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 08/17] x86/hvm: split I/O completion handling from state model Paul Durrant
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Paul Durrant, Keir Fraser, Jan Beulich

By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

This patch also fixes the I/O state test at the end of hvm_io_assist()
to check the correct value. Since the ioreq server patch series was
integrated the current ioreq state is no longer an indicator of in-flight
I/O state, since an I/O sheduled by re-emulation may be targetted at a
different ioreq server.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc; Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c   |   36 +++++++++++++++++---
 xen/arch/x86/hvm/hvm.c       |   74 ++++++++++++++++++++++++------------------
 xen/arch/x86/hvm/intercept.c |    4 +--
 xen/arch/x86/hvm/io.c        |   39 ----------------------
 xen/include/asm-x86/hvm/io.h |    3 +-
 5 files changed, 79 insertions(+), 77 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 47da5b9..59ebdfe 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -50,6 +50,34 @@ static void hvmtrace_io_assist(int is_mmio, ioreq_t *p)
     trace_var(event, 0/*!cycles*/, size, buffer);
 }
 
+static int null_read(struct hvm_io_handler *io_handler,
+                     struct vcpu *v,
+                     uint64_t addr,
+                     uint32_t size,
+                     uint64_t *data)
+{
+    *data = ~0ul;
+    return X86EMUL_OKAY;
+}
+
+static int null_write(struct hvm_io_handler *io_handler,
+                      struct vcpu *v,
+                      uint64_t addr,
+                      uint32_t size,
+                      uint64_t data)
+{
+    return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops null_ops = {
+    .read = null_read,
+    .write = null_write
+};
+
+static struct hvm_io_handler null_handler = {
+    .ops = &null_ops
+};
+
 static int hvmemul_do_io(
     int is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uint64_t data)
@@ -148,8 +176,7 @@ static int hvmemul_do_io(
     switch ( rc )
     {
     case X86EMUL_OKAY:
-        p.state = STATE_IORESP_READY;
-        hvm_io_assist(&p);
+        vio->io_data = p.data;
         vio->io_state = HVMIO_none;
         break;
     case X86EMUL_UNHANDLEABLE:
@@ -160,8 +187,9 @@ static int hvmemul_do_io(
         /* If there is no suitable backing DM, just ignore accesses */
         if ( !s )
         {
-            hvm_complete_assist_req(&p);
-            rc = X86EMUL_OKAY;
+            rc = process_io_intercept(curr, &p, &null_handler);
+            if ( rc == X86EMUL_OKAY )
+                vio->io_data = p.data;
             vio->io_state = HVMIO_none;
         }
         else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 1ecb528..0317d90 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -411,6 +411,49 @@ bool_t hvm_io_pending(struct vcpu *v)
     return 0;
 }
 
+static void hvm_io_assist(ioreq_t *p)
+{
+    struct vcpu *curr = current;
+    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    enum hvm_io_state io_state;
+
+    p->state = STATE_IOREQ_NONE;
+
+    io_state = vio->io_state;
+    vio->io_state = HVMIO_none;
+
+    switch ( io_state )
+    {
+    case HVMIO_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        break;
+    case HVMIO_handle_mmio_awaiting_completion:
+        vio->io_state = HVMIO_completed;
+        vio->io_data = p->data;
+        (void)handle_mmio();
+        break;
+    case HVMIO_handle_pio_awaiting_completion:
+        if ( vio->io_size == 4 ) /* Needs zero extension. */
+            guest_cpu_user_regs()->rax = (uint32_t)p->data;
+        else
+            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
+        break;
+    default:
+        break;
+    }
+
+    /*
+     * Re-emulation may have scheduled another I/O so io_state set
+     * at the top of the function may have changed.
+     */
+    if ( vio->io_state == HVMIO_none )
+    {
+        msix_write_completion(curr);
+        vcpu_end_shutdown_deferral(curr);
+    }
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
     /* NB. Optimised for common case (p->state == STATE_IOREQ_NONE). */
@@ -2668,37 +2711,6 @@ bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
     return 0;
 }
 
-void hvm_complete_assist_req(ioreq_t *p)
-{
-    switch ( p->type )
-    {
-    case IOREQ_TYPE_PCI_CONFIG:
-        ASSERT_UNREACHABLE();
-        break;
-    case IOREQ_TYPE_COPY:
-    case IOREQ_TYPE_PIO:
-        if ( p->dir == IOREQ_READ )
-        {
-            if ( !p->data_is_ptr )
-                p->data = ~0ul;
-            else
-            {
-                int i, step = p->df ? -p->size : p->size;
-                uint32_t data = ~0;
-
-                for ( i = 0; i < p->count; i++ )
-                    hvm_copy_to_guest_phys(p->data + step * i, &data,
-                                           p->size);
-            }
-        }
-        /* FALLTHRU */
-    default:
-        p->state = STATE_IORESP_READY;
-        hvm_io_assist(p);
-        break;
-    }
-}
-
 void hvm_broadcast_assist_req(ioreq_t *p)
 {
     struct domain *d = current->domain;
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index d10682b..342083b 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -152,8 +152,8 @@ static const struct hvm_io_ops portio_ops = {
     .write = hvm_portio_write
 };
 
-static int process_io_intercept(struct vcpu *v, ioreq_t *p,
-                                struct hvm_io_handler *handler)
+int process_io_intercept(struct vcpu *v, ioreq_t *p,
+                         struct hvm_io_handler *handler)
 {
     const struct hvm_io_ops *ops = handler->ops;
     int rc = X86EMUL_OKAY, i, step = p->df ? -p->size : p->size;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 367d51e..f2cb520 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -169,45 +169,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
     return 1;
 }
 
-void hvm_io_assist(ioreq_t *p)
-{
-    struct vcpu *curr = current;
-    struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
-
-    p->state = STATE_IOREQ_NONE;
-
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
-    {
-    case HVMIO_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
-    default:
-        break;
-    }
-
-    if ( p->state == STATE_IOREQ_NONE )
-    {
-        msix_write_completion(curr);
-        vcpu_end_shutdown_deferral(curr);
-    }
-}
-
 static bool_t dpci_portio_accept(struct hvm_io_handler *io_handler,
                                  struct vcpu *v,
                                  uint64_t addr,
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index ce8aac9..06a9d11 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -95,6 +95,8 @@ void hvm_register_io_handler(struct domain *d,
 struct hvm_io_handler *hvm_find_io_handler(struct vcpu *v,
                                            ioreq_t *p);
 
+int process_io_intercept(struct vcpu *v, ioreq_t *p,
+                         struct hvm_io_handler *handler);
 int hvm_io_intercept(ioreq_t *p);
 
 void register_mmio_handler(struct domain *d,
@@ -114,7 +116,6 @@ int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn,
                                  struct npfec);
 int handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_io_assist(ioreq_t *p);
 void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
                   const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/17] x86/hvm: split I/O completion handling from state model
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (6 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 07/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 09/17] x86/hvm: remove hvm_io_pending() check in hvmemul_do_io() Paul Durrant
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c         |   42 +++++++++++++++++++++++++---------------
 xen/arch/x86/hvm/io.c          |    4 ++--
 xen/include/asm-x86/hvm/vcpu.h |   15 +++++++++-----
 3 files changed, 38 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0317d90..3891dce 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -415,31 +415,41 @@ static void hvm_io_assist(ioreq_t *p)
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
-    enum hvm_io_state io_state;
 
     p->state = STATE_IOREQ_NONE;
 
-    io_state = vio->io_state;
-    vio->io_state = HVMIO_none;
-
-    switch ( io_state )
+    switch ( vio->io_state )
     {
     case HVMIO_awaiting_completion:
+    {
+        enum hvm_io_completion completion = vio->io_completion;
+
         vio->io_state = HVMIO_completed;
         vio->io_data = p->data;
+        vio->io_completion = HVMIO_no_completion;
+
+        switch ( completion )
+        {
+        case HVMIO_mmio_completion:
+            (void)handle_mmio();
+            break;
+
+        case HVMIO_pio_completion:
+            if ( vio->io_size == 4 ) /* Needs zero extension. */
+                guest_cpu_user_regs()->rax = (uint32_t)p->data;
+            else
+                memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
+
+            vio->io_state = HVMIO_none;
+            break;
+        default:
+            break;
+        }
+
         break;
-    case HVMIO_handle_mmio_awaiting_completion:
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        (void)handle_mmio();
-        break;
-    case HVMIO_handle_pio_awaiting_completion:
-        if ( vio->io_size == 4 ) /* Needs zero extension. */
-            guest_cpu_user_regs()->rax = (uint32_t)p->data;
-        else
-            memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        break;
+    }
     default:
+        vio->io_state = HVMIO_none;
         break;
     }
 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index f2cb520..e4b4350 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -93,7 +93,7 @@ int handle_mmio(void)
     if ( rc != X86EMUL_RETRY )
         vio->io_state = HVMIO_none;
     if ( vio->io_state == HVMIO_awaiting_completion )
-        vio->io_state = HVMIO_handle_mmio_awaiting_completion;
+        vio->io_completion = HVMIO_mmio_completion;
     else
         vio->mmio_access = (struct npfec){};
 
@@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
             return 0;
         /* Completion in hvm_io_assist() with no re-emulation required. */
         ASSERT(dir == IOREQ_READ);
-        vio->io_state = HVMIO_handle_pio_awaiting_completion;
+        vio->io_completion = HVMIO_pio_completion;
         break;
     default:
         gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 9ecb702..ee5b258 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -34,11 +34,15 @@ enum hvm_io_state {
     HVMIO_none = 0,
     HVMIO_dispatched,
     HVMIO_awaiting_completion,
-    HVMIO_handle_mmio_awaiting_completion,
-    HVMIO_handle_pio_awaiting_completion,
     HVMIO_completed
 };
 
+enum hvm_io_completion {
+    HVMIO_no_completion = 0,
+    HVMIO_mmio_completion,
+    HVMIO_pio_completion
+};
+
 struct hvm_vcpu_asid {
     uint64_t generation;
     uint32_t asid;
@@ -46,9 +50,10 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state   io_state;
-    unsigned long       io_data;
-    int                 io_size;
+    enum hvm_io_state      io_state;
+    unsigned long          io_data;
+    int                    io_size;
+    enum hvm_io_completion io_completion;
 
     /*
      * HVM emulation:
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/17] x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (7 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 08/17] x86/hvm: split I/O completion handling from state model Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 10/17] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The check is done at the wrong point (since it is irrelevant if the
I/O is to be handled by the hypervisor) and its functionality can be
covered by returning X86EMUL_UNHANDLEABLE from hvm_send_assist_req()
instead.

This patch also removes the domain_crash() call from
hvm_send_assist_req(). Returning X86EMUL_UNHANDLEABLE allows the
higher layers of emulation to decide what to do instead.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c    |   10 ++--------
 xen/arch/x86/hvm/hvm.c        |   16 ++++++----------
 xen/include/asm-x86/hvm/hvm.h |    2 +-
 3 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 59ebdfe..18d2401 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -153,12 +153,6 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( hvm_io_pending(curr) )
-    {
-        gdprintk(XENLOG_WARNING, "WARNING: io already pending?\n");
-        return X86EMUL_UNHANDLEABLE;
-    }
-
     vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
         HVMIO_dispatched : HVMIO_awaiting_completion;
     vio->io_size = size;
@@ -194,8 +188,8 @@ static int hvmemul_do_io(
         }
         else
         {
-            rc = X86EMUL_RETRY;
-            if ( !hvm_send_assist_req(s, &p) )
+            rc = hvm_send_assist_req(s, &p);
+            if ( rc != X86EMUL_RETRY )
                 vio->io_state = HVMIO_none;
             else if ( data_is_addr || dir == IOREQ_WRITE )
                 rc = X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3891dce..5fef4e7 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2664,7 +2664,7 @@ int hvm_buffered_io_send(ioreq_t *p)
     return 1;
 }
 
-bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
+int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
 {
     struct vcpu *curr = current;
     struct domain *d = curr->domain;
@@ -2672,7 +2672,7 @@ bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
 
     ASSERT(s);
     if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return 0; /* implicitly bins the i/o operation */
+        return X86EMUL_OKAY;
 
     list_for_each_entry ( sv,
                           &s->ioreq_vcpu_list,
@@ -2687,14 +2687,14 @@ bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
             {
                 gprintk(XENLOG_ERR, "device model set bad IO state %d\n",
                         p->state);
-                goto crash;
+                break;
             }
 
             if ( unlikely(p->vp_eport != port) )
             {
                 gprintk(XENLOG_ERR, "device model set bad event channel %d\n",
                         p->vp_eport);
-                goto crash;
+                break;
             }
 
             proto_p->state = STATE_IOREQ_NONE;
@@ -2710,15 +2710,11 @@ bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p)
              */
             p->state = STATE_IOREQ_READY;
             notify_via_xen_event_channel(d, port);
-            break;
+            return X86EMUL_RETRY;
         }
     }
 
-    return 1;
-
- crash:
-    domain_crash(d);
-    return 0;
+    return X86EMUL_UNHANDLEABLE;
 }
 
 void hvm_broadcast_assist_req(ioreq_t *p)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 77eeac5..57f9605 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -230,7 +230,7 @@ void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip);
 
 struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
                                                  ioreq_t *p);
-bool_t hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *p);
+int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *p);
 void hvm_broadcast_assist_req(ioreq_t *p);
 void hvm_complete_assist_req(ioreq_t *p);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/17] x86/hvm: remove HVMIO_dispatched I/O state
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (8 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 09/17] x86/hvm: remove hvm_io_pending() check in hvmemul_do_io() Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 11/17] x86/hvm: remove hvm_io_state enumeration Paul Durrant
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |   15 ++-------------
 xen/arch/x86/hvm/hvm.c         |   35 ++++++++++++++---------------------
 xen/arch/x86/hvm/io.c          |   20 +++++++++-----------
 xen/include/asm-x86/hvm/vcpu.h |    2 +-
 4 files changed, 26 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 18d2401..941a25f 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -139,23 +139,14 @@ static int hvmemul_do_io(
         break;
     case HVMIO_completed:
         vio->io_state = HVMIO_none;
-        if ( data_is_addr || dir == IOREQ_WRITE )
-            return X86EMUL_UNHANDLEABLE;
         goto finish_access;
-    case HVMIO_dispatched:
-        /* May have to wait for previous cycle of a multi-write to complete. */
-        if ( is_mmio && !data_is_addr && (dir == IOREQ_WRITE) &&
-             (addr == (vio->mmio_large_write_pa +
-                       vio->mmio_large_write_bytes)) )
-            return X86EMUL_RETRY;
-        /* fallthrough */
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = (data_is_addr || dir == IOREQ_WRITE) ?
-        HVMIO_dispatched : HVMIO_awaiting_completion;
+    vio->io_state = HVMIO_awaiting_completion;
     vio->io_size = size;
+    vio->io_dir = dir;
 
     if ( dir == IOREQ_WRITE )
     {
@@ -191,8 +182,6 @@ static int hvmemul_do_io(
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY )
                 vio->io_state = HVMIO_none;
-            else if ( data_is_addr || dir == IOREQ_WRITE )
-                rc = X86EMUL_OKAY;
         }
         break;
     }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5fef4e7..c18e1a8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -415,41 +415,34 @@ static void hvm_io_assist(ioreq_t *p)
 {
     struct vcpu *curr = current;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
+    enum hvm_io_completion completion = vio->io_completion;
 
     p->state = STATE_IOREQ_NONE;
 
-    switch ( vio->io_state )
-    {
-    case HVMIO_awaiting_completion:
-    {
-        enum hvm_io_completion completion = vio->io_completion;
+    BUG_ON(vio->io_state != HVMIO_awaiting_completion);
 
-        vio->io_state = HVMIO_completed;
-        vio->io_data = p->data;
-        vio->io_completion = HVMIO_no_completion;
+    vio->io_state = HVMIO_completed;
+    vio->io_data = p->data;
+    vio->io_completion = HVMIO_no_completion;
 
-        switch ( completion )
-        {
-        case HVMIO_mmio_completion:
-            (void)handle_mmio();
-            break;
+    switch ( completion )
+    {
+    case HVMIO_mmio_completion:
+        (void)handle_mmio();
+        break;
 
-        case HVMIO_pio_completion:
+    case HVMIO_pio_completion:
+        if ( vio->io_dir == IOREQ_READ )
+        {
             if ( vio->io_size == 4 ) /* Needs zero extension. */
                 guest_cpu_user_regs()->rax = (uint32_t)p->data;
             else
                 memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-
-            vio->io_state = HVMIO_none;
-            break;
-        default:
-            break;
         }
 
+        vio->io_state = HVMIO_none;
         break;
-    }
     default:
-        vio->io_state = HVMIO_none;
         break;
     }
 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index e4b4350..d22c5bf 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -90,19 +90,21 @@ int handle_mmio(void)
 
     rc = hvm_emulate_one(&ctxt);
 
-    if ( rc != X86EMUL_RETRY )
-        vio->io_state = HVMIO_none;
-    if ( vio->io_state == HVMIO_awaiting_completion )
-        vio->io_completion = HVMIO_mmio_completion;
-    else
-        vio->mmio_access = (struct npfec){};
-
     switch ( rc )
     {
+    case X86EMUL_OKAY:
+        vio->mmio_access = (struct npfec){};
+        break;
+    case X86EMUL_RETRY:
+        vio->io_completion = HVMIO_mmio_completion;
+        break;
     case X86EMUL_UNHANDLEABLE:
+        vio->mmio_access = (struct npfec){};
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
         return 0;
     case X86EMUL_EXCEPTION:
+        vio->io_state = HVMIO_none;
+        vio->mmio_access = (struct npfec){};
         if ( ctxt.exn_pending )
             hvm_inject_trap(&ctxt.trap);
         break;
@@ -154,10 +156,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
         }
         break;
     case X86EMUL_RETRY:
-        if ( vio->io_state != HVMIO_awaiting_completion )
-            return 0;
-        /* Completion in hvm_io_assist() with no re-emulation required. */
-        ASSERT(dir == IOREQ_READ);
         vio->io_completion = HVMIO_pio_completion;
         break;
     default:
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index ee5b258..e86197e 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -32,7 +32,6 @@
 
 enum hvm_io_state {
     HVMIO_none = 0,
-    HVMIO_dispatched,
     HVMIO_awaiting_completion,
     HVMIO_completed
 };
@@ -53,6 +52,7 @@ struct hvm_vcpu_io {
     enum hvm_io_state      io_state;
     unsigned long          io_data;
     int                    io_size;
+    int                    io_dir;
     enum hvm_io_completion io_completion;
 
     /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/17] x86/hvm: remove hvm_io_state enumeration
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (9 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 10/17] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 12/17] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none                -> STATE_IOREQ_NONE
HVMIO_awaiting_completion -> STATE_IOREQ_READY
HVMIO_completed           -> STATE_IORESP_READY

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |   14 +++++++-------
 xen/arch/x86/hvm/hvm.c           |    8 ++++----
 xen/arch/x86/hvm/io.c            |    2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    6 +++---
 xen/include/asm-x86/hvm/vcpu.h   |    8 +-------
 6 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 941a25f..a17a13f 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -135,16 +135,16 @@ static int hvmemul_do_io(
 
     switch ( vio->io_state )
     {
-    case HVMIO_none:
+    case STATE_IOREQ_NONE:
         break;
-    case HVMIO_completed:
-        vio->io_state = HVMIO_none;
+    case STATE_IORESP_READY:
+        vio->io_state = STATE_IOREQ_NONE;
         goto finish_access;
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = HVMIO_awaiting_completion;
+    vio->io_state = STATE_IOREQ_READY;
     vio->io_size = size;
     vio->io_dir = dir;
 
@@ -162,7 +162,7 @@ static int hvmemul_do_io(
     {
     case X86EMUL_OKAY:
         vio->io_data = p.data;
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     case X86EMUL_UNHANDLEABLE:
     {
@@ -175,13 +175,13 @@ static int hvmemul_do_io(
             rc = process_io_intercept(curr, &p, &null_handler);
             if ( rc == X86EMUL_OKAY )
                 vio->io_data = p.data;
-            vio->io_state = HVMIO_none;
+            vio->io_state = STATE_IOREQ_NONE;
         }
         else
         {
             rc = hvm_send_assist_req(s, &p);
             if ( rc != X86EMUL_RETRY )
-                vio->io_state = HVMIO_none;
+                vio->io_state = STATE_IOREQ_NONE;
         }
         break;
     }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c18e1a8..4a6028c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -419,9 +419,9 @@ static void hvm_io_assist(ioreq_t *p)
 
     p->state = STATE_IOREQ_NONE;
 
-    BUG_ON(vio->io_state != HVMIO_awaiting_completion);
+    BUG_ON(vio->io_state != STATE_IOREQ_READY);
 
-    vio->io_state = HVMIO_completed;
+    vio->io_state = STATE_IORESP_READY;
     vio->io_data = p->data;
     vio->io_completion = HVMIO_no_completion;
 
@@ -440,7 +440,7 @@ static void hvm_io_assist(ioreq_t *p)
                 memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
         }
 
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         break;
     default:
         break;
@@ -450,7 +450,7 @@ static void hvm_io_assist(ioreq_t *p)
      * Re-emulation may have scheduled another I/O so io_state set
      * at the top of the function may have changed.
      */
-    if ( vio->io_state == HVMIO_none )
+    if ( vio->io_state == STATE_IOREQ_NONE )
     {
         msix_write_completion(curr);
         vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index d22c5bf..1fd2f3f 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -103,7 +103,7 @@ int handle_mmio(void)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
         return 0;
     case X86EMUL_EXCEPTION:
-        vio->io_state = HVMIO_none;
+        vio->io_state = STATE_IOREQ_NONE;
         vio->mmio_access = (struct npfec){};
         if ( ctxt.exn_pending )
             hvm_inject_trap(&ctxt.trap);
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index be5797a..8b165c6 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != HVMIO_none )
+        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index fe8b4a0..8c2da9a 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -177,7 +177,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     hvm_emulate_prepare(&hvmemul_ctxt, regs);
 
-    if ( vio->io_state == HVMIO_completed )
+    if ( vio->io_state == STATE_IORESP_READY )
         realmode_emulate_one(&hvmemul_ctxt);
 
     /* Only deliver interrupts into emulated real mode. */
@@ -191,7 +191,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     curr->arch.hvm_vmx.vmx_emulate = 1;
     while ( curr->arch.hvm_vmx.vmx_emulate &&
             !softirq_pending(smp_processor_id()) &&
-            (vio->io_state == HVMIO_none) )
+            (vio->io_state == STATE_IOREQ_NONE) )
     {
         /*
          * Check for pending interrupts only every 16 instructions, because
@@ -216,7 +216,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != HVMIO_none )
+    if ( vio->io_state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index e86197e..ee9f46b 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -30,12 +30,6 @@
 #include <asm/hvm/svm/nestedsvm.h>
 #include <asm/mtrr.h>
 
-enum hvm_io_state {
-    HVMIO_none = 0,
-    HVMIO_awaiting_completion,
-    HVMIO_completed
-};
-
 enum hvm_io_completion {
     HVMIO_no_completion = 0,
     HVMIO_mmio_completion,
@@ -49,7 +43,7 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    enum hvm_io_state      io_state;
+    uint8_t                io_state;
     unsigned long          io_data;
     int                    io_size;
     int                    io_dir;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/17] x86/hvm: use ioreq_t to track in-flight state
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (10 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 11/17] x86/hvm: remove hvm_io_state enumeration Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 13/17] x86/hvm: only acquire RAM pages for emulation when we need to Paul Durrant
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c       |  136 +++++++++++++++++++-------------------
 xen/arch/x86/hvm/hvm.c           |   19 ++----
 xen/arch/x86/hvm/io.c            |    2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |    6 +-
 xen/include/asm-x86/hvm/vcpu.h   |    5 +-
 6 files changed, 80 insertions(+), 90 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index a17a13f..d91cd74 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -93,6 +93,7 @@ static int hvmemul_do_io(
         .df = df,
         .data = data,
         .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
+        .state = STATE_IOREQ_READY,
     };
     void *p_data = (void *)data;
     int rc;
@@ -133,21 +134,66 @@ static int hvmemul_do_io(
         }
     }
 
-    switch ( vio->io_state )
+    switch ( vio->io_req.state )
     {
     case STATE_IOREQ_NONE:
+        vio->io_req = p;
         break;
     case STATE_IORESP_READY:
-        vio->io_state = STATE_IOREQ_NONE;
-        goto finish_access;
+        p = vio->io_req;
+
+        ASSERT(p.type == is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO);
+        ASSERT(p.addr == addr);
+        ASSERT(p.size == size);
+        ASSERT(p.count == *reps);
+        ASSERT(p.dir == dir);
+        ASSERT(p.df == df);
+        ASSERT(p.data_is_ptr == data_is_addr);
+
+ resp_ready:
+        vio->io_req.state = STATE_IOREQ_NONE;
+
+        if ( dir == IOREQ_READ )
+        {
+            hvmtrace_io_assist(is_mmio, &p);
+
+            if ( !data_is_addr )
+                memcpy((void *)data, &p.data, size);
+        }
+
+        if ( is_mmio && !data_is_addr )
+        {
+            /* Part of a multi-cycle read or write? */
+            if ( dir == IOREQ_WRITE )
+            {
+                paddr_t pa = vio->mmio_large_write_pa;
+                unsigned int bytes = vio->mmio_large_write_bytes;
+                if ( bytes == 0 )
+                    pa = vio->mmio_large_write_pa = addr;
+                if ( addr == (pa + bytes) )
+                    vio->mmio_large_write_bytes += size;
+            }
+            else
+            {
+                paddr_t pa = vio->mmio_large_read_pa;
+                unsigned int bytes = vio->mmio_large_read_bytes;
+                if ( bytes == 0 )
+                    pa = vio->mmio_large_read_pa = addr;
+                if ( (addr == (pa + bytes)) &&
+                     ((bytes + size) <= sizeof(vio->mmio_large_read)) )
+                {
+                    memcpy(&vio->mmio_large_read[addr - pa], (void *)data,
+                           size);
+                    vio->mmio_large_read_bytes += size;
+                }
+            }
+        }
+
+        return X86EMUL_OKAY;
     default:
         return X86EMUL_UNHANDLEABLE;
     }
 
-    vio->io_state = STATE_IOREQ_READY;
-    vio->io_size = size;
-    vio->io_dir = dir;
-
     if ( dir == IOREQ_WRITE )
     {
         if ( !data_is_addr )
@@ -158,77 +204,31 @@ static int hvmemul_do_io(
 
     rc = hvm_io_intercept(&p);
 
-    switch ( rc )
-    {
-    case X86EMUL_OKAY:
-        vio->io_data = p.data;
-        vio->io_state = STATE_IOREQ_NONE;
-        break;
-    case X86EMUL_UNHANDLEABLE:
+    if ( rc == X86EMUL_UNHANDLEABLE )
     {
         struct hvm_ioreq_server *s =
             hvm_select_ioreq_server(curr->domain, &p);
 
         /* If there is no suitable backing DM, just ignore accesses */
-        if ( !s )
-        {
-            rc = process_io_intercept(curr, &p, &null_handler);
-            if ( rc == X86EMUL_OKAY )
-                vio->io_data = p.data;
-            vio->io_state = STATE_IOREQ_NONE;
-        }
-        else
-        {
-            rc = hvm_send_assist_req(s, &p);
-            if ( rc != X86EMUL_RETRY )
-                vio->io_state = STATE_IOREQ_NONE;
-        }
-        break;
-    }
-    default:
-        BUG();
+        rc = !s ?
+            process_io_intercept(curr, &p, &null_handler) :
+            hvm_send_assist_req(s, &p);
     }
 
-    if ( rc != X86EMUL_OKAY )
-        return rc;
-
- finish_access:
-    if ( dir == IOREQ_READ )
-    {
-        hvmtrace_io_assist(is_mmio, &p);
-
-        if ( !data_is_addr )
-            memcpy((void *)data, &vio->io_data, size);
-    }
-
-    if ( is_mmio && !data_is_addr )
+    switch ( rc )
     {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_write_pa = addr;
-            if ( addr == (pa + bytes) )
-                vio->mmio_large_write_bytes += size;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( bytes == 0 )
-                pa = vio->mmio_large_read_pa = addr;
-            if ( (addr == (pa + bytes)) &&
-                 ((bytes + size) <= sizeof(vio->mmio_large_read)) )
-            {
-                memcpy(&vio->mmio_large_read[bytes], p_data, size);
-                vio->mmio_large_read_bytes += size;
-            }
-        }
+    case X86EMUL_OKAY:
+        goto resp_ready;
+    case X86EMUL_UNHANDLEABLE:
+        vio->io_req.state = STATE_IOREQ_NONE;
+        break;
+    case X86EMUL_RETRY:
+        break;
+    default:
+        BUG();
     }
 
-    return X86EMUL_OKAY;
+    return rc;
 }
 
 int hvmemul_do_io_buffer(
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4a6028c..c17ff6a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -419,10 +419,10 @@ static void hvm_io_assist(ioreq_t *p)
 
     p->state = STATE_IOREQ_NONE;
 
-    BUG_ON(vio->io_state != STATE_IOREQ_READY);
+    BUG_ON(vio->io_req.state != STATE_IOREQ_READY);
 
-    vio->io_state = STATE_IORESP_READY;
-    vio->io_data = p->data;
+    vio->io_req.state = STATE_IORESP_READY;
+    vio->io_req.data = p->data;
     vio->io_completion = HVMIO_no_completion;
 
     switch ( completion )
@@ -432,15 +432,8 @@ static void hvm_io_assist(ioreq_t *p)
         break;
 
     case HVMIO_pio_completion:
-        if ( vio->io_dir == IOREQ_READ )
-        {
-            if ( vio->io_size == 4 ) /* Needs zero extension. */
-                guest_cpu_user_regs()->rax = (uint32_t)p->data;
-            else
-                memcpy(&guest_cpu_user_regs()->rax, &p->data, vio->io_size);
-        }
-
-        vio->io_state = STATE_IOREQ_NONE;
+        (void)handle_pio(vio->io_req.addr, vio->io_req.size,
+                         vio->io_req.dir);
         break;
     default:
         break;
@@ -450,7 +443,7 @@ static void hvm_io_assist(ioreq_t *p)
      * Re-emulation may have scheduled another I/O so io_state set
      * at the top of the function may have changed.
      */
-    if ( vio->io_state == STATE_IOREQ_NONE )
+    if ( vio->io_req.state == STATE_IOREQ_NONE )
     {
         msix_write_completion(curr);
         vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 1fd2f3f..86aab35 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -103,7 +103,7 @@ int handle_mmio(void)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
         return 0;
     case X86EMUL_EXCEPTION:
-        vio->io_state = STATE_IOREQ_NONE;
+        vio->io_req.state = STATE_IOREQ_NONE;
         vio->mmio_access = (struct npfec){};
         if ( ctxt.exn_pending )
             hvm_inject_trap(&ctxt.trap);
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 8b165c6..78667a2 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
          * Delay the injection because this would result in delivering
          * an interrupt *within* the execution of an instruction.
          */
-        if ( v->arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
+        if ( v->arch.hvm_vcpu.hvm_io.io_req.state != STATE_IOREQ_NONE )
             return hvm_intblk_shadow;
 
         if ( !nv->nv_vmexit_pending && n2vmcb->exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 8c2da9a..69c0297 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -177,7 +177,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
     hvm_emulate_prepare(&hvmemul_ctxt, regs);
 
-    if ( vio->io_state == STATE_IORESP_READY )
+    if ( vio->io_req.state == STATE_IORESP_READY )
         realmode_emulate_one(&hvmemul_ctxt);
 
     /* Only deliver interrupts into emulated real mode. */
@@ -191,7 +191,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     curr->arch.hvm_vmx.vmx_emulate = 1;
     while ( curr->arch.hvm_vmx.vmx_emulate &&
             !softirq_pending(smp_processor_id()) &&
-            (vio->io_state == STATE_IOREQ_NONE) )
+            (vio->io_req.state == STATE_IOREQ_NONE) )
     {
         /*
          * Check for pending interrupts only every 16 instructions, because
@@ -216,7 +216,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     }
 
     /* Need to emulate next time if we've started an IO operation */
-    if ( vio->io_state != STATE_IOREQ_NONE )
+    if ( vio->io_req.state != STATE_IOREQ_NONE )
         curr->arch.hvm_vmx.vmx_emulate = 1;
 
     if ( !curr->arch.hvm_vmx.vmx_emulate && !curr->arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index ee9f46b..2a1da4b 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -43,10 +43,7 @@ struct hvm_vcpu_asid {
 
 struct hvm_vcpu_io {
     /* I/O request in flight to device model. */
-    uint8_t                io_state;
-    unsigned long          io_data;
-    int                    io_size;
-    int                    io_dir;
+    ioreq_t                io_req;
     enum hvm_io_completion io_completion;
 
     /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/17] x86/hvm: only acquire RAM pages for emulation when we need to
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (11 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 12/17] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 14/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist() Paul Durrant
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

If hvmemul_do_io_addr() is called to complete a previously issued
emulation then there is no need to acquire the RAM pages again. There
is also no need to re-calculate the value of *reps, providing
hvmemul_do_io() updates it when returning X86EMUL_OKAY.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |   86 ++++++++++++++++++++++++--------------------
 1 file changed, 48 insertions(+), 38 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d91cd74..ab7c716 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -145,7 +145,6 @@ static int hvmemul_do_io(
         ASSERT(p.type == is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO);
         ASSERT(p.addr == addr);
         ASSERT(p.size == size);
-        ASSERT(p.count == *reps);
         ASSERT(p.dir == dir);
         ASSERT(p.df == df);
         ASSERT(p.data_is_ptr == data_is_addr);
@@ -189,6 +188,7 @@ static int hvmemul_do_io(
             }
         }
 
+        *reps = p.count;
         return X86EMUL_OKAY;
     default:
         return X86EMUL_UNHANDLEABLE;
@@ -285,56 +285,66 @@ int hvmemul_do_io_addr(
     uint8_t dir, bool_t df, paddr_t ram_addr)
 {
     struct vcpu *v = current;
+    struct hvm_vcpu_io *vio = &v->arch.hvm_vcpu.hvm_io;
     unsigned long ram_gmfn = paddr_to_pfn(ram_addr);
     struct page_info *ram_page[2];
     int nr_pages = 0;
     unsigned long count;
     int rc;
 
-    rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
-    if ( rc != X86EMUL_OKAY )
-        goto out;
-
-    nr_pages++;
-
-    /* Detemine how many reps will fit within this page */
-    for ( count = 0; count < *reps; count++ )
+    switch ( vio->io_req.state )
     {
-        paddr_t start, end;
+    case STATE_IOREQ_NONE:
+        rc = hvmemul_acquire_page(ram_gmfn, &ram_page[nr_pages]);
+        if ( rc != X86EMUL_OKAY )
+            goto out;
 
-        if ( df )
-        {
-            start = ram_addr - count * size;
-            end = ram_addr + size - 1;
-        }
-        else
+        nr_pages++;
+
+        /* Detemine how many reps will fit within this page */
+        for ( count = 0; count < *reps; count++ )
         {
-            start = ram_addr;
-            end = ram_addr + (count + 1) * size - 1;
-        }
+            paddr_t start, end;
 
-        if ( paddr_to_pfn(start) != ram_gmfn ||
-             paddr_to_pfn(end) != ram_gmfn )
-            break;
-    }
+            if ( df )
+            {
+                start = ram_addr - count * size;
+                end = ram_addr + size - 1;
+            }
+            else
+            {
+                start = ram_addr;
+                end = ram_addr + (count + 1) * size - 1;
+            }
 
-    if ( count == 0 )
-    {
-        /*
-         * This access must span two pages, so grab a reference to
-         * the next page and do a single rep.
-         */
-        rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
-                                  &ram_page[nr_pages]);
-        if ( rc != X86EMUL_OKAY )
-            goto out;
+            if ( paddr_to_pfn(start) != ram_gmfn ||
+                 paddr_to_pfn(end) != ram_gmfn )
+                break;
+        }
 
-        nr_pages++;
-        count = 1;
-    }
+        if ( count == 0 )
+        {
+            /*
+             * This access must span two pages, so grab a reference
+             * to the next page and do a single rep.
+             */
+            rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+                                      &ram_page[nr_pages]);
+            if ( rc != X86EMUL_OKAY )
+                goto out;
+
+            nr_pages++;
+            count = 1;
+        }
 
-    v->arch.hvm_vcpu.hvm_io.io_retry = (count < *reps);
-    *reps = count;
+        v->arch.hvm_vcpu.hvm_io.io_retry = (count < *reps);
+        *reps = count;
+        break;
+    case STATE_IORESP_READY:
+        break;
+    default:
+        return X86EMUL_UNHANDLEABLE;
+     }
 
     rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
                        (uint64_t)ram_addr);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (12 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 13/17] x86/hvm: only acquire RAM pages for emulation when we need to Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 15/17] x86/hvm: make sure translated MMIO reads or writes fall within a page Paul Durrant
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The is_mmio parameter can be inferred from the ioreq type.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ab7c716..243824f 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -23,8 +23,9 @@
 #include <asm/hvm/support.h>
 #include <asm/hvm/svm/svm.h>
 
-static void hvmtrace_io_assist(int is_mmio, ioreq_t *p)
+static void hvmtrace_io_assist(ioreq_t *p)
 {
+    bool_t is_mmio = (p->type == IOREQ_TYPE_COPY);
     unsigned int size, event;
     unsigned char buffer[12];
 
@@ -154,7 +155,7 @@ static int hvmemul_do_io(
 
         if ( dir == IOREQ_READ )
         {
-            hvmtrace_io_assist(is_mmio, &p);
+            hvmtrace_io_assist(&p);
 
             if ( !data_is_addr )
                 memcpy((void *)data, &p.data, size);
@@ -199,7 +200,7 @@ static int hvmemul_do_io(
         if ( !data_is_addr )
             memcpy(&p.data, (void *)data, size);
 
-        hvmtrace_io_assist(is_mmio, &p);
+        hvmtrace_io_assist(&p);
     }
 
     rc = hvm_io_intercept(&p);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 15/17] x86/hvm: make sure translated MMIO reads or writes fall within a page
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (13 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 14/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist() Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 16/17] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...otherwise they will simply carry on to the next page using a normal
linear-to-phys translation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 243824f..46fab98 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -645,7 +645,6 @@ static int __hvmemul_read(
                                         p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 return rc;
-            addr += chunk;
             off += chunk;
             gpa += chunk;
             p_data += chunk;
@@ -653,6 +652,8 @@ static int __hvmemul_read(
             if ( bytes < chunk )
                 chunk = bytes;
         }
+
+        return X86EMUL_UNHANDLEABLE;
     }
 
     if ( (seg != x86_seg_none) &&
@@ -789,7 +790,6 @@ static int hvmemul_write(
                                         p_data);
             if ( rc != X86EMUL_OKAY || bytes == chunk )
                 return rc;
-            addr += chunk;
             off += chunk;
             gpa += chunk;
             p_data += chunk;
@@ -797,6 +797,8 @@ static int hvmemul_write(
             if ( bytes < chunk )
                 chunk = bytes;
         }
+
+        return X86EMUL_UNHANDLEABLE;
     }
 
     if ( (seg != x86_seg_none) &&
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 16/17] x86/hvm: remove multiple open coded 'chunking' loops
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (14 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 15/17] x86/hvm: make sure translated MMIO reads or writes fall within a page Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-08 14:33 ` [PATCH 17/17] x86/hvm: track large memory mapped accesses by linear address Paul Durrant
  2015-06-09 14:43 ` [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c |  237 +++++++++++++++++++++++++-------------------
 1 file changed, 133 insertions(+), 104 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 46fab98..8b9b7f2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -599,6 +599,124 @@ static int hvmemul_virtual_to_linear(
     return X86EMUL_EXCEPTION;
 }
 
+static int hvmemul_phys_mmio_access(paddr_t mmio_addr,
+                                    unsigned int size,
+                                    uint8_t dir,
+                                    void *buffer)
+{
+    unsigned long one_rep = 1;
+    paddr_t page_off = mmio_addr & (PAGE_SIZE - 1);
+    unsigned int chunk;
+    int rc = 0;
+
+    /* Accesses must fall within a page */
+    if (page_off + size > PAGE_SIZE)
+        return X86EMUL_UNHANDLEABLE;
+
+    /*
+     * hvmemul_do_io() cannot handle non-power-of-2 accesses or
+     * accesses larger than sizeof(long), so choose the highest power
+     * of 2 not exceeding sizeof(long) as the 'chunk' size.
+     */
+    chunk = 1 << (fls(size) - 1);
+    if ( chunk > sizeof (long) )
+        chunk = sizeof (long);
+
+    while ( size != 0 )
+    {
+        rc = hvmemul_do_mmio_buffer(mmio_addr, &one_rep, chunk, dir, 0,
+                                    buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        /* Advance to the next chunk */
+        mmio_addr += chunk;
+        buffer += chunk;
+        size -= chunk;
+
+        /*
+         * If the chunk now exceeds the remaining size, choose the next
+         * lowest power of 2 that will fit.
+         */
+        while ( chunk > size )
+            chunk >>= 1;
+    }
+
+    return rc;
+}
+
+static int hvmemul_phys_mmio_read(paddr_t mmio_addr,
+                                  unsigned int size,
+                                  void *buffer)
+{
+    return hvmemul_phys_mmio_access(mmio_addr, size, IOREQ_READ, buffer);
+}
+
+static int hvmemul_phys_mmio_write(paddr_t mmio_addr,
+                                   unsigned int size,
+                                   void *buffer)
+{
+    return hvmemul_phys_mmio_access(mmio_addr, size, IOREQ_WRITE, buffer);
+}
+
+static int hvmemul_linear_mmio_access(unsigned long mmio_addr,
+                                      unsigned int size,
+                                      uint8_t dir,
+                                      void *buffer,
+                                      uint32_t pfec,
+                                      struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+    unsigned long page_off = mmio_addr & (PAGE_SIZE - 1);
+    unsigned int chunk;
+    paddr_t gpa;
+    unsigned long one_rep = 1;
+    int rc;
+
+    chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
+    rc = hvmemul_linear_to_phys(mmio_addr, &gpa, chunk,
+                                &one_rep, pfec, hvmemul_ctxt);
+    while ( rc == X86EMUL_OKAY )
+    {
+        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+        if ( rc != X86EMUL_OKAY )
+            break;
+
+        mmio_addr += chunk;
+        ASSERT((mmio_addr & (PAGE_SIZE - 1)) == 0);
+        buffer += chunk;
+        size -= chunk;
+
+        if ( size == 0 )
+            break;
+
+        chunk = min_t(unsigned int, size, PAGE_SIZE);
+        rc = hvmemul_linear_to_phys(mmio_addr, &gpa, chunk,
+                                    &one_rep, pfec, hvmemul_ctxt);
+    }
+
+    return rc;
+}
+
+static int hvmemul_linear_mmio_read(unsigned long mmio_addr,
+                                    unsigned int size,
+                                    void *buffer,
+                                    uint32_t pfec,
+                                    struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+    return hvmemul_linear_mmio_access(mmio_addr, size, IOREQ_READ, buffer,
+                                      pfec, hvmemul_ctxt);
+}
+
+static int hvmemul_linear_mmio_write(unsigned long mmio_addr,
+                                     unsigned int size,
+                                     void *buffer,
+                                     uint32_t pfec,
+                                     struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+    return hvmemul_linear_mmio_access(mmio_addr, size, IOREQ_WRITE, buffer,
+                                      pfec, hvmemul_ctxt);
+}
+
 static int __hvmemul_read(
     enum x86_segment seg,
     unsigned long offset,
@@ -608,52 +726,26 @@ static int __hvmemul_read(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
-    unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
+    unsigned long addr, one_rep = 1;
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
-        seg, offset, bytes, &reps, access_type, hvmemul_ctxt, &addr);
+        seg, offset, bytes, &one_rep, access_type, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /*
-     * We only need to handle sizes actual instruction operands can have. All
-     * such sizes are either powers of 2 or the sum of two powers of 2. Thus
-     * picking as initial chunk size the largest power of 2 not greater than
-     * the total size will always result in only power-of-2 size requests
-     * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-powers-of-2).
-     */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
 
     if ( ((access_type != hvm_access_insn_fetch
            ? vio->mmio_access.read_access
            : vio->mmio_access.insn_fetch)) &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
     {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
+        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) |
+               (addr & (PAGE_SIZE - 1)));
 
-        return X86EMUL_UNHANDLEABLE;
+        return hvmemul_phys_mmio_read(gpa, bytes, p_data);
     }
 
     if ( (seg != x86_seg_none) &&
@@ -673,30 +765,9 @@ static int __hvmemul_read(
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
             return X86EMUL_UNHANDLEABLE;
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+
+        return hvmemul_linear_mmio_read(addr, bytes, p_data,
+                                        pfec, hvmemul_ctxt);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
@@ -761,44 +832,24 @@ static int hvmemul_write(
     struct hvm_emulate_ctxt *hvmemul_ctxt =
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
-    unsigned long addr, reps = 1;
-    unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER);
+    unsigned long addr, one_rep = 1;
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     paddr_t gpa;
     int rc;
 
     rc = hvmemul_virtual_to_linear(
-        seg, offset, bytes, &reps, hvm_access_write, hvmemul_ctxt, &addr);
+        seg, offset, bytes, &one_rep, hvm_access_write, hvmemul_ctxt, &addr);
     if ( rc != X86EMUL_OKAY )
         return rc;
-    off = addr & (PAGE_SIZE - 1);
-    /* See the respective comment in __hvmemul_read(). */
-    while ( chunk & (chunk - 1) )
-        chunk &= chunk - 1;
-    if ( off + bytes > PAGE_SIZE )
-        while ( off & (chunk - 1) )
-            chunk >>= 1;
 
     if ( vio->mmio_access.write_access &&
          (vio->mmio_gva == (addr & PAGE_MASK)) )
     {
-        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
-        while ( (off + chunk) <= PAGE_SIZE )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                return rc;
-            off += chunk;
-            gpa += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-        }
+        gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) |
+               (addr & (PAGE_SIZE - 1)));
 
-        return X86EMUL_UNHANDLEABLE;
+        return hvmemul_phys_mmio_write(gpa, bytes, p_data);
     }
 
     if ( (seg != x86_seg_none) &&
@@ -814,30 +865,8 @@ static int hvmemul_write(
     case HVMCOPY_bad_gva_to_gfn:
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
-        rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                    hvmemul_ctxt);
-        while ( rc == X86EMUL_OKAY )
-        {
-            rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_WRITE, 0,
-                                        p_data);
-            if ( rc != X86EMUL_OKAY || bytes == chunk )
-                break;
-            addr += chunk;
-            off += chunk;
-            p_data += chunk;
-            bytes -= chunk;
-            if ( bytes < chunk )
-                chunk = bytes;
-            if ( off < PAGE_SIZE )
-                gpa += chunk;
-            else
-            {
-                rc = hvmemul_linear_to_phys(addr, &gpa, chunk, &reps, pfec,
-                                            hvmemul_ctxt);
-                off = 0;
-            }
-        }
-        return rc;
+        return hvmemul_linear_mmio_write(addr, bytes, p_data,
+                                         pfec, hvmemul_ctxt);
     case HVMCOPY_gfn_paged_out:
     case HVMCOPY_gfn_shared:
         return X86EMUL_RETRY;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 17/17] x86/hvm: track large memory mapped accesses by linear address
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (15 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 16/17] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
@ 2015-06-08 14:33 ` Paul Durrant
  2015-06-09 14:43 ` [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
  17 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-08 14:33 UTC (permalink / raw
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Keir Fraser, Jan Beulich

The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical rather than linear address. The code is also only
relevant to memory mapped I/O to or from a buffer.

This patch re-factors the code and moves it into
hvmemul_linear_mmio_access() where it is relevant and where it has
access to linear addresses.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/emulate.c     |   82 +++++++++++++---------------------------
 xen/include/asm-x86/hvm/vcpu.h |   16 ++++----
 2 files changed, 36 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8b9b7f2..288d69d 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -112,28 +112,7 @@ static int hvmemul_do_io(
         return X86EMUL_UNHANDLEABLE;
     }
 
-    if ( is_mmio && !data_is_addr )
-    {
-        /* Part of a multi-cycle read or write? */
-        if ( dir == IOREQ_WRITE )
-        {
-            paddr_t pa = vio->mmio_large_write_pa;
-            unsigned int bytes = vio->mmio_large_write_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-                return X86EMUL_OKAY;
-        }
-        else
-        {
-            paddr_t pa = vio->mmio_large_read_pa;
-            unsigned int bytes = vio->mmio_large_read_bytes;
-            if ( (addr >= pa) && ((addr + size) <= (pa + bytes)) )
-            {
-                memcpy(p_data, &vio->mmio_large_read[addr - pa],
-                       size);
-                return X86EMUL_OKAY;
-            }
-        }
-    }
+    vio = &curr->arch.hvm_vcpu.hvm_io;
 
     switch ( vio->io_req.state )
     {
@@ -161,34 +140,6 @@ static int hvmemul_do_io(
                 memcpy((void *)data, &p.data, size);
         }
 
-        if ( is_mmio && !data_is_addr )
-        {
-            /* Part of a multi-cycle read or write? */
-            if ( dir == IOREQ_WRITE )
-            {
-                paddr_t pa = vio->mmio_large_write_pa;
-                unsigned int bytes = vio->mmio_large_write_bytes;
-                if ( bytes == 0 )
-                    pa = vio->mmio_large_write_pa = addr;
-                if ( addr == (pa + bytes) )
-                    vio->mmio_large_write_bytes += size;
-            }
-            else
-            {
-                paddr_t pa = vio->mmio_large_read_pa;
-                unsigned int bytes = vio->mmio_large_read_bytes;
-                if ( bytes == 0 )
-                    pa = vio->mmio_large_read_pa = addr;
-                if ( (addr == (pa + bytes)) &&
-                     ((bytes + size) <= sizeof(vio->mmio_large_read)) )
-                {
-                    memcpy(&vio->mmio_large_read[addr - pa], (void *)data,
-                           size);
-                    vio->mmio_large_read_bytes += size;
-                }
-            }
-        }
-
         *reps = p.count;
         return X86EMUL_OKAY;
     default:
@@ -666,8 +617,9 @@ static int hvmemul_linear_mmio_access(unsigned long mmio_addr,
                                       uint32_t pfec,
                                       struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
+    struct hvm_vcpu_io *vio = &current->arch.hvm_vcpu.hvm_io;
     unsigned long page_off = mmio_addr & (PAGE_SIZE - 1);
-    unsigned int chunk;
+    unsigned int chunk, buffer_off = 0;
     paddr_t gpa;
     unsigned long one_rep = 1;
     int rc;
@@ -677,13 +629,33 @@ static int hvmemul_linear_mmio_access(unsigned long mmio_addr,
                                 &one_rep, pfec, hvmemul_ctxt);
     while ( rc == X86EMUL_OKAY )
     {
-        rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
-        if ( rc != X86EMUL_OKAY )
-            break;
+        /* Have we already done this chunk? */
+        if ( (buffer_off + chunk) <= vio->mmio_cache[dir].size )
+        {
+            if ( dir == IOREQ_READ )
+                memcpy(buffer,
+                       &vio->mmio_cache[IOREQ_READ].buffer[buffer_off],
+                       chunk);
+            else
+                ASSERT(memcmp(buffer,
+                              &vio->mmio_cache[IOREQ_WRITE].buffer[buffer_off],
+                              chunk) == 0);
+        }
+        else
+        {
+            rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+            if ( rc != X86EMUL_OKAY )
+                break;
+
+            /* Note that we have now done this chunk */
+            memcpy(&vio->mmio_cache[dir].buffer[buffer_off], buffer, chunk);
+            vio->mmio_cache[dir].size += chunk;
+        }
 
         mmio_addr += chunk;
         ASSERT((mmio_addr & (PAGE_SIZE - 1)) == 0);
         buffer += chunk;
+        buffer_off += chunk;
         size -= chunk;
 
         if ( size == 0 )
@@ -1645,7 +1617,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
         rc = X86EMUL_RETRY;
     if ( rc != X86EMUL_RETRY )
     {
-        vio->mmio_large_read_bytes = vio->mmio_large_write_bytes = 0;
+        memset(&vio->mmio_cache, 0, sizeof(vio->mmio_cache));
         vio->mmio_insn_bytes = 0;
     }
     else
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 2a1da4b..83f536b 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -56,13 +56,15 @@ struct hvm_vcpu_io {
     unsigned long       mmio_gva;
     unsigned long       mmio_gpfn;
 
-    /* We may read up to m256 as a number of device-model transactions. */
-    paddr_t mmio_large_read_pa;
-    uint8_t mmio_large_read[32];
-    unsigned int mmio_large_read_bytes;
-    /* We may write up to m256 as a number of device-model transactions. */
-    unsigned int mmio_large_write_bytes;
-    paddr_t mmio_large_write_pa;
+    /*
+     * We may read or write up to m256 as a number of device-model
+     * transactions.
+     */
+    struct {
+        unsigned long size;
+        uint8_t buffer[32];
+    } mmio_cache[2]; /* Indexed by ioreq type */
+
     /* For retries we shouldn't re-fetch the instruction. */
     unsigned int mmio_insn_bytes;
     unsigned char mmio_insn[16];
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
  2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
                   ` (16 preceding siblings ...)
  2015-06-08 14:33 ` [PATCH 17/17] x86/hvm: track large memory mapped accesses by linear address Paul Durrant
@ 2015-06-09 14:43 ` Fabio Fantoni
  2015-06-09 15:21   ` Paul Durrant
  17 siblings, 1 reply; 23+ messages in thread
From: Fabio Fantoni @ 2015-06-09 14:43 UTC (permalink / raw
  To: Paul Durrant, xen-devel

Il 08/06/2015 16:33, Paul Durrant ha scritto:
> This patch series re-works much of the code involved in emulation of port
> and memory mapped I/O for HVM guests.
>
> The code has become very convoluted and, at least by inspection, certain
> emulations will apparently malfunction.
>
> The series is broken down into 17 patches (which are also available in
> my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> on the emulation18 branch) as follows:

Big thanks for your work.
I tested them taking the patches from emulation18 branch.
I tried xl create of a window 7 domU but during xl create dom0 
istant-reboot and there isn't anything about the problem in the logs 
(kern.log, syslog, qemu log, xl log ecc...)
I tried also with linux (hvm) domU and with stdvga instead qxl I had 
setted but same result.

For debug the problem I must enable all xen debug in grub entry and 
redirect output to serial, similar to debug of dom0 boot problem or I 
must do different thing?

Thanks for any reply and sorry for my bad english.

>
> x86/hvm: simplify hvmemul_do_io()
> x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
> x86/hvm: unify internal portio and mmio intercepts
> x86/hvm: unify dpci portio intercept with standard portio intercept
> x86/hvm: unify stdvga mmio intercept with standard mmio intercept
> x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
> x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
> x86/hvm: split I/O completion handling from state model
> x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
> x86/hvm: remove HVMIO_dispatched I/O state
> x86/hvm: remove hvm_io_state enumeration
> x86/hvm: use ioreq_t to track in-flight state
> x86/hvm: only acquire RAM pages for emulation when we need to
> x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
> x86/hvm: make sure translated MMIO reads or writes fall within a page
> x86/hvm: remove multiple open coded 'chunking' loops
> x86/hvm: track large memory mapped accesses by linear address
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
  2015-06-09 14:43 ` [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
@ 2015-06-09 15:21   ` Paul Durrant
  2015-06-10  9:13     ` Fabio Fantoni
  0 siblings, 1 reply; 23+ messages in thread
From: Paul Durrant @ 2015-06-09 15:21 UTC (permalink / raw
  To: Fabio Fantoni, xen-devel@lists.xenproject.org

> -----Original Message-----
> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> Sent: 09 June 2015 15:44
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH 00/17] x86/hvm: I/O emulation cleanup and
> fix
> 
> Il 08/06/2015 16:33, Paul Durrant ha scritto:
> > This patch series re-works much of the code involved in emulation of port
> > and memory mapped I/O for HVM guests.
> >
> > The code has become very convoluted and, at least by inspection, certain
> > emulations will apparently malfunction.
> >
> > The series is broken down into 17 patches (which are also available in
> > my xenbits repo:
> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> > on the emulation18 branch) as follows:
> 
> Big thanks for your work.
> I tested them taking the patches from emulation18 branch.
> I tried xl create of a window 7 domU but during xl create dom0
> istant-reboot and there isn't anything about the problem in the logs
> (kern.log, syslog, qemu log, xl log ecc...)
> I tried also with linux (hvm) domU and with stdvga instead qxl I had
> setted but same result.
> 
> For debug the problem I must enable all xen debug in grub entry and
> redirect output to serial, similar to debug of dom0 boot problem or I
> must do different thing?
> 

Having serial is very useful in these cases. I've been debugging with a 32-bit Win 7 VM so it would be useful if you could bisect bit. There are natural boundaries in the series you could try first:

Apply everything up to and including 'x86/hvm: unify stdvga mmio intercept with standard mmio intercept'
Then try everything up to and including ' x86/hvm: remove extraneous parameter from hvmtrace_io_assist()'
Then, if you get that far, try the rest one at a time.

  Paul

> Thanks for any reply and sorry for my bad english.
> 
> >
> > x86/hvm: simplify hvmemul_do_io()
> > x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
> > x86/hvm: unify internal portio and mmio intercepts
> > x86/hvm: unify dpci portio intercept with standard portio intercept
> > x86/hvm: unify stdvga mmio intercept with standard mmio intercept
> > x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
> > x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
> > x86/hvm: split I/O completion handling from state model
> > x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
> > x86/hvm: remove HVMIO_dispatched I/O state
> > x86/hvm: remove hvm_io_state enumeration
> > x86/hvm: use ioreq_t to track in-flight state
> > x86/hvm: only acquire RAM pages for emulation when we need to
> > x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
> > x86/hvm: make sure translated MMIO reads or writes fall within a page
> > x86/hvm: remove multiple open coded 'chunking' loops
> > x86/hvm: track large memory mapped accesses by linear address
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
  2015-06-09 15:21   ` Paul Durrant
@ 2015-06-10  9:13     ` Fabio Fantoni
  2015-06-10 14:49       ` Fabio Fantoni
  0 siblings, 1 reply; 23+ messages in thread
From: Fabio Fantoni @ 2015-06-10  9:13 UTC (permalink / raw
  To: Paul Durrant, xen-devel@lists.xenproject.org

Il 09/06/2015 17:21, Paul Durrant ha scritto:
>> -----Original Message-----
>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>> Sent: 09 June 2015 15:44
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Subject: Re: [Xen-devel] [PATCH 00/17] x86/hvm: I/O emulation cleanup and
>> fix
>>
>> Il 08/06/2015 16:33, Paul Durrant ha scritto:
>>> This patch series re-works much of the code involved in emulation of port
>>> and memory mapped I/O for HVM guests.
>>>
>>> The code has become very convoluted and, at least by inspection, certain
>>> emulations will apparently malfunction.
>>>
>>> The series is broken down into 17 patches (which are also available in
>>> my xenbits repo:
>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>>> on the emulation18 branch) as follows:
>> Big thanks for your work.
>> I tested them taking the patches from emulation18 branch.
>> I tried xl create of a window 7 domU but during xl create dom0
>> istant-reboot and there isn't anything about the problem in the logs
>> (kern.log, syslog, qemu log, xl log ecc...)
>> I tried also with linux (hvm) domU and with stdvga instead qxl I had
>> setted but same result.
>>
>> For debug the problem I must enable all xen debug in grub entry and
>> redirect output to serial, similar to debug of dom0 boot problem or I
>> must do different thing?
>>
> Having serial is very useful in these cases. I've been debugging with a 32-bit Win 7 VM so it would be useful if you could bisect bit. There are natural boundaries in the series you could try first:
>
> Apply everything up to and including 'x86/hvm: unify stdvga mmio intercept with standard mmio intercept'
> Then try everything up to and including ' x86/hvm: remove extraneous parameter from hvmtrace_io_assist()'
> Then, if you get that far, try the rest one at a time.
>
>    Paul

Thanks for reply.
I tried to boot with full debug enabled and with serial on lan but 
server insta-reboot without show error or useful output on redirected 
output or monitor.
I used same parameters of other debug I did long time ago if I remember 
good, here the grub2 entry:
> menuentry 'Wheezy con Linux 3.16.0-0.bpo.4-amd64 e XEN - RAID - Debug 
> su Seriale' --class debian --class gnu-linux --class gnu --class os {
>     set root='(RAID-ROOT)'
>     echo    'Caricamento Hypervisor Xen...'
>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G 
> swiotlb=65762 loglvl=all guest_loglvl=all sync_console console_to_ring 
> com2=19200,8n1 console=com2
>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64...'
>     linux    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder 
> root=/dev/mapper/RAID-ROOT ro console=hvc0 earlyprintk=xen nomodeset
>     echo    'Caricamento ramdisk iniziale...'
>     initrd    /boot/initrd.img-3.16.0-0.bpo.4-amd64
> }
Here the entry without debug working:
>
> menuentry 'RAID - Debian 7.8 (wheezy) con Linux 3.16.0-0.bpo.4-amd64 e 
> XEN 4.6-unstable' --class debian --class gnu-linux --class gnu --class 
> os --class xen {
>     set fallback="1"
>     set root='(RAID-ROOT)'
>     echo    'Caricamento Hypervisor Xen 4.6-unstable...'
>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G
>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64 ...'
>     module    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder 
> root=/dev/mapper/RAID-ROOT ro swiotlb=65762 quiet
>     echo    'Caricamento ramdisk iniziale...'
>     module    /boot/initrd.img-3.16.0-0.bpo.4-amd64
> }
I did something wrong?

Now I'll try to bisect the patch serie following your advice.

>
>> Thanks for any reply and sorry for my bad english.
>>
>>> x86/hvm: simplify hvmemul_do_io()
>>> x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
>>> x86/hvm: unify internal portio and mmio intercepts
>>> x86/hvm: unify dpci portio intercept with standard portio intercept
>>> x86/hvm: unify stdvga mmio intercept with standard mmio intercept
>>> x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
>>> x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
>>> x86/hvm: split I/O completion handling from state model
>>> x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
>>> x86/hvm: remove HVMIO_dispatched I/O state
>>> x86/hvm: remove hvm_io_state enumeration
>>> x86/hvm: use ioreq_t to track in-flight state
>>> x86/hvm: only acquire RAM pages for emulation when we need to
>>> x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
>>> x86/hvm: make sure translated MMIO reads or writes fall within a page
>>> x86/hvm: remove multiple open coded 'chunking' loops
>>> x86/hvm: track large memory mapped accesses by linear address
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
  2015-06-10  9:13     ` Fabio Fantoni
@ 2015-06-10 14:49       ` Fabio Fantoni
  2015-06-10 15:51         ` Paul Durrant
  0 siblings, 1 reply; 23+ messages in thread
From: Fabio Fantoni @ 2015-06-10 14:49 UTC (permalink / raw
  To: Paul Durrant, xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Keir Fraser, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 5301 bytes --]

Il 10/06/2015 11:13, Fabio Fantoni ha scritto:
> Il 09/06/2015 17:21, Paul Durrant ha scritto:
>>> -----Original Message-----
>>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
>>> Sent: 09 June 2015 15:44
>>> To: Paul Durrant; xen-devel@lists.xenproject.org
>>> Subject: Re: [Xen-devel] [PATCH 00/17] x86/hvm: I/O emulation 
>>> cleanup and
>>> fix
>>>
>>> Il 08/06/2015 16:33, Paul Durrant ha scritto:
>>>> This patch series re-works much of the code involved in emulation 
>>>> of port
>>>> and memory mapped I/O for HVM guests.
>>>>
>>>> The code has become very convoluted and, at least by inspection, 
>>>> certain
>>>> emulations will apparently malfunction.
>>>>
>>>> The series is broken down into 17 patches (which are also available in
>>>> my xenbits repo:
>>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
>>>> on the emulation18 branch) as follows:
>>> Big thanks for your work.
>>> I tested them taking the patches from emulation18 branch.
>>> I tried xl create of a window 7 domU but during xl create dom0
>>> istant-reboot and there isn't anything about the problem in the logs
>>> (kern.log, syslog, qemu log, xl log ecc...)
>>> I tried also with linux (hvm) domU and with stdvga instead qxl I had
>>> setted but same result.
>>>
>>> For debug the problem I must enable all xen debug in grub entry and
>>> redirect output to serial, similar to debug of dom0 boot problem or I
>>> must do different thing?
>>>
>> Having serial is very useful in these cases. I've been debugging with 
>> a 32-bit Win 7 VM so it would be useful if you could bisect bit. 
>> There are natural boundaries in the series you could try first:
>>
>> Apply everything up to and including 'x86/hvm: unify stdvga mmio 
>> intercept with standard mmio intercept'
>> Then try everything up to and including ' x86/hvm: remove extraneous 
>> parameter from hvmtrace_io_assist()'
>> Then, if you get that far, try the rest one at a time.
>>
>>    Paul

Found the patch that cause dom0 insta-reboot:
  x86/hvm: remove multiple open coded 'chunking' loops in 
hvmemul_read/write()

In attachments also xl -vvv create output until the crash if can be 
useful, I not found useful information in logs after reboot :(

If you need more informations/tests tell me and I'll post them.

>
> Thanks for reply.
> I tried to boot with full debug enabled and with serial on lan but 
> server insta-reboot without show error or useful output on redirected 
> output or monitor.
> I used same parameters of other debug I did long time ago if I 
> remember good, here the grub2 entry:
>> menuentry 'Wheezy con Linux 3.16.0-0.bpo.4-amd64 e XEN - RAID - Debug 
>> su Seriale' --class debian --class gnu-linux --class gnu --class os {
>>     set root='(RAID-ROOT)'
>>     echo    'Caricamento Hypervisor Xen...'
>>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G 
>> swiotlb=65762 loglvl=all guest_loglvl=all sync_console 
>> console_to_ring com2=19200,8n1 console=com2
>>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64...'
>>     linux    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder 
>> root=/dev/mapper/RAID-ROOT ro console=hvc0 earlyprintk=xen nomodeset
>>     echo    'Caricamento ramdisk iniziale...'
>>     initrd    /boot/initrd.img-3.16.0-0.bpo.4-amd64
>> }
> Here the entry without debug working:
>>
>> menuentry 'RAID - Debian 7.8 (wheezy) con Linux 3.16.0-0.bpo.4-amd64 
>> e XEN 4.6-unstable' --class debian --class gnu-linux --class gnu 
>> --class os --class xen {
>>     set fallback="1"
>>     set root='(RAID-ROOT)'
>>     echo    'Caricamento Hypervisor Xen 4.6-unstable...'
>>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G
>>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64 ...'
>>     module    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder 
>> root=/dev/mapper/RAID-ROOT ro swiotlb=65762 quiet
>>     echo    'Caricamento ramdisk iniziale...'
>>     module    /boot/initrd.img-3.16.0-0.bpo.4-amd64
>> }
> I did something wrong?
>
> Now I'll try to bisect the patch serie following your advice.
>
>>
>>> Thanks for any reply and sorry for my bad english.
>>>
>>>> x86/hvm: simplify hvmemul_do_io()
>>>> x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
>>>> x86/hvm: unify internal portio and mmio intercepts
>>>> x86/hvm: unify dpci portio intercept with standard portio intercept
>>>> x86/hvm: unify stdvga mmio intercept with standard mmio intercept
>>>> x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
>>>> x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
>>>> x86/hvm: split I/O completion handling from state model
>>>> x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
>>>> x86/hvm: remove HVMIO_dispatched I/O state
>>>> x86/hvm: remove hvm_io_state enumeration
>>>> x86/hvm: use ioreq_t to track in-flight state
>>>> x86/hvm: only acquire RAM pages for emulation when we need to
>>>> x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
>>>> x86/hvm: make sure translated MMIO reads or writes fall within a page
>>>> x86/hvm: remove multiple open coded 'chunking' loops
>>>> x86/hvm: track large memory mapped accesses by linear address
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> http://lists.xen.org/xen-devel
>


[-- Attachment #2: mmio-patches-problem.log --]
[-- Type: text/plain, Size: 13308 bytes --]

xl -vvv create /etc/xen/W7.cfg
Parsing config from /etc/xen/W7.cfg
libxl: debug: libxl_create.c:1586:do_domain_create: ao 0x95edc0: create: how=(nil) callback=(nil) poller=0x95ee20
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=unknown
libxl: debug: libxl_device.c:298:libxl__device_disk_set_backend: Disk vdev=hda, using backend phy
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=hdb spec.backend=unknown
libxl: debug: libxl_device.c:215:disk_try_backend: Disk vdev=hdb, backend phy unsuitable as phys path not a block device
libxl: debug: libxl_device.c:298:libxl__device_disk_set_backend: Disk vdev=hdb, using backend qdisk
libxl: debug: libxl_create.c:961:initiate_domain_create: running bootloader
libxl: debug: libxl_bootloader.c:323:libxl__bootloader_run: not a PV domain, skipping bootloader
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x95f628: deregister unregistered
libxl: detail: libxl_dom.c:254:hvm_set_viridian_features: base group enabled
libxl: detail: libxl_dom.c:254:hvm_set_viridian_features: freq group enabled
libxl: detail: libxl_dom.c:254:hvm_set_viridian_features: time_ref_count group enabled
xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x26e358
xc: detail: elf_parse_binary: memory: 0x100000 -> 0x36e358
xc: detail: VIRTUAL MEMORY ARRANGEMENT:
xc: detail:   Loader:   0000000000100000->000000000036e358
xc: detail:   Modules:  0000000000000000->0000000000000000
xc: detail:   TOTAL:    0000000000000000->0000000078000000
xc: detail:   ENTRY:    0000000000100038
xc: detail: PHYSICAL MEMORY ALLOCATION:
xc: detail:   4KB PAGES: 0x0000000000000200
xc: detail:   2MB PAGES: 0x00000000000003bf
xc: detail:   1GB PAGES: 0x0000000000000000
xc: detail: elf_load_binary: phdr 0 at 0x7f45ec866000 -> 0x7f45ecacb2d4
domainbuilder: detail: xc_dom_gnttab_hvm_seed: called, pfn=0xff000
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=phy
libxl: debug: libxl_event.c:600:libxl__ev_xswatch_register: watch w=0x961b88 wpath=/local/domain/0/backend/vbd/1/768/state token=3/0: register slotnum=3
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=hdb spec.backend=qdisk
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=hdb spec.backend=qdisk
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x962d00: deregister unregistered
libxl: debug: libxl_create.c:1602:do_domain_create: ao 0x95edc0: inprogress: poller=0x95ee20, flags=i
libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0x961b88 wpath=/local/domain/0/backend/vbd/1/768/state token=3/0: event epath=/local/domain/0/backend/vbd/1/768/state
libxl: debug: libxl_event.c:841:devstate_watch_callback: backend /local/domain/0/backend/vbd/1/768/state wanted state 2 ok
libxl: debug: libxl_event.c:638:libxl__ev_xswatch_deregister: watch w=0x961b88 wpath=/local/domain/0/backend/vbd/1/768/state token=3/0: deregister slotnum=3
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x961b88: deregister unregistered
libxl: debug: libxl_device.c:1030:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_aoutils.c:566:libxl__async_exec_start: forking to execute: /etc/xen/scripts/block add 
libxl: debug: libxl_event.c:506:watchfd_callback: watch epath=/local/domain/0/backend/vbd/1/768/state token=3/0: empty slot
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x961c10: deregister unregistered
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x961c10: deregister unregistered
libxl: debug: libxl_dm.c:1502:libxl__spawn_local_dm: Spawning device-model /usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   /usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -xen-domid
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   1
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -no-shutdown
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   chardev=libxenstat-cmd,mode=control
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -nodefaults
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   W7
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -vnc
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   none
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -display
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   none
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -k
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   it
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -spice
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   port=6002,tls-port=0,addr=0.0.0.0,disable-ticketing,agent-mouse=on,disable-copy-paste
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   virtio-serial
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   spicevmc,id=vdagent,name=vdagent
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   virtserialport,chardev=vdagent,name=com.redhat.spice.0
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   qxl-vga,vram_size_mb=64,ram_size_mb=64
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -boot
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   order=dc
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ich9-usb-ehci1,id=usb,addr=0x1d.0x7,multifunction=on
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ich9-usb-uhci1,masterbus=usb.0,firstport=0,addr=0x1d.0,multifunction=on
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ich9-usb-uhci2,masterbus=usb.0,firstport=2,addr=0x1d.0x1,multifunction=on
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ich9-usb-uhci3,masterbus=usb.0,firstport=4,addr=0x1d.0x2,multifunction=on
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   spicevmc,name=usbredir,id=usbrc1
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   usb-redir,chardev=usbrc1,id=usbrc1
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   spicevmc,name=usbredir,id=usbrc2
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   usb-redir,chardev=usbrc2,id=usbrc2
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   spicevmc,name=usbredir,id=usbrc3
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   usb-redir,chardev=usbrc3,id=usbrc3
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   spicevmc,name=usbredir,id=usbrc4
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   usb-redir,chardev=usbrc4,id=usbrc4
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -soundhw
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   hda
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -smp
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   2,maxcpus=2
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   rtl8139,id=nic0,netdev=net0,mac=00:16:3e:42:ae:8f
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -netdev
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   type=tap,id=net0,ifname=vif1.0-emu,script=no,downscript=no
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -machine
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   xenfv
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   1920
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ahci,id=ahci0
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -drive
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   file=/mnt/vm/disks/W7.disk1.xm,if=none,id=ahcidisk-0,format=raw,cache=writeback
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ide-hd,bus=ahci0.0,unit=0,drive=ahcidisk-0
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -drive
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   if=none,id=ide-832,cache=writeback
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   -device
libxl: debug: libxl_dm.c:1504:libxl__spawn_local_dm:   ide-cd,drive=ide-832
libxl: debug: libxl_event.c:600:libxl__ev_xswatch_register: watch w=0x95f8a8 wpath=/local/domain/0/device-model/1/state token=3/1: register slotnum=3
libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0x95f8a8 wpath=/local/domain/0/device-model/1/state token=3/1: event epath=/local/domain/0/device-model/1/state
libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0x95f8a8 wpath=/local/domain/0/device-model/1/state token=3/1: event epath=/local/domain/0/device-model/1/state
libxl: debug: libxl_event.c:638:libxl__ev_xswatch_deregister: watch w=0x95f8a8 wpath=/local/domain/0/device-model/1/state token=3/1: deregister slotnum=3
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x95f8a8: deregister unregistered
libxl: debug: libxl_qmp.c:705:libxl__qmp_initialize: connected to /var/run/xen/qmp-libxl-1
libxl: debug: libxl_qmp.c:296:qmp_handle_response: message type: qmp
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "qmp_capabilities",
    "id": 1
}
'
libxl: debug: libxl_qmp.c:296:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "query-chardev",
    "id": 2
}
'
libxl: debug: libxl_qmp.c:296:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: '{
    "execute": "query-vnc",
    "id": 3
}
'
libxl: debug: libxl_qmp.c:296:qmp_handle_response: message type: return
libxl: debug: libxl_event.c:600:libxl__ev_xswatch_register: watch w=0x967048 wpath=/local/domain/0/backend/vif/1/0/state token=3/2: register slotnum=3
libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0x967048 wpath=/local/domain/0/backend/vif/1/0/state token=3/2: event epath=/local/domain/0/backend/vif/1/0/state
libxl: debug: libxl_event.c:845:devstate_watch_callback: backend /local/domain/0/backend/vif/1/0/state wanted state 2 still waiting state 1
libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0x967048 wpath=/local/domain/0/backend/vif/1/0/state token=3/2: event epath=/local/domain/0/backend/vif/1/0/state
libxl: debug: libxl_event.c:841:devstate_watch_callback: backend /local/domain/0/backend/vif/1/0/state wanted state 2 ok
libxl: debug: libxl_event.c:638:libxl__ev_xswatch_deregister: watch w=0x967048 wpath=/local/domain/0/backend/vif/1/0/state token=3/2: deregister slotnum=3
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x967048: deregister unregistered
libxl: debug: libxl_device.c:1030:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-bridge online
libxl: debug: libxl_aoutils.c:566:libxl__async_exec_start: forking to execute: /etc/xen/scripts/vif-bridge online 
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x9670d0: deregister unregistered
libxl: debug: libxl_device.c:1030:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-bridge add
libxl: debug: libxl_aoutils.c:566:libxl__async_exec_start: forking to execute: /etc/xen/scripts/vif-bridge add 
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x9670d0: deregister unregistered
libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x9670d0: deregister unregistered
libxl: debug: libxl_event.c:1944:libxl__ao_progress_report: ao 0x95edc0: progress report: ignored
libxl: debug: libxl_event.c:1768:libxl__ao_complete: ao 0x95edc0: complete, rc=0
libxl: debug: libxl_event.c:1740:libxl__ao__destroy: ao 0x95edc0: destroy

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix
  2015-06-10 14:49       ` Fabio Fantoni
@ 2015-06-10 15:51         ` Paul Durrant
  0 siblings, 0 replies; 23+ messages in thread
From: Paul Durrant @ 2015-06-10 15:51 UTC (permalink / raw
  To: Fabio Fantoni, xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Keir (Xen.org), Jan Beulich

> -----Original Message-----
> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> Sent: 10 June 2015 15:50
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Andrew Cooper; Keir (Xen.org); Jan Beulich
> Subject: Re: [Xen-devel] [PATCH 00/17] x86/hvm: I/O emulation cleanup and
> fix
> 
> Il 10/06/2015 11:13, Fabio Fantoni ha scritto:
> > Il 09/06/2015 17:21, Paul Durrant ha scritto:
> >>> -----Original Message-----
> >>> From: Fabio Fantoni [mailto:fabio.fantoni@m2r.biz]
> >>> Sent: 09 June 2015 15:44
> >>> To: Paul Durrant; xen-devel@lists.xenproject.org
> >>> Subject: Re: [Xen-devel] [PATCH 00/17] x86/hvm: I/O emulation
> >>> cleanup and
> >>> fix
> >>>
> >>> Il 08/06/2015 16:33, Paul Durrant ha scritto:
> >>>> This patch series re-works much of the code involved in emulation
> >>>> of port
> >>>> and memory mapped I/O for HVM guests.
> >>>>
> >>>> The code has become very convoluted and, at least by inspection,
> >>>> certain
> >>>> emulations will apparently malfunction.
> >>>>
> >>>> The series is broken down into 17 patches (which are also available in
> >>>> my xenbits repo:
> >>> http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
> >>>> on the emulation18 branch) as follows:
> >>> Big thanks for your work.
> >>> I tested them taking the patches from emulation18 branch.
> >>> I tried xl create of a window 7 domU but during xl create dom0
> >>> istant-reboot and there isn't anything about the problem in the logs
> >>> (kern.log, syslog, qemu log, xl log ecc...)
> >>> I tried also with linux (hvm) domU and with stdvga instead qxl I had
> >>> setted but same result.
> >>>
> >>> For debug the problem I must enable all xen debug in grub entry and
> >>> redirect output to serial, similar to debug of dom0 boot problem or I
> >>> must do different thing?
> >>>
> >> Having serial is very useful in these cases. I've been debugging with
> >> a 32-bit Win 7 VM so it would be useful if you could bisect bit.
> >> There are natural boundaries in the series you could try first:
> >>
> >> Apply everything up to and including 'x86/hvm: unify stdvga mmio
> >> intercept with standard mmio intercept'
> >> Then try everything up to and including ' x86/hvm: remove extraneous
> >> parameter from hvmtrace_io_assist()'
> >> Then, if you get that far, try the rest one at a time.
> >>
> >>    Paul
> 
> Found the patch that cause dom0 insta-reboot:
>   x86/hvm: remove multiple open coded 'chunking' loops in
> hvmemul_read/write()
> 

I just ran into that too. There's a bogus assertion in there. I thought I was running with assertions on but clearly I wasn't.

I also found another couple of issues having backported the patches onto a recent XenServer. I'll post a v2 of the series once I've incorporated all the fixes.

  Paul

> In attachments also xl -vvv create output until the crash if can be
> useful, I not found useful information in logs after reboot :(
> 
> If you need more informations/tests tell me and I'll post them.
> 
> >
> > Thanks for reply.
> > I tried to boot with full debug enabled and with serial on lan but
> > server insta-reboot without show error or useful output on redirected
> > output or monitor.
> > I used same parameters of other debug I did long time ago if I
> > remember good, here the grub2 entry:
> >> menuentry 'Wheezy con Linux 3.16.0-0.bpo.4-amd64 e XEN - RAID -
> Debug
> >> su Seriale' --class debian --class gnu-linux --class gnu --class os {
> >>     set root='(RAID-ROOT)'
> >>     echo    'Caricamento Hypervisor Xen...'
> >>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G
> >> swiotlb=65762 loglvl=all guest_loglvl=all sync_console
> >> console_to_ring com2=19200,8n1 console=com2
> >>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64...'
> >>     linux    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder
> >> root=/dev/mapper/RAID-ROOT ro console=hvc0 earlyprintk=xen
> nomodeset
> >>     echo    'Caricamento ramdisk iniziale...'
> >>     initrd    /boot/initrd.img-3.16.0-0.bpo.4-amd64
> >> }
> > Here the entry without debug working:
> >>
> >> menuentry 'RAID - Debian 7.8 (wheezy) con Linux 3.16.0-0.bpo.4-amd64
> >> e XEN 4.6-unstable' --class debian --class gnu-linux --class gnu
> >> --class os --class xen {
> >>     set fallback="1"
> >>     set root='(RAID-ROOT)'
> >>     echo    'Caricamento Hypervisor Xen 4.6-unstable...'
> >>     multiboot    /boot/xen.gz placeholder dom0_mem=2G,max:3G
> >>     echo    'Caricamento Linux 3.16.0-0.bpo.4-amd64 ...'
> >>     module    /boot/vmlinuz-3.16.0-0.bpo.4-amd64 placeholder
> >> root=/dev/mapper/RAID-ROOT ro swiotlb=65762 quiet
> >>     echo    'Caricamento ramdisk iniziale...'
> >>     module    /boot/initrd.img-3.16.0-0.bpo.4-amd64
> >> }
> > I did something wrong?
> >
> > Now I'll try to bisect the patch serie following your advice.
> >
> >>
> >>> Thanks for any reply and sorry for my bad english.
> >>>
> >>>> x86/hvm: simplify hvmemul_do_io()
> >>>> x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops
> >>>> x86/hvm: unify internal portio and mmio intercepts
> >>>> x86/hvm: unify dpci portio intercept with standard portio intercept
> >>>> x86/hvm: unify stdvga mmio intercept with standard mmio intercept
> >>>> x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry...
> >>>> x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
> >>>> x86/hvm: split I/O completion handling from state model
> >>>> x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()
> >>>> x86/hvm: remove HVMIO_dispatched I/O state
> >>>> x86/hvm: remove hvm_io_state enumeration
> >>>> x86/hvm: use ioreq_t to track in-flight state
> >>>> x86/hvm: only acquire RAM pages for emulation when we need to
> >>>> x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
> >>>> x86/hvm: make sure translated MMIO reads or writes fall within a page
> >>>> x86/hvm: remove multiple open coded 'chunking' loops
> >>>> x86/hvm: track large memory mapped accesses by linear address
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@lists.xen.org
> >>>> http://lists.xen.org/xen-devel
> >

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-06-10 15:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-08 14:33 [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Paul Durrant
2015-06-08 14:33 ` [PATCH 01/17] x86/hvm: simplify hvmemul_do_io() Paul Durrant
2015-06-08 14:33 ` [PATCH 02/17] x86/hvm: re-name struct hvm_mmio_handler to hvm_mmio_ops Paul Durrant
2015-06-08 14:33 ` [PATCH 03/17] x86/hvm: unify internal portio and mmio intercepts Paul Durrant
2015-06-08 14:33 ` [PATCH 04/17] x86/hvm: unify dpci portio intercept with standard portio intercept Paul Durrant
2015-06-08 14:33 ` [PATCH 05/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept Paul Durrant
2015-06-08 14:33 ` [PATCH 06/17] x86/hvm: revert 82ed8716b "fix direct PCI port I/O emulation retry Paul Durrant
2015-06-08 14:33 ` [PATCH 07/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io() Paul Durrant
2015-06-08 14:33 ` [PATCH 08/17] x86/hvm: split I/O completion handling from state model Paul Durrant
2015-06-08 14:33 ` [PATCH 09/17] x86/hvm: remove hvm_io_pending() check in hvmemul_do_io() Paul Durrant
2015-06-08 14:33 ` [PATCH 10/17] x86/hvm: remove HVMIO_dispatched I/O state Paul Durrant
2015-06-08 14:33 ` [PATCH 11/17] x86/hvm: remove hvm_io_state enumeration Paul Durrant
2015-06-08 14:33 ` [PATCH 12/17] x86/hvm: use ioreq_t to track in-flight state Paul Durrant
2015-06-08 14:33 ` [PATCH 13/17] x86/hvm: only acquire RAM pages for emulation when we need to Paul Durrant
2015-06-08 14:33 ` [PATCH 14/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist() Paul Durrant
2015-06-08 14:33 ` [PATCH 15/17] x86/hvm: make sure translated MMIO reads or writes fall within a page Paul Durrant
2015-06-08 14:33 ` [PATCH 16/17] x86/hvm: remove multiple open coded 'chunking' loops Paul Durrant
2015-06-08 14:33 ` [PATCH 17/17] x86/hvm: track large memory mapped accesses by linear address Paul Durrant
2015-06-09 14:43 ` [PATCH 00/17] x86/hvm: I/O emulation cleanup and fix Fabio Fantoni
2015-06-09 15:21   ` Paul Durrant
2015-06-10  9:13     ` Fabio Fantoni
2015-06-10 14:49       ` Fabio Fantoni
2015-06-10 15:51         ` Paul Durrant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).