Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask
Date: Tue, 14 Jul 2015 16:58:14 +1000	[thread overview]
Message-ID: <55A4B306.6040402@ozlabs.ru> (raw)
In-Reply-To: <1436814815.1391.388.camel@redhat.com>

On 07/14/2015 05:13 AM, Alex Williamson wrote:
> On Tue, 2015-07-14 at 00:56 +1000, Alexey Kardashevskiy wrote:
>> These started switching from TARGET_PAGE_MASK (hardcoded as 4K) to
>> a real host page size:
>> 4e51361d7 "cpu-all: complete "real" host page size API" and
>> f7ceed190 "vfio: cpu: Use "real" page size API"
>>
>> This finished the transition by:
>> - %s/TARGET_PAGE_MASK/qemu_real_host_page_mask/
>> - %s/TARGET_PAGE_ALIGN/REAL_HOST_PAGE_ALIGN/
>> - removing bitfield length for offsets in VFIOQuirk::data as
>> qemu_real_host_page_mask is not a macro
>
> Can we assume that none of the changes to quirks have actually been
> tested?

No, why? :) I tried it on one of NVIDIAs I got here -
VGA compatible controller: NVIDIA Corporation GM107GL [Quadro K2200] (rev a2)
The driver was from NVIDIA (not nouveau) and the test was "acos" (some 
basic CUDA test).

> I don't really support them being bundled in here since they
> really aren't related to what you're doing.

This makes sense, I'll move them to a separate patch and add a note how it 
helps on a 64k-pages host.

> For DMA we generally want
> to be host IOMMU page aligned,

Do all known IOMMUs use a constant page size? IOMMU memory region does not 
have an IOMMU page size/mask and I wanted to add it there but not sure if 
it is generic enough.

> which we can generally assume is the same
> as host page aligned,

They are almost never the same on sPAPR for 32bit windows...

> but quirks are simply I/O regions, so I think they
> ought to continue to be target page aligned.

Without s/TARGET_PAGE_MASK/qemu_real_host_page_mask/, 
&vfio_nvidia_88000_quirk fails+exits in kvm_set_phys_mem() as the size of 
section is 0x88000. It still works with x-mmap=false (or TCG, I suppose) 
though.


>> This keeps using TARGET_PAGE_MASK for IOMMU regions though as it is
>> the minimum page size which IOMMU regions may be using and at the moment
>> memory regions do not carry the actual page size.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>> In reality DMA windows are always a lot bigger than a single 4K page
>> and aligned to 32/64MB, may be only use here qemu_real_host_page_mask?
>
> I don't understand what this is asking either.  While the bulk of memory
> is going to be mapped in larger chunks, we do occasionally see 4k
> mappings on x86, particularly in some of the legacy low memory areas.


The question was not about individual mappings - these are handled by a 
iommu memory region notifier; here we are dealing with DMA windows which 
are always megabytes but nothing really prevents/prohibits a guest from 
requesting a 4K _window_ (with a single TCE entry). Whether we want to 
support such small windows or not - this was a question.


>
>> ---
>>   hw/vfio/common.c | 26 ++++++++++++++++----------
>>   hw/vfio/pci.c    | 30 +++++++++++++++---------------
>>   2 files changed, 31 insertions(+), 25 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 85ee9b0..d115ec9 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -321,6 +321,9 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>       Int128 llend;
>>       void *vaddr;
>>       int ret;
>> +    const bool is_iommu = memory_region_is_iommu(section->mr);
>> +    const hwaddr page_mask =
>> +        is_iommu ? TARGET_PAGE_MASK : qemu_real_host_page_mask;
>>
>>       if (vfio_listener_skipped_section(section)) {
>>           trace_vfio_listener_region_add_skip(
>> @@ -330,16 +333,16 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>           return;
>>       }
>>
>> -    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>> -                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
>> +    if (unlikely((section->offset_within_address_space & ~page_mask) !=
>> +                 (section->offset_within_region & ~page_mask))) {
>>           error_report("%s received unaligned region", __func__);
>>           return;
>>       }
>>
>> -    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>> +    iova = ROUND_UP(section->offset_within_address_space, ~page_mask + 1);
>>       llend = int128_make64(section->offset_within_address_space);
>>       llend = int128_add(llend, section->size);
>> -    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
>> +    llend = int128_and(llend, int128_exts64(page_mask));
>>
>>       if (int128_ge(int128_make64(iova), llend)) {
>>           return;
>> @@ -347,7 +350,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>
>>       memory_region_ref(section->mr);
>>
>> -    if (memory_region_is_iommu(section->mr)) {
>> +    if (is_iommu) {
>>           VFIOGuestIOMMU *giommu;
>>
>>           trace_vfio_listener_region_add_iommu(iova,
>> @@ -423,6 +426,9 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>                                               iommu_data.type1.listener);
>>       hwaddr iova, end;
>>       int ret;
>> +    const bool is_iommu = memory_region_is_iommu(section->mr);
>> +    const hwaddr page_mask =
>> +        is_iommu ? TARGET_PAGE_MASK : qemu_real_host_page_mask;
>>
>>       if (vfio_listener_skipped_section(section)) {
>>           trace_vfio_listener_region_del_skip(
>> @@ -432,13 +438,13 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>           return;
>>       }
>>
>> -    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
>> -                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
>> +    if (unlikely((section->offset_within_address_space & ~page_mask) !=
>> +                 (section->offset_within_region & ~page_mask))) {
>>           error_report("%s received unaligned region", __func__);
>>           return;
>>       }
>>
>> -    if (memory_region_is_iommu(section->mr)) {
>> +    if (is_iommu) {
>>           VFIOGuestIOMMU *giommu;
>>
>>           QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
>> @@ -459,9 +465,9 @@ static void vfio_listener_region_del(MemoryListener *listener,
>>            */
>>       }
>>
>> -    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>> +    iova = REAL_HOST_PAGE_ALIGN(section->offset_within_address_space);
>>       end = (section->offset_within_address_space + int128_get64(section->size)) &
>> -          TARGET_PAGE_MASK;
>> +          page_mask;
>>
>>       if (iova >= end) {
>>           return;
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 2ed877f..7694afe 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -50,16 +50,16 @@ typedef struct VFIOQuirk {
>>       struct VFIOPCIDevice *vdev;
>>       QLIST_ENTRY(VFIOQuirk) next;
>>       struct {
>> -        uint32_t base_offset:TARGET_PAGE_BITS;
>> -        uint32_t address_offset:TARGET_PAGE_BITS;
>> +        uint32_t base_offset;
>> +        uint32_t address_offset;
>>           uint32_t address_size:3;
>>           uint32_t bar:3;
>>
>>           uint32_t address_match;
>>           uint32_t address_mask;
>>
>> -        uint32_t address_val:TARGET_PAGE_BITS;
>> -        uint32_t data_offset:TARGET_PAGE_BITS;
>> +        uint32_t address_val;
>> +        uint32_t data_offset;
>>           uint32_t data_size:3;
>>
>>           uint8_t flags;
>> @@ -1319,8 +1319,8 @@ static uint64_t vfio_generic_quirk_read(void *opaque,
>>   {
>>       VFIOQuirk *quirk = opaque;
>>       VFIOPCIDevice *vdev = quirk->vdev;
>> -    hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
>> -    hwaddr offset = quirk->data.address_match & ~TARGET_PAGE_MASK;
>> +    hwaddr base = quirk->data.address_match & qemu_real_host_page_mask;
>> +    hwaddr offset = quirk->data.address_match & ~qemu_real_host_page_mask;
>>       uint64_t data;
>>
>>       if (vfio_flags_enabled(quirk->data.flags, quirk->data.read_flags) &&
>> @@ -1349,8 +1349,8 @@ static void vfio_generic_quirk_write(void *opaque, hwaddr addr,
>>   {
>>       VFIOQuirk *quirk = opaque;
>>       VFIOPCIDevice *vdev = quirk->vdev;
>> -    hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
>> -    hwaddr offset = quirk->data.address_match & ~TARGET_PAGE_MASK;
>> +    hwaddr base = quirk->data.address_match & qemu_real_host_page_mask;
>> +    hwaddr offset = quirk->data.address_match & ~qemu_real_host_page_mask;
>>
>>       if (vfio_flags_enabled(quirk->data.flags, quirk->data.write_flags) &&
>>           ranges_overlap(addr, size, offset, quirk->data.address_mask + 1)) {
>> @@ -1650,9 +1650,9 @@ static void vfio_probe_ati_bar2_4000_quirk(VFIOPCIDevice *vdev, int nr)
>>
>>       memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_generic_quirk, quirk,
>>                             "vfio-ati-bar2-4000-quirk",
>> -                          TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
>> +                          REAL_HOST_PAGE_ALIGN(quirk->data.address_mask + 1));
>>       memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
>> -                          quirk->data.address_match & TARGET_PAGE_MASK,
>> +                          quirk->data.address_match & qemu_real_host_page_mask,
>>                             &quirk->mem, 1);
>>
>>       QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
>> @@ -1888,7 +1888,7 @@ static void vfio_nvidia_88000_quirk_write(void *opaque, hwaddr addr,
>>       VFIOQuirk *quirk = opaque;
>>       VFIOPCIDevice *vdev = quirk->vdev;
>>       PCIDevice *pdev = &vdev->pdev;
>> -    hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
>> +    hwaddr base = quirk->data.address_match & qemu_real_host_page_mask;
>>
>>       vfio_generic_quirk_write(opaque, addr, data, size);
>>
>> @@ -1943,9 +1943,9 @@ static void vfio_probe_nvidia_bar0_88000_quirk(VFIOPCIDevice *vdev, int nr)
>>
>>       memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_nvidia_88000_quirk,
>>                             quirk, "vfio-nvidia-bar0-88000-quirk",
>> -                          TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
>> +                          REAL_HOST_PAGE_ALIGN(quirk->data.address_mask + 1));
>>       memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
>> -                          quirk->data.address_match & TARGET_PAGE_MASK,
>> +                          quirk->data.address_match & qemu_real_host_page_mask,
>>                             &quirk->mem, 1);
>>
>>       QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
>> @@ -1980,9 +1980,9 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIOPCIDevice *vdev, int nr)
>>
>>       memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_generic_quirk, quirk,
>>                             "vfio-nvidia-bar0-1800-quirk",
>> -                          TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
>> +                          REAL_HOST_PAGE_ALIGN(quirk->data.address_mask + 1));
>>       memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
>> -                          quirk->data.address_match & TARGET_PAGE_MASK,
>> +                          quirk->data.address_match & qemu_real_host_page_mask,
>>                             &quirk->mem, 1);
>>
>>       QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
>
>
>


-- 
Alexey

next prev parent reply	other threads:[~2015-07-14  6:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-13 14:56 [Qemu-devel] [RFC PATCH qemu v2 0/5] vfio: SPAPR IOMMU v2 (memory preregistration support) Alexey Kardashevskiy
2015-07-13 14:56 ` [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask Alexey Kardashevskiy
2015-07-13 19:13   ` Alex Williamson
2015-07-14  6:58     ` Alexey Kardashevskiy [this message]
2015-07-14 12:17       ` Alexey Kardashevskiy
2015-07-14 22:28       ` Alex Williamson
2015-07-15  0:49         ` Alexey Kardashevskiy
2015-07-13 14:56 ` [Qemu-devel] [RFC PATCH qemu v2 2/5] vfio: Skip PCI BARs in memory listener Alexey Kardashevskiy
2015-07-13 14:56 ` [Qemu-devel] [RFC PATCH qemu v2 3/5] vfio: Store IOMMU type in container Alexey Kardashevskiy
2015-07-13 14:56 ` [Qemu-devel] [RFC PATCH qemu v2 4/5] vfio: Refactor memory listener to accommodate more IOMMU types Alexey Kardashevskiy
2015-07-13 14:56 ` [Qemu-devel] [RFC PATCH qemu v2 5/5] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) Alexey Kardashevskiy
2015-07-13 19:37 ` [Qemu-devel] [RFC PATCH qemu v2 0/5] vfio: SPAPR IOMMU v2 (memory preregistration support) Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A4B306.6040402@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.