From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges Date: Tue, 14 Jul 2015 10:27:35 +0100 Message-ID: <55A4F2270200007800090834@mail.emea.novell.com> References: <1436420047-25356-1-git-send-email-tiejun.chen@intel.com> <1436420047-25356-7-git-send-email-tiejun.chen@intel.com> <55A3D5600200007800090330@mail.emea.novell.com> <55A4AE88.2000200@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55A4AE88.2000200@intel.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tiejun Chen Cc: Wei Liu , Ian Campbell , Stefano Stabellini , Andrew Cooper , Ian Jackson , xen-devel@lists.xen.org, Keir Fraser List-Id: xen-devel@lists.xenproject.org >>> On 14.07.15 at 08:39, wrote: >> > - } *resource, mem_resource, high_mem_resource, io_resource; >>> + } *resource, mem_resource, high_mem_resource, io_resource, > exp_mem_resource; >> >> Despite having gone through description and the rest of the patch I >> can't seem to be able to guess what "exp_mem" stands for. >> Meaningful variable names are quite helpful though, often avoiding >> the need for comments. > > exp_mem_resource() is the expanded mem_resource in the case of > populating RAM. > > Maybe I should use the whole word, expand_mem_resource. And what does "expand" here mean then? >>> @@ -309,29 +339,31 @@ void pci_setup(void) >>> } >>> >>> /* Relocate RAM that overlaps PCI space (in 64k-page chunks). */ >>> + cur_pci_mem_start = pci_mem_start; >>> while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend ) >>> + relocate_ram_for_pci_memory(cur_pci_mem_start); >> >> Please be consistent which variable to want to use in the loop >> (pci_mem_start vs cur_pci_mem_start). > > Overall I just call relocate_ram_for_pci_memory() twice and each I > always pass cur_pci_mem_start. Any inconsistent place? In the quoted code you use pci_mem_start in the while() condition and cur_pci_mem_start in that same while()'s body. >> Also, this being the first substantial change to the function makes >> clear that you _still_ leave the sizing loop untouched, and instead >> make the allocation logic below more complicated. I said before a > > But this may be more reasonable than it used to do. In my point of view > we always need to first allocate 32bit mmio and then allocate 64bit mmio > since as you said we don't want to expand high memory if possible. > >> number of times that I don't think this helps maintainability of this >> already convoluted code. Among other things this manifests itself >> in your second call to relocate_ram_for_pci_memory() in no way >> playing by the constraints explained a few lines up from here in an >> extensive comment. > > Can't all variables/comments express what I intend to do here? Except > for that exp_mem_resource. I'm not talking about a lack of comments, but about your added use of the function not being in line with what is being explained in an earlier (pre-existing) comment. >> Therefore I'll not make any further comments on the rest of the >> patch, but instead outline an allocation model that I think would >> fit our needs: Subject to the constraints mentioned above, set up >> a bitmap (maximum size 64k [2Gb = 2^^19 pages needing 2^^19 >> bits], i.e. reasonably small a memory block). Each bit represents a >> page usable for MMIO: First of all you remove the range from >> PCI_MEM_END upwards. Then remove all RDM pages. Now do a >> first pass over all devices, allocating (in the bitmap) space for only >> the 32-bit MMIO BARs, starting with the biggest one(s), by finding >> a best fit (i.e. preferably a range not usable by any bigger BAR) >> from top down. For example, if you have available >> >> [f0000000,f8000000) >> [f9000000,f9001000) >> [fa000000,fa003000) >> [fa010000,fa012000) >> >> and you're looking for a single page slot, you should end up >> picking fa002000. > > Why is this [f9000000,f9001000]? Just one page in this slot. This was just a simple example I gave. Or maybe I don't understand your question... >> After this pass you should be able to do RAM relocation in a >> single attempt just like we do today (you may still grow the MMIO >> window if you know you need to and can fit some of the 64-bit >> BARs in there, subject to said constraints; this is in an attempt >> to help OSes not comfortable with 64-bit resources). >> >> In a 2nd pass you'd then assign 64-bit resources: If you can fit >> them below 4G (you still have the bitmap left of what you've got >> available), put them there. Allocation strategy could be the same > > I think basically, your logic is similar to what I did as I described in > changelog, The goal is the same, but the approaches look quite different to me. In particular my approach avoids calculating mmio_total up front, then basing RAM relocation on it, only to find subsequently that more RAM may need to be relocated. > I think bitmap mechanism is a good idea but honestly, its not easy to > cover all requirements here. And just like bootmem on Linux side, so its > a little complicated to implement this entirely. So I prefer not to > introduce this way in current phase. I'm afraid it's going to be hard to convince me of any approaches further complicating the current mechanism instead of overhauling it. Jan