From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges Date: Mon, 13 Jul 2015 14:12:32 +0100 Message-ID: <55A3D5600200007800090330@mail.emea.novell.com> References: <1436420047-25356-1-git-send-email-tiejun.chen@intel.com> <1436420047-25356-7-git-send-email-tiejun.chen@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1436420047-25356-7-git-send-email-tiejun.chen@intel.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tiejun Chen Cc: Wei Liu , Ian Campbell , Stefano Stabellini , Andrew Cooper , Ian Jackson , xen-devel@lists.xen.org, Keir Fraser List-Id: xen-devel@lists.xenproject.org >>> On 09.07.15 at 07:33, wrote: > @@ -50,17 +75,22 @@ void pci_setup(void) > /* Resources assignable to PCI devices via BARs. */ > struct resource { > uint64_t base, max; > - } *resource, mem_resource, high_mem_resource, io_resource; > + } *resource, mem_resource, high_mem_resource, io_resource, exp_mem_resource; Despite having gone through description and the rest of the patch I can't seem to be able to guess what "exp_mem" stands for. Meaningful variable names are quite helpful though, often avoiding the need for comments. > /* Create a list of device BARs in descending order of size. */ > struct bars { > - uint32_t is_64bar; > +#define PCI_BAR_IS_64BIT 0x1 > +#define PCI_BAR_IS_ALLOCATED 0x2 > + uint32_t flag; flags (you already have two) > uint32_t devfn; > uint32_t bar_reg; > uint64_t bar_sz; > } *bars = (struct bars *)scratch_start; > - unsigned int i, nr_bars = 0; > - uint64_t mmio_hole_size = 0; > + unsigned int i, j, n, nr_bars = 0; > + uint64_t mmio_hole_size = 0, reserved_start, reserved_end, reserved_size; > + bool bar32_allocating = 0; > + uint64_t mmio32_unallocated_total = 0; > + unsigned long cur_pci_mem_start = 0; > > const char *s; > /* > @@ -222,7 +252,7 @@ void pci_setup(void) > if ( i != nr_bars ) > memmove(&bars[i+1], &bars[i], (nr_bars-i) * sizeof(*bars)); > > - bars[i].is_64bar = is_64bar; > + bars[i].flag = is_64bar ? PCI_BAR_IS_64BIT : 0; > bars[i].devfn = devfn; > bars[i].bar_reg = bar_reg; > bars[i].bar_sz = bar_sz; > @@ -309,29 +339,31 @@ void pci_setup(void) > } > > /* Relocate RAM that overlaps PCI space (in 64k-page chunks). */ > + cur_pci_mem_start = pci_mem_start; > while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend ) > + relocate_ram_for_pci_memory(cur_pci_mem_start); Please be consistent which variable to want to use in the loop (pci_mem_start vs cur_pci_mem_start). Also, this being the first substantial change to the function makes clear that you _still_ leave the sizing loop untouched, and instead make the allocation logic below more complicated. I said before a number of times that I don't think this helps maintainability of this already convoluted code. Among other things this manifests itself in your second call to relocate_ram_for_pci_memory() in no way playing by the constraints explained a few lines up from here in an extensive comment. Therefore I'll not make any further comments on the rest of the patch, but instead outline an allocation model that I think would fit our needs: Subject to the constraints mentioned above, set up a bitmap (maximum size 64k [2Gb = 2^^19 pages needing 2^^19 bits], i.e. reasonably small a memory block). Each bit represents a page usable for MMIO: First of all you remove the range from PCI_MEM_END upwards. Then remove all RDM pages. Now do a first pass over all devices, allocating (in the bitmap) space for only the 32-bit MMIO BARs, starting with the biggest one(s), by finding a best fit (i.e. preferably a range not usable by any bigger BAR) from top down. For example, if you have available [f0000000,f8000000) [f9000000,f9001000) [fa000000,fa003000) [fa010000,fa012000) and you're looking for a single page slot, you should end up picking fa002000. After this pass you should be able to do RAM relocation in a single attempt just like we do today (you may still grow the MMIO window if you know you need to and can fit some of the 64-bit BARs in there, subject to said constraints; this is in an attempt to help OSes not comfortable with 64-bit resources). In a 2nd pass you'd then assign 64-bit resources: If you can fit them below 4G (you still have the bitmap left of what you've got available), put them there. Allocation strategy could be the same as above (biggest first), perhaps allowing for some factoring out of logic, but here smallest first probably could work equally well. The main thought to decide between the two is whether it is better to fit as many (small) or as big (in total) as possible a set under 4G. I'd generally expect the former (as many as possible, leaving only a few huge ones to go above 4G) to be the better approach, but that's more a gut feeling than based on hard data. Jan