From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges Date: Wed, 15 Jul 2015 14:40:11 +0100 Message-ID: References: <1436420047-25356-1-git-send-email-tiejun.chen@intel.com> <1436420047-25356-7-git-send-email-tiejun.chen@intel.com> <55A3D5600200007800090330@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55A3D5600200007800090330@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Wei Liu , Ian Campbell , Stefano Stabellini , Andrew Cooper , Ian Jackson , "xen-devel@lists.xen.org" , Tiejun Chen , Keir Fraser List-Id: xen-devel@lists.xenproject.org On Mon, Jul 13, 2015 at 2:12 PM, Jan Beulich wrote: > Therefore I'll not make any further comments on the rest of the > patch, but instead outline an allocation model that I think would > fit our needs: Subject to the constraints mentioned above, set up > a bitmap (maximum size 64k [2Gb = 2^^19 pages needing 2^^19 > bits], i.e. reasonably small a memory block). Each bit represents a > page usable for MMIO: First of all you remove the range from > PCI_MEM_END upwards. Then remove all RDM pages. Now do a > first pass over all devices, allocating (in the bitmap) space for only > the 32-bit MMIO BARs, starting with the biggest one(s), by finding > a best fit (i.e. preferably a range not usable by any bigger BAR) > from top down. For example, if you have available > > [f0000000,f8000000) > [f9000000,f9001000) > [fa000000,fa003000) > [fa010000,fa012000) > > and you're looking for a single page slot, you should end up > picking fa002000. > > After this pass you should be able to do RAM relocation in a > single attempt just like we do today (you may still grow the MMIO > window if you know you need to and can fit some of the 64-bit > BARs in there, subject to said constraints; this is in an attempt > to help OSes not comfortable with 64-bit resources). > > In a 2nd pass you'd then assign 64-bit resources: If you can fit > them below 4G (you still have the bitmap left of what you've got > available), put them there. Allocation strategy could be the same > as above (biggest first), perhaps allowing for some factoring out > of logic, but here smallest first probably could work equally well. > The main thought to decide between the two is whether it is > better to fit as many (small) or as big (in total) as possible a set > under 4G. I'd generally expect the former (as many as possible, > leaving only a few huge ones to go above 4G) to be the better > approach, but that's more a gut feeling than based on hard data. I agree that it would be more sensible for hvmloader to make a "plan" first, and then do the memory reallocation (if it's possible) at one time, then go through and actually update the device BARs according to the "plan". However, I don't really see how having a bitmap really helps in this case. I would think having a list of free ranges (perhaps aligned by powers of two?), sorted small->large, makes the most sense. So suppose we had the above example, but with the range [fa000000,fa005000) instead, and we're looking for a 4-page region. Then our "free list" initially would look like this: [f9000000,f9001000) [fa010000,fa012000) [fa000000,fa005000) [f0000000,f8000000) After skipping the first two because they aren't big enough, we'd take 0x4000 from the third one, placing the BAR at [fa000000,fa004000), and putting the remainder [fa004000,fa005000) back on the free list in order, thus: [f9000000,f9001000) [fa004000,fa005000) [fa010000,fa012000) [f0000000,f8000000) If we got to the end and hadn't found a region large enough, *and* we could still expand the MMIO hole, we could lower pci_mem_start until it could fit. What do you think? -George