All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: "Lu, Aaron" <aaron.lu@intel.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"song@kernel.org" <song@kernel.org>
Subject: Re: [RFC PATCH 1/4] x86/mm/cpa: restore global bit when page is present
Date: Thu, 11 Aug 2022 20:30:34 +0900	[thread overview]
Message-ID: <YvToWsNqXudd6cSN@hyeyoo> (raw)
In-Reply-To: <df4f40ff3e4c408931ed21ab4e8968bdb1871f79.camel@intel.com>

On Thu, Aug 11, 2022 at 08:16:08AM +0000, Lu, Aaron wrote:
> On Thu, 2022-08-11 at 05:21 +0000, Hyeonggon Yoo wrote:
> > On Mon, Aug 08, 2022 at 10:56:46PM +0800, Aaron Lu wrote:
> > > For configs that don't have PTI enabled or cpus that don't need
> > > meltdown mitigation, current kernel can lose GLOBAL bit after a page
> > > goes through a cycle of present -> not present -> present.
> > > 
> > > It happened like this(__vunmap() does this in vm_remove_mappings()):
> > > original page protection: 0x8000000000000163 (NX/G/D/A/RW/P)
> > > set_memory_np(page, 1):   0x8000000000000062 (NX/D/A/RW) lose G and P
> > > set_memory_p(pagem 1):    0x8000000000000063 (NX/D/A/RW/P) restored P
> > > 
> > > In the end, this page's protection no longer has Global bit set and this
> > > would create problem for this merge small mapping feature.
> > > 
> > > For this reason, restore Global bit for systems that do not have PTI
> > > enabled if page is present.
> > > 
> > > (pgprot_clear_protnone_bits() deserves a better name if this patch is
> > > acceptible but first, I would like to get some feedback if this is the
> > > right way to solve this so I didn't bother with the name yet)
> > > 
> > > Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> > > ---
> > >  arch/x86/mm/pat/set_memory.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> > > index 1abd5438f126..33657a54670a 100644
> > > --- a/arch/x86/mm/pat/set_memory.c
> > > +++ b/arch/x86/mm/pat/set_memory.c
> > > @@ -758,6 +758,8 @@ static pgprot_t pgprot_clear_protnone_bits(pgprot_t prot)
> > >  	 */
> > >  	if (!(pgprot_val(prot) & _PAGE_PRESENT))
> > >  		pgprot_val(prot) &= ~_PAGE_GLOBAL;
> > > +	else
> > > +		pgprot_val(prot) |= _PAGE_GLOBAL & __default_kernel_pte_mask;
> > >  
> > >  	return prot;
> > >  }
> > 
> > IIUC It makes it unable to set _PAGE_GLOBL when PTI is on.
> > 
> 
> Yes. Is this a problem?
> I think that is the intended behaviour when PTI is on: not to enable
> Gloabl bit on kernel mappings.

Please note that I'm not expert on PTI.

but AFAIK with PTI, at least everything (kernel part) mapped to user page table is
mapped as global when PGE is supported.

Not sure "Global bit is never used for kernel part when PTI is enabled"
is true.

Also, commit d1440b23c922d ("x86/mm: Factor out pageattr _PAGE_GLOBAL setting") that introduced
pgprot_clear_protnone_bits() says:
	
	This unconditional setting of _PAGE_GLOBAL is a problem when we have
	PTI and non-PTI and we want some areas to have _PAGE_GLOBAL and some
	not.

	This updated version of the code says:
	1. Clear _PAGE_GLOBAL when !_PAGE_PRESENT
	2. Never set _PAGE_GLOBAL implicitly
	3. Allow _PAGE_GLOBAL to be in cpa.set_mask
	4. Allow _PAGE_GLOBAL to be inherited from previous PTE

> > Maybe it would be less intrusive to make
> > set_direct_map_default_noflush() replace protection bits
> > with PAGE_KENREL as it's only called for direct map, and the function
> > is to reset permission to default:
> > 
> > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> > index 1abd5438f126..0dd4433c1382 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2250,7 +2250,16 @@ int set_direct_map_invalid_noflush(struct page *page)
> > 
> >  int set_direct_map_default_noflush(struct page *page)
> >  {
> > -       return __set_pages_p(page, 1);
> > +       unsigned long tempaddr = (unsigned long) page_address(page);
> > +       struct cpa_data cpa = {
> > +                       .vaddr = &tempaddr,
> > +                       .pgd = NULL,
> > +                       .numpages = 1,
> > +                       .mask_set = PAGE_KERNEL,
> > +                       .mask_clr = __pgprot(~0),

Nah, this sets _PAGE_ENC unconditionally, which should be evaluated.
Maybe less intrusive way would be:
		       .mask_set = __pgprot(_PAGE_PRESENT |
					   (_PAGE_GLOBAL & __kernel_default_pte_mask)),
                       .mask_clr = __pgprot(0),

> > +                       .flags = 0};
> > +
> > +       return __change_page_attr_set_clr(&cpa, 0);
> >  }
> 
> Looks reasonable to me and it is indeed less intrusive. I'm only
> concerned there might be other paths that also go through present ->
> not present -> present and this change can not cover them.
>

AFAIK other paths going through present->not present->present (using CPA)
is only when DEBUG_PAGEALLOC is used.

Do we care direct map fragmentation when using DEBUG_PAGEALLOC?

> > 
> > set_direct_map_{invalid,default}_noflush() is the exact reason
> > why direct map become split after vmalloc/vfree with special
> > permissions.
> 
> Yes I agree, because it can lose G bit after the whole cycle when PTI
> is not on. When PTI is on, there is no such problem because G bit is
> not there initially.
> 
> Thanks,
> Aaron

-- 
Thanks,
Hyeonggon

  reply	other threads:[~2022-08-11 11:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-08 14:56 [RFC PATCH 0/4] x86/mm/cpa: merge small mappings whenever possible Aaron Lu
2022-08-08 14:56 ` [RFC PATCH 1/4] x86/mm/cpa: restore global bit when page is present Aaron Lu
2022-08-11  5:21   ` Hyeonggon Yoo
2022-08-11  8:16     ` Lu, Aaron
2022-08-11 11:30       ` Hyeonggon Yoo [this message]
2022-08-11 12:28         ` Aaron Lu
2022-08-08 14:56 ` [RFC PATCH 2/4] x86/mm/cpa: merge splitted direct mapping when possible Aaron Lu
2022-08-08 14:56 ` [RFC PATCH 3/4] x86/mm/cpa: add merge event counter Aaron Lu
2022-08-08 14:56 ` [TEST NOT_FOR_MERGE 4/4] x86/mm/cpa: add a test interface to split direct map Aaron Lu
2022-08-09 10:04 ` [RFC PATCH 0/4] x86/mm/cpa: merge small mappings whenever possible Kirill A. Shutemov
2022-08-09 14:58   ` Aaron Lu
2022-08-09 17:56     ` Kirill A. Shutemov
2022-08-11  4:50 ` Hyeonggon Yoo
2022-08-11  7:50   ` Lu, Aaron
2022-08-13 16:05   ` Mike Rapoport
2022-08-16  6:33     ` Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YvToWsNqXudd6cSN@hyeyoo \
    --to=42.hyeyoo@gmail.com \
    --cc=aaron.lu@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.