From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965050AbbFCRJv (ORCPT ); Wed, 3 Jun 2015 13:09:51 -0400 Received: from mga14.intel.com ([192.55.52.115]:57941 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964840AbbFCRHd (ORCPT ); Wed, 3 Jun 2015 13:07:33 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,548,1427785200"; d="scan'208";a="720166580" From: "Kirill A. Shutemov" To: Andrew Morton , Andrea Arcangeli , Hugh Dickins Cc: Dave Hansen , Mel Gorman , Rik van Riel , Vlastimil Babka , Christoph Lameter , Naoya Horiguchi , Steve Capper , "Aneesh Kumar K.V" , Johannes Weiner , Michal Hocko , Jerome Marchand , Sasha Levin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Kirill A. Shutemov" Subject: [PATCHv6 04/36] mm, thp: adjust conditions when we can reuse the page on WP fault Date: Wed, 3 Jun 2015 20:05:35 +0300 Message-Id: <1433351167-125878-5-git-send-email-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com> References: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With new refcounting we will be able map the same compound page with PTEs and PMDs. It requires adjustment to conditions when we can reuse the page on write-protection fault. For PTE fault we can't reuse the page if it's part of huge page. For PMD we can only reuse the page if nobody else maps the huge page or it's part. We can do it by checking page_mapcount() on each sub-page, but it's expensive. The cheaper way is to check page_count() to be equal 1: every mapcount takes page reference, so this way we can guarantee, that the PMD is the only mapping. This approach can give false negative if somebody pinned the page, but that doesn't affect correctness. Signed-off-by: Kirill A. Shutemov Tested-by: Sasha Levin Acked-by: Jerome Marchand Acked-by: Vlastimil Babka --- include/linux/swap.h | 3 ++- mm/huge_memory.c | 12 +++++++++++- mm/swapfile.c | 3 +++ 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 0428e4c84e1d..17cdd6b9456b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page) return 0; } -#define reuse_swap_page(page) (page_mapcount(page) == 1) +#define reuse_swap_page(page) \ + (!PageTransCompound(page) && page_mapcount(page) == 1) static inline int try_to_free_swap(struct page *page) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1043b9b0659c..6035cc92be6b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1128,7 +1128,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, page = pmd_page(orig_pmd); VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page); - if (page_mapcount(page) == 1) { + /* + * We can only reuse the page if nobody else maps the huge page or it's + * part. We can do it by checking page_mapcount() on each sub-page, but + * it's expensive. + * The cheaper way is to check page_count() to be equal 1: every + * mapcount takes page reference reference, so this way we can + * guarantee, that the PMD is the only mapping. + * This can give false negative if somebody pinned the page, but that's + * fine. + */ + if (page_mapcount(page) == 1 && page_count(page) == 1) { pmd_t entry; entry = pmd_mkyoung(orig_pmd); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); diff --git a/mm/swapfile.c b/mm/swapfile.c index 6dd365d1c488..3cd5f188b996 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page) VM_BUG_ON_PAGE(!PageLocked(page), page); if (unlikely(PageKsm(page))) return 0; + /* The page is part of THP and cannot be reused */ + if (PageTransCompound(page)) + return 0; count = page_mapcount(page); if (count <= 1 && PageSwapCache(page)) { count += page_swapcount(page); -- 2.1.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f181.google.com (mail-pd0-f181.google.com [209.85.192.181]) by kanga.kvack.org (Postfix) with ESMTP id 5104C900016 for ; Wed, 3 Jun 2015 13:07:50 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so11651392pdb.1 for ; Wed, 03 Jun 2015 10:07:50 -0700 (PDT) Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTP id uy7si1767505pbc.246.2015.06.03.10.07.32 for ; Wed, 03 Jun 2015 10:07:33 -0700 (PDT) From: "Kirill A. Shutemov" Subject: [PATCHv6 04/36] mm, thp: adjust conditions when we can reuse the page on WP fault Date: Wed, 3 Jun 2015 20:05:35 +0300 Message-Id: <1433351167-125878-5-git-send-email-kirill.shutemov@linux.intel.com> In-Reply-To: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com> References: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton , Andrea Arcangeli , Hugh Dickins Cc: Dave Hansen , Mel Gorman , Rik van Riel , Vlastimil Babka , Christoph Lameter , Naoya Horiguchi , Steve Capper , "Aneesh Kumar K.V" , Johannes Weiner , Michal Hocko , Jerome Marchand , Sasha Levin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Kirill A. Shutemov" With new refcounting we will be able map the same compound page with PTEs and PMDs. It requires adjustment to conditions when we can reuse the page on write-protection fault. For PTE fault we can't reuse the page if it's part of huge page. For PMD we can only reuse the page if nobody else maps the huge page or it's part. We can do it by checking page_mapcount() on each sub-page, but it's expensive. The cheaper way is to check page_count() to be equal 1: every mapcount takes page reference, so this way we can guarantee, that the PMD is the only mapping. This approach can give false negative if somebody pinned the page, but that doesn't affect correctness. Signed-off-by: Kirill A. Shutemov Tested-by: Sasha Levin Acked-by: Jerome Marchand Acked-by: Vlastimil Babka --- include/linux/swap.h | 3 ++- mm/huge_memory.c | 12 +++++++++++- mm/swapfile.c | 3 +++ 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 0428e4c84e1d..17cdd6b9456b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -524,7 +524,8 @@ static inline int page_swapcount(struct page *page) return 0; } -#define reuse_swap_page(page) (page_mapcount(page) == 1) +#define reuse_swap_page(page) \ + (!PageTransCompound(page) && page_mapcount(page) == 1) static inline int try_to_free_swap(struct page *page) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1043b9b0659c..6035cc92be6b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1128,7 +1128,17 @@ int do_huge_pmd_wp_page(struct mm_struct *mm, struct vm_area_struct *vma, page = pmd_page(orig_pmd); VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page); - if (page_mapcount(page) == 1) { + /* + * We can only reuse the page if nobody else maps the huge page or it's + * part. We can do it by checking page_mapcount() on each sub-page, but + * it's expensive. + * The cheaper way is to check page_count() to be equal 1: every + * mapcount takes page reference reference, so this way we can + * guarantee, that the PMD is the only mapping. + * This can give false negative if somebody pinned the page, but that's + * fine. + */ + if (page_mapcount(page) == 1 && page_count(page) == 1) { pmd_t entry; entry = pmd_mkyoung(orig_pmd); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); diff --git a/mm/swapfile.c b/mm/swapfile.c index 6dd365d1c488..3cd5f188b996 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -887,6 +887,9 @@ int reuse_swap_page(struct page *page) VM_BUG_ON_PAGE(!PageLocked(page), page); if (unlikely(PageKsm(page))) return 0; + /* The page is part of THP and cannot be reused */ + if (PageTransCompound(page)) + return 0; count = page_mapcount(page); if (count <= 1 && PageSwapCache(page)) { count += page_swapcount(page); -- 2.1.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org