LKML Archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Jason Evans <je@fb.com>, Daniel Micay <danielmicay@gmail.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Shaohua Li <shli@kernel.org>, Michal Hocko <mhocko@suse.cz>,
	yalin.wang2010@gmail.com, Andy Lutomirski <luto@amacapital.net>
Subject: Re: [PATCH v5 00/12] MADV_FREE support
Date: Sun, 7 Feb 2016 21:31:20 +0900	[thread overview]
Message-ID: <20160207123120.GA16116@bbox> (raw)
In-Reply-To: <56B5F5D2.70309@gmail.com>

On Sat, Feb 06, 2016 at 02:32:02PM +0100, Michael Kerrisk (man-pages) wrote:
> Hello Minchan,
> 
> On 02/05/2016 03:15 AM, Minchan Kim wrote:
> > On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote:
> >> Hello Minchan,
> >>
> >> On 11/30/2015 07:39 AM, Minchan Kim wrote:
> >>> In v4, Andrew wanted to settle in old basic MADV_FREE and introduces
> >>> new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later
> >>> so this version doesn't include them.
> >>>
> >>> I have been tested it on mmotm-2015-11-25-17-08 with additional
> >>> patch[1] from Kirill to prevent BUG_ON which he didn't send to
> >>> linux-mm yet as formal patch. With it, I couldn't find any
> >>> problem so far.
> >>>
> >>> Note that this version is based on THP refcount redesign so
> >>> I needed some modification on MADV_FREE because split_huge_pmd
> >>> doesn't split a THP page any more and pmd_trans_huge(pmd) is not
> >>> enough to guarantee the page is not THP page.
> >>> As well, for MAVD_FREE lazy-split, THP split should respect
> >>> pmd's dirtiness rather than marking ptes of all subpages dirty
> >>> unconditionally. Please, review last patch in this patchset.
> >>
> >> Now that MADV_FREE has been merged, would you be willing to write
> >> patch to the madvise(2) man page that describes the semantics, 
> >> noes limitations and restrictions, and (ideally) has some sentences
> >> describing use cases?
> >>
> > 
> > Hello Michael,
> > 
> > Could you review this patch?
> > 
> > Thanks.
> > 
> >>From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Fri, 5 Feb 2016 11:09:54 +0900
> > Subject: [PATCH] madvise.2: Add MADV_FREE
> > 
> > Document the MADV_FREE flags added to madvise() in Linux 4.5
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  man2/madvise.2 | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/man2/madvise.2 b/man2/madvise.2
> > index c1df67c..4704304 100644
> > --- a/man2/madvise.2
> > +++ b/man2/madvise.2
> > @@ -143,6 +143,25 @@ flag are special memory areas that are not managed
> >  by the virtual memory subsystem.
> >  Such pages are typically created by device drivers that
> >  map the pages into user space.)
> > +.TP
> > +.B MADV_FREE " (since Linux 4.5)"
> > +Application is finished with the given range, so kernel can free
> > +resources associated with it but the freeing could be delayed until
> > +memory pressure happens or canceld by write operation by user.
> > +
> > +After a successful MADV_FREE operation, user shouldn't expect kernel
> > +keeps stale data on the page. However, subsequent write of pages
> > +in the range will succeed and then kernel cannot free those dirtied pages
> > +so user can always see just written data. If there was no subsequent
> > +write, kernel can free those clean pages any time. In such case,
> > +user can see zero-fill-on-demand pages.
> > +
> > +Note that, it works only with private anonymous pages (see
> > +.BR mmap (2)).
> > +On swapless system, freeing pages in given range happens instantly
> > +regardless of memory pressure.
> > +
> > +
> >  .\"
> >  .\" ======================================================================
> >  .\"
> > 
> 
> Thanks for the nice text! I reworked somewhat, trying to fill out a
> few details about how I understand things work, but I may have introduced
> errors, so I would be happy if you would check the following text:

Below looks good to me.
Thanks, Michael

> 
>        MADV_FREE (since Linux 4.5)
>               The  application  no  longer  requires  the pages in the
>               range specified by addr and len.  The  kernel  can  thus
>               free these pages, but the freeing could be delayed until
>               memory pressure occurs.  For each of the pages that  has
>               been  marked to be freed but has not yet been freed, the
>               free operation will be canceled  if  the  caller  writes
>               into  the page.  After a successful MADV_FREE operation,
>               any stale data (i.e., dirty, unwritten  pages)  will  be
>               lost  when  the kernel frees the pages.  However, subse‐
>               quent writes to pages in the range will succeed and then
>               kernel  cannot  free  those  dirtied  pages, so that the
>               caller can always see just written data.  If there is no
>               subsequent  write,  the kernel can free the pages at any
>               time.  Once pages in the  range  have  been  freed,  the
>               caller  will  see  zero-fill-on-demand pages upon subse‐
>               quent page references.
> 
>               The MADV_FREE operation can be applied only  to  private
>               anonymous  pages  (see  mmap(2)).  On a swapless system,
>               freeing  pages  in  a  given  range  happens  instantly,
>               regardless of memory pressure.
> 
> Thanks,
> 
> Michael
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-02-07 12:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05  2:15 [PATCH v5 00/12] MADV_FREE support Minchan Kim
2016-02-06 13:32 ` Michael Kerrisk (man-pages)
2016-02-07 12:31   ` Minchan Kim [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-11-30  6:39 Minchan Kim
2016-01-28  7:16 ` Michael Kerrisk (man-pages)
2016-01-29  7:32   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160207123120.GA16116@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=danielmicay@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=je@fb.com \
    --cc=kirill@shutemov.name \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=mtk.manpages@gmail.com \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    --cc=yalin.wang2010@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).