From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752450AbcBOUkR (ORCPT ); Mon, 15 Feb 2016 15:40:17 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:34048 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751735AbcBOUkP (ORCPT ); Mon, 15 Feb 2016 15:40:15 -0500 Date: Mon, 15 Feb 2016 21:40:12 +0100 From: Michal Hocko To: akpm@linux-foundation.org, Tetsuo Handa Cc: rientjes@google.com, mgorman@suse.de, oleg@redhat.com, torvalds@linux-foundation.org, hughd@google.com, andrea@kernel.org, riel@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 3.1/5] oom: make oom_reaper freezable Message-ID: <20160215204011.GC9223@dhcp22.suse.cz> References: <1454505240-23446-1-git-send-email-mhocko@kernel.org> <1454505240-23446-4-git-send-email-mhocko@kernel.org> <201602042322.IAG65142.MOOJHFSVLOQFFt@I-love.SAKURA.ne.jp> <20160204144319.GD14425@dhcp22.suse.cz> <20160206064505.GB20537@dhcp22.suse.cz> <201602062333.CCI64980.MFJFFVOLtOOQSH@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201602062333.CCI64980.MFJFFVOLtOOQSH@I-love.SAKURA.ne.jp> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andrew, this can be either folded into 3/5 patch or go as a standalone one. I would be inclined to have it standalone for the record (the description should be pretty clear about the intention) and because the issue is highly unlikely. OOM during the PM freezer doesn't happen in 99% cases. On Sat 06-02-16 23:33:24, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > OK, I was thinking about it some more and it seems you are right here. > > oom_reaper as a kernel thread is not freezable automatically and so it > > might interfere after all the processes/kernel threads are considered > > frozen. Then it really might shut down TIF_MEMDIE too early and wake out > > oom_killer_disable. wait_event_freezable is not sufficient because the > > oom_reaper might running while the PM freezer is freezing tasks and it > > will miss it because it doesn't see it. > > I'm not using PM freezer, but your answer is opposite to my guess. > I thought try_to_freeze_tasks(false) is called by freeze_kernel_threads() > after oom_killer_disable() succeeded, and try_to_freeze_tasks(false) will > freeze both userspace tasks (including OOM victims which got TIF_MEMDIE > cleared by the OOM reaper) and kernel threads (including the OOM reaper). kernel threads which are not freezable are ignored by the freezer. > Thus, I was guessing that clearing TIF_MEMDIE without reaching do_exit() is > safe. Does the following explains it better? --- >>From d7f57b1ac07532657312c91f3bba67cf0332b32f Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon, 15 Feb 2016 10:09:39 +0100 Subject: [PATCH] oom: make oom_reaper freezable After "oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space" oom_reaper will call exit_oom_victim on the target task after it is done. This might however race with the PM freezer: CPU0 CPU1 CPU2 freeze_processes try_to_freeze_tasks # Allocation request out_of_memory oom_killer_disable wake_oom_reaper(P1) __oom_reap_task exit_oom_victim(P1) wait_event(oom_victims==0) [...] do_exit(P1) perform IO/interfere with the freezer which breaks the oom_killer_disable semantic. We no longer have a guarantee that the oom victim won't interfere with the freezer because it might be anywhere on the way to do_exit while the freezer thinks the task has already terminated. It might trigger IO or touch devices which are frozen already. In order to close this race, make the oom_reaper thread freezable. This will work because a) already running oom_reaper will block freezer to enter the quiescent state b) wake_oom_reaper will not wake up the reaper after it has been frozen c) the only way to call exit_oom_victim after try_to_freeze_tasks is from the oom victim's context when we know the further interference shouldn't be possible Signed-off-by: Michal Hocko --- mm/oom_kill.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index ca61e6cfae52..7e9953a64489 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -521,6 +521,8 @@ static void oom_reap_task(struct task_struct *tsk) static int oom_reaper(void *unused) { + set_freezable(); + while (true) { struct task_struct *tsk = NULL; -- 2.7.0 -- Michal Hocko SUSE Labs