From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757007AbcBQJtB (ORCPT ); Wed, 17 Feb 2016 04:49:01 -0500 Received: from mail-wm0-f51.google.com ([74.125.82.51]:34129 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756827AbcBQJs5 (ORCPT ); Wed, 17 Feb 2016 04:48:57 -0500 Date: Wed, 17 Feb 2016 10:48:55 +0100 From: Michal Hocko To: Andrew Morton Cc: David Rientjes , Mel Gorman , Tetsuo Handa , Oleg Nesterov , Linus Torvalds , Hugh Dickins , Andrea Argangeli , Rik van Riel , linux-mm@kvack.org, LKML Subject: [PATCH 6/5] oom, oom_reaper: disable oom_reaper for Message-ID: <20160217094855.GC29196@dhcp22.suse.cz> References: <1454505240-23446-1-git-send-email-mhocko@kernel.org> <1454505240-23446-6-git-send-email-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1454505240-23446-6-git-send-email-mhocko@kernel.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrew, although this can be folded into patch 5 (mm-oom_reaper-implement-oom-victims-queuing.patch) I think it would be better to have it separate and revert after we sort out the proper oom_kill_allocating_task behavior or handle exclusion at oom_reaper level. Thanks! --- >>From 7d8c953994f97fb38a8d71b53c06ecf8418616e9 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 17 Feb 2016 10:40:41 +0100 Subject: [PATCH] oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task Tetsuo has reported that oom_kill_allocating_task=1 will cause oom_reaper_list corruption because oom_kill_process doesn't follow standard OOM exclusion (aka ignores TIF_MEMDIE) and allows to enqueue the same task multiple times - e.g. by sacrificing the same child multiple times. Let's workaround this issue for now until we decide how to handle oom_kill_allocating_task properly (should it sacrifice children at all?) or come up with some other protection. Reported-by: Tetsuo Handa Signed-off-by: Michal Hocko --- mm/oom_kill.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 7e9953a64489..078e07ec0906 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -678,7 +678,14 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, unsigned int victim_points = 0; static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); - bool can_oom_reap = true; + bool can_oom_reap; + + /* + * XXX: oom_kill_allocating_task doesn't follow normal OOM exclusion + * and so the same task might enter oom_kill_process which oom_reaper + * cannot handle currently. + */ + can_oom_reap = !sysctl_oom_kill_allocating_task; /* * If the task is already exiting, don't alarm the sysadmin or kill -- 2.7.0 -- Michal Hocko SUSE Labs