From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753612AbbFKMyH (ORCPT ); Thu, 11 Jun 2015 08:54:07 -0400 Received: from casper.infradead.org ([85.118.1.10]:39633 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752370AbbFKMxu (ORCPT ); Thu, 11 Jun 2015 08:53:50 -0400 Message-Id: <20150611124636.448700267@infradead.org> User-Agent: quilt/0.61-1 Date: Thu, 11 Jun 2015 14:46:36 +0200 From: Peter Zijlstra To: umgwanakikbuti@gmail.com, mingo@elte.hu Cc: ktkhai@parallels.com, rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com, pang.xunlei@linaro.org, oleg@redhat.com, wanpeng.li@linux.intel.com, linux-kernel@vger.kernel.org, peterz@infradead.org Subject: [PATCH 00/18] sched: balance callbacks v4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mike stumbled over a cute bug where the RT/DL balancing ops caused a bug. The exact scenario is __sched_setscheduler() changing a (runnable) task from FIFO to OTHER. In swiched_from_rt(), where we do pull_rt_task() we temporarity drop rq->lock. This gap allows regular cfs load-balancing to step in and migrate our task. However, check_class_changed() will happily continue with switched_to_fair() which assumes our task is still on the old rq and makes the kernel go boom. Instead of trying to patch this up and make things complicated; simply disallow these methods to drop rq->lock and extend the current post_schedule stuff into a balancing callback list, and use that. This survives Mike's testcase. Changes since -v3: - reworked the hrtimer stuff, again. -- Kirill, Oleg - small changes to the new lockdep stuff Changes since -v2: - reworked the hrtimer patch. -- Kirill, tglx - added lock pinning Changes since -v1: - make SMP=n build, - cured switched_from_dl()'s cancel_dl_timer(). no real tests on the new parts other than booting / building kernels.