From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934202AbcBDRTF (ORCPT ); Thu, 4 Feb 2016 12:19:05 -0500 Received: from mail-lb0-f194.google.com ([209.85.217.194]:36656 "EHLO mail-lb0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965095AbcBDRTD (ORCPT ); Thu, 4 Feb 2016 12:19:03 -0500 MIME-Version: 1.0 In-Reply-To: <20160204105116.GF12132@e106622-lin> References: <3071836.JbNxX8hU6x@vostro.rjw.lan> <18671470.kF8gVcBlTg@vostro.rjw.lan> <20160204105116.GF12132@e106622-lin> Date: Thu, 4 Feb 2016 18:19:01 +0100 X-Google-Sender-Auth: LndfTUEhB2O9WTdxEtYm0mXObGs Message-ID: Subject: Re: [PATCH 0/3] cpufreq: Replace timers with utilization update callbacks From: "Rafael J. Wysocki" To: Juri Lelli Cc: "Rafael J. Wysocki" , Linux PM list , Linux Kernel Mailing List , Peter Zijlstra , Srinivas Pandruvada , Viresh Kumar , Steve Muckle , Thomas Gleixner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 4, 2016 at 11:51 AM, Juri Lelli wrote: > Hi Rafael, > > On 03/02/16 23:20, Rafael J. Wysocki wrote: >> On Friday, January 29, 2016 11:52:15 PM Rafael J. Wysocki wrote: >> > Hi, >> > >> > The following patch series introduces a mechanism allowing the cpufreq core >> > and "setpolicy" drivers to provide utilization update callbacks to be invoked >> > by the scheduler on utilization changes. Those callbacks can be used to run >> > the sampling and frequency adjustments code (intel_pstate) or to schedule the >> > execution of that code in process context (cpufreq core) instead of per-CPU >> > deferrable timers used in cpufreq today (which Thomas complained about during >> > the last Kernel Summit). >> > >> > [1/3] Introduce a mechanism for calling into cpufreq from the scheduler and >> > registering callbacks to be executed from there. >> > >> > [2/3] Modify intel_pstate to use the mechanism introduced by [1/3] instead >> > of per-CPU deferrable timers to do its work. >> > >> > This isn't entirely straightforward as the scheduler context running those >> > callbacks is really special. Among other things it can only use raw >> > spinlocks and cannot invoke wake_up_process() directly. Also, calling >> > ktime_get() from there may be too expensive on some systems. All that has to >> > be taken into account, but even then the change allows some lines of code to be >> > cut from the driver. >> > >> > Some performance and energy consumption measurements have been carried out with >> > an earlier version of this patch and it looks like the changes lead to a >> > slightly better performing system that consumes slightly less energy at the >> > same time overall. >> > >> > [3/3] Modify the cpufreq core to use the mechanism introduced by [1/3] instead >> > of per-CPU deferrable timers to queue up the execution of governor work. >> > >> > Again, this isn't really straightforward for the above reasons, but still the >> > code size is reduced a bit by the changes. >> > >> > I'm still unsure about the energy consumption and performance impact of [3/3] >> > as earlier versions of it led to inconsistent results (most likely due to bugs >> > in them that hopefully have been fixed in this version). In particular, the >> > additional irq_work may turn out to be problematic, but more optimizations are >> > possible on top of this one even if it makes things worse by itself. >> > >> > For example, it should be possible to move the execution of state selection >> > code into the utilization update callback itself, at least in principle, for >> > all governors. The P-state/OPP adjustment may need to be run from process >> > context still, but for the drivers that can do it without sleeping it should >> > be possible to move that into the utilization update callback as well. >> > >> > The patches are on top of 4.5-rc1 and have been tested on a couple of x86 >> > machines. >> >> Well, no responses here, so I'm inclined to believe that this series is fine >> by everybody (at least by everybody in the CC). >> > > I did intend to test and review this series, but then other patches > required attention as well and I didn't find time to have a look at > these. Sorry about that. Also, if I can speak for him, I think that > Steve is OOO this week. No problem at all. >> I can wait for a few days more, but new material is starting to pile up on top >> of these patches and I'll simply need to move forward at one point. >> > > Unfortunately, I can't promise anything at the moment, but, if I find > some time, I'll run some tests (BTW, do you have alredy something that I > can put to run on my boxes?). I guess I can eventually do that after > this gets merged as well. Thanks! Well, everything that might regress performance-wise or from the energy consumption standpoint would be good to run. Thanks, Rafael