Linux-PM Archive mirror
 help / color / mirror / Atom feed
From: Qais Yousef <qyousef@layalina.io>
To: Ashay Jaiswal <quic_ashayj@quicinc.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
	Rick Yiu <rickyiu@google.com>,
	Chung-Kai Mei <chungkai@google.com>,
	quic_anshar@quicinc.com, quic_atulpant@quicinc.com,
	quic_shashim@quicinc.com, quic_rgottimu@quicinc.com,
	quic_adharmap@quicinc.com, quic_kshivnan@quicinc.com,
	quic_pkondeti@quicinc.com
Subject: Re: [PATCH v2 8/8] sched/pelt: Introduce PELT multiplier
Date: Fri, 19 Apr 2024 14:19:58 +0100	[thread overview]
Message-ID: <20240419131958.s46xbqtmvr2yjn6j@airbuntu> (raw)
In-Reply-To: <c1f8c627-6497-4598-8b71-6be45e9c12f1@quicinc.com>

Hi Ashay

On 04/12/24 15:36, Ashay Jaiswal wrote:
> On 2/6/2024 10:37 PM, Ashay Jaiswal wrote:
> > 
> > 
> > On 1/30/2024 10:58 PM, Vincent Guittot wrote:
> >> On Sun, 28 Jan 2024 at 17:22, Ashay Jaiswal <quic_ashayj@quicinc.com> wrote:
> >>>
> >>> Hello Qais Yousef,
> >>>
> >>> Thank you for your response.
> >>>
> >>> On 1/21/2024 5:34 AM, Qais Yousef wrote:
> >>>> Hi Ashay
> >>>>
> >>>> On 01/20/24 13:22, Ashay Jaiswal wrote:
> >>>>> Hello Qais Yousef,
> >>>>>
> >>>>> We ran few benchmarks with PELT multiplier patch on a Snapdragon 8Gen2
> >>>>> based internal Android device and we are observing significant
> >>>>> improvements with PELT8 configuration compared to PELT32.
> >>>>>
> >>>>> Following are some of the benchmark results with PELT32 and PELT8
> >>>>> configuration:
> >>>>>
> >>>>> +-----------------+---------------+----------------+----------------+
> >>>>> | Test case                       |     PELT32     |     PELT8      |
> >>>>> +-----------------+---------------+----------------+----------------+
> >>>>> |                 |    Overall    |     711543     |     971275     |
> >>>>> |                 +---------------+----------------+----------------+
> >>>>> |                 |    CPU        |     193704     |     224378     |
> >>>>> |                 +---------------+----------------+----------------+
> >>>>> |ANTUTU V9.3.9    |    GPU        |     284650     |     424774     |
> >>>>> |                 +---------------+----------------+----------------+
> >>>>> |                 |    MEM        |     125207     |     160548     |
> >>>>> |                 +---------------+----------------+----------------+
> >>>>> |                 |    UX         |     107982     |     161575     |
> >>>>> +-----------------+---------------+----------------+----------------+
> >>>>> |                 |   Single core |     1170       |     1268       |
> >>>>> |GeekBench V5.4.4 +---------------+----------------+----------------+
> >>>>> |                 |   Multi core  |     2530       |     3797       |
> >>>>> +-----------------+---------------+----------------+----------------+
> >>>>> |                 |    Twitter    |     >50 Janks  |     0          |
> >>>>> |     SCROLL      +---------------+----------------+----------------+
> >>>>> |                 |    Contacts   |     >30 Janks  |     0          |
> >>>>> +-----------------+---------------+----------------+----------------+
> >>>>>
> >>>>> Please let us know if you need any support with running any further
> >>>>> workloads for PELT32/PELT8 experiments, we can help with running the
> >>>>> experiments.
> >>>>
> >>>> Thanks a lot for the test results. Was this tried with this patch alone or
> >>>> the whole series applied?
> >>>>
> >>> I have only applied patch8(sched/pelt: Introduce PELT multiplier) for the tests.
> >>>
> >>>> Have you tried to tweak each policy response_time_ms introduced in patch
> >>>> 7 instead? With the series applied, boot with PELT8, record the response time
> >>>> values for each policy, then boot back again to PELT32 and use those values.
> >>>> Does this produce similar results?
> >>>>
> >>> As the device is based on 5.15 kernel, I will try to pull all the 8 patches
> >>> along with the dependency patches on 5.15 and try out the experiments as
> >>> suggested.
> >>
> >> Generally speaking, it would be better to compare with the latest
> >> kernel or at least close and which includes new features added since
> >> v5.15 (which is more than 2 years old now). I understand that this is
> >> not always easy or doable but you could be surprised by the benefit of
> >> some features like [0] merged since v5.15
> >>
> >> [0] https://lore.kernel.org/lkml/249816c9-c2b5-8016-f9ce-dab7b7d384e4@arm.com/
> >>
> > Thank you Vincent for the suggestion, I will try to get the results on device running
> > with most recent kernel and update.
> > 
> > Thanks,
> > Ashay Jaiswal
> 
> Hello Qais Yousef and Vincent,
> 
> Sorry for the delay, setting up internal device on latest kernel is taking more time than anticipated.
> We are trying to bring-up latest kernel on the device and will complete the testing with the latest
> cpufreq patches as you suggested.
> 
> Regarding PELT multiplier patch [1], are we planning to merge it separately or will it be merged
> altogether with the cpufreq patches?
> 
> [1]: https://lore.kernel.org/all/20231208002342.367117-9-qyousef@layalina.io/

I am working on updated version. I've been analysing the problem more since the
last posting and found more issues to fix and enhance the response time of the
system in terms of migration and DVFS.

For the PELT multiplier, I have an updated patch now based on Vincent
suggestion of a different implementation without anew clodk. I will include
that, but haven't been testing this part so far.

I hope to send the new version soon. Will CC you so you can try and see if
these improvements help. In my view the boot time PELT multiplier is only
necessary to help low-end type of system where the max performance is
relatively low, but the scheduler has a constant model (and response time)
which means they need to a different default behavior so workloads reach this
max performance point faster.

I can potentially see a powerful system needing that, but IMHO the trade-off
with power will be very costly. If everything goes to plan, I hope we can
introduce a per-task util_est_faster like Peter suggested in earlier discussion
which should help workload that needs best ST perf to reach that faster without
changing the system default behavior.

The biggest challenge is handling those bursty tasks, and I hope the proposal
I am working on will put us on the right direction in terms of better default
behavior for those tasks.

If you have any analysis on why you think faster PELT helps, that'd be great to
share.


Cheers

--
Qais Yousef

  reply	other threads:[~2024-04-19 13:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  0:23 [PATCH v2 0/8] sched: cpufreq: Remove magic hardcoded numbers from margins Qais Yousef
2023-12-08  0:23 ` [PATCH v2 1/8] cpufreq: Change default transition delay to 2ms Qais Yousef
2023-12-08  0:23 ` [PATCH v2 2/8] sched: cpufreq: Rename map_util_perf to apply_dvfs_headroom Qais Yousef
2023-12-08  0:23 ` [PATCH v2 3/8] sched/pelt: Add a new function to approximate the future util_avg value Qais Yousef
2023-12-08  0:23 ` [PATCH v2 4/8] sched/pelt: Add a new function to approximate runtime to reach given util Qais Yousef
2023-12-08  0:23 ` [PATCH v2 5/8] sched/fair: Remove magic hardcoded margin in fits_capacity() Qais Yousef
2023-12-08  0:23 ` [PATCH v2 6/8] sched: cpufreq: Remove magic 1.25 headroom from apply_dvfs_headroom() Qais Yousef
2023-12-08  0:23 ` [PATCH v2 7/8] sched/schedutil: Add a new tunable to dictate response time Qais Yousef
2023-12-08 18:06   ` Rafael J. Wysocki
2023-12-10 20:40     ` Qais Yousef
2023-12-11 20:20       ` Rafael J. Wysocki
2023-12-12 13:16         ` Qais Yousef
2024-02-01 22:31   ` Qais Yousef
2023-12-08  0:23 ` [PATCH v2 8/8] sched/pelt: Introduce PELT multiplier Qais Yousef
2024-01-20  7:52   ` Ashay Jaiswal
2024-01-21  0:04     ` Qais Yousef
2024-01-28 16:21       ` Ashay Jaiswal
2024-01-30 17:28         ` Vincent Guittot
2024-02-06 17:07           ` Ashay Jaiswal
2024-04-12 10:06             ` Ashay Jaiswal
2024-04-19 13:19               ` Qais Yousef [this message]
2024-01-30 17:38   ` Vincent Guittot
2024-02-01 22:24     ` Qais Yousef
2024-02-04 11:32       ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240419131958.s46xbqtmvr2yjn6j@airbuntu \
    --to=qyousef@layalina.io \
    --cc=chungkai@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=quic_adharmap@quicinc.com \
    --cc=quic_anshar@quicinc.com \
    --cc=quic_ashayj@quicinc.com \
    --cc=quic_atulpant@quicinc.com \
    --cc=quic_kshivnan@quicinc.com \
    --cc=quic_pkondeti@quicinc.com \
    --cc=quic_rgottimu@quicinc.com \
    --cc=quic_shashim@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=rickyiu@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=wvw@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).