All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: 雷博涵 <garthlei@pku.edu.cn>
To: Atish Patra <atishp@atishpatra.org>
Cc: linux-riscv@lists.infradead.org, Atish Patra <atishp@rivosinc.com>
Subject: Re: RISC-V PMU driver issue
Date: Wed, 13 Mar 2024 12:29:02 +0800	[thread overview]
Message-ID: <022728E0-4238-4EC4-A6CC-71C62E114019@pku.edu.cn> (raw)
In-Reply-To: <CAOnJCUK7OMq97y__M98G1r24zQFQbkyNgxrVdj5tVFJF6znXrA@mail.gmail.com>

Sorry for the duplicated message. The last one was an HTML e-mail so I resend this.

> 2024年3月13日 08:02,Atish Patra <atishp@atishpatra.org> 写道:
> 
> On Tue, Feb 27, 2024 at 7:49 PM 雷博涵 <garthlei@pku.edu.cn> wrote:
>> 
>> Hi all,
>> 
>> I am having problems with the RISC-V PMU driver. The overflow handler of my custom perf event kernel counter seems to read an incorrect value from the event.
> 
> If I understand you correctly, you have a custom kernel counter for
> the perf event ? Can you explain the use case for that ?

Actually it was the perf-event backend of an open-source tool called PMCTrack (https://github.com/jcsaezal/pmctrack), which I was trying to port to RISC-V. It creates a kernel counter using `perf_event_create_kernel_counter`, registering an overflow handler. The registered overflow handler calls `perf_event_read_value` to read the counter value.

> 
>> It seems that the issue lies in the function `riscv_pmu_event_set_period`, which sets `prev_count` to a new value but does not modify the underlying hardware counter. When `perf_event_read_value` gets called later in the user-defined overflow handler, `riscv_pmu_event_update` will update the `count` field again based on the unmodified hardware counter value and the modified `prev_count` field, which causes an incorrect reading.
> 
> What do you mean by user defined overflow handler ? The overflow
> handler registered via the custom perf event kernel counter which gets
> invoked from perf_event_overflow ?

Yes.

> 
>> I noticed that other PMU drivers, such as the ARM one, write to the underlying counter in their set_period functions, which prevents the problem. However, the RISC-V SBI specification does not have such an API to write to a counter without starting it. Using `local64_read(&hw_evt->period_left) <= 0` directly as the guard condition to place `riscv_pmu_event_set_period(event)` after it seems to work, but I am not sure whether it can cause other issues.
> 
> Not sure the exact code path you are referring to. You don't want to
> invoke  riscv_pmu_event_set_period if period_left <= 0 ?

First, `pmu_sbi_ovf_handler` calls `riscv_pmu_event_update`, so the perf event counter value (`event->count`) is updated. It calls `riscv_pmu_event_set_period` later, which calls `local64_set(&hwc->prev_count, (u64)-left)`. When `perf_event_read_value` gets invoked in the registered overflow handler, `riscv_pmu_event_update` will be invoked eventually. Because `prev_count` has been set to `(u64)-left` and the underlying hardware counter value (the value read via `rvpmu->ctr_read(event)`) has not been changed, `event->count` will be updated again, and the value this time will be incorrect.

I have found that the ARM PMU does not have the problem. The function `armpmu_event_set_period` in `arm_pmu.c` calls `armpmu->write_counter(event, (u64)(-left) & max_period)` after `local64_set(&hwc->prev_count, (u64)-left)`, so that the underlying hardware counter is also set to a new value. Thus, when `armpmu_event_update` gets called later in the registered handler, it will find that the underlying counter has the same value as `prev_count`, so `event->count` can keep the correct value.

However, as I mentioned before, the RISC-V SBI specification does not have such an API as “write_counter,” so the ARM solution could not be adapted easily. I was wondering if we could postpone `riscv_pmu_event_set_period` after the if-block, like this:

```
if (local64_read(&hw_evt->period_left) <= 0) {
  perf_event_overflow(event, &data, regs);
}
riscv_pmu_event_set_period(event);
```

This way `perf_event_read_value` can return the correct counter value. `riscv_pmu_event_set_period` gets invoked after perf_event_overflow. However, I am not sure if it can cause other problems.

> 
>> 
>> Thank you,
>> Bohan Lei
> 
> 
> -- 
> Regards,
> Atish

--
Regards,
Bohan Lei

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2024-03-13  4:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  3:49 RISC-V PMU driver issue 雷博涵
2024-03-13  0:02 ` Atish Patra
2024-03-13  4:29   ` 雷博涵 [this message]
2024-03-18 22:18     ` Atish Patra
2024-04-27  0:07       ` Atish Patra
     [not found]         ` <2451aea.567f.18f3c8baa06.Coremail.garthlei@pku.edu.cn>
     [not found]           ` <CAHBxVyEotFmCfc+Ym-p7f=WAocAXtNA_z4MvOr7Ky8fXxPtVXA@mail.gmail.com>
2024-05-08  3:01             ` Re: " 雷博涵
2024-05-08  6:57               ` Atish Kumar Patra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=022728E0-4238-4EC4-A6CC-71C62E114019@pku.edu.cn \
    --to=garthlei@pku.edu.cn \
    --cc=atishp@atishpatra.org \
    --cc=atishp@rivosinc.com \
    --cc=linux-riscv@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.