Claude review: accel: ethosu: Add performance counter support

public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed

From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: accel: ethosu: Add performance counter support
Date: Sat, 16 May 2026 10:08:39 +1000	[thread overview]
Message-ID: <review-patch1-20260515032625.1880618-1-robh@kernel.org> (raw)
In-Reply-To: <20260515032625.1880618-1-robh@kernel.org>

Patch Review

**Bug: Uninitialized `pmu_regs` offset for non-U65**

In `ethosu_drv.c`, the probe function sets:
```c
ethosudev->pmu_regs = ethosudev->regs;
```
Then in `ethosu_init()`:
```c
if (!ethosu_is_u65(ethosudev))
    ethosudev->pmu_regs += 0x1000;
```

This `+= 0x1000` is applied to the `void __iomem *` pointer. However, at this point `pmu_regs` was already set to `ethosudev->regs` in `ethosu_probe()`. If `ethosu_init()` were ever called a second time (e.g. via a resume path or error-retry), the offset would be applied twice. More importantly, the `devm_platform_ioremap_resource()` mapping might not cover the extra 0x1000 offset — the resource size should be validated. Suggest using an explicit assignment `ethosudev->pmu_regs = ethosudev->regs + 0x1000;` instead of `+=` to make it idempotent and clearer.

**UAPI: `drm_ethosu_npu_info` struct packing changed without padding**

Adding `pmu_counters` (__u32) to `drm_ethosu_npu_info`:
```c
struct drm_ethosu_npu_info {
    __u32 id;
    __u32 config;
    __u32 sram_size;
    __u32 pmu_counters;  /* new */
};
```
This changes the struct size from 12 to 16 bytes. Since this struct is used via `DRM_ETHOSU_DEV_QUERY` with a size-based copy mechanism (`min(size, actual_structure_size)`), this is designed to be extensible and should be fine for forward/backward compatibility. However, the original 12-byte struct had no padding for natural alignment — was 12 bytes intentional, or should there have been an explicit pad field already? This is worth verifying that existing userspace won't break, since `sizeof(struct drm_ethosu_npu_info)` changes.

**UAPI: `drm_ethosu_perfmon_create` alignment/padding issues**

```c
struct drm_ethosu_perfmon_create {
    __u32 id;
    __u32 ncounters;
    __u16 counters[DRM_ETHOSU_MAX_PERF_EVENT_COUNTERS]; /* 8 * 2 = 16 bytes */
};
```
Total: 4 + 4 + 16 = 24 bytes. The struct has no trailing padding issue per se, but the `__u16` array following `__u32` fields is aligned fine. However, it would be good practice to add a `__u32 pad` for future extensibility since this is ioctl-level ABI.

**UAPI: `drm_ethosu_perfmon_destroy` is only 4 bytes**

```c
struct drm_ethosu_perfmon_destroy {
    __u32 id;
};
```
A 4-byte UAPI struct is unusual. Consider adding a `__u32 pad;` for future extensibility and to ensure 8-byte alignment for the ioctl struct.

**UAPI: `drm_ethosu_submit.pad` field repurposed**

The existing `pad` field (which was required to be zero) is repurposed as `perfmon_id`:
```c
-	/** Reserved, must be zero. */
-	__u32 pad;
+	/** Input: Id returned by DRM_ETHOSU_PERFMON_CREATE */
+	__u32 perfmon_id;
```
The validation of `pad == 0` is removed:
```c
-	if (args->pad) {
-		drm_dbg(dev, "Reserved field in drm_ethosu_submit struct should be 0.\n");
-		return -EINVAL;
-	}
```
This is the correct way to reuse a reserved field. Since `XA_FLAGS_ALLOC1` is used, valid perfmon IDs start at 1 and `perfmon_id == 0` means "no perfmon," which is backward-compatible. Good.

**Locking: `ethosu_perfmon_start` called under spinlock but not annotated**

In `ethosu_switch_perfmon()`:
```c
guard(spinlock)(&ethosu->perfmon_state.lock);
...
ethosu_perfmon_stop_locked(ethosu, ethosu->perfmon_state.active, true);
if (perfmon && ethosu->perfmon_state.active != perfmon)
    ethosu_perfmon_start(ethosu, perfmon);
```
`ethosu_perfmon_start()` is called under `perfmon_state.lock` (a spinlock), and it does `writel_relaxed()` which is fine from a sleeping context. But this function does MMIO writes — the spinlock ensures atomicity with respect to the stop path, which is correct. However, `ethosu_perfmon_start()` has a `WARN_ON_ONCE(!perfmon || ethosu->perfmon_state.active)` check — by the time we get there after the stop, `active` should be NULL. The second condition in the `if` check `ethosu->perfmon_state.active != perfmon` is redundant after the early return for `perfmon == active` case, since `stop_locked` sets `active = NULL`. Not a bug, just slightly confusing logic.

**Potential data race: `global_perfmon` read without lock**

In `ethosu_switch_perfmon()`:
```c
guard(spinlock)(&ethosu->perfmon_state.lock);
perfmon = READ_ONCE(ethosu->global_perfmon);
```
The `READ_ONCE` is good, but `global_perfmon` is also written in `ethosu_perfmon_set_global()` under the same spinlock, and in `ethosu_perfmon_delete()` under `spinlock_irqsave`. The lock coverage is consistent, which is correct. However, the `global_perfmon` pointer is set with a refcount from `ethosu_perfmon_find()` in `set_global`, but there's no corresponding `ethosu_perfmon_get()` when it's used in `switch_perfmon`. If the global perfmon is concurrently destroyed (via `ethosu_perfmon_delete` which drops the ref), the pointer used in `switch_perfmon` could reference freed memory after the spinlock is dropped. In practice, since `switch_perfmon` only uses the pointer within the spinlock scope and `delete` also takes the spinlock, this is safe — but the design is fragile.

**`ethosu_perfmon_stop_locked`: `pm_runtime_get_if_active` returns int, not bool**

```c
if (!pm_runtime_get_if_active(ethosu->base.dev)) {
```
`pm_runtime_get_if_active()` can return negative error codes, 0 (device not active), or positive (success). The `!` check treats negative errors the same as "not active" — this is probably fine (if PM errors out, bailing is reasonable), but it's worth noting.

**`ethosu_perfmon_stop_locked`: called from spinlock but calls `pm_runtime_put`**

```c
void ethosu_perfmon_stop_locked(...)
{
    ...
    pm_runtime_put(ethosu->base.dev);
}
```
`pm_runtime_put()` may sleep (it can trigger `runtime_suspend` callbacks synchronously). This is called under a spinlock context (from `ethosu_switch_perfmon` via `guard(spinlock)`). This is a **bug** — sleeping in atomic context. Should use `pm_runtime_put_autosuspend()` or `pm_runtime_mark_last_busy()` + `pm_runtime_put_autosuspend()` instead, or better yet, defer the pm_runtime_put to after the spinlock is released.

Similarly, `pm_runtime_get_if_active()` should also be checked for sleeping behavior, though that one is documented as safe in atomic context.

**`ethosu_ioctl_perfmon_get_values`: stopping a perfmon that may not be active**

```c
ethosu_perfmon_stop(ethosu, perfmon, true);
```
This stops the perfmon to capture current values, but doesn't restart it afterward. If this was a global perfmon that was running, it will be stopped permanently after a get_values call. The docstring says "no implicit synchronization" but the stop side-effect is implicit. This seems intentional based on the V3D model, but worth documenting.

**`ethosu_ioctl_perfmon_set_global`: no check that perfmon_id matches the perfmon being cleared**

When `DRM_ETHOSU_PERFMON_CLEAR_GLOBAL` is set:
```c
perfmon = ethosu_perfmon_find(ethosu_priv, req->id);
if (!perfmon)
    return -EINVAL;
...
if (req->flags & DRM_ETHOSU_PERFMON_CLEAR_GLOBAL) {
    old = ethosu->global_perfmon;
    if (!old) { ... return -EINVAL; }
    ethosu->global_perfmon = NULL;
```
The `req->id` is used to find a perfmon, but then the code doesn't verify that the found perfmon matches the current `global_perfmon`. Any valid perfmon ID will clear any global perfmon. The found `perfmon` is only used for the `find` validity check, then immediately put. This is potentially confusing API — either the `id` should be validated against the current global, or the clear operation shouldn't require a valid `id` at all.

**Missing `#include` for flex array / `kzalloc_flex` in ethosu_perfmon.c**

The code uses `kzalloc_flex()` but I don't see an explicit include for whatever header provides it. It likely comes transitively through `slab.h` or one of the DRM headers. Should be fine but worth verifying builds cleanly.

**`ethosu_job_err_cleanup`: perfmon_put before BO cleanup**

```c
static void ethosu_job_err_cleanup(struct ethosu_job *job)
{
    ethosu_perfmon_put(job->perfmon);
    for (i = 0; i < job->region_cnt; i++)
        drm_gem_object_put(job->region_bo[i]);
```
This is fine — just noting that `perfmon_put` is added at the top. If `job->perfmon` is NULL (no perfmon attached), the `ethosu_perfmon_put` NULL-checks internally, which is correct.

**Missing error check for `ethosu_perfmon_find` in submit path**

```c
if (perfmon_id)
    ejob->perfmon = ethosu_perfmon_find(file_priv, perfmon_id);
```
If the user passes a nonzero `perfmon_id` that doesn't exist in the xarray, `ethosu_perfmon_find` returns NULL (xa_load returns NULL for missing entries). The job proceeds with `perfmon == NULL`, silently ignoring the invalid perfmon ID. This should return `-EINVAL` if the user specified a perfmon_id but it wasn't found:
```c
if (perfmon_id) {
    ejob->perfmon = ethosu_perfmon_find(file_priv, perfmon_id);
    if (!ejob->perfmon)
        return -EINVAL;
}
```

**`counters` array in `ethosu_perfmon` struct is fixed-size but `values` is flex**

```c
struct ethosu_perfmon {
    u8 ncounters;
    u16 counters[DRM_ETHOSU_MAX_PERF_EVENT_COUNTERS];
    u64 values[] __counted_by(ncounters);
};
```
`ncounters` includes the cycle counter (+1), so it can be up to 9 (DRM_ETHOSU_MAX_PERF_COUNTERS). The `counters` array is sized `DRM_ETHOSU_MAX_PERF_EVENT_COUNTERS` (8), which stores only the user-specified event counters (not cycles). `values` is `ncounters` entries which includes cycles. This mismatch between `counters` array size (8 event types) and `values` array size (up to 9 values) is intentional but the mixed use of `ncounters` and `ncounters - 1` throughout the code is error-prone and hard to follow. Consider using a separate field for the event counter count vs total counter count.

**Minor: SPDX license tag inconsistency**

`ethosu_perfmon.c` header:
```c
// SPDX-License-Identifier: GPL-2.0-only or MIT
```
Other files use:
```c
// SPDX-License-Identifier: GPL-2.0-only OR MIT
```
The SPDX spec requires uppercase `OR`. This should be `GPL-2.0-only OR MIT`.

**Summary of issues by severity:**

1. **Bug**: `pm_runtime_put()` called under spinlock in `ethosu_perfmon_stop_locked` — sleeping in atomic context
2. **Bug**: Missing error check when `ethosu_perfmon_find` returns NULL in the submit path (invalid perfmon_id silently ignored)
3. **Issue**: `pmu_regs += 0x1000` is not idempotent — should be explicit assignment
4. **UAPI**: Several structs could benefit from pad fields for future ABI extensibility
5. **Minor**: SPDX license tag uses lowercase `or` instead of `OR`
6. **Style**: Pervasive `ncounters - 1` pattern throughout the code is fragile; a named field would be clearer

---
Generated by Claude Code Patch Reviewer

next prev parent reply	other threads:[~2026-05-16  0:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  3:26 [PATCH v2] accel: ethosu: Add performance counter support Rob Herring (Arm)
2026-05-16  0:08 ` Claude review: " Claude Code Review Bot
2026-05-16  0:08 ` Claude Code Review Bot [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-03-06 16:36 [PATCH] " Rob Herring (Arm)
2026-03-08 22:34 ` Claude review: " Claude Code Review Bot
2026-03-08 22:34 ` Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch1-20260515032625.1880618-1-robh@kernel.org \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox