From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: Re: [PATCH v2] accel: ethosu: Add performance counter support Date: Mon, 25 May 2026 17:53:03 +1000 Message-ID: In-Reply-To: References: <20260515032625.1880618-1-robh@kernel.org> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **UAPI issues:** 1. **`drm_ethosu_npu_info` breaks existing ABI** (ethosu_accel.h). Adding `= pmu_counters` after `sram_size` changes the struct size. Since this struct = is used with `DRM_IOCTL_ETHOSU(R, ...)`, the ioctl number encodes the struc= t size. Old userspace calling `GET_NPU_INFO` will use a different ioctl num= ber and get `-ENOTTY`. This is likely acceptable for a new driver that hasn= 't shipped a stable ABI yet, but should be explicitly called out in the com= mit message. 2. **`drm_ethosu_perfmon_create` has no padding field** =E2=80=94 The struc= t is 24 bytes (`__u32 id` + `__u32 ncounters` + `__u16 counters[8]`), which= is 8-byte aligned. This is fine, but `ncounters` is modified by the kernel= (incremented to include cycle counter at line 1553: `req->ncounters++`). T= he DRM ioctl infrastructure copies data back to userspace for WR ioctls, so= this modified value will be visible to userspace. This is a UAPI concern = =E2=80=94 userspace passes N but gets N+1 back. This should either be docum= ented explicitly or handled differently (e.g., don't modify `req->ncounters= ` in place). 3. **`drm_ethosu_perfmon_destroy` pad field is never checked** =E2=80=94 v3= added a `__u32 pad` field for alignment, but the destroy ioctl handler nev= er validates `req->pad =3D=3D 0`. This is inconsistent with `perfmon_get_va= lues` which does check its pad field. **Locking / concurrency issues:** 4. **`global_perfmon` accessed without lock protection in `ethosu_switch_pe= rfmon`** (ethosu_job.c, around line 1290): ```c perfmon =3D ethosu->global_perfmon; ``` This reads `global_perfmon` under `perfmon_state.lock`, which is correct. H= owever, `ethosu_ioctl_perfmon_set_global` also accesses `global_perfmon` un= der the same lock, so this is actually fine. Good. 5. **`ethosu_perfmon_start` called under mutex in `ethosu_switch_perfmon` b= ut accesses MMIO registers** =E2=80=94 `ethosu_perfmon_start` does `writel_= relaxed` to program counters. This is called from `ethosu_job_run` which is= the scheduler's `run_job` callback. The function holds `perfmon_state.lock= ` (mutex). Since the NPU should already be powered at this point (job is be= ing submitted), this should be okay, but there's no `pm_runtime_get` protec= ting these register writes, unlike `ethosu_perfmon_stop_locked` which does = call `pm_runtime_get_if_active`. If the device gets suspended between the f= ence init and the MMIO writes, the register accesses would hit powered-down= hardware. The `job_lock` mutex below should serialize with the IRQ handler= , but pm_runtime is the concern. 6. **Race in `ethosu_perfmon_find`** (ethosu_perfmon.c, around line 1483-14= 93): ```c xa_lock(ðosu_priv->perfmons); perfmon =3D xa_load(ðosu_priv->perfmons, id); ethosu_perfmon_get(perfmon); xa_unlock(ðosu_priv->perfmons); ``` Using `xa_lock`/`xa_unlock` manually here works for serializing the load+re= fcount_inc, but `xa_erase` in `perfmon_destroy` does not hold this lock =E2= =80=94 it uses its own internal locking. There is a TOCTOU window: `perfmon= _find` could load and start incrementing the refcount on a perfmon that `pe= rfmon_destroy` just erased and is about to `perfmon_delete`. The `refcount_= inc` in `ethosu_perfmon_get` will WARN if the refcount is already 0. This n= eeds the xa_lock to be held across both the erase and the first put, or use= `xa_cmpxchg`. 7. **`ethosu_perfmon_get` uses `refcount_inc` which will WARN/BUG if refcou= nt is 0** =E2=80=94 If `ethosu_perfmon_find` races with destroy (see above)= , this could trigger a refcount saturation warning. Using `refcount_inc_not= _zero` and returning NULL on failure would be safer. **Correctness issues:** 8. **`ethosu_perfmon_find` can return NULL perfmon but `ethosu_ioctl_submit= _job` doesn't check** (ethosu_job.c, around line 1331-1332): ```c if (perfmon_id) ejob->perfmon =3D ethosu_perfmon_find(file_priv, perfmon_id); ``` If `perfmon_id` is nonzero but the perfmon doesn't exist (or was destroyed)= , `ethosu_perfmon_find` returns NULL and the job proceeds without a perfmon= rather than returning an error. This should return `-ENOENT` or `-EINVAL` = if the perfmon_id was specified but not found. 9. **`perfmon_get_values` stops the perfmon then doesn't restart it** (etho= su_perfmon.c, around line 1616): ```c ethosu_perfmon_stop(ethosu, perfmon, true); ``` After reading values, the perfmon is stopped and `perfmon_state.active` is = set to NULL. If this was a global perfmon actively counting, it won't be re= started until the next job switches perfmon. The comment says "No implicit = synchronization is performed, so the user has to guarantee that any jobs us= ing this perfmon have already been completed" =E2=80=94 but for a *global* = perfmon this semantic is awkward, since the user may want to read accumulat= ed values mid-stream. The behavior should at least be documented. 10. **Magic number `0x80000000` for cycle counter enable bit** (ethosu_perf= mon.c, lines ~1421, 1462, 1549): ```c mask =3D 0x80000000; ``` This appears to be bit 31 of the PMCNTENSET/PMCNTENCLR register, presumably= the cycle counter enable. It should be a named define (e.g., `PMCNTEN_CCNT= _EN BIT(31)`) for readability, consistent with the other PMU defines. 11. **v2 patch is included in the same mbox as v3** =E2=80=94 This will con= fuse automated tooling that tries to apply patches from the mbox. Only the = latest version should typically be in a single mbox for review/application = purposes. This appears to be an artifact of the filtering though, not the s= ubmitter's fault. **Minor issues:** 12. **Missing error path for `IS_ERR(ethosudev->regs)` propagation** =E2=80= =94 The existing code at `ethosu_probe` line 345 already has this bug (no e= rror check on `devm_platform_ioremap_resource`), but the patch adds `ethosu= dev->pmu_regs =3D ethosudev->regs` right after, inheriting the issue. The `= pmu_regs +=3D 0x1000` adjustment in `ethosu_init` would then corrupt an ERR= _PTR. Not introduced by this patch, but worth noting. 13. **`drm_ethosu_perfmon_create.ncounters` should validate `ncounters > 0`= ** =E2=80=94 If userspace passes `ncounters =3D 0`, the code increments to = 1 (for the cycle counter), allocates a perfmon with 1 value slot, and sets = `ncounters - 1 =3D 0` event counters. This technically works (only the cycl= e counter is used), but the intent is unclear =E2=80=94 should a perfmon wi= th zero event counters be allowed? 14. **Comment says "perform" instead of "perfmon"** =E2=80=94 In `ethosu_dr= v.h` (v2 only), the comment reads "The perform lifetime is controlled by us= erspace". v3 fixes this to "The perfmon lifetime...". Good. **Summary:** The overall design is sound and follows the V3D pattern well. = The most critical issues are the race between find and destroy (#6/#7), the= missing NULL-check after `perfmon_find` in the submit path (#8), and the U= API semantics around `ncounters` being modified in-place (#2). The pm_runti= me concern in `perfmon_start` (#5) deserves investigation. The rest are sty= le/documentation issues. --- Generated by Claude Code Patch Reviewer