* [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes
@ 2026-02-18 22:21 Rob Herring (Arm)
2026-02-18 22:21 ` [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows Rob Herring (Arm)
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Rob Herring (Arm) @ 2026-02-18 22:21 UTC (permalink / raw)
To: Tomeu Vizoso, Anders Roxell, Oded Gabbay, Thomas Zimmermann,
Frank Li
Cc: dri-devel, linux-kernel
This is series of fixes for issues I found in testing additional models
and adding more supported operations in mesa.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
Rob Herring (Arm) (3):
accel: ethosu: Fix job submit error clean-up refcount underflows
accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar
accel: ethosu: Handle possible underflow in IFM size calculations
drivers/accel/ethosu/ethosu_gem.c | 12 +++++++++---
drivers/accel/ethosu/ethosu_job.c | 26 ++++++++++++++++++--------
2 files changed, 27 insertions(+), 11 deletions(-)
---
base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
change-id: 20260218-ethos-fixes-e3508a579582
Best regards,
--
Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows
2026-02-18 22:21 [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Rob Herring (Arm)
@ 2026-02-18 22:21 ` Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 22:21 ` [PATCH 2/3] accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar Rob Herring (Arm)
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Rob Herring (Arm) @ 2026-02-18 22:21 UTC (permalink / raw)
To: Tomeu Vizoso, Anders Roxell, Oded Gabbay, Thomas Zimmermann,
Frank Li
Cc: dri-devel, linux-kernel
If the job submit fails before adding the job to the scheduler queue
such as when the GEM buffer bounds checks fail, then doing a
ethosu_job_put() results in a pm_runtime_put_autosuspend() without the
corresponding pm_runtime_resume_and_get(). The dma_fence_put()'s are
also unnecessary, but seem to be harmless.
Split the ethosu_job_cleanup() function into 2 parts for the before
and after the job is queued.
Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver")
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/accel/ethosu/ethosu_job.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/accel/ethosu/ethosu_job.c b/drivers/accel/ethosu/ethosu_job.c
index 26e7a2f64d71..70a144803b09 100644
--- a/drivers/accel/ethosu/ethosu_job.c
+++ b/drivers/accel/ethosu/ethosu_job.c
@@ -143,23 +143,29 @@ static int ethosu_job_push(struct ethosu_job *job)
return ret;
}
+static void ethosu_job_err_cleanup(struct ethosu_job *job)
+{
+ unsigned int i;
+
+ for (i = 0; i < job->region_cnt; i++)
+ drm_gem_object_put(job->region_bo[i]);
+
+ drm_gem_object_put(job->cmd_bo);
+
+ kfree(job);
+}
+
static void ethosu_job_cleanup(struct kref *ref)
{
struct ethosu_job *job = container_of(ref, struct ethosu_job,
refcount);
- unsigned int i;
pm_runtime_put_autosuspend(job->dev->base.dev);
dma_fence_put(job->done_fence);
dma_fence_put(job->inference_done_fence);
- for (i = 0; i < job->region_cnt; i++)
- drm_gem_object_put(job->region_bo[i]);
-
- drm_gem_object_put(job->cmd_bo);
-
- kfree(job);
+ ethosu_job_err_cleanup(job);
}
static void ethosu_job_put(struct ethosu_job *job)
@@ -454,12 +460,16 @@ static int ethosu_ioctl_submit_job(struct drm_device *dev, struct drm_file *file
}
}
ret = ethosu_job_push(ejob);
+ if (!ret) {
+ ethosu_job_put(ejob);
+ return 0;
+ }
out_cleanup_job:
if (ret)
drm_sched_job_cleanup(&ejob->base);
out_put_job:
- ethosu_job_put(ejob);
+ ethosu_job_err_cleanup(ejob);
return ret;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar
2026-02-18 22:21 [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Rob Herring (Arm)
2026-02-18 22:21 ` [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows Rob Herring (Arm)
@ 2026-02-18 22:21 ` Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 22:21 ` [PATCH 3/3] accel: ethosu: Handle possible underflow in IFM size calculations Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Claude Code Review Bot
3 siblings, 1 reply; 8+ messages in thread
From: Rob Herring (Arm) @ 2026-02-18 22:21 UTC (permalink / raw)
To: Tomeu Vizoso, Anders Roxell, Oded Gabbay, Thomas Zimmermann,
Frank Li
Cc: dri-devel, linux-kernel
The NPU_OP_ELEMENTWISE instruction uses a scalar value for IFM2 if the
IFM2_BROADCAST "scalar" mode is set. It is a bit (7) on the u65 and
part of a field (bits 3:0) on the u85. The driver was hardcoded to the
u85.
Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver")
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/accel/ethosu/ethosu_gem.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/ethosu/ethosu_gem.c b/drivers/accel/ethosu/ethosu_gem.c
index 473b5f5d7514..a735f860a119 100644
--- a/drivers/accel/ethosu/ethosu_gem.c
+++ b/drivers/accel/ethosu/ethosu_gem.c
@@ -417,7 +417,10 @@ static int ethosu_gem_cmdstream_copy_and_validate(struct drm_device *ddev,
return ret;
break;
case NPU_OP_ELEMENTWISE:
- use_ifm2 = !((st.ifm2.broadcast == 8) || (param == 5) ||
+ use_scale = ethosu_is_u65(edev) ?
+ (st.ifm2.broadcast & 0x80) :
+ (st.ifm2.broadcast == 8);
+ use_ifm2 = !(use_scale || (param == 5) ||
(param == 6) || (param == 7) || (param == 0x24));
use_ifm = st.ifm.broadcast != 8;
ret = calc_sizes_elemwise(ddev, info, cmd, &st, use_ifm, use_ifm2);
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] accel: ethosu: Handle possible underflow in IFM size calculations
2026-02-18 22:21 [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Rob Herring (Arm)
2026-02-18 22:21 ` [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows Rob Herring (Arm)
2026-02-18 22:21 ` [PATCH 2/3] accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar Rob Herring (Arm)
@ 2026-02-18 22:21 ` Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 23:57 ` Claude review: accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Claude Code Review Bot
3 siblings, 1 reply; 8+ messages in thread
From: Rob Herring (Arm) @ 2026-02-18 22:21 UTC (permalink / raw)
To: Tomeu Vizoso, Anders Roxell, Oded Gabbay, Thomas Zimmermann,
Frank Li
Cc: dri-devel, linux-kernel
If the command stream has larger padding sizes than the IFM and OFM
diminsions, then the calculations will underflow to a negative value.
The result is a very large region bounds which is caught on submit, but
it's better to catch it earlier.
Current mesa ethosu driver has a signedness bug which resulted in
padding of 127 (the max) and triggers this issue.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/accel/ethosu/ethosu_gem.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/ethosu/ethosu_gem.c b/drivers/accel/ethosu/ethosu_gem.c
index a735f860a119..d1169001c83d 100644
--- a/drivers/accel/ethosu/ethosu_gem.c
+++ b/drivers/accel/ethosu/ethosu_gem.c
@@ -245,11 +245,14 @@ static int calc_sizes(struct drm_device *ddev,
((st->ifm.stride_kernel >> 1) & 0x1) + 1;
u32 stride_x = ((st->ifm.stride_kernel >> 5) & 0x2) +
(st->ifm.stride_kernel & 0x1) + 1;
- u32 ifm_height = st->ofm.height[2] * stride_y +
+ s32 ifm_height = st->ofm.height[2] * stride_y +
st->ifm.height[2] - (st->ifm.pad_top + st->ifm.pad_bottom);
- u32 ifm_width = st->ofm.width * stride_x +
+ s32 ifm_width = st->ofm.width * stride_x +
st->ifm.width - (st->ifm.pad_left + st->ifm.pad_right);
+ if (ifm_height < 0 || ifm_width < 0)
+ return -EINVAL;
+
len = feat_matrix_length(info, &st->ifm, ifm_width,
ifm_height, st->ifm.depth);
dev_dbg(ddev->dev, "op %d: IFM:%d:0x%llx-0x%llx\n",
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Claude review: accel: ethosu: Runtime PM refcounting and cmd stream validation fixes
2026-02-18 22:21 [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Rob Herring (Arm)
` (2 preceding siblings ...)
2026-02-18 22:21 ` [PATCH 3/3] accel: ethosu: Handle possible underflow in IFM size calculations Rob Herring (Arm)
@ 2026-02-18 23:57 ` Claude Code Review Bot
3 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-02-18 23:57 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: accel: ethosu: Runtime PM refcounting and cmd stream validation fixes
Author: "Rob Herring (Arm)" <robh@kernel.org>
Patches: 4
Reviewed: 2026-02-19T09:57:30.759643
---
This is a three-patch fix series for the Arm Ethos-U NPU accelerator driver, addressing runtime PM refcount underflow on job submit failure, incorrect scalar mode detection for elementwise operations on the u65, and potential integer underflow in IFM size calculations. The patches are from Rob Herring and fix real bugs found during testing with additional models.
The architectural approach in patch 1 — splitting cleanup into pre-queue and post-queue paths — is the right idea. The two hardware-specific fixes in patches 2 and 3 are straightforward and well-motivated. One concern in patch 1 is that `done_fence` (which is separately kzalloc'd) is not freed by `ethosu_job_err_cleanup`, resulting in a memory leak on the error paths that go through `out_cleanup_job` or `out_put_job`. This was also leaked in the original code (since `dma_fence_put` on a zero-initialized fence triggers a kref WARN and doesn't actually free), but it would be good to fix it properly while restructuring the cleanup code.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: accel: ethosu: Fix job submit error clean-up refcount underflows
2026-02-18 22:21 ` [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows Rob Herring (Arm)
@ 2026-02-18 23:57 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-02-18 23:57 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
The commit message accurately describes the problem: when job submit fails before the job is queued (e.g. during GEM buffer bounds checks), calling `ethosu_job_put` leads to `ethosu_job_cleanup` which calls `pm_runtime_put_autosuspend` without a corresponding `pm_runtime_resume_and_get`. The fix correctly splits the cleanup into `ethosu_job_err_cleanup` (for pre-queue failures) and `ethosu_job_cleanup` (for post-queue teardown via kref).
However, there is a memory leak. `done_fence` is separately allocated:
> + ejob->done_fence = kzalloc(sizeof(*ejob->done_fence), GFP_KERNEL);
> + if (!ejob->done_fence) {
> + ret = -ENOMEM;
> + goto out_cleanup_job;
> + }
But `ethosu_job_err_cleanup` does not free it:
> +static void ethosu_job_err_cleanup(struct ethosu_job *job)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < job->region_cnt; i++)
> + drm_gem_object_put(job->region_bo[i]);
> +
> + drm_gem_object_put(job->cmd_bo);
> +
> + kfree(job);
> +}
Every error path through `out_cleanup_job` or `out_put_job` will leak the `done_fence` allocation. A `kfree(job->done_fence)` should be added to `ethosu_job_err_cleanup` before `kfree(job)`. (This was also leaked in the original code since `dma_fence_put` on a never-initialized fence doesn't actually free it, but this patch should fix it while it's restructuring the cleanup.)
A second concern is the `out_cleanup_job` label when reached from the `done_fence` allocation failure:
> out_cleanup_job:
> if (ret)
> drm_sched_job_cleanup(&ejob->base);
When `done_fence` allocation fails, we jump to `out_cleanup_job` and call `drm_sched_job_cleanup(&ejob->base)` even though `drm_sched_job_init` has not yet been called. The job was kzalloc'd so `ejob->base` is zeroed, but calling `drm_sched_job_cleanup` on an uninitialized scheduler job is at best relying on implementation details. This is a pre-existing issue not introduced by this patch, but it might be worth fixing while restructuring the error paths — perhaps by jumping to `out_put_job` instead.
The success path change looks correct:
> ret = ethosu_job_push(ejob);
> + if (!ret) {
> + ethosu_job_put(ejob);
> + return 0;
> + }
On success, `ethosu_job_push` has already called `pm_runtime_resume_and_get` and taken a kref, so `ethosu_job_put` drops the initial reference correctly.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar
2026-02-18 22:21 ` [PATCH 2/3] accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar Rob Herring (Arm)
@ 2026-02-18 23:57 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-02-18 23:57 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
The change correctly differentiates the u65 and u85 scalar detection for elementwise operations:
> + use_scale = ethosu_is_u65(edev) ?
> + (st.ifm2.broadcast & 0x80) :
> + (st.ifm2.broadcast == 8);
> + use_ifm2 = !(use_scale || (param == 5) ||
> (param == 6) || (param == 7) || (param == 0x24));
On the u65, scalar mode is indicated by bit 7 of the IFM2_BROADCAST register, while on the u85 the existing check `broadcast == 8` (i.e., bits 3:0 equal to 8) applies. The commit message explains the hardware difference clearly. The logic substitution from `!(A || B || C)` to `!(use_scale || B || C)` preserves the original structure correctly. No issues found.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: accel: ethosu: Handle possible underflow in IFM size calculations
2026-02-18 22:21 ` [PATCH 3/3] accel: ethosu: Handle possible underflow in IFM size calculations Rob Herring (Arm)
@ 2026-02-18 23:57 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-02-18 23:57 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
The fix changes `ifm_height` and `ifm_width` from `u32` to `s32` and adds an early check for negative values:
> - u32 ifm_height = st->ofm.height[2] * stride_y +
> + s32 ifm_height = st->ofm.height[2] * stride_y +
> st->ifm.height[2] - (st->ifm.pad_top + st->ifm.pad_bottom);
> - u32 ifm_width = st->ofm.width * stride_x +
> + s32 ifm_width = st->ofm.width * stride_x +
> st->ifm.width - (st->ifm.pad_left + st->ifm.pad_right);
>
> + if (ifm_height < 0 || ifm_width < 0)
> + return -EINVAL;
The arithmetic itself is worth examining. `st->ofm.height[2]` is `u16`, `stride_y` is `u32` (max value 4), `st->ifm.height[2]` is `u16`, and `pad_top`/`pad_bottom` are `u8`. The maximum positive value is roughly `0xFFFF * 4 + 0xFFFF = 327675`, which fits comfortably in `s32`. The minimum value when padding dominates is `0 + 0 - (255 + 255) = -510`, which is representable in `s32`. So the type change is safe.
The values then get passed to `feat_matrix_length(info, &st->ifm, ifm_width, ifm_height, st->ifm.depth)` which takes `u32` parameters. Since the negative check happens before the call, only non-negative `s32` values reach the function, and the implicit conversion to `u32` is well-defined. No issues found. The commit message accurately describes the problem with padding values exceeding IFM dimensions and the mesa driver bug that triggered it.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-02-18 23:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18 22:21 [PATCH 0/3] accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Rob Herring (Arm)
2026-02-18 22:21 ` [PATCH 1/3] accel: ethosu: Fix job submit error clean-up refcount underflows Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 22:21 ` [PATCH 2/3] accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 22:21 ` [PATCH 3/3] accel: ethosu: Handle possible underflow in IFM size calculations Rob Herring (Arm)
2026-02-18 23:57 ` Claude review: " Claude Code Review Bot
2026-02-18 23:57 ` Claude review: accel: ethosu: Runtime PM refcounting and cmd stream validation fixes Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox