* [PATCH] accel: ethosu: Remove redundant job_lock
@ 2026-05-16 14:40 Maíra Canal
2026-05-18 6:46 ` Claude review: " Claude Code Review Bot
2026-05-18 6:46 ` Claude Code Review Bot
0 siblings, 2 replies; 3+ messages in thread
From: Maíra Canal @ 2026-05-16 14:40 UTC (permalink / raw)
To: Rob Herring, Tomeu Vizoso, Oded Gabbay, Sumit Semwal,
Christian König
Cc: dri-devel, kernel-dev, Maíra Canal
The job_lock mutex guarded dev->in_flight_job across ethosu_job_run(),
the threaded IRQ handler, and ethosu_job_timedout(). However, the DRM
scheduler already provides most of the serialization required, which
makes this mutex redundant.
Analyzing the different scenarios:
1. run_job() and timedout_job() are mutually exclusive scheduler
callbacks, so the scheduler itself serializes them.
2. run_job() and the IRQ handler are implicitly serialized, but they can
overlap in time: dma_fence_signal() synchronously queues the next
run_job onto submit_wq, and the worker can execute run_job(next) on
another CPU before the IRQ thread finishes. The mutex previously kept
the IRQ's trailing "in_flight_job = NULL" from racing run_job(next)'s
"in_flight_job = next" store.
The handler is now restructured to clear in_flight_job before calling
dma_fence_signal(), so any run_job(next) woken by the signal observes
NULL.
3. timedout_job() and the IRQ handler can also overlap if the hardware
completes the timed-out job near the timeout boundary, since
drm_sched_stop()'s cancel_work_sync() only synchronizes with the
scheduler's workqueue. The IRQ handler saves in_flight_job into a
local with READ_ONCE() before dereferencing ->done_fence, and
run_job()/timedout_job() publish the field with WRITE_ONCE(). This
prevents the compiler from reloading the pointer, and keeps the IRQ
thread operating on the same job for the duration of the handler even
if timedout_job() concurrently clears the field.
Drop the mutex along with its initialization.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
---
Hi,
I noticed this pattern while reviewing "[PATCH v2] accel: ethosu: Add
performance counter support" and I had the impression this could be a
nice code simplification. However, I don't have the hardware to test it,
so this was only "compile-tested".
Best regards,
- Maíra
drivers/accel/ethosu/ethosu_device.h | 2 --
drivers/accel/ethosu/ethosu_job.c | 22 ++++++++--------------
2 files changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/accel/ethosu/ethosu_device.h b/drivers/accel/ethosu/ethosu_device.h
index b189fa783d6a..c2c59681a019 100644
--- a/drivers/accel/ethosu/ethosu_device.h
+++ b/drivers/accel/ethosu/ethosu_device.h
@@ -173,8 +173,6 @@ struct ethosu_device {
struct drm_ethosu_npu_info npu_info;
struct ethosu_job *in_flight_job;
- /* For in_flight_job and ethosu_job_hw_submit() */
- struct mutex job_lock;
/* For dma_fence */
spinlock_t fence_lock;
diff --git a/drivers/accel/ethosu/ethosu_job.c b/drivers/accel/ethosu/ethosu_job.c
index 418463c03bfb..5a9bd017f3bc 100644
--- a/drivers/accel/ethosu/ethosu_job.c
+++ b/drivers/accel/ethosu/ethosu_job.c
@@ -194,10 +194,8 @@ static struct dma_fence *ethosu_job_run(struct drm_sched_job *sched_job)
dev->fence_context, ++dev->emit_seqno);
dma_fence_get(fence);
- scoped_guard(mutex, &dev->job_lock) {
- dev->in_flight_job = job;
- ethosu_job_hw_submit(dev, job);
- }
+ WRITE_ONCE(dev->in_flight_job, job);
+ ethosu_job_hw_submit(dev, job);
return fence;
}
@@ -205,6 +203,7 @@ static struct dma_fence *ethosu_job_run(struct drm_sched_job *sched_job)
static void ethosu_job_handle_irq(struct ethosu_device *dev)
{
u32 status = readl_relaxed(dev->regs + NPU_REG_STATUS);
+ struct ethosu_job *job;
if (status & (STATUS_BUS_STATUS | STATUS_CMD_PARSE_ERR)) {
dev_err(dev->base.dev, "Error IRQ - %x\n", status);
@@ -212,11 +211,10 @@ static void ethosu_job_handle_irq(struct ethosu_device *dev)
return;
}
- scoped_guard(mutex, &dev->job_lock) {
- if (dev->in_flight_job) {
- dma_fence_signal(dev->in_flight_job->done_fence);
- dev->in_flight_job = NULL;
- }
+ job = READ_ONCE(dev->in_flight_job);
+ if (job) {
+ WRITE_ONCE(dev->in_flight_job, NULL);
+ dma_fence_signal(job->done_fence);
}
}
@@ -272,8 +270,7 @@ static enum drm_gpu_sched_stat ethosu_job_timedout(struct drm_sched_job *bad)
drm_sched_stop(&dev->sched, bad);
- scoped_guard(mutex, &dev->job_lock)
- dev->in_flight_job = NULL;
+ WRITE_ONCE(dev->in_flight_job, NULL);
/* Proceed with reset now. */
pm_runtime_force_suspend(dev->base.dev);
@@ -304,9 +301,6 @@ int ethosu_job_init(struct ethosu_device *edev)
int ret;
spin_lock_init(&edev->fence_lock);
- ret = devm_mutex_init(dev, &edev->job_lock);
- if (ret)
- return ret;
ret = devm_mutex_init(dev, &edev->sched_lock);
if (ret)
return ret;
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Claude review: accel: ethosu: Remove redundant job_lock
2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
@ 2026-05-18 6:46 ` Claude Code Review Bot
2026-05-18 6:46 ` Claude Code Review Bot
1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-18 6:46 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: accel: ethosu: Remove redundant job_lock
Author: =?UTF-8?q?Ma=C3=ADra=20Canal?= <mcanal@igalia.com>
Patches: 1
Reviewed: 2026-05-18T16:46:56.799851
---
This is a single, well-reasoned patch that removes a redundant `job_lock` mutex from the ethosu accelerator driver, replacing it with `READ_ONCE`/`WRITE_ONCE` on `dev->in_flight_job`. The commit message provides an exceptionally thorough analysis of the three concurrency scenarios.
The approach is **correct**. The key invariants that make it safe:
1. **`credit_limit = 1`** (line 296) — only one job can be in-flight, so `run_job()` is never called while another `run_job()` is active.
2. **Scheduler serializes `run_job()` and `timedout_job()`** — they are mutually exclusive callbacks.
3. **IRQ handler clears `in_flight_job` before `dma_fence_signal()`** — so any `run_job(next)` woken by the signal sees NULL and can safely store the new job pointer.
4. **Memory ordering is sufficient** — `writel()` at the end of `ethosu_job_hw_submit()` includes a write barrier (ensuring `in_flight_job` is visible before HW is kicked), and `dma_fence_signal()` uses spinlocks internally (providing acquire/release semantics that order the NULL store relative to the wake-up of the next `run_job()`).
5. **`job_lock` is fully removed** — grep confirms zero remaining references.
**One caveat**: the author notes this is compile-tested only (no hardware). Given the subtle concurrency changes, hardware testing or careful review from the ethosu maintainer would be prudent before merging.
**Verdict: Looks good.** The analysis is sound and the code change matches the stated rationale.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 3+ messages in thread
* Claude review: accel: ethosu: Remove redundant job_lock
2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
2026-05-18 6:46 ` Claude review: " Claude Code Review Bot
@ 2026-05-18 6:46 ` Claude Code Review Bot
1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-18 6:46 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Commit message quality**: Excellent. The three-scenario analysis is accurate and demonstrates deep understanding of the DRM scheduler's guarantees and the concurrency model.
**Header change** (`ethosu_device.h`):
```c
- /* For in_flight_job and ethosu_job_hw_submit() */
- struct mutex job_lock;
```
Clean removal. The `in_flight_job` field remains but is now protected by `WRITE_ONCE`/`READ_ONCE` + scheduler serialization rather than the mutex.
**`ethosu_job_run()`** (lines 197–198):
```c
WRITE_ONCE(dev->in_flight_job, job);
ethosu_job_hw_submit(dev, job);
```
Correct. `WRITE_ONCE` prevents the compiler from tearing or eliding the store. The subsequent `writel()` in `ethosu_job_hw_submit()` provides a write memory barrier, ensuring the store to `in_flight_job` is globally visible before the HW processes the command and raises an interrupt.
**`ethosu_job_handle_irq()`** (lines 214–218):
```c
job = READ_ONCE(dev->in_flight_job);
if (job) {
WRITE_ONCE(dev->in_flight_job, NULL);
dma_fence_signal(job->done_fence);
}
```
This is the most critical change. Two things to verify:
1. **Ordering of clear-before-signal**: The `WRITE_ONCE(NULL)` must be visible to other CPUs before `dma_fence_signal()` wakes the scheduler. This holds because `dma_fence_signal()` internally acquires a spinlock (`spin_lock_irqsave`), which acts as an acquire barrier — all preceding stores (including the NULL write) are committed before the lock is acquired and before any waiter is woken. So `run_job(next)` on another CPU will observe `in_flight_job == NULL`. **Correct.**
2. **Race with `timedout_job()`**: If `timedout_job()` concurrently clears `in_flight_job`, the IRQ handler's local `job` copy (captured by `READ_ONCE`) remains valid because the job isn't freed until the scheduler calls `free_job`, which can't happen until after `done_fence` is signaled or the scheduler reprocesses the timed-out job. No use-after-free risk. **Correct.**
**`ethosu_job_timedout()`** (line 273):
```c
WRITE_ONCE(dev->in_flight_job, NULL);
```
Called after `drm_sched_stop()`, which synchronizes with the scheduler's workqueue but not with the threaded IRQ handler. The `WRITE_ONCE` ensures the compiler emits a single atomic store. If the IRQ handler has already captured the old pointer via `READ_ONCE`, it operates on its local copy safely (see scenario 3 in the commit message). **Correct.**
**`ethosu_job_init()`** (lines 303–305):
```c
- ret = devm_mutex_init(dev, &edev->job_lock);
- if (ret)
- return ret;
```
Clean removal of initialization code.
**Minor observation**: It might be worth adding a brief comment on `in_flight_job` in the struct definition noting the access protocol (e.g., `/* Written with WRITE_ONCE by run_job/timedout_job, read with READ_ONCE by IRQ handler */`), since the locking is now implicit rather than explicit. But this is optional and a matter of taste — the commit message serves as the permanent record.
**No bugs found. Reviewed-by worthy.**
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-18 6:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
2026-05-18 6:46 ` Claude review: " Claude Code Review Bot
2026-05-18 6:46 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox