public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] accel: ethosu: Remove redundant job_lock
@ 2026-05-16 14:40 Maíra Canal
  2026-05-18  6:46 ` Claude review: " Claude Code Review Bot
  2026-05-18  6:46 ` Claude Code Review Bot
  0 siblings, 2 replies; 3+ messages in thread
From: Maíra Canal @ 2026-05-16 14:40 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Oded Gabbay, Sumit Semwal,
	Christian König
  Cc: dri-devel, kernel-dev, Maíra Canal

The job_lock mutex guarded dev->in_flight_job across ethosu_job_run(),
the threaded IRQ handler, and ethosu_job_timedout(). However, the DRM
scheduler already provides most of the serialization required, which
makes this mutex redundant.

Analyzing the different scenarios:

  1. run_job() and timedout_job() are mutually exclusive scheduler
     callbacks, so the scheduler itself serializes them.

  2. run_job() and the IRQ handler are implicitly serialized, but they can
     overlap in time: dma_fence_signal() synchronously queues the next
     run_job onto submit_wq, and the worker can execute run_job(next) on
     another CPU before the IRQ thread finishes. The mutex previously kept
     the IRQ's trailing "in_flight_job = NULL" from racing run_job(next)'s
     "in_flight_job = next" store.

     The handler is now restructured to clear in_flight_job before calling
     dma_fence_signal(), so any run_job(next) woken by the signal observes
     NULL.

  3. timedout_job() and the IRQ handler can also overlap if the hardware
     completes the timed-out job near the timeout boundary, since
     drm_sched_stop()'s cancel_work_sync() only synchronizes with the
     scheduler's workqueue. The IRQ handler saves in_flight_job into a
     local with READ_ONCE() before dereferencing ->done_fence, and
     run_job()/timedout_job() publish the field with WRITE_ONCE(). This
     prevents the compiler from reloading the pointer, and keeps the IRQ
     thread operating on the same job for the duration of the handler even
     if timedout_job() concurrently clears the field.

Drop the mutex along with its initialization.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
---

Hi,

I noticed this pattern while reviewing "[PATCH v2] accel: ethosu: Add
performance counter support" and I had the impression this could be a
nice code simplification. However, I don't have the hardware to test it,
so this was only "compile-tested".

Best regards,
- Maíra

 drivers/accel/ethosu/ethosu_device.h |  2 --
 drivers/accel/ethosu/ethosu_job.c    | 22 ++++++++--------------
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/accel/ethosu/ethosu_device.h b/drivers/accel/ethosu/ethosu_device.h
index b189fa783d6a..c2c59681a019 100644
--- a/drivers/accel/ethosu/ethosu_device.h
+++ b/drivers/accel/ethosu/ethosu_device.h
@@ -173,8 +173,6 @@ struct ethosu_device {
 	struct drm_ethosu_npu_info npu_info;
 
 	struct ethosu_job *in_flight_job;
-	/* For in_flight_job and ethosu_job_hw_submit() */
-	struct mutex job_lock;
 
 	/* For dma_fence */
 	spinlock_t fence_lock;
diff --git a/drivers/accel/ethosu/ethosu_job.c b/drivers/accel/ethosu/ethosu_job.c
index 418463c03bfb..5a9bd017f3bc 100644
--- a/drivers/accel/ethosu/ethosu_job.c
+++ b/drivers/accel/ethosu/ethosu_job.c
@@ -194,10 +194,8 @@ static struct dma_fence *ethosu_job_run(struct drm_sched_job *sched_job)
 		       dev->fence_context, ++dev->emit_seqno);
 	dma_fence_get(fence);
 
-	scoped_guard(mutex, &dev->job_lock) {
-		dev->in_flight_job = job;
-		ethosu_job_hw_submit(dev, job);
-	}
+	WRITE_ONCE(dev->in_flight_job, job);
+	ethosu_job_hw_submit(dev, job);
 
 	return fence;
 }
@@ -205,6 +203,7 @@ static struct dma_fence *ethosu_job_run(struct drm_sched_job *sched_job)
 static void ethosu_job_handle_irq(struct ethosu_device *dev)
 {
 	u32 status = readl_relaxed(dev->regs + NPU_REG_STATUS);
+	struct ethosu_job *job;
 
 	if (status & (STATUS_BUS_STATUS | STATUS_CMD_PARSE_ERR)) {
 		dev_err(dev->base.dev, "Error IRQ - %x\n", status);
@@ -212,11 +211,10 @@ static void ethosu_job_handle_irq(struct ethosu_device *dev)
 		return;
 	}
 
-	scoped_guard(mutex, &dev->job_lock) {
-		if (dev->in_flight_job) {
-			dma_fence_signal(dev->in_flight_job->done_fence);
-			dev->in_flight_job = NULL;
-		}
+	job = READ_ONCE(dev->in_flight_job);
+	if (job) {
+		WRITE_ONCE(dev->in_flight_job, NULL);
+		dma_fence_signal(job->done_fence);
 	}
 }
 
@@ -272,8 +270,7 @@ static enum drm_gpu_sched_stat ethosu_job_timedout(struct drm_sched_job *bad)
 
 	drm_sched_stop(&dev->sched, bad);
 
-	scoped_guard(mutex, &dev->job_lock)
-		dev->in_flight_job = NULL;
+	WRITE_ONCE(dev->in_flight_job, NULL);
 
 	/* Proceed with reset now. */
 	pm_runtime_force_suspend(dev->base.dev);
@@ -304,9 +301,6 @@ int ethosu_job_init(struct ethosu_device *edev)
 	int ret;
 
 	spin_lock_init(&edev->fence_lock);
-	ret = devm_mutex_init(dev, &edev->job_lock);
-	if (ret)
-		return ret;
 	ret = devm_mutex_init(dev, &edev->sched_lock);
 	if (ret)
 		return ret;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Claude review: accel: ethosu: Remove redundant job_lock
  2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
@ 2026-05-18  6:46 ` Claude Code Review Bot
  2026-05-18  6:46 ` Claude Code Review Bot
  1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-18  6:46 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: accel: ethosu: Remove redundant job_lock
Author: =?UTF-8?q?Ma=C3=ADra=20Canal?= <mcanal@igalia.com>
Patches: 1
Reviewed: 2026-05-18T16:46:56.799851

---

This is a single, well-reasoned patch that removes a redundant `job_lock` mutex from the ethosu accelerator driver, replacing it with `READ_ONCE`/`WRITE_ONCE` on `dev->in_flight_job`. The commit message provides an exceptionally thorough analysis of the three concurrency scenarios.

The approach is **correct**. The key invariants that make it safe:

1. **`credit_limit = 1`** (line 296) — only one job can be in-flight, so `run_job()` is never called while another `run_job()` is active.
2. **Scheduler serializes `run_job()` and `timedout_job()`** — they are mutually exclusive callbacks.
3. **IRQ handler clears `in_flight_job` before `dma_fence_signal()`** — so any `run_job(next)` woken by the signal sees NULL and can safely store the new job pointer.
4. **Memory ordering is sufficient** — `writel()` at the end of `ethosu_job_hw_submit()` includes a write barrier (ensuring `in_flight_job` is visible before HW is kicked), and `dma_fence_signal()` uses spinlocks internally (providing acquire/release semantics that order the NULL store relative to the wake-up of the next `run_job()`).
5. **`job_lock` is fully removed** — grep confirms zero remaining references.

**One caveat**: the author notes this is compile-tested only (no hardware). Given the subtle concurrency changes, hardware testing or careful review from the ethosu maintainer would be prudent before merging.

**Verdict: Looks good.** The analysis is sound and the code change matches the stated rationale.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Claude review: accel: ethosu: Remove redundant job_lock
  2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
  2026-05-18  6:46 ` Claude review: " Claude Code Review Bot
@ 2026-05-18  6:46 ` Claude Code Review Bot
  1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-18  6:46 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Commit message quality**: Excellent. The three-scenario analysis is accurate and demonstrates deep understanding of the DRM scheduler's guarantees and the concurrency model.

**Header change** (`ethosu_device.h`):
```c
-	/* For in_flight_job and ethosu_job_hw_submit() */
-	struct mutex job_lock;
```
Clean removal. The `in_flight_job` field remains but is now protected by `WRITE_ONCE`/`READ_ONCE` + scheduler serialization rather than the mutex.

**`ethosu_job_run()`** (lines 197–198):
```c
	WRITE_ONCE(dev->in_flight_job, job);
	ethosu_job_hw_submit(dev, job);
```
Correct. `WRITE_ONCE` prevents the compiler from tearing or eliding the store. The subsequent `writel()` in `ethosu_job_hw_submit()` provides a write memory barrier, ensuring the store to `in_flight_job` is globally visible before the HW processes the command and raises an interrupt.

**`ethosu_job_handle_irq()`** (lines 214–218):
```c
	job = READ_ONCE(dev->in_flight_job);
	if (job) {
		WRITE_ONCE(dev->in_flight_job, NULL);
		dma_fence_signal(job->done_fence);
	}
```
This is the most critical change. Two things to verify:

1. **Ordering of clear-before-signal**: The `WRITE_ONCE(NULL)` must be visible to other CPUs before `dma_fence_signal()` wakes the scheduler. This holds because `dma_fence_signal()` internally acquires a spinlock (`spin_lock_irqsave`), which acts as an acquire barrier — all preceding stores (including the NULL write) are committed before the lock is acquired and before any waiter is woken. So `run_job(next)` on another CPU will observe `in_flight_job == NULL`. **Correct.**

2. **Race with `timedout_job()`**: If `timedout_job()` concurrently clears `in_flight_job`, the IRQ handler's local `job` copy (captured by `READ_ONCE`) remains valid because the job isn't freed until the scheduler calls `free_job`, which can't happen until after `done_fence` is signaled or the scheduler reprocesses the timed-out job. No use-after-free risk. **Correct.**

**`ethosu_job_timedout()`** (line 273):
```c
	WRITE_ONCE(dev->in_flight_job, NULL);
```
Called after `drm_sched_stop()`, which synchronizes with the scheduler's workqueue but not with the threaded IRQ handler. The `WRITE_ONCE` ensures the compiler emits a single atomic store. If the IRQ handler has already captured the old pointer via `READ_ONCE`, it operates on its local copy safely (see scenario 3 in the commit message). **Correct.**

**`ethosu_job_init()`** (lines 303–305):
```c
-	ret = devm_mutex_init(dev, &edev->job_lock);
-	if (ret)
-		return ret;
```
Clean removal of initialization code.

**Minor observation**: It might be worth adding a brief comment on `in_flight_job` in the struct definition noting the access protocol (e.g., `/* Written with WRITE_ONCE by run_job/timedout_job, read with READ_ONCE by IRQ handler */`), since the locking is now implicit rather than explicit. But this is optional and a matter of taste — the commit message serves as the permanent record.

**No bugs found. Reviewed-by worthy.**

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-18  6:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-16 14:40 [PATCH] accel: ethosu: Remove redundant job_lock Maíra Canal
2026-05-18  6:46 ` Claude review: " Claude Code Review Bot
2026-05-18  6:46 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox