From mboxrd@z Thu Jan  1 00:00:00 1970
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/sched: Add test suite for concurrent job submissions
Date: Sun, 12 Apr 2026 12:16:53 +1000
Message-ID: <review-patch1-20260408154947.85204-1-marco.pagani@linux.dev>
In-Reply-To: <20260408154947.85204-1-marco.pagani@linux.dev>
References: <20260408154947.85204-1-marco.pagani@linux.dev> <20260408154947.85204-1-marco.pagani@linux.dev>
X-Mailer: Claude Code Patch Reviewer
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Patch Review

**Cleanup pattern diverges from existing code**

The existing suites use a simple `init`/`exit` pair with direct calls to `d=
rm_mock_sched_fini()` and `drm_mock_sched_entity_free()`. This patch introd=
uces `KUNIT_DEFINE_ACTION_WRAPPER` and `kunit_add_action_or_reset` for the =
first time in this file. While the action-based cleanup is appropriate for =
the concurrent case (where early assertion failures in the test body need s=
afe unwinding), it's worth noting this creates two different cleanup idioms=
 in the same file. Not a blocker, but a comment explaining *why* the action=
-based approach is used here would help future readers.

**`complete_destroy_workqueue` =E2=80=94 the safety net is sound but subtle=
**

```c
static void complete_destroy_workqueue(void *context)
{
	struct sched_concurrent_context *ctx =3D context;

	complete_all(&ctx->wait_go);

	destroy_workqueue(ctx->sub_wq);
}
```

The `complete_all()` before `destroy_workqueue()` serves as a safety net: i=
f the test aborts (via `KUNIT_ASSERT_*`) after workers are queued but befor=
e `complete_all()` is called in the test body, workers would deadlock waiti=
ng on the completion, and `destroy_workqueue()` would hang. This is correct=
 but non-obvious. A brief comment explaining this defensive pattern would b=
e valuable.

In the normal (non-aborting) flow, `complete_all()` is called twice =E2=80=
=94 once in the test body and once in cleanup. This is harmless since `comp=
lete_all()` sets `done` to `UINT_MAX/2` idempotently, but a reader might wo=
nder if it's a bug.

**Grammar nit in comment**

```c
	 * Init workers only after all jobs and entities have been successfully
	 * allocated. In this way, the cleanup logic for when an assertion fail
	 * can be simplified.
```

"assertion fail" =E2=86=92 "assertion fails"

**Single parameter case for each test**

Both `drm_sched_parallel_cases[]` and `drm_sched_interleaved_cases[]` defin=
e only a single parameter set. The parameterized infrastructure is in place=
 but currently exercises only one configuration (16 workers =C3=97 8 jobs f=
or parallel; 16 workers, 100=C2=B5s base, 8 in-flight for interleaved). Con=
sider adding at least one more case for each =E2=80=94 e.g., a smaller conf=
iguration (fewer workers, fewer jobs) as a quick smoke test, or a stress co=
nfiguration. This would better justify the parameterized approach and impro=
ve coverage. Not a blocker for merging, but something to consider.

**Interleaved worker timeout calculation**

```c
	timeout =3D msecs_to_jiffies(params->test_duration_ms * 2);
```

The timeout is 2=C3=97 the total test duration budget (2 seconds). Since th=
e scheduler serializes all jobs across all workers, and the total expected =
execution time is approximately `test_duration_ms` (1 second), a per-job ti=
meout of 2 seconds is generous. This seems reasonable for avoiding flaky ti=
meouts, but the `* 2` magic number could benefit from a brief comment expla=
ining the rationale.

**No exit function for the concurrent suite**

```c
static struct kunit_suite drm_sched_concurrent =3D {
	.name =3D "drm_sched_concurrent_tests",
	.init =3D drm_sched_concurrent_init,
	.test_cases =3D drm_sched_concurrent_tests,
};
```

The suite has no `.exit` callback, relying entirely on kunit action-based c=
leanup. This is consistent with the action-based approach and correct =E2=
=80=94 all resources are freed via actions registered with `kunit_add_actio=
n_or_reset`. Just noting this is intentional and fine.

**Entity cleanup actions registered per-worker in the test body**

```c
	for (i =3D 0; i < params->num_workers; i++) {
		...
		worker->entity =3D drm_mock_sched_entity_new(test, ...);

		ret =3D kunit_add_action_or_reset(test, drm_mock_sched_entity_free_wrap,
						worker->entity);
		KUNIT_ASSERT_EQ(test, ret, 0);
```

This means up to 16 cleanup actions are registered for entities alone. Comb=
ined with the workqueue and scheduler actions from init, the cleanup stack =
could get long. This is fine for a test, but worth being aware of if the pa=
rameter counts grow significantly.

**`drm_mock_sched_job_is_finished` vs `drm_mock_sched_job_wait_finished` in=
 final check**

In the interleaved test's final verification:
```c
	for (i =3D 0; i < params->num_workers; i++) {
		...
		for (j =3D 0; j < worker->job_count; j++) {
			job =3D worker->jobs[j];
			done =3D drm_mock_sched_job_is_finished(job);
			KUNIT_EXPECT_TRUE(test, done);
		}
	}
```

This uses the non-waiting `drm_mock_sched_job_is_finished()` rather than `d=
rm_mock_sched_job_wait_finished()` (which the parallel test uses). This is =
correct: by the time `flush_workqueue()` returns, all workers have complete=
d their wait loops, so all jobs should already be finished. The non-waiting=
 check serves as a pure assertion that nothing went wrong, which is the rig=
ht semantic here.

**Overall**: This is solid test code that adds meaningful coverage. The mai=
n suggestions are cosmetic =E2=80=94 adding comments for the `complete_all`=
 safety net, fixing the grammar nit, and considering additional parameter c=
ases. The concurrency design (barrier start, per-worker entities, sliding w=
indow) is well thought out.

---
Generated by Claude Code Patch Reviewer