From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/sched: Add test suite for concurrent job submissions Date: Sun, 12 Apr 2026 12:16:53 +1000 Message-ID: In-Reply-To: <20260408154947.85204-1-marco.pagani@linux.dev> References: <20260408154947.85204-1-marco.pagani@linux.dev> <20260408154947.85204-1-marco.pagani@linux.dev> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Cleanup pattern diverges from existing code** The existing suites use a simple `init`/`exit` pair with direct calls to `d= rm_mock_sched_fini()` and `drm_mock_sched_entity_free()`. This patch introd= uces `KUNIT_DEFINE_ACTION_WRAPPER` and `kunit_add_action_or_reset` for the = first time in this file. While the action-based cleanup is appropriate for = the concurrent case (where early assertion failures in the test body need s= afe unwinding), it's worth noting this creates two different cleanup idioms= in the same file. Not a blocker, but a comment explaining *why* the action= -based approach is used here would help future readers. **`complete_destroy_workqueue` =E2=80=94 the safety net is sound but subtle= ** ```c static void complete_destroy_workqueue(void *context) { struct sched_concurrent_context *ctx =3D context; complete_all(&ctx->wait_go); destroy_workqueue(ctx->sub_wq); } ``` The `complete_all()` before `destroy_workqueue()` serves as a safety net: i= f the test aborts (via `KUNIT_ASSERT_*`) after workers are queued but befor= e `complete_all()` is called in the test body, workers would deadlock waiti= ng on the completion, and `destroy_workqueue()` would hang. This is correct= but non-obvious. A brief comment explaining this defensive pattern would b= e valuable. In the normal (non-aborting) flow, `complete_all()` is called twice =E2=80= =94 once in the test body and once in cleanup. This is harmless since `comp= lete_all()` sets `done` to `UINT_MAX/2` idempotently, but a reader might wo= nder if it's a bug. **Grammar nit in comment** ```c * Init workers only after all jobs and entities have been successfully * allocated. In this way, the cleanup logic for when an assertion fail * can be simplified. ``` "assertion fail" =E2=86=92 "assertion fails" **Single parameter case for each test** Both `drm_sched_parallel_cases[]` and `drm_sched_interleaved_cases[]` defin= e only a single parameter set. The parameterized infrastructure is in place= but currently exercises only one configuration (16 workers =C3=97 8 jobs f= or parallel; 16 workers, 100=C2=B5s base, 8 in-flight for interleaved). Con= sider adding at least one more case for each =E2=80=94 e.g., a smaller conf= iguration (fewer workers, fewer jobs) as a quick smoke test, or a stress co= nfiguration. This would better justify the parameterized approach and impro= ve coverage. Not a blocker for merging, but something to consider. **Interleaved worker timeout calculation** ```c timeout =3D msecs_to_jiffies(params->test_duration_ms * 2); ``` The timeout is 2=C3=97 the total test duration budget (2 seconds). Since th= e scheduler serializes all jobs across all workers, and the total expected = execution time is approximately `test_duration_ms` (1 second), a per-job ti= meout of 2 seconds is generous. This seems reasonable for avoiding flaky ti= meouts, but the `* 2` magic number could benefit from a brief comment expla= ining the rationale. **No exit function for the concurrent suite** ```c static struct kunit_suite drm_sched_concurrent =3D { .name =3D "drm_sched_concurrent_tests", .init =3D drm_sched_concurrent_init, .test_cases =3D drm_sched_concurrent_tests, }; ``` The suite has no `.exit` callback, relying entirely on kunit action-based c= leanup. This is consistent with the action-based approach and correct =E2= =80=94 all resources are freed via actions registered with `kunit_add_actio= n_or_reset`. Just noting this is intentional and fine. **Entity cleanup actions registered per-worker in the test body** ```c for (i =3D 0; i < params->num_workers; i++) { ... worker->entity =3D drm_mock_sched_entity_new(test, ...); ret =3D kunit_add_action_or_reset(test, drm_mock_sched_entity_free_wrap, worker->entity); KUNIT_ASSERT_EQ(test, ret, 0); ``` This means up to 16 cleanup actions are registered for entities alone. Comb= ined with the workqueue and scheduler actions from init, the cleanup stack = could get long. This is fine for a test, but worth being aware of if the pa= rameter counts grow significantly. **`drm_mock_sched_job_is_finished` vs `drm_mock_sched_job_wait_finished` in= final check** In the interleaved test's final verification: ```c for (i =3D 0; i < params->num_workers; i++) { ... for (j =3D 0; j < worker->job_count; j++) { job =3D worker->jobs[j]; done =3D drm_mock_sched_job_is_finished(job); KUNIT_EXPECT_TRUE(test, done); } } ``` This uses the non-waiting `drm_mock_sched_job_is_finished()` rather than `d= rm_mock_sched_job_wait_finished()` (which the parallel test uses). This is = correct: by the time `flush_workqueue()` returns, all workers have complete= d their wait loops, so all jobs should already be finished. The non-waiting= check serves as a pure assertion that nothing went wrong, which is the rig= ht semantic here. **Overall**: This is solid test code that adds meaningful coverage. The mai= n suggestions are cosmetic =E2=80=94 adding comments for the `complete_all`= safety net, fixing the grammar nit, and considering additional parameter c= ases. The concurrency design (barrier start, per-worker entities, sliding w= indow) is well thought out. --- Generated by Claude Code Patch Reviewer