From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/sched: Add test suite for concurrent job submissions Date: Mon, 25 May 2026 22:03:53 +1000 Message-ID: In-Reply-To: <20260520093350.1036001-1-marco.pagani@linux.dev> References: <20260520093350.1036001-1-marco.pagani@linux.dev> <20260520093350.1036001-1-marco.pagani@linux.dev> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Teardown ordering (the v4 fix)** The cleanup action registration order in `drm_sched_concurrent_init` is: 1. `drm_mock_sched_fini_wrap` (registered first =E2=86=92 cleaned up last) 2. `complete_destroy_workqueue` (registered second =E2=86=92 cleaned up fir= st) Entity free actions are registered per-worker inside the test body, so they= run before both of the above (LIFO). The `complete_destroy_workqueue` func= tion: ```c static void complete_destroy_workqueue(void *context) { struct sched_concurrent_context *ctx =3D context; complete_all(&ctx->wait_go); destroy_workqueue(ctx->sub_wq); } ``` This correctly handles the early-failure case: if a test aborts before call= ing `complete_all` in the test body, stuck workers are unblocked here befor= e the workqueue is destroyed, and the scheduler is finalized last. This ord= ering looks correct. However, there's a subtle concern: entity free actions run *before* `comple= te_destroy_workqueue`. If a test aborts after workers are queued to the wor= kqueue but before `complete_all` is called, the cleanup would free entities= first, then `complete_all` would unblock workers who try to use those free= d entities. Looking at the test code, this scenario is avoided because ther= e are no assertions between the `queue_work` loop and `complete_all`: ```c for (i =3D 0; i < params->num_workers; i++) { worker =3D &workers[i]; INIT_WORK(&worker->work, drm_sched_parallel_worker); queue_work(ctx->sub_wq, &worker->work); } complete_all(&ctx->wait_go); flush_workqueue(ctx->sub_wq); ``` No assertion can fire between queuing and completing, so this is safe in pr= actice. Still, this is a fragile invariant =E2=80=94 a future maintainer ad= ding a `KUNIT_ASSERT` between these two blocks could reintroduce the use-af= ter-free. A brief comment here noting this constraint might be worthwhile. **Double `complete_all` is safe** In the normal (non-error) flow, `complete_all(&ctx->wait_go)` is called bot= h in the test body (line 260/429) and again in `complete_destroy_workqueue`= during cleanup. Since `complete_all` sets `done =3D UINT_MAX` and is idemp= otent, the second call is a no-op. This is correct. **Sliding window in interleaved worker** ```c max_in_flight_job =3D min(worker->job_count, params->num_in_flight_jobs); for (i =3D 0; i < max_in_flight_job; i++) drm_mock_sched_job_submit(worker->jobs[i]); for (i =3D 0; i < worker->job_count; i++) { done =3D drm_mock_sched_job_wait_finished(worker->jobs[i], timeout); ... j =3D i + max_in_flight_job; if (j < worker->job_count) drm_mock_sched_job_submit(worker->jobs[j]); } ``` The sliding window logic is correct. It pre-fills `max_in_flight_job` jobs,= then as each job completes, submits the next one, maintaining the window s= ize until jobs are exhausted. **Time budget calculation** ```c worker_share_us =3D (params->test_duration_ms * USEC_PER_MSEC) / params->num_workers; ... worker->job_duration_us =3D params->job_base_duration_us * (i + 1); worker->job_count =3D worker_share_us / worker->job_duration_us; worker->job_count =3D max(1U, worker->job_count); ``` With the given parameters (1000ms, 100us base, 16 workers): each worker get= s 62,500us. Worker 0 gets 625 jobs of 100us, worker 15 gets 39 jobs of 1600= us. Total serialized time sums to =E2=89=881 second. The per-job timeout of= `test_duration_ms * 2` (2 seconds) is sufficient. The `max(1U, worker->job_count)` ensures at least one job per worker even i= f the duration exceeds the share. With the current parameters this never tr= iggers, but it's a reasonable safety measure for future parameter sets. **Minor observations:** 1. The `kunit_info` calls in workers include a trailing `\n`, e.g.: ```c kunit_info(test_ctx->test, "Parallel worker %u submitting %u jobs started\n= ", worker->id, params->num_jobs); ``` `kunit_info` (like `pr_info`) appends a newline automatically. The explicit= `\n` results in a double newline in output. Not a bug, but slightly incons= istent with typical kernel logging style. Same applies to the "Worker %u su= bmitting" and "Job %u of worker %u timed out" messages. 2. The test suite currently has a single parameter set per test case. While= sufficient for initial coverage, it would be straightforward to add more p= arameter combinations (e.g., more workers than jobs, single worker with man= y jobs) in the future. 3. The `KUNIT_DEFINE_ACTION_WRAPPER` macros for `drm_mock_sched_fini_wrap` = and `drm_mock_sched_entity_free_wrap` are defined at file scope within the = concurrent test section. If other test suites in this file later need the s= ame wrappers, they're available, but the existing `drm_sched_basic_exit` st= ill calls `drm_mock_sched_fini` directly rather than using the wrapper. Thi= s is fine =E2=80=94 the wrapper is only needed for the `kunit_add_action_or= _reset` pattern. **Summary:** Clean, well-designed concurrency test. The v4 teardown fix is = correct. The only actionable nit is the extra `\n` in `kunit_info` format s= trings. Otherwise, this looks ready. --- Generated by Claude Code Patch Reviewer