From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/virtio: abort virtqueue wait on device removal to avoid hung task Date: Mon, 25 May 2026 21:05:48 +1000 Message-ID: In-Reply-To: <20260521-virtio-gpu_wait_event-v2-1-5796b3a71d03@redhat.com> References: <20260521-virtio-gpu_wait_event-v2-1-5796b3a71d03@redhat.com> <20260521-virtio-gpu_wait_event-v2-1-5796b3a71d03@redhat.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Positive aspects:** - The commit message clearly describes the problem and solution. - The v2 changes (removing arbitrary timeout, using a flag-driven approach)= are a good design evolution. - The abort paths correctly mirror the `drm_dev_enter()` failure cleanup in= both functions. - Calling `virtio_gpu_release_vqs()` before `drm_dev_unplug()` in both `rem= ove` and `shutdown` is correct for the ordering requirement. **Issue 1 (medium): `vqs_released` should be atomic or have explicit barrie= r** ```c + bool vqs_released; ``` This is a plain `bool` written by one context and read by others without an= y locking. While `wait_event()` internally provides memory barriers for its= condition check, the post-`wait_event` re-read of `vqs_released`: ```c + if (vgdev->vqs_released) { ``` is a plain load outside of any lock. On weakly-ordered architectures, the c= ompiler or CPU could theoretically reorder or cache this. In practice, `wai= t_event()` likely provides sufficient ordering since it just evaluated the = same variable, but the idiomatic kernel approach would be to either: - Use `READ_ONCE()`/`WRITE_ONCE()` for the accesses, or - Use an `atomic_t` flag (e.g., `atomic_read(&vgdev->vqs_released)`) At minimum, the write side should use `WRITE_ONCE()`: ```c WRITE_ONCE(vgdev->vqs_released, true); ``` and the reads should use `READ_ONCE()`. **Issue 2 (low): Race window between `drm_dev_enter()` succeeding and reach= ing `wait_event()`** Consider this sequence: 1. Thread A calls `drm_dev_enter()` =E2=80=94 succeeds, enters SRCU critica= l section 2. Thread B calls `virtio_gpu_release_vqs()` =E2=80=94 sets `vqs_released`,= wakes queues 3. Thread B calls `drm_dev_unplug()` =E2=80=94 starts `synchronize_srcu()`,= waits for Thread A 4. Thread A reaches `wait_event()` =E2=80=94 condition is already true due = to `vqs_released`, returns immediately 5. Thread A takes the abort path, calls `drm_dev_exit()`, releases SRCU 6. Thread B's `synchronize_srcu()` completes This actually works correctly. The `vqs_released` flag being set before `dr= m_dev_unplug()` means the `wait_event()` will see it regardless of timing. **Issue 3 (nit): Redundant comment in cursor path** ```c + /* See comment in virtio_gpu_queue_ctrl_sgs(). */ ``` This is fine for cross-referencing, but the comment it points to in `virtio= _gpu_queue_ctrl_sgs()`: ```c + /* + * Set by virtio_gpu_release_vqs() to unblock + * synchronize_srcu() wait in drm_dev_unplug(). + */ ``` is describing what sets the flag rather than what the code is doing. The co= de flow is self-explanatory =E2=80=94 the comment could be shortened to jus= t the "why": `/* Device being removed =E2=80=94 abort command. */` **Issue 4 (low): `virtio_gpu_release_vqs` is only used in one file but not = static** The function `virtio_gpu_release_vqs()` is defined in `virtgpu_drv.c` and o= nly used in that file. It's already implicitly static by not being declared= in the header, but marking it explicitly `static` would be more consistent: ```c +static void virtio_gpu_release_vqs(struct drm_device *dev) ``` Looking at the diff again, it already doesn't have `static` but also isn't = exported or declared in the header. Since it's only called from `virtio_gpu= _remove()` and `virtio_gpu_shutdown()` in the same file, it should be `stat= ic`. **Issue 5 (observation): Fence not signaled on abort** When the abort path fires in `virtio_gpu_queue_ctrl_sgs()`, the fence passe= d as a parameter is never signaled with an error. This is consistent with t= he existing `drm_dev_enter()` failure path (which also doesn't signal the f= ence), so this is not a regression introduced by this patch. However, it's = worth noting that this could leave waiters on the fence hanging =E2=80=94 t= hough during device removal this is arguably acceptable since the entire de= vice is going away. **Summary:** The patch is correct in its approach and addresses the real de= adlock. The main actionable feedback is to use `WRITE_ONCE()`/`READ_ONCE()`= for the `vqs_released` accesses to satisfy kernel memory model expectation= s, and to add `static` to `virtio_gpu_release_vqs()`. --- Generated by Claude Code Patch Reviewer