From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/virtio: abort virtqueue wait on device removal to avoid hung task Date: Thu, 04 Jun 2026 14:31:08 +1000 Message-ID: In-Reply-To: <20260601-virtio-gpu_wait_event-v3-1-89530517a98a@redhat.com> References: <20260601-virtio-gpu_wait_event-v3-1-89530517a98a@redhat.com> <20260601-virtio-gpu_wait_event-v3-1-89530517a98a@redhat.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Correctness =E2=80=94 memory ordering of `vqs_released`** The `vqs_released` flag is a plain `bool` written on one CPU and read on an= other (the waiting thread). While `wake_up_all()` does provide a memory bar= rier on the waking side, and `wait_event()` rechecks the condition after be= ing woken, there's a subtlety: the `wait_event()` macro checks the conditio= n *before* sleeping. If the flag is set just as another CPU is entering the= wait path, the plain bool read could theoretically be stale without an exp= licit barrier or atomic. In practice, this is likely fine on x86 (strong memory model) and the `wait= _event` internal barriers paper over most cases, but for correctness on all= architectures, consider using `READ_ONCE(vgdev->vqs_released)` in the wait= conditions and `WRITE_ONCE(vgdev->vqs_released, true)` in `virtio_gpu_rele= ase_vqs()`. This is the standard kernel pattern for lockless flag communica= tion: ```c // In virtio_gpu_release_vqs(): WRITE_ONCE(vgdev->vqs_released, true); // In wait conditions: wait_event(vgdev->ctrlq.ack_queue, vq->num_free >=3D elemcnt || READ_ONCE(vgdev->vqs_released)); ``` This prevents compiler tearing/reordering and is the idiomatic way to do th= is in the kernel. **Correctness =E2=80=94 the ctrl queue abort path looks correct** ```c if (vgdev->vqs_released) { if (fence && vbuf->objs) virtio_gpu_array_unlock_resv(vbuf->objs); free_vbuf(vgdev, vbuf); drm_dev_exit(idx); return -ENODEV; } ``` This matches the cleanup in the `drm_dev_enter()` failure path at lines 383= -388 of the current tree. The `drm_dev_exit(idx)` call is correct since we = entered the SRCU critical section earlier. Good. **Correctness =E2=80=94 the cursor queue abort path looks correct** ```c if (vgdev->vqs_released) { free_vbuf(vgdev, vbuf); drm_dev_exit(idx); return; } ``` Mirrors the `drm_dev_enter()` failure path at lines 555-558. The cursor pat= h doesn't have fence/objs handling, consistent with the existing code. Good. **Design =E2=80=94 ordering of `virtio_gpu_release_vqs()` vs `drm_dev_unplu= g()`** Placing `virtio_gpu_release_vqs()` before `drm_dev_unplug()` is correct =E2= =80=94 this is the whole point. The flag unblocks waiters so their SRCU cri= tical sections can complete, allowing `synchronize_srcu()` inside `drm_dev_= unplug()` to proceed. The ordering is sound. **Minor =E2=80=94 race window between `vqs_released` check and `goto again`= /`goto retry`** In the ctrl queue path, after the `vqs_released` check: ```c if (vgdev->vqs_released) { ... return -ENODEV; } goto again; ``` If `vqs_released` is false when checked but becomes true before re-entering= the `wait_event`, the next iteration will catch it. This is fine =E2=80=94= the condition is rechecked in `wait_event`. **Minor =E2=80=94 `vqs_released` initialization** The bool field is added to `struct virtio_gpu_device`. Since the struct is = allocated with `kzalloc` (through `drm_dev_alloc`), it will be zero-initial= ized, so `vqs_released` starts as `false`. This is correct. **Style =E2=80=94 comment verbosity** The comment block in `virtio_gpu_queue_ctrl_sgs()`: ```c /* * Set by virtio_gpu_release_vqs() to unblock * synchronize_srcu() wait in drm_dev_unplug(). */ ``` This is reasonable for explaining the non-obvious interaction with `drm_dev= _unplug()`. The shorter `/* See comment in virtio_gpu_queue_ctrl_sgs(). */`= cross-reference in the cursor path is appropriate. **Overall assessment: Good fix. The `READ_ONCE`/`WRITE_ONCE` suggestion is = the main feedback worth acting on. The logic, cleanup paths, and ordering a= re all correct.** --- Generated by Claude Code Patch Reviewer