From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/virtio: add timeout to virtqueue wait to avoid hung task Date: Sat, 16 May 2026 13:58:17 +1000 Message-ID: In-Reply-To: <20260512-virtio-gpu_wait_event-v1-1-207eb4c1a69a@redhat.com> References: <20260512-virtio-gpu_wait_event-v1-1-207eb4c1a69a@redhat.com> <20260512-virtio-gpu_wait_event-v1-1-207eb4c1a69a@redhat.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **virtio_gpu_queue_ctrl_sgs timeout path (lines 142=E2=80=93150 of the diff= ):** ```c + if (!wait_event_timeout(vgdev->ctrlq.ack_queue, + vq->num_free >=3D elemcnt, + 5 * HZ)) { + /* The device did not respond */ + if (fence && vbuf->objs) + virtio_gpu_array_unlock_resv(vbuf->objs); + free_vbuf(vgdev, vbuf); + drm_dev_exit(idx); + return -ENODEV; + } ``` - **Cleanup is correct** =E2=80=94 it mirrors the `drm_dev_enter()` failure= path at lines 383=E2=80=93388 of the source. The fence has not been emitte= d yet at this point (that happens after the `goto again` loop succeeds), so= not touching `fence` here is fine. - **Missing error logging** =E2=80=94 this is a significant event (device u= nresponsive for 5 seconds). At minimum a `drm_err(vgdev->ddev, "timed out w= aiting for ctrl virtqueue space\n")` should be emitted so the admin/develop= er can diagnose the failure. - **`drm_dev_exit(idx)` is correct** =E2=80=94 the function called `drm_dev= _enter` earlier at line 383 and the timeout path needs to balance it. **virtio_gpu_queue_cursor timeout path (lines 161=E2=80=93167 of the diff):= ** ```c + if (!wait_event_timeout(vgdev->cursorq.ack_queue, + vq->num_free >=3D outcnt, + 5 * HZ)) { + /* The device did not respond */ + free_vbuf(vgdev, vbuf); + drm_dev_exit(idx); + return; + } ``` - **Cleanup mirrors the `drm_dev_enter` failure path** at line 555=E2=80=93= 557 of the source (`free_vbuf` + return). This is correct. - **`drm_dev_exit(idx)` is correct** =E2=80=94 balances the `drm_dev_enter`= at line 555. - **Same missing logging concern** =E2=80=94 should log a message on timeou= t. - **Silent cursor drop** =E2=80=94 the cursor update is silently lost on ti= meout. The sole caller `virtio_gpu_cursor_ping()` at line 1305 doesn't chec= k for errors (and can't, since the function returns void). This is acceptab= le for a cursor ping but still deserves a log message. **General issues across both sites:** - **Race with legitimate completion:** The 5-second window is a fixed wall-= clock timeout. Under memory pressure or host CPU contention, a legitimate c= ompletion that arrives at 5.1 seconds would trigger a false `-ENODEV`. Cons= ider whether the timeout should be longer (e.g., 30s) or whether `wait_even= t_interruptible_timeout` would be more appropriate so the wait can be inter= rupted by signals =E2=80=94 though that introduces its own complexity. - **No `WARN` or telemetry:** Other drivers that add device-timeout paths o= ften include a `drm_err` or `WARN_ONCE` to make the failure visible. Silent= ly returning `-ENODEV` makes debugging harder. - **Commit message claims consistency with existing timeout but doesn't cit= e it.** The claim "consistent with the existing timeout pattern in the driv= er" should reference the specific code location for reviewer verification. **Summary:** The patch addresses a real problem but needs (1) logging on th= e timeout paths, (2) justification for the 5-second value (and consideratio= n of whether it's too short for legitimate workloads), and (3) the commit m= essage should cite which "existing timeout pattern" it's referring to. --- Generated by Claude Code Patch Reviewer