* [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback
@ 2026-05-15 13:18 w15303746062
2026-05-15 15:09 ` Greg KH
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: w15303746062 @ 2026-05-15 13:18 UTC (permalink / raw)
To: louis.chauvet, hamohammed.sa, simona, melissa.srw,
maarten.lankhorst, mripard, tzimmermann, airlied
Cc: dri-devel, linux-kernel, stable, Mingyu Wang
From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
[Note: This patch addresses a legacy VKMS implementation deadlock specific
to older stable trees (e.g., 6.18.y). Mainline has removed this code during
the generic DRM_CRTC_VBLANK_TIMER_FUNCS refactoring.]
During local fuzzing with Syzkaller, an RCU preempt stall (soft lockup)
was observed. This is caused by an ABBA deadlock between the
drm_vblank_disable_and_save() function and the vkms_vblank_simulate()
hrtimer callback.
The race condition occurs as follows:
Thread A (CPU 3 - DRM_IOCTL_MODE_SETCRTC):
- drm_vblank_disable_and_save() acquires `&dev->vblank_time_lock`.
- Calls __disable_vblank() -> vkms_disable_vblank().
- Calls hrtimer_cancel() to synchronously stop the vblank timer.
- BLOCK: hrtimer_cancel() spins indefinitely waiting for the timer
callback to finish executing on CPU 0.
Thread B (CPU 0 - hrtimer interrupt):
- Executes the hrtimer callback vkms_vblank_simulate().
- Calls drm_crtc_handle_vblank() -> drm_handle_vblank().
- BLOCK: drm_handle_vblank() tries to acquire `&dev->vblank_time_lock`
and spins forever because Thread A is holding it.
This patch fixes the deadlock by replacing hrtimer_cancel() with
hrtimer_try_to_cancel(). If the timer callback is running, try_to_cancel()
will safely return -1 and allow Thread A to proceed and release the lock.
Additionally, vkms_vblank_simulate() is modified to conditionally return
HRTIMER_NORESTART if drm_crtc_handle_vblank() fails (which it will,
because Thread A sets `vblank->enabled = false` immediately after
try_to_cancel). This acts as a self-destruct mechanism, preventing the
timer from blindly re-arming itself and causing an infinite loop of
DRM_ERROR messages.
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
drivers/gpu/drm/vkms/vkms_crtc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index e60573e0f3e9..a62153b73548 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -57,7 +57,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
dma_fence_end_signalling(fence_cookie);
- return HRTIMER_RESTART;
+ return ret ? HRTIMER_RESTART : HRTIMER_NORESTART;
}
static int vkms_enable_vblank(struct drm_crtc *crtc)
@@ -77,7 +77,7 @@ static void vkms_disable_vblank(struct drm_crtc *crtc)
{
struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
- hrtimer_cancel(&out->vblank_hrtimer);
+ hrtimer_try_to_cancel(&out->vblank_hrtimer);
}
static bool vkms_get_vblank_timestamp(struct drm_crtc *crtc,
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback
2026-05-15 13:18 [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback w15303746062
@ 2026-05-15 15:09 ` Greg KH
2026-05-15 23:05 ` Claude review: " Claude Code Review Bot
2026-05-15 23:05 ` Claude Code Review Bot
2 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2026-05-15 15:09 UTC (permalink / raw)
To: w15303746062
Cc: louis.chauvet, hamohammed.sa, simona, melissa.srw,
maarten.lankhorst, mripard, tzimmermann, airlied, dri-devel,
linux-kernel, stable, Mingyu Wang
On Fri, May 15, 2026 at 09:18:26PM +0800, w15303746062@163.com wrote:
> From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
>
> [Note: This patch addresses a legacy VKMS implementation deadlock specific
> to older stable trees (e.g., 6.18.y). Mainline has removed this code during
> the generic DRM_CRTC_VBLANK_TIMER_FUNCS refactoring.]
Why not apply those upstream commits here as well? No need to diverge
from Linus's tree, otherwise we will end up having a mess that nothing
can ever be backported to.
How many commits need to be backported? Have you tried?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Claude review: drm/vkms: Fix ABBA deadlock in vblank disable and timer callback
2026-05-15 13:18 [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback w15303746062
2026-05-15 15:09 ` Greg KH
@ 2026-05-15 23:05 ` Claude Code Review Bot
2026-05-15 23:05 ` Claude Code Review Bot
2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-05-15 23:05 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: drm/vkms: Fix ABBA deadlock in vblank disable and timer callback
Author: <w15303746062@163.com>
Patches: 2
Reviewed: 2026-05-16T09:05:03.765320
---
This is a single patch targeting stable tree 6.18.y that attempts to fix an ABBA deadlock in the VKMS driver between `drm_vblank_disable_and_save()` (which holds `vblank_time_lock` and calls `hrtimer_cancel()`) and `vkms_vblank_simulate()` (an hrtimer callback that tries to acquire `vblank_time_lock` via `drm_handle_vblank()`).
**The deadlock analysis is correct** — this is a real lock ordering issue. In fact, **mainline has already fixed this exact class of bug** as part of the `DRM_CRTC_VBLANK_TIMER_FUNCS` refactoring. The new `drm_crtc_vblank_cancel_timer()` function at `drm_vblank.c:2261` uses the same `hrtimer_try_to_cancel()` approach plus interval-clearing, and the comment at line 2268-2271 explicitly documents this deadlock:
> Calling hrtimer_cancel() can result in a deadlock with DRM's vblank_time_lock and hrtimers' softirq_expiry_lock.
However, this patch has **significant issues** that make it unsuitable for merging as-is.
**Verdict: Needs revision.** The fix idea is sound but the implementation is incomplete and the patch targets the wrong tree.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Claude review: drm/vkms: Fix ABBA deadlock in vblank disable and timer callback
2026-05-15 13:18 [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback w15303746062
2026-05-15 15:09 ` Greg KH
2026-05-15 23:05 ` Claude review: " Claude Code Review Bot
@ 2026-05-15 23:05 ` Claude Code Review Bot
2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-05-15 23:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Positive aspects:**
- The deadlock analysis in the commit message is well-explained, with clear thread interleaving showing the ABBA pattern.
- The `hrtimer_cancel()` → `hrtimer_try_to_cancel()` change is the right direction, matching the approach that mainline adopted in `drm_crtc_vblank_cancel_timer()`.
**Issues:**
1. **`hrtimer_try_to_cancel()` return value is silently ignored:**
```c
- hrtimer_cancel(&out->vblank_hrtimer);
+ hrtimer_try_to_cancel(&out->vblank_hrtimer);
```
`hrtimer_try_to_cancel()` returns `-1` when the callback is currently executing, meaning the timer was **not** cancelled. The patch relies entirely on the timer self-stopping via `HRTIMER_NORESTART`, but there is a race window: the callback could be between `drm_crtc_handle_vblank()` returning true and evaluating `ret` for the return value. In that window, `ret` is true, so the timer re-arms itself even though vblank was just disabled. Compare with mainline's `drm_crtc_vblank_cancel_timer()` which clears the timer interval to zero under a spinlock as a definitive cancellation signal, then calls `hrtimer_try_to_cancel()`:
```c
spin_lock_irqsave(&vtimer->interval_lock, flags);
vtimer->interval = 0;
spin_unlock_irqrestore(&vtimer->interval_lock, flags);
hrtimer_try_to_cancel(&vtimer->timer);
```
The patch lacks an equivalent mechanism to guarantee the timer won't re-arm.
2. **The `ret` variable semantics are unclear:**
```c
- return HRTIMER_RESTART;
+ return ret ? HRTIMER_RESTART : HRTIMER_NORESTART;
```
The commit message says `drm_crtc_handle_vblank()` "fails" when vblank is disabled, but `drm_crtc_handle_vblank()` returns `false` simply when the vblank counter is not enabled — not necessarily an error. There are legitimate scenarios where it might return false transiently without implying the timer should permanently stop. This change could cause premature timer termination in edge cases during enable/disable transitions.
3. **Targeting wrong tree / applicability concern:** The cover note says this targets "6.18.y" stable, but 6.18.y doesn't exist yet. The patch is written against old VKMS code that had a per-output `vblank_hrtimer`. In mainline (drm-next), this code no longer exists — the `vkms_output` struct no longer contains `vblank_hrtimer`, and `vkms_vblank_simulate()` / `vkms_disable_vblank()` have been removed entirely. If this is truly for a stable backport, it needs a `Fixes:` tag referencing the original commit that introduced the deadlock, and the submitter should specify exactly which stable kernel versions are affected.
4. **Missing `Fixes:` tag:** A stable-targeted patch must have a `Fixes:` tag to identify which commit introduced the bug. Without it, the stable team cannot determine which kernels need the fix.
5. **No `Cc: stable` in the commit itself:** While `stable@vger.kernel.org` appears in the email Cc, the conventional `Cc: stable@vger.kernel.org` trailer is missing from the signed-off area.
6. **Incomplete deadlock resolution:** Even if `hrtimer_try_to_cancel()` returns `-1` and the timer callback is running, the callback could still be spinning on `vblank_time_lock` at that very moment. The patch assumes that after `try_to_cancel` returns, Thread A proceeds and releases `vblank_time_lock`, allowing the callback to finish. This is correct for the specific ABBA case described, but it means `vkms_disable_vblank()` returns **without the timer actually being cancelled**. The timer fires one more time, gets `HRTIMER_NORESTART` from the `ret` check, and stops. During that window, vblank events could still be processed after the caller believes vblank is disabled. The mainline fix handles this more robustly by zeroing the interval.
**Recommendation:** If the submitter intends this for an older stable tree, they should:
- Add a `Fixes:` tag
- Specify exactly which stable versions are affected
- Add a flag/state check (similar to mainline's interval-clearing) rather than relying solely on `drm_crtc_handle_vblank()`'s return value
- Consider whether a cleaner backport of the mainline `DRM_CRTC_VBLANK_TIMER_FUNCS` fix is feasible instead of a partial workaround
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-15 23:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 13:18 [PATCH 6.18.y] drm/vkms: Fix ABBA deadlock in vblank disable and timer callback w15303746062
2026-05-15 15:09 ` Greg KH
2026-05-15 23:05 ` Claude review: " Claude Code Review Bot
2026-05-15 23:05 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox