public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path
@ 2026-05-22 12:35 w15303746062
  2026-05-25  8:39 ` Claude review: " Claude Code Review Bot
  2026-05-25  8:39 ` Claude Code Review Bot
  0 siblings, 2 replies; 3+ messages in thread
From: w15303746062 @ 2026-05-22 12:35 UTC (permalink / raw)
  To: zack.rusin
  Cc: bcm-kernel-feedback-list, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, dri-devel, linux-kernel, 25181214217

From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

A severe deadlock occurs when disabling the CRTC while the VKMS vblank
hrtimer is running. The issue is caused by a circular lock dependency
(ABBA) involving the DRM core's dev->vbl_lock and the hrtimer cancellation
sequence.

Stack traces from NMI backtrace confirm the deadlock:
CPU 0 (IRQ Context):
 [ <0>] hrtimer_interrupt
 [ <0>] vkms_vblank_simulate
 [ <0>] drm_crtc_handle_vblank
 [ <0>] _raw_spin_lock_irqsave (waiting for dev->vbl_lock)

CPU 2 (Process Context):
 [ <2>] drm_crtc_vblank_off
 [ <2>] vmw_vkms_disable_vblank
 [ <2>] hrtimer_cancel (blocks waiting for timer callback)

This results in a system lockup and RCU stall:
[ 3367.370429] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks
[ 3367.912523] rcu: rcu_preempt kthread starved for 10504 jiffies!

The driver incorrectly calls the blocking hrtimer_cancel() while holding
dev->vbl_lock inside the disable_vblank() callback.

Fix this by using hrtimer_try_to_cancel() in vmw_vkms_disable_vblank().
This callback must remain non-blocking as it is called with dev->vbl_lock
held by the DRM core. Subsequently, call hrtimer_cancel() in
vmw_vkms_crtc_atomic_disable() *after* drm_crtc_vblank_off() has released
the lock. This ensures the timer is safely and synchronously stopped
without inducing a deadlock.

Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
index 5abd7f5ad2db..96fc856b9e06 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
@@ -305,7 +305,10 @@ vmw_vkms_disable_vblank(struct drm_crtc *crtc)
 	if (!vmw->vkms_enabled)
 		return;
 
-	hrtimer_cancel(&du->vkms.timer);
+	/*
+	 * Non-blocking cancel to avoid ABBA deadlock while holding vbl_lock.
+	 */
+	hrtimer_try_to_cancel(&du->vkms.timer);
 	du->vkms.surface = NULL;
 	du->vkms.period_ns = ktime_set(0, 0);
 }
@@ -390,9 +393,16 @@ vmw_vkms_crtc_atomic_disable(struct drm_crtc *crtc,
 			     struct drm_atomic_state *state)
 {
 	struct vmw_private *vmw = vmw_priv(crtc->dev);
+	struct vmw_display_unit *du = vmw_crtc_to_du(crtc);
 
-	if (vmw->vkms_enabled)
-		drm_crtc_vblank_off(crtc);
+	if (vmw->vkms_enabled) {
+		drm_crtc_vblank_off(crtc);
+		/*
+		 * Synchronously stop the timer after releasing the vbl_lock
+		 * to ensure no further callbacks occur.
+		 */
+		hrtimer_cancel(&du->vkms.timer);
+	}
 }
 
 static bool
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Claude review: drm/vmwgfx: Break ABBA deadlock in vblank disable path
  2026-05-22 12:35 [PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path w15303746062
@ 2026-05-25  8:39 ` Claude Code Review Bot
  2026-05-25  8:39 ` Claude Code Review Bot
  1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-25  8:39 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: drm/vmwgfx: Break ABBA deadlock in vblank disable path
Author: w15303746062@163.com
Patches: 1
Reviewed: 2026-05-25T18:39:18.610164

---

**Already fixed upstream — NAK.**

This is a single patch targeting a real deadlock that **has already been fixed** in drm-next via a different, more comprehensive approach. The vmwgfx vkms code was reworked to use the DRM core's vblank timer infrastructure (`drm_crtc_vblank_cancel_timer()`, `drm_crtc_vblank_start_timer()`), which centralizes the timer management and properly uses `hrtimer_try_to_cancel()` internally (`drm_vblank.c:2278`). The old per-driver `du->vkms.timer` hrtimer and `du->vkms.period_ns` no longer exist in the current code.

The patch does not apply cleanly because the code it modifies has been replaced.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Claude review: drm/vmwgfx: Break ABBA deadlock in vblank disable path
  2026-05-22 12:35 [PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path w15303746062
  2026-05-25  8:39 ` Claude review: " Claude Code Review Bot
@ 2026-05-25  8:39 ` Claude Code Review Bot
  1 sibling, 0 replies; 3+ messages in thread
From: Claude Code Review Bot @ 2026-05-25  8:39 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Status: Superseded by upstream refactoring**

**The bug analysis is correct.** The commit message accurately describes the ABBA deadlock: `drm_crtc_vblank_off()` holds `dev->vbl_lock` and calls `disable_vblank()`, which called blocking `hrtimer_cancel()`, while the timer callback calls `drm_crtc_handle_vblank()` which tries to acquire `dev->vbl_lock`. This is a legitimate deadlock scenario.

**The fix approach has a correctness gap.** In `vmw_vkms_disable_vblank()`:

```c
-	hrtimer_cancel(&du->vkms.timer);
+	hrtimer_try_to_cancel(&du->vkms.timer);
 	du->vkms.surface = NULL;
```

`hrtimer_try_to_cancel()` returns `-1` if the timer callback is currently executing, meaning the cancel did not happen. The patch then sets `du->vkms.surface = NULL` regardless, so a still-running timer callback could race against the surface being cleared. The follow-up `hrtimer_cancel()` in `vmw_vkms_crtc_atomic_disable()` would eventually synchronize, but between the `try_to_cancel` and the later `hrtimer_cancel`, the timer callback may fire again and access the now-NULL surface pointer.

**The upstream fix is superior.** In drm-next, the entire vblank timer has been moved into the DRM core (`struct drm_vblank_crtc_timer`). The `vmw_vkms_disable_vblank()` now calls `drm_crtc_vblank_cancel_timer(crtc)` which internally uses `hrtimer_try_to_cancel()` with proper interval-zeroing under a spinlock, and the old `du->vkms.timer` field no longer exists. The `vmw_vkms_crtc_atomic_disable()` function no longer needs a separate `hrtimer_cancel()` call.

**Minor style issues (moot given NAK):**
- The `period_ns` line (`du->vkms.period_ns = ktime_set(0, 0)`) was removed in the diff context but this field no longer exists upstream either.
- The added comments, while accurate, are somewhat verbose for kernel style.

**Recommendation:** This patch should be rejected. The underlying bug was real but has been fixed by the vblank timer infrastructure refactoring already in drm-next. If the submitter is targeting a stable kernel that predates that refactoring, they should note the target branch and add a `Fixes:` tag identifying the commit that introduced the original `hrtimer_cancel()` call, along with a `Cc: stable@vger.kernel.org` tag.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-25  8:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 12:35 [PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path w15303746062
2026-05-25  8:39 ` Claude review: " Claude Code Review Bot
2026-05-25  8:39 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox