From mboxrd@z Thu Jan  1 00:00:00 1970
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence
Date: Sun, 12 Apr 2026 12:04:48 +1000
Message-ID: <review-patch5-20260408233458.22666-6-ville.syrjala@linux.intel.com>
In-Reply-To: <20260408233458.22666-6-ville.syrjala@linux.intel.com>
References: <20260408233458.22666-1-ville.syrjala@linux.intel.com> <20260408233458.22666-6-ville.syrjala@linux.intel.com>
X-Mailer: Claude Code Patch Reviewer
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Patch Review

This is the core patch. Key design:

1. A `dma_fence` is lazily created on first request (`intel_display_reset_f=
ence_get`) and stored in `display->reset.fence`, protected by `display->res=
et.mutex`.
2. `intel_atomic_commit_fence_wait()` gets the reset fence and waits on bot=
h the plane fence and the reset fence via `dma_fence_wait_any_timeout()`.
3. `intel_display_reset_prepare()` signals the reset fence before proceedin=
g with the reset.
4. `intel_display_reset_finish()` discards the old (signaled) fence so the =
next caller gets a fresh one.

**Race concern =E2=80=94 discard vs. in-flight waiters:** In `intel_display=
_reset_fence_discard()`, the fence is put (potentially freed) under the mut=
ex. But a concurrent `intel_atomic_commit_fence_wait()` has already taken a=
 reference via `dma_fence_get()`, so this is safe =E2=80=94 the refcount pr=
otects against use-after-free. This is correct.

**Potential concern =E2=80=94 signal ordering:** `intel_display_reset_prepa=
re()` signals the fence first, then proceeds to take `mode_config.mutex` an=
d modeset locks. The commit path, after waking from fence wait, proceeds in=
to `intel_atomic_commit_tail()` which eventually calls `drm_atomic_helper_w=
ait_for_dependencies()` etc. The reset path then acquires modeset locks. Th=
ere could be a race where the commit proceeds and tries to do hardware acce=
ss while the reset is in flight. However, this is the same as the original =
design intent =E2=80=94 the fence wait unblocking lets the commit proceed t=
o eventually release modeset locks that the reset needs. The commit path sh=
ould not do harmful hardware access before the reset completes because the =
GPU is already hung.

**`dma_fence_init` with context=3D0, seqno=3D0:** Using `dma_fence_init(fen=
ce, &ops, NULL, 0, 0)` =E2=80=94 context 0 with seqno 0 means all reset fen=
ces share the same context/seqno. This is fine for this use case since ther=
e's only ever one active reset fence at a time, and they are purely for sig=
naling, not ordering. However, `dma_fence` debug infrastructure may complai=
n about context 0 reuse. Consider using `dma_fence_context_alloc(1)` for a =
unique context.

**Missing `mutex_destroy` in cleanup:** `intel_display_reset_fence_init()` =
calls `mutex_init()` but there's no corresponding `mutex_destroy()` in the =
teardown path. `intel_display_driver_remove()` calls `intel_display_reset_f=
ence_discard()` which discards the fence but doesn't destroy the mutex. Min=
or leak of debug resources on CONFIG_DEBUG_MUTEXES kernels.

**xe Makefile addition:** Adding `intel_display_reset.o` to the xe build is=
 needed since `intel_display_reset_fence_get` is now called from shared dis=
play code. The compat stubs for `intel_clock_gating.h` from patch 4 make th=
is build cleanly.

---
Generated by Claude Code Patch Reviewer