From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm: msm: adreno: attempt to recover from ringbuffer drain timeout
Date: Mon, 09 Mar 2026 07:37:27 +1000 [thread overview]
Message-ID: <review-patch1-20260308-adreno-ringbuffer-drain-timeout-recovery-v1-1-985a33faf108@postmarketos.org> (raw)
In-Reply-To: <20260308-adreno-ringbuffer-drain-timeout-recovery-v1-1-985a33faf108@postmarketos.org>
Patch Review
**Critical issues:**
1. **Missing locking context.** The `recover_worker()` in `msm_gpu.c:463` takes `mutex_lock(&gpu->lock)` before calling `gpu->funcs->recover(gpu)`. The new call in `adreno_idle()` calls `recover()` without holding `gpu->lock`. Many callers of `adreno_idle()` (e.g., during `hw_init`, `pm_suspend`, CP register writes) are not necessarily holding this lock, and some may already hold it — leading to either missing synchronization or deadlocks.
2. **Bypasses critical recovery bookkeeping.** The `recover_worker()` does substantial work before calling `gpu->funcs->recover()`:
- It finds the faulting submit (`find_submit()`)
- Increments fault counters
- Captures crash state (`msm_gpu_crashstate_capture()`)
- Advances fences to skip the hung submit (`ring->memptrs->fence = ++fence`)
- Retires completed submits (`retire_submits()`)
- Replays remaining submits after recovery
The patch skips all of this. Without fence advancement and submit retirement, the GPU will likely re-execute the same faulting command after recovery, causing an infinite hang-recover loop.
3. **Calling recover from within idle is architecturally wrong.** `adreno_idle()` is called from many contexts including `hw_init` paths. For example, `a5xx_gpu.c:975` calls `a5xx_idle()` during `a5xx_hw_init()`. The `adreno_recover()` function itself calls `msm_gpu_hw_init()` (`adreno_gpu.c:709`), which would re-enter the init path and call idle again, creating **infinite recursion**.
4. **No pm_runtime handling.** The `recover_worker` does `pm_runtime_get_sync()` before calling recover and `pm_runtime_put()` after. The patch has no such protection.
5. **The actual call:**
```c
adreno_gpu->funcs->base.recover(gpu);
```
This calls the per-generation recover (e.g., `a5xx_recover`), not the base `adreno_recover()`. The per-gen recover functions (e.g., `a4xx_recover` at `a4xx_gpu.c:350`) do things like dump registers, which may have their own locking assumptions.
**Recommendation:** The right approach is to schedule recovery through the existing `recover_worker` mechanism rather than calling `recover()` directly:
```c
kthread_queue_work(gpu->worker, &gpu->recover_work);
```
This is what the fault/hang IRQ handlers already do (see `a8xx_gpu.c:939`, `a5xx_preempt.c:91`, etc.). It ensures proper locking, fence management, crashstate capture, and submit replay all happen correctly. However, even that needs care — `adreno_idle()` callers currently expect synchronous behavior (they check the return value), so queuing async recovery would require adjusting the callers too.
---
Generated by Claude Code Patch Reviewer
prev parent reply other threads:[~2026-03-08 21:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-08 21:12 [PATCH] drm: msm: adreno: attempt to recover from ringbuffer drain timeout Paul Sajna
2026-03-08 21:37 ` Claude review: " Claude Code Review Bot
2026-03-08 21:37 ` Claude Code Review Bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=review-patch1-20260308-adreno-ringbuffer-drain-timeout-recovery-v1-1-985a33faf108@postmarketos.org \
--to=claude-review@example.com \
--cc=dri-devel-reviews@example.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox