public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/nouveau/fifo: add recovery path for Tesla cache_error/dma_pusher
Date: Sat, 16 May 2026 11:34:50 +1000	[thread overview]
Message-ID: <review-patch2-20260513175014.96599-3-marek@czernohous.de> (raw)
In-Reply-To: <20260513175014.96599-3-marek@czernohous.de>

Patch Review

**Assessment: Functionally correct, a few items to address.**

**Issue 1 (should fix) — German comments in public header.** `include/nvkm/engine/fifo.h` is a header shared across nouveau. The struct field comments are in German:

```c
u32              count;                                  /* aktuelle Fenster-Tiefe */
ktime_t          ts[NVKM_FIFO_WEDGE_RING_MAX];          /* Ring von Timestamps */
u32              head;                                   /* Ring-Head */
```

These should be in English ("current window depth", "timestamp ring", "ring head") or removed entirely, since the field names are already reasonably descriptive.

**Issue 2 (design question) — permanent wedge with no reset path.** Once Tier-2 fires:

```c
if (atomic_xchg(&w->wedged, 1) != 0)
    return; /* already wedged this cycle */
```

The `wedged` atomic is never reset to 0. If userspace performs the `DRM_WEDGE_RECOVERY_REBIND` and the driver is re-bound, the `nvkm_fifo` structure is re-allocated so this is implicitly reset. But if the uevent is *not* acted on (no listener, or userspace ignores it), the wedge event fires exactly once and then Tier-2 is silently dead while Tier-1 keeps running. This is probably intentional, but worth documenting in a comment or the commit message.

**Issue 3 (minor) — `w->count` read without lock in worker.**

```c
static void nv04_fifo_wedge_work(struct work_struct *work)
{
    ...
    fault_count = w->count;
```

`w->count` is updated under `fifo->wedge.lock` in the IRQ path but read here without the lock. This is a benign race since the value is only used for the log message and tracepoint (not control flow), but KCSAN may flag it. A `READ_ONCE()` would silence any tooling complaint and make the intent explicit.

**Issue 4 (minor) — `nv04_fifo_wedge_init`/`fini` called for all chipsets.** These are called from `nvkm_fifo_new_` and `nvkm_fifo_dtor` which run for every GPU generation, not just Tesla. The overhead is trivial (spinlock init + work init), but naming them `nv04_*` and initializing them globally is a bit misleading. Consider either: (a) renaming to something more generic, or (b) conditioning on chipset. Very minor.

**Issue 5 (cosmetic) — dev_info prefix redundancy.**

```c
dev_info(drm_dev->dev,
         "nouveau: fifo wedged after %u faults in %u ms\n",
         fault_count, nouveau_fifo_wedge_window_ms);
```

`dev_info` already prefixes the device name. The manual `"nouveau: "` prefix is redundant and will produce output like `pci 0000:02:00.0: nouveau: fifo wedged ...`. Drop the `"nouveau: "` prefix.

**Correctness notes (all fine):**
- `schedule_work` from IRQ context: safe by design.
- `nvkm_chan_error` from IRQ context: safe — other interrupt handlers (gf100, ga100, gv100) already call it this way.
- Reuse of `flags` variable in `nv04_fifo_recover` for both the chan get/put and the spinlock: correct, the chan is released before the spin_lock.
- `ktime_t` zero as sentinel: safe since `ktime_get()` will never return 0 in practice.
- Module params declared with perm `0400` (read-only): no runtime write race possible.
- `CREATE_TRACE_POINTS` placement in `nouveau_drm.c`: correct, and this is the first trace event header for nouveau.
- Tracepoint `TRACE_INCLUDE_PATH` uses the `../../` relative path pattern: this is the standard kernel convention for out-of-tree-style trace headers.

---
Generated by Claude Code Patch Reviewer

  reply	other threads:[~2026-05-16  1:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 17:50 [PATCH 0/2] drm/nouveau: nv04 FIFO cleanup + recovery for Tesla Marek Czernohous
2026-05-13 17:50 ` [PATCH 1/2] drm/nouveau/fifo/nv04: filter benign CACHE_ERROR from Mesa NV50 bind probe Marek Czernohous
2026-05-16  1:34   ` Claude review: " Claude Code Review Bot
2026-05-13 17:50 ` [PATCH 2/2] drm/nouveau/fifo: add recovery path for Tesla cache_error/dma_pusher Marek Czernohous
2026-05-16  1:34   ` Claude Code Review Bot [this message]
2026-05-16  1:34 ` Claude review: drm/nouveau: nv04 FIFO cleanup + recovery for Tesla Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch2-20260513175014.96599-3-marek@czernohous.de \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox