public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/tyr: add GPU reset handling
Date: Sat, 14 Mar 2026 07:15:47 +1000	[thread overview]
Message-ID: <review-patch4-20260313091646.16938-5-work@onurozkan.dev> (raw)
In-Reply-To: <20260313091646.16938-5-work@onurozkan.dev>

Patch Review

**This is the main patch and has several items worth discussing.**

#### 1. `as_bound()` safety in async work context

```rust
+    fn reset_work(self: &Arc<Self>) {
+        // ...
+        // SAFETY: `Controller` is part of driver-private data and only exists
+        // while the platform device is bound.
+        let pdev = unsafe { self.pdev.as_ref().as_bound() };
```

This safety argument is the weakest point in the series. The `Controller` holds an `ARef<platform::Device>` which is a refcounted reference — it keeps the device alive, but **being alive is not the same as being bound**. The device could be unbound (e.g., during module removal or device hot-unplug) while the workqueue work item is still executing. The `disable_sync()` in `ResetHandle::drop()` is meant to prevent this, but the safety argument in the comment doesn't articulate this clearly.

The safety justification should reference that `ResetHandle::drop()` calls `disable_sync()` which guarantees no work is running after that point, and `ResetHandle` is dropped before the device unbinds (because it's embedded in `TyrDrmDeviceData` which is driver-private data released during unbind). This chain of reasoning needs to be explicit in the safety comment.

#### 2. Memory ordering on `pending` flag

```rust
+        if self.0.pending.cmpxchg(false, true, Relaxed).is_ok()
+            && self.0.wq.enqueue(self.0.clone()).is_err()
+        {
+            self.0.pending.store(false, Release);
+        }
```

Using `Relaxed` on the cmpxchg is consistent with what Panthor does (`atomic_cmpxchg` in C defaults to relaxed on many architectures). However, there's a subtle issue: if `schedule()` is called from an IRQ handler or another CPU that has just observed a hardware fault, the `Relaxed` ordering means there's no guarantee that the work function will see the state that triggered the reset request. In practice, the workqueue infrastructure provides sufficient memory barriers (work item execution implies acquire semantics from the queue), so this is likely fine, but it's worth noting.

The `Release` store on the failure path is correct — it ensures that the "not pending" state is visible to other CPUs.

In `reset_work`:
```rust
+        self.pending.store(false, Release);
```

This `Release` at the end of `reset_work` pairs with the `Acquire` load in `is_pending()`:
```rust
+    pub(crate) fn is_pending(&self) -> bool {
+        self.0.pending.load(Acquire)
+    }
```

This pairing is correct — `Acquire` on load ensures that if `is_pending()` returns `false`, the caller sees all memory effects of the completed reset.

However, the `cmpxchg` in `schedule()` uses `Relaxed`, which means a successful cmpxchg (false→true) doesn't synchronize with the `Release` store at the end of the previous `reset_work()`. This means `schedule()` callers don't have a happens-before relationship with the completion of a prior reset. For the current code this is fine since `schedule()` doesn't read any shared state that depends on the reset having completed, but it's a footgun for future modifications. Consider using `Acquire` on the cmpxchg for safety.

#### 3. Drop ordering comment

```rust
+    // `ResetHandle::drop()` drains queued/running works and this must happen
+    // before clocks/regulators are dropped. So keep this field before them to
+    // ensure the correct drop order.
+    pub(crate) reset: reset::ResetHandle,
```

Relying on struct field declaration order for drop ordering is correct in Rust (fields are dropped in declaration order), and the comment documenting this is good practice. This is a well-known pattern.

#### 4. Formatting changes mixed in

The patch makes some unrelated formatting changes in `driver.rs`:

```rust
-    device::{
-        Bound,
-        Core,
-        Device, //
-    },
+    device::{
+        Core, //
+    },
```

and:

```rust
-    new_mutex,
-    of,
-    platform,
+    new_mutex, of, platform,
```

These are fine as cleanup of now-unused imports/reformatting, but the commit message doesn't mention them. They're a consequence of the refactoring so this is acceptable.

#### 5. `run_reset` visibility

```rust
+pub(super) fn run_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
```

The doc comment says:

```rust
+/// Its visibility is `pub(super)` only so the probe path can run an
+/// initial reset; it is not part of this module's public API.
```

This is `pub(super)` but could also be `pub(crate)` since `driver.rs` is a sibling module, not a parent. Actually, looking at the module structure (`tyr.rs` declares `mod driver` and `mod reset`), `pub(super)` makes `run_reset` visible to `tyr.rs` and its child modules including `driver.rs`. This is correct.

#### 6. Minor: `#[expect(dead_code)]`

```rust
+    #[expect(dead_code)]
+    pub(crate) fn schedule(&self) {
```

```rust
+    #[expect(dead_code)]
+    pub(crate) fn is_pending(&self) -> bool {
```

These are currently unused and marked with `#[expect(dead_code)]`. This is fine for an incremental series where callers will be added later, but in upstream submission you'd typically want the caller in the same series or the function added when it's needed. The `#[expect(dead_code)]` is better than `#[allow(dead_code)]` since it will warn when the attribute becomes unnecessary.

---
Generated by Claude Code Patch Reviewer

  parent reply	other threads:[~2026-03-13 21:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13  9:16 [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Onur Özkan
2026-03-13  9:16 ` [PATCH v1 RESEND 1/4] drm/tyr: clear reset IRQ before soft reset Onur Özkan
2026-03-13 21:15   ` Claude review: " Claude Code Review Bot
2026-03-13  9:16 ` [PATCH v1 RESEND 2/4] rust: add Work::disable_sync Onur Özkan
2026-03-13 12:00   ` Alice Ryhl
2026-03-13 21:15   ` Claude review: " Claude Code Review Bot
2026-03-13  9:16 ` [PATCH v1 RESEND 3/4] rust: add ordered workqueue wrapper Onur Özkan
2026-03-13 21:15   ` Claude review: " Claude Code Review Bot
2026-03-13  9:16 ` [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling Onur Özkan
2026-03-13 14:56   ` Daniel Almeida
2026-03-13 21:15   ` Claude Code Review Bot [this message]
2026-03-13  9:52 ` [PATCH v1 RESEND 0/4] drm/tyr: implement GPU reset API Alice Ryhl
2026-03-13 11:12   ` Onur Özkan
2026-03-13 11:26     ` Alice Ryhl
2026-03-13 21:15 ` Claude review: " Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch4-20260313091646.16938-5-work@onurozkan.dev \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox