From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/tyr: add GPU reset handling Date: Sat, 14 Mar 2026 07:15:47 +1000 Message-ID: In-Reply-To: <20260313091646.16938-5-work@onurozkan.dev> References: <20260313091646.16938-1-work@onurozkan.dev> <20260313091646.16938-5-work@onurozkan.dev> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **This is the main patch and has several items worth discussing.** #### 1. `as_bound()` safety in async work context ```rust + fn reset_work(self: &Arc) { + // ... + // SAFETY: `Controller` is part of driver-private data and only ex= ists + // while the platform device is bound. + let pdev =3D unsafe { self.pdev.as_ref().as_bound() }; ``` This safety argument is the weakest point in the series. The `Controller` h= olds an `ARef` which is a refcounted reference =E2=80=94 = it keeps the device alive, but **being alive is not the same as being bound= **. The device could be unbound (e.g., during module removal or device hot-= unplug) while the workqueue work item is still executing. The `disable_sync= ()` in `ResetHandle::drop()` is meant to prevent this, but the safety argum= ent in the comment doesn't articulate this clearly. The safety justification should reference that `ResetHandle::drop()` calls = `disable_sync()` which guarantees no work is running after that point, and = `ResetHandle` is dropped before the device unbinds (because it's embedded i= n `TyrDrmDeviceData` which is driver-private data released during unbind). = This chain of reasoning needs to be explicit in the safety comment. #### 2. Memory ordering on `pending` flag ```rust + if self.0.pending.cmpxchg(false, true, Relaxed).is_ok() + && self.0.wq.enqueue(self.0.clone()).is_err() + { + self.0.pending.store(false, Release); + } ``` Using `Relaxed` on the cmpxchg is consistent with what Panthor does (`atomi= c_cmpxchg` in C defaults to relaxed on many architectures). However, there'= s a subtle issue: if `schedule()` is called from an IRQ handler or another = CPU that has just observed a hardware fault, the `Relaxed` ordering means t= here's no guarantee that the work function will see the state that triggere= d the reset request. In practice, the workqueue infrastructure provides suf= ficient memory barriers (work item execution implies acquire semantics from= the queue), so this is likely fine, but it's worth noting. The `Release` store on the failure path is correct =E2=80=94 it ensures tha= t the "not pending" state is visible to other CPUs. In `reset_work`: ```rust + self.pending.store(false, Release); ``` This `Release` at the end of `reset_work` pairs with the `Acquire` load in = `is_pending()`: ```rust + pub(crate) fn is_pending(&self) -> bool { + self.0.pending.load(Acquire) + } ``` This pairing is correct =E2=80=94 `Acquire` on load ensures that if `is_pen= ding()` returns `false`, the caller sees all memory effects of the complete= d reset. However, the `cmpxchg` in `schedule()` uses `Relaxed`, which means a succes= sful cmpxchg (false=E2=86=92true) doesn't synchronize with the `Release` st= ore at the end of the previous `reset_work()`. This means `schedule()` call= ers don't have a happens-before relationship with the completion of a prior= reset. For the current code this is fine since `schedule()` doesn't read a= ny shared state that depends on the reset having completed, but it's a foot= gun for future modifications. Consider using `Acquire` on the cmpxchg for s= afety. #### 3. Drop ordering comment ```rust + // `ResetHandle::drop()` drains queued/running works and this must hap= pen + // before clocks/regulators are dropped. So keep this field before the= m to + // ensure the correct drop order. + pub(crate) reset: reset::ResetHandle, ``` Relying on struct field declaration order for drop ordering is correct in R= ust (fields are dropped in declaration order), and the comment documenting = this is good practice. This is a well-known pattern. #### 4. Formatting changes mixed in The patch makes some unrelated formatting changes in `driver.rs`: ```rust - device::{ - Bound, - Core, - Device, // - }, + device::{ + Core, // + }, ``` and: ```rust - new_mutex, - of, - platform, + new_mutex, of, platform, ``` These are fine as cleanup of now-unused imports/reformatting, but the commi= t message doesn't mention them. They're a consequence of the refactoring so= this is acceptable. #### 5. `run_reset` visibility ```rust +pub(super) fn run_reset(dev: &Device, iomem: &Devres) -> Res= ult { ``` The doc comment says: ```rust +/// Its visibility is `pub(super)` only so the probe path can run an +/// initial reset; it is not part of this module's public API. ``` This is `pub(super)` but could also be `pub(crate)` since `driver.rs` is a = sibling module, not a parent. Actually, looking at the module structure (`t= yr.rs` declares `mod driver` and `mod reset`), `pub(super)` makes `run_rese= t` visible to `tyr.rs` and its child modules including `driver.rs`. This is= correct. #### 6. Minor: `#[expect(dead_code)]` ```rust + #[expect(dead_code)] + pub(crate) fn schedule(&self) { ``` ```rust + #[expect(dead_code)] + pub(crate) fn is_pending(&self) -> bool { ``` These are currently unused and marked with `#[expect(dead_code)]`. This is = fine for an incremental series where callers will be added later, but in up= stream submission you'd typically want the caller in the same series or the= function added when it's needed. The `#[expect(dead_code)]` is better than= `#[allow(dead_code)]` since it will warn when the attribute becomes unnece= ssary. --- Generated by Claude Code Patch Reviewer