public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/xe/xe_hw_error: Add support for Core-Compute errors
Date: Tue, 03 Mar 2026 14:32:50 +1000	[thread overview]
Message-ID: <review-patch4-20260228080858.3063532-11-riana.tauro@intel.com> (raw)
In-Reply-To: <20260228080858.3063532-11-riana.tauro@intel.com>

Patch Review

**Platform guard broadened without full coverage:**
```c
-	if (xe->info.platform != XE_BATTLEMAGE)
+	if (!IS_DGFX(xe))
 		return;
```

This now enters `hw_error_source_handler` for all discrete GPUs, but `hw_error_info_init` only initializes RAS for PVC. For Battlemage (and future DGFX), `info` will be NULL. The `if (!info) goto clear_reg` guard prevents a crash, but it also means **Battlemage silently stops processing error bits other than CSC** — a regression from the current code that does process them (even if just clearing the register). Consider whether this behavioral change is intentional.

**`xe_hw_error_map` array size issue with `break` vs `continue`:**
```c
+	for_each_set_bit(err_bit, &err_src, XE_RAS_REG_SIZE) {
+		/* Check error bit is within bounds */
+		if (err_bit >= ARRAY_SIZE(xe_hw_error_map))
+			break;
```

In this patch, `xe_hw_error_map` has only index `[0]`, so `ARRAY_SIZE` is 1. The `break` exits the entire loop when any bit > 0 is set. This means if bit 0 (GT) and bit 17 (CSC) are both set in the same status register read, CSC would be silently skipped. Using `continue` instead of `break` would be safer, though CSC is handled separately above. In patch 5 the array grows to size 17, but bits 1-15 would still exit the loop prematurely due to `break`.

**Potential double-counting in subslice error path:**
```c
+		case ERR_STAT_GT_VECTOR0:
+		case ERR_STAT_GT_VECTOR1: {
+			val = hweight32(vector);
+			atomic_add(val, &info[error_id].counter);
+			...
+			err_stat = xe_mmio_read32(mmio, ERR_STAT_GT_REG(hw_err));
+			for_each_set_bit(errbit, &err_stat, GT_HW_ERROR_MAX_ERR_BITS) {
+				if (PVC_ERROR_MASK_SET(hw_err, errbit))
+					atomic_inc(&info[error_id].counter);
+			}
```

The vector register bits and the error status register bits both increment the same counter. If these represent the same underlying errors reported through two different registers, this double-counts. Please clarify in comments whether these are truly independent error events.

**`PVC_ERROR_MASK_SET` macro missing parentheses around `hw_err`:**
```c
+#define PVC_ERROR_MASK_SET(hw_err, err_bit)	((hw_err == HARDWARE_ERROR_CORRECTABLE) ? \
```
Should be `((hw_err) == HARDWARE_ERROR_CORRECTABLE)` to prevent operator-precedence issues if a complex expression is passed.

Same issue in `PVC_GT_VECTOR_LEN`, `ERR_STAT_GT_VECTOR_REG`, and other macros using unparenthesized parameters.

---

---
Generated by Claude Code Patch Reviewer

  reply	other threads:[~2026-03-03  4:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-28  8:08 [PATCH v9 0/5] Introduce DRM_RAS using generic netlink for RAS Riana Tauro
2026-02-28  8:08 ` [PATCH v9 1/5] drm/ras: Introduce the DRM RAS infrastructure over generic netlink Riana Tauro
2026-02-28 16:04   ` Jakub Kicinski
2026-03-03  4:32   ` Claude review: " Claude Code Review Bot
2026-02-28  8:08 ` [PATCH v9 2/5] drm/xe/xe_drm_ras: Add support for XE DRM RAS Riana Tauro
2026-03-03  4:32   ` Claude review: " Claude Code Review Bot
2026-02-28  8:08 ` [PATCH v9 3/5] drm/xe/xe_hw_error: Integrate DRM RAS with hardware error handling Riana Tauro
2026-03-03  4:32   ` Claude review: " Claude Code Review Bot
2026-02-28  8:08 ` [PATCH v9 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-03-03  4:32   ` Claude Code Review Bot [this message]
2026-02-28  8:08 ` [PATCH v9 5/5] drm/xe/xe_hw_error: Add support for PVC SoC errors Riana Tauro
2026-03-03  4:32   ` Claude review: " Claude Code Review Bot
2026-03-03  4:32 ` Claude review: Introduce DRM_RAS using generic netlink for RAS Claude Code Review Bot
  -- strict thread matches above, loose matches on Subject: below --
2026-03-04  7:44 [PATCH v10 0/5] " Riana Tauro
2026-03-04  7:44 ` [PATCH v10 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-03-05  3:47   ` Claude review: " Claude Code Review Bot
2026-02-23  6:05 [PATCH v8 0/5] Introduce DRM_RAS using generic netlink for RAS Riana Tauro
2026-02-23  6:05 ` [PATCH v8 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-02-24  0:45   ` Claude review: " Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch4-20260228080858.3063532-11-riana.tauro@intel.com \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox