From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/xe/xe_hw_error: Add support for Core-Compute errors
Date: Tue, 03 Mar 2026 14:32:50 +1000 [thread overview]
Message-ID: <review-patch4-20260228080858.3063532-11-riana.tauro@intel.com> (raw)
In-Reply-To: <20260228080858.3063532-11-riana.tauro@intel.com>
Patch Review
**Platform guard broadened without full coverage:**
```c
- if (xe->info.platform != XE_BATTLEMAGE)
+ if (!IS_DGFX(xe))
return;
```
This now enters `hw_error_source_handler` for all discrete GPUs, but `hw_error_info_init` only initializes RAS for PVC. For Battlemage (and future DGFX), `info` will be NULL. The `if (!info) goto clear_reg` guard prevents a crash, but it also means **Battlemage silently stops processing error bits other than CSC** — a regression from the current code that does process them (even if just clearing the register). Consider whether this behavioral change is intentional.
**`xe_hw_error_map` array size issue with `break` vs `continue`:**
```c
+ for_each_set_bit(err_bit, &err_src, XE_RAS_REG_SIZE) {
+ /* Check error bit is within bounds */
+ if (err_bit >= ARRAY_SIZE(xe_hw_error_map))
+ break;
```
In this patch, `xe_hw_error_map` has only index `[0]`, so `ARRAY_SIZE` is 1. The `break` exits the entire loop when any bit > 0 is set. This means if bit 0 (GT) and bit 17 (CSC) are both set in the same status register read, CSC would be silently skipped. Using `continue` instead of `break` would be safer, though CSC is handled separately above. In patch 5 the array grows to size 17, but bits 1-15 would still exit the loop prematurely due to `break`.
**Potential double-counting in subslice error path:**
```c
+ case ERR_STAT_GT_VECTOR0:
+ case ERR_STAT_GT_VECTOR1: {
+ val = hweight32(vector);
+ atomic_add(val, &info[error_id].counter);
+ ...
+ err_stat = xe_mmio_read32(mmio, ERR_STAT_GT_REG(hw_err));
+ for_each_set_bit(errbit, &err_stat, GT_HW_ERROR_MAX_ERR_BITS) {
+ if (PVC_ERROR_MASK_SET(hw_err, errbit))
+ atomic_inc(&info[error_id].counter);
+ }
```
The vector register bits and the error status register bits both increment the same counter. If these represent the same underlying errors reported through two different registers, this double-counts. Please clarify in comments whether these are truly independent error events.
**`PVC_ERROR_MASK_SET` macro missing parentheses around `hw_err`:**
```c
+#define PVC_ERROR_MASK_SET(hw_err, err_bit) ((hw_err == HARDWARE_ERROR_CORRECTABLE) ? \
```
Should be `((hw_err) == HARDWARE_ERROR_CORRECTABLE)` to prevent operator-precedence issues if a complex expression is passed.
Same issue in `PVC_GT_VECTOR_LEN`, `ERR_STAT_GT_VECTOR_REG`, and other macros using unparenthesized parameters.
---
---
Generated by Claude Code Patch Reviewer
next prev parent reply other threads:[~2026-03-03 4:32 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-28 8:08 [PATCH v9 0/5] Introduce DRM_RAS using generic netlink for RAS Riana Tauro
2026-02-28 8:08 ` [PATCH v9 1/5] drm/ras: Introduce the DRM RAS infrastructure over generic netlink Riana Tauro
2026-02-28 16:04 ` Jakub Kicinski
2026-03-03 4:32 ` Claude review: " Claude Code Review Bot
2026-02-28 8:08 ` [PATCH v9 2/5] drm/xe/xe_drm_ras: Add support for XE DRM RAS Riana Tauro
2026-03-03 4:32 ` Claude review: " Claude Code Review Bot
2026-02-28 8:08 ` [PATCH v9 3/5] drm/xe/xe_hw_error: Integrate DRM RAS with hardware error handling Riana Tauro
2026-03-03 4:32 ` Claude review: " Claude Code Review Bot
2026-02-28 8:08 ` [PATCH v9 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-03-03 4:32 ` Claude Code Review Bot [this message]
2026-02-28 8:08 ` [PATCH v9 5/5] drm/xe/xe_hw_error: Add support for PVC SoC errors Riana Tauro
2026-03-03 4:32 ` Claude review: " Claude Code Review Bot
2026-03-03 4:32 ` Claude review: Introduce DRM_RAS using generic netlink for RAS Claude Code Review Bot
-- strict thread matches above, loose matches on Subject: below --
2026-03-04 7:44 [PATCH v10 0/5] " Riana Tauro
2026-03-04 7:44 ` [PATCH v10 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-03-05 3:47 ` Claude review: " Claude Code Review Bot
2026-02-23 6:05 [PATCH v8 0/5] Introduce DRM_RAS using generic netlink for RAS Riana Tauro
2026-02-23 6:05 ` [PATCH v8 4/5] drm/xe/xe_hw_error: Add support for Core-Compute errors Riana Tauro
2026-02-24 0:45 ` Claude review: " Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=review-patch4-20260228080858.3063532-11-riana.tauro@intel.com \
--to=claude-review@example.com \
--cc=dri-devel-reviews@example.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox