From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/xe/xe_hw_error: Integrate DRM RAS with hardware error handling Date: Tue, 24 Feb 2026 10:45:42 +1000 Message-ID: In-Reply-To: <20260223060541.526397-10-riana.tauro@intel.com> References: <20260223060541.526397-7-riana.tauro@intel.com> <20260223060541.526397-10-riana.tauro@intel.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 Patch Review **Loss of error detail in log messages:** > - drm_err_ratelimited(&xe->drm, HW_ERR "Tile%d reported DEV_ERR_STAT_%s blank!\n", > - tile->id, hw_err_str); > + drm_err_ratelimited(&xe->drm, HW_ERR "Tile%d reported %s DEV_ERR_STAT register blank!\n", > + tile->id, severity_str); The previous code distinguished between CORRECTABLE, NONFATAL, and FATAL in the log. Now both NONFATAL and FATAL map to "uncorrectable-errors" via `hw_err_to_severity`. This loses the ability to distinguish nonfatal from fatal errors in log messages, which is useful for debugging. The severity names `"correctable-errors"` and `"uncorrectable-errors"` are userspace-facing node names, not ideal for kernel log messages. > + if (xe->info.platform != XE_PVC) > + return 0; > + > + return xe_drm_ras_init(xe); `hw_error_info_init` is called from `xe_hw_error_init`, but the caller logs the error and continues: > + ret = hw_error_info_init(xe); > + if (ret) > + drm_err(&xe->drm, "Failed to initialize XE DRM RAS (%pe)\n", ERR_PTR(ret)); > + > + process_hw_errors(xe); This means RAS init failure is non-fatal. That's fine, but `process_hw_errors` will call `hw_error_source_handler` which (after patch 4) accesses `ras->info[severity]`. The NULL check in patch 4's `hw_error_source_handler` (`if (!info) goto clear_reg`) protects this path, so this is safe. Just wanted to confirm that this was intentional. --- Generated by Claude Code Patch Reviewer