From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/xe/xe_hw_error: Add support for PVC SoC errors Date: Thu, 05 Mar 2026 13:47:41 +1000 Message-ID: In-Reply-To: <20260304074412.464435-12-riana.tauro@intel.com> References: <20260304074412.464435-7-riana.tauro@intel.com> <20260304074412.464435-12-riana.tauro@intel.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Bug: Correctable errors blindly clear all register bits** ```c if (hw_err =3D=3D HARDWARE_ERROR_CORRECTABLE) { xe_mmio_write32(mmio, SOC_GLOBAL_ERR_STAT_REG(master, hw_err), REG_GENM= ASK(31, 0)); xe_mmio_write32(mmio, SOC_LOCAL_ERR_STAT_REG(master, hw_err), REG_GENMA= SK(31, 0)); xe_mmio_write32(mmio, SOC_GLOBAL_ERR_STAT_REG(slave, hw_err), REG_GENMA= SK(31, 0)); xe_mmio_write32(mmio, SOC_LOCAL_ERR_STAT_REG(slave, hw_err), REG_GENMAS= K(31, 0)); goto unmask_gsysevtctl; } ``` For correctable SoC errors, the registers are cleared without being read fi= rst. This means: (1) no counter is incremented for correctable SoC errors (= the counter value reported via netlink will always be 0), and (2) no diagno= stic log messages are emitted. This seems like a significant gap =E2=80=94 = the netlink interface claims to report soc-internal correctable errors but = the value will never change. **Concern: `log_soc_error` uses `strcmp` against string literal** ```c if (strcmp(name, "Undefined")) { ... atomic_inc(&info[index].counter); } ``` Using string comparison to decide whether to count an error is fragile. A s= entinel value (NULL) or a bitmask approach would be more robust. If someone= changes the "Undefined" string, the filter silently breaks. **Concern: GSYSEVTCTL unmask value is magic number** ```c xe_mmio_write32(mmio, SOC_GSYSEVTCTL_REG(master, slave, i), (HARDWARE_ERROR_MAX << 1) + 1); ``` `(HARDWARE_ERROR_MAX << 1) + 1` =3D `(3 << 1) + 1` =3D `7` =3D `0b111`. Thi= s magic number represents "enable all three error types." A define like `SO= C_GSYSEVTCTL_UNMASK_ALL` would improve readability. **Minor: Missing newline in drm_warn/drm_err format strings** ```c drm_warn(&xe->drm, "%s SOC %s detected", name, severity_str); drm_err_ratelimited(&xe->drm, "%s SOC %s detected", name, severity_str); ``` These format strings lack a trailing `\n`. While `drm_warn`/`drm_err` may a= dd one internally, it's inconsistent with all other log calls in this file = which include `\n`. **Minor: `slave` variable in `soc_hw_error_handler` is unused after initial= ization** ```c u32 master, slave, regbit; ... master =3D SOC_PVC_MASTER_BASE; slave =3D SOC_PVC_SLAVE_BASE; ``` `slave` is only used in the `SOC_GSYSEVTCTL_REG` macro calls and in the cor= rectable error clearing path. The variable is needed but could be `const` f= or clarity. --- Generated by Claude Code Patch Reviewer