From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/xe/drm_ras: Wire up error threshold callbacks Date: Fri, 05 Jun 2026 06:10:08 +1000 Message-ID: In-Reply-To: <20260604184849.1011985-5-raag.jadav@intel.com> References: <20260604184849.1011985-1-raag.jadav@intel.com> <20260604184849.1011985-5-raag.jadav@intel.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review This wires the threshold callbacks into the correctable error node only. **Issues:** 1. **Threshold only wired for correctable, not uncorrectable**: This is the= biggest issue with this patch. The `assign_node_params` function handles b= oth correctable and uncorrectable severity: ```c if (severity =3D=3D DRM_XE_RAS_ERR_SEV_CORRECTABLE) { node->query_error_counter =3D query_correctable_error_counter; node->clear_error_counter =3D clear_correctable_error_counter; + node->query_error_threshold =3D query_correctable_error_threshold; + node->set_error_threshold =3D set_correctable_error_threshold; } else { node->query_error_counter =3D query_uncorrectable_error_counter; node->clear_error_counter =3D clear_uncorrectable_error_counter; } ``` No `query_uncorrectable_error_threshold` or `set_uncorrectable_error_thresh= old` is added for the else branch. The core drm_ras code (patch 1) already = returns `-EOPNOTSUPP` when the callbacks are NULL, so this won't crash =E2= =80=94 but the commit message doesn't explain why threshold is only meaning= ful for correctable errors. If this is intentional (e.g., hardware only sup= ports threshold for correctable errors), it should be stated in the commit = message. 2. **Redundant validation**: The `query_correctable_error_threshold` callba= ck does its own validation: ```c if (!info || !info[error_id].name) return -ENOENT; if (!xe->info.has_sysctrl) return -EOPNOTSUPP; ``` But the core `get_node_error_threshold()` in patch 1 already validates: - The node exists (xa_load) - The callback exists (returns `-EOPNOTSUPP` if NULL) - The error_id is in range The `!info || !info[error_id].name` check is still useful as driver-level v= alidation (the counter may not be defined for that ID even if it's in range= ). The `has_sysctrl` check is also correct since threshold requires the mai= lbox. This is fine, just noting the defense-in-depth. 3. **The callback functions access `ras->info[severity]` directly**: This r= e-fetches the counter info array that was set up in `assign_node_params`. T= here's no lock protecting this access. If `cleanup_node_param` could race w= ith a netlink request (e.g., during driver unload while a user is querying)= , this could be a use-after-free. This is an existing issue with the counte= r callbacks too, not introduced by this patch. --- Generated by Claude Code Patch Reviewer