From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release() Date: Thu, 07 May 2026 13:31:56 +1000 Message-ID: In-Reply-To: <20260506-panthor-shrinker-fixes-v1-2-e7721526de96@collabora.com> References: <20260506-panthor-shrinker-fixes-v1-0-e7721526de96@collabora.com> <20260506-panthor-shrinker-fixes-v1-2-e7721526de96@collabora.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review This patch fixes a second race in the core code. The race table in the comm= it message is clear and well-documented. The problem: `drm_gem_lru_scan()` = moves a zero-refcount object to the stack-allocated `still_in_lru` *before*= the `kref_get_unless_zero()` check. If the object is concurrently being fr= eed, `drm_gem_object_release()` =E2=86=92 `drm_gem_lru_remove()` reads `obj= ->lru` (which now points to the stack-allocated `still_in_lru`), and by the= time it dereferences it, `drm_gem_lru_scan()` may have already returned an= d the stack frame is gone. The fix moves the `drm_gem_lru_move_tail_locked()` call to *after* the `kre= f_get_unless_zero()` check: ```c - drm_gem_lru_move_tail_locked(&still_in_lru, obj); ... - if (!kref_get_unless_zero(&obj->refcount)) + if (!kref_get_unless_zero(&obj->refcount)) { + if (obj->lru) + drm_gem_lru_remove_locked(obj); continue; + } + + drm_gem_lru_move_tail_locked(&still_in_lru, obj); ``` The `drm_gem_lru_remove_locked(obj)` call for zero-refcount objects is nece= ssary to prevent the scan loop from hitting the same dying object repeatedl= y (since `list_first_entry_or_null` would keep finding it). This is safe be= cause we hold the LRU lock here. **One concern:** The `if (obj->lru)` check =E2=80=94 at this point we hold = the LRU's mutex lock, and we just got the object from `lru->list`, so `obj-= >lru` should always be non-NULL (it should point to `lru`). The check is de= fensive, which is fine, but it might mask a logic error if `obj->lru` were = ever NULL here. That said, defensive coding in a race-fix patch is reasonab= le. **Interaction with the `still_in_lru` splice-back at the end of `drm_gem_lr= u_scan()`:** ```c list_for_each_entry (obj, &still_in_lru.list, lru_node) obj->lru =3D lru; list_splice_tail(&still_in_lru.list, &lru->list); lru->count +=3D still_in_lru.count; ``` After this patch, objects removed with `drm_gem_lru_remove_locked()` have `= obj->lru =3D NULL` and are removed from the list, so they won't appear in t= he `still_in_lru` splice-back. The `still_in_lru.count` will have been decr= emented by `drm_gem_lru_remove_locked()`, so the count arithmetic remains c= orrect. This looks right. **Note:** This patch references `drm_gem_lru_remove_locked()` which is curr= ently a static function defined earlier in the file (line 1555). The patch'= s diff context starts at line 1660, so this function is already available. = Good. No blocking issues. Patch is correct. --- Generated by Claude Code Patch Reviewer