From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: accel/amdxdna: Add per-process BO memory usage query support Date: Wed, 25 Mar 2026 06:39:42 +1000 Message-ID: In-Reply-To: <20260324163159.2425461-1-lizhi.hou@amd.com> References: <20260324163159.2425461-1-lizhi.hou@amd.com> <20260324163159.2425461-1-lizhi.hou@amd.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **1. UAPI: `pid` type is `__s64` =E2=80=94 should be `__u32` or use proper = pid_t semantics** ```c struct amdxdna_drm_bo_usage { /** @pid: The ID of the process to query from. */ __s64 pid; ``` Using `__s64` for a PID is unusual. The existing `amdxdna_drm_hwctx_entry` = also uses `__s64` for pid, so there's precedent within this driver, but PID= s are typically positive integers that fit in 32 bits. More importantly, th= ere's no validation that `pid` is a valid (positive) value =E2=80=94 only t= hat it's non-zero. A negative pid would pass the check and silently match n= othing. **2. Security concern: any process can query any other process's BO usage** ```c if (!tmp.pid) return -EINVAL; list_for_each_entry(tmp_client, &xdna->client_list, node) { if (tmp.pid !=3D tmp_client->pid) continue; ``` The ioctl has no DRM_ROOT_ONLY flag (line 218 in the current tree shows `DR= M_IOCTL_DEF_DRV(AMDXDNA_GET_ARRAY, amdxdna_drm_get_array_ioctl, 0)`), meani= ng any unprivileged process can query memory usage of any other process by = PID. This leaks information about other users' memory usage. Consider restr= icting to the calling client's own PID unless the caller is privileged, or = querying only the caller's own usage (removing the PID lookup entirely). **3. `copy_from_user` with `min_sz` can read uninitialized stack data** ```c size_t min_sz =3D min(args->element_size, sizeof(struct amdxdna_drm_bo_usa= ge)); ... if (copy_from_user(&tmp, buf, min_sz)) return -EFAULT; ``` If `element_size < sizeof(struct amdxdna_drm_bo_usage)`, only a partial rea= d occurs, but `tmp` is not zero-initialized. The `tmp.pid` check that follo= ws may read uninitialized memory if `element_size` is smaller than `offseto= f(amdxdna_drm_bo_usage, pid) + sizeof(pid)`. Use `=3D {}` or `memset` to ze= ro-initialize `tmp` before the partial copy. **4. Locking: `amdxdna_gem_add_bo_usage` / `amdxdna_gem_del_bo_usage` take = `mm_lock`, but `open_ref` is protected by `abo->lock`** ```c static int amdxdna_gem_obj_open(...) { guard(mutex)(&abo->lock); abo->open_ref++; if (abo->open_ref =3D=3D 1) { abo->client =3D filp->driver_priv; amdxdna_gem_add_bo_usage(abo); // takes client->mm_lock inside } ``` The `abo->lock` is held while acquiring `client->mm_lock` inside `amdxdna_g= em_add_bo_usage`. In `amdxdna_gem_heap_alloc`, `client->mm_lock` is held an= d `abo->client` is accessed. This creates a potential lock ordering depende= ncy (abo->lock =E2=86=92 mm_lock). Verify no code path takes these locks in= the reverse order, or document the required ordering. **5. Conflict with current tree: `open_ref` vs existing `ref`** The current drm-next tree already has a `ref` field in `amdxdna_gem_obj` an= d already has both `open` and `close` callbacks. This patch renames `ref` t= o `open_ref` and introduces `close` as if it doesn't exist. The patch descr= iption and diffstat don't mention this as a rebase, which will cause confli= cts. The patch needs to be rebased onto the current tree. **6. `abo->client` is set/cleared based on `open_ref`, but `amdxdna_gem_obj= _open` also does IOMMU mapping unconditionally** ```c if (abo->open_ref =3D=3D 1) { abo->client =3D filp->driver_priv; amdxdna_gem_add_bo_usage(abo); } if (amdxdna_iova_on(xdna)) { ret =3D amdxdna_iommu_map_bo(xdna, abo); if (ret) return ret; } ``` The IOMMU mapping happens on every open (including `open_ref > 1`), but `cl= ient` is only assigned on the first open. If the IOMMU map fails on `open_r= ef > 1`, `open_ref` is already incremented but never decremented =E2=80=94 = the error path doesn't clean up. In the current tree this is handled correc= tly (early return when `ref > 0`), but the patch breaks this by removing th= at early return. The error path needs to decrement `open_ref` and potential= ly call `del_bo_usage` if it was the first open. **7. DEV_HEAP is marked `internal =3D true` but is also counted via `heap_u= sage`** ```c if (args->type =3D=3D AMDXDNA_BO_DEV_HEAP) { abo->type =3D AMDXDNA_BO_DEV_HEAP; abo->internal =3D true; ``` And in `amdxdna_gem_add_bo_usage`: ```c client->total_bo_usage +=3D abo->mem.size; if (abo->internal) client->total_int_bo_usage +=3D abo->mem.size; ``` So DEV_HEAP is counted in both `total_bo_usage` and `total_int_bo_usage`. B= ut DEV BO heap allocations are also counted separately in `heap_usage`. The= UAPI doc says DEV BOs don't add to the total footprint since their memory = comes from the heap. This seems correct for DEV BOs (they're skipped), but = the heap itself is counted in `total_bo_usage` AND individual heap sub-allo= cations are counted in `heap_usage`. The comment in the UAPI header is conf= using =E2=80=94 clarify whether `total_usage` includes the heap or not, and= whether `heap_usage` is a subset of `total_usage` or separate. **8. `mutex_destroy` reordering in cleanup** ```c - mutex_destroy(&client->mm_lock); if (client->dev_heap) drm_gem_object_put(to_gobj(client->dev_heap)); + mutex_destroy(&client->mm_lock); ``` This reordering is correct =E2=80=94 `drm_gem_object_put` on the dev_heap c= ould trigger `amdxdna_gem_heap_free` which takes `mm_lock`, so the mutex mu= st still be valid. Good fix but should be mentioned separately or in the co= mmit message. **9. `num_element` forced to 1 is not really an "array" query** ```c if (args->num_element !=3D 1) return -EINVAL; ``` The `GET_ARRAY` UAPI is designed for returning arrays of elements. Forcing = exactly 1 element is a bit of an API misuse. Consider whether this should i= nstead be a `GET_INFO` query, which seems more appropriate for a single-val= ue lookup. **10. The `drm_WARN_ON` for lock assertion will fire in test scenarios** ```c drm_WARN_ON(dev, !mutex_is_locked(&xdna->dev_lock)); ``` This is fine for documenting the expected locking context, but note it reli= es on the caller (aie2_get_array) always holding dev_lock, which is current= ly true via the ioctl path. However, the function signature takes `struct d= rm_device *` rather than `struct amdxdna_client *` (unlike other get_array = handlers), which is inconsistent. The function is called from `aie2_get_arr= ay` which receives a `client`, and the dev_lock is held by the ioctl wrappe= r =E2=80=94 passing `&xdna->ddev` directly couples the function to the aie2= backend's locking assumptions rather than the ioctl framework's guarantees. --- Generated by Claude Code Patch Reviewer