public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: accel/amdxdna: Add per-process BO memory usage query support
Date: Wed, 25 Mar 2026 06:39:42 +1000	[thread overview]
Message-ID: <review-patch1-20260324163159.2425461-1-lizhi.hou@amd.com> (raw)
In-Reply-To: <20260324163159.2425461-1-lizhi.hou@amd.com>

Patch Review

**1. UAPI: `pid` type is `__s64` — should be `__u32` or use proper pid_t semantics**

```c
struct amdxdna_drm_bo_usage {
	/** @pid: The ID of the process to query from. */
	__s64 pid;
```

Using `__s64` for a PID is unusual. The existing `amdxdna_drm_hwctx_entry` also uses `__s64` for pid, so there's precedent within this driver, but PIDs are typically positive integers that fit in 32 bits. More importantly, there's no validation that `pid` is a valid (positive) value — only that it's non-zero. A negative pid would pass the check and silently match nothing.

**2. Security concern: any process can query any other process's BO usage**

```c
	if (!tmp.pid)
		return -EINVAL;

	list_for_each_entry(tmp_client, &xdna->client_list, node) {
		if (tmp.pid != tmp_client->pid)
			continue;
```

The ioctl has no DRM_ROOT_ONLY flag (line 218 in the current tree shows `DRM_IOCTL_DEF_DRV(AMDXDNA_GET_ARRAY, amdxdna_drm_get_array_ioctl, 0)`), meaning any unprivileged process can query memory usage of any other process by PID. This leaks information about other users' memory usage. Consider restricting to the calling client's own PID unless the caller is privileged, or querying only the caller's own usage (removing the PID lookup entirely).

**3. `copy_from_user` with `min_sz` can read uninitialized stack data**

```c
	size_t min_sz = min(args->element_size, sizeof(struct amdxdna_drm_bo_usage));
	...
	if (copy_from_user(&tmp, buf, min_sz))
		return -EFAULT;
```

If `element_size < sizeof(struct amdxdna_drm_bo_usage)`, only a partial read occurs, but `tmp` is not zero-initialized. The `tmp.pid` check that follows may read uninitialized memory if `element_size` is smaller than `offsetof(amdxdna_drm_bo_usage, pid) + sizeof(pid)`. Use `= {}` or `memset` to zero-initialize `tmp` before the partial copy.

**4. Locking: `amdxdna_gem_add_bo_usage` / `amdxdna_gem_del_bo_usage` take `mm_lock`, but `open_ref` is protected by `abo->lock`**

```c
static int amdxdna_gem_obj_open(...)
{
	guard(mutex)(&abo->lock);
	abo->open_ref++;
	if (abo->open_ref == 1) {
		abo->client = filp->driver_priv;
		amdxdna_gem_add_bo_usage(abo);  // takes client->mm_lock inside
	}
```

The `abo->lock` is held while acquiring `client->mm_lock` inside `amdxdna_gem_add_bo_usage`. In `amdxdna_gem_heap_alloc`, `client->mm_lock` is held and `abo->client` is accessed. This creates a potential lock ordering dependency (abo->lock → mm_lock). Verify no code path takes these locks in the reverse order, or document the required ordering.

**5. Conflict with current tree: `open_ref` vs existing `ref`**

The current drm-next tree already has a `ref` field in `amdxdna_gem_obj` and already has both `open` and `close` callbacks. This patch renames `ref` to `open_ref` and introduces `close` as if it doesn't exist. The patch description and diffstat don't mention this as a rebase, which will cause conflicts. The patch needs to be rebased onto the current tree.

**6. `abo->client` is set/cleared based on `open_ref`, but `amdxdna_gem_obj_open` also does IOMMU mapping unconditionally**

```c
	if (abo->open_ref == 1) {
		abo->client = filp->driver_priv;
		amdxdna_gem_add_bo_usage(abo);
	}
	if (amdxdna_iova_on(xdna)) {
		ret = amdxdna_iommu_map_bo(xdna, abo);
		if (ret)
			return ret;
	}
```

The IOMMU mapping happens on every open (including `open_ref > 1`), but `client` is only assigned on the first open. If the IOMMU map fails on `open_ref > 1`, `open_ref` is already incremented but never decremented — the error path doesn't clean up. In the current tree this is handled correctly (early return when `ref > 0`), but the patch breaks this by removing that early return. The error path needs to decrement `open_ref` and potentially call `del_bo_usage` if it was the first open.

**7. DEV_HEAP is marked `internal = true` but is also counted via `heap_usage`**

```c
	if (args->type == AMDXDNA_BO_DEV_HEAP) {
		abo->type = AMDXDNA_BO_DEV_HEAP;
		abo->internal = true;
```

And in `amdxdna_gem_add_bo_usage`:
```c
	client->total_bo_usage += abo->mem.size;
	if (abo->internal)
		client->total_int_bo_usage += abo->mem.size;
```

So DEV_HEAP is counted in both `total_bo_usage` and `total_int_bo_usage`. But DEV BO heap allocations are also counted separately in `heap_usage`. The UAPI doc says DEV BOs don't add to the total footprint since their memory comes from the heap. This seems correct for DEV BOs (they're skipped), but the heap itself is counted in `total_bo_usage` AND individual heap sub-allocations are counted in `heap_usage`. The comment in the UAPI header is confusing — clarify whether `total_usage` includes the heap or not, and whether `heap_usage` is a subset of `total_usage` or separate.

**8. `mutex_destroy` reordering in cleanup**

```c
-	mutex_destroy(&client->mm_lock);
 	if (client->dev_heap)
 		drm_gem_object_put(to_gobj(client->dev_heap));
+	mutex_destroy(&client->mm_lock);
```

This reordering is correct — `drm_gem_object_put` on the dev_heap could trigger `amdxdna_gem_heap_free` which takes `mm_lock`, so the mutex must still be valid. Good fix but should be mentioned separately or in the commit message.

**9. `num_element` forced to 1 is not really an "array" query**

```c
	if (args->num_element != 1)
		return -EINVAL;
```

The `GET_ARRAY` UAPI is designed for returning arrays of elements. Forcing exactly 1 element is a bit of an API misuse. Consider whether this should instead be a `GET_INFO` query, which seems more appropriate for a single-value lookup.

**10. The `drm_WARN_ON` for lock assertion will fire in test scenarios**

```c
	drm_WARN_ON(dev, !mutex_is_locked(&xdna->dev_lock));
```

This is fine for documenting the expected locking context, but note it relies on the caller (aie2_get_array) always holding dev_lock, which is currently true via the ioctl path. However, the function signature takes `struct drm_device *` rather than `struct amdxdna_client *` (unlike other get_array handlers), which is inconsistent. The function is called from `aie2_get_array` which receives a `client`, and the dev_lock is held by the ioctl wrapper — passing `&xdna->ddev` directly couples the function to the aie2 backend's locking assumptions rather than the ioctl framework's guarantees.

---
Generated by Claude Code Patch Reviewer

      parent reply	other threads:[~2026-03-24 20:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 16:31 [PATCH V1] accel/amdxdna: Add per-process BO memory usage query support Lizhi Hou
2026-03-24 17:01 ` Mario Limonciello
2026-03-24 20:39 ` Claude review: " Claude Code Review Bot
2026-03-24 20:39 ` Claude Code Review Bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch1-20260324163159.2425461-1-lizhi.hou@amd.com \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox