Claude review: accel/amdxdna: Add carveout memory support for non-IOMMU systems

public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed

From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: accel/amdxdna: Add carveout memory support for non-IOMMU systems
Date: Tue, 28 Apr 2026 14:22:37 +1000	[thread overview]
Message-ID: <review-patch1-20260427170949.2666601-1-lizhi.hou@amd.com> (raw)
In-Reply-To: <20260427170949.2666601-1-lizhi.hou@amd.com>

Patch Review

#### amdxdna_cbuf.c — Carveout buffer allocator

**`kzalloc_objs` usage for sgl allocation:**
```c
sgl = kzalloc_objs(*sg, n_entries);
```
The variable is named `sgl` but the macro dereferences `*sg` — this works because `sg` has the same type (`struct scatterlist`), but it's confusing. Should be `kzalloc_objs(*sgl, n_entries)` for clarity.

**`amdxdna_cbuf_map` manually builds sg_table:**
```c
sgt = kzalloc_obj(*sgt);
...
sgl = kzalloc_objs(*sg, n_entries);
...
sg_init_table(sgl, n_entries);
sgt->orig_nents = n_entries;
sgt->nents = n_entries;
sgt->sgl = sgl;
```
This manually allocates and wires up the sg_table instead of using `sg_alloc_table`. The manual approach sets `sgt->nents` to `n_entries` which is semantically wrong — `nents` should be the number of DMA-mapped entries (the return from `dma_map_sg`), while `orig_nents` is the pre-mapping count. Since this is filling in DMA addresses directly (via `dma_map_resource`), you should use `sgt->nents` for the DMA-mapped count, which may differ. In practice they're the same here, but it's still conceptually wrong and fragile.

**`amdxdna_cbuf_unmap` uses `drm_prime_get_contiguous_size`:**
```c
dma_unmap_resource(attach->dev, sg_dma_address(sgt->sgl),
		   drm_prime_get_contiguous_size(sgt), direction,
		   DMA_ATTR_SKIP_CPU_SYNC);
```
In `amdxdna_cbuf_map`, the entire buffer is mapped with a single `dma_map_resource` call. But here, `drm_prime_get_contiguous_size` iterates the sg entries to compute a contiguous length. Since the sg entries were filled synthetically and may have been split across max_seg boundaries, this function would scan and compute the right total only if the DMA addresses are contiguous across segments, which they are. This works but is unnecessarily roundabout — just pass `cbuf->node.size` directly (accessible via `attach->dmabuf->priv`).

**`amdxdna_cbuf_unmap` calls `sg_free_table` then `kfree(sgt)`:**
```c
sg_free_table(sgt);
kfree(sgt);
```
But the sgl was allocated with `kzalloc_objs` and manually assigned to `sgt->sgl` without using `sg_alloc_table`. `sg_free_table` will call `kfree(sgt->sgl)` which should be fine since it was `kzalloc`'d, but this asymmetry between manual allocation and `sg_free_table` cleanup is fragile. If `sg_init_table` were ever to do chaining differently, this could break.

**`amdxdna_cbuf_clear` error handling:**
```c
dma_buf_vmap(dbuf, &vmap);
if (!vmap.vaddr)
	return -EFAULT;
```
`dma_buf_vmap` returns an int error code that is ignored here. The code checks `vmap.vaddr` instead. `dma_buf_vmap` can fail and return a negative error while leaving `vmap.vaddr` as-is (NULL from the initializer). Should check the return value:
```c
ret = dma_buf_vmap(dbuf, &vmap);
if (ret)
    return ret;
```

**`amdxdna_get_cbuf` error path after `amdxdna_cbuf_clear` failure:**
```c
ret = amdxdna_cbuf_clear(dbuf);
if (ret) {
	dma_buf_put(dbuf);
	goto out;
}
```
When `amdxdna_cbuf_clear` fails, `dma_buf_put(dbuf)` drops the reference which triggers `amdxdna_cbuf_release`, which calls `drm_mm_remove_node` and `kfree(cbuf)`. Then execution jumps to `out:` which returns `ERR_PTR(ret)`. This is actually correct since the release callback handles cleanup, but the `goto out` skips `remove_node` and `free_cbuf` labels which would double-free. The control flow is correct but non-obvious — a comment would help.

**`pr_err` used instead of `XDNA_ERR`:**
```c
pr_err("Failed to dma_map_resource carveout dma buf, ret %d\n", ret);
```
and
```c
pr_err("Failed to vmap carveout dma buf\n");
```
The rest of the driver consistently uses `XDNA_ERR(xdna, ...)` for error logging. These two `pr_err` calls are inconsistent — they lack the device context that `XDNA_ERR` provides. The cbuf_priv has `cbuf->xdna` available to use.

#### amdxdna_debugfs.c — Debugfs carveout interface

**Security: no validation of physical address range:**
```c
ret = kstrtou64(sep, 0, &addr);
...
ret = amdxdna_carveout_init(xdna, addr, size);
```
The debugfs write handler accepts an arbitrary physical address and size from userspace (root). There's no validation that the address actually corresponds to reserved/carveout memory. A root user could point the driver at arbitrary physical memory (MMIO regions, other devices' memory, etc.). While debugfs is root-only and this is a debug feature, it would be prudent to at minimum validate against `memblock_is_reserved()` or similar.

**`file_to_xdna` macro is fragile:**
```c
#define file_to_xdna(file) (((struct seq_file *)(file)->private_data)->private)
```
This assumes the file was opened via `single_open` and that `private_data` is a `struct seq_file *`. While true for the current usage, this is a sharp macro. Consider using `seq_file_private()` or similar kernel helpers.

#### amdxdna_gem.c — GEM object changes

**Removed import BO skip from `amdxdna_gem_skip_bo_usage`:**
```c
-	/* Do not count imported BOs since the buffer is not allocated by us. */
-	if (is_import_bo(abo))
-		return true;
```
Carveout BOs are self-imported, so removing this check is necessary to count them. But this also means genuinely external imported BOs (from other drivers) will now be counted in `total_bo_usage`. The fdinfo comment update acknowledges this, but it's a user-visible semantic change.

**`amdxdna_gem_obj_open` restructured logic:**
```c
+	if (abo->open_ref > 1)
+		return 0;
+
+	/* Attached to the client when first opened by it. */
+	abo->client = filp->driver_priv;
+
+	/* No need to set up dma addr mapping in PASID mode. */
+	if (!amdxdna_pasid_on(abo->client)) {
+		ret = amdxdna_dma_map_bo(xdna, abo);
```
This now calls `amdxdna_pasid_on(abo->client)` to decide whether to map. But with the change to `amdxdna_drm_open`, SVA bind failure is now a warning rather than an error. If SVA bind fails (no IOMMU at all), `client->pasid` remains `IOMMU_PASID_INVALID`, so `amdxdna_pasid_on()` returns false, and `amdxdna_dma_map_bo` is called. Inside `amdxdna_dma_map_bo`, if it's a carveout-imported BO, `drm_gem_shmem_get_pages_sgt` will be called on it — but carveout BOs are imported via `drm_gem_shmem_prime_import_sg_table` and already have an sgt from the dma_buf attachment. **This will likely fail or behave incorrectly** because `drm_gem_shmem_get_pages_sgt` expects to manage shmem pages, not pre-existing imported sgt.

**`amdxdna_gem_obj_vmap` and `amdxdna_gem_obj_vunmap`:**
```c
+	if (is_import_bo(abo))
+		ret = dma_buf_vmap(abo->dma_buf, map);
+	else
+		ret = drm_gem_shmem_object_vmap(obj, map);
```
These new wrappers dispatch between imported and shmem BOs. This is needed because carveout BOs are imported and need to go through the dma_buf vmap path (which calls `amdxdna_cbuf_vmap` → `memremap`). This is correct.

**Removed `fallthrough` between `AMDXDNA_BO_CMD` and `AMDXDNA_BO_SHARE`:**
```c
 	case AMDXDNA_BO_CMD:
-		fallthrough;
 	case AMDXDNA_BO_SHARE:
```
These two cases still fall through to the same code — removing the explicit `fallthrough` annotation is correct since there's no code between them (it's an empty case label, not a fallthrough). Good cleanup.

#### amdxdna_iommu.c — DMA mapping refactoring

**BO type check changed from `AMDXDNA_BO_SHMEM` to `AMDXDNA_BO_SHARE`:**
```c
-	if (abo->type != AMDXDNA_BO_DEV_HEAP && abo->type != AMDXDNA_BO_SHMEM)
+	if (abo->type != AMDXDNA_BO_DEV_HEAP && abo->type != AMDXDNA_BO_SHARE)
```
I see `AMDXDNA_BO_SHMEM` in the original but the existing tree has this exact same check. This suggests the patch is against a slightly different base. Need to verify that `AMDXDNA_BO_SHARE` is the correct type here — since carveout BOs get `type = AMDXDNA_BO_SHARE` from `amdxdna_gem_prime_import`, this would include them. But the core concern remains: `drm_gem_shmem_get_pages_sgt` is called for all these BOs including imported carveout ones.

**`ssize_t size` variable removed, but `iommu_map_sgtable` returns `ssize_t`:**
The original code had `size_t size` (in the existing tree actually `ssize_t`) and the patch changes the variable name but the check `if (size < 0)` is added for the iommu path. However, looking at the patch diff more carefully:
```c
+	unsigned long contig_sz;
 	struct sg_table *sgt;
 	dma_addr_t dma_addr;
 	struct iova *iova;
 	ssize_t size;
```
The `ssize_t size` variable is kept for the IOMMU path where `iommu_map_sgtable` returns `ssize_t`. The new `unsigned long contig_sz` is added for the non-IOMMU path. This is fine.

**Non-IOVA path calls `drm_prime_get_contiguous_size`:**
```c
+		contig_sz = drm_prime_get_contiguous_size(sgt);
+		if (contig_sz < abo->mem.size) {
```
For carveout BOs, the sgt comes from the dma_buf attachment mapping (`amdxdna_cbuf_map`), which maps the physical address range to DMA addresses. The contiguous size check should pass since carveout memory is physically contiguous. However, this whole path through `drm_gem_shmem_get_pages_sgt` is questionable for imported BOs as noted above.

#### amdxdna_pci_drv.c — Probe and open changes

**SVA bind failure downgraded to warning:**
```c
+		/* No need to fail open since user may use pa + carveout later. */
+		if (amdxdna_sva_init(client))
+			XDNA_WARN(xdna, "PASID not available for pid %d", client->pid);
```
This is the most significant behavioral change. Previously, if SVA bind failed in a non-IOVA configuration, the open would fail entirely. Now it succeeds with a warning. This means clients can open the device even when neither IOMMU nor PASID is available, relying on carveout memory being configured later via debugfs. **The problem is**: if carveout is never configured, the client is in a broken state where buffer allocation will fail in confusing ways. It would be better to check at buffer allocation time whether we're in a valid configuration.

**`mmdrop` before `amdxdna_sva_fini` in cleanup:**
```c
+	mmdrop(client->mm);
+	amdxdna_sva_fini(client);
```
The original code called `iommu_sva_unbind_device` before `mmdrop`. The patch reverses this order. `iommu_sva_unbind_device` may access the mm_struct internally, so calling `mmdrop` first could lead to use-after-free if the mm refcount drops to zero. **This ordering should be reversed** — unbind SVA first, then drop the mm reference.

**`amdxdna_xdna_drm_release` cleanup action:**
```c
+static void amdxdna_xdna_drm_release(struct drm_device *drm, void *res)
+{
+	struct amdxdna_dev *xdna = res;
+	amdxdna_carveout_fini(xdna);
+}
```
Using `drmm_add_action` to clean up carveout on DRM device release is appropriate. However, `drmm_add_action` is called early in probe, before carveout is initialized. `amdxdna_carveout_fini` handles the NULL case (`if (!amdxdna_use_carveout(xdna)) return;`), so this is safe.

**`amdxdna_debugfs_init` called after `drm_dev_register`:**
```c
+	amdxdna_debugfs_init(xdna);
 	return 0;
```
Called after `drm_dev_register` which is correct — the debugfs root is created during registration. But there's no cleanup on the error paths above, and `amdxdna_debugfs_init` doesn't return errors (void return). This is fine since debugfs files are cleaned up automatically by DRM framework.

#### Summary of Required Fixes

1. **Critical**: Reverse `mmdrop`/`amdxdna_sva_fini` ordering in `amdxdna_client_cleanup` to avoid potential use-after-free.
2. **Important**: Verify that `drm_gem_shmem_get_pages_sgt` works correctly on carveout BOs that are self-imported via `drm_gem_shmem_prime_import_sg_table`. If not, `amdxdna_dma_map_bo` needs a different path for imported BOs.
3. **Important**: Check return value of `dma_buf_vmap` in `amdxdna_cbuf_clear` instead of just checking `vmap.vaddr`.
4. **Minor**: Use `XDNA_ERR` instead of `pr_err` in cbuf map/vmap callbacks.
5. **Minor**: Fix `kzalloc_objs(*sg, ...)` to `kzalloc_objs(*sgl, ...)`.
6. **Consider**: Add physical address range validation in the debugfs write handler.

---
Generated by Claude Code Patch Reviewer

     prev parent reply	other threads:[~2026-04-28  4:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-27 17:09 [PATCH V2] accel/amdxdna: Add carveout memory support for non-IOMMU systems Lizhi Hou
2026-04-28  4:22 ` Claude review: " Claude Code Review Bot
2026-04-28  4:22 ` Claude Code Review Bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=review-patch1-20260427170949.2666601-1-lizhi.hou@amd.com \
    --to=claude-review@example.com \
    --cc=dri-devel-reviews@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox