From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: RDMA/mlx5: get tph for p2p access when registering dma-buf mr Date: Thu, 23 Apr 2026 09:26:01 +1000 Message-ID: In-Reply-To: <20260420183920.3626389-3-zhipingz@meta.com> References: <20260420183920.3626389-1-zhipingz@meta.com> <20260420183920.3626389-3-zhipingz@meta.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Critical: Steering tag index resource leak** ```c + ret =3D mlx5_st_alloc_index_by_tag(dev->mdev, steering_tag, st_index); + if (ret) { + *ph =3D MLX5_IB_NO_PH; + mlx5_ib_dbg(dev, "st_alloc_index_by_tag failed (%d)\n", ret); + } ``` `mlx5_st_alloc_index_by_tag()` allocates a refcounted entry in `st->idx_xa`= (in non-direct mode). The resulting `st_index` is passed into `alloc_cache= able_mr()` and written into the hardware MKC, but it is **never stored** in= a way that allows `mlx5_st_dealloc_index()` to be called when the MR is de= stroyed. Looking at `__mlx5_ib_dereg_mr()`, there is no call to `mlx5_st_de= alloc_index()` =E2=80=94 that only happens through the DMAH path (`mlx5_ib_= dealloc_dmah()`). This means every MR registered via the new dma-buf TPH fallback path leaks = an xarray entry. Over time, the steering tag table fills up and `xa_alloc()= ` will eventually fail. The MR needs to track whether it owns an st_index allocation and call `mlx5= _st_dealloc_index()` during `__mlx5_ib_dereg_mr()`. **Redundant `dma_buf_get()`/`dma_buf_put()`** ```c +static void get_tph_mr_dmabuf(struct mlx5_ib_dev *dev, int fd, u16 *st_ind= ex, + u8 *ph) +{ + ... + dmabuf =3D dma_buf_get(fd); + if (IS_ERR(dmabuf)) + return; + ... + dma_buf_put(dmabuf); +} ``` This function does `dma_buf_get(fd)` / `dma_buf_put(dmabuf)` solely to quer= y TPH. The caller `reg_user_mr_dmabuf()` will immediately afterward call `i= b_umem_dmabuf_get()`, which does another `dma_buf_get()` on the same fd. Th= is is a wasted get/put cycle. Consider restructuring so the TPH query happe= ns after the dmabuf is already acquired (e.g., query from `umem_dmabuf->dma= buf`), avoiding the redundant fd lookup. **Direct ops call bypasses dma-buf framework** ```c + if (!dmabuf->ops->get_tph) + goto end_dbuf_put; + + ret =3D dmabuf->ops->get_tph(dmabuf, &steering_tag, ph, st_width); ``` This calls through the ops table directly rather than through a framework w= rapper function. Every other dma-buf operation (map, unmap, pin, unpin, att= ach, etc.) goes through a wrapper in `drivers/dma-buf/dma-buf.c`. The NULL = check should be inside the wrapper, not at the call site. This also means t= he call bypasses any future framework-level tracing, locking, or validation. **`ret =3D 0` initialization in `mlx5_st_create()`** ```c - int ret; + int ret =3D 0; ``` This fix addresses a real compiler warning, but it would be better in a sep= arate patch since it's an independent bugfix unrelated to the TPH-from-dma-= buf feature. It could also be submitted as a standalone fix that can be bac= kported independently. **`MODULE_IMPORT_NS("DMA_BUF")`** ```c +MODULE_IMPORT_NS("DMA_BUF"); ``` This is needed because `dma_buf_get()` and `dma_buf_put()` are exported in = the `DMA_BUF` namespace. Correct. **`st_width` derivation** ```c + st_width =3D (pdev->tph_req_type =3D=3D PCI_TPH_REQ_EXT_TPH) ? 16 : 8; ``` This checks `tph_req_type` but doesn't first verify that TPH is actually en= abled/supported on this device. If `pdev->tph_req_type` is `PCI_TPH_REQ_DIS= ABLE` or 0 (no TPH), the function will derive `st_width =3D 8` and proceed = to allocate a steering tag that the device can't use. The function should e= arly-return if the device doesn't support TPH (or `dev->st` is NULL), befor= e even querying the dma-buf. **Error handling: `ph` set to `MLX5_IB_NO_PH` on alloc failure but steering= _tag left as-is** ```c + ret =3D mlx5_st_alloc_index_by_tag(dev->mdev, steering_tag, st_index); + if (ret) { + *ph =3D MLX5_IB_NO_PH; ``` When `mlx5_st_alloc_index_by_tag()` fails, `*ph` is reset to `MLX5_IB_NO_PH= ` but `*st_index` is left in an indeterminate state (it was passed as a poi= nter and `mlx5_st_alloc_index_by_tag()` may or may not have written to it).= The caller initialized `st_index =3D MLX5_MKC_PCIE_TPH_NO_STEERING_TAG_IND= EX`, which should be fine if `mlx5_st_alloc_index_by_tag()` doesn't modify = `*st_index` on error, but this isn't guaranteed =E2=80=94 the xarray path c= ould partially succeed. Safer to explicitly reset `*st_index =3D MLX5_MKC_P= CIE_TPH_NO_STEERING_TAG_INDEX` in the error path. **Refactoring of `mlx5_st_alloc_index` is clean** The split into `mlx5_st_alloc_index_by_tag()` (takes raw tag) and `mlx5_st_= alloc_index()` (resolves CPU =E2=86=92 tag, then calls `_by_tag`) is a clea= n refactor. The `#ifndef CONFIG_PCIE_TPH` static inline stub returning `-EO= PNOTSUPP` is correct. --- Generated by Claude Code Patch Reviewer