From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/amdgpu: implement drm_pagemap SDMA migration callbacks Date: Tue, 28 Apr 2026 14:43:04 +1000 Message-ID: In-Reply-To: <20260427100522.7014-3-Junhua.Shen@amd.com> References: <20260427100522.7014-1-Junhua.Shen@amd.com> <20260427100522.7014-3-Junhua.Shen@amd.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Overall:** This is the largest patch (613 lines added) and implements the= core migration mechanics. The SDMA copy logic follows the established KFD = `svm_migrate_copy_memory_gart()` pattern, which is good. **Bug: `pre_migrate_fence` is completely ignored.** Both callbacks accept t= he fence but never wait on it: ```c +static int +amdgpu_svm_copy_to_devmem(struct page **pages, + struct drm_pagemap_addr *pagemap_addr, + unsigned long npages, + struct dma_fence *pre_migrate_fence) +{ ``` The `drm_pagemap_devmem_ops` documentation says: "dma-fence to wait for bef= ore migration start. May be NULL." If this fence represents pending operati= ons on the source pages (e.g., from another GPU or DMA engine), ignoring it= can cause data corruption. At minimum, add: ```c if (pre_migrate_fence) { ret =3D dma_fence_wait(pre_migrate_fence, true); if (ret) return ret; } ``` Or better, chain it with the SDMA fence. **Issue: `AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS` is unnecessarily restrictive:** ```c + bp.flags =3D AMDGPU_GEM_CREATE_NO_CPU_ACCESS | + AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | + AMDGPU_GEM_CREATE_VRAM_CLEARED; ``` The `populate_devmem_pfn` callback already handles non-contiguous buddy blo= cks via `amdgpu_res_cursor`. Requiring contiguous VRAM for every migration = will fail under fragmentation. The `VRAM_CLEARED` flag also adds unnecessar= y SDMA overhead since the pages will be immediately overwritten by `copy_to= _devmem`. Consider dropping both flags. **Issue: Batch break logic has a subtle correctness concern.** In `amdgpu_s= vm_copy_to_devmem`: ```c + /* Check if next vram page is contiguous with current */ + if (j > 0 && vram[j] !=3D vram[j - 1] + PAGE_SIZE) + goto flush; ``` This only checks VRAM contiguity but not system DMA address contiguity. The= GART window maps each system page individually so this is probably fine, b= ut the comment should clarify this is intentional. **Issue: `copy_to_ram` and `copy_to_devmem` have significant code duplicati= on.** These two functions are nearly identical with src/dst swapped. Consid= er factoring the common batch-and-flush loop into a shared helper. **Nit: Trace uses `%px` for pointer printing:** ```c + AMDGPU_MIGRATE_TRACE("Release svm_bo=3D%px bo=3D%px\n", svm_bo, svm_bo->b= o); ``` Using `%px` exposes raw kernel pointers, which is a security concern (KASLR= bypass). Use `%p` (hashed) unless this is for debug builds only. Since thi= s is `pr_debug`, it's gated, but best practice is still `%p`. **Issue: `amdgpu_svm_device_map` returns `DMA_MAPPING_ERROR` for P2P but en= codes it successfully:** ```c + } else { + /* Cross-device P2P: not yet supported */ + addr =3D DMA_MAPPING_ERROR; + } + + return drm_pagemap_addr_encode(addr, + AMDGPU_INTERCONNECT_VRAM, order, dir); ``` This encodes the error address with `AMDGPU_INTERCONNECT_VRAM` protocol. Th= e caller may not check for `DMA_MAPPING_ERROR` if the protocol looks valid.= Should return a proper error indication =E2=80=94 check how the framework = expects failures to be signaled from `device_map`. **Issue: `amdgpu_svm_gart_map` submits a job but never waits for it.** The = GART PTE update job is submitted and the fence is immediately dropped: ```c + fence =3D amdgpu_job_submit(job); + dma_fence_put(fence); ``` The subsequent `amdgpu_copy_buffer()` call needs the GART entries to be com= mitted. This relies on GPU command ordering within the same ring =E2=80=94 = if the GART update and the copy are on the same SDMA ring, the HW will seri= alize them. But this assumption should be documented, and if entity schedul= ing ever routes these to different rings, it would break. --- Generated by Claude Code Patch Reviewer