From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/amdgpu: integrate VRAM migration into SVM range map path Date: Tue, 28 Apr 2026 14:43:05 +1000 Message-ID: In-Reply-To: <20260427100522.7014-6-Junhua.Shen@amd.com> References: <20260427100522.7014-1-Junhua.Shen@amd.com> <20260427100522.7014-6-Junhua.Shen@amd.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Overall:** This is the integration patch tying everything together. It ha= s the most issues. **Bug: Migration failure is silently ignored:** ```c + /* Per-range migration */ + if (migrate_mode !=3D AMDGPU_SVM_MIGRATE_NONE) + amdgpu_svm_range_migrate_range(svm, range, migrate_mode); ``` The return value of `amdgpu_svm_range_migrate_range()` is discarded. If mig= ration fails (e.g., VRAM full, SDMA error), the code proceeds to map the ra= nge as if nothing happened. This could result in stale system memory mappin= gs when VRAM was expected, or worse, mapping garbage. At minimum, log the f= ailure. Depending on the desired semantics, either fail the map or fall bac= k gracefully with a warning. **Bug: Variable declaration after statement (C89 violation):** ```c lockdep_assert_held_write(&svm->svm_lock); =20 + enum amdgpu_svm_migrate_mode migrate_mode =3D AMDGPU_SVM_MIGRATE_NONE; bool old_access, new_access; ``` The `lockdep_assert_held_write()` is a statement, and variable declarations= must come before any statements in kernel C (C89/C99-mixed style). Move th= e declaration before the `lockdep_assert_held_write()` call, or move the as= sert after all declarations. **Concern: `amdgpu_svm_range_rebuild_locked()` behavior change.** The `!reb= uild` path was: ```c - ret =3D amdgpu_svm_range_update_gpu(svm, rebuild_start, rebuild_last, - 0, NULL, true, true, true); - if (!ret) - svm->flush_tlb(svm); - return ret; + return 0; ``` This changes from "unmap GPU mappings and flush TLB" to "do nothing." If th= e `!rebuild` case represents an invalidation (pages gone, need to unmap GPU= PTEs), then returning 0 silently leaves stale GPU mappings. What guarantee= s that the GPU unmap already happened by this point? This needs justificati= on in the commit message. **Issue: `#if IS_ENABLED(CONFIG_DRM_AMDGPU_SVM)` guards in `amdgpu_device.c= ` and `amdgpu_reset.c`:** ```c +#if IS_ENABLED(CONFIG_DRM_AMDGPU_SVM) + amdgpu_svm_migration_init(adev); +#endif ``` The header already provides a static inline stub returning 0 when `CONFIG_D= RM_AMDGPU_SVM` is disabled. These `#if` guards are unnecessary and add nois= e =E2=80=94 just call the function directly and rely on the stub. **Issue: `range_invalidate_gpu_mapping()` reordering in notifier.** Moving = this call: ```c - if (clear_pte) { - amdgpu_svm_range_gpu_unmap_in_notifier(svm, range, - mmu_range); - range_invalidate_gpu_mapping(range); - } + if (clear_pte) { + amdgpu_svm_range_gpu_unmap_in_notifier(svm, range, + mmu_range); + } =20 drm_gpusvm_range_unmap_pages(&svm->gpusvm, range, &ctx); + range_invalidate_gpu_mapping(range); ``` Now `range_invalidate_gpu_mapping()` is called unconditionally (outside the= `if (clear_pte)` block), and after `drm_gpusvm_range_unmap_pages()`. This = is a semantic change =E2=80=94 previously it only happened when PTE clearin= g was needed. Is this intentional? The commit message should explain why in= validation now always happens and why the ordering change is correct. **Minor: Deleted function `amdgpu_svm_range_update_gpu` still has callers?*= * The function is removed in this patch, but the `!rebuild` path in `amdgpu= _svm_range_rebuild_locked()` was the only caller. Since that path now retur= ns 0, the deletion is consistent. Good. **Nit: Trailing whitespace removal:** ```c -=09 } ``` Line 1565 =E2=80=94 this is fine cleanup but should ideally be in a separat= e trivial patch to keep the diff focused. **Good:** The `devmem_possible` and `device_private_page_owner` additions t= o `gpusvm_ctx` are cleanly integrated: ```c + struct drm_gpusvm_ctx gpusvm_ctx =3D { + .read_only =3D !!(attrs->flags & AMDGPU_SVM_FLAG_GPU_RO), + .devmem_possible =3D hw_devmem, + .device_private_page_owner =3D hw_devmem ? + AMDGPU_SVM_PGMAP_OWNER(svm->adev) : NULL, + }; ``` **Good:** The VRAM PTE flag handling is correct =E2=80=94 clearing SYSTEM a= nd SNOOPED bits for VRAM pages: ```c + if (entry->proto =3D=3D AMDGPU_INTERCONNECT_VRAM) + seg_pte_flags &=3D ~(AMDGPU_PTE_SYSTEM | AMDGPU_PTE_SNOOPED); ``` --- Generated by Claude Code Patch Reviewer