From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/gpusvm, pagemap: Do not assume DRM pagemap owns device pages Date: Sun, 12 Apr 2026 11:53:26 +1000 Message-ID: In-Reply-To: <20260409015512.3670302-1-matthew.brost@intel.com> References: <20260409015512.3670302-1-matthew.brost@intel.com> <20260409015512.3670302-1-matthew.brost@intel.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **drm_pagemap.h: `drm_pagemap_page_zone_device_data()` change** ```c + if (WARN_ON_ONCE(page_pgmap(page)->ops !=3D drm_pagemap_pagemap_ops_get()= )) + return NULL; ``` `WARN_ON_ONCE` is inappropriate here. The commit message itself says "it is= possible to encounter device pages that are not owned by DRM pagemap" =E2= =80=94 this is an expected condition, not a kernel bug. `WARN_ON` emits a f= ull stack trace to dmesg/console and is reserved for situations that indica= te a programming error. If encountering a non-DRM device page is a legitima= te runtime possibility, this should be a plain `if` check: ```c if (page_pgmap(page)->ops !=3D drm_pagemap_pagemap_ops_get()) return NULL; ``` Additionally, this change makes `drm_pagemap_page_zone_device_data()` retur= n NULL in new cases, but `drm_pagemap_page_to_dpagemap()` at `drm_pagemap.c= :1322-1327` was not updated: ```c struct drm_pagemap *drm_pagemap_page_to_dpagemap(struct page *page) { struct drm_pagemap_zdd *zdd =3D drm_pagemap_page_zone_device_data(page); return zdd->devmem_allocation->dpagemap; // NULL deref if zdd is NULL } ``` This exported function will now crash if called on a non-DRM device page. E= ither this function should also get a NULL guard, or you should audit all c= allers to ensure they're already protected. Currently in `drm_gpusvm.c:1507= `, the call to `drm_pagemap_page_to_dpagemap(page)` is reached after the ne= w `__zdd` NULL check, so that specific path is safe. But other callers (cur= rent or future) are not. **drm_gpusvm.c: NULL check after `drm_pagemap_page_zone_device_data()`** ```c + if (!__zdd) { + err =3D -EINVAL; + goto err_unmap; + } ``` This is the right defensive check. If `drm_pagemap_page_zone_device_data()`= returns NULL (either from the new ops check or from `!CONFIG_ZONE_DEVICE`)= , the code would otherwise: - Set `zdd =3D __zdd` (NULL) at line 1499 - Compare `pagemap !=3D page_pgmap(page)` at line 1500, enter that branch - Call `drm_pagemap_page_to_dpagemap(page)` at line 1507 which dereferences= the NULL zdd The `err_unmap` goto is the correct error path here. No issues with this hu= nk. **drm_pagemap.c: NULL check for `src_zdd` in `drm_pagemap_migrate_to_devmem= ()`** ```c - cur.dpagemap =3D src_zdd->dpagemap; - cur.ops =3D src_zdd->devmem_allocation->ops; - cur.device =3D cur.dpagemap->drm->dev; - pages[i] =3D src_page; + if (src_zdd) { + cur.dpagemap =3D src_zdd->dpagemap; + cur.ops =3D src_zdd->devmem_allocation->ops; + cur.device =3D cur.dpagemap->drm->dev; + pages[i] =3D src_page; + } else { + npages =3D i; + err =3D -EINVAL; + goto err_finalize; + } ``` Several issues: 1. **Does not apply to drm-next.** In the current drm-next tree, these line= s are inside `if (mdetails->source_peer_migrates)` (line 654), which the pa= tch context doesn't show. The patch was made against a different base. This= needs to be rebased. 2. **`npages =3D i` is correct** (vs `npages =3D i + 1` used at the existin= g error at line 688). At this point in the loop, `drm_pagemap_get_devmem_pa= ge(page, order, zdd)` at line 681 has NOT yet been called for page `i`, so = page `i`'s zone_device_data is uninitialized. `drm_pagemap_migration_unlock= _put_pages()` must not process it, so `npages =3D i` limits cleanup to page= s 0..i-1 only. This is correct. 3. **Consider whether this error path leaks destination page `i`.** The des= tination page at index `i` was allocated by `populate_devmem_pfn()` (line 6= 30) but `migrate_pfn()` at line 667 was not called for it. Whether `migrate= _vma_finalize()` can clean it up depends on whether `populate_devmem_pfn` a= lready set `MIGRATE_PFN_VALID`. If not, this page may leak. This should be = verified against the driver implementations (e.g., Xe's `populate_devmem_pf= n` callback). **Commit message / metadata:** `Suggested-by: sashiko.dev` =E2=80=94 this is not a valid kernel tag. The `= Suggested-by` format requires `Name `, per `Documentation/process/su= bmitting-patches.rst`. A bare domain name is not an email address. --- Generated by Claude Code Patch Reviewer