* [PATCH 0/3] Minor hmm_test fixes and cleanups
@ 2026-03-31 6:34 Alistair Popple
2026-03-31 6:34 ` [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free Alistair Popple
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Alistair Popple @ 2026-03-31 6:34 UTC (permalink / raw)
To: linux-mm
Cc: zenghui.yu, Liam.Howlett, akpm, david, jgg, leon, linux-kernel,
ljs, mhocko, rppt, surenb, vbabka, dri-devel, balbirs,
Alistair Popple
Just a couple of minor fixups and cleanups for the HMM kernel selftests. These
were mostly reported by Zenghui Yu with special thanks to Lorenzo for analysing
and pointing out the problems.
Alistair Popple (3):
lib: test_hmm: evict device pages on file close to avoid
use-after-free
selftests/mm: hmm-tests: don't hardcode THP size to 2MB
lib: test_hmm: Implement a device release method
lib/test_hmm.c | 130 +++++++++++++++----------
tools/testing/selftests/mm/hmm-tests.c | 83 +++-------------
2 files changed, 93 insertions(+), 120 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free
2026-03-31 6:34 [PATCH 0/3] Minor hmm_test fixes and cleanups Alistair Popple
@ 2026-03-31 6:34 ` Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:34 ` [PATCH 2/3] selftests/mm: hmm-tests: don't hardcode THP size to 2MB Alistair Popple
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2026-03-31 6:34 UTC (permalink / raw)
To: linux-mm
Cc: zenghui.yu, Liam.Howlett, akpm, david, jgg, leon, linux-kernel,
ljs, mhocko, rppt, surenb, vbabka, dri-devel, balbirs,
Alistair Popple
When dmirror_fops_release() is called it frees the dmirror struct but
doesn't migrate device private pages back to system memory first. This
leaves those pages with a dangling zone_device_data pointer to the freed
dmirror.
If a subsequent fault occurs on those pages (eg. during coredump) the
dmirror_devmem_fault() callback dereferences the stale pointer causing a
kernel panic. This was reported [1] when running mm/ksft_hmm.sh on
arm64, where a test failure triggered SIGABRT and the resulting coredump
walked the VMAs faulting in the stale device private pages.
Fix this by calling dmirror_device_evict_chunk() for each devmem chunk
in dmirror_fops_release() to migrate all device private pages back to
system memory before freeing the dmirror struct. The function is moved
earlier in the file to avoid a forward declaration.
Fixes: b2ef9f5a5cb3 ("mm/hmm/test: add selftest driver for HMM")
Reported-by: Zenghui Yu <zenghui.yu@linux.dev>
Closes: https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
Note that I wasn't able to replicate the exact crash in [1] although I
replicated something similar. So I haven't been able to verify this
fixes the crash conclusively, but it should.
[1] https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/
---
lib/test_hmm.c | 112 +++++++++++++++++++++++++++----------------------
1 file changed, 62 insertions(+), 50 deletions(-)
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 0964d53365e6..79fe7d233df1 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -185,11 +185,73 @@ static int dmirror_fops_open(struct inode *inode, struct file *filp)
return 0;
}
+static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
+{
+ unsigned long start_pfn = chunk->pagemap.range.start >> PAGE_SHIFT;
+ unsigned long end_pfn = chunk->pagemap.range.end >> PAGE_SHIFT;
+ unsigned long npages = end_pfn - start_pfn + 1;
+ unsigned long i;
+ unsigned long *src_pfns;
+ unsigned long *dst_pfns;
+ unsigned int order = 0;
+
+ src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
+ dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
+
+ migrate_device_range(src_pfns, start_pfn, npages);
+ for (i = 0; i < npages; i++) {
+ struct page *dpage, *spage;
+
+ spage = migrate_pfn_to_page(src_pfns[i]);
+ if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
+ continue;
+
+ if (WARN_ON(!is_device_private_page(spage) &&
+ !is_device_coherent_page(spage)))
+ continue;
+
+ order = folio_order(page_folio(spage));
+ spage = BACKING_PAGE(spage);
+ if (src_pfns[i] & MIGRATE_PFN_COMPOUND) {
+ dpage = folio_page(folio_alloc(GFP_HIGHUSER_MOVABLE,
+ order), 0);
+ } else {
+ dpage = alloc_page(GFP_HIGHUSER_MOVABLE | __GFP_NOFAIL);
+ order = 0;
+ }
+
+ /* TODO Support splitting here */
+ lock_page(dpage);
+ dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
+ if (src_pfns[i] & MIGRATE_PFN_WRITE)
+ dst_pfns[i] |= MIGRATE_PFN_WRITE;
+ if (order)
+ dst_pfns[i] |= MIGRATE_PFN_COMPOUND;
+ folio_copy(page_folio(dpage), page_folio(spage));
+ }
+ migrate_device_pages(src_pfns, dst_pfns, npages);
+ migrate_device_finalize(src_pfns, dst_pfns, npages);
+ kvfree(src_pfns);
+ kvfree(dst_pfns);
+}
+
static int dmirror_fops_release(struct inode *inode, struct file *filp)
{
struct dmirror *dmirror = filp->private_data;
+ struct dmirror_device *mdevice = dmirror->mdevice;
+ int i;
mmu_interval_notifier_remove(&dmirror->notifier);
+
+ if (mdevice->devmem_chunks) {
+ for (i = 0; i < mdevice->devmem_count; i++) {
+ struct dmirror_chunk *devmem =
+ mdevice->devmem_chunks[i];
+
+ dmirror_device_evict_chunk(devmem);
+ }
+ }
+
xa_destroy(&dmirror->pt);
kfree(dmirror);
return 0;
@@ -1377,56 +1439,6 @@ static int dmirror_snapshot(struct dmirror *dmirror,
return ret;
}
-static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
-{
- unsigned long start_pfn = chunk->pagemap.range.start >> PAGE_SHIFT;
- unsigned long end_pfn = chunk->pagemap.range.end >> PAGE_SHIFT;
- unsigned long npages = end_pfn - start_pfn + 1;
- unsigned long i;
- unsigned long *src_pfns;
- unsigned long *dst_pfns;
- unsigned int order = 0;
-
- src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
- dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
-
- migrate_device_range(src_pfns, start_pfn, npages);
- for (i = 0; i < npages; i++) {
- struct page *dpage, *spage;
-
- spage = migrate_pfn_to_page(src_pfns[i]);
- if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
- continue;
-
- if (WARN_ON(!is_device_private_page(spage) &&
- !is_device_coherent_page(spage)))
- continue;
-
- order = folio_order(page_folio(spage));
- spage = BACKING_PAGE(spage);
- if (src_pfns[i] & MIGRATE_PFN_COMPOUND) {
- dpage = folio_page(folio_alloc(GFP_HIGHUSER_MOVABLE,
- order), 0);
- } else {
- dpage = alloc_page(GFP_HIGHUSER_MOVABLE | __GFP_NOFAIL);
- order = 0;
- }
-
- /* TODO Support splitting here */
- lock_page(dpage);
- dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
- if (src_pfns[i] & MIGRATE_PFN_WRITE)
- dst_pfns[i] |= MIGRATE_PFN_WRITE;
- if (order)
- dst_pfns[i] |= MIGRATE_PFN_COMPOUND;
- folio_copy(page_folio(dpage), page_folio(spage));
- }
- migrate_device_pages(src_pfns, dst_pfns, npages);
- migrate_device_finalize(src_pfns, dst_pfns, npages);
- kvfree(src_pfns);
- kvfree(dst_pfns);
-}
-
/* Removes free pages from the free list so they can't be re-allocated */
static void dmirror_remove_free_pages(struct dmirror_chunk *devmem)
{
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] selftests/mm: hmm-tests: don't hardcode THP size to 2MB
2026-03-31 6:34 [PATCH 0/3] Minor hmm_test fixes and cleanups Alistair Popple
2026-03-31 6:34 ` [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free Alistair Popple
@ 2026-03-31 6:34 ` Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:34 ` [PATCH 3/3] lib: test_hmm: Implement a device release method Alistair Popple
2026-03-31 6:41 ` Claude review: Minor hmm_test fixes and cleanups Claude Code Review Bot
3 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2026-03-31 6:34 UTC (permalink / raw)
To: linux-mm
Cc: zenghui.yu, Liam.Howlett, akpm, david, jgg, leon, linux-kernel,
ljs, mhocko, rppt, surenb, vbabka, dri-devel, balbirs,
Alistair Popple
Several HMM tests hardcode TWOMEG as the THP size. This is wrong on
architectures where the PMD size is not 2MB such as arm64 with 64K base
pages where THP is 512MB. Fix this by using read_pmd_pagesize() from
vm_util instead.
While here also replace the custom file_read_ulong() helper used to
parse the default hugetlbfs page size from /proc/meminfo with the
existing default_huge_page_size() from vm_util.
[1] https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/
Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM")
Fixes: 519071529d2a ("selftests/mm/hmm-tests: new tests for zone device THP migration")
Reported-by: Zenghui Yu <zenghui.yu@linux.dev>
Closes: https://lore.kernel.org/linux-mm/8bd0396a-8997-4d2e-a13f-5aac033083d7@linux.dev/
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
tools/testing/selftests/mm/hmm-tests.c | 83 +++++---------------------
1 file changed, 16 insertions(+), 67 deletions(-)
diff --git a/tools/testing/selftests/mm/hmm-tests.c b/tools/testing/selftests/mm/hmm-tests.c
index e8328c89d855..788689497e92 100644
--- a/tools/testing/selftests/mm/hmm-tests.c
+++ b/tools/testing/selftests/mm/hmm-tests.c
@@ -34,6 +34,7 @@
*/
#include <lib/test_hmm_uapi.h>
#include <mm/gup_test.h>
+#include <mm/vm_util.h>
struct hmm_buffer {
void *ptr;
@@ -548,7 +549,7 @@ TEST_F(hmm, anon_write_child)
for (migrate = 0; migrate < 2; ++migrate) {
for (use_thp = 0; use_thp < 2; ++use_thp) {
- npages = ALIGN(use_thp ? TWOMEG : HMM_BUFFER_SIZE,
+ npages = ALIGN(use_thp ? read_pmd_pagesize() : HMM_BUFFER_SIZE,
self->page_size) >> self->page_shift;
ASSERT_NE(npages, 0);
size = npages << self->page_shift;
@@ -728,7 +729,7 @@ TEST_F(hmm, anon_write_huge)
int *ptr;
int ret;
- size = 2 * TWOMEG;
+ size = 2 * read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -744,7 +745,7 @@ TEST_F(hmm, anon_write_huge)
buffer->fd, 0);
ASSERT_NE(buffer->ptr, MAP_FAILED);
- size = TWOMEG;
+ size /= 2;
npages = size >> self->page_shift;
map = (void *)ALIGN((uintptr_t)buffer->ptr, size);
ret = madvise(map, size, MADV_HUGEPAGE);
@@ -770,54 +771,6 @@ TEST_F(hmm, anon_write_huge)
hmm_buffer_free(buffer);
}
-/*
- * Read numeric data from raw and tagged kernel status files. Used to read
- * /proc and /sys data (without a tag) and from /proc/meminfo (with a tag).
- */
-static long file_read_ulong(char *file, const char *tag)
-{
- int fd;
- char buf[2048];
- int len;
- char *p, *q;
- long val;
-
- fd = open(file, O_RDONLY);
- if (fd < 0) {
- /* Error opening the file */
- return -1;
- }
-
- len = read(fd, buf, sizeof(buf));
- close(fd);
- if (len < 0) {
- /* Error in reading the file */
- return -1;
- }
- if (len == sizeof(buf)) {
- /* Error file is too large */
- return -1;
- }
- buf[len] = '\0';
-
- /* Search for a tag if provided */
- if (tag) {
- p = strstr(buf, tag);
- if (!p)
- return -1; /* looks like the line we want isn't there */
- p += strlen(tag);
- } else
- p = buf;
-
- val = strtol(p, &q, 0);
- if (*q != ' ') {
- /* Error parsing the file */
- return -1;
- }
-
- return val;
-}
-
/*
* Write huge TLBFS page.
*/
@@ -826,15 +779,13 @@ TEST_F(hmm, anon_write_hugetlbfs)
struct hmm_buffer *buffer;
unsigned long npages;
unsigned long size;
- unsigned long default_hsize;
+ unsigned long default_hsize = default_huge_page_size();
unsigned long i;
int *ptr;
int ret;
- default_hsize = file_read_ulong("/proc/meminfo", "Hugepagesize:");
- if (default_hsize < 0 || default_hsize*1024 < default_hsize)
+ if (!default_hsize)
SKIP(return, "Huge page size could not be determined");
- default_hsize = default_hsize*1024; /* KB to B */
size = ALIGN(TWOMEG, default_hsize);
npages = size >> self->page_shift;
@@ -1606,7 +1557,7 @@ TEST_F(hmm, compound)
struct hmm_buffer *buffer;
unsigned long npages;
unsigned long size;
- unsigned long default_hsize;
+ unsigned long default_hsize = default_huge_page_size();
int *ptr;
unsigned char *m;
int ret;
@@ -1614,10 +1565,8 @@ TEST_F(hmm, compound)
/* Skip test if we can't allocate a hugetlbfs page. */
- default_hsize = file_read_ulong("/proc/meminfo", "Hugepagesize:");
- if (default_hsize < 0 || default_hsize*1024 < default_hsize)
+ if (!default_hsize)
SKIP(return, "Huge page size could not be determined");
- default_hsize = default_hsize*1024; /* KB to B */
size = ALIGN(TWOMEG, default_hsize);
npages = size >> self->page_shift;
@@ -2106,7 +2055,7 @@ TEST_F(hmm, migrate_anon_huge_empty)
int *ptr;
int ret;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -2158,7 +2107,7 @@ TEST_F(hmm, migrate_anon_huge_zero)
int ret;
int val;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -2221,7 +2170,7 @@ TEST_F(hmm, migrate_anon_huge_free)
int *ptr;
int ret;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -2280,7 +2229,7 @@ TEST_F(hmm, migrate_anon_huge_fault)
int *ptr;
int ret;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -2332,7 +2281,7 @@ TEST_F(hmm, migrate_partial_unmap_fault)
{
struct hmm_buffer *buffer;
unsigned long npages;
- unsigned long size = TWOMEG;
+ unsigned long size = read_pmd_pagesize();
unsigned long i;
void *old_ptr;
void *map;
@@ -2398,7 +2347,7 @@ TEST_F(hmm, migrate_remap_fault)
{
struct hmm_buffer *buffer;
unsigned long npages;
- unsigned long size = TWOMEG;
+ unsigned long size = read_pmd_pagesize();
unsigned long i;
void *old_ptr, *new_ptr = NULL;
void *map;
@@ -2498,7 +2447,7 @@ TEST_F(hmm, migrate_anon_huge_err)
int *ptr;
int ret;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
@@ -2593,7 +2542,7 @@ TEST_F(hmm, migrate_anon_huge_zero_err)
int *ptr;
int ret;
- size = TWOMEG;
+ size = read_pmd_pagesize();
buffer = malloc(sizeof(*buffer));
ASSERT_NE(buffer, NULL);
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] lib: test_hmm: Implement a device release method
2026-03-31 6:34 [PATCH 0/3] Minor hmm_test fixes and cleanups Alistair Popple
2026-03-31 6:34 ` [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free Alistair Popple
2026-03-31 6:34 ` [PATCH 2/3] selftests/mm: hmm-tests: don't hardcode THP size to 2MB Alistair Popple
@ 2026-03-31 6:34 ` Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:41 ` Claude review: Minor hmm_test fixes and cleanups Claude Code Review Bot
3 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2026-03-31 6:34 UTC (permalink / raw)
To: linux-mm
Cc: zenghui.yu, Liam.Howlett, akpm, david, jgg, leon, linux-kernel,
ljs, mhocko, rppt, surenb, vbabka, dri-devel, balbirs,
Alistair Popple
Unloading the HMM test module produces the following warning:
[ 3782.224783] ------------[ cut here ]------------
[ 3782.226323] Device 'hmm_dmirror0' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.
[ 3782.230570] WARNING: drivers/base/core.c:2567 at device_release+0x185/0x210, CPU#20: rmmod/1924
[ 3782.233949] Modules linked in: test_hmm(-) nvidia_uvm(O) nvidia(O)
[ 3782.236321] CPU: 20 UID: 0 PID: 1924 Comm: rmmod Tainted: G O 7.0.0-rc1+ #374 PREEMPT(full)
[ 3782.240226] Tainted: [O]=OOT_MODULE
[ 3782.241639] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
[ 3782.246193] RIP: 0010:device_release+0x185/0x210
[ 3782.247860] Code: 00 00 fc ff df 48 8d 7b 50 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 86 00 00 00 48 8b 73 50 48 85 f6 74 11 48 8d 3d db 25 29 03 <67> 48 0f b9 3a e9 0d ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 89
[ 3782.254211] RSP: 0018:ffff888126577d98 EFLAGS: 00010246
[ 3782.256054] RAX: dffffc0000000000 RBX: ffffffffc2b70310 RCX: ffffffff8fe61ba1
[ 3782.258512] RDX: 1ffffffff856e062 RSI: ffff88811341eea0 RDI: ffffffff91bbacb0
[ 3782.261041] RBP: ffff888111475000 R08: 0000000000000001 R09: fffffbfff856e069
[ 3782.263471] R10: ffffffffc2b7034b R11: 00000000ffffffff R12: 0000000000000000
[ 3782.265983] R13: dffffc0000000000 R14: ffff88811341eea0 R15: 0000000000000000
[ 3782.268443] FS: 00007fd5a3689040(0000) GS:ffff88842c8d0000(0000) knlGS:0000000000000000
[ 3782.271236] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3782.273251] CR2: 00007fd5a36d2c10 CR3: 00000001242b8000 CR4: 00000000000006f0
[ 3782.275362] Call Trace:
[ 3782.276071] <TASK>
[ 3782.276678] kobject_put+0x146/0x270
[ 3782.277731] hmm_dmirror_exit+0x7a/0x130 [test_hmm]
[ 3782.279135] __do_sys_delete_module+0x341/0x510
[ 3782.280438] ? module_flags+0x300/0x300
[ 3782.281547] do_syscall_64+0x111/0x670
[ 3782.282620] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 3782.284091] RIP: 0033:0x7fd5a3793b37
[ 3782.285303] Code: 73 01 c3 48 8b 0d c9 82 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 82 0c 00 f7 d8 64 89 01 48
[ 3782.290708] RSP: 002b:00007ffd68b7dc68 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 3782.292817] RAX: ffffffffffffffda RBX: 000055e3c0d1c770 RCX: 00007fd5a3793b37
[ 3782.294735] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055e3c0d1c7d8
[ 3782.296661] RBP: 0000000000000000 R08: 1999999999999999 R09: 0000000000000000
[ 3782.298622] R10: 00007fd5a3806ac0 R11: 0000000000000206 R12: 00007ffd68b7deb0
[ 3782.300576] R13: 00007ffd68b7e781 R14: 000055e3c0d1b2a0 R15: 00007ffd68b7deb8
[ 3782.301963] </TASK>
[ 3782.302371] irq event stamp: 5019
[ 3782.302987] hardirqs last enabled at (5027): [<ffffffff8cf1f062>] __up_console_sem+0x52/0x60
[ 3782.304507] hardirqs last disabled at (5036): [<ffffffff8cf1f047>] __up_console_sem+0x37/0x60
[ 3782.306086] softirqs last enabled at (4940): [<ffffffff8cd9a4b0>] __irq_exit_rcu+0xc0/0xf0
[ 3782.307567] softirqs last disabled at (4929): [<ffffffff8cd9a4b0>] __irq_exit_rcu+0xc0/0xf0
[ 3782.309105] ---[ end trace 0000000000000000 ]---
This is because the test module doesn't have a device.release method. In
this case one probably isn't needed for correctness - the device structs
are in a static array so don't need freeing when the final reference
goes away.
However some device state is freed on exit, so to ensure this happens at
the right time and to silence the warning move the deinitialisation to
a release method and assign that as the device release callback. Whilst
here also fix a minor error handling bug where cdev_device_del() wasn't
being called if allocation failed.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
lib/test_hmm.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 79fe7d233df1..213504915737 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -1738,6 +1738,13 @@ static const struct dev_pagemap_ops dmirror_devmem_ops = {
.folio_split = dmirror_devmem_folio_split,
};
+static void dmirror_device_release(struct device *dev)
+{
+ struct dmirror_device *mdevice = container_of(dev, struct dmirror_device, device);
+
+ dmirror_device_remove_chunks(mdevice);
+}
+
static int dmirror_device_init(struct dmirror_device *mdevice, int id)
{
dev_t dev;
@@ -1749,6 +1756,8 @@ static int dmirror_device_init(struct dmirror_device *mdevice, int id)
cdev_init(&mdevice->cdevice, &dmirror_fops);
mdevice->cdevice.owner = THIS_MODULE;
+ mdevice->device.release = dmirror_device_release;
+
device_initialize(&mdevice->device);
mdevice->device.devt = dev;
@@ -1756,12 +1765,16 @@ static int dmirror_device_init(struct dmirror_device *mdevice, int id)
if (ret)
goto put_device;
+ /* Build a list of free ZONE_DEVICE struct pages */
+ ret = dmirror_allocate_chunk(mdevice, NULL, false);
+ if (ret)
+ goto put_device;
+
ret = cdev_device_add(&mdevice->cdevice, &mdevice->device);
if (ret)
goto put_device;
- /* Build a list of free ZONE_DEVICE struct pages */
- return dmirror_allocate_chunk(mdevice, NULL, false);
+ return 0;
put_device:
put_device(&mdevice->device);
@@ -1770,7 +1783,6 @@ static int dmirror_device_init(struct dmirror_device *mdevice, int id)
static void dmirror_device_remove(struct dmirror_device *mdevice)
{
- dmirror_device_remove_chunks(mdevice);
cdev_device_del(&mdevice->cdevice, &mdevice->device);
put_device(&mdevice->device);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Claude review: Minor hmm_test fixes and cleanups
2026-03-31 6:34 [PATCH 0/3] Minor hmm_test fixes and cleanups Alistair Popple
` (2 preceding siblings ...)
2026-03-31 6:34 ` [PATCH 3/3] lib: test_hmm: Implement a device release method Alistair Popple
@ 2026-03-31 6:41 ` Claude Code Review Bot
3 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 6:41 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: Minor hmm_test fixes and cleanups
Author: Alistair Popple <apopple@nvidia.com>
Patches: 4
Reviewed: 2026-03-31T16:41:09.620184
---
This is a well-structured 3-patch series fixing real bugs and warnings in the HMM kernel selftests (`lib/test_hmm.c` and `tools/testing/selftests/mm/hmm-tests.c`). The fixes are appropriate and well-motivated:
1. **Patch 1** fixes a use-after-free crash path reported on arm64 by evicting device pages before freeing the dmirror struct on file close.
2. **Patch 2** fixes portability issues on architectures where PMD size != 2MB (e.g., arm64 with 64K base pages), and cleans up a custom helper in favor of existing utilities.
3. **Patch 3** adds a proper `device.release` method to silence a kernel warning on module unload, and fixes a minor error-handling bug.
The series is generally **good to merge** with a couple of observations below.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: lib: test_hmm: evict device pages on file close to avoid use-after-free
2026-03-31 6:34 ` [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free Alistair Popple
@ 2026-03-31 6:41 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 6:41 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Correctness concern — evicts ALL device chunks, not just this dmirror's pages:**
The release handler iterates over all `mdevice->devmem_chunks` and calls `dmirror_device_evict_chunk()` on each:
```c
if (mdevice->devmem_chunks) {
for (i = 0; i < mdevice->devmem_count; i++) {
struct dmirror_chunk *devmem =
mdevice->devmem_chunks[i];
dmirror_device_evict_chunk(devmem);
}
}
```
This evicts **all** device private pages from the entire device, not just pages belonging to the closing dmirror instance. If multiple file descriptors (and thus multiple `dmirror` structs) are open on the same device simultaneously, closing one will evict pages that belong to other active dmirror instances. The `migrate_device_range()` call operates on physical PFN ranges and doesn't filter by owner.
In practice this is probably fine for a test driver (tests are sequential), but it's worth noting in the commit message or as a comment. The proper fix would filter pages by checking `zone_device_data` against the closing `dmirror`, but that's significantly more complex and probably overkill for a test module.
**Missing locking:** The `dmirror_device_remove_chunks()` function takes `mdevice->devmem_lock` and `mdevice->lock` when doing eviction, but the eviction in `dmirror_fops_release()` holds neither lock. `migrate_device_range()` likely handles its own synchronization, but the lack of `devmem_lock` around the `mdevice->devmem_chunks` / `mdevice->devmem_count` access is a potential TOCTOU issue if chunks could be added/removed concurrently.
**No NULL check on folio_alloc for compound pages:**
```c
if (src_pfns[i] & MIGRATE_PFN_COMPOUND) {
dpage = folio_page(folio_alloc(GFP_HIGHUSER_MOVABLE,
order), 0);
}
```
If `folio_alloc()` returns NULL, `folio_page(NULL, 0)` will crash. The non-compound path uses `__GFP_NOFAIL` but the compound path doesn't. This is a pre-existing issue (the function was just moved), not introduced by this patch.
**Overall:** The fix is correct for the reported problem. The function move is clean — the code is identical.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: selftests/mm: hmm-tests: don't hardcode THP size to 2MB
2026-03-31 6:34 ` [PATCH 2/3] selftests/mm: hmm-tests: don't hardcode THP size to 2MB Alistair Popple
@ 2026-03-31 6:41 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 6:41 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This is a clean, correct fix. Good use of existing helpers.
**Positive observations:**
- Correctly replaces `TWOMEG` with `read_pmd_pagesize()` in all THP-related tests.
- The `TWOMEG` define and its remaining uses (lines 790, 1571) are kept for hugetlbfs alignment where a fixed minimum size is appropriate.
- The custom `file_read_ulong()` helper is properly replaced with `default_huge_page_size()` from `vm_util.h`, reducing code duplication.
- The `size /= 2` change in `anon_write_huge` is a nice simplification that avoids calling `read_pmd_pagesize()` twice.
**Minor nit:** The `#define TWOMEG (1 << 21)` is still present at line 57 but only used in two places for hugetlbfs alignment. Could add a comment clarifying it's not THP size, but this is very minor.
**No issues found.** This patch looks correct.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Claude review: lib: test_hmm: Implement a device release method
2026-03-31 6:34 ` [PATCH 3/3] lib: test_hmm: Implement a device release method Alistair Popple
@ 2026-03-31 6:41 ` Claude Code Review Bot
0 siblings, 0 replies; 8+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 6:41 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Good fix for the warning.** The release method is correctly wired up.
**Reordering of dmirror_allocate_chunk before cdev_device_add is a nice improvement:**
```c
/* Build a list of free ZONE_DEVICE struct pages */
ret = dmirror_allocate_chunk(mdevice, NULL, false);
if (ret)
goto put_device;
ret = cdev_device_add(&mdevice->cdevice, &mdevice->device);
if (ret)
goto put_device;
```
Previously, `dmirror_allocate_chunk` was called *after* `cdev_device_add`, meaning if allocation failed, `cdev_device_del()` was never called — the device was added but not properly removed. The new ordering is better: allocate resources first, only then expose the device. The `put_device` error path now correctly triggers `dmirror_device_release()` which calls `dmirror_device_remove_chunks()`, and `dmirror_device_remove_chunks()` safely handles the case where chunks haven't been allocated (the `if (mdevice->devmem_chunks)` guard).
**One subtle concern:** After the reorder, if `cdev_device_add()` fails, `put_device()` triggers `dmirror_device_release()` → `dmirror_device_remove_chunks()`. The `remove_chunks` function calls `dmirror_device_evict_chunk()` on each chunk. Since the device was never made visible (no `cdev_device_add`), no user could have migrated pages to the device memory, so the eviction should find no migrated pages and be harmless. This is safe.
**Overall:** Clean and correct. The error-handling improvement is a nice side-fix.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-31 6:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 6:34 [PATCH 0/3] Minor hmm_test fixes and cleanups Alistair Popple
2026-03-31 6:34 ` [PATCH 1/3] lib: test_hmm: evict device pages on file close to avoid use-after-free Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:34 ` [PATCH 2/3] selftests/mm: hmm-tests: don't hardcode THP size to 2MB Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:34 ` [PATCH 3/3] lib: test_hmm: Implement a device release method Alistair Popple
2026-03-31 6:41 ` Claude review: " Claude Code Review Bot
2026-03-31 6:41 ` Claude review: Minor hmm_test fixes and cleanups Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox