* [PATCH] drm/amdkfd: fix integer overflow in get_queue_ids()
@ 2026-05-23 14:26 Muhammad Bilal
2026-05-23 16:56 ` [PATCH] drm/amdkfd: fix NULL dereference " Muhammad Bilal
2026-05-25 7:31 ` Claude review: drm/amdkfd: fix integer overflow " Claude Code Review Bot
0 siblings, 2 replies; 5+ messages in thread
From: Muhammad Bilal @ 2026-05-23 14:26 UTC (permalink / raw)
To: Felix.Kuehling
Cc: alexander.deucher, christian.koenig, airlied, simona, amd-gfx,
dri-devel, linux-kernel, stable, Muhammad Bilal
get_queue_ids() computes the allocation size as:
size_t array_size = num_queues * sizeof(uint32_t);
num_queues is a user-controlled u32 copied directly from the ioctl
argument (args.suspend_queues.num_queues or args.resume_queues.num_queues)
via kfd_ioctl_set_debug_trap() with no prior validation or clamping.
On 32-bit kernels, size_t is 32 bits wide. A caller supplying
num_queues = 0x40000001 causes the multiplication to silently wrap:
0x40000001 * 4 = 0x100000004 -> truncated to 0x4
memdup_user() then allocates only 4 bytes. q_array_invalidate() is
called immediately after with the original num_queues value and
iterates 0x40000001 times writing KFD_DBG_QUEUE_INVALID_MASK into the
4-byte buffer, producing an unbounded heap buffer overflow.
q_array_get_index() in both callers walks the same buffer using the
same unchecked count.
Both call sites are affected:
- suspend_queues() calls get_queue_ids() unconditionally
- resume_queues() calls it only when usr_queue_id_array is non-NULL
Both callers already propagate IS_ERR() returns to userspace, so
returning ERR_PTR(-EINVAL) on overflow requires no new error handling.
The copy_to_user() calls at the tail of both functions also compute
num_queues * sizeof(uint32_t), but are only reachable after a
successful get_queue_ids() return, so they are safe once the
allocation is correctly bounded.
Fix by replacing the unchecked multiplication with check_mul_overflow().
Cast num_queues to size_t so all three arguments match the destination
type, avoiding implicit type mismatch on compilers that implement the
macro with typeof() rather than __builtin_mul_overflow() directly.
Add an explicit #include <linux/overflow.h> rather than relying on the
transitive pull through linux/slab.h.
Fixes: a70a93fa568b ("drm/amdkfd: add debug suspend and resume process queues operation")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index e0a31e11f0ff..c08ad718dbd7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -25,6 +25,7 @@
#include <linux/ratelimit.h>
#include <linux/printk.h>
#include <linux/slab.h>
+#include <linux/overflow.h>
#include <linux/list.h>
#include <linux/types.h>
#include <linux/bitops.h>
@@ -3308,11 +3309,14 @@ static void copy_context_work_handler(struct work_struct *work)
static uint32_t *get_queue_ids(uint32_t num_queues, uint32_t *usr_queue_id_array)
{
- size_t array_size = num_queues * sizeof(uint32_t);
+ size_t array_size;
if (!usr_queue_id_array)
return NULL;
+ if (check_mul_overflow((size_t)num_queues, sizeof(uint32_t), &array_size))
+ return ERR_PTR(-EINVAL);
+
return memdup_user(usr_queue_id_array, array_size);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] drm/amdkfd: fix NULL dereference in get_queue_ids()
2026-05-23 14:26 [PATCH] drm/amdkfd: fix integer overflow in get_queue_ids() Muhammad Bilal
@ 2026-05-23 16:56 ` Muhammad Bilal
2026-05-25 7:31 ` Claude review: " Claude Code Review Bot
2026-05-25 7:31 ` Claude Code Review Bot
2026-05-25 7:31 ` Claude review: drm/amdkfd: fix integer overflow " Claude Code Review Bot
1 sibling, 2 replies; 5+ messages in thread
From: Muhammad Bilal @ 2026-05-23 16:56 UTC (permalink / raw)
To: Felix.Kuehling
Cc: alexander.deucher, christian.koenig, airlied, simona, amd-gfx,
dri-devel, linux-kernel, stable, Muhammad Bilal
When usr_queue_id_array is NULL and num_queues is non-zero,
get_queue_ids() returns NULL. The callers check only IS_ERR() on the
return value; since IS_ERR(NULL) == false the check passes, and
suspend_queues() calls q_array_invalidate() which immediately
dereferences NULL while iterating num_queues times.
Userspace can trigger this via kfd_ioctl_set_debug_trap() by supplying
num_queues > 0 with a zero queue_array_ptr, causing a kernel panic.
A NULL usr_queue_id_array with num_queues == 0 is a legitimate no-op
(q_array_invalidate never executes, and resume_queues already guards
all queue_ids dereferences behind a NULL check). Return ERR_PTR(-EINVAL)
only when num_queues is non-zero and the pointer is absent; both callers
already propagate IS_ERR() returns correctly to userspace.
Fixes: a70a93fa568b ("drm/amdkfd: add debug suspend and resume process queues operation")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index c08ad718dbd7..8488b3a6c2ba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -3312,7 +3312,7 @@ static uint32_t *get_queue_ids(uint32_t num_queues, uint32_t *usr_queue_id_array
size_t array_size;
if (!usr_queue_id_array)
- return NULL;
+ return num_queues ? ERR_PTR(-EINVAL) : NULL;
if (check_mul_overflow((size_t)num_queues, sizeof(uint32_t), &array_size))
return ERR_PTR(-EINVAL);
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Claude review: drm/amdkfd: fix NULL dereference in get_queue_ids()
2026-05-23 16:56 ` [PATCH] drm/amdkfd: fix NULL dereference " Muhammad Bilal
@ 2026-05-25 7:31 ` Claude Code Review Bot
2026-05-25 7:31 ` Claude Code Review Bot
1 sibling, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 7:31 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: drm/amdkfd: fix NULL dereference in get_queue_ids()
Author: Muhammad Bilal <meatuni001@gmail.com>
Patches: 2
Reviewed: 2026-05-25T17:31:37.605877
---
This is a two-patch series from Muhammad Bilal fixing two distinct security bugs in `get_queue_ids()` in the amdkfd debug-trap queue suspend/resume path. Both `num_queues` and `queue_array_ptr` are user-controlled via `kfd_ioctl_set_debug_trap()` (confirmed at `kfd_chardev.c:3225-3234`), making both bugs userspace-triggerable.
**Patch ordering**: The integer overflow fix (earlier Message-ID) is the base; the NULL dereference fix (In-Reply-To the first) is an incremental follow-up. Both patches are small, correct, and well-justified.
**Verdict**: Both patches are correct and should be applied. The commit messages are thorough and accurate. The only nit is that `suspend_queues()` has a pre-existing asymmetry with `resume_queues()` in how it handles the `copy_to_user`/`kfree` tail — but that is outside the scope of these fixes and not a regression.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
* Claude review: drm/amdkfd: fix NULL dereference in get_queue_ids()
2026-05-23 16:56 ` [PATCH] drm/amdkfd: fix NULL dereference " Muhammad Bilal
2026-05-25 7:31 ` Claude review: " Claude Code Review Bot
@ 2026-05-25 7:31 ` Claude Code Review Bot
1 sibling, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 7:31 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Correctness: Good**
The original code:
```c
size_t array_size = num_queues * sizeof(uint32_t);
```
computes `array_size` with a bare multiplication that can wrap on 32-bit kernels (where `size_t` is 32 bits). With `num_queues = 0x40000001`, the product `0x100000004` truncates to `0x4`, and `memdup_user()` allocates 4 bytes. `q_array_invalidate()` then writes `num_queues` entries into that 4-byte buffer — heap overflow.
The fix:
```c
size_t array_size;
...
if (check_mul_overflow((size_t)num_queues, sizeof(uint32_t), &array_size))
return ERR_PTR(-EINVAL);
```
is correct. The `(size_t)` cast on `num_queues` ensures all three arguments to `check_mul_overflow()` share the same type, avoiding implicit-type issues with compiler builtins. The `#include <linux/overflow.h>` is appropriate rather than relying on transitive includes.
**Callers handle the error correctly**: Both `suspend_queues()` (`kfd_device_queue_manager.c:3607`) and `resume_queues()` (`kfd_device_queue_manager.c:3503`) check `IS_ERR(queue_ids)` and propagate via `PTR_ERR()`.
**Note on the `copy_to_user()` calls**: The commit message correctly observes that the `num_queues * sizeof(uint32_t)` in `copy_to_user()` (lines 3588, 3704) is safe because it's only reachable after `get_queue_ids()` succeeded with the same `num_queues`, meaning the multiplication was already validated. This is sound reasoning.
**No issues found.**
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
* Claude review: drm/amdkfd: fix integer overflow in get_queue_ids()
2026-05-23 14:26 [PATCH] drm/amdkfd: fix integer overflow in get_queue_ids() Muhammad Bilal
2026-05-23 16:56 ` [PATCH] drm/amdkfd: fix NULL dereference " Muhammad Bilal
@ 2026-05-25 7:31 ` Claude Code Review Bot
1 sibling, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 7:31 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Correctness: Good**
The fix changes:
```c
if (!usr_queue_id_array)
return NULL;
```
to:
```c
if (!usr_queue_id_array)
return num_queues ? ERR_PTR(-EINVAL) : NULL;
```
This addresses a real bug: in `suspend_queues()` (line 3603), `get_queue_ids()` is called unconditionally. If userspace supplies `num_queues > 0` with `queue_array_ptr == 0`, `get_queue_ids()` previously returned NULL. Since `IS_ERR(NULL) == false`, execution continued to `q_array_invalidate(num_queues, NULL)` at line 3611, which dereferences NULL in a loop — kernel panic.
The `num_queues == 0` case correctly still returns NULL, preserving the existing no-op behavior. In `suspend_queues()`, this means `q_array_invalidate(0, NULL)` is called but doesn't iterate (loop bound is 0), the `q_array_get_index` calls also use `num_queues == 0` so they don't iterate, and `copy_to_user(NULL, NULL, 0)` is a no-op. `kfree(NULL)` is also a no-op. So the NULL return path in `suspend_queues` is safe, though it's a pre-existing asymmetry with `resume_queues` which guards the tail behind `if (queue_ids)`.
In `resume_queues()` (line 3500), there's already a `if (usr_queue_id_array)` guard before calling `get_queue_ids()`, so the NULL dereference path was only exploitable via `suspend_queues()`. The fix is still correct for both callers.
**Minor observation (pre-existing, not introduced by this patch)**: `suspend_queues()` at lines 3704-3708 does not guard `copy_to_user()` and `kfree()` with a NULL check on `queue_ids`, unlike `resume_queues()` at line 3586. When `get_queue_ids()` returns NULL (the `num_queues == 0` no-op case), `copy_to_user(NULL, NULL, 0)` happens to be benign on all current architectures, but wrapping those calls in `if (queue_ids)` for consistency with `resume_queues` would be a small improvement. Not a bug, not a blocker, and out of scope for this fix.
**No issues found with the patch itself.**
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-25 7:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-23 14:26 [PATCH] drm/amdkfd: fix integer overflow in get_queue_ids() Muhammad Bilal
2026-05-23 16:56 ` [PATCH] drm/amdkfd: fix NULL dereference " Muhammad Bilal
2026-05-25 7:31 ` Claude review: " Claude Code Review Bot
2026-05-25 7:31 ` Claude Code Review Bot
2026-05-25 7:31 ` Claude review: drm/amdkfd: fix integer overflow " Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox