* [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
@ 2026-03-28 21:39 Mikhail Gavrilov
2026-03-30 7:31 ` Lazar, Lijo
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-28 21:39 UTC (permalink / raw)
To: Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable, Mikhail Gavrilov
Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
switched from ida to idr_alloc_cyclic() protected by a spinlock, but
passes GFP_KERNEL to the allocator. idr_alloc_cyclic() may need to
allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
under a spinlock that disables preemption. With CONFIG_PREEMPT or
lockdep enabled this triggers:
BUG: sleeping function called from invalid context at
./include/linux/sched/mm.h:323
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
...
#1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
...
kmem_cache_alloc_noprof+0x41d/0x780
radix_tree_node_alloc.constprop.0+0x56/0x3a0
idr_get_free+0x330/0x830
idr_alloc_u32+0x14a/0x2e0
idr_alloc_cyclic+0xd3/0x1d0
amdgpu_pasid_alloc+0x51/0x210 [amdgpu]
A mutex is not an option because amdgpu_pasid_free() is reachable from
dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.
Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
the allocator draws from the preloaded pool and never sleeps. This is
the standard kernel pattern for IDR allocation under a spinlock.
Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index d88523568b62..515775eab2ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
if (bits == 0)
return -EINVAL;
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
if (pasid >= 0)
trace_amdgpu_pasid_allocated(pasid);
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-30 7:31 ` Lazar, Lijo
2026-03-31 7:50 ` Claude review: " Claude Code Review Bot
2026-03-31 7:50 ` Claude Code Review Bot
2 siblings, 0 replies; 5+ messages in thread
From: Lazar, Lijo @ 2026-03-30 7:31 UTC (permalink / raw)
To: Mikhail Gavrilov, Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable
On 29-Mar-26 3:09 AM, Mikhail Gavrilov wrote:
> Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
> switched from ida to idr_alloc_cyclic() protected by a spinlock, but
> passes GFP_KERNEL to the allocator. idr_alloc_cyclic() may need to
> allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
> under a spinlock that disables preemption. With CONFIG_PREEMPT or
> lockdep enabled this triggers:
>
> BUG: sleeping function called from invalid context at
> ./include/linux/sched/mm.h:323
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
> ...
> #1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
> at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
> ...
> kmem_cache_alloc_noprof+0x41d/0x780
> radix_tree_node_alloc.constprop.0+0x56/0x3a0
> idr_get_free+0x330/0x830
> idr_alloc_u32+0x14a/0x2e0
> idr_alloc_cyclic+0xd3/0x1d0
> amdgpu_pasid_alloc+0x51/0x210 [amdgpu]
>
> A mutex is not an option because amdgpu_pasid_free() is reachable from
> dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.
>
> Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
> radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
> the allocator draws from the preloaded pool and never sleeps. This is
> the standard kernel pattern for IDR allocation under a spinlock.
>
> Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index d88523568b62..515775eab2ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
> if (bits == 0)
> return -EINVAL;
>
> + idr_preload(GFP_KERNEL);
It's better/simpler to replace amdgpu_pasid_idr with xarray.
Thanks,
Lijo
> spin_lock(&amdgpu_pasid_idr_lock);
> pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
> - 1U << bits, GFP_KERNEL);
> + 1U << bits, GFP_NOWAIT);
> spin_unlock(&amdgpu_pasid_idr_lock);
> + idr_preload_end();
>
> if (pasid >= 0)
> trace_amdgpu_pasid_allocated(pasid);
^ permalink raw reply [flat|nested] 5+ messages in thread* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30 7:31 ` Lazar, Lijo
@ 2026-03-31 7:50 ` Claude Code Review Bot
2026-03-31 7:50 ` Claude Code Review Bot
2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:50 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
Author: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Patches: 2
Reviewed: 2026-03-31T17:50:57.526600
---
This is a single-patch fix for a real bug: `GFP_KERNEL` allocation inside a spinlock critical section, which can sleep and trigger a BUG with lockdep or `CONFIG_PREEMPT`. The fix uses the standard `idr_preload()` / `GFP_NOWAIT` pattern, which is the well-established kernel idiom for exactly this situation. The commit message is excellent — it clearly explains the bug, includes a lockdep splat, explains why a mutex isn't viable (IRQ context in `amdgpu_pasid_free_cb`), and documents the chosen solution.
**Verdict: This patch looks correct and ready to merge.**
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30 7:31 ` Lazar, Lijo
2026-03-31 7:50 ` Claude review: " Claude Code Review Bot
@ 2026-03-31 7:50 ` Claude Code Review Bot
2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:50 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Analysis of the bug:** Confirmed by reading the current tree at `amdgpu_ids.c:70-73` — the code currently calls `idr_alloc_cyclic()` with `GFP_KERNEL` while holding `amdgpu_pasid_idr_lock` (a spinlock). This is indeed a sleeping-in-atomic-context bug.
**Analysis of the fix:**
```c
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
```
This is textbook correct:
1. `idr_preload(GFP_KERNEL)` pre-allocates radix tree nodes (can sleep, called outside spinlock) and disables preemption.
2. `GFP_NOWAIT` inside the spinlock ensures the allocator uses the preloaded pool and never sleeps.
3. `idr_preload_end()` re-enables preemption after the spinlock is released.
**One minor ordering note:** `idr_preload_end()` is called after `spin_unlock()`, which is correct — `idr_preload()` disables preemption, so you must hold the preload context across the allocation. The unlock-before-preload_end ordering is the standard pattern (see e.g., `drivers/infiniband/`, `net/core/`).
**Fixes tag and Cc: stable** are both appropriate since this is a straightforward bugfix for a commit that introduced the regression.
**No issues found.**
Reviewed-by recommendation: This patch is correct and follows established kernel patterns.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management
@ 2026-03-30 5:30 Mikhail Gavrilov
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
0 siblings, 1 reply; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-30 5:30 UTC (permalink / raw)
To: Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable, Mikhail Gavrilov
Commit 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
converted the global PASID allocator from IDA to IDR with a spinlock
for cyclic allocation. This introduced two locking bugs:
1) idr_alloc_cyclic() is called with GFP_KERNEL under spin_lock(),
which can sleep.
2) amdgpu_pasid_free() can be called from hardirq context via the
fence signal path (amdgpu_pasid_free_cb), but the lock is taken
with plain spin_lock() in process context, creating a potential
deadlock:
CPU0
----
spin_lock(&amdgpu_pasid_idr_lock) // process context, IRQs on
<Interrupt>
spin_lock(&amdgpu_pasid_idr_lock) // deadlock
The hardirq call chain is:
sdma_v6_0_process_trap_irq
-> amdgpu_fence_process
-> dma_fence_signal
-> drm_sched_job_done
-> dma_fence_signal
-> amdgpu_pasid_free_cb
-> amdgpu_pasid_free
This was observed on an RX 7900 XTX when exiting a Vulkan game
running under Proton/Wine, which triggers the fence callback path
during VM teardown.
Patch 1 fixes the sleeping-under-spinlock by using idr_preload() with
GFP_KERNEL before taking the lock, then GFP_NOWAIT for the actual
allocation.
Patch 2 converts all three spin_lock/spin_unlock call sites to
spin_lock_irqsave/spin_unlock_irqrestore.
Tested on ASUS ROG STRIX B650E-I / Ryzen 9 7950X / RX 7900 XTX with
CONFIG_PROVE_LOCKING=y. The lockdep warning is no longer triggered
after applying both patches.
Mikhail Gavrilov (2):
drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
drm/amdgpu: use spin_lock_irqsave for PASID IDR lock
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-30 5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
@ 2026-03-30 5:30 ` Mikhail Gavrilov
2026-03-31 7:36 ` Claude review: " Claude Code Review Bot
0 siblings, 1 reply; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-30 5:30 UTC (permalink / raw)
To: Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable, Mikhail Gavrilov
Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
switched from ida to idr_alloc_cyclic() protected by a spinlock, but
passes GFP_KERNEL to the allocator. idr_alloc_cyclic() may need to
allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
under a spinlock that disables preemption. With CONFIG_PREEMPT or
lockdep enabled this triggers:
BUG: sleeping function called from invalid context at
./include/linux/sched/mm.h:323
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
...
#1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
...
kmem_cache_alloc_noprof+0x41d/0x780
radix_tree_node_alloc.constprop.0+0x56/0x3a0
idr_get_free+0x330/0x830
idr_alloc_u32+0x14a/0x2e0
idr_alloc_cyclic+0xd3/0x1d0
amdgpu_pasid_alloc+0x51/0x210 [amdgpu]
A mutex is not an option because amdgpu_pasid_free() is reachable from
dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.
Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
the allocator draws from the preloaded pool and never sleeps. This is
the standard kernel pattern for IDR allocation under a spinlock.
Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index d88523568b62..515775eab2ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
if (bits == 0)
return -EINVAL;
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
if (pasid >= 0)
trace_amdgpu_pasid_allocated(pasid);
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-31 7:36 ` Claude Code Review Bot
0 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:36 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Status: Good**
The fix uses the standard `idr_preload()` / `GFP_NOWAIT` pattern:
```c
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
```
This is the textbook kernel approach: `idr_preload(GFP_KERNEL)` pre-allocates radix-tree nodes in sleepable context with preemption disabled, then `GFP_NOWAIT` inside the spinlock draws from the preloaded pool. `idr_preload_end()` re-enables preemption after the lock is released.
The ordering is correct: `idr_preload()` before `spin_lock()`, and `idr_preload_end()` after `spin_unlock()`. No issues.
Note: The Fixes: tag references `14b81abe7bdc` while the cover letter mentions `8f1de51f49be`. Both should refer to the same commit that introduced the IDR conversion. The cover letter and patch 2 use `8f1de51f49be` — the discrepancy in the Fixes: tag on patch 1 (`14b81abe7bdc`) should be verified. If these are different commits, the correct one to cite is whichever actually introduced `idr_alloc_cyclic()` with `GFP_KERNEL` under the spinlock.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-31 7:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30 7:31 ` Lazar, Lijo
2026-03-31 7:50 ` Claude review: " Claude Code Review Bot
2026-03-31 7:50 ` Claude Code Review Bot
-- strict thread matches above, loose matches on Subject: below --
2026-03-30 5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-31 7:36 ` Claude review: " Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox