* [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-30 5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
@ 2026-03-30 5:30 ` Mikhail Gavrilov
2026-03-31 7:36 ` Claude review: " Claude Code Review Bot
2026-03-30 5:30 ` [PATCH v2 2/2] drm/amdgpu: use spin_lock_irqsave for PASID IDR lock Mikhail Gavrilov
2026-03-31 7:36 ` Claude review: drm/amdgpu: fix locking issues in PASID IDR management Claude Code Review Bot
2 siblings, 1 reply; 6+ messages in thread
From: Mikhail Gavrilov @ 2026-03-30 5:30 UTC (permalink / raw)
To: Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable, Mikhail Gavrilov
Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
switched from ida to idr_alloc_cyclic() protected by a spinlock, but
passes GFP_KERNEL to the allocator. idr_alloc_cyclic() may need to
allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
under a spinlock that disables preemption. With CONFIG_PREEMPT or
lockdep enabled this triggers:
BUG: sleeping function called from invalid context at
./include/linux/sched/mm.h:323
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
...
#1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
...
kmem_cache_alloc_noprof+0x41d/0x780
radix_tree_node_alloc.constprop.0+0x56/0x3a0
idr_get_free+0x330/0x830
idr_alloc_u32+0x14a/0x2e0
idr_alloc_cyclic+0xd3/0x1d0
amdgpu_pasid_alloc+0x51/0x210 [amdgpu]
A mutex is not an option because amdgpu_pasid_free() is reachable from
dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.
Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
the allocator draws from the preloaded pool and never sleeps. This is
the standard kernel pattern for IDR allocation under a spinlock.
Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index d88523568b62..515775eab2ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
if (bits == 0)
return -EINVAL;
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
if (pasid >= 0)
trace_amdgpu_pasid_allocated(pasid);
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-31 7:36 ` Claude Code Review Bot
0 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:36 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Status: Good**
The fix uses the standard `idr_preload()` / `GFP_NOWAIT` pattern:
```c
+ idr_preload(GFP_KERNEL);
spin_lock(&amdgpu_pasid_idr_lock);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
- 1U << bits, GFP_KERNEL);
+ 1U << bits, GFP_NOWAIT);
spin_unlock(&amdgpu_pasid_idr_lock);
+ idr_preload_end();
```
This is the textbook kernel approach: `idr_preload(GFP_KERNEL)` pre-allocates radix-tree nodes in sleepable context with preemption disabled, then `GFP_NOWAIT` inside the spinlock draws from the preloaded pool. `idr_preload_end()` re-enables preemption after the lock is released.
The ordering is correct: `idr_preload()` before `spin_lock()`, and `idr_preload_end()` after `spin_unlock()`. No issues.
Note: The Fixes: tag references `14b81abe7bdc` while the cover letter mentions `8f1de51f49be`. Both should refer to the same commit that introduced the IDR conversion. The cover letter and patch 2 use `8f1de51f49be` — the discrepancy in the Fixes: tag on patch 1 (`14b81abe7bdc`) should be verified. If these are different commits, the correct one to cite is whichever actually introduced `idr_alloc_cyclic()` with `GFP_KERNEL` under the spinlock.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] drm/amdgpu: use spin_lock_irqsave for PASID IDR lock
2026-03-30 5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-30 5:30 ` Mikhail Gavrilov
2026-03-31 7:36 ` Claude review: " Claude Code Review Bot
2026-03-31 7:36 ` Claude review: drm/amdgpu: fix locking issues in PASID IDR management Claude Code Review Bot
2 siblings, 1 reply; 6+ messages in thread
From: Mikhail Gavrilov @ 2026-03-30 5:30 UTC (permalink / raw)
To: Alex Deucher, Christian König
Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
stable, Mikhail Gavrilov
amdgpu_pasid_free() can be called from hardirq context via the fence
signal path:
sdma_v6_0_process_trap_irq
-> amdgpu_fence_process
-> dma_fence_signal
-> drm_sched_job_done
-> dma_fence_signal
-> amdgpu_pasid_free_cb
-> amdgpu_pasid_free
-> spin_lock(&amdgpu_pasid_idr_lock) <- hardirq context
But the lock was originally taken with plain spin_lock() in process
context (amdgpu_pasid_alloc), creating an inconsistent
{HARDIRQ-ON-W} -> {IN-HARDIRQ-W} lock state that can deadlock if an
interrupt arrives while the lock is held on the same CPU.
Use spin_lock_irqsave/spin_unlock_irqrestore for all call sites of
amdgpu_pasid_idr_lock to prevent the deadlock.
This patch applies on top of "drm/amdgpu: fix sleeping allocation
under spinlock in PASID IDR".
Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index 515775eab2ef..762ceb3c708a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -62,16 +62,17 @@ struct amdgpu_pasid_cb {
*/
int amdgpu_pasid_alloc(unsigned int bits)
{
+ unsigned long flags;
int pasid;
if (bits == 0)
return -EINVAL;
idr_preload(GFP_KERNEL);
- spin_lock(&amdgpu_pasid_idr_lock);
+ spin_lock_irqsave(&amdgpu_pasid_idr_lock, flags);
pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
1U << bits, GFP_NOWAIT);
- spin_unlock(&amdgpu_pasid_idr_lock);
+ spin_unlock_irqrestore(&amdgpu_pasid_idr_lock, flags);
idr_preload_end();
if (pasid >= 0)
@@ -86,11 +87,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
*/
void amdgpu_pasid_free(u32 pasid)
{
+ unsigned long flags;
trace_amdgpu_pasid_freed(pasid);
- spin_lock(&amdgpu_pasid_idr_lock);
+ spin_lock_irqsave(&amdgpu_pasid_idr_lock, flags);
idr_remove(&amdgpu_pasid_idr, pasid);
- spin_unlock(&amdgpu_pasid_idr_lock);
+ spin_unlock_irqrestore(&amdgpu_pasid_idr_lock, flags);
}
static void amdgpu_pasid_free_cb(struct dma_fence *fence,
@@ -633,7 +635,9 @@ void amdgpu_vmid_mgr_fini(struct amdgpu_device *adev)
*/
void amdgpu_pasid_mgr_cleanup(void)
{
- spin_lock(&amdgpu_pasid_idr_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&amdgpu_pasid_idr_lock, flags);
idr_destroy(&amdgpu_pasid_idr);
- spin_unlock(&amdgpu_pasid_idr_lock);
+ spin_unlock_irqrestore(&amdgpu_pasid_idr_lock, flags);
}
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Claude review: drm/amdgpu: use spin_lock_irqsave for PASID IDR lock
2026-03-30 5:30 ` [PATCH v2 2/2] drm/amdgpu: use spin_lock_irqsave for PASID IDR lock Mikhail Gavrilov
@ 2026-03-31 7:36 ` Claude Code Review Bot
0 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:36 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Status: Good**
Converts all three lock sites to `spin_lock_irqsave`/`spin_unlock_irqrestore`:
1. **`amdgpu_pasid_alloc()`** — process context allocator
2. **`amdgpu_pasid_free()`** — called from both process and hardirq context (via `amdgpu_pasid_free_cb`)
3. **`amdgpu_pasid_mgr_cleanup()`** — module cleanup
All three conversions are correct and follow the same pattern:
```c
+ unsigned long flags;
...
- spin_lock(&amdgpu_pasid_idr_lock);
+ spin_lock_irqsave(&amdgpu_pasid_idr_lock, flags);
...
- spin_unlock(&amdgpu_pasid_idr_lock);
+ spin_unlock_irqrestore(&amdgpu_pasid_idr_lock, flags);
```
**Minor nit**: In `amdgpu_pasid_free()`, the `flags` variable declaration is placed before the `trace_amdgpu_pasid_freed()` call with no blank line separating declarations from statements:
```c
void amdgpu_pasid_free(u32 pasid)
{
+ unsigned long flags;
trace_amdgpu_pasid_freed(pasid);
```
Kernel coding style typically expects a blank line between variable declarations and the first statement. Looking at the applied result, it does have the declaration immediately followed by the trace call with no blank line. This is a cosmetic nit — some maintainers care, some don't.
**Correctness note on `amdgpu_pasid_mgr_cleanup()`**: Using `spin_lock_irqsave` here is technically unnecessary since this runs during module unload where no concurrent IRQ-context users should exist, but it's the right thing to do for consistency — it ensures the lock annotation is uniform across all call sites, preventing future lockdep warnings if the call ordering ever changes.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Claude review: drm/amdgpu: fix locking issues in PASID IDR management
2026-03-30 5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
2026-03-30 5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30 5:30 ` [PATCH v2 2/2] drm/amdgpu: use spin_lock_irqsave for PASID IDR lock Mikhail Gavrilov
@ 2026-03-31 7:36 ` Claude Code Review Bot
2 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:36 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: drm/amdgpu: fix locking issues in PASID IDR management
Author: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Patches: 3
Reviewed: 2026-03-31T17:36:41.530736
---
This is a well-structured 2-patch series fixing two real locking bugs introduced by commit `14b81abe7bdc` / `8f1de51f49be` ("drm/amdgpu: prevent immediate PASID reuse case"), which converted PASID allocation from IDA to IDR with a spinlock.
The two bugs are:
1. **Sleeping under spinlock**: `idr_alloc_cyclic()` called with `GFP_KERNEL` inside `spin_lock()`.
2. **IRQ deadlock**: `amdgpu_pasid_free()` reachable from hardirq context (fence signal path) but the lock taken with plain `spin_lock()` in process context.
Both fixes use well-established kernel patterns. The patches are correctly ordered (patch 1 is a prerequisite for patch 2, as patch 2 builds on the code introduced by patch 1). The cover letter is excellent with clear problem description, call chain, and test details.
**Verdict: Series looks correct and ready to apply.** One minor nit on patch 2 below.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread