public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
@ 2026-03-28 21:39 Mikhail Gavrilov
  2026-03-30  7:31 ` Lazar, Lijo
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-28 21:39 UTC (permalink / raw)
  To: Alex Deucher, Christian König
  Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
	stable, Mikhail Gavrilov

Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
switched from ida to idr_alloc_cyclic() protected by a spinlock, but
passes GFP_KERNEL to the allocator.  idr_alloc_cyclic() may need to
allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
under a spinlock that disables preemption.  With CONFIG_PREEMPT or
lockdep enabled this triggers:

  BUG: sleeping function called from invalid context at
       ./include/linux/sched/mm.h:323
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
  ...
  #1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
      at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
  ...
  kmem_cache_alloc_noprof+0x41d/0x780
  radix_tree_node_alloc.constprop.0+0x56/0x3a0
  idr_get_free+0x330/0x830
  idr_alloc_u32+0x14a/0x2e0
  idr_alloc_cyclic+0xd3/0x1d0
  amdgpu_pasid_alloc+0x51/0x210 [amdgpu]

A mutex is not an option because amdgpu_pasid_free() is reachable from
dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.

Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
the allocator draws from the preloaded pool and never sleeps.  This is
the standard kernel pattern for IDR allocation under a spinlock.

Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index d88523568b62..515775eab2ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
 	if (bits == 0)
 		return -EINVAL;
 
+	idr_preload(GFP_KERNEL);
 	spin_lock(&amdgpu_pasid_idr_lock);
 	pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
-				 1U << bits, GFP_KERNEL);
+				 1U << bits, GFP_NOWAIT);
 	spin_unlock(&amdgpu_pasid_idr_lock);
+	idr_preload_end();
 
 	if (pasid >= 0)
 		trace_amdgpu_pasid_allocated(pasid);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
  2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-30  7:31 ` Lazar, Lijo
  2026-03-31  7:50 ` Claude review: " Claude Code Review Bot
  2026-03-31  7:50 ` Claude Code Review Bot
  2 siblings, 0 replies; 5+ messages in thread
From: Lazar, Lijo @ 2026-03-30  7:31 UTC (permalink / raw)
  To: Mikhail Gavrilov, Alex Deucher, Christian König
  Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
	stable



On 29-Mar-26 3:09 AM, Mikhail Gavrilov wrote:
> Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
> switched from ida to idr_alloc_cyclic() protected by a spinlock, but
> passes GFP_KERNEL to the allocator.  idr_alloc_cyclic() may need to
> allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
> under a spinlock that disables preemption.  With CONFIG_PREEMPT or
> lockdep enabled this triggers:
> 
>    BUG: sleeping function called from invalid context at
>         ./include/linux/sched/mm.h:323
>    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
>    ...
>    #1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
>        at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
>    ...
>    kmem_cache_alloc_noprof+0x41d/0x780
>    radix_tree_node_alloc.constprop.0+0x56/0x3a0
>    idr_get_free+0x330/0x830
>    idr_alloc_u32+0x14a/0x2e0
>    idr_alloc_cyclic+0xd3/0x1d0
>    amdgpu_pasid_alloc+0x51/0x210 [amdgpu]
> 
> A mutex is not an option because amdgpu_pasid_free() is reachable from
> dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.
> 
> Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
> radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
> the allocator draws from the preloaded pool and never sleeps.  This is
> the standard kernel pattern for IDR allocation under a spinlock.
> 
> Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index d88523568b62..515775eab2ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
>   	if (bits == 0)
>   		return -EINVAL;
>   
> +	idr_preload(GFP_KERNEL);

It's better/simpler to replace amdgpu_pasid_idr with xarray.

Thanks,
Lijo

>   	spin_lock(&amdgpu_pasid_idr_lock);
>   	pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
> -				 1U << bits, GFP_KERNEL);
> +				 1U << bits, GFP_NOWAIT);
>   	spin_unlock(&amdgpu_pasid_idr_lock);
> +	idr_preload_end();
>   
>   	if (pasid >= 0)
>   		trace_amdgpu_pasid_allocated(pasid);


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
  2026-03-30  5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
@ 2026-03-31  7:36   ` Claude Code Review Bot
  0 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31  7:36 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Status: Good**

The fix uses the standard `idr_preload()` / `GFP_NOWAIT` pattern:

```c
+	idr_preload(GFP_KERNEL);
 	spin_lock(&amdgpu_pasid_idr_lock);
 	pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
-				 1U << bits, GFP_KERNEL);
+				 1U << bits, GFP_NOWAIT);
 	spin_unlock(&amdgpu_pasid_idr_lock);
+	idr_preload_end();
```

This is the textbook kernel approach: `idr_preload(GFP_KERNEL)` pre-allocates radix-tree nodes in sleepable context with preemption disabled, then `GFP_NOWAIT` inside the spinlock draws from the preloaded pool. `idr_preload_end()` re-enables preemption after the lock is released.

The ordering is correct: `idr_preload()` before `spin_lock()`, and `idr_preload_end()` after `spin_unlock()`. No issues.

Note: The Fixes: tag references `14b81abe7bdc` while the cover letter mentions `8f1de51f49be`. Both should refer to the same commit that introduced the IDR conversion. The cover letter and patch 2 use `8f1de51f49be` — the discrepancy in the Fixes: tag on patch 1 (`14b81abe7bdc`) should be verified. If these are different commits, the correct one to cite is whichever actually introduced `idr_alloc_cyclic()` with `GFP_KERNEL` under the spinlock.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
  2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
  2026-03-30  7:31 ` Lazar, Lijo
  2026-03-31  7:50 ` Claude review: " Claude Code Review Bot
@ 2026-03-31  7:50 ` Claude Code Review Bot
  2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31  7:50 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
Author: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Patches: 2
Reviewed: 2026-03-31T17:50:57.526600

---

This is a single-patch fix for a real bug: `GFP_KERNEL` allocation inside a spinlock critical section, which can sleep and trigger a BUG with lockdep or `CONFIG_PREEMPT`. The fix uses the standard `idr_preload()` / `GFP_NOWAIT` pattern, which is the well-established kernel idiom for exactly this situation. The commit message is excellent — it clearly explains the bug, includes a lockdep splat, explains why a mutex isn't viable (IRQ context in `amdgpu_pasid_free_cb`), and documents the chosen solution.

**Verdict: This patch looks correct and ready to merge.**

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Claude review: drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
  2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
  2026-03-30  7:31 ` Lazar, Lijo
@ 2026-03-31  7:50 ` Claude Code Review Bot
  2026-03-31  7:50 ` Claude Code Review Bot
  2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-03-31  7:50 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Analysis of the bug:** Confirmed by reading the current tree at `amdgpu_ids.c:70-73` — the code currently calls `idr_alloc_cyclic()` with `GFP_KERNEL` while holding `amdgpu_pasid_idr_lock` (a spinlock). This is indeed a sleeping-in-atomic-context bug.

**Analysis of the fix:**

```c
+	idr_preload(GFP_KERNEL);
 	spin_lock(&amdgpu_pasid_idr_lock);
 	pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
-				 1U << bits, GFP_KERNEL);
+				 1U << bits, GFP_NOWAIT);
 	spin_unlock(&amdgpu_pasid_idr_lock);
+	idr_preload_end();
```

This is textbook correct:
1. `idr_preload(GFP_KERNEL)` pre-allocates radix tree nodes (can sleep, called outside spinlock) and disables preemption.
2. `GFP_NOWAIT` inside the spinlock ensures the allocator uses the preloaded pool and never sleeps.
3. `idr_preload_end()` re-enables preemption after the spinlock is released.

**One minor ordering note:** `idr_preload_end()` is called after `spin_unlock()`, which is correct — `idr_preload()` disables preemption, so you must hold the preload context across the allocation. The unlock-before-preload_end ordering is the standard pattern (see e.g., `drivers/infiniband/`, `net/core/`).

**Fixes tag and Cc: stable** are both appropriate since this is a straightforward bugfix for a commit that introduced the regression.

**No issues found.** 

Reviewed-by recommendation: This patch is correct and follows established kernel patterns.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-31  7:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30  7:31 ` Lazar, Lijo
2026-03-31  7:50 ` Claude review: " Claude Code Review Bot
2026-03-31  7:50 ` Claude Code Review Bot
  -- strict thread matches above, loose matches on Subject: below --
2026-03-30  5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
2026-03-30  5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-31  7:36   ` Claude review: " Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox