public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
@ 2026-03-28 21:39 Mikhail Gavrilov
  2026-03-30  7:31 ` Lazar, Lijo
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-28 21:39 UTC (permalink / raw)
  To: Alex Deucher, Christian König
  Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
	stable, Mikhail Gavrilov

Commit 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
switched from ida to idr_alloc_cyclic() protected by a spinlock, but
passes GFP_KERNEL to the allocator.  idr_alloc_cyclic() may need to
allocate radix-tree nodes, which with GFP_KERNEL can sleep — illegal
under a spinlock that disables preemption.  With CONFIG_PREEMPT or
lockdep enabled this triggers:

  BUG: sleeping function called from invalid context at
       ./include/linux/sched/mm.h:323
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 570
  ...
  #1: ffffffffc2cd24f8 (amdgpu_pasid_idr_lock){+.+.}-{3:3},
      at: amdgpu_pasid_alloc+0x24/0x210 [amdgpu]
  ...
  kmem_cache_alloc_noprof+0x41d/0x780
  radix_tree_node_alloc.constprop.0+0x56/0x3a0
  idr_get_free+0x330/0x830
  idr_alloc_u32+0x14a/0x2e0
  idr_alloc_cyclic+0xd3/0x1d0
  amdgpu_pasid_alloc+0x51/0x210 [amdgpu]

A mutex is not an option because amdgpu_pasid_free() is reachable from
dma-fence callbacks (amdgpu_pasid_free_cb) which may run in IRQ context.

Use idr_preload(GFP_KERNEL) before taking the spinlock to pre-allocate
radix-tree nodes, then pass GFP_NOWAIT inside the critical section so
the allocator draws from the preloaded pool and never sleeps.  This is
the standard kernel pattern for IDR allocation under a spinlock.

Fixes: 14b81abe7bdc ("drm/amdgpu: prevent immediate PASID reuse case")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index d88523568b62..515775eab2ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -67,10 +67,12 @@ int amdgpu_pasid_alloc(unsigned int bits)
 	if (bits == 0)
 		return -EINVAL;
 
+	idr_preload(GFP_KERNEL);
 	spin_lock(&amdgpu_pasid_idr_lock);
 	pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1,
-				 1U << bits, GFP_KERNEL);
+				 1U << bits, GFP_NOWAIT);
 	spin_unlock(&amdgpu_pasid_idr_lock);
+	idr_preload_end();
 
 	if (pasid >= 0)
 		trace_amdgpu_pasid_allocated(pasid);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread
* [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management
@ 2026-03-30  5:30 Mikhail Gavrilov
  2026-03-30  5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
  0 siblings, 1 reply; 5+ messages in thread
From: Mikhail Gavrilov @ 2026-03-30  5:30 UTC (permalink / raw)
  To: Alex Deucher, Christian König
  Cc: Eric Huang, David Airlie, Simona Vetter, amd-gfx, dri-devel,
	stable, Mikhail Gavrilov

Commit 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
converted the global PASID allocator from IDA to IDR with a spinlock
for cyclic allocation.  This introduced two locking bugs:
 
1) idr_alloc_cyclic() is called with GFP_KERNEL under spin_lock(),
   which can sleep.
 
2) amdgpu_pasid_free() can be called from hardirq context via the
   fence signal path (amdgpu_pasid_free_cb), but the lock is taken
   with plain spin_lock() in process context, creating a potential
   deadlock:
 
     CPU0
     ----
     spin_lock(&amdgpu_pasid_idr_lock)   // process context, IRQs on
     <Interrupt>
       spin_lock(&amdgpu_pasid_idr_lock) // deadlock
 
   The hardirq call chain is:
 
     sdma_v6_0_process_trap_irq
      -> amdgpu_fence_process
       -> dma_fence_signal
        -> drm_sched_job_done
         -> dma_fence_signal
          -> amdgpu_pasid_free_cb
           -> amdgpu_pasid_free
 
   This was observed on an RX 7900 XTX when exiting a Vulkan game
   running under Proton/Wine, which triggers the fence callback path
   during VM teardown.
 
Patch 1 fixes the sleeping-under-spinlock by using idr_preload() with
GFP_KERNEL before taking the lock, then GFP_NOWAIT for the actual
allocation.
 
Patch 2 converts all three spin_lock/spin_unlock call sites to
spin_lock_irqsave/spin_unlock_irqrestore.
 
Tested on ASUS ROG STRIX B650E-I / Ryzen 9 7950X / RX 7900 XTX with
CONFIG_PROVE_LOCKING=y.  The lockdep warning is no longer triggered
after applying both patches.

Mikhail Gavrilov (2):
  drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR
  drm/amdgpu: use spin_lock_irqsave for PASID IDR lock

 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-31  7:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-28 21:39 [PATCH] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-30  7:31 ` Lazar, Lijo
2026-03-31  7:50 ` Claude review: " Claude Code Review Bot
2026-03-31  7:50 ` Claude Code Review Bot
  -- strict thread matches above, loose matches on Subject: below --
2026-03-30  5:30 [PATCH v2 0/2] drm/amdgpu: fix locking issues in PASID IDR management Mikhail Gavrilov
2026-03-30  5:30 ` [PATCH v2 1/2] drm/amdgpu: fix sleeping allocation under spinlock in PASID IDR Mikhail Gavrilov
2026-03-31  7:36   ` Claude review: " Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox