public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager
@ 2026-04-29 12:37 Tejas Upadhyay
  2026-04-30  9:14 ` Matthew Auld
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Tejas Upadhyay @ 2026-04-29 12:37 UTC (permalink / raw)
  To: intel-xe
  Cc: matthew.auld, Arunpravin.PaneerSelvam, dri-devel, amd-gfx,
	Tejas Upadhyay

gpu_buddy APIs are expected to be called with the driver-provided lock
held, but there is no runtime enforcement of this contract. Add lockdep
annotations to catch locking violations early.

Introduce gpu_buddy_driver_set_lock() for the driver to register the
lock that protects the buddy manager. Add gpu_buddy_driver_lock_held()
assertions to all exported gpu_buddy and drm_buddy APIs that
access/modify the manager state. The lock_dep_map field is only compiled
in when CONFIG_LOCKDEP is enabled, adding zero overhead to production
builds.

Wire up xe_ttm_vram_mgr to register its mutex with the buddy manager
after initialization.

Assisted-by: Claude:claude-opus-4.6
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 drivers/gpu/buddy.c                  | 11 ++++++++
 drivers/gpu/drm/drm_buddy.c          |  1 +
 drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  1 +
 include/linux/gpu_buddy.h            | 41 ++++++++++++++++++++++++++++
 4 files changed, 54 insertions(+)

diff --git a/drivers/gpu/buddy.c b/drivers/gpu/buddy.c
index 52686672e99f..eb1457376307 100644
--- a/drivers/gpu/buddy.c
+++ b/drivers/gpu/buddy.c
@@ -437,6 +437,9 @@ int gpu_buddy_init(struct gpu_buddy *mm, u64 size, u64 chunk_size)
 		root_count++;
 	} while (size);
 
+#ifdef CONFIG_LOCKDEP
+	mm->lock_dep_map = NULL;
+#endif
 	return 0;
 
 out_free_roots:
@@ -538,6 +541,7 @@ void gpu_buddy_reset_clear(struct gpu_buddy *mm, bool is_clear)
 	unsigned int order;
 	int i;
 
+	gpu_buddy_driver_lock_held(mm);
 	size = mm->size;
 	for (i = 0; i < mm->n_roots; ++i) {
 		order = ilog2(size) - ilog2(mm->chunk_size);
@@ -580,6 +584,7 @@ EXPORT_SYMBOL(gpu_buddy_reset_clear);
 void gpu_buddy_free_block(struct gpu_buddy *mm,
 			  struct gpu_buddy_block *block)
 {
+	gpu_buddy_driver_lock_held(mm);
 	BUG_ON(!gpu_buddy_block_is_allocated(block));
 	mm->avail += gpu_buddy_block_size(mm, block);
 	if (gpu_buddy_block_is_clear(block))
@@ -633,6 +638,7 @@ void gpu_buddy_free_list(struct gpu_buddy *mm,
 {
 	bool mark_clear = flags & GPU_BUDDY_CLEARED;
 
+	gpu_buddy_driver_lock_held(mm);
 	__gpu_buddy_free_list(mm, objects, mark_clear, !mark_clear);
 }
 EXPORT_SYMBOL(gpu_buddy_free_list);
@@ -1172,6 +1178,8 @@ int gpu_buddy_block_trim(struct gpu_buddy *mm,
 	u64 new_start;
 	int err;
 
+	gpu_buddy_driver_lock_held(mm);
+
 	if (!list_is_singular(blocks))
 		return -EINVAL;
 
@@ -1287,6 +1295,8 @@ int gpu_buddy_alloc_blocks(struct gpu_buddy *mm,
 	unsigned long pages;
 	int err;
 
+	gpu_buddy_driver_lock_held(mm);
+
 	if (size < mm->chunk_size)
 		return -EINVAL;
 
@@ -1475,6 +1485,7 @@ void gpu_buddy_print(struct gpu_buddy *mm)
 {
 	int order;
 
+	gpu_buddy_driver_lock_held(mm);
 	pr_info("chunk_size: %lluKiB, total: %lluMiB, free: %lluMiB, clear_free: %lluMiB\n",
 		mm->chunk_size >> 10, mm->size >> 20, mm->avail >> 20, mm->clear_avail >> 20);
 
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 841f3de5f307..faa025498de4 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -42,6 +42,7 @@ void drm_buddy_print(struct gpu_buddy *mm, struct drm_printer *p)
 {
 	int order;
 
+	gpu_buddy_driver_lock_held(mm);
 	drm_printf(p, "chunk_size: %lluKiB, total: %lluMiB, free: %lluMiB, clear_free: %lluMiB\n",
 		   mm->chunk_size >> 10, mm->size >> 20, mm->avail >> 20, mm->clear_avail >> 20);
 
diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index 5fd0d5506a7e..7ebc4d278c3b 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -322,6 +322,7 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struct xe_ttm_vram_mgr *mgr,
 	if (err)
 		return err;
 
+	gpu_buddy_driver_set_lock(&mgr->mm, &mgr->lock);
 	ttm_set_driver_manager(&xe->ttm, mem_type, &mgr->manager);
 	ttm_resource_manager_set_used(&mgr->manager, true);
 
diff --git a/include/linux/gpu_buddy.h b/include/linux/gpu_buddy.h
index 5fa917ba5450..71941a039648 100644
--- a/include/linux/gpu_buddy.h
+++ b/include/linux/gpu_buddy.h
@@ -154,6 +154,7 @@ struct gpu_buddy_block {
  * @avail: Total free space currently available for allocation in bytes.
  * @clear_avail: Free space available in the clear tree (zeroed memory) in bytes.
  *               This is a subset of @avail.
+ * @lock_dep_map: Annotates gpu_buddy API with a driver provided lock.
  */
 struct gpu_buddy {
 /* private: */
@@ -179,8 +180,48 @@ struct gpu_buddy {
 	u64 size;
 	u64 avail;
 	u64 clear_avail;
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map *lock_dep_map;
+#endif
 };
 
+#ifdef CONFIG_LOCKDEP
+/**
+ * gpu_buddy_driver_set_lock() - Set the lock protecting accesses to GPU BUDDY
+ * @mm: Pointer to GPU buddy structure.
+ * @lock: the lock used to protect the gpu buddy. The locking primitive
+ * must contain a dep_map field.
+ *
+ * Call this to annotate gpu_buddy APIs which access/modify gpu_buddy manager
+ */
+#define gpu_buddy_driver_set_lock(mm, lock) \
+	do { \
+		struct gpu_buddy *__mm = (mm); \
+		if (!WARN(__mm->lock_dep_map, "GPU BUDDY MM lock should be set only once.")) \
+			__mm->lock_dep_map = &(lock)->dep_map; \
+	} while (0)
+#else
+#define gpu_buddy_driver_set_lock(mm, lock) do { (void)(mm); (void)(lock); } while (0)
+#endif
+
+#ifdef CONFIG_LOCKDEP
+/**
+ * gpu_buddy_driver_lock_held() - Assert GPU BUDDY manager lock is held
+ * @mm: Pointer to the GPU BUDDY structure.
+ *
+ * Ensure driver lock is held.
+ */
+static inline void gpu_buddy_driver_lock_held(struct gpu_buddy *mm)
+{
+	if (mm->lock_dep_map)
+		lockdep_assert(lock_is_held_type(mm->lock_dep_map, 0));
+}
+#else
+static inline void gpu_buddy_driver_lock_held(struct gpu_buddy *mm)
+{
+}
+#endif
+
 static inline u64
 gpu_buddy_block_offset(const struct gpu_buddy_block *block)
 {
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager
  2026-04-29 12:37 [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager Tejas Upadhyay
@ 2026-04-30  9:14 ` Matthew Auld
  2026-04-30  9:34   ` Upadhyay, Tejas
  2026-04-30 13:12   ` Upadhyay, Tejas
  2026-05-05  1:39 ` Claude review: " Claude Code Review Bot
  2026-05-05  1:39 ` Claude Code Review Bot
  2 siblings, 2 replies; 6+ messages in thread
From: Matthew Auld @ 2026-04-30  9:14 UTC (permalink / raw)
  To: Tejas Upadhyay, intel-xe; +Cc: Arunpravin.PaneerSelvam, dri-devel, amd-gfx

On 29/04/2026 13:37, Tejas Upadhyay wrote:
> gpu_buddy APIs are expected to be called with the driver-provided lock
> held, but there is no runtime enforcement of this contract. Add lockdep
> annotations to catch locking violations early.
> 
> Introduce gpu_buddy_driver_set_lock() for the driver to register the
> lock that protects the buddy manager. Add gpu_buddy_driver_lock_held()
> assertions to all exported gpu_buddy and drm_buddy APIs that
> access/modify the manager state. The lock_dep_map field is only compiled
> in when CONFIG_LOCKDEP is enabled, adding zero overhead to production
> builds.
> 
> Wire up xe_ttm_vram_mgr to register its mutex with the buddy manager
> after initialization.
> 
> Assisted-by: Claude:claude-opus-4.6

I think add:

Suggested-by: Matthew Brost <matthew.brost@intel.com>

> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager
  2026-04-30  9:14 ` Matthew Auld
@ 2026-04-30  9:34   ` Upadhyay, Tejas
  2026-04-30 13:12   ` Upadhyay, Tejas
  1 sibling, 0 replies; 6+ messages in thread
From: Upadhyay, Tejas @ 2026-04-30  9:34 UTC (permalink / raw)
  To: Auld, Matthew, intel-xe@lists.freedesktop.org
  Cc: Arunpravin.PaneerSelvam@amd.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org



> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: 30 April 2026 14:45
> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
> xe@lists.freedesktop.org
> Cc: Arunpravin.PaneerSelvam@amd.com; dri-devel@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/buddy: Integrate lockdep annotations for gpu
> buddy manager
> 
> On 29/04/2026 13:37, Tejas Upadhyay wrote:
> > gpu_buddy APIs are expected to be called with the driver-provided lock
> > held, but there is no runtime enforcement of this contract. Add
> > lockdep annotations to catch locking violations early.
> >
> > Introduce gpu_buddy_driver_set_lock() for the driver to register the
> > lock that protects the buddy manager. Add gpu_buddy_driver_lock_held()
> > assertions to all exported gpu_buddy and drm_buddy APIs that
> > access/modify the manager state. The lock_dep_map field is only
> > compiled in when CONFIG_LOCKDEP is enabled, adding zero overhead to
> > production builds.
> >
> > Wire up xe_ttm_vram_mgr to register its mutex with the buddy manager
> > after initialization.
> >
> > Assisted-by: Claude:claude-opus-4.6
> 
> I think add:
> 
> Suggested-by: Matthew Brost <matthew.brost@intel.com>

Sure thanks for review, will add.

Tejas
> 
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager
  2026-04-30  9:14 ` Matthew Auld
  2026-04-30  9:34   ` Upadhyay, Tejas
@ 2026-04-30 13:12   ` Upadhyay, Tejas
  1 sibling, 0 replies; 6+ messages in thread
From: Upadhyay, Tejas @ 2026-04-30 13:12 UTC (permalink / raw)
  To: Auld, Matthew, intel-xe@lists.freedesktop.org
  Cc: Arunpravin.PaneerSelvam@amd.com, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org

Failures isn’t related to change https://patchwork.freedesktop.org/series/165720/#rev1. Will merge it later. Please let me know in case of any concerns.

Tejas

> -----Original Message-----
> From: Auld, Matthew <matthew.auld@intel.com>
> Sent: 30 April 2026 14:45
> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; intel-
> xe@lists.freedesktop.org
> Cc: Arunpravin.PaneerSelvam@amd.com; dri-devel@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/buddy: Integrate lockdep annotations for gpu
> buddy manager
> 
> On 29/04/2026 13:37, Tejas Upadhyay wrote:
> > gpu_buddy APIs are expected to be called with the driver-provided lock
> > held, but there is no runtime enforcement of this contract. Add
> > lockdep annotations to catch locking violations early.
> >
> > Introduce gpu_buddy_driver_set_lock() for the driver to register the
> > lock that protects the buddy manager. Add gpu_buddy_driver_lock_held()
> > assertions to all exported gpu_buddy and drm_buddy APIs that
> > access/modify the manager state. The lock_dep_map field is only
> > compiled in when CONFIG_LOCKDEP is enabled, adding zero overhead to
> > production builds.
> >
> > Wire up xe_ttm_vram_mgr to register its mutex with the buddy manager
> > after initialization.
> >
> > Assisted-by: Claude:claude-opus-4.6
> 
> I think add:
> 
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> 
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Claude review: drm/buddy: Integrate lockdep annotations for gpu buddy manager
  2026-04-29 12:37 [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager Tejas Upadhyay
  2026-04-30  9:14 ` Matthew Auld
  2026-05-05  1:39 ` Claude review: " Claude Code Review Bot
@ 2026-05-05  1:39 ` Claude Code Review Bot
  2 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-05-05  1:39 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: drm/buddy: Integrate lockdep annotations for gpu buddy manager
Author: Tejas Upadhyay <tejas.upadhyay@intel.com>
Patches: 4
Reviewed: 2026-05-05T11:39:29.714051

---

This is a single patch adding lockdep annotations to the `gpu_buddy` allocator, closely modeled after the existing `drm_gpusvm_driver_set_lock()` / `drm_gpusvm_driver_lock_held()` pattern. The approach is sound: store a pointer to the driver's `lockdep_map`, then assert it's held on entry to exported APIs. The implementation is mostly correct, but has several gaps and issues:

1. **Incomplete driver coverage**: Only xe is wired up, but amdgpu, i915, and the ttm test mock manager also call gpu_buddy APIs and need `gpu_buddy_driver_set_lock()` calls. Without those, the annotation is inert for 3 of 4 consumers.

2. **Incomplete API coverage**: Two exported functions (`gpu_buddy_fini` and `gpu_buddy_block_print`) plus one drm-layer export (`drm_buddy_block_print`) are missing annotations.

3. **Minor style inconsistency**: The `#else` stubs for the no-LOCKDEP case differ between the macro (uses `do { (void)(mm); (void)(lock); } while (0)`) and the original pattern (uses `do {} while (0)`). Not a bug, but inconsistent with the `drm_gpusvm` template this was copied from.

4. **The macro lacks the local-variable guard present in the drm_gpusvm template**: The `drm_gpusvm_driver_set_lock` macro accesses `(gpusvm)` directly, but the gpu_buddy version uses a local `__mm`. This is actually an *improvement* over the gpusvm original — good.

Overall this is a reasonable v1 that needs a respin to cover the remaining callers and APIs.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Claude review: drm/buddy: Integrate lockdep annotations for gpu buddy manager
  2026-04-29 12:37 [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager Tejas Upadhyay
  2026-04-30  9:14 ` Matthew Auld
@ 2026-05-05  1:39 ` Claude Code Review Bot
  2026-05-05  1:39 ` Claude Code Review Bot
  2 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-05-05  1:39 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Positive aspects:**
- Pattern is well-established — directly follows `drm_gpusvm` precedent
- `lock_dep_map` field is `#ifdef CONFIG_LOCKDEP` guarded, so zero overhead on production builds
- The macro uses a local variable `__mm` to avoid double-evaluation, which is slightly better than the gpusvm original
- Correct use of `lock_is_held_type(map, 0)` for exclusive (mutex) lock assertion

**Issue 1 (Major): Missing driver callers**

Only xe gets `gpu_buddy_driver_set_lock()`, but amdgpu and i915 also use the buddy allocator with their own mutexes. Without wiring those up, the lockdep annotation does nothing for those drivers and `gpu_buddy_driver_lock_held()` will silently skip the assertion (because `lock_dep_map` is NULL).

```c
// xe_ttm_vram_mgr.c — wired up (good)
gpu_buddy_driver_set_lock(&mgr->mm, &mgr->lock);
```

Missing from `amdgpu_vram_mgr.c` (after `gpu_buddy_init` at line 934):
```c
err = gpu_buddy_init(&mgr->mm, man->size, PAGE_SIZE);
if (err)
    return err;
// needs: gpu_buddy_driver_set_lock(&mgr->mm, &mgr->lock);
```

Missing from `i915_ttm_buddy_manager.c` (after `gpu_buddy_init` at line 297):
```c
err = gpu_buddy_init(&bman->mm, size, chunk_size);
if (err)
    goto err_free_bman;
mutex_init(&bman->lock);
// needs: gpu_buddy_driver_set_lock(&bman->mm, &bman->lock);
```

Missing from `ttm_mock_manager.c` (test code, lower priority but still should be consistent).

**Issue 2 (Medium): Missing annotations on exported functions**

`gpu_buddy_fini()` and `gpu_buddy_block_print()` are exported and access manager state, but are not annotated:

```c
// buddy.c line 461 — no gpu_buddy_driver_lock_held(mm)
void gpu_buddy_fini(struct gpu_buddy *mm)
{
    ...
}
EXPORT_SYMBOL(gpu_buddy_fini);

// buddy.c line 1458 — no gpu_buddy_driver_lock_held(mm)
void gpu_buddy_block_print(struct gpu_buddy *mm, ...)
{
    ...
}
EXPORT_SYMBOL(gpu_buddy_block_print);
```

Similarly, `drm_buddy_block_print()` in `drm_buddy.c` is exported and reads buddy state but is not annotated (while `drm_buddy_print()` in the same file *is* annotated).

For `gpu_buddy_fini`, the annotation is debatable — typically teardown happens after the lock is no longer needed. But `gpu_buddy_fini` calls `__force_merge` which modifies the buddy tree, so either annotate it or add a comment explaining why it's intentionally excluded.

**Issue 3 (Minor): Inconsistent no-LOCKDEP stub style**

```c
// gpu_buddy_driver_set_lock non-lockdep stub:
#define gpu_buddy_driver_set_lock(mm, lock) do { (void)(mm); (void)(lock); } while (0)

// drm_gpusvm_driver_set_lock non-lockdep stub for comparison:
#define drm_gpusvm_driver_set_lock(gpusvm, lock) do {} while (0)
```

The `(void)` casts are not wrong (they suppress unused-variable warnings), but the gpusvm template the code is modeled after doesn't use them. This is a matter of style consistency — either approach works.

**Issue 4 (Minor): `gpu_buddy_init` NULL initialization placement**

```c
+#ifdef CONFIG_LOCKDEP
+	mm->lock_dep_map = NULL;
+#endif
 	return 0;
```

This is placed at the end of the success path in `gpu_buddy_init`, which is fine. However, the `drm_gpusvm` version also places it at the end of initialization. Correct and consistent.

**Issue 5 (Nit): Commit message says "Assisted-by: Claude"**

```
Assisted-by: Claude:claude-opus-4.6
```

The standard tag for AI assistance is not established upstream. This may need adjustment based on maintainer preferences. At minimum the format should probably be `Assisted-by: Claude (claude-opus-4.6)` or similar human-readable form rather than the colon-separated identifier.

**Summary**: The core design is correct and follows established kernel patterns. The main issue is that the patch is incomplete — it needs to wire up all gpu_buddy consumers (amdgpu, i915, test mock) and annotate all exported functions, or explicitly justify why some are excluded. I'd suggest splitting this into a series: patch 1 for the infrastructure + buddy.c/drm_buddy.c annotations, then one patch per driver wiring up the lock registration.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-05  1:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 12:37 [PATCH] drm/buddy: Integrate lockdep annotations for gpu buddy manager Tejas Upadhyay
2026-04-30  9:14 ` Matthew Auld
2026-04-30  9:34   ` Upadhyay, Tejas
2026-04-30 13:12   ` Upadhyay, Tejas
2026-05-05  1:39 ` Claude review: " Claude Code Review Bot
2026-05-05  1:39 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox