From mboxrd@z Thu Jan  1 00:00:00 1970
From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: drm/xe: Split TLB invalidation into submit and wait steps
Date: Wed, 04 Mar 2026 07:09:17 +1000
Message-ID: <review-patch3-20260303133409.11609-4-thomas.hellstrom@linux.intel.com>
In-Reply-To: <20260303133409.11609-4-thomas.hellstrom@linux.intel.com>
References: <20260303133409.11609-1-thomas.hellstrom@linux.intel.com> <20260303133409.11609-4-thomas.hellstrom@linux.intel.com>
X-Mailer: Claude Code Patch Reviewer
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Patch Review

This is a clean refactor that splits `xe_vm_range_tilemask_tlb_inval` into =
`xe_tlb_inval_range_tilemask_submit` + `xe_tlb_inval_batch_wait`, and conve=
rts all callers.

**`xe_tlb_inval_batch` struct:**

```c
struct xe_tlb_inval_batch {
	struct xe_tlb_inval_fence fence[XE_MAX_TILES_PER_DEVICE * XE_MAX_GT_PER_TI=
LE];
	unsigned int num_fences;
};
```

With `XE_MAX_TILES_PER_DEVICE=3D2` and `XE_MAX_GT_PER_TILE=3D2`, this is an=
 array of 4 `xe_tlb_inval_fence` structs. Each fence contains a `struct dma=
_fence` (~72-80 bytes depending on config), a pointer, list_head, int, and =
ktime_t. The batch struct is roughly ~450-500 bytes. This is fine on the st=
ack (the original code also had a stack-allocated array of the same size). =
However, in patch 4 this gets **embedded per-userptr**, which adds this to =
every `struct xe_userptr`. That's a significant per-object overhead worth n=
oting.

**Error handling in `xe_tlb_inval_range_tilemask_submit`:**

```c
wait:
	batch->num_fences =3D fence_id;
	if (err)
		xe_tlb_inval_batch_wait(batch);
	return err;
```

When `xe_tlb_inval_range` returns an error, the function waits for all prev=
iously-submitted fences and returns the error. The doc says "If the functio=
n returns an error, there is no need to call `xe_tlb_inval_batch_wait()` on=
 @batch." This is consistent =E2=80=94 on error, the internal wait is done =
and `num_fences` is set to 0 by `xe_tlb_inval_batch_wait`.

**Caller conversion =E2=80=94 `xe_vm_invalidate_vma`:**

The original code did the `WRITE_ONCE(vma->tile_invalidated, ...)` after th=
e TLB invalidation completed. The new code does it *after submit but before=
 wait*:

```c
ret =3D xe_tlb_inval_range_tilemask_submit(xe, ..., &batch);
WRITE_ONCE(vma->tile_invalidated, vma->tile_mask);
if (!ret)
    xe_tlb_inval_batch_wait(&batch);
```

This is a behavioral change: `tile_invalidated` is now set before TLB inval=
idation completes. This is okay if `tile_invalidated` is a "we've initiated=
 invalidation" flag rather than "invalidation is complete". Let me note tha=
t this reordering was introduced in this patch and is kept in patch 4 when =
`xe_vm_invalidate_vma` is refactored into `xe_vm_invalidate_vma_submit` + `=
xe_vm_invalidate_vma`. The semantics should be documented clearly, since `R=
EAD_ONCE` consumers of `tile_invalidated` may now observe the flag being se=
t before the TLB flush is actually done.

**`xe_tlb_inval_types.h` includes `xe_device_types.h`:**

```c
#include "xe_device_types.h"
```

This is added to get the `XE_MAX_TILES_PER_DEVICE` define for the array siz=
e. This is a heavyweight include for a `_types.h` header and could potentia=
lly create circular dependency issues. Consider forward-declaring or using =
a constant directly instead.

Overall: **Correct refactor, minor concern about `tile_invalidated` reorder=
ing semantics and header weight.**

---

---
Generated by Claude Code Patch Reviewer