public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context
@ 2026-05-01 21:58 Lyude Paul
  2026-05-04  4:01 ` Dave Airlie
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Lyude Paul @ 2026-05-01 21:58 UTC (permalink / raw)
  To: dri-devel, nouveau, linux-kernel
  Cc: Dave Airlie, Danilo Krummrich, Dave Airlie, Timur Tabi,
	Suraj Kandpal, James Jones, Faith Ekstrand, stable,
	Maarten Lankhorst, Ben Skeggs, Simona Vetter, Thomas Zimmermann,
	Maxime Ripard, Lyude Paul

It looks like CRC read back has been slightly broken for a while now, in
particular on GPUs using GSP. On my test machines, it's worked normally
when attempting to use it from fbcon. After gnome-shell gets started
however, attempting to read /sys/kernel/debug/dri/$CARD/$CRTC/crc/data just
returns -EINVAL.

It turns out what's been happening is that since we've been using
nvif_mem_ctor_map() to both allocate and map the CRC notifier region - we
haven't actually asked for a contiguous allocation, and simply ask for
whatever type of memory allocation nouveau can find first. This doesn't
work because the CRC engine on nvidia GPUs doesn't support non-contiguous
allocations, which also causes us to fail setting up the kmsCrcNtfyCtxDma
object on pre-blackwell platforms since we don't have a single memory
address we can point nvif_object_ctor() to. Instead, ctx->mem.addr gets set
to ~0ULL.

It does however, seem to work when fbcon is running. The only reason I can
think of this is that before we start up a display environment, there is
pretty much nothing allocated in our VRAM that wasn't allocated by nouveau
itself - making it dramatically more likely that we end up finding a
contiguous allocation by default.

So, fix this by manually requesting a contiguous allocation when we
allocate our context notifiers.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 12885ecbfe62 ("drm/nouveau/kms/nvd9-: Add CRC support")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: James Jones <jajones@nvidia.com>
Cc: Faith Ekstrand <faith.ekstrand@collabora.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.9+
---
 drivers/gpu/drm/nouveau/dispnv50/crc.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c
index deb6af40ef328..5817f39934a8b 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/crc.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c
@@ -10,6 +10,7 @@
 #include <nvif/class.h>
 #include <nvif/cl0002.h>
 #include <nvif/timer.h>
+#include <nvif/if900b.h>
 
 #include <nvhw/class/cl907d.h>
 
@@ -499,16 +500,24 @@ nv50_crc_raster_type(enum nv50_crc_source source)
  * notifier needs it's own handle
  */
 static inline int
-nv50_crc_ctx_init(struct nv50_head *head, struct nvif_mmu *mmu,
+nv50_crc_ctx_init(struct drm_device *dev, struct nv50_head *head, struct nvif_mmu *mmu,
 		  struct nv50_crc_notifier_ctx *ctx, size_t len, int idx)
 {
-	struct nv50_core *core = nv50_disp(head->base.base.dev)->core;
+	struct nv50_core *core = nv50_disp(dev)->core;
 	int ret;
 
-	ret = nvif_mem_ctor_map(mmu, "kmsCrcNtfy", NVIF_MEM_VRAM, len, &ctx->mem);
+	/* The display engine requires a contiguous region of memory for the CRC notifier context */
+	ret = nvif_mem_ctor(mmu, "kmsCrcNtfy", mmu->mem, NVIF_MEM_VRAM | NVIF_MEM_MAPPABLE, 0, len,
+			    &(struct gf100_mem_v0) {
+				.contig = true,
+			    }, sizeof(struct gf100_mem_v0), &ctx->mem);
 	if (ret)
 		return ret;
 
+	ret = nvif_object_map(&ctx->mem.object, NULL, 0);
+	if (ret)
+		goto fail_fini;
+
 	/* No CTXDMAs on Blackwell. */
 	if (core->chan.base.user.oclass >= GB202_DISP_CORE_CHANNEL_DMA)
 		return 0;
@@ -576,7 +585,7 @@ int nv50_crc_set_source(struct drm_crtc *crtc, const char *source_str)
 
 	if (source) {
 		for (i = 0; i < ARRAY_SIZE(head->crc.ctx); i++) {
-			ret = nv50_crc_ctx_init(head, mmu, &crc->ctx[i],
+			ret = nv50_crc_ctx_init(dev, head, mmu, &crc->ctx[i],
 						func->notifier_len, i);
 			if (ret)
 				goto out_ctx_fini;

base-commit: 29d6da40d0b8bf3bbc3dcd1d2198434a0e1f71b0
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context
  2026-05-01 21:58 [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context Lyude Paul
@ 2026-05-04  4:01 ` Dave Airlie
  2026-05-04 23:05 ` Claude review: " Claude Code Review Bot
  2026-05-04 23:05 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Dave Airlie @ 2026-05-04  4:01 UTC (permalink / raw)
  To: Lyude Paul
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich, Dave Airlie,
	Timur Tabi, Suraj Kandpal, James Jones, Faith Ekstrand, stable,
	Maarten Lankhorst, Ben Skeggs, Simona Vetter, Thomas Zimmermann,
	Maxime Ripard

On Sat, 2 May 2026 at 07:59, Lyude Paul <lyude@redhat.com> wrote:
>
> It looks like CRC read back has been slightly broken for a while now, in
> particular on GPUs using GSP. On my test machines, it's worked normally
> when attempting to use it from fbcon. After gnome-shell gets started
> however, attempting to read /sys/kernel/debug/dri/$CARD/$CRTC/crc/data just
> returns -EINVAL.
>
> It turns out what's been happening is that since we've been using
> nvif_mem_ctor_map() to both allocate and map the CRC notifier region - we
> haven't actually asked for a contiguous allocation, and simply ask for
> whatever type of memory allocation nouveau can find first. This doesn't
> work because the CRC engine on nvidia GPUs doesn't support non-contiguous
> allocations, which also causes us to fail setting up the kmsCrcNtfyCtxDma
> object on pre-blackwell platforms since we don't have a single memory
> address we can point nvif_object_ctor() to. Instead, ctx->mem.addr gets set
> to ~0ULL.
>
> It does however, seem to work when fbcon is running. The only reason I can
> think of this is that before we start up a display environment, there is
> pretty much nothing allocated in our VRAM that wasn't allocated by nouveau
> itself - making it dramatically more likely that we end up finding a
> contiguous allocation by default.
>
> So, fix this by manually requesting a contiguous allocation when we
> allocate our context notifiers.
>
> Signed-off-by: Lyude Paul <lyude@redhat.com>
> Fixes: 12885ecbfe62 ("drm/nouveau/kms/nvd9-: Add CRC support")
> Cc: Lyude Paul <lyude@redhat.com>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Timur Tabi <ttabi@nvidia.com>
> Cc: Suraj Kandpal <suraj.kandpal@intel.com>
> Cc: James Jones <jajones@nvidia.com>
> Cc: Faith Ekstrand <faith.ekstrand@collabora.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: nouveau@lists.freedesktop.org
> Cc: <stable@vger.kernel.org> # v5.9+
> ---
>  drivers/gpu/drm/nouveau/dispnv50/crc.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c
> index deb6af40ef328..5817f39934a8b 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/crc.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c
> @@ -10,6 +10,7 @@
>  #include <nvif/class.h>
>  #include <nvif/cl0002.h>
>  #include <nvif/timer.h>
> +#include <nvif/if900b.h>
>
>  #include <nvhw/class/cl907d.h>
>
> @@ -499,16 +500,24 @@ nv50_crc_raster_type(enum nv50_crc_source source)
>   * notifier needs it's own handle
>   */
>  static inline int
> -nv50_crc_ctx_init(struct nv50_head *head, struct nvif_mmu *mmu,
> +nv50_crc_ctx_init(struct drm_device *dev, struct nv50_head *head, struct nvif_mmu *mmu,
>                   struct nv50_crc_notifier_ctx *ctx, size_t len, int idx)
>  {
> -       struct nv50_core *core = nv50_disp(head->base.base.dev)->core;
> +       struct nv50_core *core = nv50_disp(dev)->core;
>         int ret;
>
> -       ret = nvif_mem_ctor_map(mmu, "kmsCrcNtfy", NVIF_MEM_VRAM, len, &ctx->mem);
> +       /* The display engine requires a contiguous region of memory for the CRC notifier context */
> +       ret = nvif_mem_ctor(mmu, "kmsCrcNtfy", mmu->mem, NVIF_MEM_VRAM | NVIF_MEM_MAPPABLE, 0, len,
> +                           &(struct gf100_mem_v0) {
> +                               .contig = true,
> +                           }, sizeof(struct gf100_mem_v0), &ctx->mem);

I don't think you can just throw gf100_mem_v0 in here like that, I
think you should maybe make nvif_mem_ctor_map deal with this since you
want a kernel mappable vram region it needs contig.

Dave.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context
  2026-05-01 21:58 [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context Lyude Paul
  2026-05-04  4:01 ` Dave Airlie
@ 2026-05-04 23:05 ` Claude Code Review Bot
  2026-05-04 23:05 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-05-04 23:05 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context
Author: Lyude Paul <lyude@redhat.com>
Patches: 2
Reviewed: 2026-05-05T09:05:55.229008

---

This is a single-patch fix for a real bug in nouveau CRC readback. The commit message is excellent: it clearly explains the symptom (CRC reads return `-EINVAL` after gnome-shell starts), the root cause (non-contiguous VRAM allocation for a hardware engine that requires contiguous memory), and why the bug is intermittent (low VRAM fragmentation under fbcon makes contiguous allocation likely by accident). The fix is correct, the error handling is sound, and the `Fixes:` / `Cc: stable` tags are appropriate.

**Recommendation: Accept.** One very minor style observation below, nothing that should block the patch.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context
  2026-05-01 21:58 [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context Lyude Paul
  2026-05-04  4:01 ` Dave Airlie
  2026-05-04 23:05 ` Claude review: " Claude Code Review Bot
@ 2026-05-04 23:05 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-05-04 23:05 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Correctness of the core fix:**

The old code called `nvif_mem_ctor_map()` which internally passes `NULL, 0` for the argv/argc parameters to `nvif_mem_ctor()`:

```c
// nvif/mem.c:31-32 (old path)
int ret = nvif_mem_ctor(mmu, name, mmu->mem, NVIF_MEM_MAPPABLE | type,
                        0, size, NULL, 0, mem);
```

The new code replaces this with a direct `nvif_mem_ctor()` call that passes a `gf100_mem_v0` struct with `.contig = true`:

```c
ret = nvif_mem_ctor(mmu, "kmsCrcNtfy", mmu->mem, NVIF_MEM_VRAM | NVIF_MEM_MAPPABLE, 0, len,
                    &(struct gf100_mem_v0) {
                            .contig = true,
                    }, sizeof(struct gf100_mem_v0), &ctx->mem);
```

This is correct. The compound literal zero-initializes `.version` to 0, matching the `v0` convention. The `gf100_mem_v0` data gets copied into `args->data[]` inside `nvif_mem_ctor_type()` (mem.c:70) and forwarded to the firmware/kernel memory handler, which will honor the contiguous flag.

The flags `NVIF_MEM_VRAM | NVIF_MEM_MAPPABLE` match exactly what `nvif_mem_ctor_map()` was computing internally (`NVIF_MEM_MAPPABLE | type` where `type` was `NVIF_MEM_VRAM`).

**Error handling:**

The separate `nvif_object_map()` call after `nvif_mem_ctor()` correctly jumps to `fail_fini` on failure, which calls `nvif_mem_dtor()` to clean up the allocation. This mirrors the cleanup that `nvif_mem_ctor_map()` did internally (mem.c:36).

**Minor style observation (non-blocking):**

The patch adds `struct drm_device *dev` as a new parameter to `nv50_crc_ctx_init()`, but `dev` is already reachable via the existing `head` parameter as `head->base.base.dev`. The parameter is only used once to replace that chain:

```c
-	struct nv50_core *core = nv50_disp(head->base.base.dev)->core;
+	struct nv50_core *core = nv50_disp(dev)->core;
```

For a stable-targeted bug fix, keeping the change minimal by not adding the parameter would be slightly preferable — just use `head->base.base.dev` as before. That said, it's a trivial cleanup and not worth a respin.

**Reviewed-by worthy:** Yes. The fix is well-motivated, correct, and appropriately scoped.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-04 23:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-01 21:58 [PATCH] drm/nouveau/kms/nvd9-: Use contiguous memory for CRC notifier context Lyude Paul
2026-05-04  4:01 ` Dave Airlie
2026-05-04 23:05 ` Claude review: " Claude Code Review Bot
2026-05-04 23:05 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox