Claude review: drm/panfrost: Check another bo field for cache option query

public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed

* Claude review: drm/panfrost: Check another bo field for cache option query
  2026-05-25 23:03 ` [PATCH 1/4] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
@ 2026-05-27  5:22   ` Claude Code Review Bot
  0 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-05-27  5:22 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Status: Looks correct, good fix.**

The change in `panfrost_drv.c`:
```c
-		if (!bo->base.map_wc)
+		if (bo->wb_mmap)
```

The old code infers WB_MMAP by checking `!map_wc`. But `map_wc` is derived from *two* inputs, as seen in `panfrost_gem.c:should_map_wc()`:
```c
if (pfdev->coherent)
    return false;
if (bo->wb_mmap)
    return false;
return true;
```

On a coherent device, `map_wc` is always `false` regardless of `wb_mmap`. So `!bo->base.map_wc` would be `true` for *every* BO on a coherent device, incorrectly reporting `PANFROST_BO_WB_MMAP` for BOs that were never created with that flag. The fix to check `bo->wb_mmap` directly is the right thing to do -- it queries the user's original intent rather than a derived value.

**Minor**: The commit message could be clearer about *why* `map_wc` is wrong -- specifically mentioning the coherent-device case would help reviewers understand the concrete bug.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost
@ 2026-06-04 17:35 Adrián Larumbe
  2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong, Claude

This patch series comes with some minor fixes for Panfrost. Some of the
issues addressed were uncovered by LLM tools, while the one about RPM was
traditionally debugged.

The only proper bug fixes are the ones for perfcnt and RPM, and the others
are more about ensuring robustness. perfcnt bug was never observed in the
open, because no one had faced a HW reset when a perfcnt session was
ongoing. RPM fix addresses a well-known issue I had been observing for
quite some time, but that I had completely misunderstood.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
Changes in v2:
- Fixed race conditions introduced by the previous revision, in the RPM and
HWPerf commits specifically. These can be consulted at [1]
- Don't attempt to suspend the device unconditionally at the end of
probe(), and do it through autosuspend instead.
- Broke the RPM patch into one that fixes PM refcnt proper and another one
for MMU enablement at device init time.
- Moved perfcnt GPU disable helper into its own commmit. Same for
transplating shrinker initialisation and unplug.
- Added 'Fixes' and 'Reported-by' tags to all the relevant commits
- [1] https://sashiko.dev/#/patchset/20260526-claude-fixes-v1-0-16e92eaa4949%40collabora.com
- Link to v1: https://patch.msgid.link/20260526-claude-fixes-v1-0-16e92eaa4949@collabora.com

To: Boris Brezillon <boris.brezillon@collabora.com>
To: Rob Herring <robh@kernel.org>
To: Steven Price <steven.price@arm.com>
To: Adrián Larumbe <adrian.larumbe@collabora.com>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Maxime Ripard <mripard@kernel.org>
To: Thomas Zimmermann <tzimmermann@suse.de>
To: David Airlie <airlied@gmail.com>
To: Simona Vetter <simona@ffwll.ch>
To: Faith Ekstrand <faith.ekstrand@collabora.com>
To: "Marty E. Plummer" <hanetzer@startmail.com>
To: Tomeu Vizoso <tomeu@tomeuvizoso.net>
To: Eric Anholt <eric@anholt.net>
To: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: Neil Armstrong <neil.armstrong@linaro.org>

---
Adrián Larumbe (7):
      drm/panfrost: Check another bo field for cache option query
      drm/panfrost: Prevent division by 0
      drm/panfrost: Move shrinker initialization and unplug one level down
      drm/panfrost: Move perfcnt GPU disable sequence into a helper
      drm/panfrost: Make reset sequence deal with an active HWPerf session
      drm/panfrost: Fix PM usage_count mishandling
      drm/panfrost: Explicitly enable MMU interrupts at device init

 drivers/gpu/drm/panfrost/panfrost_devfreq.c      |  3 +-
 drivers/gpu/drm/panfrost/panfrost_device.c       |  8 +++
 drivers/gpu/drm/panfrost/panfrost_drv.c          | 14 ++---
 drivers/gpu/drm/panfrost/panfrost_gem.h          |  4 +-
 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c |  8 +--
 drivers/gpu/drm/panfrost/panfrost_mmu.c          | 11 +++-
 drivers/gpu/drm/panfrost/panfrost_mmu.h          |  2 +
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c      | 77 ++++++++++++++++++------
 drivers/gpu/drm/panfrost/panfrost_perfcnt.h      |  1 +
 9 files changed, 90 insertions(+), 38 deletions(-)
---
base-commit: c1079aebb4de218caa86c44f9a53700d1a582683
change-id: 20260523-claude-fixes-82b03a134a8c

Best regards,
--  
Adrián Larumbe <adrian.larumbe@collabora.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:57   ` Boris Brezillon
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong, Claude

When querying the properties of a BO, whether it was created in
Writeback caching mode for userspace mappings, is recorded in a Panfrost
BO's wb_mmap field.

Reported-by: Claude <noreply@anthropic.com>
Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 62eedf1ccba5 ("drm/panfrost: Add flag to map GEM object Write-Back Cacheable")
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 3d0bdba2a474..cb8e5015847f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -659,7 +659,7 @@ static int panfrost_ioctl_query_bo_info(struct drm_device *dev, void *data,
 		if (bo->is_heap)
 			args->create_flags |= PANFROST_BO_HEAP;
 
-		if (!bo->base.map_wc)
+		if (bo->wb_mmap)
 			args->create_flags |= PANFROST_BO_WB_MMAP;
 	}
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/7] drm/panfrost: Prevent division by 0
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
  2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:44   ` sashiko-bot
  2026-06-04 18:02   ` Boris Brezillon
  2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong, Claude

When updating and debug-printing devfreq stats, in the very unlikely
offchance that total device time is less than 100 ns, clamp it to 0 to
avoid division by 0.

Reported-by: Claude <noreply@anthropic.com>
Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
---
 drivers/gpu/drm/panfrost/panfrost_devfreq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
index b51c30778811..e34de47bf43b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
@@ -78,7 +78,8 @@ static int panfrost_devfreq_get_dev_status(struct device *dev,
 
 	dev_dbg(pfdev->base.dev, "busy %lu total %lu %lu %% freq %lu MHz\n",
 		status->busy_time, status->total_time,
-		status->busy_time / (status->total_time / 100),
+		status->total_time >= 100 ?
+			status->busy_time / (status->total_time / 100) : 0,
 		status->current_frequency / 1000 / 1000);
 
 	return 0;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
  2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
  2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 18:04   ` Boris Brezillon
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong

Since the moment we call drm_dev_register() the device should be in a
position to accept jobs, so it's best if the shrinker is already
initialized by then.

On top of that, make shrinker functions take an panfrost_device pointer
like other functions in the same sequence and rename them accordingly.

Essentially mimic the init/fini behaviour in Panthor.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_device.c       | 7 +++++++
 drivers/gpu/drm/panfrost/panfrost_drv.c          | 6 ------
 drivers/gpu/drm/panfrost/panfrost_gem.h          | 4 ++--
 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c | 8 ++------
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index 7fed22d555a5..87b372c9e675 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -282,9 +282,15 @@ int panfrost_device_init(struct panfrost_device *pfdev)
 	if (err)
 		goto out_job;
 
+	err = panfrost_gem_shrinker_init(pfdev);
+	if (err)
+		goto out_perfcnt;
+
 	panfrost_gem_init(pfdev);
 
 	return 0;
+out_perfcnt:
+	panfrost_perfcnt_fini(pfdev);
 out_job:
 	panfrost_jm_fini(pfdev);
 out_mmu:
@@ -306,6 +312,7 @@ int panfrost_device_init(struct panfrost_device *pfdev)
 
 void panfrost_device_fini(struct panfrost_device *pfdev)
 {
+	panfrost_gem_shrinker_fini(pfdev);
 	panfrost_perfcnt_fini(pfdev);
 	panfrost_jm_fini(pfdev);
 	panfrost_mmu_fini(pfdev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index cb8e5015847f..2d4b6aa95c66 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -1000,14 +1000,9 @@ static int panfrost_probe(struct platform_device *pdev)
 	if (err < 0)
 		goto err_out1;
 
-	err = panfrost_gem_shrinker_init(&pfdev->base);
-	if (err)
-		goto err_out2;
 
 	return 0;
 
-err_out2:
-	drm_dev_unregister(&pfdev->base);
 err_out1:
 	pm_runtime_disable(pfdev->base.dev);
 	panfrost_device_fini(pfdev);
@@ -1021,7 +1016,6 @@ static void panfrost_remove(struct platform_device *pdev)
 	struct panfrost_device *pfdev = platform_get_drvdata(pdev);
 
 	drm_dev_unregister(&pfdev->base);
-	panfrost_gem_shrinker_cleanup(&pfdev->base);
 
 	pm_runtime_get_sync(pfdev->base.dev);
 	pm_runtime_disable(pfdev->base.dev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index 79d4377019e9..323a1aee255e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -154,8 +154,8 @@ panfrost_gem_mapping_get(struct panfrost_gem_object *bo,
 void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping);
 void panfrost_gem_teardown_mappings_locked(struct panfrost_gem_object *bo);
 
-int panfrost_gem_shrinker_init(struct drm_device *dev);
-void panfrost_gem_shrinker_cleanup(struct drm_device *dev);
+int panfrost_gem_shrinker_init(struct panfrost_device *pfdev);
+void panfrost_gem_shrinker_fini(struct panfrost_device *pfdev);
 
 void panfrost_gem_set_label(struct drm_gem_object *obj, const char *label);
 int panfrost_gem_sync(struct drm_gem_object *obj, u32 type,
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index 2fe967a90bcb..fefae87535d6 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -95,10 +95,8 @@ panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
  *
  * This function registers and sets up the panfrost shrinker.
  */
-int panfrost_gem_shrinker_init(struct drm_device *dev)
+int panfrost_gem_shrinker_init(struct panfrost_device *pfdev)
 {
-	struct panfrost_device *pfdev = to_panfrost_device(dev);
-
 	pfdev->shrinker = shrinker_alloc(0, "drm-panfrost");
 	if (!pfdev->shrinker)
 		return -ENOMEM;
@@ -118,10 +116,8 @@ int panfrost_gem_shrinker_init(struct drm_device *dev)
  *
  * This function unregisters the panfrost shrinker.
  */
-void panfrost_gem_shrinker_cleanup(struct drm_device *dev)
+void panfrost_gem_shrinker_fini(struct panfrost_device *pfdev)
 {
-	struct panfrost_device *pfdev = to_panfrost_device(dev);
-
 	if (pfdev->shrinker)
 		shrinker_free(pfdev->shrinker);
 }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
                   ` (2 preceding siblings ...)
  2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:47   ` sashiko-bot
  2026-06-04 18:05   ` Boris Brezillon
  2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong

Just for the sake of avoiding repetition.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 31 ++++++++++++-----------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
index 7020c0192e18..ad1156678e91 100644
--- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
+++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
@@ -35,6 +35,16 @@ struct panfrost_perfcnt {
 	struct completion dump_comp;
 };
 
+static void panfrost_perfcnt_gpu_disable(struct panfrost_device *pfdev)
+{
+	gpu_write(pfdev, GPU_PERFCNT_CFG,
+		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
+	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
+	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
+	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
+	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
+}
+
 void panfrost_perfcnt_clean_cache_done(struct panfrost_device *pfdev)
 {
 	complete(&pfdev->perfcnt->dump_comp);
@@ -193,12 +203,7 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
 	if (user != perfcnt->user)
 		return -EINVAL;
 
-	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
-	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
-	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
-	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
-	gpu_write(pfdev, GPU_PERFCNT_CFG,
-		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
+	panfrost_perfcnt_gpu_disable(pfdev);
 
 	perfcnt->user = NULL;
 	drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
@@ -327,12 +332,7 @@ int panfrost_perfcnt_init(struct panfrost_device *pfdev)
 	perfcnt->bosize = size;
 
 	/* Start with everything disabled. */
-	gpu_write(pfdev, GPU_PERFCNT_CFG,
-		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
-	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
+	panfrost_perfcnt_gpu_disable(pfdev);
 
 	init_completion(&perfcnt->dump_comp);
 	mutex_init(&perfcnt->lock);
@@ -344,10 +344,5 @@ int panfrost_perfcnt_init(struct panfrost_device *pfdev)
 void panfrost_perfcnt_fini(struct panfrost_device *pfdev)
 {
 	/* Disable everything before leaving. */
-	gpu_write(pfdev, GPU_PERFCNT_CFG,
-		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
-	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0);
-	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
+	panfrost_perfcnt_gpu_disable(pfdev);
 }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
                   ` (3 preceding siblings ...)
  2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:49   ` sashiko-bot
  2026-06-04 18:26   ` Boris Brezillon
  2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong, Claude

Right now, if there's a HW reset and an HWPerf session is active,
panfrost_mmu_reset() will reset the AS count for every single open file's
mmu struct back to 0, and also invalidate their AS numbers. Then, when
disabling hwperf, panfrost_mmu_as_put() will WARN that mmu->as_count is
less than zero.

Fix this by introducing a perfcnt HW reset path.

The choice was made to render perfcnt unusable after reset, so that a
user might have to reprogram it with a full disable/enable sequence
before requesting more perfcnt dumps.

Reported-by: Claude <noreply@anthropic.com>
Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 7786fd108777 ("drm/panfrost: Expose performance counters through unstable ioctls")
---
 drivers/gpu/drm/panfrost/panfrost_device.c  |  1 +
 drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 46 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/panfrost/panfrost_perfcnt.h |  1 +
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index 87b372c9e675..2805d50c1b9b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -426,6 +426,7 @@ bool panfrost_exception_needs_reset(const struct panfrost_device *pfdev,
 
 void panfrost_device_reset(struct panfrost_device *pfdev, bool enable_job_int)
 {
+	panfrost_perfcnt_reset(pfdev);
 	panfrost_gpu_soft_reset(pfdev);
 
 	panfrost_gpu_power_on(pfdev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
index ad1156678e91..c2087ea705fe 100644
--- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
+++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
@@ -33,6 +33,7 @@ struct panfrost_perfcnt {
 	struct panfrost_file_priv *user;
 	struct mutex lock;
 	struct completion dump_comp;
+	atomic_t hw_reset_happened;
 };
 
 static void panfrost_perfcnt_gpu_disable(struct panfrost_device *pfdev)
@@ -57,9 +58,13 @@ void panfrost_perfcnt_sample_done(struct panfrost_device *pfdev)
 
 static int panfrost_perfcnt_dump_locked(struct panfrost_device *pfdev)
 {
+	struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
 	u64 gpuva;
 	int ret;
 
+	if (atomic_read(&perfcnt->hw_reset_happened))
+		return -EIO;
+
 	reinit_completion(&pfdev->perfcnt->dump_comp);
 	gpuva = pfdev->perfcnt->mapping->mmnode.start << PAGE_SHIFT;
 	gpu_write(pfdev, GPU_PERFCNT_BASE_LO, lower_32_bits(gpuva));
@@ -140,6 +145,15 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
 		goto err_vunmap;
 	}
 
+	/* If a reset is ongoing, the AS we get right below will be torn
+	 * down, so rather than waiting until this becomes obvious in a
+	 * perfcnt_dump() ioctl, we ask the user to try again slightly later.
+	 */
+	if (atomic_read(&pfdev->reset.pending)) {
+		ret = -EAGAIN;
+		goto err_vunmap;
+	}
+
 	ret = panfrost_mmu_as_get(pfdev, perfcnt->mapping->mmu);
 	if (ret < 0)
 		goto err_vunmap;
@@ -173,6 +187,16 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
 	if (panfrost_has_hw_issue(pfdev, HW_ISSUE_8186))
 		gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0xffffffff);
 
+	/* If a reset happened, we've no way of knowing whether it was between the time we called
+	 * panfrost_mmu_as_get() or before perfcnt_enable(), so clearing this flag and going forward
+	 * isn't possible. We must clear the flag and try again in the hopes no resets will happen
+	 * between this and the next ioctl invocation.
+	 */
+	if (atomic_cmpxchg(&perfcnt->hw_reset_happened, 1, 0)) {
+		ret = EAGAIN;
+		goto err_disable;
+	}
+
 	/* The BO ref is retained by the mapping. */
 	drm_gem_object_put(&bo->base);
 
@@ -180,6 +204,8 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
 
 	return 0;
 
+err_disable:
+	panfrost_perfcnt_gpu_disable(pfdev);
 err_vunmap:
 	drm_gem_vunmap(&bo->base, &map);
 err_put_mapping:
@@ -209,7 +235,8 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
 	drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
 	perfcnt->buf = NULL;
 	panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv);
-	panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
+	if (!atomic_read(&perfcnt->hw_reset_happened))
+		panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
 	panfrost_gem_mapping_put(perfcnt->mapping);
 	perfcnt->mapping = NULL;
 	pm_runtime_put_autosuspend(pfdev->base.dev);
@@ -346,3 +373,20 @@ void panfrost_perfcnt_fini(struct panfrost_device *pfdev)
 	/* Disable everything before leaving. */
 	panfrost_perfcnt_gpu_disable(pfdev);
 }
+
+void panfrost_perfcnt_reset(struct panfrost_device *pfdev)
+{
+	struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
+
+	/* Since this function will be called either from a scheduled HW reset
+	 * or a runtime resume, tearing down any perfcnt resources means we're
+	 * doomed to deadlocking with perfcnt_{enable/disable}, since we'd have
+	 * to take the perfecnt lock. On top of that, it'd also violate DMA fence
+	 * signalling rules because GFP_KERNEL allocations are made with the perfcnt
+	 * lock taken in perfcnt_enable. In light of this, the only thing we can do
+	 * is disabling perfcnt unconditionally, and notifying the perfcnt user of
+	 * the reset having happpened so that they can take recovery measures.
+	 */
+	panfrost_perfcnt_gpu_disable(pfdev);
+	atomic_set(&perfcnt->hw_reset_happened, 1);
+}
diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.h b/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
index 8bbcf5f5fb33..8b9bc704b634 100644
--- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
+++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
@@ -14,5 +14,6 @@ int panfrost_ioctl_perfcnt_enable(struct drm_device *dev, void *data,
 				  struct drm_file *file_priv);
 int panfrost_ioctl_perfcnt_dump(struct drm_device *dev, void *data,
 				struct drm_file *file_priv);
+void panfrost_perfcnt_reset(struct panfrost_device *pfdev);
 
 #endif

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
                   ` (4 preceding siblings ...)
  2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:50   ` sashiko-bot
  2026-06-04 18:36   ` Boris Brezillon
  2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
  2026-06-04 20:16 ` Claude review: RPM, perfcnt and other minor fixes for Panfrost Claude Code Review Bot
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong

During device probe(), failure to do a PM get() will leave the usage_count
set to 0, which is the value assigned at device creation time. That means
when the autosuspend delay expires, runtime suspend callback won't be
invoked, so the device will remain powered on forever.

On top of that, failure to call PM put() during device unplug means
Panfrost device's PM usage_count increases monotonically for every new
module reload.

The combined outcome of both of the above was that devfreq OPP transition
notifications would be printed all the time, even when no jobs are being
submitted. This quickly fills the kernel ring buffer with junk.

Even direr than that was the fact MMU interrupts are only enabled when
the device is reset, so after device probe() the very first job targeting
the tiler heap BO would always time out, because the driver's PM runtime
resume callback would not be invoked.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 635430797d3f ("drm/panfrost: Rework runtime PM initialization")
Fixes: 876b15d2c88d ("drm/panfrost: Fix module unload")
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 2d4b6aa95c66..545fbf2c8d0c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -989,6 +989,7 @@ static int panfrost_probe(struct platform_device *pdev)
 	pm_runtime_set_active(pfdev->base.dev);
 	pm_runtime_mark_last_busy(pfdev->base.dev);
 	pm_runtime_enable(pfdev->base.dev);
+	pm_runtime_get_noresume(pfdev->base.dev);
 	pm_runtime_set_autosuspend_delay(pfdev->base.dev, 50); /* ~3 frames */
 	pm_runtime_use_autosuspend(pfdev->base.dev);
 
@@ -1000,10 +1001,12 @@ static int panfrost_probe(struct platform_device *pdev)
 	if (err < 0)
 		goto err_out1;
 
+	pm_runtime_put_autosuspend(pfdev->base.dev);
 
 	return 0;
 
 err_out1:
+	pm_runtime_put_noidle(pfdev->base.dev);
 	pm_runtime_disable(pfdev->base.dev);
 	panfrost_device_fini(pfdev);
 	pm_runtime_set_suspended(pfdev->base.dev);
@@ -1018,8 +1021,9 @@ static void panfrost_remove(struct platform_device *pdev)
 	drm_dev_unregister(&pfdev->base);
 
 	pm_runtime_get_sync(pfdev->base.dev);
-	pm_runtime_disable(pfdev->base.dev);
 	panfrost_device_fini(pfdev);
+	pm_runtime_put_noidle(pfdev->base.dev);
+	pm_runtime_disable(pfdev->base.dev);
 	pm_runtime_set_suspended(pfdev->base.dev);
 }
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
                   ` (5 preceding siblings ...)
  2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
@ 2026-06-04 17:35 ` Adrián Larumbe
  2026-06-04 17:55   ` sashiko-bot
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  2026-06-04 20:16 ` Claude review: RPM, perfcnt and other minor fixes for Panfrost Claude Code Review Bot
  7 siblings, 2 replies; 28+ messages in thread
From: Adrián Larumbe @ 2026-06-04 17:35 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Steven Price, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Faith Ekstrand, Marty E. Plummer, Tomeu Vizoso, Eric Anholt,
	Alyssa Rosenzweig, Robin Murphy
  Cc: dri-devel, linux-kernel, Collabora Kernel Team,
	Adrián Larumbe, Neil Armstrong

Because the device must be in a position to accept jobs between the time
drm_dev_register() is called and autosuspend first kicks in, there's a very
narrow window inbetween during which jobs targeting the tiler buffer
object would time out, since the device's PM status is 'Active', but no MMU
interrupts were enabled at device initialisation time.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 73e467f60acd ("drm/panfrost: Consolidate reset handling")
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 11 +++++++++--
 drivers/gpu/drm/panfrost/panfrost_mmu.h |  2 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 4a3162c3b659..0e1ad6256c2c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -336,6 +336,12 @@ void panfrost_mmu_as_put(struct panfrost_device *pfdev, struct panfrost_mmu *mmu
 	WARN_ON(atomic_read(&mmu->as_count) < 0);
 }
 
+void panfrost_mmu_enable_interrupts(struct panfrost_device *pfdev)
+{
+	mmu_write(pfdev, MMU_INT_CLEAR, ~0);
+	mmu_write(pfdev, MMU_INT_MASK, ~0);
+}
+
 void panfrost_mmu_reset(struct panfrost_device *pfdev)
 {
 	struct panfrost_mmu *mmu, *mmu_tmp;
@@ -355,8 +361,7 @@ void panfrost_mmu_reset(struct panfrost_device *pfdev)
 
 	spin_unlock(&pfdev->as_lock);
 
-	mmu_write(pfdev, MMU_INT_CLEAR, ~0);
-	mmu_write(pfdev, MMU_INT_MASK, ~0);
+	panfrost_mmu_enable_interrupts(pfdev);
 }
 
 static size_t get_pgsize(u64 addr, size_t size, size_t *count)
@@ -970,6 +975,8 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
 		return err;
 	}
 
+	panfrost_mmu_enable_interrupts(pfdev);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.h b/drivers/gpu/drm/panfrost/panfrost_mmu.h
index 27c3c65ed074..3140eb4fea0f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.h
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.h
@@ -12,6 +12,8 @@ struct panfrost_mmu;
 int panfrost_mmu_map(struct panfrost_gem_mapping *mapping);
 void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping);
 
+void panfrost_mmu_enable_interrupts(struct panfrost_device *pfdev);
+
 int panfrost_mmu_init(struct panfrost_device *pfdev);
 void panfrost_mmu_fini(struct panfrost_device *pfdev);
 void panfrost_mmu_reset(struct panfrost_device *pfdev);

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/7] drm/panfrost: Prevent division by 0
  2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
@ 2026-06-04 17:44   ` sashiko-bot
  2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
  2026-06-04 18:02   ` Boris Brezillon
  1 sibling, 1 reply; 28+ messages in thread
From: sashiko-bot @ 2026-06-04 17:44 UTC (permalink / raw)
  To: Adrián Larumbe; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] The exact same division-by-zero vulnerability fixed in panfrost_devfreq.c is present in the Lima, Panthor, and Nouveau (gk20a) drivers.
--

commit dc73067c1030f65e3609c76a060b1b1176f85e4e
Author: Adrián Larumbe <adrian.larumbe@collabora.com>

drm/panfrost: Prevent division by 0

This commit clamps the devfreq total_time to 0 when it is less than 100ns
in panfrost_devfreq_get_dev_status() to prevent a division by zero error
during debug printing.

Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88

> diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> index b51c307788110..e34de47bf43bb 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> @@ -78,7 +78,8 @@ static int panfrost_devfreq_get_dev_status(struct device *dev,
>  
>  	dev_dbg(pfdev->base.dev, "busy %lu total %lu %lu %% freq %lu MHz\n",
>  		status->busy_time, status->total_time,
> -		status->busy_time / (status->total_time / 100),
> +		status->total_time >= 100 ?
> +			status->busy_time / (status->total_time / 100) : 0,

[Severity: High]
This is a pre-existing issue, but the exact same division-by-zero vulnerability
fixed here is also present in other DRM drivers.

In drivers/gpu/drm/lima/lima_devfreq.c:lima_devfreq_get_dev_status():
    status->busy_time / (status->total_time / 100),

In drivers/gpu/drm/panthor/panthor_devfreq.c:panthor_devfreq_get_dev_status():
    status->busy_time / (status->total_time / 100),

In drivers/gpu/drm/nouveau/nvkm/subdev/clk/gk20a_devfreq.c:gk20a_devfreq_get_dev_status():
    status->busy_time / (status->total_time / 100),

If userspace repeatedly reads the devfreq load in quick succession, total_time
can be less than 100ns, leading to a division-by-zero kernel panic when debug
printing is enabled.

Should these other drivers be patched similarly?

>  		status->current_frequency / 1000 / 1000);
>  
>  	return 0;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655@collabora.com?part=2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper
  2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
@ 2026-06-04 17:47   ` sashiko-bot
  2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
  2026-06-04 18:05   ` Boris Brezillon
  1 sibling, 1 reply; 28+ messages in thread
From: sashiko-bot @ 2026-06-04 17:47 UTC (permalink / raw)
  To: Adrián Larumbe; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] The refactored `panfrost_perfcnt_gpu_disable()` helper reverses the register write sequence when disabling performance counters, potentially breaking the HW_ISSUE_8186 (PRLAM-8186) hardware workaround and risking hardware state corruption.
--

commit b1a0eb47a40de5889be52c5109cc3ee780f78598
Author: Adrián Larumbe <adrian.larumbe@collabora.com>

drm/panfrost: Move perfcnt GPU disable sequence into a helper

This commit refactors the performance counter disable sequences into a
single helper function to avoid repetition.

> diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> index 7020c0192e189..ad1156678e91c 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> @@ -35,6 +35,16 @@ struct panfrost_perfcnt {
>  	struct completion dump_comp;
>  };
>  
> +static void panfrost_perfcnt_gpu_disable(struct panfrost_device *pfdev)
> +{
> +	gpu_write(pfdev, GPU_PERFCNT_CFG,
> +		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> +	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> +}

[Severity: High]
Does this new helper reverse the hardware disable sequence when called on
actively running counters?

In the original panfrost_perfcnt_disable_locked(), the block enable
registers (like GPU_PRFCNT_TILER_EN) were explicitly cleared to 0 before
the master configuration GPU_PERFCNT_CFG was turned off. The new helper
reverses this order.

Could shutting down the master configuration while sub-blocks are still
actively generating events cause hardware state corruption or bus hangs?

Modifying the main configuration while the Tiler counters are actively
enabled might also trigger hardware bugs (e.g., HW_ISSUE_8186), which the
original sequence safely avoided.

[ ... ]

> @@ -193,12 +203,7 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
>  	if (user != perfcnt->user)
>  		return -EINVAL;
>  
> -	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> -	gpu_write(pfdev, GPU_PERFCNT_CFG,
> -		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> +	panfrost_perfcnt_gpu_disable(pfdev);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655@collabora.com?part=4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session
  2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
@ 2026-06-04 17:49   ` sashiko-bot
  2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
  2026-06-04 18:26   ` Boris Brezillon
  1 sibling, 1 reply; 28+ messages in thread
From: sashiko-bot @ 2026-06-04 17:49 UTC (permalink / raw)
  To: Adrián Larumbe; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Address Space reference (mmu->as_count) permanently leaks when hardware reset aborts perfcnt_enable_locked.
- [High] A TOCTOU race between disable_locked and the asynchronous reset workqueue reintroduces the mmu->as_count underflow WARN.
- [Medium] A positive error code (EAGAIN) is incorrectly returned directly to userspace instead of the negative standard (-EAGAIN).
- [Medium] Spurious initialization failures occur for new perfcnt sessions due to a stale hw_reset_happened flag.
--

commit 555dc068f79668462a998f3ee4fa95ead697090d
Author: Adrián Larumbe <adrian.larumbe@collabora.com>

drm/panfrost: Make reset sequence deal with an active HWPerf session

This commit introduces a perfcnt HW reset path to handle active HWPerf
sessions during a HW reset. It renders perfcnt unusable after reset and
notifies the user to take recovery measures.

Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
Fixes: 7786fd108777 ("drm/panfrost: Expose performance counters through unstable ioctls")

> diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> index ad1156678e91c..c2087ea705fe6 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c

[ ... ]

> @@ -173,6 +187,16 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
>  	if (panfrost_has_hw_issue(pfdev, HW_ISSUE_8186))
>  		gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0xffffffff);
>  
> +	/* If a reset happened, we've no way of knowing whether it was between the time we called
> +	 * panfrost_mmu_as_get() or before perfcnt_enable(), so clearing this flag and going forward
> +	 * isn't possible. We must clear the flag and try again in the hopes no resets will happen
> +	 * between this and the next ioctl invocation.
> +	 */
> +	if (atomic_cmpxchg(&perfcnt->hw_reset_happened, 1, 0)) {
> +		ret = EAGAIN;

[Severity: Medium]
Does this return a positive error code directly to userspace instead of the
standard negative one (-EAGAIN)?

DRM ioctls pass positive return values directly back to user space as the
ioctl syscall result. Because of this, user space will incorrectly interpret
this as success and subsequently fail on dump ioctls since the session wasn't
actually enabled.

> +		goto err_disable;

[Severity: High]
Can this leak the Address Space reference if a hardware reset aborts the
enable sequence?

If a GPU reset occurs before panfrost_mmu_as_get() is called, the reset
handler clears mmu->as_count to 0, and then panfrost_mmu_as_get() increments
it to 1. The code subsequently detects the reset via the hw_reset_happened
flag, clears it, and jumps to err_disable here. Because err_disable skips
panfrost_mmu_as_put(), the AS slot reference is permanently leaked,
eventually exhausting the limited hardware AS slots.

> +	}
> +
>  	/* The BO ref is retained by the mapping. */
>  	drm_gem_object_put(&bo->base);

[ ... ]

> @@ -209,7 +235,8 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
>  	drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
>  	perfcnt->buf = NULL;
>  	panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv);
> -	panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
> +	if (!atomic_read(&perfcnt->hw_reset_happened))
> +		panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);

[Severity: High]
Could a race between this check and an asynchronous reset reintroduce the
mmu->as_count underflow WARN that this patch was explicitly intended to
prevent?

The non-atomic gap between reading the flag and executing
panfrost_mmu_as_put() creates an exposure window. If an interleaved
asynchronous hardware reset occurs in this window, panfrost_device_reset()
will clear mmu->as_count to 0. The subsequent execution of
panfrost_mmu_as_put() decrements it to -1, directly triggering the WARN.

>  	panfrost_gem_mapping_put(perfcnt->mapping);
>  	perfcnt->mapping = NULL;
>  	pm_runtime_put_autosuspend(pfdev->base.dev);

[ ... ]

> @@ -346,3 +373,20 @@ void panfrost_perfcnt_fini(struct panfrost_device *pfdev)
>  	/* Disable everything before leaving. */
>  	panfrost_perfcnt_gpu_disable(pfdev);
>  }
> +
> +void panfrost_perfcnt_reset(struct panfrost_device *pfdev)
> +{
> +	struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
> +
> +	/* Since this function will be called either from a scheduled HW reset
> +	 * or a runtime resume, tearing down any perfcnt resources means we're
> +	 * doomed to deadlocking with perfcnt_{enable/disable}, since we'd have
> +	 * to take the perfecnt lock. On top of that, it'd also violate DMA fence
> +	 * signalling rules because GFP_KERNEL allocations are made with the perfcnt
> +	 * lock taken in perfcnt_enable. In light of this, the only thing we can do
> +	 * is disabling perfcnt unconditionally, and notifying the perfcnt user of
> +	 * the reset having happpened so that they can take recovery measures.
> +	 */
> +	panfrost_perfcnt_gpu_disable(pfdev);
> +	atomic_set(&perfcnt->hw_reset_happened, 1);

[Severity: Medium]
Will this unconditionally set the reset flag even if no perfcnt session is
active, causing the next new session to spuriously fail?

Since this function is called during any GPU reset, it sets the flag even
when perfcnt->user == NULL. If user space later attempts to start a
completely new session, panfrost_perfcnt_enable_locked() will consume this
stale flag and abort with EAGAIN, functionally breaking the feature on the
first attempt after a reset.

> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655@collabora.com?part=5

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
  2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
@ 2026-06-04 17:50   ` sashiko-bot
  2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
  2026-06-04 18:36   ` Boris Brezillon
  1 sibling, 1 reply; 28+ messages in thread
From: sashiko-bot @ 2026-06-04 17:50 UTC (permalink / raw)
  To: Adrián Larumbe; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [High] Reordering PM teardown in device removal and probe error paths creates race conditions (UAF/kernel panics) and violates PM domain rules.

Pre-existing issues:
- [High] Improper DRM device unplug allows subsequent ioctls to access freed driver resources, causing Use-After-Free.
--

commit 385821af932c300aa013c14b13508c8e9cf10f64
Author: Adrián Larumbe <adrian.larumbe@collabora.com>

drm/panfrost: Fix PM usage_count mishandling

This commit addresses a PM usage_count mishandling issue during device probe
and unplug. It adds balanced PM get/put calls to prevent devices from
staying powered on indefinitely or triggering constant devfreq OPP transition
notifications.

> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 2d4b6aa95c66d..545fbf2c8d0cb 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c

[ ... ]

> @@ -1000,10 +1001,12 @@ static int panfrost_probe(struct platform_device *pdev)
>  	if (err < 0)
>  		goto err_out1;
>  
> +	pm_runtime_put_autosuspend(pfdev->base.dev);
>  
>  	return 0;
>  
>  err_out1:
> +	pm_runtime_put_noidle(pfdev->base.dev);
>  	pm_runtime_disable(pfdev->base.dev);
>  	panfrost_device_fini(pfdev);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Does dropping the usage count before pm_runtime_disable() create a race
condition where the suspend callback can run and disable clocks before
hardware shutdown?

Because the usage count is dropped early, a concurrent PM event could trigger
the suspend callback, disabling clocks. Then, panfrost_device_fini() calls
panfrost_gpu_fini() which writes to MMIO registers. Could writing to
unclocked registers on ARM SoCs cause fatal bus errors or panics?

>  	pm_runtime_set_suspended(pfdev->base.dev);

[ ... ]

> @@ -1018,8 +1021,9 @@ static void panfrost_remove(struct platform_device *pdev)
>  	drm_dev_unregister(&pfdev->base);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
This is a pre-existing issue, but does using drm_dev_unregister() instead of
drm_dev_unplug() allow subsequent ioctls to access freed driver resources?

Because drm_dev_unregister() removes the device from sysfs but does not set
the dev->unplugged flag, the DRM core (drm_ioctl_kernel) does not block
ioctls from userspace processes that already hold an open file descriptor.

Since panfrost_remove() proceeds to call panfrost_device_fini() which
destroys critical structures (job manager, MMU contexts, shrinker lists) and
powers down hardware, could a user issuing an ioctl during or after driver
unbind via sysfs access freed structures and unmapped MMIO regions, resulting
in a use-after-free?

>  
>  	pm_runtime_get_sync(pfdev->base.dev);
> -	pm_runtime_disable(pfdev->base.dev);
>  	panfrost_device_fini(pfdev);
> +	pm_runtime_put_noidle(pfdev->base.dev);
> +	pm_runtime_disable(pfdev->base.dev);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Does placing pm_runtime_put_noidle() and pm_runtime_disable() after
panfrost_device_fini() create a window where runtime PM can access freed
structures?

pm_runtime_put_noidle() drops the PM usage count to 0. At this point, a
concurrent PM event (e.g., userspace writing to /sys/.../power/control) can
trigger panfrost_device_runtime_suspend() before pm_runtime_disable()
executes.

Since panfrost_device_fini() has already destroyed driver structures like the
job manager, would the suspend callback access freed memory (like checking
panfrost_jm_is_idle), causing a use-after-free?

>  	pm_runtime_set_suspended(pfdev->base.dev);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655@collabora.com?part=6

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init
  2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
@ 2026-06-04 17:55   ` sashiko-bot
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: sashiko-bot @ 2026-06-04 17:55 UTC (permalink / raw)
  To: Adrián Larumbe; +Cc: dri-devel

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] The MMU IRQ handler remains active while device clocks and power domains are disabled during error teardown, causing unclocked MMIO accesses (S-Error).
- [High] panfrost_reset() does not mask or synchronize the MMU interrupt before resetting the GPU hardware, leading to a race condition with the MMU IRQ thread.
--

commit cac1cfb873df895888c7871605b659cdaf55b78e
Author: Adrián Larumbe <adrian.larumbe@collabora.com>
Date:   Thu Jun 4 17:35:26 2026 +0000

    drm/panfrost: Explicitly enable MMU interrupts at device init

    Because the device must be in a position to accept jobs between the time
    drm_dev_register() is called and autosuspend first kicks in, there's a very
    narrow window inbetween during which jobs targeting the tiler buffer
    object would time out, since the device's PM status is 'Active', but no MMU
    interrupts were enabled at device initialisation time.

> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 4a3162c3b659f..0e1ad6256c2cb 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
[ ... ]
> @@ -355,8 +361,7 @@ void panfrost_mmu_reset(struct panfrost_device *pfdev)
>  
>  	spin_unlock(&pfdev->as_lock);
>  
> -	mmu_write(pfdev, MMU_INT_CLEAR, ~0);
> -	mmu_write(pfdev, MMU_INT_MASK, ~0);
> +	panfrost_mmu_enable_interrupts(pfdev);
>  }

[Severity: High]
This isn't a bug introduced by this patch, but while reviewing the interrupt
handling during reset, does panfrost_reset() need to mask and synchronize
the MMU interrupt before resetting the hardware?

When a job timeout occurs, panfrost_reset() correctly masks and synchronizes
the job manager interrupts:

drivers/gpu/drm/panfrost/panfrost_job.c:panfrost_reset() {
	/* Mask job interrupts and synchronize to make sure we won't be
	 * interrupted during our reset.
	 */
	job_write(pfdev, JOB_INT_MASK, 0);
	synchronize_irq(pfdev->js->irq);
...
	panfrost_device_reset(pfdev, true);
...
}

However, it does not do the same for the MMU interrupt (pfdev->mmu_irq).

Because MMU faults often cause job timeouts, the
panfrost_mmu_irq_handler_thread could be actively running concurrently on
another CPU.

If the IRQ thread attempts to handle an MMU page fault, it will write to MMU
registers while panfrost_device_reset() concurrently calls
panfrost_gpu_soft_reset() to reset the hardware.

Can uncoordinated MMIO accesses to hardware undergoing a soft reset lead to
unpredictable behavior or bus errors?

[ ... ]
> @@ -970,6 +975,8 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
>  		return err;
>  	}
>  
> +	panfrost_mmu_enable_interrupts(pfdev);
> +

[Severity: High]
This is a pre-existing issue, but by explicitly enabling the MMU interrupts
here during init, does this expose a race condition in the device
initialization error paths?

If a subsequent initialization step like panfrost_jm_init() fails,
panfrost_device_init() will jump to the out_mmu label:

drivers/gpu/drm/panfrost/panfrost_device.c:panfrost_device_init() {
...
	err = panfrost_jm_init(pfdev);
	if (err)
		goto out_mmu;
...
out_mmu:
	panfrost_mmu_fini(pfdev);
...
out_clk:
	panfrost_clk_fini(pfdev);
...
}

panfrost_mmu_fini() masks the interrupt in hardware, but does not synchronize
or disable the IRQ handler in the kernel:

void panfrost_mmu_fini(struct panfrost_device *pfdev)
{
	mmu_write(pfdev, MMU_INT_MASK, 0);
}

Because panfrost_mmu_init() registers a shared threaded IRQ handler using
devm_request_threaded_irq(), the handler remains active during the rest of
the teardown.

If an interrupt fires or the thread runs after panfrost_clk_fini() has
disabled the hardware clocks, panfrost_mmu_irq_handler() will read unclocked
MMIO registers (e.g., mmu_read(pfdev, MMU_INT_STAT)).

Could this lead to a fatal S-Error (synchronous external abort)?

>  	return 0;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260604-claude-fixes-v2-0-57c6bd4c1655@collabora.com?part=7

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query
  2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
@ 2026-06-04 17:57   ` Boris Brezillon
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 17:57 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong, Claude

On Thu, 04 Jun 2026 18:35:20 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> When querying the properties of a BO, whether it was created in
> Writeback caching mode for userspace mappings, is recorded in a Panfrost
> BO's wb_mmap field.
> 
> Reported-by: Claude <noreply@anthropic.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: 62eedf1ccba5 ("drm/panfrost: Add flag to map GEM object Write-Back Cacheable")

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 3d0bdba2a474..cb8e5015847f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -659,7 +659,7 @@ static int panfrost_ioctl_query_bo_info(struct drm_device *dev, void *data,
>  		if (bo->is_heap)
>  			args->create_flags |= PANFROST_BO_HEAP;
>  
> -		if (!bo->base.map_wc)
> +		if (bo->wb_mmap)
>  			args->create_flags |= PANFROST_BO_WB_MMAP;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/7] drm/panfrost: Prevent division by 0
  2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
  2026-06-04 17:44   ` sashiko-bot
@ 2026-06-04 18:02   ` Boris Brezillon
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 18:02 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong, Claude

On Thu, 04 Jun 2026 18:35:21 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> When updating and debug-printing devfreq stats, in the very unlikely
> offchance that total device time is less than 100 ns, clamp it to 0 to
> avoid division by 0.
> 
> Reported-by: Claude <noreply@anthropic.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
> ---
>  drivers/gpu/drm/panfrost/panfrost_devfreq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> index b51c30778811..e34de47bf43b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> @@ -78,7 +78,8 @@ static int panfrost_devfreq_get_dev_status(struct device *dev,
>  
>  	dev_dbg(pfdev->base.dev, "busy %lu total %lu %lu %% freq %lu MHz\n",
>  		status->busy_time, status->total_time,
> -		status->busy_time / (status->total_time / 100),
> +		status->total_time >= 100 ?
> +			status->busy_time / (status->total_time / 100) : 0,

or:

		status->busy_time * 100 / MAX(status->total_time, 1),

>  		status->current_frequency / 1000 / 1000);
>  
>  	return 0;
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down
  2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
@ 2026-06-04 18:04   ` Boris Brezillon
  2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 18:04 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong

On Thu, 04 Jun 2026 18:35:22 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> Since the moment we call drm_dev_register() the device should be in a
> position to accept jobs, so it's best if the shrinker is already
> initialized by then.
> 
> On top of that, make shrinker functions take an panfrost_device pointer
> like other functions in the same sequence and rename them accordingly.
> 
> Essentially mimic the init/fini behaviour in Panthor.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c       | 7 +++++++
>  drivers/gpu/drm/panfrost/panfrost_drv.c          | 6 ------
>  drivers/gpu/drm/panfrost/panfrost_gem.h          | 4 ++--
>  drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c | 8 ++------
>  4 files changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index 7fed22d555a5..87b372c9e675 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -282,9 +282,15 @@ int panfrost_device_init(struct panfrost_device *pfdev)
>  	if (err)
>  		goto out_job;
>  
> +	err = panfrost_gem_shrinker_init(pfdev);
> +	if (err)
> +		goto out_perfcnt;
> +
>  	panfrost_gem_init(pfdev);

It feels weird to have the GEM shrinker initialized before the GEM
component. Also, I'd consider the shrinker part of the GEM logic, so
maybe call panfrost_gem_shrinker_init() from panfrost_gem_init() and
change the prototype of panfrost_gem_init() to return an int.

>  
>  	return 0;
> +out_perfcnt:
> +	panfrost_perfcnt_fini(pfdev);
>  out_job:
>  	panfrost_jm_fini(pfdev);
>  out_mmu:
> @@ -306,6 +312,7 @@ int panfrost_device_init(struct panfrost_device *pfdev)
>  
>  void panfrost_device_fini(struct panfrost_device *pfdev)
>  {
> +	panfrost_gem_shrinker_fini(pfdev);
>  	panfrost_perfcnt_fini(pfdev);
>  	panfrost_jm_fini(pfdev);
>  	panfrost_mmu_fini(pfdev);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index cb8e5015847f..2d4b6aa95c66 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -1000,14 +1000,9 @@ static int panfrost_probe(struct platform_device *pdev)
>  	if (err < 0)
>  		goto err_out1;
>  
> -	err = panfrost_gem_shrinker_init(&pfdev->base);
> -	if (err)
> -		goto err_out2;
>  
>  	return 0;
>  
> -err_out2:
> -	drm_dev_unregister(&pfdev->base);
>  err_out1:
>  	pm_runtime_disable(pfdev->base.dev);
>  	panfrost_device_fini(pfdev);
> @@ -1021,7 +1016,6 @@ static void panfrost_remove(struct platform_device *pdev)
>  	struct panfrost_device *pfdev = platform_get_drvdata(pdev);
>  
>  	drm_dev_unregister(&pfdev->base);
> -	panfrost_gem_shrinker_cleanup(&pfdev->base);
>  
>  	pm_runtime_get_sync(pfdev->base.dev);
>  	pm_runtime_disable(pfdev->base.dev);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
> index 79d4377019e9..323a1aee255e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
> @@ -154,8 +154,8 @@ panfrost_gem_mapping_get(struct panfrost_gem_object *bo,
>  void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping);
>  void panfrost_gem_teardown_mappings_locked(struct panfrost_gem_object *bo);
>  
> -int panfrost_gem_shrinker_init(struct drm_device *dev);
> -void panfrost_gem_shrinker_cleanup(struct drm_device *dev);
> +int panfrost_gem_shrinker_init(struct panfrost_device *pfdev);
> +void panfrost_gem_shrinker_fini(struct panfrost_device *pfdev);
>  
>  void panfrost_gem_set_label(struct drm_gem_object *obj, const char *label);
>  int panfrost_gem_sync(struct drm_gem_object *obj, u32 type,
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> index 2fe967a90bcb..fefae87535d6 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> @@ -95,10 +95,8 @@ panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
>   *
>   * This function registers and sets up the panfrost shrinker.
>   */
> -int panfrost_gem_shrinker_init(struct drm_device *dev)
> +int panfrost_gem_shrinker_init(struct panfrost_device *pfdev)
>  {
> -	struct panfrost_device *pfdev = to_panfrost_device(dev);
> -
>  	pfdev->shrinker = shrinker_alloc(0, "drm-panfrost");
>  	if (!pfdev->shrinker)
>  		return -ENOMEM;
> @@ -118,10 +116,8 @@ int panfrost_gem_shrinker_init(struct drm_device *dev)
>   *
>   * This function unregisters the panfrost shrinker.
>   */
> -void panfrost_gem_shrinker_cleanup(struct drm_device *dev)
> +void panfrost_gem_shrinker_fini(struct panfrost_device *pfdev)
>  {
> -	struct panfrost_device *pfdev = to_panfrost_device(dev);
> -
>  	if (pfdev->shrinker)
>  		shrinker_free(pfdev->shrinker);
>  }
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper
  2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
  2026-06-04 17:47   ` sashiko-bot
@ 2026-06-04 18:05   ` Boris Brezillon
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 18:05 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong

On Thu, 04 Jun 2026 18:35:23 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> Just for the sake of avoiding repetition.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 31 ++++++++++++-----------------
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> index 7020c0192e18..ad1156678e91 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> @@ -35,6 +35,16 @@ struct panfrost_perfcnt {
>  	struct completion dump_comp;
>  };
>  
> +static void panfrost_perfcnt_gpu_disable(struct panfrost_device *pfdev)
> +{
> +	gpu_write(pfdev, GPU_PERFCNT_CFG,
> +		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> +	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
> +	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> +}
> +
>  void panfrost_perfcnt_clean_cache_done(struct panfrost_device *pfdev)
>  {
>  	complete(&pfdev->perfcnt->dump_comp);
> @@ -193,12 +203,7 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
>  	if (user != perfcnt->user)
>  		return -EINVAL;
>  
> -	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0x0);
> -	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> -	gpu_write(pfdev, GPU_PERFCNT_CFG,
> -		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> +	panfrost_perfcnt_gpu_disable(pfdev);
>  
>  	perfcnt->user = NULL;
>  	drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
> @@ -327,12 +332,7 @@ int panfrost_perfcnt_init(struct panfrost_device *pfdev)
>  	perfcnt->bosize = size;
>  
>  	/* Start with everything disabled. */
> -	gpu_write(pfdev, GPU_PERFCNT_CFG,
> -		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> -	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> +	panfrost_perfcnt_gpu_disable(pfdev);
>  
>  	init_completion(&perfcnt->dump_comp);
>  	mutex_init(&perfcnt->lock);
> @@ -344,10 +344,5 @@ int panfrost_perfcnt_init(struct panfrost_device *pfdev)
>  void panfrost_perfcnt_fini(struct panfrost_device *pfdev)
>  {
>  	/* Disable everything before leaving. */
> -	gpu_write(pfdev, GPU_PERFCNT_CFG,
> -		  GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
> -	gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_SHADER_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_MMU_L2_EN, 0);
> -	gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0);
> +	panfrost_perfcnt_gpu_disable(pfdev);
>  }
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session
  2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
  2026-06-04 17:49   ` sashiko-bot
@ 2026-06-04 18:26   ` Boris Brezillon
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 18:26 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong, Claude

On Thu, 04 Jun 2026 18:35:24 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> Right now, if there's a HW reset and an HWPerf session is active,
> panfrost_mmu_reset() will reset the AS count for every single open file's
> mmu struct back to 0, and also invalidate their AS numbers. Then, when
> disabling hwperf, panfrost_mmu_as_put() will WARN that mmu->as_count is
> less than zero.
> 
> Fix this by introducing a perfcnt HW reset path.
> 
> The choice was made to render perfcnt unusable after reset, so that a
> user might have to reprogram it with a full disable/enable sequence
> before requesting more perfcnt dumps.

Can't we do better than that. We store the config of the perf session,
so if a reset is in progress, we can simply block on it, restore the
old config in the post_reset path, and get going with the DUMP request
once the reset has been done. There's certainly something to do to
report the discontinuity to userspace, but that's something we can
reflect through and extra field added to drm_panfrost_perfcnt_dump.

> 
> Reported-by: Claude <noreply@anthropic.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/88
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: 7786fd108777 ("drm/panfrost: Expose performance counters through unstable ioctls")
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c  |  1 +
>  drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 46 ++++++++++++++++++++++++++++-
>  drivers/gpu/drm/panfrost/panfrost_perfcnt.h |  1 +
>  3 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index 87b372c9e675..2805d50c1b9b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -426,6 +426,7 @@ bool panfrost_exception_needs_reset(const struct panfrost_device *pfdev,
>  
>  void panfrost_device_reset(struct panfrost_device *pfdev, bool enable_job_int)
>  {
> +	panfrost_perfcnt_reset(pfdev);
>  	panfrost_gpu_soft_reset(pfdev);
>  
>  	panfrost_gpu_power_on(pfdev);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> index ad1156678e91..c2087ea705fe 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
> @@ -33,6 +33,7 @@ struct panfrost_perfcnt {
>  	struct panfrost_file_priv *user;
>  	struct mutex lock;
>  	struct completion dump_comp;
> +	atomic_t hw_reset_happened;
>  };
>  
>  static void panfrost_perfcnt_gpu_disable(struct panfrost_device *pfdev)
> @@ -57,9 +58,13 @@ void panfrost_perfcnt_sample_done(struct panfrost_device *pfdev)
>  
>  static int panfrost_perfcnt_dump_locked(struct panfrost_device *pfdev)
>  {
> +	struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
>  	u64 gpuva;
>  	int ret;
>  
> +	if (atomic_read(&perfcnt->hw_reset_happened))
> +		return -EIO;
> +
>  	reinit_completion(&pfdev->perfcnt->dump_comp);
>  	gpuva = pfdev->perfcnt->mapping->mmnode.start << PAGE_SHIFT;
>  	gpu_write(pfdev, GPU_PERFCNT_BASE_LO, lower_32_bits(gpuva));
> @@ -140,6 +145,15 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
>  		goto err_vunmap;
>  	}
>  
> +	/* If a reset is ongoing, the AS we get right below will be torn
> +	 * down, so rather than waiting until this becomes obvious in a
> +	 * perfcnt_dump() ioctl, we ask the user to try again slightly later.
> +	 */
> +	if (atomic_read(&pfdev->reset.pending)) {
> +		ret = -EAGAIN;
> +		goto err_vunmap;
> +	}
> +
>  	ret = panfrost_mmu_as_get(pfdev, perfcnt->mapping->mmu);
>  	if (ret < 0)
>  		goto err_vunmap;
> @@ -173,6 +187,16 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
>  	if (panfrost_has_hw_issue(pfdev, HW_ISSUE_8186))
>  		gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0xffffffff);
>  
> +	/* If a reset happened, we've no way of knowing whether it was between the time we called
> +	 * panfrost_mmu_as_get() or before perfcnt_enable(), so clearing this flag and going forward
> +	 * isn't possible. We must clear the flag and try again in the hopes no resets will happen
> +	 * between this and the next ioctl invocation.
> +	 */
> +	if (atomic_cmpxchg(&perfcnt->hw_reset_happened, 1, 0)) {
> +		ret = EAGAIN;
> +		goto err_disable;
> +	}

This should really be transparent to the user, apart from reporting
that samples might have been lost because of the reset.

> +
>  	/* The BO ref is retained by the mapping. */
>  	drm_gem_object_put(&bo->base);
>  
> @@ -180,6 +204,8 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
>  
>  	return 0;
>  
> +err_disable:
> +	panfrost_perfcnt_gpu_disable(pfdev);
>  err_vunmap:
>  	drm_gem_vunmap(&bo->base, &map);
>  err_put_mapping:
> @@ -209,7 +235,8 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
>  	drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
>  	perfcnt->buf = NULL;
>  	panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv);
> -	panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
> +	if (!atomic_read(&perfcnt->hw_reset_happened))
> +		panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);

Why not call panfrost_mmu_as_put() in panfrost_perfcnt_pre_reset(),
which is called before panfrost_mmu_reset()? If we do that, the AS
should be returned, and we can add a panfrost_perfcnt_post_reset()
that's called after panfrost_mmu_reset() and which re-acquires the AS
and re-instantiate the perfcnt settings.

>  	panfrost_gem_mapping_put(perfcnt->mapping);
>  	perfcnt->mapping = NULL;
>  	pm_runtime_put_autosuspend(pfdev->base.dev);
> @@ -346,3 +373,20 @@ void panfrost_perfcnt_fini(struct panfrost_device *pfdev)
>  	/* Disable everything before leaving. */
>  	panfrost_perfcnt_gpu_disable(pfdev);
>  }
> +
> +void panfrost_perfcnt_reset(struct panfrost_device *pfdev)
> +{
> +	struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
> +
> +	/* Since this function will be called either from a scheduled HW reset
> +	 * or a runtime resume, tearing down any perfcnt resources means we're
> +	 * doomed to deadlocking with perfcnt_{enable/disable}, since we'd have
> +	 * to take the perfecnt lock. On top of that, it'd also violate DMA fence
> +	 * signalling rules because GFP_KERNEL allocations are made with the perfcnt
> +	 * lock taken in perfcnt_enable.

Question is, do we really need these allocation to happen with the lock
held? And if yes, can't we protect perfcnt ops with a separate reset
lock?

> In light of this, the only thing we can do
> +	 * is disabling perfcnt unconditionally, and notifying the perfcnt user of
> +	 * the reset having happpened so that they can take recovery measures.

Informing userspace about the discontinuity is fine, but I think we
should try to restore the config in a post_reset() hook.

> +	 */
> +	panfrost_perfcnt_gpu_disable(pfdev);
> +	atomic_set(&perfcnt->hw_reset_happened, 1);
> +}
> diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.h b/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
> index 8bbcf5f5fb33..8b9bc704b634 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.h
> @@ -14,5 +14,6 @@ int panfrost_ioctl_perfcnt_enable(struct drm_device *dev, void *data,
>  				  struct drm_file *file_priv);
>  int panfrost_ioctl_perfcnt_dump(struct drm_device *dev, void *data,
>  				struct drm_file *file_priv);
> +void panfrost_perfcnt_reset(struct panfrost_device *pfdev);
>  
>  #endif
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
  2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
  2026-06-04 17:50   ` sashiko-bot
@ 2026-06-04 18:36   ` Boris Brezillon
  1 sibling, 0 replies; 28+ messages in thread
From: Boris Brezillon @ 2026-06-04 18:36 UTC (permalink / raw)
  To: Adrián Larumbe
  Cc: Rob Herring, Steven Price, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Faith Ekstrand,
	Marty E. Plummer, Tomeu Vizoso, Eric Anholt, Alyssa Rosenzweig,
	Robin Murphy, dri-devel, linux-kernel, Collabora Kernel Team,
	Neil Armstrong

On Thu, 04 Jun 2026 18:35:25 +0100
Adrián Larumbe <adrian.larumbe@collabora.com> wrote:

> During device probe(), failure to do a PM get() will leave the usage_count
> set to 0, which is the value assigned at device creation time. That means
> when the autosuspend delay expires, runtime suspend callback won't be
> invoked, so the device will remain powered on forever.
> 
> On top of that, failure to call PM put() during device unplug means
> Panfrost device's PM usage_count increases monotonically for every new
> module reload.
> 
> The combined outcome of both of the above was that devfreq OPP transition
> notifications would be printed all the time, even when no jobs are being
> submitted. This quickly fills the kernel ring buffer with junk.
> 
> Even direr than that was the fact MMU interrupts are only enabled when
> the device is reset, so after device probe() the very first job targeting
> the tiler heap BO would always time out, because the driver's PM runtime
> resume callback would not be invoked.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
> Fixes: 635430797d3f ("drm/panfrost: Rework runtime PM initialization")
> Fixes: 876b15d2c88d ("drm/panfrost: Fix module unload")
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 2d4b6aa95c66..545fbf2c8d0c 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -989,6 +989,7 @@ static int panfrost_probe(struct platform_device *pdev)
>  	pm_runtime_set_active(pfdev->base.dev);
>  	pm_runtime_mark_last_busy(pfdev->base.dev);
>  	pm_runtime_enable(pfdev->base.dev);
> +	pm_runtime_get_noresume(pfdev->base.dev);
>  	pm_runtime_set_autosuspend_delay(pfdev->base.dev, 50); /* ~3 frames */
>  	pm_runtime_use_autosuspend(pfdev->base.dev);
>  
> @@ -1000,10 +1001,12 @@ static int panfrost_probe(struct platform_device *pdev)
>  	if (err < 0)
>  		goto err_out1;
>  
> +	pm_runtime_put_autosuspend(pfdev->base.dev);
>  
>  	return 0;
>  
>  err_out1:
> +	pm_runtime_put_noidle(pfdev->base.dev);

Do we really need this get_noresume/put_noidle dance, can't use call
pm_runtime_dont_use_autosuspend() instead like is done in panthor, or
is panthor broken too?

>  	pm_runtime_disable(pfdev->base.dev);
>  	panfrost_device_fini(pfdev);
>  	pm_runtime_set_suspended(pfdev->base.dev);
> @@ -1018,8 +1021,9 @@ static void panfrost_remove(struct platform_device *pdev)
>  	drm_dev_unregister(&pfdev->base);
>  
>  	pm_runtime_get_sync(pfdev->base.dev);
> -	pm_runtime_disable(pfdev->base.dev);
>  	panfrost_device_fini(pfdev);
> +	pm_runtime_put_noidle(pfdev->base.dev);
> +	pm_runtime_disable(pfdev->base.dev);
>  	pm_runtime_set_suspended(pfdev->base.dev);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: RPM, perfcnt and other minor fixes for Panfrost
  2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
                   ` (6 preceding siblings ...)
  2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
@ 2026-06-04 20:16 ` Claude Code Review Bot
  7 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: RPM, perfcnt and other minor fixes for Panfrost
Author: =?utf-8?q?Adri=C3=A1n_Larumbe?= <adrian.larumbe@collabora.com>
Patches: 19
Reviewed: 2026-06-05T06:16:23.718180

---

This is a 7-patch series for the Panfrost GPU driver addressing RPM (Runtime Power Management) reference counting, perfcnt reset handling, and several minor robustness fixes. The series is well-structured with preparatory refactoring patches (3, 4) separated from the actual fixes (5, 6, 7). The two most important patches are patch 6 (PM usage_count fix) and patch 5 (perfcnt reset handling), which address real bugs.

There is one clear **bug** in patch 5 (missing negative return value), and a few design concerns around the perfcnt reset atomics that warrant discussion. The minor patches (1, 2, 3, 4) are straightforward and look correct.

Overall the series is in reasonable shape but patch 5 needs a fix before it can be merged.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: drm/panfrost: Check another bo field for cache option query
  2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
  2026-06-04 17:57   ` Boris Brezillon
@ 2026-06-04 20:16   ` Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Correct fix, but commit message could be clearer.**

The existing code:
```c
if (!bo->base.map_wc)
    args->create_flags |= PANFROST_BO_WB_MMAP;
```

This is indeed wrong. Looking at the source tree, `map_wc` is set by `should_map_wc()` which returns `false` for both coherent devices (where WC is impossible) AND for `wb_mmap` BOs. So on a coherent device, `!map_wc` is always true, meaning every BO would erroneously report `PANFROST_BO_WB_MMAP`. The fix to use `bo->wb_mmap` directly is correct — it reflects the actual creation flag:

```c
if (bo->wb_mmap)
    args->create_flags |= PANFROST_BO_WB_MMAP;
```

**Nit:** The commit message says "is recorded in a Panfrost BO's wb_mmap field" but doesn't explain *why* `!map_wc` was wrong. On a coherent device, `map_wc` is always false (line 460 in `panfrost_gem.c`), so `!map_wc` is always true — the old code would report all BOs as `WB_MMAP` on coherent devices. Explaining this would help reviewers.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: Re: [PATCH v2 2/7] drm/panfrost: Prevent division by 0
  2026-06-04 17:44   ` sashiko-bot
@ 2026-06-04 20:16     ` Claude Code Review Bot
  0 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Acceptable, minor concern.**

```c
status->total_time >= 100 ?
    status->busy_time / (status->total_time / 100) : 0,
```

This is a debug print only, so the worst case is a kernel oops in a `dev_dbg()` path. The fix is reasonable. The threshold of 100 comes from the integer division `total_time / 100` — if `total_time < 100`, the divisor is 0.

**Minor:** The commit message says "clamp it to 0" but it's not clamping `total_time` — it's substituting 0 for the percentage when `total_time` is too small. This is cosmetic.

**Question:** Could you consider using `div64_u64` or similar safe division helpers instead of this ternary? Though for a debug print, this is fine.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: drm/panfrost: Move shrinker initialization and unplug one level down
  2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
  2026-06-04 18:04   ` Boris Brezillon
@ 2026-06-04 20:16   ` Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Looks correct.**

Moves `panfrost_gem_shrinker_init/fini` into `panfrost_device_init/fini` (mirroring Panthor), and renames `panfrost_gem_shrinker_cleanup` to `panfrost_gem_shrinker_fini` for consistency. The rationale is that the shrinker should be ready before `drm_dev_register()` is called.

The init/fini ordering looks correct:
- Init: `...perfcnt_init -> shrinker_init -> gem_init`
- Fini: `shrinker_fini -> perfcnt_fini -> jm_fini -> ...` (reverse order)

The function signatures are changed from `struct drm_device *` to `struct panfrost_device *` which makes sense since `panfrost_device_init` already works with `panfrost_device`.

**Note:** The error path in `panfrost_probe` loses the `drm_dev_unregister` call in `err_out2`. That's fine because `err_out2` was only for `panfrost_gem_shrinker_init` failure, which now happens inside `panfrost_device_init` (before `drm_dev_register` is called), so `drm_dev_unregister` would be wrong there.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: Re: [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper
  2026-06-04 17:47   ` sashiko-bot
@ 2026-06-04 20:16     ` Claude Code Review Bot
  0 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Correct, straightforward refactor.**

Extracts the repeated 5-line GPU write sequence into `panfrost_perfcnt_gpu_disable()`. Used in three places: init, fini, and disable_locked.

**One subtle ordering change:** In `panfrost_perfcnt_disable_locked`, the original code wrote to the enable registers first, then set the config mode to OFF. The new helper sets config mode OFF first:

```c
// New order in helper:
gpu_write(pfdev, GPU_PERFCNT_CFG, GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
...
```

vs. original disable_locked order:
```c
// Old order:
gpu_write(pfdev, GPU_PRFCNT_JM_EN, 0x0);
...
gpu_write(pfdev, GPU_PERFCNT_CFG, GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));
```

The init path already used the "config first" order, so the helper unifies on that. This is likely fine — turning off the mode register first should be safe — but worth noting if the HW cares about the ordering.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: Re: [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session
  2026-06-04 17:49   ` sashiko-bot
@ 2026-06-04 20:16     ` Claude Code Review Bot
  0 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Has a bug. Needs revision.**

**Bug — missing negative sign on EAGAIN:**

```c
if (atomic_cmpxchg(&perfcnt->hw_reset_happened, 1, 0)) {
    ret = EAGAIN;  // <-- BUG: should be -EAGAIN
    goto err_disable;
}
```

This returns a positive `EAGAIN` (11) which will be interpreted as success by the ioctl layer. It must be `-EAGAIN`.

**Race condition concern with `reset.pending` check:**

```c
if (atomic_read(&pfdev->reset.pending)) {
    ret = -EAGAIN;
    goto err_vunmap;
}
```

This check races with the reset path — `reset.pending` could be set immediately after this read. The comment acknowledges this and there's a second check later via `hw_reset_happened`, but the second check has the same TOCTOU issue. Between `panfrost_mmu_as_get()` and the `atomic_cmpxchg` of `hw_reset_happened`, a reset could happen that sets `hw_reset_happened=1`, and the cmpxchg will catch that. So the design intent is sound, except for the missing negation bug above.

**`err_disable` error path is incomplete:**

When hitting the `err_disable` label, the code calls `panfrost_perfcnt_gpu_disable(pfdev)` then falls through to `err_vunmap`. But at this point, `panfrost_mmu_as_get` has already succeeded (line before the `err_disable` goto target was added). The error path doesn't call `panfrost_mmu_as_put()`, so the AS reference will be leaked. The code should either:
1. Add `panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu)` before `err_disable` falls through to `err_vunmap`, or
2. Have `err_disable` jump to a label after `panfrost_mmu_as_put`.

Wait, looking more carefully: the `err_disable` label is placed right before `err_vunmap`, and `panfrost_mmu_as_get` was called *before* the block with the `atomic_cmpxchg`. So after `err_disable`, we skip `panfrost_mmu_as_put`. This is an AS reference leak.

**`panfrost_perfcnt_reset` called without lock:**

```c
void panfrost_perfcnt_reset(struct panfrost_device *pfdev)
{
    ...
    panfrost_perfcnt_gpu_disable(pfdev);
    atomic_set(&perfcnt->hw_reset_happened, 1);
}
```

This writes GPU registers without holding the perfcnt lock. If a perfcnt enable/dump is happening concurrently, the GPU register writes could interleave. The comment explains the reasoning (deadlock avoidance due to GFP_KERNEL allocations under lock), which is fair, but it means the GPU register writes in `panfrost_perfcnt_gpu_disable` are racing with the enable path's GPU register writes. This is probably tolerable since we're about to do a GPU soft reset anyway (the next call after `panfrost_perfcnt_reset` is `panfrost_gpu_soft_reset`), but it should be documented.

**`panfrost_perfcnt_disable_locked` — skipping `as_put` on reset:**

```c
if (!atomic_read(&perfcnt->hw_reset_happened))
    panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
```

This correctly avoids the WARN when the reset already zeroed the as_count. However, after this, `atomic_read(&perfcnt->hw_reset_happened)` is never cleared back to 0 in the disable path. It's cleared in `perfcnt_enable_locked` via `atomic_cmpxchg`. What if the user disables perfcnt and never re-enables? The flag stays set forever, which is fine since the struct outlives file close.

But what about `panfrost_perfcnt_close()`? It calls `panfrost_perfcnt_disable_locked()` which will skip `as_put` if `hw_reset_happened` is set. Then `panfrost_perfcnt_close` does `pm_runtime_put_autosuspend` at the end. This seems consistent.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: Re: [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling
  2026-06-04 17:50   ` sashiko-bot
@ 2026-06-04 20:16     ` Claude Code Review Bot
  0 siblings, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: This is the key bug fix. Looks correct.**

The fix adds `pm_runtime_get_noresume()` after `pm_runtime_enable()` to increment the usage count (since the device is already active), then adds `pm_runtime_put_autosuspend()` after `drm_dev_register()` to allow eventual autosuspend.

```c
pm_runtime_set_active(pfdev->base.dev);
pm_runtime_mark_last_busy(pfdev->base.dev);
pm_runtime_enable(pfdev->base.dev);
pm_runtime_get_noresume(pfdev->base.dev);  // NEW: count=1, device stays active
...
drm_dev_register(&pfdev->base, 0);
...
pm_runtime_put_autosuspend(pfdev->base.dev);  // NEW: count=0, autosuspend starts
```

The error path adds `pm_runtime_put_noidle` to balance the `get_noresume`.

In `panfrost_remove`, the reordering is:

```c
pm_runtime_get_sync(pfdev->base.dev);    // count=1
panfrost_device_fini(pfdev);             // teardown while device is active
pm_runtime_put_noidle(pfdev->base.dev);  // count=0
pm_runtime_disable(pfdev->base.dev);     // disable PM
```

This is correct. Previously `pm_runtime_disable` was called before `panfrost_device_fini`, which could leave the device in an inconsistent state if the fini path needed PM. The new ordering ensures the device is powered on during teardown and the put is balanced before disable.

The commit message clearly explains the symptoms (devfreq OPP notifications, first tiler heap job timeout) and root cause.

---

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Claude review: drm/panfrost: Explicitly enable MMU interrupts at device init
  2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
  2026-06-04 17:55   ` sashiko-bot
@ 2026-06-04 20:16   ` Claude Code Review Bot
  1 sibling, 0 replies; 28+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 20:16 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Verdict: Correct, follows naturally from patch 6.**

```c
int panfrost_mmu_init(struct panfrost_device *pfdev)
{
    ...
    panfrost_mmu_enable_interrupts(pfdev);
    return 0;
}
```

Since patch 6 ensures the device starts active (with PM get/put around probe), there's a window between `drm_dev_register()` and when autosuspend triggers the first `runtime_suspend`/`runtime_resume` cycle. During this window, jobs could be submitted but MMU interrupts weren't enabled (they're only enabled in `panfrost_mmu_reset()`, called from `panfrost_device_reset()`, called from `runtime_resume()`). This patch closes that gap.

The refactoring of `mmu_write` calls into `panfrost_mmu_enable_interrupts()` is clean and used in both `panfrost_mmu_reset()` and `panfrost_mmu_init()`.

---

## SUMMARY OF ISSUES

1. **Patch 5 — Bug:** `ret = EAGAIN` must be `ret = -EAGAIN` (line ~988 in the mbox)
2. **Patch 5 — Bug:** `err_disable` error path leaks the AS reference from `panfrost_mmu_as_get()` — needs `panfrost_mmu_as_put()` before falling through to `err_vunmap`
3. **Patch 4 — Note:** Register write ordering changed in `disable_locked` (enable regs vs config reg) — verify HW is order-agnostic
4. **Patch 5 — Design:** `panfrost_perfcnt_reset()` writes GPU registers without locking — acceptable given the subsequent soft reset, but should be explicitly noted

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2026-06-04 20:16 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 17:35 [PATCH v2 0/7] RPM, perfcnt and other minor fixes for Panfrost Adrián Larumbe
2026-06-04 17:35 ` [PATCH v2 1/7] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
2026-06-04 17:57   ` Boris Brezillon
2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
2026-06-04 17:35 ` [PATCH v2 2/7] drm/panfrost: Prevent division by 0 Adrián Larumbe
2026-06-04 17:44   ` sashiko-bot
2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
2026-06-04 18:02   ` Boris Brezillon
2026-06-04 17:35 ` [PATCH v2 3/7] drm/panfrost: Move shrinker initialization and unplug one level down Adrián Larumbe
2026-06-04 18:04   ` Boris Brezillon
2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
2026-06-04 17:35 ` [PATCH v2 4/7] drm/panfrost: Move perfcnt GPU disable sequence into a helper Adrián Larumbe
2026-06-04 17:47   ` sashiko-bot
2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
2026-06-04 18:05   ` Boris Brezillon
2026-06-04 17:35 ` [PATCH v2 5/7] drm/panfrost: Make reset sequence deal with an active HWPerf session Adrián Larumbe
2026-06-04 17:49   ` sashiko-bot
2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
2026-06-04 18:26   ` Boris Brezillon
2026-06-04 17:35 ` [PATCH v2 6/7] drm/panfrost: Fix PM usage_count mishandling Adrián Larumbe
2026-06-04 17:50   ` sashiko-bot
2026-06-04 20:16     ` Claude review: " Claude Code Review Bot
2026-06-04 18:36   ` Boris Brezillon
2026-06-04 17:35 ` [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init Adrián Larumbe
2026-06-04 17:55   ` sashiko-bot
2026-06-04 20:16   ` Claude review: " Claude Code Review Bot
2026-06-04 20:16 ` Claude review: RPM, perfcnt and other minor fixes for Panfrost Claude Code Review Bot
  -- strict thread matches above, loose matches on Subject: below --
2026-05-25 23:03 [PATCH 0/4] RPM, percnt " Adrián Larumbe
2026-05-25 23:03 ` [PATCH 1/4] drm/panfrost: Check another bo field for cache option query Adrián Larumbe
2026-05-27  5:22   ` Claude review: " Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox