public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Fix resource leaks
@ 2026-02-25  1:44 Ethan Tidmore
  2026-02-25  6:36 ` kernel test robot
  0 siblings, 1 reply; 4+ messages in thread
From: Ethan Tidmore @ 2026-02-25  1:44 UTC (permalink / raw)
  To: Alex Deucher, Christian König, amd-gfx
  Cc: David Airlie, Simona Vetter, Lijo Lazar, Mario Limonciello,
	Mario Limonciello, Ce Sun, Yo-Jung Leo Lin, Jammy Zhou, dri-devel,
	linux-kernel, Ethan Tidmore

There are multiple resource leaks due to ioremap() being used and
iounmap never being called in multiple possible error paths.

Change ioremap() to devm_ioremap() to fix all resource leaks at
once.

Detected by Smatch:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4834 amdgpu_device_init() warn:
'adev->rmmio' from ioremap() not released on lines:
4539,4549,4563,4574,4592,4834.

Fixes: d38ceaf99ed01 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0acddcb04730..5cb58bed64ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4520,7 +4520,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	for (i = 0; i < AMD_IP_BLOCK_TYPE_NUM; i++)
 		atomic_set(&adev->pm.pwr_state[i], POWER_STATE_UNKNOWN);
 
-	adev->rmmio = ioremap(adev->rmmio_base, adev->rmmio_size);
+	adev->rmmio = devm_ioremap(adev->dev, adev->rmmio_base, adev->rmmio_size);
 	if (!adev->rmmio)
 		return -ENOMEM;
 
@@ -4843,8 +4843,6 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev)
 	/* Unmap all mapped bars - Doorbell, registers and VRAM */
 	amdgpu_doorbell_fini(adev);
 
-	iounmap(adev->rmmio);
-	adev->rmmio = NULL;
 	if (adev->mman.aper_base_kaddr)
 		iounmap(adev->mman.aper_base_kaddr);
 	adev->mman.aper_base_kaddr = NULL;
@@ -4970,13 +4968,6 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
 	if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
 		vga_client_unregister(adev->pdev);
 
-	if (drm_dev_enter(adev_to_drm(adev), &idx)) {
-
-		iounmap(adev->rmmio);
-		adev->rmmio = NULL;
-		drm_dev_exit(idx);
-	}
-
 	if (IS_ENABLED(CONFIG_PERF_EVENTS))
 		amdgpu_pmu_fini(adev);
 	if (adev->discovery.bin)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leaks
  2026-02-25  1:44 [PATCH] drm/amdgpu: Fix resource leaks Ethan Tidmore
@ 2026-02-25  6:36 ` kernel test robot
  2026-02-27  4:17   ` Claude review: " Claude Code Review Bot
  2026-02-27  4:17   ` Claude Code Review Bot
  0 siblings, 2 replies; 4+ messages in thread
From: kernel test robot @ 2026-02-25  6:36 UTC (permalink / raw)
  To: Ethan Tidmore, Alex Deucher, Christian König, amd-gfx
  Cc: oe-kbuild-all, David Airlie, Simona Vetter, Lijo Lazar,
	Mario Limonciello, Ce Sun, Yo-Jung Leo Lin, Jammy Zhou, dri-devel,
	linux-kernel, Ethan Tidmore

Hi Ethan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on linus/master v7.0-rc1 next-20260224]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ethan-Tidmore/drm-amdgpu-Fix-resource-leaks/20260225-095342
base:   https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link:    https://lore.kernel.org/r/20260225014425.2474802-1-ethantidmore06%40gmail.com
patch subject: [PATCH] drm/amdgpu: Fix resource leaks
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260225/202602251459.1jr0PuUf-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260225/202602251459.1jr0PuUf-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602251459.1jr0PuUf-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function 'amdgpu_device_fini_sw':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4942:16: warning: unused variable 'idx' [-Wunused-variable]
    4942 |         int i, idx;
         |                ^~~


vim +/idx +4942 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

72c8c97b1522ce Andrey Grodzovsky   2021-05-12  4939  
72c8c97b1522ce Andrey Grodzovsky   2021-05-12  4940  void amdgpu_device_fini_sw(struct amdgpu_device *adev)
72c8c97b1522ce Andrey Grodzovsky   2021-05-12  4941  {
bd22e44ad415ac Christian König     2025-01-15 @4942  	int i, idx;
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4943  	bool px;
62d5f9f7110ad3 Leslie Shi          2022-01-05  4944  
a5c5d8d50ecf58 Lang Yu             2021-10-21  4945  	amdgpu_device_ip_fini(adev);
b61badd20b443e Vitaly Prosyak      2024-11-11  4946  	amdgpu_fence_driver_sw_fini(adev);
b31d306378d9ba Mario Limonciello   2023-01-04  4947  	amdgpu_ucode_release(&adev->firmware.gpu_info_fw);
d38ceaf99ed015 Alex Deucher        2015-04-20  4948  	adev->accel_working = false;
68ce8b24224265 Christian König     2022-03-02  4949  	dma_fence_put(rcu_dereference_protected(adev->gang_submit, true));
bd22e44ad415ac Christian König     2025-01-15  4950  	for (i = 0; i < MAX_XCP; ++i) {
bd22e44ad415ac Christian König     2025-01-15  4951  		dma_fence_put(adev->isolation[i].spearhead);
bd22e44ad415ac Christian König     2025-01-15  4952  		amdgpu_sync_free(&adev->isolation[i].active);
bd22e44ad415ac Christian König     2025-01-15  4953  		amdgpu_sync_free(&adev->isolation[i].prev);
bd22e44ad415ac Christian König     2025-01-15  4954  	}
04442bf70debb1 Lijo Lazar          2021-03-16  4955  
04442bf70debb1 Lijo Lazar          2021-03-16  4956  	amdgpu_reset_fini(adev);
04442bf70debb1 Lijo Lazar          2021-03-16  4957  
d38ceaf99ed015 Alex Deucher        2015-04-20  4958  	/* free i2c buses */
d38ceaf99ed015 Alex Deucher        2015-04-20  4959  	amdgpu_i2c_fini(adev);
bfca02892773d2 Shaoyun Liu         2018-02-01  4960  
6e8ca38ebc9b13 Lijo Lazar          2025-02-05  4961  	if (adev->bios) {
bfca02892773d2 Shaoyun Liu         2018-02-01  4962  		if (amdgpu_emu_mode != 1)
d38ceaf99ed015 Alex Deucher        2015-04-20  4963  			amdgpu_atombios_fini(adev);
e986e89659e18b Lijo Lazar          2025-02-05  4964  		amdgpu_bios_release(adev);
6e8ca38ebc9b13 Lijo Lazar          2025-02-05  4965  	}
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4966  
8a2b51392ac4a5 Lijo Lazar          2023-10-04  4967  	kfree(adev->fru_info);
8a2b51392ac4a5 Lijo Lazar          2023-10-04  4968  	adev->fru_info = NULL;
8a2b51392ac4a5 Lijo Lazar          2023-10-04  4969  
b5aaa82e2b12fe Flora Cui           2025-03-14  4970  	kfree(adev->xcp_mgr);
b5aaa82e2b12fe Flora Cui           2025-03-14  4971  	adev->xcp_mgr = NULL;
b5aaa82e2b12fe Flora Cui           2025-03-14  4972  
127ed492ad2df0 Lijo Lazar          2025-06-30  4973  	px = amdgpu_device_supports_px(adev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4974  
7b1c6263eaf4fd Alex Deucher        2023-10-17  4975  	if (px || (!dev_is_removable(&adev->pdev->dev) &&
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4976  				apple_gmux_detect(NULL, NULL)))
d38ceaf99ed015 Alex Deucher        2015-04-20  4977  		vga_switcheroo_unregister_client(adev->pdev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4978  
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4979  	if (px)
83ba126a9be318 Alex Deucher        2016-06-03  4980  		vga_switcheroo_fini_domain_pm_ops(adev->dev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03  4981  
38d6be8199331e Alex Deucher        2020-11-20  4982  	if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
b8779475869a26 Christoph Hellwig   2021-07-16  4983  		vga_client_unregister(adev->pdev);
e9bc1bf7916e00 Yintian Tao         2019-06-05  4984  
d155bef0636e01 Arnd Bergmann       2019-07-08  4985  	if (IS_ENABLED(CONFIG_PERF_EVENTS))
9c7c85f7ea1fe5 Jonathan Kim        2019-06-19  4986  		amdgpu_pmu_fini(adev);
9e2096baab9add Lijo Lazar          2025-10-10  4987  	if (adev->discovery.bin)
a190d1c75c73ce Xiaojie Yuan        2019-03-27  4988  		amdgpu_discovery_fini(adev);
72c8c97b1522ce Andrey Grodzovsky   2021-05-12  4989  
cfbb6b0047448e Andrey Grodzovsky   2022-01-21  4990  	amdgpu_reset_put_reset_domain(adev->reset_domain);
cfbb6b0047448e Andrey Grodzovsky   2022-01-21  4991  	adev->reset_domain = NULL;
cfbb6b0047448e Andrey Grodzovsky   2022-01-21  4992  
72c8c97b1522ce Andrey Grodzovsky   2021-05-12  4993  	kfree(adev->pci_state);
1dd2fa0e00f17b Lijo Lazar          2025-06-28  4994  	kfree(adev->pcie_reset_ctx.swds_pcistate);
1dd2fa0e00f17b Lijo Lazar          2025-06-28  4995  	kfree(adev->pcie_reset_ctx.swus_pcistate);
d38ceaf99ed015 Alex Deucher        2015-04-20  4996  }
d38ceaf99ed015 Alex Deucher        2015-04-20  4997  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: Re: [PATCH] drm/amdgpu: Fix resource leaks
  2026-02-25  6:36 ` kernel test robot
@ 2026-02-27  4:17   ` Claude Code Review Bot
  2026-02-27  4:17   ` Claude Code Review Bot
  1 sibling, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-27  4:17 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: Re: [PATCH] drm/amdgpu: Fix resource leaks
Author: kernel test robot <lkp@intel.com>
Patches: 2
Reviewed: 2026-02-27T14:17:10.577336

---

This single patch converts `adev->rmmio` from `ioremap()` to `devm_ioremap()` and removes the manual `iounmap()` calls. While the intent to fix resource leaks on error paths is valid (the Smatch warning is real), the approach has **serious correctness problems** related to the device lifecycle and hot-unplug handling. The patch should **not be applied** in its current form.

The core issue: the patch removes `adev->rmmio = NULL` assignments that serve as **guards** in the driver's teardown logic, and changes the unmap timing from explicit driver-controlled teardown to implicit device-release-time cleanup, which fundamentally breaks the hot-unplug code path.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: Re: [PATCH] drm/amdgpu: Fix resource leaks
  2026-02-25  6:36 ` kernel test robot
  2026-02-27  4:17   ` Claude review: " Claude Code Review Bot
@ 2026-02-27  4:17   ` Claude Code Review Bot
  1 sibling, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-27  4:17 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**The problem being fixed is real**: after `ioremap()` at line 4537, multiple error paths in `amdgpu_device_init()` return without calling `iounmap()`, and none of them go through `amdgpu_device_fini_sw()`. The Smatch warning is legitimate.

However, using `devm_ioremap()` is the wrong fix here for several reasons:

**1. Removal of `adev->rmmio = NULL` breaks hot-unplug guard logic**

In `amdgpu_device_unmap_mmio()`, the patch removes:
```c
-	iounmap(adev->rmmio);
-	adev->rmmio = NULL;
```

The `adev->rmmio = NULL` assignment is critical because it's checked in `amdgpu_driver_unload_kms()` (`amdgpu_kms.c:91`):
```c
if (adev->rmmio == NULL)
    return;
```

This guard prevents `amdgpu_device_fini_hw()` from running on an already-torn-down device during hot-unplug. With the patch applied, `adev->rmmio` is never set to NULL, so this guard never triggers, potentially causing use-after-free or double-teardown issues during PCI hot-unplug.

**2. Lifetime mismatch with hot-unplug path**

`amdgpu_device_unmap_mmio()` is called from `amdgpu_device_fini_hw()` when `pci_dev_is_disconnected()` is true (line 4937-4938). This is designed to immediately unmap MMIO when the PCI device disappears, so accesses to the now-gone hardware stop immediately. With `devm_ioremap()`, the unmap is deferred to device release time, meaning `adev->rmmio` remains a valid pointer to an ioremap'd region for a physically absent device — the register space behind it is gone. While `no_hw_access` flag guards most register accesses, the explicit unmap provided a second safety net.

**3. Removal of `drm_dev_enter()` guarded cleanup in `amdgpu_device_fini_sw()` is wrong**

The patch removes:
```c
-	if (drm_dev_enter(adev_to_drm(adev), &idx)) {
-
-		iounmap(adev->rmmio);
-		adev->rmmio = NULL;
-		drm_dev_exit(idx);
-	}
```

The `drm_dev_enter()` guard here is deliberate — it ensures the iounmap only happens when the DRM device is still accessible (i.e., NOT hot-unplugged). This two-site cleanup pattern (`amdgpu_device_unmap_mmio` for hot-unplug, `amdgpu_device_fini_sw` for normal shutdown) is intentional and correct. Removing it collapses two distinct lifecycle paths into "let devm handle it," which is the wrong abstraction for a driver with explicit hot-unplug support.

**4. devm cleanup ordering may conflict with other teardown**

`devm_ioremap()` resources are freed when the `struct device` is released. With `adev` allocated via `devm_drm_dev_alloc()`, the ordering between DRM device release and PCI device release can be subtle. Code in `amdgpu_device_fini_sw()` that runs *after* the removed iounmap block (e.g., `amdgpu_discovery_fini`, `amdgpu_pmu_fini`) may still need the MMIO mapping to be present (or at least deterministically know whether it is).

**Suggested alternative fix**: Instead of converting to `devm_ioremap()`, fix the actual error paths in `amdgpu_device_init()` by adding `iounmap(adev->rmmio)` before the failing returns after the ioremap, or consolidating the error cleanup through a goto label that performs the unmap. This is the more targeted and lower-risk fix for the Smatch warning.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-27  4:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25  1:44 [PATCH] drm/amdgpu: Fix resource leaks Ethan Tidmore
2026-02-25  6:36 ` kernel test robot
2026-02-27  4:17   ` Claude review: " Claude Code Review Bot
2026-02-27  4:17   ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox