* [PATCH] drm/amdgpu: Fix resource leaks
@ 2026-02-25 1:44 Ethan Tidmore
2026-02-25 6:36 ` kernel test robot
0 siblings, 1 reply; 4+ messages in thread
From: Ethan Tidmore @ 2026-02-25 1:44 UTC (permalink / raw)
To: Alex Deucher, Christian König, amd-gfx
Cc: David Airlie, Simona Vetter, Lijo Lazar, Mario Limonciello,
Mario Limonciello, Ce Sun, Yo-Jung Leo Lin, Jammy Zhou, dri-devel,
linux-kernel, Ethan Tidmore
There are multiple resource leaks due to ioremap() being used and
iounmap never being called in multiple possible error paths.
Change ioremap() to devm_ioremap() to fix all resource leaks at
once.
Detected by Smatch:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4834 amdgpu_device_init() warn:
'adev->rmmio' from ioremap() not released on lines:
4539,4549,4563,4574,4592,4834.
Fixes: d38ceaf99ed01 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0acddcb04730..5cb58bed64ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4520,7 +4520,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
for (i = 0; i < AMD_IP_BLOCK_TYPE_NUM; i++)
atomic_set(&adev->pm.pwr_state[i], POWER_STATE_UNKNOWN);
- adev->rmmio = ioremap(adev->rmmio_base, adev->rmmio_size);
+ adev->rmmio = devm_ioremap(adev->dev, adev->rmmio_base, adev->rmmio_size);
if (!adev->rmmio)
return -ENOMEM;
@@ -4843,8 +4843,6 @@ static void amdgpu_device_unmap_mmio(struct amdgpu_device *adev)
/* Unmap all mapped bars - Doorbell, registers and VRAM */
amdgpu_doorbell_fini(adev);
- iounmap(adev->rmmio);
- adev->rmmio = NULL;
if (adev->mman.aper_base_kaddr)
iounmap(adev->mman.aper_base_kaddr);
adev->mman.aper_base_kaddr = NULL;
@@ -4970,13 +4968,6 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
vga_client_unregister(adev->pdev);
- if (drm_dev_enter(adev_to_drm(adev), &idx)) {
-
- iounmap(adev->rmmio);
- adev->rmmio = NULL;
- drm_dev_exit(idx);
- }
-
if (IS_ENABLED(CONFIG_PERF_EVENTS))
amdgpu_pmu_fini(adev);
if (adev->discovery.bin)
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] drm/amdgpu: Fix resource leaks
2026-02-25 1:44 [PATCH] drm/amdgpu: Fix resource leaks Ethan Tidmore
@ 2026-02-25 6:36 ` kernel test robot
2026-02-27 4:17 ` Claude review: " Claude Code Review Bot
2026-02-27 4:17 ` Claude Code Review Bot
0 siblings, 2 replies; 4+ messages in thread
From: kernel test robot @ 2026-02-25 6:36 UTC (permalink / raw)
To: Ethan Tidmore, Alex Deucher, Christian König, amd-gfx
Cc: oe-kbuild-all, David Airlie, Simona Vetter, Lijo Lazar,
Mario Limonciello, Ce Sun, Yo-Jung Leo Lin, Jammy Zhou, dri-devel,
linux-kernel, Ethan Tidmore
Hi Ethan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on linus/master v7.0-rc1 next-20260224]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ethan-Tidmore/drm-amdgpu-Fix-resource-leaks/20260225-095342
base: https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link: https://lore.kernel.org/r/20260225014425.2474802-1-ethantidmore06%40gmail.com
patch subject: [PATCH] drm/amdgpu: Fix resource leaks
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260225/202602251459.1jr0PuUf-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260225/202602251459.1jr0PuUf-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602251459.1jr0PuUf-lkp@intel.com/
All warnings (new ones prefixed by >>):
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function 'amdgpu_device_fini_sw':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4942:16: warning: unused variable 'idx' [-Wunused-variable]
4942 | int i, idx;
| ^~~
vim +/idx +4942 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
72c8c97b1522ce Andrey Grodzovsky 2021-05-12 4939
72c8c97b1522ce Andrey Grodzovsky 2021-05-12 4940 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
72c8c97b1522ce Andrey Grodzovsky 2021-05-12 4941 {
bd22e44ad415ac Christian König 2025-01-15 @4942 int i, idx;
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4943 bool px;
62d5f9f7110ad3 Leslie Shi 2022-01-05 4944
a5c5d8d50ecf58 Lang Yu 2021-10-21 4945 amdgpu_device_ip_fini(adev);
b61badd20b443e Vitaly Prosyak 2024-11-11 4946 amdgpu_fence_driver_sw_fini(adev);
b31d306378d9ba Mario Limonciello 2023-01-04 4947 amdgpu_ucode_release(&adev->firmware.gpu_info_fw);
d38ceaf99ed015 Alex Deucher 2015-04-20 4948 adev->accel_working = false;
68ce8b24224265 Christian König 2022-03-02 4949 dma_fence_put(rcu_dereference_protected(adev->gang_submit, true));
bd22e44ad415ac Christian König 2025-01-15 4950 for (i = 0; i < MAX_XCP; ++i) {
bd22e44ad415ac Christian König 2025-01-15 4951 dma_fence_put(adev->isolation[i].spearhead);
bd22e44ad415ac Christian König 2025-01-15 4952 amdgpu_sync_free(&adev->isolation[i].active);
bd22e44ad415ac Christian König 2025-01-15 4953 amdgpu_sync_free(&adev->isolation[i].prev);
bd22e44ad415ac Christian König 2025-01-15 4954 }
04442bf70debb1 Lijo Lazar 2021-03-16 4955
04442bf70debb1 Lijo Lazar 2021-03-16 4956 amdgpu_reset_fini(adev);
04442bf70debb1 Lijo Lazar 2021-03-16 4957
d38ceaf99ed015 Alex Deucher 2015-04-20 4958 /* free i2c buses */
d38ceaf99ed015 Alex Deucher 2015-04-20 4959 amdgpu_i2c_fini(adev);
bfca02892773d2 Shaoyun Liu 2018-02-01 4960
6e8ca38ebc9b13 Lijo Lazar 2025-02-05 4961 if (adev->bios) {
bfca02892773d2 Shaoyun Liu 2018-02-01 4962 if (amdgpu_emu_mode != 1)
d38ceaf99ed015 Alex Deucher 2015-04-20 4963 amdgpu_atombios_fini(adev);
e986e89659e18b Lijo Lazar 2025-02-05 4964 amdgpu_bios_release(adev);
6e8ca38ebc9b13 Lijo Lazar 2025-02-05 4965 }
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4966
8a2b51392ac4a5 Lijo Lazar 2023-10-04 4967 kfree(adev->fru_info);
8a2b51392ac4a5 Lijo Lazar 2023-10-04 4968 adev->fru_info = NULL;
8a2b51392ac4a5 Lijo Lazar 2023-10-04 4969
b5aaa82e2b12fe Flora Cui 2025-03-14 4970 kfree(adev->xcp_mgr);
b5aaa82e2b12fe Flora Cui 2025-03-14 4971 adev->xcp_mgr = NULL;
b5aaa82e2b12fe Flora Cui 2025-03-14 4972
127ed492ad2df0 Lijo Lazar 2025-06-30 4973 px = amdgpu_device_supports_px(adev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4974
7b1c6263eaf4fd Alex Deucher 2023-10-17 4975 if (px || (!dev_is_removable(&adev->pdev->dev) &&
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4976 apple_gmux_detect(NULL, NULL)))
d38ceaf99ed015 Alex Deucher 2015-04-20 4977 vga_switcheroo_unregister_client(adev->pdev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4978
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4979 if (px)
83ba126a9be318 Alex Deucher 2016-06-03 4980 vga_switcheroo_fini_domain_pm_ops(adev->dev);
d37a3929ca0363 Orlando Chamberlain 2023-03-03 4981
38d6be8199331e Alex Deucher 2020-11-20 4982 if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
b8779475869a26 Christoph Hellwig 2021-07-16 4983 vga_client_unregister(adev->pdev);
e9bc1bf7916e00 Yintian Tao 2019-06-05 4984
d155bef0636e01 Arnd Bergmann 2019-07-08 4985 if (IS_ENABLED(CONFIG_PERF_EVENTS))
9c7c85f7ea1fe5 Jonathan Kim 2019-06-19 4986 amdgpu_pmu_fini(adev);
9e2096baab9add Lijo Lazar 2025-10-10 4987 if (adev->discovery.bin)
a190d1c75c73ce Xiaojie Yuan 2019-03-27 4988 amdgpu_discovery_fini(adev);
72c8c97b1522ce Andrey Grodzovsky 2021-05-12 4989
cfbb6b0047448e Andrey Grodzovsky 2022-01-21 4990 amdgpu_reset_put_reset_domain(adev->reset_domain);
cfbb6b0047448e Andrey Grodzovsky 2022-01-21 4991 adev->reset_domain = NULL;
cfbb6b0047448e Andrey Grodzovsky 2022-01-21 4992
72c8c97b1522ce Andrey Grodzovsky 2021-05-12 4993 kfree(adev->pci_state);
1dd2fa0e00f17b Lijo Lazar 2025-06-28 4994 kfree(adev->pcie_reset_ctx.swds_pcistate);
1dd2fa0e00f17b Lijo Lazar 2025-06-28 4995 kfree(adev->pcie_reset_ctx.swus_pcistate);
d38ceaf99ed015 Alex Deucher 2015-04-20 4996 }
d38ceaf99ed015 Alex Deucher 2015-04-20 4997
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 4+ messages in thread
* Claude review: Re: [PATCH] drm/amdgpu: Fix resource leaks
2026-02-25 6:36 ` kernel test robot
@ 2026-02-27 4:17 ` Claude Code Review Bot
2026-02-27 4:17 ` Claude Code Review Bot
1 sibling, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-27 4:17 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: Re: [PATCH] drm/amdgpu: Fix resource leaks
Author: kernel test robot <lkp@intel.com>
Patches: 2
Reviewed: 2026-02-27T14:17:10.577336
---
This single patch converts `adev->rmmio` from `ioremap()` to `devm_ioremap()` and removes the manual `iounmap()` calls. While the intent to fix resource leaks on error paths is valid (the Smatch warning is real), the approach has **serious correctness problems** related to the device lifecycle and hot-unplug handling. The patch should **not be applied** in its current form.
The core issue: the patch removes `adev->rmmio = NULL` assignments that serve as **guards** in the driver's teardown logic, and changes the unmap timing from explicit driver-controlled teardown to implicit device-release-time cleanup, which fundamentally breaks the hot-unplug code path.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Claude review: Re: [PATCH] drm/amdgpu: Fix resource leaks
2026-02-25 6:36 ` kernel test robot
2026-02-27 4:17 ` Claude review: " Claude Code Review Bot
@ 2026-02-27 4:17 ` Claude Code Review Bot
1 sibling, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-27 4:17 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**The problem being fixed is real**: after `ioremap()` at line 4537, multiple error paths in `amdgpu_device_init()` return without calling `iounmap()`, and none of them go through `amdgpu_device_fini_sw()`. The Smatch warning is legitimate.
However, using `devm_ioremap()` is the wrong fix here for several reasons:
**1. Removal of `adev->rmmio = NULL` breaks hot-unplug guard logic**
In `amdgpu_device_unmap_mmio()`, the patch removes:
```c
- iounmap(adev->rmmio);
- adev->rmmio = NULL;
```
The `adev->rmmio = NULL` assignment is critical because it's checked in `amdgpu_driver_unload_kms()` (`amdgpu_kms.c:91`):
```c
if (adev->rmmio == NULL)
return;
```
This guard prevents `amdgpu_device_fini_hw()` from running on an already-torn-down device during hot-unplug. With the patch applied, `adev->rmmio` is never set to NULL, so this guard never triggers, potentially causing use-after-free or double-teardown issues during PCI hot-unplug.
**2. Lifetime mismatch with hot-unplug path**
`amdgpu_device_unmap_mmio()` is called from `amdgpu_device_fini_hw()` when `pci_dev_is_disconnected()` is true (line 4937-4938). This is designed to immediately unmap MMIO when the PCI device disappears, so accesses to the now-gone hardware stop immediately. With `devm_ioremap()`, the unmap is deferred to device release time, meaning `adev->rmmio` remains a valid pointer to an ioremap'd region for a physically absent device — the register space behind it is gone. While `no_hw_access` flag guards most register accesses, the explicit unmap provided a second safety net.
**3. Removal of `drm_dev_enter()` guarded cleanup in `amdgpu_device_fini_sw()` is wrong**
The patch removes:
```c
- if (drm_dev_enter(adev_to_drm(adev), &idx)) {
-
- iounmap(adev->rmmio);
- adev->rmmio = NULL;
- drm_dev_exit(idx);
- }
```
The `drm_dev_enter()` guard here is deliberate — it ensures the iounmap only happens when the DRM device is still accessible (i.e., NOT hot-unplugged). This two-site cleanup pattern (`amdgpu_device_unmap_mmio` for hot-unplug, `amdgpu_device_fini_sw` for normal shutdown) is intentional and correct. Removing it collapses two distinct lifecycle paths into "let devm handle it," which is the wrong abstraction for a driver with explicit hot-unplug support.
**4. devm cleanup ordering may conflict with other teardown**
`devm_ioremap()` resources are freed when the `struct device` is released. With `adev` allocated via `devm_drm_dev_alloc()`, the ordering between DRM device release and PCI device release can be subtle. Code in `amdgpu_device_fini_sw()` that runs *after* the removed iounmap block (e.g., `amdgpu_discovery_fini`, `amdgpu_pmu_fini`) may still need the MMIO mapping to be present (or at least deterministically know whether it is).
**Suggested alternative fix**: Instead of converting to `devm_ioremap()`, fix the actual error paths in `amdgpu_device_init()` by adding `iounmap(adev->rmmio)` before the failing returns after the ioremap, or consolidating the error cleanup through a goto label that performs the unmap. This is the more targeted and lower-risk fix for the Smatch warning.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-27 4:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 1:44 [PATCH] drm/amdgpu: Fix resource leaks Ethan Tidmore
2026-02-25 6:36 ` kernel test robot
2026-02-27 4:17 ` Claude review: " Claude Code Review Bot
2026-02-27 4:17 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox