* [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode
@ 2026-02-10 19:16 Lizhi Hou
2026-02-11 6:14 ` Claude review: " Claude Code Review Bot
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Lizhi Hou @ 2026-02-10 19:16 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, maciej.falkowski
Cc: Lizhi Hou, linux-kernel, max.zhen, sonal.santan,
mario.limonciello
Enabling turbo mode disables hardware clock gating. Suspend requires
hardware clock gating to be re-enabled, otherwise suspend will fail.
Fix this by calling aie2_runtime_cfg() from aie2_hw_stop() to
re-enable clock gating during suspend. Also ensure that firmware is
initialized in aie2_hw_start() before modifying clock-gating
settings during resume.
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/aie2_pci.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index 5b326e4610e6..0d41a6764892 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -323,6 +323,7 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
return;
}
+ aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, NULL);
aie2_mgmt_fw_fini(ndev);
xdna_mailbox_stop_channel(ndev->mgmt_chann);
xdna_mailbox_destroy_channel(ndev->mgmt_chann);
@@ -406,15 +407,15 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto stop_psp;
}
- ret = aie2_pm_init(ndev);
+ ret = aie2_mgmt_fw_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
+ XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
goto destroy_mgmt_chann;
}
- ret = aie2_mgmt_fw_init(ndev);
+ ret = aie2_pm_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
+ XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
goto destroy_mgmt_chann;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Claude review: accel/amdxdna: Fix suspend failure after enabling turbo mode
2026-02-10 19:16 [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode Lizhi Hou
@ 2026-02-11 6:14 ` Claude Code Review Bot
2026-02-11 6:15 ` Claude Code Review Bot
2026-02-11 19:23 ` [PATCH V1] " Mario Limonciello
2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-11 6:14 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: accel/amdxdna: Fix suspend failure after enabling turbo
mode
Author: Lizhi Hou <lizhi.hou@amd.com>
Patches: 1
Reviewed: 2026-02-11T16:14:59.929102
---
This single-patch series addresses a suspend/resume issue in the AMD XDNA accelerator driver related to turbo mode and hardware clock gating. The fix is straightforward and addresses a clear bug where enabling turbo mode (which disables clock gating) causes subsequent suspend operations to fail because suspend requires clock gating to be enabled.
**High-Level Assessment:**
The patch correctly identifies that the suspend path needs to re-enable clock gating before suspending the device, and the resume path needs to ensure firmware is initialized before modifying clock gating settings. However, there are several concerns with the implementation:
1. **Error handling**: The added call to `aie2_runtime_cfg()` in `aie2_hw_stop()` doesn't check return values
2. **Ordering logic**: The swap of `aie2_pm_init()` and `aie2_mgmt_fw_init()` in the resume path needs better justification
3. **Missing cleanup**: If the new ordering causes failures, the cleanup path may be incorrect
4. **Documentation**: No comments explain the ordering requirements
**Recommendation:** Needs revision to address error handling and provide clearer rationale for the initialization order change.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Claude review: accel/amdxdna: Fix suspend failure after enabling turbo mode
2026-02-10 19:16 [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode Lizhi Hou
2026-02-11 6:14 ` Claude review: " Claude Code Review Bot
@ 2026-02-11 6:15 ` Claude Code Review Bot
2026-02-11 19:23 ` [PATCH V1] " Mario Limonciello
2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-02-11 6:15 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Commit Message Review:**
The commit message provides a clear explanation of the problem and the solution. However, it could be improved:
- Should include a "Fixes:" tag referencing the commit that introduced turbo mode
- Should mention if this is user-visible (e.g., "Users experience suspend failures...")
- Could benefit from describing the failure mode (does the system hang? return error?)
**Code Review:**
```c
@@ -323,6 +323,7 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
return;
}
+ aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, NULL);
aie2_mgmt_fw_fini(ndev);
xdna_mailbox_stop_channel(ndev->mgmt_chann);
xdna_mailbox_destroy_channel(ndev->mgmt_chann);
```
**Issues at drivers/accel/amdxdna/aie2_pci.c:326:**
1. **Missing error handling**: `aie2_runtime_cfg()` likely returns an error code, but it's not checked here. What happens if re-enabling clock gating fails? Should the function proceed with firmware shutdown or bail out?
2. **Missing comment**: This non-obvious call deserves a comment explaining why clock gating must be re-enabled before firmware shutdown, something like:
```c
/* Re-enable clock gating before suspend, as turbo mode disables it */
```
3. **Ordering question**: Should this be done *after* firmware shutdown rather than before? If firmware operations require certain clock states, the current order might cause issues.
```c
@@ -406,15 +407,15 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto stop_psp;
}
- ret = aie2_pm_init(ndev);
+ ret = aie2_mgmt_fw_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
+ XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
goto destroy_mgmt_chann;
}
- ret = aie2_mgmt_fw_init(ndev);
+ ret = aie2_pm_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
+ XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
goto destroy_mgmt_chann;
}
```
**Issues at drivers/accel/amdxdna/aie2_pci.c:410-419:**
1. **Inadequate justification**: The commit message says "ensure that firmware is initialized in aie2_hw_start() before modifying clock-gating settings" but doesn't explain *why* this is necessary. Does `aie2_pm_init()` modify clock gating? If so, this should be explicit in the commit message.
2. **Broken error path**: Both error cases now `goto destroy_mgmt_chann`, but with the new ordering:
- If `aie2_mgmt_fw_init()` fails, we skip destroying it (correct)
- If `aie2_pm_init()` fails, we `goto destroy_mgmt_chann` but we've already initialized firmware - shouldn't we call `aie2_mgmt_fw_fini()` first?
This looks like a bug in the error handling path. You need different labels:
```c
ret = aie2_mgmt_fw_init(ndev);
if (ret) {
XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
goto destroy_mgmt_chann;
}
ret = aie2_pm_init(ndev);
if (ret) {
XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
goto fini_mgmt_fw; /* Need to cleanup firmware */
}
```
3. **Typo in error message**: "initial mgmt firmware failed" should be "initialize mgmt firmware failed" or "mgmt firmware initialization failed" for consistency with the other message format.
4. **Testing concern**: Has this been tested with error injection to verify the error paths work correctly? The changed ordering could mask issues.
**Additional Concerns:**
1. **Missing symmetric cleanup**: In `aie2_hw_stop()`, the patch adds `aie2_runtime_cfg()` before `aie2_mgmt_fw_fini()`, but there's no corresponding operation in `aie2_hw_start()` after `aie2_pm_init()`. Is this asymmetry intentional?
2. **Race conditions**: What happens if suspend is called while turbo mode is being enabled? Is there locking to prevent races between runtime PM and mode changes?
3. **Power consumption**: If re-enabling clock gating fails during suspend, does the device stay in a high-power state?
**Suggested Fix:**
```c
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index 5b326e4610e6..XXXXXXXX 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -323,6 +323,12 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
return;
}
+ /*
+ * Re-enable clock gating for suspend. Turbo mode disables it,
+ * but suspend requires it to be enabled.
+ */
+ ret = aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, NULL);
+ WARN_ON(ret); /* Log but continue - shutdown is critical path */
+
aie2_mgmt_fw_fini(ndev);
xdna_mailbox_stop_channel(ndev->mgmt_chann);
xdna_mailbox_destroy_channel(ndev->mgmt_chann);
@@ -406,15 +412,20 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto stop_psp;
}
- ret = aie2_pm_init(ndev);
+ /*
+ * Initialize firmware before PM to ensure clock gating
+ * configuration happens after firmware is ready.
+ */
+ ret = aie2_mgmt_fw_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
+ XDNA_ERR(xdna, "mgmt firmware init failed, ret %d", ret);
goto destroy_mgmt_chann;
}
- ret = aie2_mgmt_fw_init(ndev);
+ ret = aie2_pm_init(ndev);
if (ret) {
- XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
+ XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
+ aie2_mgmt_fw_fini(ndev);
goto destroy_mgmt_chann;
}
```
**Summary:**
The patch addresses a real bug but needs revision for:
- Proper error handling in the suspend path
- Fixed error cleanup in the resume path
- Better documentation of ordering requirements
- Clearer commit message with Fixes: tag
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode
2026-02-10 19:16 [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode Lizhi Hou
2026-02-11 6:14 ` Claude review: " Claude Code Review Bot
2026-02-11 6:15 ` Claude Code Review Bot
@ 2026-02-11 19:23 ` Mario Limonciello
2 siblings, 0 replies; 4+ messages in thread
From: Mario Limonciello @ 2026-02-11 19:23 UTC (permalink / raw)
To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, maciej.falkowski
Cc: linux-kernel, max.zhen, sonal.santan
On 2/10/26 1:16 PM, Lizhi Hou wrote:
> Enabling turbo mode disables hardware clock gating. Suspend requires
> hardware clock gating to be re-enabled, otherwise suspend will fail.
> Fix this by calling aie2_runtime_cfg() from aie2_hw_stop() to
> re-enable clock gating during suspend. Also ensure that firmware is
> initialized in aie2_hw_start() before modifying clock-gating
> settings during resume.
>
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Fixes tag?
> ---
> drivers/accel/amdxdna/aie2_pci.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
> index 5b326e4610e6..0d41a6764892 100644
> --- a/drivers/accel/amdxdna/aie2_pci.c
> +++ b/drivers/accel/amdxdna/aie2_pci.c
> @@ -323,6 +323,7 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
> return;
> }
>
> + aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, NULL);
> aie2_mgmt_fw_fini(ndev);
> xdna_mailbox_stop_channel(ndev->mgmt_chann);
> xdna_mailbox_destroy_channel(ndev->mgmt_chann);
> @@ -406,15 +407,15 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
> goto stop_psp;
> }
>
> - ret = aie2_pm_init(ndev);
> + ret = aie2_mgmt_fw_init(ndev);
> if (ret) {
> - XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
> + XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
> goto destroy_mgmt_chann;
> }
>
> - ret = aie2_mgmt_fw_init(ndev);
> + ret = aie2_pm_init(ndev);
> if (ret) {
> - XDNA_ERR(xdna, "initial mgmt firmware failed, ret %d", ret);
> + XDNA_ERR(xdna, "failed to init pm, ret %d", ret);
> goto destroy_mgmt_chann;
> }
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-11 19:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-10 19:16 [PATCH V1] accel/amdxdna: Fix suspend failure after enabling turbo mode Lizhi Hou
2026-02-11 6:14 ` Claude review: " Claude Code Review Bot
2026-02-11 6:15 ` Claude Code Review Bot
2026-02-11 19:23 ` [PATCH V1] " Mario Limonciello
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox