public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands
@ 2026-05-29 16:21 Lizhi Hou
  2026-05-31 13:17 ` Mario Limonciello
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Lizhi Hou @ 2026-05-29 16:21 UTC (permalink / raw)
  To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
	karol.wachowski
  Cc: Lizhi Hou, linux-kernel, max.zhen, sonal.santan

The config and sync debug BO commands currently may report success even
when the operation fails.

Capture the firmware return status and propagate the corresponding error
to userspace.

Fixes: 7ea046838021 ("accel/amdxdna: Support firmware debug buffer")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
 drivers/accel/amdxdna/aie2_ctx.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
index 2ad343728782..da89b3701f5b 100644
--- a/drivers/accel/amdxdna/aie2_ctx.c
+++ b/drivers/accel/amdxdna/aie2_ctx.c
@@ -305,17 +305,13 @@ aie2_sched_drvcmd_resp_handler(void *handle, void __iomem *data, size_t size)
 	struct amdxdna_sched_job *job = handle;
 	int ret = 0;
 
-	if (unlikely(!data))
-		goto out;
-
-	if (unlikely(size != sizeof(u32))) {
+	if (unlikely(!data || size != sizeof(u32))) {
+		job->drv_cmd->result = U32_MAX;
 		ret = -EINVAL;
-		goto out;
+	} else {
+		job->drv_cmd->result = readl(data);
 	}
 
-	job->drv_cmd->result = readl(data);
-
-out:
 	aie2_sched_notify(job);
 	return ret;
 }
@@ -940,6 +936,7 @@ static int aie2_hwctx_cfg_debug_bo(struct amdxdna_hwctx *hwctx, u32 bo_hdl,
 	aie2_cmd_wait(hwctx, seq);
 	if (cmd.result) {
 		XDNA_ERR(xdna, "Response failure 0x%x", cmd.result);
+		ret = -EINVAL;
 		goto put_obj;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands
  2026-05-29 16:21 [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands Lizhi Hou
@ 2026-05-31 13:17 ` Mario Limonciello
  2026-06-04  6:14 ` Claude review: " Claude Code Review Bot
  2026-06-04  6:14 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Mario Limonciello @ 2026-05-31 13:17 UTC (permalink / raw)
  To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, karol.wachowski
  Cc: linux-kernel, max.zhen, sonal.santan



On 5/29/26 18:21, Lizhi Hou wrote:
> The config and sync debug BO commands currently may report success even
> when the operation fails.
> 
> Capture the firmware return status and propagate the corresponding error
> to userspace.
> 
> Fixes: 7ea046838021 ("accel/amdxdna: Support firmware debug buffer")
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
> ---
>   drivers/accel/amdxdna/aie2_ctx.c | 13 +++++--------
>   1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
> index 2ad343728782..da89b3701f5b 100644
> --- a/drivers/accel/amdxdna/aie2_ctx.c
> +++ b/drivers/accel/amdxdna/aie2_ctx.c
> @@ -305,17 +305,13 @@ aie2_sched_drvcmd_resp_handler(void *handle, void __iomem *data, size_t size)
>   	struct amdxdna_sched_job *job = handle;
>   	int ret = 0;
>   
> -	if (unlikely(!data))
> -		goto out;
> -
> -	if (unlikely(size != sizeof(u32))) {
> +	if (unlikely(!data || size != sizeof(u32))) {
> +		job->drv_cmd->result = U32_MAX;
>   		ret = -EINVAL;
> -		goto out;
> +	} else {
> +		job->drv_cmd->result = readl(data);
>   	}
>   
> -	job->drv_cmd->result = readl(data);
> -
> -out:
>   	aie2_sched_notify(job);
>   	return ret;
>   }
> @@ -940,6 +936,7 @@ static int aie2_hwctx_cfg_debug_bo(struct amdxdna_hwctx *hwctx, u32 bo_hdl,
>   	aie2_cmd_wait(hwctx, seq);
>   	if (cmd.result) {
>   		XDNA_ERR(xdna, "Response failure 0x%x", cmd.result);
> +		ret = -EINVAL;
>   		goto put_obj;
>   	}
>   


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: accel/amdxdna: Return errors for failed debug BO commands
  2026-05-29 16:21 [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands Lizhi Hou
  2026-05-31 13:17 ` Mario Limonciello
@ 2026-06-04  6:14 ` Claude Code Review Bot
  2026-06-04  6:14 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-06-04  6:14 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: accel/amdxdna: Return errors for failed debug BO commands
Author: Lizhi Hou <lizhi.hou@amd.com>
Patches: 2
Reviewed: 2026-06-04T16:14:42.300725

---

This is a single-patch bugfix for the `accel/amdxdna` driver. The patch fixes two related problems where debug BO (buffer object) config/sync commands could silently report success to userspace despite firmware failures:

1. The response handler `aie2_sched_drvcmd_resp_handler` returned success (ret=0) and left `result` at its zero-initialized value when `data` was NULL, making callers believe the command succeeded.
2. `aie2_hwctx_cfg_debug_bo` never set `ret` to an error code when `cmd.result` indicated a firmware failure, so it returned 0 to userspace despite logging an error.

The fix is small, correct, and well-targeted. One minor observation below about the commit message scope.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Claude review: accel/amdxdna: Return errors for failed debug BO commands
  2026-05-29 16:21 [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands Lizhi Hou
  2026-05-31 13:17 ` Mario Limonciello
  2026-06-04  6:14 ` Claude review: " Claude Code Review Bot
@ 2026-06-04  6:14 ` Claude Code Review Bot
  2 siblings, 0 replies; 4+ messages in thread
From: Claude Code Review Bot @ 2026-06-04  6:14 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**Hunk 1 — `aie2_sched_drvcmd_resp_handler` refactor (lines 305–321)**

Good cleanup. The original code had a subtle bug: when `data` was NULL, it jumped to `out` with `ret = 0` and never wrote `job->drv_cmd->result`, leaving it at whatever the caller initialized (zero via `{ 0 }`). Callers check `if (cmd.result)` to detect failure, so they'd see success.

The new code:
```c
if (unlikely(!data || size != sizeof(u32))) {
    job->drv_cmd->result = U32_MAX;
    ret = -EINVAL;
} else {
    job->drv_cmd->result = readl(data);
}
```

This correctly handles both error paths: `ret = -EINVAL` for the return value, and `U32_MAX` as the result sentinel so callers checking `cmd.result` also see a failure. The `if/else` replacing the `goto out` pattern is cleaner for this two-case logic.

**Hunk 2 — `aie2_hwctx_cfg_debug_bo` missing error return (line 940)**

This is the core bug. Before the fix:
```c
aie2_cmd_wait(hwctx, seq);
if (cmd.result) {
    XDNA_ERR(xdna, "Response failure 0x%x", cmd.result);
    goto put_obj;
}
```

At this point `ret` is still 0 from the successful `amdxdna_cmd_submit` call, so jumping to `put_obj` returns success to userspace despite logging the error. The fix adding `ret = -EINVAL` is correct and matches the pattern already used in `aie2_hwctx_sync_debug_bo` (line 994 in the tree).

**Minor observation on commit message:** The commit message says "The config and sync debug BO commands currently may report success even when the operation fails." However, `aie2_hwctx_sync_debug_bo` already correctly returns `-EINVAL` when `cmd.result` is non-zero (tree lines 992–994). The "sync" part of the fix is only via the response handler change (the `data == NULL` path), not a missing error return like the config function had. This is technically accurate but could be clearer — the config function had two bugs (response handler + missing ret assignment) while the sync function only had the response handler one.

**No other issues found.** The patch is correct and ready to merge.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-04  6:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 16:21 [PATCH V1] accel/amdxdna: Return errors for failed debug BO commands Lizhi Hou
2026-05-31 13:17 ` Mario Limonciello
2026-06-04  6:14 ` Claude review: " Claude Code Review Bot
2026-06-04  6:14 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox