* [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback
@ 2026-05-29 15:28 Lizhi Hou
2026-05-31 14:18 ` Mario Limonciello
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Lizhi Hou @ 2026-05-29 15:28 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
karol.wachowski
Cc: Lizhi Hou, linux-kernel, max.zhen, sonal.santan
aie2_sched_job_free() accesses job->drv_cmd for tracing purposes. However,
job->drv_cmd is owned by the caller and may already have been freed when
the job free callback runs, leading to a potential use-after-free.
Remove the job->drv_cmd access from aie2_sched_job_free().
Fixes: 8711eb2dde2e ("accel/amdxdna: Improve tracing for job lifecycle and mailbox RX worker")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/aie2_ctx.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
index 658a5fb1fda6..2ad343728782 100644
--- a/drivers/accel/amdxdna/aie2_ctx.c
+++ b/drivers/accel/amdxdna/aie2_ctx.c
@@ -437,8 +437,9 @@ static void aie2_sched_job_free(struct drm_sched_job *sched_job)
struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job);
struct amdxdna_hwctx *hwctx = job->hwctx;
+ /* job->drv_cmd could be freed, so use DEFAULT_IO */
trace_xdna_job(sched_job, hwctx->name, "job free",
- job->seq, job->drv_cmd ? job->drv_cmd->opcode : DEFAULT_IO);
+ job->seq, DEFAULT_IO);
if (!job->job_done)
up(&hwctx->priv->job_sem);
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-05-29 15:28 [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback Lizhi Hou
@ 2026-05-31 14:18 ` Mario Limonciello
2026-06-01 15:18 ` Lizhi Hou
2026-06-04 6:19 ` Claude review: " Claude Code Review Bot
2026-06-04 6:19 ` Claude Code Review Bot
2 siblings, 1 reply; 7+ messages in thread
From: Mario Limonciello @ 2026-05-31 14:18 UTC (permalink / raw)
To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, karol.wachowski
Cc: linux-kernel, max.zhen, sonal.santan
On 5/29/26 17:28, Lizhi Hou wrote:
> aie2_sched_job_free() accesses job->drv_cmd for tracing purposes. However,
> job->drv_cmd is owned by the caller and may already have been freed when
> the job free callback runs, leading to a potential use-after-free.
>
> Remove the job->drv_cmd access from aie2_sched_job_free().
>
> Fixes: 8711eb2dde2e ("accel/amdxdna: Improve tracing for job lifecycle and mailbox RX worker")
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> ---
> drivers/accel/amdxdna/aie2_ctx.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
> index 658a5fb1fda6..2ad343728782 100644
> --- a/drivers/accel/amdxdna/aie2_ctx.c
> +++ b/drivers/accel/amdxdna/aie2_ctx.c
> @@ -437,8 +437,9 @@ static void aie2_sched_job_free(struct drm_sched_job *sched_job)
> struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job);
> struct amdxdna_hwctx *hwctx = job->hwctx;
>
> + /* job->drv_cmd could be freed, so use DEFAULT_IO */
> trace_xdna_job(sched_job, hwctx->name, "job free",
> - job->seq, job->drv_cmd ? job->drv_cmd->opcode : DEFAULT_IO);
> + job->seq, DEFAULT_IO);
Could this still be a race with dov->drv_cmd being valid when the first
part of the expression is evaluated (job->drv_cmd) but invalid when
job->drv_cmd->opcode is accessed?
> if (!job->job_done)
> up(&hwctx->priv->job_sem);
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-05-31 14:18 ` Mario Limonciello
@ 2026-06-01 15:18 ` Lizhi Hou
2026-06-03 18:34 ` Mario Limonciello
0 siblings, 1 reply; 7+ messages in thread
From: Lizhi Hou @ 2026-06-01 15:18 UTC (permalink / raw)
To: Mario Limonciello, ogabbay, quic_jhugo, dri-devel,
karol.wachowski
Cc: linux-kernel, max.zhen, sonal.santan
On 5/31/26 07:18, Mario Limonciello wrote:
>
>
> On 5/29/26 17:28, Lizhi Hou wrote:
>> aie2_sched_job_free() accesses job->drv_cmd for tracing purposes.
>> However,
>> job->drv_cmd is owned by the caller and may already have been freed when
>> the job free callback runs, leading to a potential use-after-free.
>>
>> Remove the job->drv_cmd access from aie2_sched_job_free().
>>
>> Fixes: 8711eb2dde2e ("accel/amdxdna: Improve tracing for job
>> lifecycle and mailbox RX worker")
>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
>> ---
>> drivers/accel/amdxdna/aie2_ctx.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c
>> b/drivers/accel/amdxdna/aie2_ctx.c
>> index 658a5fb1fda6..2ad343728782 100644
>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>> @@ -437,8 +437,9 @@ static void aie2_sched_job_free(struct
>> drm_sched_job *sched_job)
>> struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job);
>> struct amdxdna_hwctx *hwctx = job->hwctx;
>> + /* job->drv_cmd could be freed, so use DEFAULT_IO */
>> trace_xdna_job(sched_job, hwctx->name, "job free",
>> - job->seq, job->drv_cmd ? job->drv_cmd->opcode :
>> DEFAULT_IO);
>> + job->seq, DEFAULT_IO);
>
> Could this still be a race with dov->drv_cmd being valid when the
> first part of the expression is evaluated (job->drv_cmd) but invalid
> when job->drv_cmd->opcode is accessed?
When aie2_sched_job_free() is called, the job->drv_cmd could already be
freed. So it should never access job->drv_cmd at all. The entire
expression "job->drv_cmd ? job->drv_cmd->opcode : DEFAULT_IO" is removed.
Lizhi
>
>> if (!job->job_done)
>> up(&hwctx->priv->job_sem);
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-06-01 15:18 ` Lizhi Hou
@ 2026-06-03 18:34 ` Mario Limonciello
2026-06-03 20:17 ` Lizhi Hou
0 siblings, 1 reply; 7+ messages in thread
From: Mario Limonciello @ 2026-06-03 18:34 UTC (permalink / raw)
To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, karol.wachowski
Cc: linux-kernel, max.zhen, sonal.santan
On 6/1/26 10:18, Lizhi Hou wrote:
>
> On 5/31/26 07:18, Mario Limonciello wrote:
>>
>>
>> On 5/29/26 17:28, Lizhi Hou wrote:
>>> aie2_sched_job_free() accesses job->drv_cmd for tracing purposes.
>>> However,
>>> job->drv_cmd is owned by the caller and may already have been freed when
>>> the job free callback runs, leading to a potential use-after-free.
>>>
>>> Remove the job->drv_cmd access from aie2_sched_job_free().
>>>
>>> Fixes: 8711eb2dde2e ("accel/amdxdna: Improve tracing for job
>>> lifecycle and mailbox RX worker")
>>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
>>> ---
>>> drivers/accel/amdxdna/aie2_ctx.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/
>>> amdxdna/aie2_ctx.c
>>> index 658a5fb1fda6..2ad343728782 100644
>>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>>> @@ -437,8 +437,9 @@ static void aie2_sched_job_free(struct
>>> drm_sched_job *sched_job)
>>> struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job);
>>> struct amdxdna_hwctx *hwctx = job->hwctx;
>>> + /* job->drv_cmd could be freed, so use DEFAULT_IO */
>>> trace_xdna_job(sched_job, hwctx->name, "job free",
>>> - job->seq, job->drv_cmd ? job->drv_cmd->opcode :
>>> DEFAULT_IO);
>>> + job->seq, DEFAULT_IO);
>>
>> Could this still be a race with dov->drv_cmd being valid when the
>> first part of the expression is evaluated (job->drv_cmd) but invalid
>> when job->drv_cmd->opcode is accessed?
>
> When aie2_sched_job_free() is called, the job->drv_cmd could already be
> freed. So it should never access job->drv_cmd at all. The entire
> expression "job->drv_cmd ? job->drv_cmd->opcode : DEFAULT_IO" is removed.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
>
> Lizhi
>
>>
>>> if (!job->job_done)
>>> up(&hwctx->priv->job_sem);
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-06-03 18:34 ` Mario Limonciello
@ 2026-06-03 20:17 ` Lizhi Hou
0 siblings, 0 replies; 7+ messages in thread
From: Lizhi Hou @ 2026-06-03 20:17 UTC (permalink / raw)
To: Mario Limonciello, ogabbay, quic_jhugo, dri-devel,
karol.wachowski
Cc: linux-kernel, max.zhen, sonal.santan
Applied to drm-misc-next-fixes
On 6/3/26 11:34, Mario Limonciello wrote:
>
>
> On 6/1/26 10:18, Lizhi Hou wrote:
>>
>> On 5/31/26 07:18, Mario Limonciello wrote:
>>>
>>>
>>> On 5/29/26 17:28, Lizhi Hou wrote:
>>>> aie2_sched_job_free() accesses job->drv_cmd for tracing purposes.
>>>> However,
>>>> job->drv_cmd is owned by the caller and may already have been freed
>>>> when
>>>> the job free callback runs, leading to a potential use-after-free.
>>>>
>>>> Remove the job->drv_cmd access from aie2_sched_job_free().
>>>>
>>>> Fixes: 8711eb2dde2e ("accel/amdxdna: Improve tracing for job
>>>> lifecycle and mailbox RX worker")
>>>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
>>>> ---
>>>> drivers/accel/amdxdna/aie2_ctx.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/
>>>> amdxdna/aie2_ctx.c
>>>> index 658a5fb1fda6..2ad343728782 100644
>>>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>>>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>>>> @@ -437,8 +437,9 @@ static void aie2_sched_job_free(struct
>>>> drm_sched_job *sched_job)
>>>> struct amdxdna_sched_job *job = drm_job_to_xdna_job(sched_job);
>>>> struct amdxdna_hwctx *hwctx = job->hwctx;
>>>> + /* job->drv_cmd could be freed, so use DEFAULT_IO */
>>>> trace_xdna_job(sched_job, hwctx->name, "job free",
>>>> - job->seq, job->drv_cmd ? job->drv_cmd->opcode :
>>>> DEFAULT_IO);
>>>> + job->seq, DEFAULT_IO);
>>>
>>> Could this still be a race with dov->drv_cmd being valid when the
>>> first part of the expression is evaluated (job->drv_cmd) but invalid
>>> when job->drv_cmd->opcode is accessed?
>>
>> When aie2_sched_job_free() is called, the job->drv_cmd could already
>> be freed. So it should never access job->drv_cmd at all. The entire
>> expression "job->drv_cmd ? job->drv_cmd->opcode : DEFAULT_IO" is
>> removed.
> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
>>
>> Lizhi
>>
>>>
>>>> if (!job->job_done)
>>>> up(&hwctx->priv->job_sem);
>>>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-05-29 15:28 [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback Lizhi Hou
2026-05-31 14:18 ` Mario Limonciello
@ 2026-06-04 6:19 ` Claude Code Review Bot
2026-06-04 6:19 ` Claude Code Review Bot
2 siblings, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 6:19 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: accel/amdxdna: Remove drv_cmd tracing from job free callback
Author: Lizhi Hou <lizhi.hou@amd.com>
Patches: 5
Reviewed: 2026-06-04T16:19:38.908940
---
This is a single-patch fix for a real use-after-free bug in `accel/amdxdna`. The analysis is correct: `job->drv_cmd` points to a caller-owned stack-local `struct amdxdna_drv_cmd` (e.g., the `cmd` variable in `aie2_hwctx_cfg_debug_bo()` at line 907 and `aie2_hwctx_sync_debug_bo()` at line 979). After `amdxdna_cmd_submit()` returns and the calling function finishes, that stack frame is gone — but the job may still be alive in the scheduler and `aie2_sched_job_free()` can run later via the `.free_job` callback, at which point `job->drv_cmd` is a dangling pointer.
The fix is correct and minimal. The patch is straightforward and appropriate for a `Fixes:` tag.
**Verdict: Looks good.** One minor observation below.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: accel/amdxdna: Remove drv_cmd tracing from job free callback
2026-05-29 15:28 [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback Lizhi Hou
2026-05-31 14:18 ` Mario Limonciello
2026-06-04 6:19 ` Claude review: " Claude Code Review Bot
@ 2026-06-04 6:19 ` Claude Code Review Bot
2 siblings, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 6:19 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Correctness: Good.** The use-after-free is real. The callers in `aie2_hwctx_cfg_debug_bo()` and `aie2_hwctx_sync_debug_bo()` pass a pointer to a stack-local `struct amdxdna_drv_cmd cmd`:
```c
struct amdxdna_drv_cmd cmd = { 0 };
...
ret = amdxdna_cmd_submit(client, &cmd, ...);
```
And in `amdxdna_cmd_submit()` this is stored as a bare pointer:
```c
job->drv_cmd = drv_cmd;
```
Since `aie2_sched_job_free` runs asynchronously via `drm_sched_backend_ops.free_job`, it can execute after the caller's stack frame is gone, making `job->drv_cmd` a dangling pointer. The fix correctly removes the access.
**Minor observation:** The added comment says `/* job->drv_cmd could be freed, so use DEFAULT_IO */`, but strictly speaking `drv_cmd` points to a stack-local — it isn't "freed" in the heap sense, the stack frame just goes out of scope. A more precise comment might say "could be invalid" or "may point to freed/stale memory." This is very minor and doesn't affect the fix.
**Other `drv_cmd` accesses:** The other uses of `job->drv_cmd` in `aie2_ctx.c` (the "signaling fence" trace at line 212, "job run" at line 384, the switch at line 395, and "sent to device" at line 429) all happen within `aie2_sched_job_run()` or its signal callback path, which runs while the job is actively submitted — the caller is blocked in `aie2_cmd_wait()` at that point, so the stack-local `cmd` is still valid. Those accesses are fine.
**Reviewed-by criteria:** This is a clean, minimal fix for a real bug. No concerns.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-04 6:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 15:28 [PATCH V1] accel/amdxdna: Remove drv_cmd tracing from job free callback Lizhi Hou
2026-05-31 14:18 ` Mario Limonciello
2026-06-01 15:18 ` Lizhi Hou
2026-06-03 18:34 ` Mario Limonciello
2026-06-03 20:17 ` Lizhi Hou
2026-06-04 6:19 ` Claude review: " Claude Code Review Bot
2026-06-04 6:19 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox