* [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform
@ 2026-03-30 16:36 Lizhi Hou
2026-03-30 16:37 ` [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4 Lizhi Hou
` (6 more replies)
0 siblings, 7 replies; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:36 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan
From: David Zhang <yidong.zhang@amd.com>
This series adds initial support for AMD NPU (AIE4) platforms, including
the Physical Function with SR-IOV support on PCI device IDs 0x17F2 and
0x1B0B (NPU3).
Support for SR-IOV Virtual Functions and non-SR-IOV configurations will
be added in future patches.
The AIE4 platform uses mechanisms similar to AIE2 for mailbox, PSP, and
SMU interactions. To support this, the driver is refactored to introduce
shared implementations for mailbox, PSP, and SMU, allowing both AIE2 and
AIE4 to reuse common code.
Series structure
(patch 1) Create common mailbox functions
Factor the AIE mailbox send-and-wait and status handling into common code.
AIE2 and AIE4 use same aie_send_mgmt_msg_wait() and DECLARE_AIE_MSG();
(patch 2) Add AIE4 SR-IOV support
Add the AIE4 Physical Function driver and SR-IOV: new PCI IDs and device
table entries. The amdxdna driver is updated with a sriov_configure
callback. The mailbox layer gets helpers so it works for both AIE2
(iohub) and AIE4 (no iohub). UAPI is extended with AMDXDNA_DEV_TYPE_PF.
(patch 3) Create common PSP interfaces
Make PSP support into common code.
(patch 4) AIE4 PSP support and firmware loading
Use PSP common code for AIE4 hardware. Add firmware request code for psp
operations, add NPU3 PSP BAR/register layout in npu3_regs.c.
(patch 5) Create common SMU interfaces
Make SMU support into common code.
(patch 6) AIE4 SMU support
Use SMU common code for AIE4 hardware. Add smu operations, add NPU3 SMU
BAR/register layout in npu3_regs.c.
Testing
Build and load on a system with an AIE4/NPU3 PF; confirm the device is
bound and SR-IOV can be enabled (e.g. via sriov_numvfs). Confirm
AIE2-based devices (e.g. NPU4) still probe and run as before.
David Zhang (5):
accel/amdxdna: Add basic support for AIE4 devices
accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4
accel/amdxdna: Add AIE4 firmware loading
accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4
accel/amdxdna: Add AIE4 power on and off support
Lizhi Hou (1):
accel/amdxdna: Create shared functions for AIE2 and AIE4
drivers/accel/amdxdna/Makefile | 10 +-
drivers/accel/amdxdna/aie.c | 89 +++++
drivers/accel/amdxdna/aie.h | 104 +++++
drivers/accel/amdxdna/aie2_ctx.c | 4 +-
drivers/accel/amdxdna/aie2_error.c | 12 +-
drivers/accel/amdxdna/aie2_message.c | 140 +++----
drivers/accel/amdxdna/aie2_pci.c | 145 +++----
drivers/accel/amdxdna/aie2_pci.h | 93 +----
drivers/accel/amdxdna/aie2_pm.c | 6 +-
drivers/accel/amdxdna/aie2_psp.c | 161 --------
drivers/accel/amdxdna/aie2_smu.c | 156 --------
drivers/accel/amdxdna/aie4_message.c | 27 ++
drivers/accel/amdxdna/aie4_msg_priv.h | 49 +++
drivers/accel/amdxdna/aie4_pci.c | 491 ++++++++++++++++++++++++
drivers/accel/amdxdna/aie4_pci.h | 53 +++
drivers/accel/amdxdna/aie4_sriov.c | 88 +++++
drivers/accel/amdxdna/aie_psp.c | 235 ++++++++++++
drivers/accel/amdxdna/aie_smu.c | 153 ++++++++
drivers/accel/amdxdna/amdxdna_mailbox.c | 19 +-
drivers/accel/amdxdna/amdxdna_mailbox.h | 8 +-
drivers/accel/amdxdna/amdxdna_pci_drv.c | 19 +-
drivers/accel/amdxdna/amdxdna_pci_drv.h | 10 +
drivers/accel/amdxdna/npu1_regs.c | 25 +-
drivers/accel/amdxdna/npu3_regs.c | 77 ++++
drivers/accel/amdxdna/npu4_regs.c | 30 +-
drivers/accel/amdxdna/npu5_regs.c | 2 +-
drivers/accel/amdxdna/npu6_regs.c | 2 +-
include/uapi/drm/amdxdna_accel.h | 3 +-
28 files changed, 1609 insertions(+), 602 deletions(-)
create mode 100644 drivers/accel/amdxdna/aie.c
create mode 100644 drivers/accel/amdxdna/aie.h
delete mode 100644 drivers/accel/amdxdna/aie2_psp.c
delete mode 100644 drivers/accel/amdxdna/aie2_smu.c
create mode 100644 drivers/accel/amdxdna/aie4_message.c
create mode 100644 drivers/accel/amdxdna/aie4_msg_priv.h
create mode 100644 drivers/accel/amdxdna/aie4_pci.c
create mode 100644 drivers/accel/amdxdna/aie4_pci.h
create mode 100644 drivers/accel/amdxdna/aie4_sriov.c
create mode 100644 drivers/accel/amdxdna/aie_psp.c
create mode 100644 drivers/accel/amdxdna/aie_smu.c
create mode 100644 drivers/accel/amdxdna/npu3_regs.c
--
2.34.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 2/6] accel/amdxdna: Add basic support for AIE4 devices Lizhi Hou
` (5 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: Lizhi Hou, linux-kernel, max.zhen, sonal.santan
The AIE4 platform uses a mailbox management channel mechanism similar to
AIE2 to communicate with the firmware.
Create aie.h and aie.c and move the functions and structures that can
be shared by both platforms from the AIE2-specific files into these
common files. This allows AIE2 and AIE4 to reuse the same implementation
and reduces code duplication.
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/Makefile | 1 +
drivers/accel/amdxdna/aie.c | 89 +++++++++++++++
drivers/accel/amdxdna/aie.h | 31 ++++++
drivers/accel/amdxdna/aie2_ctx.c | 4 +-
drivers/accel/amdxdna/aie2_error.c | 12 +--
drivers/accel/amdxdna/aie2_message.c | 138 +++++++++---------------
drivers/accel/amdxdna/aie2_pci.c | 107 ++++++------------
drivers/accel/amdxdna/aie2_pci.h | 26 +----
drivers/accel/amdxdna/aie2_pm.c | 6 +-
drivers/accel/amdxdna/aie2_smu.c | 22 ++--
drivers/accel/amdxdna/amdxdna_pci_drv.h | 8 ++
drivers/accel/amdxdna/npu1_regs.c | 4 +-
drivers/accel/amdxdna/npu4_regs.c | 4 +-
drivers/accel/amdxdna/npu5_regs.c | 2 +-
drivers/accel/amdxdna/npu6_regs.c | 2 +-
15 files changed, 246 insertions(+), 210 deletions(-)
create mode 100644 drivers/accel/amdxdna/aie.c
create mode 100644 drivers/accel/amdxdna/aie.h
diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile
index cf9bf19dedb9..5c7911554c46 100644
--- a/drivers/accel/amdxdna/Makefile
+++ b/drivers/accel/amdxdna/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
amdxdna-y := \
+ aie.o \
aie2_ctx.o \
aie2_error.o \
aie2_message.o \
diff --git a/drivers/accel/amdxdna/aie.c b/drivers/accel/amdxdna/aie.c
new file mode 100644
index 000000000000..4b3d4493128e
--- /dev/null
+++ b/drivers/accel/amdxdna/aie.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include <linux/errno.h>
+
+#include "aie.h"
+#include "amdxdna_mailbox_helper.h"
+#include "amdxdna_mailbox.h"
+#include "amdxdna_pci_drv.h"
+
+void aie_dump_mgmt_chann_debug(struct aie_device *aie)
+{
+ struct amdxdna_dev *xdna = aie->xdna;
+
+ XDNA_DBG(xdna, "i2x tail 0x%x", aie->mgmt_i2x.mb_tail_ptr_reg);
+ XDNA_DBG(xdna, "i2x head 0x%x", aie->mgmt_i2x.mb_head_ptr_reg);
+ XDNA_DBG(xdna, "i2x ringbuf 0x%x", aie->mgmt_i2x.rb_start_addr);
+ XDNA_DBG(xdna, "i2x rsize 0x%x", aie->mgmt_i2x.rb_size);
+ XDNA_DBG(xdna, "x2i tail 0x%x", aie->mgmt_x2i.mb_tail_ptr_reg);
+ XDNA_DBG(xdna, "x2i head 0x%x", aie->mgmt_x2i.mb_head_ptr_reg);
+ XDNA_DBG(xdna, "x2i ringbuf 0x%x", aie->mgmt_x2i.rb_start_addr);
+ XDNA_DBG(xdna, "x2i rsize 0x%x", aie->mgmt_x2i.rb_size);
+ XDNA_DBG(xdna, "x2i chann index 0x%x", aie->mgmt_chan_idx);
+ XDNA_DBG(xdna, "mailbox protocol major 0x%x", aie->mgmt_prot_major);
+ XDNA_DBG(xdna, "mailbox protocol minor 0x%x", aie->mgmt_prot_minor);
+}
+
+void aie_destroy_chann(struct aie_device *aie, struct mailbox_channel **chann)
+{
+ struct amdxdna_dev *xdna = aie->xdna;
+
+ drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
+
+ if (!*chann)
+ return;
+
+ xdna_mailbox_stop_channel(*chann);
+ xdna_mailbox_free_channel(*chann);
+ *chann = NULL;
+}
+
+int aie_send_mgmt_msg_wait(struct aie_device *aie, struct xdna_mailbox_msg *msg)
+{
+ struct amdxdna_dev *xdna = aie->xdna;
+ struct xdna_notify *hdl = msg->handle;
+ int ret;
+
+ drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
+
+ if (!aie->mgmt_chann)
+ return -ENODEV;
+
+ ret = xdna_send_msg_wait(xdna, aie->mgmt_chann, msg);
+ if (ret == -ETIME)
+ aie_destroy_chann(aie, &aie->mgmt_chann);
+
+ if (!ret && *hdl->status) {
+ XDNA_ERR(xdna, "command opcode 0x%x failed, status 0x%x",
+ msg->opcode, *hdl->data);
+ ret = -EINVAL;
+ }
+
+ return ret;
+}
+
+int aie_check_protocol(struct aie_device *aie, u32 fw_major, u32 fw_minor)
+{
+ const struct amdxdna_fw_feature_tbl *feature;
+ bool found = false;
+
+ for (feature = aie->xdna->dev_info->fw_feature_tbl;
+ feature->major; feature++) {
+ if (feature->major != fw_major)
+ continue;
+ if (fw_minor < feature->min_minor)
+ continue;
+ if (feature->max_minor > 0 && fw_minor > feature->max_minor)
+ continue;
+
+ aie->feature_mask |= feature->features;
+
+ /* firmware version matches one of the driver support entry */
+ found = true;
+ }
+
+ return found ? 0 : -EOPNOTSUPP;
+}
diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
new file mode 100644
index 000000000000..1bea14b79c7c
--- /dev/null
+++ b/drivers/accel/amdxdna/aie.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+#ifndef _AIE_H_
+#define _AIE_H_
+
+#include "amdxdna_pci_drv.h"
+#include "amdxdna_mailbox.h"
+
+struct aie_device {
+ struct amdxdna_dev *xdna;
+ struct mailbox_channel *mgmt_chann;
+ struct xdna_mailbox_chann_res mgmt_x2i;
+ struct xdna_mailbox_chann_res mgmt_i2x;
+ u32 mgmt_chan_idx;
+ u32 mgmt_prot_major;
+ u32 mgmt_prot_minor;
+ unsigned long feature_mask;
+};
+
+#define DECLARE_AIE_MSG(name, op) \
+ DECLARE_XDNA_MSG_COMMON(name, op, -1)
+#define AIE_FEATURE_ON(aie, feature) test_bit(feature, &(aie)->feature_mask)
+
+void aie_dump_mgmt_chann_debug(struct aie_device *aie);
+void aie_destroy_chann(struct aie_device *aie, struct mailbox_channel **chann);
+int aie_send_mgmt_msg_wait(struct aie_device *aie, struct xdna_mailbox_msg *msg);
+int aie_check_protocol(struct aie_device *aie, u32 fw_major, u32 fw_minor);
+
+#endif /* _AIE_H_ */
diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
index 66dbbfd322a2..a942ac626d07 100644
--- a/drivers/accel/amdxdna/aie2_ctx.c
+++ b/drivers/accel/amdxdna/aie2_ctx.c
@@ -525,7 +525,7 @@ static int aie2_alloc_resource(struct amdxdna_hwctx *hwctx)
struct alloc_requests *xrs_req;
int ret;
- if (AIE2_FEATURE_ON(xdna->dev_handle, AIE2_TEMPORAL_ONLY)) {
+ if (AIE_FEATURE_ON(&xdna->dev_handle->aie, AIE2_TEMPORAL_ONLY)) {
hwctx->num_unused_col = xdna->dev_handle->total_col - hwctx->num_col;
hwctx->num_col = xdna->dev_handle->total_col;
return aie2_create_context(xdna->dev_handle, hwctx);
@@ -562,7 +562,7 @@ static void aie2_release_resource(struct amdxdna_hwctx *hwctx)
struct amdxdna_dev *xdna = hwctx->client->xdna;
int ret;
- if (AIE2_FEATURE_ON(xdna->dev_handle, AIE2_TEMPORAL_ONLY)) {
+ if (AIE_FEATURE_ON(&xdna->dev_handle->aie, AIE2_TEMPORAL_ONLY)) {
ret = aie2_destroy_context(xdna->dev_handle, hwctx);
if (ret && ret != -ENODEV)
XDNA_ERR(xdna, "Destroy temporal only context failed, ret %d", ret);
diff --git a/drivers/accel/amdxdna/aie2_error.c b/drivers/accel/amdxdna/aie2_error.c
index 58abb59b6153..9d20e956c020 100644
--- a/drivers/accel/amdxdna/aie2_error.c
+++ b/drivers/accel/amdxdna/aie2_error.c
@@ -249,12 +249,12 @@ static u32 aie2_error_backtrack(struct amdxdna_dev_hdl *ndev, void *err_info, u3
enum aie_error_category cat;
cat = aie_get_error_category(err->row, err->event_id, err->mod_type);
- XDNA_ERR(ndev->xdna, "Row: %d, Col: %d, module %d, event ID %d, category %d",
+ XDNA_ERR(ndev->aie.xdna, "Row: %d, Col: %d, module %d, event ID %d, category %d",
err->row, err->col, err->mod_type,
err->event_id, cat);
if (err->col >= 32) {
- XDNA_WARN(ndev->xdna, "Invalid column number");
+ XDNA_WARN(ndev->aie.xdna, "Invalid column number");
break;
}
@@ -294,7 +294,7 @@ static void aie2_error_worker(struct work_struct *err_work)
e = container_of(err_work, struct async_event, work);
- xdna = e->ndev->xdna;
+ xdna = e->ndev->aie.xdna;
if (e->resp.status == MAX_AIE2_STATUS_CODE)
return;
@@ -329,7 +329,7 @@ static void aie2_error_worker(struct work_struct *err_work)
void aie2_error_async_events_free(struct amdxdna_dev_hdl *ndev)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
struct async_events *events;
events = ndev->async_events;
@@ -344,7 +344,7 @@ void aie2_error_async_events_free(struct amdxdna_dev_hdl *ndev)
int aie2_error_async_events_alloc(struct amdxdna_dev_hdl *ndev)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
u32 total_col = ndev->total_col;
u32 total_size = ASYNC_BUF_SIZE * total_col;
struct async_events *events;
@@ -402,7 +402,7 @@ int aie2_error_async_events_alloc(struct amdxdna_dev_hdl *ndev)
int aie2_get_array_async_error(struct amdxdna_dev_hdl *ndev, struct amdxdna_drm_get_array *args)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
diff --git a/drivers/accel/amdxdna/aie2_message.c b/drivers/accel/amdxdna/aie2_message.c
index a1c546c3e81c..ccf87b1aa1cc 100644
--- a/drivers/accel/amdxdna/aie2_message.c
+++ b/drivers/accel/amdxdna/aie2_message.c
@@ -16,6 +16,7 @@
#include <linux/types.h>
#include <linux/xarray.h>
+#include "aie.h"
#include "aie2_msg_priv.h"
#include "aie2_pci.h"
#include "amdxdna_ctx.h"
@@ -24,38 +25,12 @@
#include "amdxdna_mailbox_helper.h"
#include "amdxdna_pci_drv.h"
-#define DECLARE_AIE2_MSG(name, op) \
- DECLARE_XDNA_MSG_COMMON(name, op, MAX_AIE2_STATUS_CODE)
-
#define EXEC_MSG_OPS(xdna) ((xdna)->dev_handle->exec_msg_ops)
-static int aie2_send_mgmt_msg_wait(struct amdxdna_dev_hdl *ndev,
- struct xdna_mailbox_msg *msg)
-{
- struct amdxdna_dev *xdna = ndev->xdna;
- struct xdna_notify *hdl = msg->handle;
- int ret;
-
- if (!ndev->mgmt_chann)
- return -ENODEV;
-
- ret = xdna_send_msg_wait(xdna, ndev->mgmt_chann, msg);
- if (ret == -ETIME)
- aie2_destroy_mgmt_chann(ndev);
-
- if (!ret && *hdl->status != AIE2_STATUS_SUCCESS) {
- XDNA_ERR(xdna, "command opcode 0x%x failed, status 0x%x",
- msg->opcode, *hdl->data);
- ret = -EINVAL;
- }
-
- return ret;
-}
-
void *aie2_alloc_msg_buffer(struct amdxdna_dev_hdl *ndev, u32 *size,
dma_addr_t *dma_addr)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
void *vaddr;
int order;
@@ -79,7 +54,7 @@ void *aie2_alloc_msg_buffer(struct amdxdna_dev_hdl *ndev, u32 *size,
void aie2_free_msg_buffer(struct amdxdna_dev_hdl *ndev, size_t size,
void *cpu_addr, dma_addr_t dma_addr)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
if (amdxdna_iova_on(xdna)) {
amdxdna_iommu_free(xdna, size, cpu_addr, dma_addr);
@@ -91,12 +66,12 @@ void aie2_free_msg_buffer(struct amdxdna_dev_hdl *ndev, size_t size,
int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev)
{
- DECLARE_AIE2_MSG(suspend, MSG_OP_SUSPEND);
+ DECLARE_AIE_MSG(suspend, MSG_OP_SUSPEND);
int ret;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
- XDNA_ERR(ndev->xdna, "Failed to suspend fw, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Failed to suspend fw, ret %d", ret);
return ret;
}
@@ -105,22 +80,22 @@ int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev)
int aie2_resume_fw(struct amdxdna_dev_hdl *ndev)
{
- DECLARE_AIE2_MSG(suspend, MSG_OP_RESUME);
+ DECLARE_AIE_MSG(suspend, MSG_OP_RESUME);
- return aie2_send_mgmt_msg_wait(ndev, &msg);
+ return aie_send_mgmt_msg_wait(&ndev->aie, &msg);
}
int aie2_set_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 value)
{
- DECLARE_AIE2_MSG(set_runtime_cfg, MSG_OP_SET_RUNTIME_CONFIG);
+ DECLARE_AIE_MSG(set_runtime_cfg, MSG_OP_SET_RUNTIME_CONFIG);
int ret;
req.type = type;
req.value = value;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
- XDNA_ERR(ndev->xdna, "Failed to set runtime config, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Failed to set runtime config, ret %d", ret);
return ret;
}
@@ -129,13 +104,13 @@ int aie2_set_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 value)
int aie2_get_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 *value)
{
- DECLARE_AIE2_MSG(get_runtime_cfg, MSG_OP_GET_RUNTIME_CONFIG);
+ DECLARE_AIE_MSG(get_runtime_cfg, MSG_OP_GET_RUNTIME_CONFIG);
int ret;
req.type = type;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
- XDNA_ERR(ndev->xdna, "Failed to get runtime config, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Failed to get runtime config, ret %d", ret);
return ret;
}
@@ -145,20 +120,20 @@ int aie2_get_runtime_cfg(struct amdxdna_dev_hdl *ndev, u32 type, u64 *value)
int aie2_assign_mgmt_pasid(struct amdxdna_dev_hdl *ndev, u16 pasid)
{
- DECLARE_AIE2_MSG(assign_mgmt_pasid, MSG_OP_ASSIGN_MGMT_PASID);
+ DECLARE_AIE_MSG(assign_mgmt_pasid, MSG_OP_ASSIGN_MGMT_PASID);
req.pasid = pasid;
- return aie2_send_mgmt_msg_wait(ndev, &msg);
+ return aie_send_mgmt_msg_wait(&ndev->aie, &msg);
}
int aie2_query_aie_version(struct amdxdna_dev_hdl *ndev, struct aie_version *version)
{
- DECLARE_AIE2_MSG(aie_version_info, MSG_OP_QUERY_AIE_VERSION);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(aie_version_info, MSG_OP_QUERY_AIE_VERSION);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
int ret;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
return ret;
@@ -173,10 +148,10 @@ int aie2_query_aie_version(struct amdxdna_dev_hdl *ndev, struct aie_version *ver
int aie2_query_aie_metadata(struct amdxdna_dev_hdl *ndev, struct aie_metadata *metadata)
{
- DECLARE_AIE2_MSG(aie_tile_info, MSG_OP_QUERY_AIE_TILE_INFO);
+ DECLARE_AIE_MSG(aie_tile_info, MSG_OP_QUERY_AIE_TILE_INFO);
int ret;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
return ret;
@@ -211,10 +186,10 @@ int aie2_query_aie_metadata(struct amdxdna_dev_hdl *ndev, struct aie_metadata *m
int aie2_query_firmware_version(struct amdxdna_dev_hdl *ndev,
struct amdxdna_fw_ver *fw_ver)
{
- DECLARE_AIE2_MSG(firmware_version, MSG_OP_GET_FIRMWARE_VERSION);
+ DECLARE_AIE_MSG(firmware_version, MSG_OP_GET_FIRMWARE_VERSION);
int ret;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
return ret;
@@ -228,12 +203,12 @@ int aie2_query_firmware_version(struct amdxdna_dev_hdl *ndev,
static int aie2_destroy_context_req(struct amdxdna_dev_hdl *ndev, u32 id)
{
- DECLARE_AIE2_MSG(destroy_ctx, MSG_OP_DESTROY_CONTEXT);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(destroy_ctx, MSG_OP_DESTROY_CONTEXT);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
int ret;
req.context_id = id;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret && ret != -ENODEV)
XDNA_WARN(xdna, "Destroy context failed, ret %d", ret);
else if (ret == -ENODEV)
@@ -245,7 +220,7 @@ static int aie2_destroy_context_req(struct amdxdna_dev_hdl *ndev, u32 id)
static u32 aie2_get_context_priority(struct amdxdna_dev_hdl *ndev,
struct amdxdna_hwctx *hwctx)
{
- if (!AIE2_FEATURE_ON(ndev, AIE2_PREEMPT))
+ if (!AIE_FEATURE_ON(&ndev->aie, AIE2_PREEMPT))
return PRIORITY_HIGH;
switch (hwctx->qos.priority) {
@@ -264,8 +239,8 @@ static u32 aie2_get_context_priority(struct amdxdna_dev_hdl *ndev,
int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx)
{
- DECLARE_AIE2_MSG(create_ctx, MSG_OP_CREATE_CONTEXT);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(create_ctx, MSG_OP_CREATE_CONTEXT);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
struct xdna_mailbox_chann_res x2i;
struct xdna_mailbox_chann_res i2x;
struct cq_pair *cq_pair;
@@ -280,7 +255,7 @@ int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwct
req.pasid = amdxdna_pasid_on(hwctx->client) ? hwctx->client->pasid : 0;
req.context_priority = aie2_get_context_priority(ndev, hwctx);
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
return ret;
@@ -344,7 +319,7 @@ int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwct
int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwctx)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
int ret;
if (!hwctx->priv->mbox_chann)
@@ -363,14 +338,14 @@ int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx *hwc
int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 addr, u64 size)
{
- DECLARE_AIE2_MSG(map_host_buffer, MSG_OP_MAP_HOST_BUFFER);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(map_host_buffer, MSG_OP_MAP_HOST_BUFFER);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
int ret;
req.context_id = context_id;
req.buf_addr = addr;
req.buf_size = size;
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret)
return ret;
@@ -392,8 +367,8 @@ static int amdxdna_hwctx_col_map(struct amdxdna_hwctx *hwctx, void *arg)
int aie2_query_status(struct amdxdna_dev_hdl *ndev, char __user *buf,
u32 size, u32 *cols_filled)
{
- DECLARE_AIE2_MSG(aie_column_info, MSG_OP_QUERY_COL_STATUS);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(aie_column_info, MSG_OP_QUERY_COL_STATUS);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
u32 buf_sz = size, aie_bitmap = 0;
struct amdxdna_client *client;
dma_addr_t dma_addr;
@@ -415,7 +390,7 @@ int aie2_query_status(struct amdxdna_dev_hdl *ndev, char __user *buf,
req.aie_bitmap = aie_bitmap;
drm_clflush_virt_range(buff_addr, size); /* device can access */
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
XDNA_ERR(xdna, "Error during NPU query, status %d", ret);
goto fail;
@@ -446,8 +421,8 @@ int aie2_query_telemetry(struct amdxdna_dev_hdl *ndev,
char __user *buf, u32 size,
struct amdxdna_drm_query_telemetry_header *header)
{
- DECLARE_AIE2_MSG(get_telemetry, MSG_OP_GET_TELEMETRY);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(get_telemetry, MSG_OP_GET_TELEMETRY);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
dma_addr_t dma_addr;
u32 buf_sz = size;
u8 *addr;
@@ -465,7 +440,7 @@ int aie2_query_telemetry(struct amdxdna_dev_hdl *ndev,
req.type = header->type;
drm_clflush_virt_range(addr, size); /* device can access */
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
XDNA_ERR(xdna, "Query telemetry failed, status %d", ret);
goto free_buf;
@@ -506,8 +481,8 @@ int aie2_register_asyn_event_msg(struct amdxdna_dev_hdl *ndev, dma_addr_t addr,
req.buf_addr = addr;
req.buf_size = size;
- XDNA_DBG(ndev->xdna, "Register addr 0x%llx size 0x%x", addr, size);
- return xdna_mailbox_send_msg(ndev->mgmt_chann, &msg, TX_TIMEOUT);
+ XDNA_DBG(ndev->aie.xdna, "Register addr 0x%llx size 0x%x", addr, size);
+ return xdna_mailbox_send_msg(ndev->aie.mgmt_chann, &msg, TX_TIMEOUT);
}
int aie2_config_cu(struct amdxdna_hwctx *hwctx,
@@ -866,7 +841,6 @@ static int aie2_init_exec_req(void *req, struct amdxdna_gem_obj *cmd_abo,
int ret;
u32 op;
-
op = amdxdna_cmd_get_op(cmd_abo);
switch (op) {
case ERT_START_CU:
@@ -915,12 +889,12 @@ aie2_cmdlist_fill_slot(void *slot, struct amdxdna_gem_obj *cmd_abo,
ret = EXEC_MSG_OPS(xdna)->fill_dpu_slot(cmd_abo, slot, size);
break;
case ERT_START_NPU_PREEMPT:
- if (!AIE2_FEATURE_ON(xdna->dev_handle, AIE2_PREEMPT))
+ if (!AIE_FEATURE_ON(&xdna->dev_handle->aie, AIE2_PREEMPT))
return -EOPNOTSUPP;
ret = EXEC_MSG_OPS(xdna)->fill_preempt_slot(cmd_abo, slot, size);
break;
case ERT_START_NPU_PREEMPT_ELF:
- if (!AIE2_FEATURE_ON(xdna->dev_handle, AIE2_PREEMPT))
+ if (!AIE_FEATURE_ON(&xdna->dev_handle->aie, AIE2_PREEMPT))
return -EOPNOTSUPP;
ret = EXEC_MSG_OPS(xdna)->fill_elf_slot(cmd_abo, slot, size);
break;
@@ -935,26 +909,12 @@ aie2_cmdlist_fill_slot(void *slot, struct amdxdna_gem_obj *cmd_abo,
void aie2_msg_init(struct amdxdna_dev_hdl *ndev)
{
- if (AIE2_FEATURE_ON(ndev, AIE2_NPU_COMMAND))
+ if (AIE_FEATURE_ON(&ndev->aie, AIE2_NPU_COMMAND))
ndev->exec_msg_ops = &npu_exec_message_ops;
else
ndev->exec_msg_ops = &legacy_exec_message_ops;
}
-void aie2_destroy_mgmt_chann(struct amdxdna_dev_hdl *ndev)
-{
- struct amdxdna_dev *xdna = ndev->xdna;
-
- drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
-
- if (!ndev->mgmt_chann)
- return;
-
- xdna_mailbox_stop_channel(ndev->mgmt_chann);
- xdna_mailbox_free_channel(ndev->mgmt_chann);
- ndev->mgmt_chann = NULL;
-}
-
static inline struct amdxdna_gem_obj *
aie2_cmdlist_get_cmd_buf(struct amdxdna_sched_job *job)
{
@@ -1199,14 +1159,14 @@ int aie2_config_debug_bo(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *
int aie2_query_app_health(struct amdxdna_dev_hdl *ndev, u32 context_id,
struct app_health_report *report)
{
- DECLARE_AIE2_MSG(get_app_health, MSG_OP_GET_APP_HEALTH);
- struct amdxdna_dev *xdna = ndev->xdna;
+ DECLARE_AIE_MSG(get_app_health, MSG_OP_GET_APP_HEALTH);
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
struct app_health_report *buf;
dma_addr_t dma_addr;
u32 buf_size;
int ret;
- if (!AIE2_FEATURE_ON(ndev, AIE2_APP_HEALTH)) {
+ if (!AIE_FEATURE_ON(&ndev->aie, AIE2_APP_HEALTH)) {
XDNA_DBG(xdna, "App health feature not supported");
return -EOPNOTSUPP;
}
@@ -1223,7 +1183,7 @@ int aie2_query_app_health(struct amdxdna_dev_hdl *ndev, u32 context_id,
req.buf_size = buf_size;
drm_clflush_virt_range(buf, sizeof(*report));
- ret = aie2_send_mgmt_msg_wait(ndev, &msg);
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
if (ret) {
XDNA_ERR(xdna, "Get app health failed, ret %d status 0x%x", ret, resp.status);
goto free_buf;
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index f1ac4e00bd9f..03bac963516d 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -60,45 +60,6 @@ struct mgmt_mbox_chann_info {
__u32 rsvd[4];
};
-static int aie2_check_protocol(struct amdxdna_dev_hdl *ndev, u32 fw_major, u32 fw_minor)
-{
- const struct aie2_fw_feature_tbl *feature;
- bool found = false;
-
- for (feature = ndev->priv->fw_feature_tbl; feature->major; feature++) {
- if (feature->major != fw_major)
- continue;
- if (fw_minor < feature->min_minor)
- continue;
- if (feature->max_minor > 0 && fw_minor > feature->max_minor)
- continue;
-
- ndev->feature_mask |= feature->features;
-
- /* firmware version matches one of the driver support entry */
- found = true;
- }
-
- return found ? 0 : -EOPNOTSUPP;
-}
-
-static void aie2_dump_chann_info_debug(struct amdxdna_dev_hdl *ndev)
-{
- struct amdxdna_dev *xdna = ndev->xdna;
-
- XDNA_DBG(xdna, "i2x tail 0x%x", ndev->mgmt_i2x.mb_tail_ptr_reg);
- XDNA_DBG(xdna, "i2x head 0x%x", ndev->mgmt_i2x.mb_head_ptr_reg);
- XDNA_DBG(xdna, "i2x ringbuf 0x%x", ndev->mgmt_i2x.rb_start_addr);
- XDNA_DBG(xdna, "i2x rsize 0x%x", ndev->mgmt_i2x.rb_size);
- XDNA_DBG(xdna, "x2i tail 0x%x", ndev->mgmt_x2i.mb_tail_ptr_reg);
- XDNA_DBG(xdna, "x2i head 0x%x", ndev->mgmt_x2i.mb_head_ptr_reg);
- XDNA_DBG(xdna, "x2i ringbuf 0x%x", ndev->mgmt_x2i.rb_start_addr);
- XDNA_DBG(xdna, "x2i rsize 0x%x", ndev->mgmt_x2i.rb_size);
- XDNA_DBG(xdna, "x2i chann index 0x%x", ndev->mgmt_chan_idx);
- XDNA_DBG(xdna, "mailbox protocol major 0x%x", ndev->mgmt_prot_major);
- XDNA_DBG(xdna, "mailbox protocol minor 0x%x", ndev->mgmt_prot_minor);
-}
-
static int aie2_get_mgmt_chann_info(struct amdxdna_dev_hdl *ndev)
{
struct mgmt_mbox_chann_info info_regs;
@@ -128,13 +89,13 @@ static int aie2_get_mgmt_chann_info(struct amdxdna_dev_hdl *ndev)
reg[i] = readl(ndev->sram_base + off + i * sizeof(u32));
if (info_regs.magic != MGMT_MBOX_MAGIC) {
- XDNA_ERR(ndev->xdna, "Invalid mbox magic 0x%x", info_regs.magic);
+ XDNA_ERR(ndev->aie.xdna, "Invalid mbox magic 0x%x", info_regs.magic);
ret = -EINVAL;
goto done;
}
- i2x = &ndev->mgmt_i2x;
- x2i = &ndev->mgmt_x2i;
+ i2x = &ndev->aie.mgmt_i2x;
+ x2i = &ndev->aie.mgmt_x2i;
i2x->mb_head_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.i2x_head);
i2x->mb_tail_ptr_reg = AIE2_MBOX_OFF(ndev, info_regs.i2x_tail);
@@ -146,14 +107,15 @@ static int aie2_get_mgmt_chann_info(struct amdxdna_dev_hdl *ndev)
x2i->rb_start_addr = AIE2_SRAM_OFF(ndev, info_regs.x2i_buf);
x2i->rb_size = info_regs.x2i_buf_sz;
- ndev->mgmt_chan_idx = info_regs.msi_id;
- ndev->mgmt_prot_major = info_regs.prot_major;
- ndev->mgmt_prot_minor = info_regs.prot_minor;
+ ndev->aie.mgmt_chan_idx = info_regs.msi_id;
+ ndev->aie.mgmt_prot_major = info_regs.prot_major;
+ ndev->aie.mgmt_prot_minor = info_regs.prot_minor;
- ret = aie2_check_protocol(ndev, ndev->mgmt_prot_major, ndev->mgmt_prot_minor);
+ ret = aie_check_protocol(&ndev->aie, ndev->aie.mgmt_prot_major,
+ ndev->aie.mgmt_prot_minor);
done:
- aie2_dump_chann_info_debug(ndev);
+ aie_dump_mgmt_chann_debug(&ndev->aie);
/* Must clear address at FW_ALIVE_OFF */
writel(0, SRAM_GET_ADDR(ndev, FW_ALIVE_OFF));
@@ -173,13 +135,14 @@ int aie2_runtime_cfg(struct amdxdna_dev_hdl *ndev,
continue;
if (cfg->feature_mask &&
- bitmap_subset(&cfg->feature_mask, &ndev->feature_mask, AIE2_FEATURE_MAX))
+ bitmap_subset(&cfg->feature_mask, &ndev->aie.feature_mask,
+ AIE2_FEATURE_MAX))
continue;
value = val ? *val : cfg->value;
ret = aie2_set_runtime_cfg(ndev, cfg->type, value);
if (ret) {
- XDNA_ERR(ndev->xdna, "Set type %d value %d failed",
+ XDNA_ERR(ndev->aie.xdna, "Set type %d value %d failed",
cfg->type, value);
return ret;
}
@@ -194,13 +157,13 @@ static int aie2_xdna_reset(struct amdxdna_dev_hdl *ndev)
ret = aie2_suspend_fw(ndev);
if (ret) {
- XDNA_ERR(ndev->xdna, "Suspend firmware failed");
+ XDNA_ERR(ndev->aie.xdna, "Suspend firmware failed");
return ret;
}
ret = aie2_resume_fw(ndev);
if (ret) {
- XDNA_ERR(ndev->xdna, "Resume firmware failed");
+ XDNA_ERR(ndev->aie.xdna, "Resume firmware failed");
return ret;
}
@@ -213,19 +176,19 @@ static int aie2_mgmt_fw_init(struct amdxdna_dev_hdl *ndev)
ret = aie2_runtime_cfg(ndev, AIE2_RT_CFG_INIT, NULL);
if (ret) {
- XDNA_ERR(ndev->xdna, "Runtime config failed");
+ XDNA_ERR(ndev->aie.xdna, "Runtime config failed");
return ret;
}
ret = aie2_assign_mgmt_pasid(ndev, 0);
if (ret) {
- XDNA_ERR(ndev->xdna, "Can not assign PASID");
+ XDNA_ERR(ndev->aie.xdna, "Can not assign PASID");
return ret;
}
ret = aie2_xdna_reset(ndev);
if (ret) {
- XDNA_ERR(ndev->xdna, "Reset firmware failed");
+ XDNA_ERR(ndev->aie.xdna, "Reset firmware failed");
return ret;
}
@@ -236,21 +199,21 @@ static int aie2_mgmt_fw_query(struct amdxdna_dev_hdl *ndev)
{
int ret;
- ret = aie2_query_firmware_version(ndev, &ndev->xdna->fw_ver);
+ ret = aie2_query_firmware_version(ndev, &ndev->aie.xdna->fw_ver);
if (ret) {
- XDNA_ERR(ndev->xdna, "query firmware version failed");
+ XDNA_ERR(ndev->aie.xdna, "query firmware version failed");
return ret;
}
ret = aie2_query_aie_version(ndev, &ndev->version);
if (ret) {
- XDNA_ERR(ndev->xdna, "Query AIE version failed");
+ XDNA_ERR(ndev->aie.xdna, "Query AIE version failed");
return ret;
}
ret = aie2_query_aie_metadata(ndev, &ndev->metadata);
if (ret) {
- XDNA_ERR(ndev->xdna, "Query AIE metadata failed");
+ XDNA_ERR(ndev->aie.xdna, "Query AIE metadata failed");
return ret;
}
@@ -262,8 +225,8 @@ static int aie2_mgmt_fw_query(struct amdxdna_dev_hdl *ndev)
static void aie2_mgmt_fw_fini(struct amdxdna_dev_hdl *ndev)
{
if (aie2_suspend_fw(ndev))
- XDNA_ERR(ndev->xdna, "Suspend_fw failed");
- XDNA_DBG(ndev->xdna, "Firmware suspended");
+ XDNA_ERR(ndev->aie.xdna, "Suspend_fw failed");
+ XDNA_DBG(ndev->aie.xdna, "Firmware suspended");
}
static int aie2_xrs_load(void *cb_arg, struct xrs_action_load *action)
@@ -331,7 +294,7 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
aie2_runtime_cfg(ndev, AIE2_RT_CFG_CLK_GATING, NULL);
aie2_mgmt_fw_fini(ndev);
- aie2_destroy_mgmt_chann(ndev);
+ aie_destroy_chann(&ndev->aie, &ndev->aie.mgmt_chann);
drmm_kfree(&xdna->ddev, ndev->mbox);
ndev->mbox = NULL;
aie2_psp_stop(ndev->psp_hdl);
@@ -374,8 +337,8 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto disable_dev;
}
- ndev->mgmt_chann = xdna_mailbox_alloc_channel(ndev->mbox);
- if (!ndev->mgmt_chann) {
+ ndev->aie.mgmt_chann = xdna_mailbox_alloc_channel(ndev->mbox);
+ if (!ndev->aie.mgmt_chann) {
XDNA_ERR(xdna, "failed to alloc channel");
ret = -ENODEV;
goto disable_dev;
@@ -399,17 +362,17 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto stop_psp;
}
- mgmt_mb_irq = pci_irq_vector(pdev, ndev->mgmt_chan_idx);
+ mgmt_mb_irq = pci_irq_vector(pdev, ndev->aie.mgmt_chan_idx);
if (mgmt_mb_irq < 0) {
ret = mgmt_mb_irq;
XDNA_ERR(xdna, "failed to alloc irq vector, ret %d", ret);
goto stop_psp;
}
- xdna_mailbox_intr_reg = ndev->mgmt_i2x.mb_head_ptr_reg + 4;
- ret = xdna_mailbox_start_channel(ndev->mgmt_chann,
- &ndev->mgmt_x2i,
- &ndev->mgmt_i2x,
+ xdna_mailbox_intr_reg = ndev->aie.mgmt_i2x.mb_head_ptr_reg + 4;
+ ret = xdna_mailbox_start_channel(ndev->aie.mgmt_chann,
+ &ndev->aie.mgmt_x2i,
+ &ndev->aie.mgmt_i2x,
xdna_mailbox_intr_reg,
mgmt_mb_irq);
if (ret) {
@@ -448,14 +411,14 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
stop_fw:
aie2_suspend_fw(ndev);
- xdna_mailbox_stop_channel(ndev->mgmt_chann);
+ xdna_mailbox_stop_channel(ndev->aie.mgmt_chann);
stop_psp:
aie2_psp_stop(ndev->psp_hdl);
fini_smu:
aie2_smu_fini(ndev);
free_channel:
- xdna_mailbox_free_channel(ndev->mgmt_chann);
- ndev->mgmt_chann = NULL;
+ xdna_mailbox_free_channel(ndev->aie.mgmt_chann);
+ ndev->aie.mgmt_chann = NULL;
disable_dev:
pci_disable_device(pdev);
@@ -516,7 +479,7 @@ static int aie2_init(struct amdxdna_dev *xdna)
return -ENOMEM;
ndev->priv = xdna->dev_info->dev_priv;
- ndev->xdna = xdna;
+ ndev->aie.xdna = xdna;
for (i = 0; i < ARRAY_SIZE(npu_fw); i++) {
fw_full_path = kasprintf(GFP_KERNEL, "%s%s", ndev->priv->fw_path, npu_fw[i]);
diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_pci.h
index efcf4be035f0..90fb0aafaf40 100644
--- a/drivers/accel/amdxdna/aie2_pci.h
+++ b/drivers/accel/amdxdna/aie2_pci.h
@@ -10,6 +10,7 @@
#include <linux/limits.h>
#include <linux/semaphore.h>
+#include "aie.h"
#include "aie2_msg_priv.h"
#include "amdxdna_mailbox.h"
@@ -20,7 +21,7 @@
#define AIE2_DEVM_BASE 0x4000000
#define AIE2_DEVM_SIZE SZ_64M
-#define NDEV2PDEV(ndev) (to_pci_dev((ndev)->xdna->ddev.dev))
+#define NDEV2PDEV(ndev) (to_pci_dev((ndev)->aie.xdna->ddev.dev))
#define AIE2_SRAM_OFF(ndev, addr) ((addr) - (ndev)->priv->sram_dev_addr)
#define AIE2_MBOX_OFF(ndev, addr) ((addr) - (ndev)->priv->mbox_dev_addr)
@@ -45,7 +46,7 @@
({ \
typeof(ndev) _ndev = (ndev); \
((_ndev)->priv->mbox_size) ? (_ndev)->priv->mbox_size : \
- pci_resource_len(NDEV2PDEV(_ndev), (_ndev)->xdna->dev_info->mbox_bar); \
+ pci_resource_len(NDEV2PDEV(_ndev), (_ndev)->aie.xdna->dev_info->mbox_bar); \
})
#if IS_ENABLED(CONFIG_AMD_PMF)
@@ -203,23 +204,16 @@ struct aie2_exec_msg_ops {
};
struct amdxdna_dev_hdl {
- struct amdxdna_dev *xdna;
+ struct aie_device aie;
const struct amdxdna_dev_priv *priv;
void __iomem *sram_base;
void __iomem *smu_base;
void __iomem *mbox_base;
struct psp_device *psp_hdl;
- struct xdna_mailbox_chann_res mgmt_x2i;
- struct xdna_mailbox_chann_res mgmt_i2x;
- u32 mgmt_chan_idx;
- u32 mgmt_prot_major;
- u32 mgmt_prot_minor;
-
u32 total_col;
struct aie_version version;
struct aie_metadata metadata;
- unsigned long feature_mask;
struct aie2_exec_msg_ops *exec_msg_ops;
/* power management and clock*/
@@ -237,7 +231,6 @@ struct amdxdna_dev_hdl {
/* Mailbox and the management channel */
struct mailbox *mbox;
- struct mailbox_channel *mgmt_chann;
struct async_events *async_events;
enum aie2_dev_status dev_status;
@@ -266,21 +259,12 @@ enum aie2_fw_feature {
AIE2_FEATURE_MAX
};
-struct aie2_fw_feature_tbl {
- u64 features;
- u32 major;
- u32 max_minor;
- u32 min_minor;
-};
-
#define AIE2_ALL_FEATURES GENMASK_ULL(AIE2_FEATURE_MAX - 1, AIE2_NPU_COMMAND)
-#define AIE2_FEATURE_ON(ndev, feature) test_bit(feature, &(ndev)->feature_mask)
struct amdxdna_dev_priv {
const char *fw_path;
const struct rt_config *rt_config;
const struct dpm_clk_freq *dpm_clk_tbl;
- const struct aie2_fw_feature_tbl *fw_feature_tbl;
#define COL_ALIGN_NONE 0
#define COL_ALIGN_NATURE 1
@@ -306,7 +290,7 @@ extern const struct dpm_clk_freq npu1_dpm_clk_table[];
extern const struct dpm_clk_freq npu4_dpm_clk_table[];
extern const struct rt_config npu1_default_rt_cfg[];
extern const struct rt_config npu4_default_rt_cfg[];
-extern const struct aie2_fw_feature_tbl npu4_fw_feature_table[];
+extern const struct amdxdna_fw_feature_tbl npu4_fw_feature_table[];
/* aie2_smu.c */
int aie2_smu_init(struct amdxdna_dev_hdl *ndev);
diff --git a/drivers/accel/amdxdna/aie2_pm.c b/drivers/accel/amdxdna/aie2_pm.c
index 29bd4403a94d..5ec6728d04fd 100644
--- a/drivers/accel/amdxdna/aie2_pm.c
+++ b/drivers/accel/amdxdna/aie2_pm.c
@@ -31,14 +31,14 @@ int aie2_pm_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
{
int ret;
- ret = amdxdna_pm_resume_get_locked(ndev->xdna);
+ ret = amdxdna_pm_resume_get_locked(ndev->aie.xdna);
if (ret)
return ret;
ret = ndev->priv->hw_ops.set_dpm(ndev, dpm_level);
if (!ret)
ndev->dpm_level = dpm_level;
- amdxdna_pm_suspend_put(ndev->xdna);
+ amdxdna_pm_suspend_put(ndev->aie.xdna);
return ret;
}
@@ -81,7 +81,7 @@ int aie2_pm_init(struct amdxdna_dev_hdl *ndev)
int aie2_pm_set_mode(struct amdxdna_dev_hdl *ndev, enum amdxdna_power_mode_type target)
{
- struct amdxdna_dev *xdna = ndev->xdna;
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
u32 clk_gating, dpm_level;
int ret;
diff --git a/drivers/accel/amdxdna/aie2_smu.c b/drivers/accel/amdxdna/aie2_smu.c
index d8c31924e501..727637dac3a8 100644
--- a/drivers/accel/amdxdna/aie2_smu.c
+++ b/drivers/accel/amdxdna/aie2_smu.c
@@ -46,7 +46,7 @@ static int aie2_smu_exec(struct amdxdna_dev_hdl *ndev, u32 reg_cmd,
ret = readx_poll_timeout(readl, SMU_REG(ndev, SMU_RESP_REG), resp,
resp, AIE2_INTERVAL, AIE2_TIMEOUT);
if (ret) {
- XDNA_ERR(ndev->xdna, "smu cmd %d timed out", reg_cmd);
+ XDNA_ERR(ndev->aie.xdna, "smu cmd %d timed out", reg_cmd);
return ret;
}
@@ -54,7 +54,7 @@ static int aie2_smu_exec(struct amdxdna_dev_hdl *ndev, u32 reg_cmd,
*out = readl(SMU_REG(ndev, SMU_OUT_REG));
if (resp != SMU_RESULT_OK) {
- XDNA_ERR(ndev->xdna, "smu cmd %d failed, 0x%x", reg_cmd, resp);
+ XDNA_ERR(ndev->aie.xdna, "smu cmd %d failed, 0x%x", reg_cmd, resp);
return -EINVAL;
}
@@ -69,7 +69,7 @@ int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
ret = aie2_smu_exec(ndev, AIE2_SMU_SET_MPNPUCLK_FREQ,
ndev->priv->dpm_clk_tbl[dpm_level].npuclk, &freq);
if (ret) {
- XDNA_ERR(ndev->xdna, "Set npu clock to %d failed, ret %d\n",
+ XDNA_ERR(ndev->aie.xdna, "Set npu clock to %d failed, ret %d\n",
ndev->priv->dpm_clk_tbl[dpm_level].npuclk, ret);
return ret;
}
@@ -78,7 +78,7 @@ int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HCLK_FREQ,
ndev->priv->dpm_clk_tbl[dpm_level].hclk, &freq);
if (ret) {
- XDNA_ERR(ndev->xdna, "Set h clock to %d failed, ret %d\n",
+ XDNA_ERR(ndev->aie.xdna, "Set h clock to %d failed, ret %d\n",
ndev->priv->dpm_clk_tbl[dpm_level].hclk, ret);
return ret;
}
@@ -87,7 +87,7 @@ int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
ndev->max_tops = 2 * ndev->total_col;
ndev->curr_tops = ndev->max_tops * freq / 1028;
- XDNA_DBG(ndev->xdna, "MP-NPU clock %d, H clock %d\n",
+ XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
ndev->npuclk_freq, ndev->hclk_freq);
return 0;
@@ -99,14 +99,14 @@ int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HARD_DPMLEVEL, dpm_level, NULL);
if (ret) {
- XDNA_ERR(ndev->xdna, "Set hard dpm level %d failed, ret %d ",
+ XDNA_ERR(ndev->aie.xdna, "Set hard dpm level %d failed, ret %d ",
dpm_level, ret);
return ret;
}
ret = aie2_smu_exec(ndev, AIE2_SMU_SET_SOFT_DPMLEVEL, dpm_level, NULL);
if (ret) {
- XDNA_ERR(ndev->xdna, "Set soft dpm level %d failed, ret %d",
+ XDNA_ERR(ndev->aie.xdna, "Set soft dpm level %d failed, ret %d",
dpm_level, ret);
return ret;
}
@@ -116,7 +116,7 @@ int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
ndev->max_tops = NPU4_DPM_TOPS(ndev, ndev->max_dpm_level);
ndev->curr_tops = NPU4_DPM_TOPS(ndev, dpm_level);
- XDNA_DBG(ndev->xdna, "MP-NPU clock %d, H clock %d\n",
+ XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
ndev->npuclk_freq, ndev->hclk_freq);
return 0;
@@ -132,13 +132,13 @@ int aie2_smu_init(struct amdxdna_dev_hdl *ndev)
*/
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
if (ret) {
- XDNA_ERR(ndev->xdna, "Access power failed, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Access power failed, ret %d", ret);
return ret;
}
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_ON, 0, NULL);
if (ret) {
- XDNA_ERR(ndev->xdna, "Power on failed, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Power on failed, ret %d", ret);
return ret;
}
@@ -152,5 +152,5 @@ void aie2_smu_fini(struct amdxdna_dev_hdl *ndev)
ndev->priv->hw_ops.set_dpm(ndev, 0);
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
if (ret)
- XDNA_ERR(ndev->xdna, "Power off failed, ret %d", ret);
+ XDNA_ERR(ndev->aie.xdna, "Power off failed, ret %d", ret);
}
diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdna/amdxdna_pci_drv.h
index 0661749917d6..5e0bf565a1ae 100644
--- a/drivers/accel/amdxdna/amdxdna_pci_drv.h
+++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h
@@ -66,6 +66,13 @@ struct amdxdna_dev_ops {
int (*get_array)(struct amdxdna_client *client, struct amdxdna_drm_get_array *args);
};
+struct amdxdna_fw_feature_tbl {
+ u64 features;
+ u32 major;
+ u32 max_minor;
+ u32 min_minor;
+};
+
/*
* struct amdxdna_dev_info - Device hardware information
* Record device static information, like reg, mbox, PSP, SMU bar index
@@ -83,6 +90,7 @@ struct amdxdna_dev_info {
size_t dev_mem_size;
char *vbnv;
const struct amdxdna_dev_priv *dev_priv;
+ const struct amdxdna_fw_feature_tbl *fw_feature_tbl;
const struct amdxdna_dev_ops *ops;
};
diff --git a/drivers/accel/amdxdna/npu1_regs.c b/drivers/accel/amdxdna/npu1_regs.c
index 1320e924e548..2ea7568a2e99 100644
--- a/drivers/accel/amdxdna/npu1_regs.c
+++ b/drivers/accel/amdxdna/npu1_regs.c
@@ -65,7 +65,7 @@ const struct dpm_clk_freq npu1_dpm_clk_table[] = {
{ 0 }
};
-static const struct aie2_fw_feature_tbl npu1_fw_feature_table[] = {
+static const struct amdxdna_fw_feature_tbl npu1_fw_feature_table[] = {
{ .major = 5, .min_minor = 7 },
{ .features = BIT_U64(AIE2_NPU_COMMAND), .major = 5, .min_minor = 8 },
{ 0 }
@@ -75,7 +75,6 @@ static const struct amdxdna_dev_priv npu1_dev_priv = {
.fw_path = "amdnpu/1502_00/",
.rt_config = npu1_default_rt_cfg,
.dpm_clk_tbl = npu1_dpm_clk_table,
- .fw_feature_tbl = npu1_fw_feature_table,
.col_align = COL_ALIGN_NONE,
.mbox_dev_addr = NPU1_MBOX_BAR_BASE,
.mbox_size = 0, /* Use BAR size */
@@ -120,5 +119,6 @@ const struct amdxdna_dev_info dev_npu1_info = {
.vbnv = "RyzenAI-npu1",
.device_type = AMDXDNA_DEV_TYPE_KMQ,
.dev_priv = &npu1_dev_priv,
+ .fw_feature_tbl = npu1_fw_feature_table,
.ops = &aie2_ops,
};
diff --git a/drivers/accel/amdxdna/npu4_regs.c b/drivers/accel/amdxdna/npu4_regs.c
index 619bff042e52..9689c56c83be 100644
--- a/drivers/accel/amdxdna/npu4_regs.c
+++ b/drivers/accel/amdxdna/npu4_regs.c
@@ -88,7 +88,7 @@ const struct dpm_clk_freq npu4_dpm_clk_table[] = {
{ 0 }
};
-const struct aie2_fw_feature_tbl npu4_fw_feature_table[] = {
+const struct amdxdna_fw_feature_tbl npu4_fw_feature_table[] = {
{ .major = 6, .min_minor = 12 },
{ .features = BIT_U64(AIE2_NPU_COMMAND), .major = 6, .min_minor = 15 },
{ .features = BIT_U64(AIE2_PREEMPT), .major = 6, .min_minor = 12 },
@@ -102,7 +102,6 @@ static const struct amdxdna_dev_priv npu4_dev_priv = {
.fw_path = "amdnpu/17f0_10/",
.rt_config = npu4_default_rt_cfg,
.dpm_clk_tbl = npu4_dpm_clk_table,
- .fw_feature_tbl = npu4_fw_feature_table,
.col_align = COL_ALIGN_NATURE,
.mbox_dev_addr = NPU4_MBOX_BAR_BASE,
.mbox_size = 0, /* Use BAR size */
@@ -147,5 +146,6 @@ const struct amdxdna_dev_info dev_npu4_info = {
.vbnv = "RyzenAI-npu4",
.device_type = AMDXDNA_DEV_TYPE_KMQ,
.dev_priv = &npu4_dev_priv,
+ .fw_feature_tbl = npu4_fw_feature_table,
.ops = &aie2_ops, /* NPU4 can share NPU1's callback */
};
diff --git a/drivers/accel/amdxdna/npu5_regs.c b/drivers/accel/amdxdna/npu5_regs.c
index c0ac5daf32ee..98ee8780f3f5 100644
--- a/drivers/accel/amdxdna/npu5_regs.c
+++ b/drivers/accel/amdxdna/npu5_regs.c
@@ -66,7 +66,6 @@ static const struct amdxdna_dev_priv npu5_dev_priv = {
.fw_path = "amdnpu/17f0_11/",
.rt_config = npu4_default_rt_cfg,
.dpm_clk_tbl = npu4_dpm_clk_table,
- .fw_feature_tbl = npu4_fw_feature_table,
.col_align = COL_ALIGN_NATURE,
.mbox_dev_addr = NPU5_MBOX_BAR_BASE,
.mbox_size = 0, /* Use BAR size */
@@ -111,5 +110,6 @@ const struct amdxdna_dev_info dev_npu5_info = {
.vbnv = "RyzenAI-npu5",
.device_type = AMDXDNA_DEV_TYPE_KMQ,
.dev_priv = &npu5_dev_priv,
+ .fw_feature_tbl = npu4_fw_feature_table,
.ops = &aie2_ops,
};
diff --git a/drivers/accel/amdxdna/npu6_regs.c b/drivers/accel/amdxdna/npu6_regs.c
index ce591ed0d483..31400cca5ec4 100644
--- a/drivers/accel/amdxdna/npu6_regs.c
+++ b/drivers/accel/amdxdna/npu6_regs.c
@@ -66,7 +66,6 @@ static const struct amdxdna_dev_priv npu6_dev_priv = {
.fw_path = "amdnpu/17f0_10/",
.rt_config = npu4_default_rt_cfg,
.dpm_clk_tbl = npu4_dpm_clk_table,
- .fw_feature_tbl = npu4_fw_feature_table,
.col_align = COL_ALIGN_NATURE,
.mbox_dev_addr = NPU6_MBOX_BAR_BASE,
.mbox_size = 0, /* Use BAR size */
@@ -112,5 +111,6 @@ const struct amdxdna_dev_info dev_npu6_info = {
.vbnv = "RyzenAI-npu6",
.device_type = AMDXDNA_DEV_TYPE_KMQ,
.dev_priv = &npu6_dev_priv,
+ .fw_feature_tbl = npu4_fw_feature_table,
.ops = &aie2_ops,
};
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V1 2/6] accel/amdxdna: Add basic support for AIE4 devices
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
2026-03-30 16:37 ` [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4 Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 3/6] accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4 Lizhi Hou
` (4 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue, Lizhi Hou
From: David Zhang <yidong.zhang@amd.com>
Add initial support for AIE4 devices (PCI device IDs 0x17F2 and 0x1B0B),
including:
Device initialization
Basic mailbox communication
SR-IOV enablement
This lays the groundwork for full AIE4 support.
Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: David Zhang <yidong.zhang@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/Makefile | 5 +
drivers/accel/amdxdna/aie.h | 3 +
drivers/accel/amdxdna/aie2_pci.c | 2 +-
drivers/accel/amdxdna/aie2_pci.h | 3 -
drivers/accel/amdxdna/aie2_smu.c | 2 +-
drivers/accel/amdxdna/aie4_message.c | 27 ++
drivers/accel/amdxdna/aie4_msg_priv.h | 49 ++++
drivers/accel/amdxdna/aie4_pci.c | 364 ++++++++++++++++++++++++
drivers/accel/amdxdna/aie4_pci.h | 48 ++++
drivers/accel/amdxdna/aie4_sriov.c | 88 ++++++
drivers/accel/amdxdna/amdxdna_mailbox.c | 19 +-
drivers/accel/amdxdna/amdxdna_mailbox.h | 8 +-
drivers/accel/amdxdna/amdxdna_pci_drv.c | 19 +-
drivers/accel/amdxdna/amdxdna_pci_drv.h | 2 +
drivers/accel/amdxdna/npu3_regs.c | 39 +++
include/uapi/drm/amdxdna_accel.h | 3 +-
16 files changed, 666 insertions(+), 15 deletions(-)
create mode 100644 drivers/accel/amdxdna/aie4_message.c
create mode 100644 drivers/accel/amdxdna/aie4_msg_priv.h
create mode 100644 drivers/accel/amdxdna/aie4_pci.c
create mode 100644 drivers/accel/amdxdna/aie4_pci.h
create mode 100644 drivers/accel/amdxdna/aie4_sriov.c
create mode 100644 drivers/accel/amdxdna/npu3_regs.c
diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile
index 5c7911554c46..a61cd6c0db30 100644
--- a/drivers/accel/amdxdna/Makefile
+++ b/drivers/accel/amdxdna/Makefile
@@ -10,6 +10,8 @@ amdxdna-y := \
aie2_psp.o \
aie2_smu.o \
aie2_solver.o \
+ aie4_message.o \
+ aie4_pci.o \
amdxdna_ctx.o \
amdxdna_gem.o \
amdxdna_iommu.o \
@@ -20,7 +22,10 @@ amdxdna-y := \
amdxdna_sysfs.o \
amdxdna_ubuf.o \
npu1_regs.o \
+ npu3_regs.o \
npu4_regs.o \
npu5_regs.o \
npu6_regs.o
+
+amdxdna-$(CONFIG_PCI_IOV) += aie4_sriov.o
obj-$(CONFIG_DRM_ACCEL_AMDXDNA) = amdxdna.o
diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
index 1bea14b79c7c..6c53870d0098 100644
--- a/drivers/accel/amdxdna/aie.h
+++ b/drivers/accel/amdxdna/aie.h
@@ -8,6 +8,9 @@
#include "amdxdna_pci_drv.h"
#include "amdxdna_mailbox.h"
+#define AIE_INTERVAL 20000 /* us */
+#define AIE_TIMEOUT 1000000 /* us */
+
struct aie_device {
struct amdxdna_dev *xdna;
struct mailbox_channel *mgmt_chann;
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index 03bac963516d..708d0b7fd2e3 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -79,7 +79,7 @@ static int aie2_get_mgmt_chann_info(struct amdxdna_dev_hdl *ndev)
* is alive.
*/
ret = readx_poll_timeout(readl, SRAM_GET_ADDR(ndev, FW_ALIVE_OFF),
- addr, addr, AIE2_INTERVAL, AIE2_TIMEOUT);
+ addr, addr, AIE_INTERVAL, AIE_TIMEOUT);
if (ret || !addr)
return -ETIME;
diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_pci.h
index 90fb0aafaf40..96960a2219a4 100644
--- a/drivers/accel/amdxdna/aie2_pci.h
+++ b/drivers/accel/amdxdna/aie2_pci.h
@@ -14,9 +14,6 @@
#include "aie2_msg_priv.h"
#include "amdxdna_mailbox.h"
-#define AIE2_INTERVAL 20000 /* us */
-#define AIE2_TIMEOUT 1000000 /* us */
-
/* Firmware determines device memory base address and size */
#define AIE2_DEVM_BASE 0x4000000
#define AIE2_DEVM_SIZE SZ_64M
diff --git a/drivers/accel/amdxdna/aie2_smu.c b/drivers/accel/amdxdna/aie2_smu.c
index 727637dac3a8..1b966bbef2e5 100644
--- a/drivers/accel/amdxdna/aie2_smu.c
+++ b/drivers/accel/amdxdna/aie2_smu.c
@@ -44,7 +44,7 @@ static int aie2_smu_exec(struct amdxdna_dev_hdl *ndev, u32 reg_cmd,
writel(1, SMU_REG(ndev, SMU_INTR_REG));
ret = readx_poll_timeout(readl, SMU_REG(ndev, SMU_RESP_REG), resp,
- resp, AIE2_INTERVAL, AIE2_TIMEOUT);
+ resp, AIE_INTERVAL, AIE_TIMEOUT);
if (ret) {
XDNA_ERR(ndev->aie.xdna, "smu cmd %d timed out", reg_cmd);
return ret;
diff --git a/drivers/accel/amdxdna/aie4_message.c b/drivers/accel/amdxdna/aie4_message.c
new file mode 100644
index 000000000000..d621dd32ac40
--- /dev/null
+++ b/drivers/accel/amdxdna/aie4_message.c
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include <drm/amdxdna_accel.h>
+#include <drm/drm_print.h>
+#include <linux/mutex.h>
+
+#include "aie.h"
+#include "aie4_msg_priv.h"
+#include "aie4_pci.h"
+#include "amdxdna_mailbox.h"
+#include "amdxdna_mailbox_helper.h"
+#include "amdxdna_pci_drv.h"
+
+int aie4_suspend_fw(struct amdxdna_dev_hdl *ndev)
+{
+ DECLARE_AIE_MSG(aie4_msg_suspend, AIE4_MSG_OP_SUSPEND);
+ int ret;
+
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
+ if (ret)
+ XDNA_ERR(ndev->aie.xdna, "Failed to suspend fw, ret %d", ret);
+
+ return ret;
+}
diff --git a/drivers/accel/amdxdna/aie4_msg_priv.h b/drivers/accel/amdxdna/aie4_msg_priv.h
new file mode 100644
index 000000000000..88463cc3a98a
--- /dev/null
+++ b/drivers/accel/amdxdna/aie4_msg_priv.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#ifndef _AIE4_MSG_PRIV_H_
+#define _AIE4_MSG_PRIV_H_
+
+#include <linux/types.h>
+
+enum aie4_msg_opcode {
+ AIE4_MSG_OP_SUSPEND = 0x10003,
+
+ AIE4_MSG_OP_CREATE_VFS = 0x20001,
+ AIE4_MSG_OP_DESTROY_VFS = 0x20002,
+};
+
+enum aie4_msg_status {
+ AIE4_MSG_STATUS_SUCCESS = 0x0,
+ AIE4_MSG_STATUS_ERROR = 0x1,
+ AIE4_MSG_STATUS_NOTSUPP = 0x2,
+ MAX_AIE4_MSG_STATUS_CODE = 0x4,
+};
+
+struct aie4_msg_suspend_req {
+ __u32 rsvd;
+} __packed;
+
+struct aie4_msg_suspend_resp {
+ enum aie4_msg_status status;
+} __packed;
+
+struct aie4_msg_create_vfs_req {
+ __u32 vf_cnt;
+} __packed;
+
+struct aie4_msg_create_vfs_resp {
+ enum aie4_msg_status status;
+} __packed;
+
+struct aie4_msg_destroy_vfs_req {
+ __u32 rsvd;
+} __packed;
+
+struct aie4_msg_destroy_vfs_resp {
+ enum aie4_msg_status status;
+} __packed;
+
+#endif /* _AIE4_MSG_PRIV_H_ */
diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
new file mode 100644
index 000000000000..0f360c1ccebd
--- /dev/null
+++ b/drivers/accel/amdxdna/aie4_pci.c
@@ -0,0 +1,364 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include <drm/amdxdna_accel.h>
+#include <drm/drm_managed.h>
+#include <drm/drm_print.h>
+
+#include "aie4_pci.h"
+#include "amdxdna_pci_drv.h"
+
+#define NO_IOHUB 0
+
+/*
+ * The management mailbox channel is allocated by firmware.
+ * The related register and ring buffer information is on SRAM BAR.
+ * This struct is the register layout.
+ */
+struct mailbox_info {
+ __u32 valid;
+ __u32 protocol_major;
+ __u32 protocol_minor;
+ __u32 x2i_tail_offset;
+ __u32 x2i_head_offset;
+ __u32 x2i_buffer_addr;
+ __u32 x2i_buffer_size;
+ __u32 i2x_tail_offset;
+ __u32 i2x_head_offset;
+ __u32 i2x_buffer_addr;
+ __u32 i2x_buffer_size;
+ __u32 i2x_msi_idx;
+ __u32 reserved[4];
+};
+
+static int aie4_fw_is_alive(struct amdxdna_dev *xdna)
+{
+ const struct amdxdna_dev_priv *npriv = xdna->dev_info->dev_priv;
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+ u32 __iomem *src;
+ u32 fw_is_valid;
+ int ret;
+
+ src = ndev->rbuf_base + npriv->mbox_info_off;
+
+ ret = readx_poll_timeout(readl, src + offsetof(struct mailbox_info, valid),
+ fw_is_valid, (fw_is_valid == 0x1),
+ AIE_INTERVAL, AIE_TIMEOUT);
+ if (ret)
+ XDNA_ERR(xdna, "fw_is_valid=%d after %d ms",
+ fw_is_valid, DIV_ROUND_CLOSEST(AIE_TIMEOUT, 1000000));
+
+ return ret;
+}
+
+static void aie4_read_mbox_info(struct amdxdna_dev *xdna,
+ struct mailbox_info *mbox_info)
+{
+ const struct amdxdna_dev_priv *npriv = xdna->dev_info->dev_priv;
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+ u32 *dst = (u32 *)mbox_info;
+ u32 __iomem *src;
+ int i;
+
+ src = ndev->rbuf_base + npriv->mbox_info_off;
+
+ for (i = 0; i < sizeof(*mbox_info) / sizeof(u32); i++)
+ dst[i] = readl(&src[i]);
+}
+
+static int aie4_mailbox_info(struct amdxdna_dev *xdna,
+ struct mailbox_info *mbox_info)
+{
+ int ret;
+
+ ret = aie4_fw_is_alive(xdna);
+ if (ret)
+ return ret;
+
+ aie4_read_mbox_info(xdna, mbox_info);
+
+ ret = aie_check_protocol(&xdna->dev_handle->aie,
+ mbox_info->protocol_major,
+ mbox_info->protocol_minor);
+ if (ret)
+ XDNA_ERR(xdna, "mailbox major.minor %d.%d is not supported",
+ mbox_info->protocol_major, mbox_info->protocol_minor);
+
+ return ret;
+}
+
+static void aie4_mailbox_fini(struct amdxdna_dev_hdl *ndev)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+
+ aie_destroy_chann(&ndev->aie, &ndev->aie.mgmt_chann);
+ drmm_kfree(&xdna->ddev, ndev->mbox);
+ ndev->mbox = NULL;
+}
+
+static int aie4_irq_init(struct amdxdna_dev *xdna)
+{
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ int ret, nvec;
+
+ nvec = pci_msix_vec_count(pdev);
+ XDNA_DBG(xdna, "irq vectors:%d", nvec);
+ if (nvec <= 0) {
+ XDNA_ERR(xdna, "does not get number of interrupt vector");
+ return -EINVAL;
+ }
+
+ ret = pci_alloc_irq_vectors(pdev, nvec, nvec, PCI_IRQ_MSIX);
+ if (ret < 0) {
+ XDNA_ERR(xdna, "failed to alloc irq vector, ret: %d", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int aie4_mailbox_start(struct amdxdna_dev *xdna,
+ struct mailbox_info *mbi)
+{
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+ const struct amdxdna_dev_priv *npriv = xdna->dev_info->dev_priv;
+ struct xdna_mailbox_chann_res *i2x;
+ struct xdna_mailbox_chann_res *x2i;
+ int mgmt_mb_irq;
+ int ret;
+
+ struct xdna_mailbox_res mbox_res = {
+ .ringbuf_base = ndev->rbuf_base,
+ .ringbuf_size = pci_resource_len(pdev, npriv->mbox_rbuf_bar),
+ .mbox_base = ndev->mbox_base,
+ .mbox_size = pci_resource_len(pdev, npriv->mbox_bar),
+ .name = "xdna_aie4_mailbox",
+ };
+
+ i2x = &ndev->aie.mgmt_i2x;
+ x2i = &ndev->aie.mgmt_x2i;
+
+ x2i->mb_head_ptr_reg = mbi->x2i_head_offset;
+ x2i->mb_tail_ptr_reg = mbi->x2i_tail_offset;
+ x2i->rb_start_addr = mbi->x2i_buffer_addr;
+ x2i->rb_size = mbi->x2i_buffer_size;
+
+ i2x->rb_start_addr = mbi->i2x_buffer_addr;
+ i2x->rb_size = mbi->i2x_buffer_size;
+ i2x->mb_head_ptr_reg = mbi->i2x_head_offset;
+ i2x->mb_tail_ptr_reg = mbi->i2x_tail_offset;
+
+ ndev->aie.mgmt_chan_idx = mbi->i2x_msi_idx;
+ aie_dump_mgmt_chann_debug(&ndev->aie);
+
+ ndev->mbox = xdnam_mailbox_create(&xdna->ddev, &mbox_res);
+ if (!ndev->mbox) {
+ XDNA_ERR(xdna, "failed to create mailbox device");
+ return -ENODEV;
+ }
+
+ ndev->aie.mgmt_chann = xdna_mailbox_alloc_channel(ndev->mbox);
+ if (!ndev->aie.mgmt_chann) {
+ XDNA_ERR(xdna, "failed to alloc mailbox channel");
+ return -ENODEV;
+ }
+
+ mgmt_mb_irq = pci_irq_vector(pdev, ndev->aie.mgmt_chan_idx);
+ if (mgmt_mb_irq < 0) {
+ XDNA_ERR(xdna, "failed to alloc irq vector, return %d", mgmt_mb_irq);
+ ret = mgmt_mb_irq;
+ goto free_channel;
+ }
+
+ ret = xdna_mailbox_start_channel(ndev->aie.mgmt_chann,
+ &ndev->aie.mgmt_x2i,
+ &ndev->aie.mgmt_i2x,
+ NO_IOHUB,
+ mgmt_mb_irq);
+ if (ret) {
+ XDNA_ERR(xdna, "failed to start management mailbox channel");
+ ret = -EINVAL;
+ goto free_channel;
+ }
+
+ XDNA_DBG(xdna, "Mailbox management channel created");
+ return 0;
+
+free_channel:
+ xdna_mailbox_free_channel(ndev->aie.mgmt_chann);
+ ndev->aie.mgmt_chann = NULL;
+ return ret;
+}
+
+static int aie4_mailbox_init(struct amdxdna_dev *xdna)
+{
+ struct mailbox_info mbox_info;
+ int ret;
+
+ ret = aie4_mailbox_info(xdna, &mbox_info);
+ if (ret)
+ return ret;
+
+ return aie4_mailbox_start(xdna, &mbox_info);
+}
+
+static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
+{
+ /* TODO */
+}
+
+static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
+{
+ /* TODO */
+ return 0;
+}
+
+static int aie4_hw_start(struct amdxdna_dev *xdna)
+{
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+ int ret;
+
+ ret = aie4_fw_load(ndev);
+ if (ret)
+ return ret;
+
+ ret = aie4_mailbox_init(xdna);
+ if (ret)
+ goto fw_unload;
+
+ return 0;
+
+fw_unload:
+ aie4_fw_unload(ndev);
+
+ return ret;
+}
+
+static void aie4_mgmt_fw_fini(struct amdxdna_dev_hdl *ndev)
+{
+ int ret;
+
+ /* No paired resume needed, fw is stateless */
+ ret = aie4_suspend_fw(ndev);
+ if (ret)
+ XDNA_ERR(ndev->aie.xdna, "suspend_fw failed, ret %d", ret);
+ else
+ XDNA_DBG(ndev->aie.xdna, "npu firmware suspended");
+}
+
+static void aie4_hw_stop(struct amdxdna_dev *xdna)
+{
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+
+ drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
+
+ aie4_mgmt_fw_fini(ndev);
+ aie4_mailbox_fini(ndev);
+
+ aie4_fw_unload(ndev);
+}
+
+static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
+ unsigned long bars = 0;
+ int ret, i;
+
+ /* Enable managed PCI device */
+ ret = pcim_enable_device(pdev);
+ if (ret) {
+ XDNA_ERR(xdna, "pcim enable device failed, ret %d", ret);
+ return ret;
+ }
+
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+ if (ret) {
+ XDNA_ERR(xdna, "failed to set DMA mask to 64:%d", ret);
+ return ret;
+ }
+
+ set_bit(xdna->dev_info->mbox_bar, &bars);
+ set_bit(xdna->dev_info->sram_bar, &bars);
+
+ for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+ if (!test_bit(i, &bars))
+ continue;
+ tbl[i] = pcim_iomap(pdev, i, 0);
+ if (!tbl[i]) {
+ XDNA_ERR(xdna, "map bar %d failed", i);
+ return -ENOMEM;
+ }
+ }
+
+ ndev->mbox_base = tbl[xdna->dev_info->mbox_bar];
+ ndev->rbuf_base = tbl[xdna->dev_info->sram_bar];
+
+ pci_set_master(pdev);
+
+ ret = aie4_irq_init(xdna);
+ if (ret)
+ goto clear_master;
+
+ ret = aie4_hw_start(xdna);
+ if (ret)
+ goto clear_master;
+
+ return 0;
+
+clear_master:
+ pci_clear_master(pdev);
+
+ return ret;
+}
+
+static void aie4_pcidev_fini(struct amdxdna_dev_hdl *ndev)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+
+ aie4_hw_stop(xdna);
+
+ pci_clear_master(pdev);
+}
+
+static void aie4_fini(struct amdxdna_dev *xdna)
+{
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+
+ aie4_sriov_stop(ndev);
+ aie4_pcidev_fini(ndev);
+}
+
+static int aie4_init(struct amdxdna_dev *xdna)
+{
+ struct amdxdna_dev_hdl *ndev;
+ int ret;
+
+ ndev = drmm_kzalloc(&xdna->ddev, sizeof(*ndev), GFP_KERNEL);
+ if (!ndev)
+ return -ENOMEM;
+
+ ndev->priv = xdna->dev_info->dev_priv;
+ ndev->aie.xdna = xdna;
+ xdna->dev_handle = ndev;
+
+ ret = aie4_pcidev_init(ndev);
+ if (ret) {
+ XDNA_ERR(xdna, "Setup PCI device failed, ret %d", ret);
+ return ret;
+ }
+
+ XDNA_DBG(xdna, "aie4 init finished");
+ return 0;
+}
+
+const struct amdxdna_dev_ops aie4_ops = {
+ .init = aie4_init,
+ .fini = aie4_fini,
+ .sriov_configure = aie4_sriov_configure,
+};
diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
new file mode 100644
index 000000000000..f3810a969431
--- /dev/null
+++ b/drivers/accel/amdxdna/aie4_pci.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#ifndef _AIE4_PCI_H_
+#define _AIE4_PCI_H_
+
+#include <linux/device.h>
+#include <linux/iopoll.h>
+#include <linux/pci.h>
+
+#include "aie.h"
+#include "amdxdna_mailbox.h"
+
+struct amdxdna_dev_priv {
+ u32 mbox_bar;
+ u32 mbox_rbuf_bar;
+ u64 mbox_info_off;
+};
+
+struct amdxdna_dev_hdl {
+ struct aie_device aie;
+ const struct amdxdna_dev_priv *priv;
+ void __iomem *mbox_base;
+ void __iomem *rbuf_base;
+
+ struct mailbox *mbox;
+};
+
+/* aie4_message.c */
+int aie4_suspend_fw(struct amdxdna_dev_hdl *ndev);
+
+/* aie4_sriov.c */
+#if IS_ENABLED(CONFIG_PCI_IOV)
+int aie4_sriov_configure(struct amdxdna_dev *xdna, int num_vfs);
+int aie4_sriov_stop(struct amdxdna_dev_hdl *ndev);
+#else
+#define aie4_sriov_configure NULL
+static inline int aie4_sriov_stop(struct amdxdna_dev_hdl *ndev)
+{
+ return 0;
+}
+#endif
+
+extern const struct amdxdna_dev_ops aie4_ops;
+
+#endif /* _AIE4_PCI_H_ */
diff --git a/drivers/accel/amdxdna/aie4_sriov.c b/drivers/accel/amdxdna/aie4_sriov.c
new file mode 100644
index 000000000000..e1ce633768a5
--- /dev/null
+++ b/drivers/accel/amdxdna/aie4_sriov.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include <drm/amdxdna_accel.h>
+#include <drm/drm_print.h>
+#include <linux/pci.h>
+
+#include "aie.h"
+#include "aie4_msg_priv.h"
+#include "aie4_pci.h"
+#include "amdxdna_mailbox.h"
+#include "amdxdna_mailbox_helper.h"
+#include "amdxdna_pci_drv.h"
+
+static int aie4_destroy_vfs(struct amdxdna_dev_hdl *ndev)
+{
+ DECLARE_AIE_MSG(aie4_msg_destroy_vfs, AIE4_MSG_OP_DESTROY_VFS);
+ int ret;
+
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
+ if (ret)
+ XDNA_ERR(ndev->aie.xdna, "destroy vfs op failed: %d", ret);
+
+ return ret;
+}
+
+static int aie4_create_vfs(struct amdxdna_dev_hdl *ndev, int num_vfs)
+{
+ DECLARE_AIE_MSG(aie4_msg_create_vfs, AIE4_MSG_OP_CREATE_VFS);
+ int ret;
+
+ req.vf_cnt = num_vfs;
+ ret = aie_send_mgmt_msg_wait(&ndev->aie, &msg);
+ if (ret)
+ XDNA_ERR(ndev->aie.xdna, "create vfs op failed: %d", ret);
+
+ return ret;
+}
+
+int aie4_sriov_stop(struct amdxdna_dev_hdl *ndev)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ int ret;
+
+ if (!pci_num_vf(pdev))
+ return 0;
+
+ ret = pci_vfs_assigned(pdev);
+ if (ret) {
+ XDNA_ERR(xdna, "VFs are still assigned to VMs");
+ return -EPERM;
+ }
+
+ pci_disable_sriov(pdev);
+ return aie4_destroy_vfs(ndev);
+}
+
+static int aie4_sriov_start(struct amdxdna_dev_hdl *ndev, int num_vfs)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ int ret;
+
+ ret = aie4_create_vfs(ndev, num_vfs);
+ if (ret)
+ return ret;
+
+ ret = pci_enable_sriov(pdev, num_vfs);
+ if (ret) {
+ XDNA_ERR(xdna, "configure VFs failed, ret: %d", ret);
+ aie4_destroy_vfs(ndev);
+ return ret;
+ }
+
+ return num_vfs;
+}
+
+int aie4_sriov_configure(struct amdxdna_dev *xdna, int num_vfs)
+{
+ struct amdxdna_dev_hdl *ndev = xdna->dev_handle;
+
+ drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock));
+
+ return (num_vfs) ? aie4_sriov_start(ndev, num_vfs) : aie4_sriov_stop(ndev);
+}
diff --git a/drivers/accel/amdxdna/amdxdna_mailbox.c b/drivers/accel/amdxdna/amdxdna_mailbox.c
index e681a090752d..84a7e92562ad 100644
--- a/drivers/accel/amdxdna/amdxdna_mailbox.c
+++ b/drivers/accel/amdxdna/amdxdna_mailbox.c
@@ -112,6 +112,18 @@ static u32 mailbox_reg_read(struct mailbox_channel *mb_chann, u32 mbox_reg)
return readl(ringbuf_addr);
}
+static inline void mailbox_irq_acknowledge(struct mailbox_channel *mb_chann)
+{
+ if (mb_chann->iohub_int_addr)
+ mailbox_reg_write(mb_chann, mb_chann->iohub_int_addr, 0);
+}
+
+static inline u32 mailbox_irq_status(struct mailbox_channel *mb_chann)
+{
+ return (mb_chann->iohub_int_addr) ?
+ mailbox_reg_read(mb_chann, mb_chann->iohub_int_addr) : 0;
+}
+
static inline void
mailbox_set_headptr(struct mailbox_channel *mb_chann, u32 headptr_val)
{
@@ -199,7 +211,6 @@ mailbox_send_msg(struct mailbox_channel *mb_chann, struct mailbox_msg *mb_msg)
start_addr = mb_chann->res[CHAN_RES_X2I].rb_start_addr;
tmp_tail = tail + mb_msg->pkg_size;
-
check_again:
if (tail >= head && tmp_tail > ringbuf_size) {
write_addr = mb_chann->mb->res.ringbuf_base + start_addr + tail;
@@ -357,7 +368,7 @@ static void mailbox_rx_worker(struct work_struct *rx_work)
}
again:
- mailbox_reg_write(mb_chann, mb_chann->iohub_int_addr, 0);
+ mailbox_irq_acknowledge(mb_chann);
while (1) {
/*
@@ -382,7 +393,7 @@ static void mailbox_rx_worker(struct work_struct *rx_work)
* the interrupt register to make sure there is not any new response
* before exiting.
*/
- if (mailbox_reg_read(mb_chann, mb_chann->iohub_int_addr))
+ if (mailbox_irq_status(mb_chann))
goto again;
}
@@ -520,7 +531,7 @@ xdna_mailbox_start_channel(struct mailbox_channel *mb_chann,
}
mb_chann->bad_state = false;
- mailbox_reg_write(mb_chann, mb_chann->iohub_int_addr, 0);
+ mailbox_irq_acknowledge(mb_chann);
MB_DBG(mb_chann, "Mailbox channel started (irq: %d)", mb_chann->msix_irq);
return 0;
diff --git a/drivers/accel/amdxdna/amdxdna_mailbox.h b/drivers/accel/amdxdna/amdxdna_mailbox.h
index 8b1e00945da4..2908404303ae 100644
--- a/drivers/accel/amdxdna/amdxdna_mailbox.h
+++ b/drivers/accel/amdxdna/amdxdna_mailbox.h
@@ -1,10 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
- * Copyright (C) 2022-2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2022-2026, Advanced Micro Devices, Inc.
*/
-#ifndef _AIE2_MAILBOX_H_
-#define _AIE2_MAILBOX_H_
+#ifndef _AIE_MAILBOX_H_
+#define _AIE_MAILBOX_H_
struct mailbox;
struct mailbox_channel;
@@ -124,4 +124,4 @@ void xdna_mailbox_stop_channel(struct mailbox_channel *mailbox_chann);
int xdna_mailbox_send_msg(struct mailbox_channel *mailbox_chann,
const struct xdna_mailbox_msg *msg, u64 tx_timeout);
-#endif /* _AIE2_MAILBOX_ */
+#endif /* _AIE_MAILBOX_ */
diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c b/drivers/accel/amdxdna/amdxdna_pci_drv.c
index b50a7d1f8a11..09d7d88bb6f1 100644
--- a/drivers/accel/amdxdna/amdxdna_pci_drv.c
+++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c
@@ -37,9 +37,10 @@ MODULE_FIRMWARE("amdnpu/17f0_11/npu_7.sbin");
* 0.6: Support preemption
* 0.7: Support getting power and utilization data
* 0.8: Support BO usage query
+ * 0.9: Add new device type AMDXDNA_DEV_TYPE_PF
*/
#define AMDXDNA_DRIVER_MAJOR 0
-#define AMDXDNA_DRIVER_MINOR 8
+#define AMDXDNA_DRIVER_MINOR 9
/*
* Bind the driver base on (vendor_id, device_id) pair and later use the
@@ -49,6 +50,8 @@ MODULE_FIRMWARE("amdnpu/17f0_11/npu_7.sbin");
static const struct pci_device_id pci_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1502) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x17f0) },
+ { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x17f2) },
+ { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1B0B) },
{0}
};
@@ -59,6 +62,8 @@ static const struct amdxdna_device_id amdxdna_ids[] = {
{ 0x17f0, 0x10, &dev_npu4_info },
{ 0x17f0, 0x11, &dev_npu5_info },
{ 0x17f0, 0x20, &dev_npu6_info },
+ { 0x17f2, 0x10, &dev_npu3_pf_info },
+ { 0x1B0B, 0x10, &dev_npu3_pf_info },
{0}
};
@@ -365,12 +370,24 @@ static const struct dev_pm_ops amdxdna_pm_ops = {
RUNTIME_PM_OPS(amdxdna_pm_suspend, amdxdna_pm_resume, NULL)
};
+static int amdxdna_sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+ struct amdxdna_dev *xdna = pci_get_drvdata(pdev);
+
+ guard(mutex)(&xdna->dev_lock);
+ if (xdna->dev_info->ops->sriov_configure)
+ return xdna->dev_info->ops->sriov_configure(xdna, num_vfs);
+
+ return -ENOENT;
+}
+
static struct pci_driver amdxdna_pci_driver = {
.name = KBUILD_MODNAME,
.id_table = pci_ids,
.probe = amdxdna_probe,
.remove = amdxdna_remove,
.driver.pm = &amdxdna_pm_ops,
+ .sriov_configure = amdxdna_sriov_configure,
};
module_pci_driver(amdxdna_pci_driver);
diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdna/amdxdna_pci_drv.h
index 5e0bf565a1ae..eabbf57f2b38 100644
--- a/drivers/accel/amdxdna/amdxdna_pci_drv.h
+++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h
@@ -55,6 +55,7 @@ struct amdxdna_dev_ops {
void (*fini)(struct amdxdna_dev *xdna);
int (*resume)(struct amdxdna_dev *xdna);
int (*suspend)(struct amdxdna_dev *xdna);
+ int (*sriov_configure)(struct amdxdna_dev *xdna, int num_vfs);
int (*hwctx_init)(struct amdxdna_hwctx *hwctx);
void (*hwctx_fini)(struct amdxdna_hwctx *hwctx);
int (*hwctx_config)(struct amdxdna_hwctx *hwctx, u32 type, u64 value, void *buf, u32 size);
@@ -157,6 +158,7 @@ struct amdxdna_client {
/* Add device info below */
extern const struct amdxdna_dev_info dev_npu1_info;
+extern const struct amdxdna_dev_info dev_npu3_pf_info;
extern const struct amdxdna_dev_info dev_npu4_info;
extern const struct amdxdna_dev_info dev_npu5_info;
extern const struct amdxdna_dev_info dev_npu6_info;
diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
new file mode 100644
index 000000000000..f6e20f4858db
--- /dev/null
+++ b/drivers/accel/amdxdna/npu3_regs.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include <drm/amdxdna_accel.h>
+#include <drm/drm_device.h>
+
+#include "aie4_pci.h"
+#include "amdxdna_pci_drv.h"
+
+#define NPU3_MBOX_BAR 0
+
+#define NPU3_MBOX_BUFFER_BAR 2
+#define NPU3_MBOX_INFO_OFF 0x0
+
+/* PCIe BAR Index for NPU3 */
+#define NPU3_REG_BAR_INDEX 0
+
+static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
+ { .major = 5, .min_minor = 10 },
+ { 0 }
+};
+
+static const struct amdxdna_dev_priv npu3_dev_priv = {
+ .mbox_bar = NPU3_MBOX_BAR,
+ .mbox_rbuf_bar = NPU3_MBOX_BUFFER_BAR,
+ .mbox_info_off = NPU3_MBOX_INFO_OFF,
+};
+
+const struct amdxdna_dev_info dev_npu3_pf_info = {
+ .mbox_bar = NPU3_MBOX_BAR,
+ .sram_bar = NPU3_MBOX_BUFFER_BAR,
+ .vbnv = "RyzenAI-npu3-pf",
+ .device_type = AMDXDNA_DEV_TYPE_PF,
+ .dev_priv = &npu3_dev_priv,
+ .fw_feature_tbl = npu3_fw_feature_table,
+ .ops = &aie4_ops,
+};
diff --git a/include/uapi/drm/amdxdna_accel.h b/include/uapi/drm/amdxdna_accel.h
index 61d3686fa3b1..0b11e8e3ea5d 100644
--- a/include/uapi/drm/amdxdna_accel.h
+++ b/include/uapi/drm/amdxdna_accel.h
@@ -29,7 +29,8 @@ extern "C" {
enum amdxdna_device_type {
AMDXDNA_DEV_TYPE_UNKNOWN = -1,
- AMDXDNA_DEV_TYPE_KMQ,
+ AMDXDNA_DEV_TYPE_KMQ = 0,
+ AMDXDNA_DEV_TYPE_PF = 2,
};
enum amdxdna_drm_ioctl_id {
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V1 3/6] accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
2026-03-30 16:37 ` [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4 Lizhi Hou
2026-03-30 16:37 ` [PATCH V1 2/6] accel/amdxdna: Add basic support for AIE4 devices Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
` (3 subsequent siblings)
6 siblings, 1 reply; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue, Lizhi Hou
From: David Zhang <yidong.zhang@amd.com>
The AIE2 and AIE4 use the similar interface to PSP (Platform Security
Processor). Move the PSP implementation into aie_psp.c so both platforms
use the same path and future AIE4 PSP work can build on it.
Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: David Zhang <yidong.zhang@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/Makefile | 2 +-
drivers/accel/amdxdna/aie.h | 41 +++++++++++++++++
drivers/accel/amdxdna/aie2_message.c | 2 +-
drivers/accel/amdxdna/aie2_pci.c | 12 ++---
drivers/accel/amdxdna/aie2_pci.h | 44 ++-----------------
.../accel/amdxdna/{aie2_psp.c => aie_psp.c} | 17 +++----
6 files changed, 60 insertions(+), 58 deletions(-)
rename drivers/accel/amdxdna/{aie2_psp.c => aie_psp.c} (88%)
diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile
index a61cd6c0db30..d3c0fe765a8b 100644
--- a/drivers/accel/amdxdna/Makefile
+++ b/drivers/accel/amdxdna/Makefile
@@ -2,12 +2,12 @@
amdxdna-y := \
aie.o \
+ aie_psp.o \
aie2_ctx.o \
aie2_error.o \
aie2_message.o \
aie2_pci.o \
aie2_pm.o \
- aie2_psp.o \
aie2_smu.o \
aie2_solver.o \
aie4_message.o \
diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
index 6c53870d0098..124c0f7e9ca0 100644
--- a/drivers/accel/amdxdna/aie.h
+++ b/drivers/accel/amdxdna/aie.h
@@ -11,6 +11,8 @@
#define AIE_INTERVAL 20000 /* us */
#define AIE_TIMEOUT 1000000 /* us */
+struct psp_device;
+
struct aie_device {
struct amdxdna_dev *xdna;
struct mailbox_channel *mgmt_chann;
@@ -20,15 +22,54 @@ struct aie_device {
u32 mgmt_prot_major;
u32 mgmt_prot_minor;
unsigned long feature_mask;
+
+ struct psp_device *psp_hdl;
};
#define DECLARE_AIE_MSG(name, op) \
DECLARE_XDNA_MSG_COMMON(name, op, -1)
#define AIE_FEATURE_ON(aie, feature) test_bit(feature, &(aie)->feature_mask)
+#define PSP_REG_BAR(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].bar_idx)
+#define PSP_REG_OFF(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].offset)
+
+#define DEFINE_BAR_OFFSET(reg_name, bar, reg_addr) \
+ [reg_name] = {bar##_BAR_INDEX, (reg_addr) - bar##_BAR_BASE}
+
+enum psp_reg_idx {
+ PSP_CMD_REG = 0,
+ PSP_ARG0_REG,
+ PSP_ARG1_REG,
+ PSP_ARG2_REG,
+ PSP_NUM_IN_REGS, /* number of input registers */
+ PSP_INTR_REG = PSP_NUM_IN_REGS,
+ PSP_STATUS_REG,
+ PSP_RESP_REG,
+ PSP_PWAITMODE_REG,
+ PSP_MAX_REGS /* Keep this at the end */
+};
+
+struct aie_bar_off_pair {
+ int bar_idx;
+ u32 offset;
+};
+
+struct psp_config {
+ const void *fw_buf;
+ u32 fw_size;
+ void __iomem *psp_regs[PSP_MAX_REGS];
+};
+
+/* aie.c */
void aie_dump_mgmt_chann_debug(struct aie_device *aie);
void aie_destroy_chann(struct aie_device *aie, struct mailbox_channel **chann);
int aie_send_mgmt_msg_wait(struct aie_device *aie, struct xdna_mailbox_msg *msg);
int aie_check_protocol(struct aie_device *aie, u32 fw_major, u32 fw_minor);
+/* aie_psp.c */
+struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf);
+int aie_psp_start(struct psp_device *psp);
+void aie_psp_stop(struct psp_device *psp);
+int aie_psp_waitmode_poll(struct psp_device *psp);
+
#endif /* _AIE_H_ */
diff --git a/drivers/accel/amdxdna/aie2_message.c b/drivers/accel/amdxdna/aie2_message.c
index ccf87b1aa1cc..e5e7da7a8f40 100644
--- a/drivers/accel/amdxdna/aie2_message.c
+++ b/drivers/accel/amdxdna/aie2_message.c
@@ -75,7 +75,7 @@ int aie2_suspend_fw(struct amdxdna_dev_hdl *ndev)
return ret;
}
- return aie2_psp_waitmode_poll(ndev->psp_hdl);
+ return aie_psp_waitmode_poll(ndev->aie.psp_hdl);
}
int aie2_resume_fw(struct amdxdna_dev_hdl *ndev)
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index 708d0b7fd2e3..e4b7893bd429 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -297,7 +297,7 @@ static void aie2_hw_stop(struct amdxdna_dev *xdna)
aie_destroy_chann(&ndev->aie, &ndev->aie.mgmt_chann);
drmm_kfree(&xdna->ddev, ndev->mbox);
ndev->mbox = NULL;
- aie2_psp_stop(ndev->psp_hdl);
+ aie_psp_stop(ndev->aie.psp_hdl);
aie2_smu_fini(ndev);
aie2_error_async_events_free(ndev);
pci_disable_device(pdev);
@@ -350,7 +350,7 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto free_channel;
}
- ret = aie2_psp_start(ndev->psp_hdl);
+ ret = aie_psp_start(ndev->aie.psp_hdl);
if (ret) {
XDNA_ERR(xdna, "failed to start psp, ret %d", ret);
goto fini_smu;
@@ -413,7 +413,7 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
aie2_suspend_fw(ndev);
xdna_mailbox_stop_channel(ndev->aie.mgmt_chann);
stop_psp:
- aie2_psp_stop(ndev->psp_hdl);
+ aie_psp_stop(ndev->aie.psp_hdl);
fini_smu:
aie2_smu_fini(ndev);
free_channel:
@@ -463,7 +463,7 @@ static int aie2_init(struct amdxdna_dev *xdna)
void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
struct init_config xrs_cfg = { 0 };
struct amdxdna_dev_hdl *ndev;
- struct psp_config psp_conf;
+ struct psp_config psp_conf = { 0 };
const struct firmware *fw;
unsigned long bars = 0;
char *fw_full_path;
@@ -551,8 +551,8 @@ static int aie2_init(struct amdxdna_dev *xdna)
psp_conf.fw_buf = fw->data;
for (i = 0; i < PSP_MAX_REGS; i++)
psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
- ndev->psp_hdl = aie2m_psp_create(&xdna->ddev, &psp_conf);
- if (!ndev->psp_hdl) {
+ ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
+ if (!ndev->aie.psp_hdl) {
XDNA_ERR(xdna, "failed to create psp");
ret = -ENOMEM;
goto release_fw;
diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_pci.h
index 96960a2219a4..4f036b9fa096 100644
--- a/drivers/accel/amdxdna/aie2_pci.h
+++ b/drivers/accel/amdxdna/aie2_pci.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
- * Copyright (C) 2023-2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2023-2026, Advanced Micro Devices, Inc.
*/
#ifndef _AIE2_PCI_H_
@@ -23,8 +23,6 @@
#define AIE2_SRAM_OFF(ndev, addr) ((addr) - (ndev)->priv->sram_dev_addr)
#define AIE2_MBOX_OFF(ndev, addr) ((addr) - (ndev)->priv->mbox_dev_addr)
-#define PSP_REG_BAR(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].bar_idx)
-#define PSP_REG_OFF(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].offset)
#define SRAM_REG_OFF(ndev, idx) ((ndev)->priv->sram_offs[(idx)].offset)
#define SMU_REG(ndev, idx) \
@@ -88,30 +86,11 @@ enum aie2_sram_reg_idx {
SRAM_MAX_INDEX /* Keep this at the end */
};
-enum psp_reg_idx {
- PSP_CMD_REG = 0,
- PSP_ARG0_REG,
- PSP_ARG1_REG,
- PSP_ARG2_REG,
- PSP_NUM_IN_REGS, /* number of input registers */
- PSP_INTR_REG = PSP_NUM_IN_REGS,
- PSP_STATUS_REG,
- PSP_RESP_REG,
- PSP_PWAITMODE_REG,
- PSP_MAX_REGS /* Keep this at the end */
-};
-
struct amdxdna_client;
struct amdxdna_fw_ver;
struct amdxdna_hwctx;
struct amdxdna_sched_job;
-struct psp_config {
- const void *fw_buf;
- u32 fw_size;
- void __iomem *psp_regs[PSP_MAX_REGS];
-};
-
struct aie_version {
u16 major;
u16 minor;
@@ -206,7 +185,6 @@ struct amdxdna_dev_hdl {
void __iomem *sram_base;
void __iomem *smu_base;
void __iomem *mbox_base;
- struct psp_device *psp_hdl;
u32 total_col;
struct aie_version version;
@@ -236,14 +214,6 @@ struct amdxdna_dev_hdl {
struct amdxdna_async_error last_async_err;
};
-#define DEFINE_BAR_OFFSET(reg_name, bar, reg_addr) \
- [reg_name] = {bar##_BAR_INDEX, (reg_addr) - bar##_BAR_BASE}
-
-struct aie2_bar_off_pair {
- int bar_idx;
- u32 offset;
-};
-
struct aie2_hw_ops {
int (*set_dpm)(struct amdxdna_dev_hdl *ndev, u32 dpm_level);
};
@@ -271,9 +241,9 @@ struct amdxdna_dev_priv {
u32 mbox_size;
u32 hwctx_limit;
u32 sram_dev_addr;
- struct aie2_bar_off_pair sram_offs[SRAM_MAX_INDEX];
- struct aie2_bar_off_pair psp_regs_off[PSP_MAX_REGS];
- struct aie2_bar_off_pair smu_regs_off[SMU_MAX_REGS];
+ struct aie_bar_off_pair sram_offs[SRAM_MAX_INDEX];
+ struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
+ struct aie_bar_off_pair smu_regs_off[SMU_MAX_REGS];
struct aie2_hw_ops hw_ops;
};
@@ -300,12 +270,6 @@ int aie2_pm_init(struct amdxdna_dev_hdl *ndev);
int aie2_pm_set_mode(struct amdxdna_dev_hdl *ndev, enum amdxdna_power_mode_type target);
int aie2_pm_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level);
-/* aie2_psp.c */
-struct psp_device *aie2m_psp_create(struct drm_device *ddev, struct psp_config *conf);
-int aie2_psp_start(struct psp_device *psp);
-void aie2_psp_stop(struct psp_device *psp);
-int aie2_psp_waitmode_poll(struct psp_device *psp);
-
/* aie2_error.c */
int aie2_error_async_events_alloc(struct amdxdna_dev_hdl *ndev);
void aie2_error_async_events_free(struct amdxdna_dev_hdl *ndev);
diff --git a/drivers/accel/amdxdna/aie2_psp.c b/drivers/accel/amdxdna/aie_psp.c
similarity index 88%
rename from drivers/accel/amdxdna/aie2_psp.c
rename to drivers/accel/amdxdna/aie_psp.c
index 3a7130577e3e..8743b812a449 100644
--- a/drivers/accel/amdxdna/aie2_psp.c
+++ b/drivers/accel/amdxdna/aie_psp.c
@@ -1,19 +1,16 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Copyright (C) 2022-2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
*/
#include <drm/drm_device.h>
-#include <drm/drm_gem_shmem_helper.h>
#include <drm/drm_managed.h>
#include <drm/drm_print.h>
-#include <drm/gpu_scheduler.h>
#include <linux/bitfield.h>
#include <linux/iopoll.h>
+#include <linux/slab.h>
-#include "aie2_pci.h"
-#include "amdxdna_mailbox.h"
-#include "amdxdna_pci_drv.h"
+#include "aie.h"
#define PSP_STATUS_READY BIT(31)
@@ -76,7 +73,7 @@ static int psp_exec(struct psp_device *psp, u32 *reg_vals)
return 0;
}
-int aie2_psp_waitmode_poll(struct psp_device *psp)
+int aie_psp_waitmode_poll(struct psp_device *psp)
{
struct amdxdna_dev *xdna = to_xdna_dev(psp->ddev);
u32 mode_reg;
@@ -91,7 +88,7 @@ int aie2_psp_waitmode_poll(struct psp_device *psp)
return ret;
}
-void aie2_psp_stop(struct psp_device *psp)
+void aie_psp_stop(struct psp_device *psp)
{
u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, };
int ret;
@@ -101,7 +98,7 @@ void aie2_psp_stop(struct psp_device *psp)
drm_err(psp->ddev, "release tmr failed, ret %d", ret);
}
-int aie2_psp_start(struct psp_device *psp)
+int aie_psp_start(struct psp_device *psp)
{
u32 reg_vals[PSP_NUM_IN_REGS];
int ret;
@@ -129,7 +126,7 @@ int aie2_psp_start(struct psp_device *psp)
return 0;
}
-struct psp_device *aie2m_psp_create(struct drm_device *ddev, struct psp_config *conf)
+struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf)
{
struct psp_device *psp;
u64 offset;
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
` (2 preceding siblings ...)
2026-03-30 16:37 ` [PATCH V1 3/6] accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4 Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-30 20:17 ` Mario Limonciello
` (2 more replies)
2026-03-30 16:37 ` [PATCH V1 5/6] accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4 Lizhi Hou
` (2 subsequent siblings)
6 siblings, 3 replies; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue, Lizhi Hou
From: David Zhang <yidong.zhang@amd.com>
Add support for loading AIE4 firmware through the common PSP
interfaces.
Compared to AIE2, AIE4 introduces an additional CERT firmware image.
aiem_psp_create() performs CERT setup when the CERT image size is
non-zero.
Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: David Zhang <yidong.zhang@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/aie.h | 4 +
drivers/accel/amdxdna/aie2_pci.c | 2 +
drivers/accel/amdxdna/aie4_pci.c | 109 ++++++++++++++++++++++-
drivers/accel/amdxdna/aie4_pci.h | 4 +
drivers/accel/amdxdna/aie_psp.c | 141 +++++++++++++++++++++++-------
drivers/accel/amdxdna/npu3_regs.c | 23 +++++
6 files changed, 247 insertions(+), 36 deletions(-)
diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
index 124c0f7e9ca0..423ed34af9ee 100644
--- a/drivers/accel/amdxdna/aie.h
+++ b/drivers/accel/amdxdna/aie.h
@@ -57,7 +57,11 @@ struct aie_bar_off_pair {
struct psp_config {
const void *fw_buf;
u32 fw_size;
+ const void *certfw_buf;
+ u32 certfw_size;
void __iomem *psp_regs[PSP_MAX_REGS];
+ u32 arg2_mask;
+ u32 notify_val;
};
/* aie.c */
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index e4b7893bd429..0489e668cd73 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -549,6 +549,8 @@ static int aie2_init(struct amdxdna_dev *xdna)
psp_conf.fw_size = fw->size;
psp_conf.fw_buf = fw->data;
+ psp_conf.arg2_mask = GENMASK(23, 0);
+ psp_conf.notify_val = 1;
for (i = 0; i < PSP_MAX_REGS; i++)
psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
index 0f360c1ccebd..e7993b315996 100644
--- a/drivers/accel/amdxdna/aie4_pci.c
+++ b/drivers/accel/amdxdna/aie4_pci.c
@@ -6,11 +6,15 @@
#include <drm/amdxdna_accel.h>
#include <drm/drm_managed.h>
#include <drm/drm_print.h>
+#include <linux/firmware.h>
+#include <linux/sizes.h>
#include "aie4_pci.h"
#include "amdxdna_pci_drv.h"
-#define NO_IOHUB 0
+#define NO_IOHUB 0
+#define CERTFW_MAX_SIZE (SZ_32K + SZ_256)
+#define PSP_NOTIFY_INTR 0xD007BE11
/*
* The management mailbox channel is allocated by firmware.
@@ -207,13 +211,12 @@ static int aie4_mailbox_init(struct amdxdna_dev *xdna)
static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
{
- /* TODO */
+ aie_psp_stop(ndev->aie.psp_hdl);
}
static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
{
- /* TODO */
- return 0;
+ return aie_psp_start(ndev->aie.psp_hdl);
}
static int aie4_hw_start(struct amdxdna_dev *xdna)
@@ -261,11 +264,98 @@ static void aie4_hw_stop(struct amdxdna_dev *xdna)
aie4_fw_unload(ndev);
}
+static int aie4_request_firmware(struct amdxdna_dev_hdl *ndev,
+ const struct firmware **npufw,
+ const struct firmware **certfw)
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
+ char fw_name[128];
+ int ret;
+
+ ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
+ pdev->device, pdev->revision, ndev->priv->npufw_path);
+ if (ret >= sizeof(fw_name)) {
+ XDNA_ERR(xdna, "npu firmware path is truncated");
+ return -EINVAL;
+ }
+
+ ret = request_firmware(npufw, fw_name, &pdev->dev);
+ if (ret) {
+ XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
+ return ret;
+ }
+
+ ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
+ pdev->device, pdev->revision, ndev->priv->certfw_path);
+ if (ret >= sizeof(fw_name)) {
+ XDNA_ERR(xdna, "cert firmware path is truncated");
+ ret = -EINVAL;
+ goto release_npufw;
+ }
+
+ ret = request_firmware(certfw, fw_name, &pdev->dev);
+ if (ret) {
+ XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
+ goto release_npufw;
+ }
+
+ if ((*certfw)->size > CERTFW_MAX_SIZE) {
+ XDNA_ERR(xdna, "CERTFW over maximum size of 32 KB + 256 B");
+ ret = -EINVAL;
+ goto release_certfw;
+ }
+
+ return 0;
+
+release_certfw:
+ release_firmware(*certfw);
+release_npufw:
+ release_firmware(*npufw);
+
+ return ret;
+}
+
+static void aie4_release_firmware(struct amdxdna_dev_hdl *ndev,
+ const struct firmware *npufw,
+ const struct firmware *certfw)
+{
+ release_firmware(certfw);
+ release_firmware(npufw);
+}
+
+static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
+ const struct firmware *npufw,
+ const struct firmware *certfw,
+ void __iomem *tbl[PCI_NUM_RESOURCES])
+{
+ struct amdxdna_dev *xdna = ndev->aie.xdna;
+ struct psp_config psp_conf;
+ int i;
+
+ psp_conf.fw_size = npufw->size;
+ psp_conf.fw_buf = npufw->data;
+ psp_conf.certfw_size = certfw->size;
+ psp_conf.certfw_buf = certfw->data;
+ psp_conf.arg2_mask = ~0;
+ psp_conf.notify_val = PSP_NOTIFY_INTR;
+ for (i = 0; i < PSP_MAX_REGS; i++)
+ psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
+ ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
+ if (!ndev->aie.psp_hdl) {
+ XDNA_ERR(xdna, "failed to create psp");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
{
struct amdxdna_dev *xdna = ndev->aie.xdna;
struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
+ const struct firmware *npufw, *certfw;
unsigned long bars = 0;
int ret, i;
@@ -282,6 +372,8 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
return ret;
}
+ for (i = 0; i < PSP_MAX_REGS; i++)
+ set_bit(PSP_REG_BAR(ndev, i), &bars);
set_bit(xdna->dev_info->mbox_bar, &bars);
set_bit(xdna->dev_info->sram_bar, &bars);
@@ -300,6 +392,15 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
pci_set_master(pdev);
+ ret = aie4_request_firmware(ndev, &npufw, &certfw);
+ if (ret)
+ goto clear_master;
+
+ ret = aie4_prepare_firmware(ndev, npufw, certfw, tbl);
+ aie4_release_firmware(ndev, npufw, certfw);
+ if (ret)
+ goto clear_master;
+
ret = aie4_irq_init(xdna);
if (ret)
goto clear_master;
diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
index f3810a969431..ee388ccf7196 100644
--- a/drivers/accel/amdxdna/aie4_pci.h
+++ b/drivers/accel/amdxdna/aie4_pci.h
@@ -14,9 +14,13 @@
#include "amdxdna_mailbox.h"
struct amdxdna_dev_priv {
+ const char *npufw_path;
+ const char *certfw_path;
u32 mbox_bar;
u32 mbox_rbuf_bar;
u64 mbox_info_off;
+
+ struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
};
struct amdxdna_dev_hdl {
diff --git a/drivers/accel/amdxdna/aie_psp.c b/drivers/accel/amdxdna/aie_psp.c
index 8743b812a449..458dca7cc5a0 100644
--- a/drivers/accel/amdxdna/aie_psp.c
+++ b/drivers/accel/amdxdna/aie_psp.c
@@ -18,6 +18,7 @@
#define PSP_VALIDATE 1
#define PSP_START 2
#define PSP_RELEASE_TMR 3
+#define PSP_VALIDATE_CERT 4
/* PSP special arguments */
#define PSP_START_COPY_FW 1
@@ -27,10 +28,20 @@
#define PSP_ERROR_BAD_STATE 0xFFFF0007
#define PSP_FW_ALIGN 0x10000
+#define PSP_CFW_ALIGN 0x8000
#define PSP_POLL_INTERVAL 20000 /* us */
#define PSP_POLL_TIMEOUT 1000000 /* us */
-#define PSP_REG(p, reg) ((p)->psp_regs[reg])
+#define PSP_REG(p, reg) ((p)->conf.psp_regs[reg])
+#define PSP_SET_CMD(psp, reg_vals, cmd, arg0, arg1, arg2) \
+({ \
+ u32 *_regs = reg_vals; \
+ u32 _cmd = cmd; \
+ _regs[0] = _cmd; \
+ _regs[1] = arg0; \
+ _regs[2] = arg1; \
+ _regs[3] = ((arg2) | ((_cmd) << 24)) & (psp)->conf.arg2_mask; \
+})
struct psp_device {
struct drm_device *ddev;
@@ -38,7 +49,9 @@ struct psp_device {
u32 fw_buf_sz;
u64 fw_paddr;
void *fw_buffer;
- void __iomem *psp_regs[PSP_MAX_REGS];
+ u32 certfw_buf_sz;
+ u64 certfw_paddr;
+ void *certfw_buffer;
};
static int psp_exec(struct psp_device *psp, u32 *reg_vals)
@@ -47,13 +60,22 @@ static int psp_exec(struct psp_device *psp, u32 *reg_vals)
int ret, i;
u32 ready;
+ /* Check for PSP ready before any write */
+ ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
+ FIELD_GET(PSP_STATUS_READY, ready),
+ PSP_POLL_INTERVAL, PSP_POLL_TIMEOUT);
+ if (ret) {
+ drm_err(psp->ddev, "PSP is not ready, ret 0x%x", ret);
+ return ret;
+ }
+
/* Write command and argument registers */
for (i = 0; i < PSP_NUM_IN_REGS; i++)
writel(reg_vals[i], PSP_REG(psp, i));
/* clear and set PSP INTR register to kick off */
writel(0, PSP_REG(psp, PSP_INTR_REG));
- writel(1, PSP_REG(psp, PSP_INTR_REG));
+ writel(psp->conf.notify_val, PSP_REG(psp, PSP_INTR_REG));
/* PSP should be busy. Wait for ready, so we know task is done. */
ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
@@ -90,69 +112,124 @@ int aie_psp_waitmode_poll(struct psp_device *psp)
void aie_psp_stop(struct psp_device *psp)
{
- u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, };
+ u32 reg_vals[PSP_NUM_IN_REGS];
int ret;
+ PSP_SET_CMD(psp, reg_vals, PSP_RELEASE_TMR, 0, 0, 0);
+
ret = psp_exec(psp, reg_vals);
if (ret)
drm_err(psp->ddev, "release tmr failed, ret %d", ret);
}
-int aie_psp_start(struct psp_device *psp)
+static int psp_validate_fw(struct psp_device *psp, u8 cmd, u64 paddr, u32 buf_sz)
{
u32 reg_vals[PSP_NUM_IN_REGS];
int ret;
- reg_vals[0] = PSP_VALIDATE;
- reg_vals[1] = lower_32_bits(psp->fw_paddr);
- reg_vals[2] = upper_32_bits(psp->fw_paddr);
- reg_vals[3] = psp->fw_buf_sz;
+ PSP_SET_CMD(psp, reg_vals, cmd, lower_32_bits(paddr),
+ upper_32_bits(paddr), buf_sz);
ret = psp_exec(psp, reg_vals);
- if (ret) {
+ if (ret)
drm_err(psp->ddev, "failed to validate fw, ret %d", ret);
- return ret;
- }
- memset(reg_vals, 0, sizeof(reg_vals));
- reg_vals[0] = PSP_START;
- reg_vals[1] = PSP_START_COPY_FW;
+ return ret;
+}
+
+static int psp_start(struct psp_device *psp)
+{
+ u32 reg_vals[PSP_NUM_IN_REGS];
+ int ret;
+
+ PSP_SET_CMD(psp, reg_vals, PSP_START, PSP_START_COPY_FW, 0, 0);
+
ret = psp_exec(psp, reg_vals);
- if (ret) {
+ if (ret)
drm_err(psp->ddev, "failed to start fw, ret %d", ret);
+
+ return ret;
+}
+
+int aie_psp_start(struct psp_device *psp)
+{
+ int ret;
+
+ ret = psp_validate_fw(psp, PSP_VALIDATE,
+ psp->fw_paddr, psp->fw_buf_sz);
+ if (ret)
return ret;
- }
- return 0;
+ if (!psp->certfw_buf_sz)
+ goto psp_start;
+
+ ret = psp_validate_fw(psp, PSP_VALIDATE_CERT,
+ psp->certfw_paddr, psp->certfw_buf_sz);
+ if (ret)
+ return ret;
+psp_start:
+ return psp_start(psp);
+}
+
+/*
+ * PSP requires host physical address to load firmware.
+ * Allocate a buffer, obtain its physical address, align, and copy data in.
+ */
+static void *psp_alloc_fw_buf(struct psp_device *psp, const void *fw_data,
+ u32 fw_size, u32 align, u32 *buf_sz,
+ u64 *paddr)
+{
+ u32 alloc_sz;
+ void *buffer;
+ u64 offset;
+
+ *buf_sz = ALIGN(fw_size, align);
+ alloc_sz = *buf_sz + align;
+
+ buffer = drmm_kmalloc(psp->ddev, alloc_sz, GFP_KERNEL);
+ if (!buffer)
+ return NULL;
+
+ *paddr = virt_to_phys(buffer);
+ offset = ALIGN(*paddr, align) - *paddr;
+ *paddr += offset;
+ memcpy(buffer + offset, fw_data, fw_size);
+
+ return buffer;
}
struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf)
{
struct psp_device *psp;
- u64 offset;
psp = drmm_kzalloc(ddev, sizeof(*psp), GFP_KERNEL);
if (!psp)
return NULL;
psp->ddev = ddev;
- memcpy(psp->psp_regs, conf->psp_regs, sizeof(psp->psp_regs));
+ psp->fw_buffer = psp_alloc_fw_buf(psp, conf->fw_buf, conf->fw_size,
+ PSP_FW_ALIGN, &psp->fw_buf_sz,
+ &psp->fw_paddr);
+ if (!psp->fw_buffer)
+ return NULL;
+
+ if (!conf->certfw_size) {
+ drm_dbg(ddev, "no cert fw");
+ goto done;
+ }
- psp->fw_buf_sz = ALIGN(conf->fw_size, PSP_FW_ALIGN);
- psp->fw_buffer = drmm_kmalloc(ddev, psp->fw_buf_sz + PSP_FW_ALIGN, GFP_KERNEL);
- if (!psp->fw_buffer) {
- drm_err(ddev, "no memory for fw buffer");
+ /* CERT firmware */
+ psp->certfw_buffer = psp_alloc_fw_buf(psp, conf->certfw_buf,
+ conf->certfw_size, PSP_CFW_ALIGN,
+ &psp->certfw_buf_sz,
+ &psp->certfw_paddr);
+ if (!psp->certfw_buffer) {
+ drm_err(ddev, "no memory for cert fw buffer");
return NULL;
}
- /*
- * AMD Platform Security Processor(PSP) requires host physical
- * address to load NPU firmware.
- */
- psp->fw_paddr = virt_to_phys(psp->fw_buffer);
- offset = ALIGN(psp->fw_paddr, PSP_FW_ALIGN) - psp->fw_paddr;
- psp->fw_paddr += offset;
- memcpy(psp->fw_buffer + offset, conf->fw_buf, conf->fw_size);
+done:
+ memcpy(&psp->conf, conf, sizeof(psp->conf));
return psp;
}
diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
index f6e20f4858db..fb2bd60b8f00 100644
--- a/drivers/accel/amdxdna/npu3_regs.c
+++ b/drivers/accel/amdxdna/npu3_regs.c
@@ -16,6 +16,15 @@
/* PCIe BAR Index for NPU3 */
#define NPU3_REG_BAR_INDEX 0
+#define NPU3_PSP_BAR_INDEX 4
+
+#define MMNPU_APERTURE3_BASE 0x3810000
+#define NPU3_PSP_BAR_BASE MMNPU_APERTURE3_BASE
+
+#define MPASP_C2PMSG_123_ALT_1 0x3810AEC
+#define MPASP_C2PMSG_156_ALT_1 0x3810B70
+#define MPASP_C2PMSG_157_ALT_1 0x3810B74
+#define MPASP_C2PMSG_73_ALT_1 0x3810A24
static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
{ .major = 5, .min_minor = 10 },
@@ -23,14 +32,28 @@ static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
};
static const struct amdxdna_dev_priv npu3_dev_priv = {
+ .npufw_path = "npu.dev.sbin",
+ .certfw_path = "cert.dev.sbin",
.mbox_bar = NPU3_MBOX_BAR,
.mbox_rbuf_bar = NPU3_MBOX_BUFFER_BAR,
.mbox_info_off = NPU3_MBOX_INFO_OFF,
+ .psp_regs_off = {
+ DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU3_PSP, MPASP_C2PMSG_157_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU3_PSP, MPASP_C2PMSG_73_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
+ DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
+ /* npu3 doesn't use 8th pwaitmode register */
+ },
+
};
const struct amdxdna_dev_info dev_npu3_pf_info = {
.mbox_bar = NPU3_MBOX_BAR,
.sram_bar = NPU3_MBOX_BUFFER_BAR,
+ .psp_bar = NPU3_PSP_BAR_INDEX,
.vbnv = "RyzenAI-npu3-pf",
.device_type = AMDXDNA_DEV_TYPE_PF,
.dev_priv = &npu3_dev_priv,
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V1 5/6] accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
` (3 preceding siblings ...)
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 6/6] accel/amdxdna: Add AIE4 power on and off support Lizhi Hou
2026-03-31 7:05 ` Claude review: accel/amdxdna: Initial support for AIE4 platform Claude Code Review Bot
6 siblings, 1 reply; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue, Lizhi Hou
From: David Zhang <yidong.zhang@amd.com>
AIE2 and AIE4 use similar interfaces to the SMU (System Management
Unit). Move the SMU implementation into aie_smu.c and provide common
interfaces for both platforms.
This allows AIE2 and AIE4 to share the same implementation and reduces
code duplication.
Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: David Zhang <yidong.zhang@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/Makefile | 2 +-
drivers/accel/amdxdna/aie.h | 25 +++++
drivers/accel/amdxdna/aie2_pci.c | 22 ++++-
drivers/accel/amdxdna/aie2_pci.h | 20 ----
drivers/accel/amdxdna/aie2_smu.c | 156 ------------------------------
drivers/accel/amdxdna/aie_smu.c | 153 +++++++++++++++++++++++++++++
drivers/accel/amdxdna/npu1_regs.c | 21 ++++
drivers/accel/amdxdna/npu4_regs.c | 26 +++++
8 files changed, 245 insertions(+), 180 deletions(-)
delete mode 100644 drivers/accel/amdxdna/aie2_smu.c
create mode 100644 drivers/accel/amdxdna/aie_smu.c
diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile
index d3c0fe765a8b..79369e497540 100644
--- a/drivers/accel/amdxdna/Makefile
+++ b/drivers/accel/amdxdna/Makefile
@@ -3,12 +3,12 @@
amdxdna-y := \
aie.o \
aie_psp.o \
+ aie_smu.o \
aie2_ctx.o \
aie2_error.o \
aie2_message.o \
aie2_pci.o \
aie2_pm.o \
- aie2_smu.o \
aie2_solver.o \
aie4_message.o \
aie4_pci.o \
diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
index 423ed34af9ee..ba4c9ee21823 100644
--- a/drivers/accel/amdxdna/aie.h
+++ b/drivers/accel/amdxdna/aie.h
@@ -12,6 +12,7 @@
#define AIE_TIMEOUT 1000000 /* us */
struct psp_device;
+struct smu_device;
struct aie_device {
struct amdxdna_dev *xdna;
@@ -24,6 +25,7 @@ struct aie_device {
unsigned long feature_mask;
struct psp_device *psp_hdl;
+ struct smu_device *smu_hdl;
};
#define DECLARE_AIE_MSG(name, op) \
@@ -33,9 +35,21 @@ struct aie_device {
#define PSP_REG_BAR(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].bar_idx)
#define PSP_REG_OFF(ndev, idx) ((ndev)->priv->psp_regs_off[(idx)].offset)
+#define SMU_REG_BAR(ndev, idx) ((ndev)->priv->smu_regs_off[(idx)].bar_idx)
+#define SMU_REG_OFF(ndev, idx) ((ndev)->priv->smu_regs_off[(idx)].offset)
+
#define DEFINE_BAR_OFFSET(reg_name, bar, reg_addr) \
[reg_name] = {bar##_BAR_INDEX, (reg_addr) - bar##_BAR_BASE}
+enum smu_reg_idx {
+ SMU_CMD_REG = 0,
+ SMU_ARG_REG,
+ SMU_INTR_REG,
+ SMU_RESP_REG,
+ SMU_OUT_REG,
+ SMU_MAX_REGS /* Keep this at the end */
+};
+
enum psp_reg_idx {
PSP_CMD_REG = 0,
PSP_ARG0_REG,
@@ -54,6 +68,10 @@ struct aie_bar_off_pair {
u32 offset;
};
+struct smu_config {
+ void __iomem *smu_regs[SMU_MAX_REGS];
+};
+
struct psp_config {
const void *fw_buf;
u32 fw_size;
@@ -76,4 +94,11 @@ int aie_psp_start(struct psp_device *psp);
void aie_psp_stop(struct psp_device *psp);
int aie_psp_waitmode_poll(struct psp_device *psp);
+/* aie_smu.c */
+struct smu_device *aiem_smu_create(struct drm_device *ddev, struct smu_config *conf);
+int aie_smu_init(struct smu_device *smu);
+void aie_smu_fini(struct smu_device *smu);
+int aie_smu_set_clocks(struct smu_device *smu, u32 *npuclk, u32 *hclk);
+int aie_smu_set_dpm(struct smu_device *smu, u32 dpm_level);
+
#endif /* _AIE_H_ */
diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
index 0489e668cd73..164e188ba501 100644
--- a/drivers/accel/amdxdna/aie2_pci.c
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -282,6 +282,12 @@ static struct xrs_action_ops aie2_xrs_actions = {
.set_dft_dpm_level = aie2_xrs_set_dft_dpm_level,
};
+static void aie2_smu_fini(struct amdxdna_dev_hdl *ndev)
+{
+ ndev->priv->hw_ops.set_dpm(ndev, 0);
+ aie_smu_fini(ndev->aie.smu_hdl);
+}
+
static void aie2_hw_stop(struct amdxdna_dev *xdna)
{
struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
@@ -344,7 +350,7 @@ static int aie2_hw_start(struct amdxdna_dev *xdna)
goto disable_dev;
}
- ret = aie2_smu_init(ndev);
+ ret = aie_smu_init(ndev->aie.smu_hdl);
if (ret) {
XDNA_ERR(xdna, "failed to init smu, ret %d", ret);
goto free_channel;
@@ -464,6 +470,7 @@ static int aie2_init(struct amdxdna_dev *xdna)
struct init_config xrs_cfg = { 0 };
struct amdxdna_dev_hdl *ndev;
struct psp_config psp_conf = { 0 };
+ struct smu_config smu_conf;
const struct firmware *fw;
unsigned long bars = 0;
char *fw_full_path;
@@ -508,9 +515,10 @@ static int aie2_init(struct amdxdna_dev *xdna)
for (i = 0; i < PSP_MAX_REGS; i++)
set_bit(PSP_REG_BAR(ndev, i), &bars);
+ for (i = 0; i < SMU_MAX_REGS; i++)
+ set_bit(SMU_REG_BAR(ndev, i), &bars);
set_bit(xdna->dev_info->sram_bar, &bars);
- set_bit(xdna->dev_info->smu_bar, &bars);
set_bit(xdna->dev_info->mbox_bar, &bars);
for (i = 0; i < PCI_NUM_RESOURCES; i++) {
@@ -525,7 +533,6 @@ static int aie2_init(struct amdxdna_dev *xdna)
}
ndev->sram_base = tbl[xdna->dev_info->sram_bar];
- ndev->smu_base = tbl[xdna->dev_info->smu_bar];
ndev->mbox_base = tbl[xdna->dev_info->mbox_bar];
ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
@@ -559,6 +566,15 @@ static int aie2_init(struct amdxdna_dev *xdna)
ret = -ENOMEM;
goto release_fw;
}
+
+ for (i = 0; i < SMU_MAX_REGS; i++)
+ smu_conf.smu_regs[i] = tbl[SMU_REG_BAR(ndev, i)] + SMU_REG_OFF(ndev, i);
+ ndev->aie.smu_hdl = aiem_smu_create(&xdna->ddev, &smu_conf);
+ if (!ndev->aie.smu_hdl) {
+ XDNA_ERR(xdna, "failed to create smu");
+ ret = -ENOMEM;
+ goto release_fw;
+ }
xdna->dev_handle = ndev;
ret = aie2_hw_start(xdna);
diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_pci.h
index 4f036b9fa096..7c308672b5fe 100644
--- a/drivers/accel/amdxdna/aie2_pci.h
+++ b/drivers/accel/amdxdna/aie2_pci.h
@@ -25,11 +25,6 @@
#define SRAM_REG_OFF(ndev, idx) ((ndev)->priv->sram_offs[(idx)].offset)
-#define SMU_REG(ndev, idx) \
-({ \
- typeof(ndev) _ndev = ndev; \
- ((_ndev)->smu_base + (_ndev)->priv->smu_regs_off[(idx)].offset); \
-})
#define SRAM_GET_ADDR(ndev, idx) \
({ \
typeof(ndev) _ndev = ndev; \
@@ -71,15 +66,6 @@
})
#endif
-enum aie2_smu_reg_idx {
- SMU_CMD_REG = 0,
- SMU_ARG_REG,
- SMU_INTR_REG,
- SMU_RESP_REG,
- SMU_OUT_REG,
- SMU_MAX_REGS /* Keep this at the end */
-};
-
enum aie2_sram_reg_idx {
MBOX_CHANN_OFF = 0,
FW_ALIVE_OFF,
@@ -183,7 +169,6 @@ struct amdxdna_dev_hdl {
struct aie_device aie;
const struct amdxdna_dev_priv *priv;
void __iomem *sram_base;
- void __iomem *smu_base;
void __iomem *mbox_base;
u32 total_col;
@@ -258,11 +243,6 @@ extern const struct dpm_clk_freq npu4_dpm_clk_table[];
extern const struct rt_config npu1_default_rt_cfg[];
extern const struct rt_config npu4_default_rt_cfg[];
extern const struct amdxdna_fw_feature_tbl npu4_fw_feature_table[];
-
-/* aie2_smu.c */
-int aie2_smu_init(struct amdxdna_dev_hdl *ndev);
-void aie2_smu_fini(struct amdxdna_dev_hdl *ndev);
-int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level);
int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level);
/* aie2_pm.c */
diff --git a/drivers/accel/amdxdna/aie2_smu.c b/drivers/accel/amdxdna/aie2_smu.c
deleted file mode 100644
index 1b966bbef2e5..000000000000
--- a/drivers/accel/amdxdna/aie2_smu.c
+++ /dev/null
@@ -1,156 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2022-2024, Advanced Micro Devices, Inc.
- */
-
-#include <drm/drm_device.h>
-#include <drm/drm_gem_shmem_helper.h>
-#include <drm/drm_print.h>
-#include <drm/gpu_scheduler.h>
-#include <linux/iopoll.h>
-
-#include "aie2_pci.h"
-#include "amdxdna_pci_drv.h"
-
-#define SMU_RESULT_OK 1
-
-/* SMU commands */
-#define AIE2_SMU_POWER_ON 0x3
-#define AIE2_SMU_POWER_OFF 0x4
-#define AIE2_SMU_SET_MPNPUCLK_FREQ 0x5
-#define AIE2_SMU_SET_HCLK_FREQ 0x6
-#define AIE2_SMU_SET_SOFT_DPMLEVEL 0x7
-#define AIE2_SMU_SET_HARD_DPMLEVEL 0x8
-
-#define NPU4_DPM_TOPS(ndev, dpm_level) \
-({ \
- typeof(ndev) _ndev = ndev; \
- (4096 * (_ndev)->total_col * \
- (_ndev)->priv->dpm_clk_tbl[dpm_level].hclk / 1000000); \
-})
-
-static int aie2_smu_exec(struct amdxdna_dev_hdl *ndev, u32 reg_cmd,
- u32 reg_arg, u32 *out)
-{
- u32 resp;
- int ret;
-
- writel(0, SMU_REG(ndev, SMU_RESP_REG));
- writel(reg_arg, SMU_REG(ndev, SMU_ARG_REG));
- writel(reg_cmd, SMU_REG(ndev, SMU_CMD_REG));
-
- /* Clear and set SMU_INTR_REG to kick off */
- writel(0, SMU_REG(ndev, SMU_INTR_REG));
- writel(1, SMU_REG(ndev, SMU_INTR_REG));
-
- ret = readx_poll_timeout(readl, SMU_REG(ndev, SMU_RESP_REG), resp,
- resp, AIE_INTERVAL, AIE_TIMEOUT);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "smu cmd %d timed out", reg_cmd);
- return ret;
- }
-
- if (out)
- *out = readl(SMU_REG(ndev, SMU_OUT_REG));
-
- if (resp != SMU_RESULT_OK) {
- XDNA_ERR(ndev->aie.xdna, "smu cmd %d failed, 0x%x", reg_cmd, resp);
- return -EINVAL;
- }
-
- return 0;
-}
-
-int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
-{
- u32 freq;
- int ret;
-
- ret = aie2_smu_exec(ndev, AIE2_SMU_SET_MPNPUCLK_FREQ,
- ndev->priv->dpm_clk_tbl[dpm_level].npuclk, &freq);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Set npu clock to %d failed, ret %d\n",
- ndev->priv->dpm_clk_tbl[dpm_level].npuclk, ret);
- return ret;
- }
- ndev->npuclk_freq = freq;
-
- ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HCLK_FREQ,
- ndev->priv->dpm_clk_tbl[dpm_level].hclk, &freq);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Set h clock to %d failed, ret %d\n",
- ndev->priv->dpm_clk_tbl[dpm_level].hclk, ret);
- return ret;
- }
-
- ndev->hclk_freq = freq;
- ndev->max_tops = 2 * ndev->total_col;
- ndev->curr_tops = ndev->max_tops * freq / 1028;
-
- XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
- ndev->npuclk_freq, ndev->hclk_freq);
-
- return 0;
-}
-
-int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
-{
- int ret;
-
- ret = aie2_smu_exec(ndev, AIE2_SMU_SET_HARD_DPMLEVEL, dpm_level, NULL);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Set hard dpm level %d failed, ret %d ",
- dpm_level, ret);
- return ret;
- }
-
- ret = aie2_smu_exec(ndev, AIE2_SMU_SET_SOFT_DPMLEVEL, dpm_level, NULL);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Set soft dpm level %d failed, ret %d",
- dpm_level, ret);
- return ret;
- }
-
- ndev->npuclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].npuclk;
- ndev->hclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].hclk;
- ndev->max_tops = NPU4_DPM_TOPS(ndev, ndev->max_dpm_level);
- ndev->curr_tops = NPU4_DPM_TOPS(ndev, dpm_level);
-
- XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
- ndev->npuclk_freq, ndev->hclk_freq);
-
- return 0;
-}
-
-int aie2_smu_init(struct amdxdna_dev_hdl *ndev)
-{
- int ret;
-
- /*
- * Failing to set power off indicates an unrecoverable hardware or
- * firmware error.
- */
- ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Access power failed, ret %d", ret);
- return ret;
- }
-
- ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_ON, 0, NULL);
- if (ret) {
- XDNA_ERR(ndev->aie.xdna, "Power on failed, ret %d", ret);
- return ret;
- }
-
- return 0;
-}
-
-void aie2_smu_fini(struct amdxdna_dev_hdl *ndev)
-{
- int ret;
-
- ndev->priv->hw_ops.set_dpm(ndev, 0);
- ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
- if (ret)
- XDNA_ERR(ndev->aie.xdna, "Power off failed, ret %d", ret);
-}
diff --git a/drivers/accel/amdxdna/aie_smu.c b/drivers/accel/amdxdna/aie_smu.c
new file mode 100644
index 000000000000..62aea550aabc
--- /dev/null
+++ b/drivers/accel/amdxdna/aie_smu.c
@@ -0,0 +1,153 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026, Advanced Micro Devices, Inc.
+ */
+
+#include "drm/amdxdna_accel.h"
+#include <drm/drm_device.h>
+#include <drm/drm_managed.h>
+#include <drm/drm_print.h>
+#include <drm/gpu_scheduler.h>
+#include <linux/iopoll.h>
+
+#include "aie.h"
+
+#define SMU_RESULT_OK 1
+
+/* SMU commands */
+#define AIE_SMU_POWER_ON 0x3
+#define AIE_SMU_POWER_OFF 0x4
+#define AIE_SMU_SET_MPNPUCLK_FREQ 0x5
+#define AIE_SMU_SET_HCLK_FREQ 0x6
+#define AIE_SMU_SET_SOFT_DPMLEVEL 0x7
+#define AIE_SMU_SET_HARD_DPMLEVEL 0x8
+
+#define SMU_REG(s, reg) ((s)->smu_regs[reg])
+
+struct smu_device {
+ struct drm_device *ddev;
+ struct smu_config conf;
+ void __iomem *smu_regs[SMU_MAX_REGS];
+};
+
+static int aie_smu_exec(struct smu_device *smu, u32 reg_cmd, u32 reg_arg, u32 *out)
+{
+ u32 resp;
+ int ret;
+
+ writel(0, SMU_REG(smu, SMU_RESP_REG));
+ writel(reg_arg, SMU_REG(smu, SMU_ARG_REG));
+ writel(reg_cmd, SMU_REG(smu, SMU_CMD_REG));
+
+ /* Clear and set SMU_INTR_REG to kick off */
+ writel(0, SMU_REG(smu, SMU_INTR_REG));
+ writel(1, SMU_REG(smu, SMU_INTR_REG));
+
+ ret = readx_poll_timeout(readl, SMU_REG(smu, SMU_RESP_REG), resp,
+ resp, AIE_INTERVAL, AIE_TIMEOUT);
+ if (ret) {
+ drm_err(smu->ddev, "smu cmd %d timed out", reg_cmd);
+ return ret;
+ }
+
+ if (out)
+ *out = readl(SMU_REG(smu, SMU_OUT_REG));
+
+ if (resp != SMU_RESULT_OK) {
+ drm_err(smu->ddev, "smu cmd %d failed, 0x%x", reg_cmd, resp);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int aie_smu_init(struct smu_device *smu)
+{
+ int ret;
+
+ /*
+ * Failing to set power off indicates an unrecoverable hardware or
+ * firmware error.
+ */
+ ret = aie_smu_exec(smu, AIE_SMU_POWER_OFF, 0, NULL);
+ if (ret) {
+ drm_err(smu->ddev, "Access power failed, ret %d", ret);
+ return ret;
+ }
+
+ ret = aie_smu_exec(smu, AIE_SMU_POWER_ON, 0, NULL);
+ if (ret) {
+ drm_err(smu->ddev, "Power on failed, ret %d", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+void aie_smu_fini(struct smu_device *smu)
+{
+ int ret;
+
+ ret = aie_smu_exec(smu, AIE_SMU_POWER_OFF, 0, NULL);
+ if (ret)
+ drm_err(smu->ddev, "Power off failed, ret %d", ret);
+}
+
+int aie_smu_set_clocks(struct smu_device *smu, u32 *npuclk, u32 *hclk)
+{
+ int ret;
+
+ if (npuclk) {
+ ret = aie_smu_exec(smu, AIE_SMU_SET_MPNPUCLK_FREQ, *npuclk, npuclk);
+ if (ret) {
+ drm_err(smu->ddev, "Set mpnpu clock to %d failed, ret %d", *npuclk, ret);
+ return ret;
+ }
+ }
+
+ if (hclk) {
+ ret = aie_smu_exec(smu, AIE_SMU_SET_HCLK_FREQ, *hclk, hclk);
+ if (ret) {
+ drm_err(smu->ddev, "Set hclock to %d failed, ret %d",
+ *hclk, ret);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+int aie_smu_set_dpm(struct smu_device *smu, u32 dpm_level)
+{
+ int ret;
+
+ ret = aie_smu_exec(smu, AIE_SMU_SET_HARD_DPMLEVEL, dpm_level, NULL);
+ if (ret) {
+ drm_err(smu->ddev, "Set hard dpm level %d failed, ret %d",
+ dpm_level, ret);
+ return ret;
+ }
+
+ ret = aie_smu_exec(smu, AIE_SMU_SET_SOFT_DPMLEVEL, dpm_level, NULL);
+ if (ret) {
+ drm_err(smu->ddev, "Set soft dpm level %d failed, ret %d",
+ dpm_level, ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+struct smu_device *aiem_smu_create(struct drm_device *ddev, struct smu_config *conf)
+{
+ struct smu_device *smu;
+
+ smu = drmm_kzalloc(ddev, sizeof(*smu), GFP_KERNEL);
+ if (!smu)
+ return NULL;
+
+ smu->ddev = ddev;
+ memcpy(smu->smu_regs, conf->smu_regs, sizeof(smu->smu_regs));
+
+ return smu;
+}
diff --git a/drivers/accel/amdxdna/npu1_regs.c b/drivers/accel/amdxdna/npu1_regs.c
index 2ea7568a2e99..a83e44f378ad 100644
--- a/drivers/accel/amdxdna/npu1_regs.c
+++ b/drivers/accel/amdxdna/npu1_regs.c
@@ -71,6 +71,27 @@ static const struct amdxdna_fw_feature_tbl npu1_fw_feature_table[] = {
{ 0 }
};
+static int npu1_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
+{
+ u32 npuclk, hclk;
+ int ret;
+
+ npuclk = ndev->priv->dpm_clk_tbl[dpm_level].npuclk;
+ hclk = ndev->priv->dpm_clk_tbl[dpm_level].hclk;
+ ret = aie_smu_set_clocks(ndev->aie.smu_hdl, &npuclk, &hclk);
+ if (ret)
+ return ret;
+
+ ndev->npuclk_freq = npuclk;
+ ndev->hclk_freq = hclk;
+ ndev->max_tops = 2 * ndev->total_col;
+ ndev->curr_tops = ndev->max_tops * hclk / 1028;
+
+ XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
+ ndev->npuclk_freq, ndev->hclk_freq);
+ return 0;
+}
+
static const struct amdxdna_dev_priv npu1_dev_priv = {
.fw_path = "amdnpu/1502_00/",
.rt_config = npu1_default_rt_cfg,
diff --git a/drivers/accel/amdxdna/npu4_regs.c b/drivers/accel/amdxdna/npu4_regs.c
index 9689c56c83be..5d68171f4ec2 100644
--- a/drivers/accel/amdxdna/npu4_regs.c
+++ b/drivers/accel/amdxdna/npu4_regs.c
@@ -63,6 +63,13 @@
#define NPU4_SMU_BAR_BASE MMNPU_APERTURE4_BASE
#define NPU4_SRAM_BAR_BASE MMNPU_APERTURE1_BASE
+#define NPU4_DPM_TOPS(ndev, dpm_level) \
+({ \
+ typeof(ndev) _ndev = ndev; \
+ (4096 * (_ndev)->total_col * \
+ (_ndev)->priv->dpm_clk_tbl[dpm_level].hclk / 1000000); \
+})
+
const struct rt_config npu4_default_rt_cfg[] = {
{ 5, 1, AIE2_RT_CFG_INIT }, /* PDI APP LOAD MODE */
{ 10, 1, AIE2_RT_CFG_INIT }, /* DEBUG BUF */
@@ -98,6 +105,25 @@ const struct amdxdna_fw_feature_tbl npu4_fw_feature_table[] = {
{ 0 }
};
+int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level)
+{
+ int ret;
+
+ ret = aie_smu_set_dpm(ndev->aie.smu_hdl, dpm_level);
+ if (ret)
+ return ret;
+
+ ndev->npuclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].npuclk;
+ ndev->hclk_freq = ndev->priv->dpm_clk_tbl[dpm_level].hclk;
+ ndev->max_tops = NPU4_DPM_TOPS(ndev, ndev->max_dpm_level);
+ ndev->curr_tops = NPU4_DPM_TOPS(ndev, dpm_level);
+
+ XDNA_DBG(ndev->aie.xdna, "MP-NPU clock %d, H clock %d\n",
+ ndev->npuclk_freq, ndev->hclk_freq);
+
+ return 0;
+}
+
static const struct amdxdna_dev_priv npu4_dev_priv = {
.fw_path = "amdnpu/17f0_10/",
.rt_config = npu4_default_rt_cfg,
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V1 6/6] accel/amdxdna: Add AIE4 power on and off support
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
` (4 preceding siblings ...)
2026-03-30 16:37 ` [PATCH V1 5/6] accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4 Lizhi Hou
@ 2026-03-30 16:37 ` Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-31 7:05 ` Claude review: accel/amdxdna: Initial support for AIE4 platform Claude Code Review Bot
6 siblings, 1 reply; 17+ messages in thread
From: Lizhi Hou @ 2026-03-30 16:37 UTC (permalink / raw)
To: ogabbay, quic_jhugo, dri-devel, mario.limonciello,
maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue, Lizhi Hou
From: David Zhang <yidong.zhang@amd.com>
Implement AIE4 power on and off control using the common SMU interfaces.
Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
Signed-off-by: David Zhang <yidong.zhang@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
drivers/accel/amdxdna/aie4_pci.c | 28 +++++++++++++++++++++++++++-
drivers/accel/amdxdna/aie4_pci.h | 1 +
drivers/accel/amdxdna/npu3_regs.c | 17 ++++++++++++++++-
3 files changed, 44 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
index e7993b315996..2249b2c9398d 100644
--- a/drivers/accel/amdxdna/aie4_pci.c
+++ b/drivers/accel/amdxdna/aie4_pci.c
@@ -212,11 +212,26 @@ static int aie4_mailbox_init(struct amdxdna_dev *xdna)
static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
{
aie_psp_stop(ndev->aie.psp_hdl);
+ aie_smu_fini(ndev->aie.smu_hdl);
}
static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
{
- return aie_psp_start(ndev->aie.psp_hdl);
+ int ret;
+
+ ret = aie_smu_init(ndev->aie.smu_hdl);
+ if (ret) {
+ XDNA_ERR(ndev->aie.xdna, "failed to init smu, ret %d", ret);
+ return ret;
+ }
+
+ ret = aie_psp_start(ndev->aie.psp_hdl);
+ if (ret) {
+ XDNA_ERR(ndev->aie.xdna, "failed to start psp, ret %d", ret);
+ aie_smu_fini(ndev->aie.smu_hdl);
+ }
+
+ return ret;
}
static int aie4_hw_start(struct amdxdna_dev *xdna)
@@ -331,6 +346,7 @@ static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
{
struct amdxdna_dev *xdna = ndev->aie.xdna;
struct psp_config psp_conf;
+ struct smu_config smu_conf;
int i;
psp_conf.fw_size = npufw->size;
@@ -347,6 +363,14 @@ static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
return -ENOMEM;
}
+ for (i = 0; i < SMU_MAX_REGS; i++)
+ smu_conf.smu_regs[i] = tbl[SMU_REG_BAR(ndev, i)] + SMU_REG_OFF(ndev, i);
+ ndev->aie.smu_hdl = aiem_smu_create(&xdna->ddev, &smu_conf);
+ if (!ndev->aie.smu_hdl) {
+ XDNA_ERR(xdna, "failed to create smu");
+ return -ENOMEM;
+ }
+
return 0;
}
@@ -374,6 +398,8 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
for (i = 0; i < PSP_MAX_REGS; i++)
set_bit(PSP_REG_BAR(ndev, i), &bars);
+ for (i = 0; i < SMU_MAX_REGS; i++)
+ set_bit(SMU_REG_BAR(ndev, i), &bars);
set_bit(xdna->dev_info->mbox_bar, &bars);
set_bit(xdna->dev_info->sram_bar, &bars);
diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
index ee388ccf7196..aa1495c3370b 100644
--- a/drivers/accel/amdxdna/aie4_pci.h
+++ b/drivers/accel/amdxdna/aie4_pci.h
@@ -21,6 +21,7 @@ struct amdxdna_dev_priv {
u64 mbox_info_off;
struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
+ struct aie_bar_off_pair smu_regs_off[SMU_MAX_REGS];
};
struct amdxdna_dev_hdl {
diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
index fb2bd60b8f00..5a0bbc916094 100644
--- a/drivers/accel/amdxdna/npu3_regs.c
+++ b/drivers/accel/amdxdna/npu3_regs.c
@@ -17,15 +17,23 @@
/* PCIe BAR Index for NPU3 */
#define NPU3_REG_BAR_INDEX 0
#define NPU3_PSP_BAR_INDEX 4
+#define NPU3_SMU_BAR_INDEX 5
#define MMNPU_APERTURE3_BASE 0x3810000
+#define MMNPU_APERTURE4_BASE 0x3B10000
+
#define NPU3_PSP_BAR_BASE MMNPU_APERTURE3_BASE
+#define NPU3_SMU_BAR_BASE MMNPU_APERTURE4_BASE
#define MPASP_C2PMSG_123_ALT_1 0x3810AEC
#define MPASP_C2PMSG_156_ALT_1 0x3810B70
#define MPASP_C2PMSG_157_ALT_1 0x3810B74
#define MPASP_C2PMSG_73_ALT_1 0x3810A24
+#define MP1_C2PMSG_59_ALT_1 0x3B109EC
+#define MP1_C2PMSG_61_ALT_1 0x3B109F4
+#define MP1_C2PMSG_60_ALT_1 0x3B109F0
+
static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
{ .major = 5, .min_minor = 10 },
{ 0 }
@@ -47,13 +55,20 @@ static const struct amdxdna_dev_priv npu3_dev_priv = {
DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
/* npu3 doesn't use 8th pwaitmode register */
},
-
+ .smu_regs_off = {
+ DEFINE_BAR_OFFSET(SMU_CMD_REG, NPU3_SMU, MP1_C2PMSG_59_ALT_1),
+ DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU3_SMU, MP1_C2PMSG_61_ALT_1),
+ DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU3_SMU, MMNPU_APERTURE4_BASE),
+ DEFINE_BAR_OFFSET(SMU_RESP_REG, NPU3_SMU, MP1_C2PMSG_60_ALT_1),
+ DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU3_SMU, MP1_C2PMSG_61_ALT_1),
+ },
};
const struct amdxdna_dev_info dev_npu3_pf_info = {
.mbox_bar = NPU3_MBOX_BAR,
.sram_bar = NPU3_MBOX_BUFFER_BAR,
.psp_bar = NPU3_PSP_BAR_INDEX,
+ .smu_bar = NPU3_SMU_BAR_INDEX,
.vbnv = "RyzenAI-npu3-pf",
.device_type = AMDXDNA_DEV_TYPE_PF,
.dev_priv = &npu3_dev_priv,
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
@ 2026-03-30 20:17 ` Mario Limonciello
2026-03-30 20:30 ` yidong Zhang
2026-03-31 2:45 ` Mario Limonciello
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2 siblings, 1 reply; 17+ messages in thread
From: Mario Limonciello @ 2026-03-30 20:17 UTC (permalink / raw)
To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue
On 3/30/26 11:37, Lizhi Hou wrote:
> From: David Zhang <yidong.zhang@amd.com>
>
> Add support for loading AIE4 firmware through the common PSP
> interfaces.
>
> Compared to AIE2, AIE4 introduces an additional CERT firmware image.
> aiem_psp_create() performs CERT setup when the CERT image size is
> non-zero.
>
> Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
> Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
> Signed-off-by: David Zhang <yidong.zhang@amd.com>
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> ---
> drivers/accel/amdxdna/aie.h | 4 +
> drivers/accel/amdxdna/aie2_pci.c | 2 +
> drivers/accel/amdxdna/aie4_pci.c | 109 ++++++++++++++++++++++-
> drivers/accel/amdxdna/aie4_pci.h | 4 +
> drivers/accel/amdxdna/aie_psp.c | 141 +++++++++++++++++++++++-------
> drivers/accel/amdxdna/npu3_regs.c | 23 +++++
> 6 files changed, 247 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
> index 124c0f7e9ca0..423ed34af9ee 100644
> --- a/drivers/accel/amdxdna/aie.h
> +++ b/drivers/accel/amdxdna/aie.h
> @@ -57,7 +57,11 @@ struct aie_bar_off_pair {
> struct psp_config {
> const void *fw_buf;
> u32 fw_size;
> + const void *certfw_buf;
> + u32 certfw_size;
> void __iomem *psp_regs[PSP_MAX_REGS];
> + u32 arg2_mask;
> + u32 notify_val;
> };
>
> /* aie.c */
> diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
> index e4b7893bd429..0489e668cd73 100644
> --- a/drivers/accel/amdxdna/aie2_pci.c
> +++ b/drivers/accel/amdxdna/aie2_pci.c
> @@ -549,6 +549,8 @@ static int aie2_init(struct amdxdna_dev *xdna)
>
> psp_conf.fw_size = fw->size;
> psp_conf.fw_buf = fw->data;
> + psp_conf.arg2_mask = GENMASK(23, 0);
> + psp_conf.notify_val = 1;
> for (i = 0; i < PSP_MAX_REGS; i++)
> psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
> ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
> diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
> index 0f360c1ccebd..e7993b315996 100644
> --- a/drivers/accel/amdxdna/aie4_pci.c
> +++ b/drivers/accel/amdxdna/aie4_pci.c
> @@ -6,11 +6,15 @@
> #include <drm/amdxdna_accel.h>
> #include <drm/drm_managed.h>
> #include <drm/drm_print.h>
> +#include <linux/firmware.h>
> +#include <linux/sizes.h>
>
> #include "aie4_pci.h"
> #include "amdxdna_pci_drv.h"
>
> -#define NO_IOHUB 0
> +#define NO_IOHUB 0
> +#define CERTFW_MAX_SIZE (SZ_32K + SZ_256)
> +#define PSP_NOTIFY_INTR 0xD007BE11
>
> /*
> * The management mailbox channel is allocated by firmware.
> @@ -207,13 +211,12 @@ static int aie4_mailbox_init(struct amdxdna_dev *xdna)
>
> static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
> {
> - /* TODO */
> + aie_psp_stop(ndev->aie.psp_hdl);
> }
>
> static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
> {
> - /* TODO */
> - return 0;
> + return aie_psp_start(ndev->aie.psp_hdl);
> }
>
> static int aie4_hw_start(struct amdxdna_dev *xdna)
> @@ -261,11 +264,98 @@ static void aie4_hw_stop(struct amdxdna_dev *xdna)
> aie4_fw_unload(ndev);
> }
>
> +static int aie4_request_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware **npufw,
> + const struct firmware **certfw)
> +{
> + struct amdxdna_dev *xdna = ndev->aie.xdna;
> + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
> + char fw_name[128];
> + int ret;
> +
> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
> + pdev->device, pdev->revision, ndev->priv->npufw_path);
> + if (ret >= sizeof(fw_name)) {
> + XDNA_ERR(xdna, "npu firmware path is truncated");
> + return -EINVAL;
> + }
> +
> + ret = request_firmware(npufw, fw_name, &pdev->dev);
> + if (ret) {
> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
> + return ret;
> + }
> +
> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
> + pdev->device, pdev->revision, ndev->priv->certfw_path);
> + if (ret >= sizeof(fw_name)) {
> + XDNA_ERR(xdna, "cert firmware path is truncated");
> + ret = -EINVAL;
> + goto release_npufw;
> + }
> +
> + ret = request_firmware(certfw, fw_name, &pdev->dev);
> + if (ret) {
> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
> + goto release_npufw;
> + }
> +
> + if ((*certfw)->size > CERTFW_MAX_SIZE) {
> + XDNA_ERR(xdna, "CERTFW over maximum size of 32 KB + 256 B");
> + ret = -EINVAL;
> + goto release_certfw;
> + }
> +
> + return 0;
> +
> +release_certfw:
> + release_firmware(*certfw);
> +release_npufw:
> + release_firmware(*npufw);
> +
> + return ret;
> +}
> +
> +static void aie4_release_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware *npufw,
> + const struct firmware *certfw)
> +{
> + release_firmware(certfw);
> + release_firmware(npufw);
> +}
> +
> +static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware *npufw,
> + const struct firmware *certfw,
> + void __iomem *tbl[PCI_NUM_RESOURCES])
> +{
> + struct amdxdna_dev *xdna = ndev->aie.xdna;
> + struct psp_config psp_conf;
> + int i;
> +
> + psp_conf.fw_size = npufw->size;
> + psp_conf.fw_buf = npufw->data;
> + psp_conf.certfw_size = certfw->size;
> + psp_conf.certfw_buf = certfw->data;
> + psp_conf.arg2_mask = ~0;
> + psp_conf.notify_val = PSP_NOTIFY_INTR;
> + for (i = 0; i < PSP_MAX_REGS; i++)
> + psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
> + ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
> + if (!ndev->aie.psp_hdl) {
> + XDNA_ERR(xdna, "failed to create psp");
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
> {
> struct amdxdna_dev *xdna = ndev->aie.xdna;
> struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
> void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
> + const struct firmware *npufw, *certfw;
> unsigned long bars = 0;
> int ret, i;
>
> @@ -282,6 +372,8 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
> return ret;
> }
>
> + for (i = 0; i < PSP_MAX_REGS; i++)
> + set_bit(PSP_REG_BAR(ndev, i), &bars);
> set_bit(xdna->dev_info->mbox_bar, &bars);
> set_bit(xdna->dev_info->sram_bar, &bars);
>
> @@ -300,6 +392,15 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
>
> pci_set_master(pdev);
>
> + ret = aie4_request_firmware(ndev, &npufw, &certfw);
> + if (ret)
> + goto clear_master;
> +
> + ret = aie4_prepare_firmware(ndev, npufw, certfw, tbl);
> + aie4_release_firmware(ndev, npufw, certfw);
> + if (ret)
> + goto clear_master;
> +
> ret = aie4_irq_init(xdna);
> if (ret)
> goto clear_master;
> diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
> index f3810a969431..ee388ccf7196 100644
> --- a/drivers/accel/amdxdna/aie4_pci.h
> +++ b/drivers/accel/amdxdna/aie4_pci.h
> @@ -14,9 +14,13 @@
> #include "amdxdna_mailbox.h"
>
> struct amdxdna_dev_priv {
> + const char *npufw_path;
> + const char *certfw_path;
> u32 mbox_bar;
> u32 mbox_rbuf_bar;
> u64 mbox_info_off;
> +
> + struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
> };
>
> struct amdxdna_dev_hdl {
> diff --git a/drivers/accel/amdxdna/aie_psp.c b/drivers/accel/amdxdna/aie_psp.c
> index 8743b812a449..458dca7cc5a0 100644
> --- a/drivers/accel/amdxdna/aie_psp.c
> +++ b/drivers/accel/amdxdna/aie_psp.c
> @@ -18,6 +18,7 @@
> #define PSP_VALIDATE 1
> #define PSP_START 2
> #define PSP_RELEASE_TMR 3
> +#define PSP_VALIDATE_CERT 4
>
> /* PSP special arguments */
> #define PSP_START_COPY_FW 1
> @@ -27,10 +28,20 @@
> #define PSP_ERROR_BAD_STATE 0xFFFF0007
>
> #define PSP_FW_ALIGN 0x10000
> +#define PSP_CFW_ALIGN 0x8000
> #define PSP_POLL_INTERVAL 20000 /* us */
> #define PSP_POLL_TIMEOUT 1000000 /* us */
>
> -#define PSP_REG(p, reg) ((p)->psp_regs[reg])
> +#define PSP_REG(p, reg) ((p)->conf.psp_regs[reg])
> +#define PSP_SET_CMD(psp, reg_vals, cmd, arg0, arg1, arg2) \
> +({ \
> + u32 *_regs = reg_vals; \
> + u32 _cmd = cmd; \
> + _regs[0] = _cmd; \
> + _regs[1] = arg0; \
> + _regs[2] = arg1; \
> + _regs[3] = ((arg2) | ((_cmd) << 24)) & (psp)->conf.arg2_mask; \
> +})
>
> struct psp_device {
> struct drm_device *ddev;
> @@ -38,7 +49,9 @@ struct psp_device {
> u32 fw_buf_sz;
> u64 fw_paddr;
> void *fw_buffer;
> - void __iomem *psp_regs[PSP_MAX_REGS];
> + u32 certfw_buf_sz;
> + u64 certfw_paddr;
> + void *certfw_buffer;
> };
>
> static int psp_exec(struct psp_device *psp, u32 *reg_vals)
> @@ -47,13 +60,22 @@ static int psp_exec(struct psp_device *psp, u32 *reg_vals)
> int ret, i;
> u32 ready;
>
> + /* Check for PSP ready before any write */
> + ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
> + FIELD_GET(PSP_STATUS_READY, ready),
> + PSP_POLL_INTERVAL, PSP_POLL_TIMEOUT);
> + if (ret) {
> + drm_err(psp->ddev, "PSP is not ready, ret 0x%x", ret);
> + return ret;
> + }
> +
> /* Write command and argument registers */
> for (i = 0; i < PSP_NUM_IN_REGS; i++)
> writel(reg_vals[i], PSP_REG(psp, i));
>
> /* clear and set PSP INTR register to kick off */
> writel(0, PSP_REG(psp, PSP_INTR_REG));
> - writel(1, PSP_REG(psp, PSP_INTR_REG));
> + writel(psp->conf.notify_val, PSP_REG(psp, PSP_INTR_REG));
>
> /* PSP should be busy. Wait for ready, so we know task is done. */
> ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
> @@ -90,69 +112,124 @@ int aie_psp_waitmode_poll(struct psp_device *psp)
>
> void aie_psp_stop(struct psp_device *psp)
> {
> - u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, };
> + u32 reg_vals[PSP_NUM_IN_REGS];
> int ret;
>
> + PSP_SET_CMD(psp, reg_vals, PSP_RELEASE_TMR, 0, 0, 0);
> +
> ret = psp_exec(psp, reg_vals);
> if (ret)
> drm_err(psp->ddev, "release tmr failed, ret %d", ret);
> }
>
> -int aie_psp_start(struct psp_device *psp)
> +static int psp_validate_fw(struct psp_device *psp, u8 cmd, u64 paddr, u32 buf_sz)
> {
> u32 reg_vals[PSP_NUM_IN_REGS];
> int ret;
>
> - reg_vals[0] = PSP_VALIDATE;
> - reg_vals[1] = lower_32_bits(psp->fw_paddr);
> - reg_vals[2] = upper_32_bits(psp->fw_paddr);
> - reg_vals[3] = psp->fw_buf_sz;
> + PSP_SET_CMD(psp, reg_vals, cmd, lower_32_bits(paddr),
> + upper_32_bits(paddr), buf_sz);
>
> ret = psp_exec(psp, reg_vals);
> - if (ret) {
> + if (ret)
> drm_err(psp->ddev, "failed to validate fw, ret %d", ret);
> - return ret;
> - }
>
> - memset(reg_vals, 0, sizeof(reg_vals));
> - reg_vals[0] = PSP_START;
> - reg_vals[1] = PSP_START_COPY_FW;
> + return ret;
> +}
> +
> +static int psp_start(struct psp_device *psp)
> +{
> + u32 reg_vals[PSP_NUM_IN_REGS];
> + int ret;
> +
> + PSP_SET_CMD(psp, reg_vals, PSP_START, PSP_START_COPY_FW, 0, 0);
> +
> ret = psp_exec(psp, reg_vals);
> - if (ret) {
> + if (ret)
> drm_err(psp->ddev, "failed to start fw, ret %d", ret);
> +
> + return ret;
> +}
> +
> +int aie_psp_start(struct psp_device *psp)
> +{
> + int ret;
> +
> + ret = psp_validate_fw(psp, PSP_VALIDATE,
> + psp->fw_paddr, psp->fw_buf_sz);
> + if (ret)
> return ret;
> - }
>
> - return 0;
> + if (!psp->certfw_buf_sz)
> + goto psp_start;
> +
> + ret = psp_validate_fw(psp, PSP_VALIDATE_CERT,
> + psp->certfw_paddr, psp->certfw_buf_sz);
> + if (ret)
> + return ret;
> +psp_start:
> + return psp_start(psp);
> +}
> +
> +/*
> + * PSP requires host physical address to load firmware.
> + * Allocate a buffer, obtain its physical address, align, and copy data in.
> + */
> +static void *psp_alloc_fw_buf(struct psp_device *psp, const void *fw_data,
> + u32 fw_size, u32 align, u32 *buf_sz,
> + u64 *paddr)
> +{
> + u32 alloc_sz;
> + void *buffer;
> + u64 offset;
> +
> + *buf_sz = ALIGN(fw_size, align);
> + alloc_sz = *buf_sz + align;
> +
> + buffer = drmm_kmalloc(psp->ddev, alloc_sz, GFP_KERNEL);
> + if (!buffer)
> + return NULL;
> +
> + *paddr = virt_to_phys(buffer);
> + offset = ALIGN(*paddr, align) - *paddr;
> + *paddr += offset;
> + memcpy(buffer + offset, fw_data, fw_size);
> +
> + return buffer;
> }
>
> struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf)
> {
> struct psp_device *psp;
> - u64 offset;
>
> psp = drmm_kzalloc(ddev, sizeof(*psp), GFP_KERNEL);
> if (!psp)
> return NULL;
>
> psp->ddev = ddev;
> - memcpy(psp->psp_regs, conf->psp_regs, sizeof(psp->psp_regs));
> + psp->fw_buffer = psp_alloc_fw_buf(psp, conf->fw_buf, conf->fw_size,
> + PSP_FW_ALIGN, &psp->fw_buf_sz,
> + &psp->fw_paddr);
> + if (!psp->fw_buffer)
> + return NULL;
> +
> + if (!conf->certfw_size) {
> + drm_dbg(ddev, "no cert fw");
> + goto done;
> + }
>
> - psp->fw_buf_sz = ALIGN(conf->fw_size, PSP_FW_ALIGN);
> - psp->fw_buffer = drmm_kmalloc(ddev, psp->fw_buf_sz + PSP_FW_ALIGN, GFP_KERNEL);
> - if (!psp->fw_buffer) {
> - drm_err(ddev, "no memory for fw buffer");
> + /* CERT firmware */
> + psp->certfw_buffer = psp_alloc_fw_buf(psp, conf->certfw_buf,
> + conf->certfw_size, PSP_CFW_ALIGN,
> + &psp->certfw_buf_sz,
> + &psp->certfw_paddr);
> + if (!psp->certfw_buffer) {
> + drm_err(ddev, "no memory for cert fw buffer");
> return NULL;
> }
>
> - /*
> - * AMD Platform Security Processor(PSP) requires host physical
> - * address to load NPU firmware.
> - */
> - psp->fw_paddr = virt_to_phys(psp->fw_buffer);
> - offset = ALIGN(psp->fw_paddr, PSP_FW_ALIGN) - psp->fw_paddr;
> - psp->fw_paddr += offset;
> - memcpy(psp->fw_buffer + offset, conf->fw_buf, conf->fw_size);
> +done:
> + memcpy(&psp->conf, conf, sizeof(psp->conf));
>
> return psp;
> }
> diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
> index f6e20f4858db..fb2bd60b8f00 100644
> --- a/drivers/accel/amdxdna/npu3_regs.c
> +++ b/drivers/accel/amdxdna/npu3_regs.c
> @@ -16,6 +16,15 @@
>
> /* PCIe BAR Index for NPU3 */
> #define NPU3_REG_BAR_INDEX 0
> +#define NPU3_PSP_BAR_INDEX 4
> +
> +#define MMNPU_APERTURE3_BASE 0x3810000
> +#define NPU3_PSP_BAR_BASE MMNPU_APERTURE3_BASE
> +
> +#define MPASP_C2PMSG_123_ALT_1 0x3810AEC
> +#define MPASP_C2PMSG_156_ALT_1 0x3810B70
> +#define MPASP_C2PMSG_157_ALT_1 0x3810B74
> +#define MPASP_C2PMSG_73_ALT_1 0x3810A24
>
> static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
> { .major = 5, .min_minor = 10 },
> @@ -23,14 +32,28 @@ static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
> };
>
> static const struct amdxdna_dev_priv npu3_dev_priv = {
> + .npufw_path = "npu.dev.sbin",
> + .certfw_path = "cert.dev.sbin",
> .mbox_bar = NPU3_MBOX_BAR,
> .mbox_rbuf_bar = NPU3_MBOX_BUFFER_BAR,
> .mbox_info_off = NPU3_MBOX_INFO_OFF,
> + .psp_regs_off = {
> + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU3_PSP, MPASP_C2PMSG_157_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU3_PSP, MPASP_C2PMSG_73_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
> + /* npu3 doesn't use 8th pwaitmode register */
> + },
> +
Spurious whitespace here that you ping pong in the later patches.
> };
>
> const struct amdxdna_dev_info dev_npu3_pf_info = {
> .mbox_bar = NPU3_MBOX_BAR,
> .sram_bar = NPU3_MBOX_BUFFER_BAR,
> + .psp_bar = NPU3_PSP_BAR_INDEX,
> .vbnv = "RyzenAI-npu3-pf",
> .device_type = AMDXDNA_DEV_TYPE_PF,
> .dev_priv = &npu3_dev_priv,
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading
2026-03-30 20:17 ` Mario Limonciello
@ 2026-03-30 20:30 ` yidong Zhang
0 siblings, 0 replies; 17+ messages in thread
From: yidong Zhang @ 2026-03-30 20:30 UTC (permalink / raw)
To: Mario Limonciello, Lizhi Hou, ogabbay, quic_jhugo, dri-devel,
maciej.falkowski
Cc: linux-kernel, max.zhen, sonal.santan, Hayden Laccabue
On 3/30/26 13:17, Mario Limonciello wrote:
>
>
> On 3/30/26 11:37, Lizhi Hou wrote:
>> From: David Zhang <yidong.zhang@amd.com>
>>
>> Add support for loading AIE4 firmware through the common PSP
>> interfaces.
>>
>> Compared to AIE2, AIE4 introduces an additional CERT firmware image.
>> aiem_psp_create() performs CERT setup when the CERT image size is
>> non-zero.
>>
>> Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
>> Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
>> Signed-off-by: David Zhang <yidong.zhang@amd.com>
>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
>> ---
>> drivers/accel/amdxdna/aie.h | 4 +
>> drivers/accel/amdxdna/aie2_pci.c | 2 +
>> drivers/accel/amdxdna/aie4_pci.c | 109 ++++++++++++++++++++++-
>> drivers/accel/amdxdna/aie4_pci.h | 4 +
>> drivers/accel/amdxdna/aie_psp.c | 141 +++++++++++++++++++++++-------
>> drivers/accel/amdxdna/npu3_regs.c | 23 +++++
>> 6 files changed, 247 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
>> index 124c0f7e9ca0..423ed34af9ee 100644
>> --- a/drivers/accel/amdxdna/aie.h
>> +++ b/drivers/accel/amdxdna/aie.h
>> @@ -57,7 +57,11 @@ struct aie_bar_off_pair {
>> struct psp_config {
>> const void *fw_buf;
>> u32 fw_size;
>> + const void *certfw_buf;
>> + u32 certfw_size;
>> void __iomem *psp_regs[PSP_MAX_REGS];
>> + u32 arg2_mask;
>> + u32 notify_val;
>> };
>> /* aie.c */
>> diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
>> index e4b7893bd429..0489e668cd73 100644
>> --- a/drivers/accel/amdxdna/aie2_pci.c
>> +++ b/drivers/accel/amdxdna/aie2_pci.c
>> @@ -549,6 +549,8 @@ static int aie2_init(struct amdxdna_dev *xdna)
>> psp_conf.fw_size = fw->size;
>> psp_conf.fw_buf = fw->data;
>> + psp_conf.arg2_mask = GENMASK(23, 0);
>> + psp_conf.notify_val = 1;
>> for (i = 0; i < PSP_MAX_REGS; i++)
>> psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
>> ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
>> diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
>> index 0f360c1ccebd..e7993b315996 100644
>> --- a/drivers/accel/amdxdna/aie4_pci.c
>> +++ b/drivers/accel/amdxdna/aie4_pci.c
>> @@ -6,11 +6,15 @@
>> #include <drm/amdxdna_accel.h>
>> #include <drm/drm_managed.h>
>> #include <drm/drm_print.h>
>> +#include <linux/firmware.h>
>> +#include <linux/sizes.h>
>> #include "aie4_pci.h"
>> #include "amdxdna_pci_drv.h"
>> -#define NO_IOHUB 0
>> +#define NO_IOHUB 0
>> +#define CERTFW_MAX_SIZE (SZ_32K + SZ_256)
>> +#define PSP_NOTIFY_INTR 0xD007BE11
>> /*
>> * The management mailbox channel is allocated by firmware.
>> @@ -207,13 +211,12 @@ static int aie4_mailbox_init(struct amdxdna_dev *xdna)
>> static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
>> {
>> - /* TODO */
>> + aie_psp_stop(ndev->aie.psp_hdl);
>> }
>> static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
>> {
>> - /* TODO */
>> - return 0;
>> + return aie_psp_start(ndev->aie.psp_hdl);
>> }
>> static int aie4_hw_start(struct amdxdna_dev *xdna)
>> @@ -261,11 +264,98 @@ static void aie4_hw_stop(struct amdxdna_dev *xdna)
>> aie4_fw_unload(ndev);
>> }
>> +static int aie4_request_firmware(struct amdxdna_dev_hdl *ndev,
>> + const struct firmware **npufw,
>> + const struct firmware **certfw)
>> +{
>> + struct amdxdna_dev *xdna = ndev->aie.xdna;
>> + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
>> + char fw_name[128];
>> + int ret;
>> +
>> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
>> + pdev->device, pdev->revision, ndev->priv->npufw_path);
>> + if (ret >= sizeof(fw_name)) {
>> + XDNA_ERR(xdna, "npu firmware path is truncated");
>> + return -EINVAL;
>> + }
>> +
>> + ret = request_firmware(npufw, fw_name, &pdev->dev);
>> + if (ret) {
>> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
>> + return ret;
>> + }
>> +
>> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
>> + pdev->device, pdev->revision, ndev->priv->certfw_path);
>> + if (ret >= sizeof(fw_name)) {
>> + XDNA_ERR(xdna, "cert firmware path is truncated");
>> + ret = -EINVAL;
>> + goto release_npufw;
>> + }
>> +
>> + ret = request_firmware(certfw, fw_name, &pdev->dev);
>> + if (ret) {
>> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
>> + goto release_npufw;
>> + }
>> +
>> + if ((*certfw)->size > CERTFW_MAX_SIZE) {
>> + XDNA_ERR(xdna, "CERTFW over maximum size of 32 KB + 256 B");
>> + ret = -EINVAL;
>> + goto release_certfw;
>> + }
>> +
>> + return 0;
>> +
>> +release_certfw:
>> + release_firmware(*certfw);
>> +release_npufw:
>> + release_firmware(*npufw);
>> +
>> + return ret;
>> +}
>> +
>> +static void aie4_release_firmware(struct amdxdna_dev_hdl *ndev,
>> + const struct firmware *npufw,
>> + const struct firmware *certfw)
>> +{
>> + release_firmware(certfw);
>> + release_firmware(npufw);
>> +}
>> +
>> +static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
>> + const struct firmware *npufw,
>> + const struct firmware *certfw,
>> + void __iomem *tbl[PCI_NUM_RESOURCES])
>> +{
>> + struct amdxdna_dev *xdna = ndev->aie.xdna;
>> + struct psp_config psp_conf;
>> + int i;
>> +
>> + psp_conf.fw_size = npufw->size;
>> + psp_conf.fw_buf = npufw->data;
>> + psp_conf.certfw_size = certfw->size;
>> + psp_conf.certfw_buf = certfw->data;
>> + psp_conf.arg2_mask = ~0;
>> + psp_conf.notify_val = PSP_NOTIFY_INTR;
>> + for (i = 0; i < PSP_MAX_REGS; i++)
>> + psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
>> + ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
>> + if (!ndev->aie.psp_hdl) {
>> + XDNA_ERR(xdna, "failed to create psp");
>> + return -ENOMEM;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
>> {
>> struct amdxdna_dev *xdna = ndev->aie.xdna;
>> struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
>> void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
>> + const struct firmware *npufw, *certfw;
>> unsigned long bars = 0;
>> int ret, i;
>> @@ -282,6 +372,8 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
>> return ret;
>> }
>> + for (i = 0; i < PSP_MAX_REGS; i++)
>> + set_bit(PSP_REG_BAR(ndev, i), &bars);
>> set_bit(xdna->dev_info->mbox_bar, &bars);
>> set_bit(xdna->dev_info->sram_bar, &bars);
>> @@ -300,6 +392,15 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
>> pci_set_master(pdev);
>> + ret = aie4_request_firmware(ndev, &npufw, &certfw);
>> + if (ret)
>> + goto clear_master;
>> +
>> + ret = aie4_prepare_firmware(ndev, npufw, certfw, tbl);
>> + aie4_release_firmware(ndev, npufw, certfw);
>> + if (ret)
>> + goto clear_master;
>> +
>> ret = aie4_irq_init(xdna);
>> if (ret)
>> goto clear_master;
>> diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
>> index f3810a969431..ee388ccf7196 100644
>> --- a/drivers/accel/amdxdna/aie4_pci.h
>> +++ b/drivers/accel/amdxdna/aie4_pci.h
>> @@ -14,9 +14,13 @@
>> #include "amdxdna_mailbox.h"
>> struct amdxdna_dev_priv {
>> + const char *npufw_path;
>> + const char *certfw_path;
>> u32 mbox_bar;
>> u32 mbox_rbuf_bar;
>> u64 mbox_info_off;
>> +
>> + struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
>> };
>> struct amdxdna_dev_hdl {
>> diff --git a/drivers/accel/amdxdna/aie_psp.c b/drivers/accel/amdxdna/aie_psp.c
>> index 8743b812a449..458dca7cc5a0 100644
>> --- a/drivers/accel/amdxdna/aie_psp.c
>> +++ b/drivers/accel/amdxdna/aie_psp.c
>> @@ -18,6 +18,7 @@
>> #define PSP_VALIDATE 1
>> #define PSP_START 2
>> #define PSP_RELEASE_TMR 3
>> +#define PSP_VALIDATE_CERT 4
>> /* PSP special arguments */
>> #define PSP_START_COPY_FW 1
>> @@ -27,10 +28,20 @@
>> #define PSP_ERROR_BAD_STATE 0xFFFF0007
>> #define PSP_FW_ALIGN 0x10000
>> +#define PSP_CFW_ALIGN 0x8000
>> #define PSP_POLL_INTERVAL 20000 /* us */
>> #define PSP_POLL_TIMEOUT 1000000 /* us */
>> -#define PSP_REG(p, reg) ((p)->psp_regs[reg])
>> +#define PSP_REG(p, reg) ((p)->conf.psp_regs[reg])
>> +#define PSP_SET_CMD(psp, reg_vals, cmd, arg0, arg1, arg2) \
>> +({ \
>> + u32 *_regs = reg_vals; \
>> + u32 _cmd = cmd; \
>> + _regs[0] = _cmd; \
>> + _regs[1] = arg0; \
>> + _regs[2] = arg1; \
>> + _regs[3] = ((arg2) | ((_cmd) << 24)) & (psp)->conf.arg2_mask; \
>> +})
>> struct psp_device {
>> struct drm_device *ddev;
>> @@ -38,7 +49,9 @@ struct psp_device {
>> u32 fw_buf_sz;
>> u64 fw_paddr;
>> void *fw_buffer;
>> - void __iomem *psp_regs[PSP_MAX_REGS];
>> + u32 certfw_buf_sz;
>> + u64 certfw_paddr;
>> + void *certfw_buffer;
>> };
>> static int psp_exec(struct psp_device *psp, u32 *reg_vals)
>> @@ -47,13 +60,22 @@ static int psp_exec(struct psp_device *psp, u32 *reg_vals)
>> int ret, i;
>> u32 ready;
>> + /* Check for PSP ready before any write */
>> + ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
>> + FIELD_GET(PSP_STATUS_READY, ready),
>> + PSP_POLL_INTERVAL, PSP_POLL_TIMEOUT);
>> + if (ret) {
>> + drm_err(psp->ddev, "PSP is not ready, ret 0x%x", ret);
>> + return ret;
>> + }
>> +
>> /* Write command and argument registers */
>> for (i = 0; i < PSP_NUM_IN_REGS; i++)
>> writel(reg_vals[i], PSP_REG(psp, i));
>> /* clear and set PSP INTR register to kick off */
>> writel(0, PSP_REG(psp, PSP_INTR_REG));
>> - writel(1, PSP_REG(psp, PSP_INTR_REG));
>> + writel(psp->conf.notify_val, PSP_REG(psp, PSP_INTR_REG));
>> /* PSP should be busy. Wait for ready, so we know task is done. */
>> ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
>> @@ -90,69 +112,124 @@ int aie_psp_waitmode_poll(struct psp_device *psp)
>> void aie_psp_stop(struct psp_device *psp)
>> {
>> - u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, };
>> + u32 reg_vals[PSP_NUM_IN_REGS];
>> int ret;
>> + PSP_SET_CMD(psp, reg_vals, PSP_RELEASE_TMR, 0, 0, 0);
>> +
>> ret = psp_exec(psp, reg_vals);
>> if (ret)
>> drm_err(psp->ddev, "release tmr failed, ret %d", ret);
>> }
>> -int aie_psp_start(struct psp_device *psp)
>> +static int psp_validate_fw(struct psp_device *psp, u8 cmd, u64 paddr, u32 buf_sz)
>> {
>> u32 reg_vals[PSP_NUM_IN_REGS];
>> int ret;
>> - reg_vals[0] = PSP_VALIDATE;
>> - reg_vals[1] = lower_32_bits(psp->fw_paddr);
>> - reg_vals[2] = upper_32_bits(psp->fw_paddr);
>> - reg_vals[3] = psp->fw_buf_sz;
>> + PSP_SET_CMD(psp, reg_vals, cmd, lower_32_bits(paddr),
>> + upper_32_bits(paddr), buf_sz);
>> ret = psp_exec(psp, reg_vals);
>> - if (ret) {
>> + if (ret)
>> drm_err(psp->ddev, "failed to validate fw, ret %d", ret);
>> - return ret;
>> - }
>> - memset(reg_vals, 0, sizeof(reg_vals));
>> - reg_vals[0] = PSP_START;
>> - reg_vals[1] = PSP_START_COPY_FW;
>> + return ret;
>> +}
>> +
>> +static int psp_start(struct psp_device *psp)
>> +{
>> + u32 reg_vals[PSP_NUM_IN_REGS];
>> + int ret;
>> +
>> + PSP_SET_CMD(psp, reg_vals, PSP_START, PSP_START_COPY_FW, 0, 0);
>> +
>> ret = psp_exec(psp, reg_vals);
>> - if (ret) {
>> + if (ret)
>> drm_err(psp->ddev, "failed to start fw, ret %d", ret);
>> +
>> + return ret;
>> +}
>> +
>> +int aie_psp_start(struct psp_device *psp)
>> +{
>> + int ret;
>> +
>> + ret = psp_validate_fw(psp, PSP_VALIDATE,
>> + psp->fw_paddr, psp->fw_buf_sz);
>> + if (ret)
>> return ret;
>> - }
>> - return 0;
>> + if (!psp->certfw_buf_sz)
>> + goto psp_start;
>> +
>> + ret = psp_validate_fw(psp, PSP_VALIDATE_CERT,
>> + psp->certfw_paddr, psp->certfw_buf_sz);
>> + if (ret)
>> + return ret;
>> +psp_start:
>> + return psp_start(psp);
>> +}
>> +
>> +/*
>> + * PSP requires host physical address to load firmware.
>> + * Allocate a buffer, obtain its physical address, align, and copy data in.
>> + */
>> +static void *psp_alloc_fw_buf(struct psp_device *psp, const void *fw_data,
>> + u32 fw_size, u32 align, u32 *buf_sz,
>> + u64 *paddr)
>> +{
>> + u32 alloc_sz;
>> + void *buffer;
>> + u64 offset;
>> +
>> + *buf_sz = ALIGN(fw_size, align);
>> + alloc_sz = *buf_sz + align;
>> +
>> + buffer = drmm_kmalloc(psp->ddev, alloc_sz, GFP_KERNEL);
>> + if (!buffer)
>> + return NULL;
>> +
>> + *paddr = virt_to_phys(buffer);
>> + offset = ALIGN(*paddr, align) - *paddr;
>> + *paddr += offset;
>> + memcpy(buffer + offset, fw_data, fw_size);
>> +
>> + return buffer;
>> }
>> struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf)
>> {
>> struct psp_device *psp;
>> - u64 offset;
>> psp = drmm_kzalloc(ddev, sizeof(*psp), GFP_KERNEL);
>> if (!psp)
>> return NULL;
>> psp->ddev = ddev;
>> - memcpy(psp->psp_regs, conf->psp_regs, sizeof(psp->psp_regs));
>> + psp->fw_buffer = psp_alloc_fw_buf(psp, conf->fw_buf, conf->fw_size,
>> + PSP_FW_ALIGN, &psp->fw_buf_sz,
>> + &psp->fw_paddr);
>> + if (!psp->fw_buffer)
>> + return NULL;
>> +
>> + if (!conf->certfw_size) {
>> + drm_dbg(ddev, "no cert fw");
>> + goto done;
>> + }
>> - psp->fw_buf_sz = ALIGN(conf->fw_size, PSP_FW_ALIGN);
>> - psp->fw_buffer = drmm_kmalloc(ddev, psp->fw_buf_sz + PSP_FW_ALIGN, GFP_KERNEL);
>> - if (!psp->fw_buffer) {
>> - drm_err(ddev, "no memory for fw buffer");
>> + /* CERT firmware */
>> + psp->certfw_buffer = psp_alloc_fw_buf(psp, conf->certfw_buf,
>> + conf->certfw_size, PSP_CFW_ALIGN,
>> + &psp->certfw_buf_sz,
>> + &psp->certfw_paddr);
>> + if (!psp->certfw_buffer) {
>> + drm_err(ddev, "no memory for cert fw buffer");
>> return NULL;
>> }
>> - /*
>> - * AMD Platform Security Processor(PSP) requires host physical
>> - * address to load NPU firmware.
>> - */
>> - psp->fw_paddr = virt_to_phys(psp->fw_buffer);
>> - offset = ALIGN(psp->fw_paddr, PSP_FW_ALIGN) - psp->fw_paddr;
>> - psp->fw_paddr += offset;
>> - memcpy(psp->fw_buffer + offset, conf->fw_buf, conf->fw_size);
>> +done:
>> + memcpy(&psp->conf, conf, sizeof(psp->conf));
>> return psp;
>> }
>> diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
>> index f6e20f4858db..fb2bd60b8f00 100644
>> --- a/drivers/accel/amdxdna/npu3_regs.c
>> +++ b/drivers/accel/amdxdna/npu3_regs.c
>> @@ -16,6 +16,15 @@
>> /* PCIe BAR Index for NPU3 */
>> #define NPU3_REG_BAR_INDEX 0
>> +#define NPU3_PSP_BAR_INDEX 4
>> +
>> +#define MMNPU_APERTURE3_BASE 0x3810000
>> +#define NPU3_PSP_BAR_BASE MMNPU_APERTURE3_BASE
>> +
>> +#define MPASP_C2PMSG_123_ALT_1 0x3810AEC
>> +#define MPASP_C2PMSG_156_ALT_1 0x3810B70
>> +#define MPASP_C2PMSG_157_ALT_1 0x3810B74
>> +#define MPASP_C2PMSG_73_ALT_1 0x3810A24
>> static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
>> { .major = 5, .min_minor = 10 },
>> @@ -23,14 +32,28 @@ static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
>> };
>> static const struct amdxdna_dev_priv npu3_dev_priv = {
>> + .npufw_path = "npu.dev.sbin",
>> + .certfw_path = "cert.dev.sbin",
>> .mbox_bar = NPU3_MBOX_BAR,
>> .mbox_rbuf_bar = NPU3_MBOX_BUFFER_BAR,
>> .mbox_info_off = NPU3_MBOX_INFO_OFF,
>> + .psp_regs_off = {
>> + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU3_PSP, MPASP_C2PMSG_157_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU3_PSP, MPASP_C2PMSG_73_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
>> + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
>> + /* npu3 doesn't use 8th pwaitmode register */
>> + },
>> +
>
> Spurious whitespace here that you ping pong in the later patches.
Thank you so much! I will fix this.
/David
>
>> };
>> const struct amdxdna_dev_info dev_npu3_pf_info = {
>> .mbox_bar = NPU3_MBOX_BAR,
>> .sram_bar = NPU3_MBOX_BUFFER_BAR,
>> + .psp_bar = NPU3_PSP_BAR_INDEX,
>> .vbnv = "RyzenAI-npu3-pf",
>> .device_type = AMDXDNA_DEV_TYPE_PF,
>> .dev_priv = &npu3_dev_priv,
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
2026-03-30 20:17 ` Mario Limonciello
@ 2026-03-31 2:45 ` Mario Limonciello
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2 siblings, 0 replies; 17+ messages in thread
From: Mario Limonciello @ 2026-03-31 2:45 UTC (permalink / raw)
To: Lizhi Hou, ogabbay, quic_jhugo, dri-devel, maciej.falkowski
Cc: David Zhang, linux-kernel, max.zhen, sonal.santan,
Hayden Laccabue
On 3/30/26 11:37, Lizhi Hou wrote:
> From: David Zhang <yidong.zhang@amd.com>
>
> Add support for loading AIE4 firmware through the common PSP
> interfaces.
>
> Compared to AIE2, AIE4 introduces an additional CERT firmware image.
> aiem_psp_create() performs CERT setup when the CERT image size is
> non-zero.
>
> Co-developed-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
> Signed-off-by: Hayden Laccabue <Hayden.Laccabue@amd.com>
> Signed-off-by: David Zhang <yidong.zhang@amd.com>
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> ---
> drivers/accel/amdxdna/aie.h | 4 +
> drivers/accel/amdxdna/aie2_pci.c | 2 +
> drivers/accel/amdxdna/aie4_pci.c | 109 ++++++++++++++++++++++-
> drivers/accel/amdxdna/aie4_pci.h | 4 +
> drivers/accel/amdxdna/aie_psp.c | 141 +++++++++++++++++++++++-------
> drivers/accel/amdxdna/npu3_regs.c | 23 +++++
> 6 files changed, 247 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/accel/amdxdna/aie.h b/drivers/accel/amdxdna/aie.h
> index 124c0f7e9ca0..423ed34af9ee 100644
> --- a/drivers/accel/amdxdna/aie.h
> +++ b/drivers/accel/amdxdna/aie.h
> @@ -57,7 +57,11 @@ struct aie_bar_off_pair {
> struct psp_config {
> const void *fw_buf;
> u32 fw_size;
> + const void *certfw_buf;
> + u32 certfw_size;
> void __iomem *psp_regs[PSP_MAX_REGS];
> + u32 arg2_mask;
> + u32 notify_val;
> };
>
> /* aie.c */
> diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.c
> index e4b7893bd429..0489e668cd73 100644
> --- a/drivers/accel/amdxdna/aie2_pci.c
> +++ b/drivers/accel/amdxdna/aie2_pci.c
> @@ -549,6 +549,8 @@ static int aie2_init(struct amdxdna_dev *xdna)
>
> psp_conf.fw_size = fw->size;
> psp_conf.fw_buf = fw->data;
> + psp_conf.arg2_mask = GENMASK(23, 0);
> + psp_conf.notify_val = 1;
> for (i = 0; i < PSP_MAX_REGS; i++)
> psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
> ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
> diff --git a/drivers/accel/amdxdna/aie4_pci.c b/drivers/accel/amdxdna/aie4_pci.c
> index 0f360c1ccebd..e7993b315996 100644
> --- a/drivers/accel/amdxdna/aie4_pci.c
> +++ b/drivers/accel/amdxdna/aie4_pci.c
> @@ -6,11 +6,15 @@
> #include <drm/amdxdna_accel.h>
> #include <drm/drm_managed.h>
> #include <drm/drm_print.h>
> +#include <linux/firmware.h>
> +#include <linux/sizes.h>
>
> #include "aie4_pci.h"
> #include "amdxdna_pci_drv.h"
>
> -#define NO_IOHUB 0
> +#define NO_IOHUB 0
> +#define CERTFW_MAX_SIZE (SZ_32K + SZ_256)
> +#define PSP_NOTIFY_INTR 0xD007BE11
>
> /*
> * The management mailbox channel is allocated by firmware.
> @@ -207,13 +211,12 @@ static int aie4_mailbox_init(struct amdxdna_dev *xdna)
>
> static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
> {
> - /* TODO */
> + aie_psp_stop(ndev->aie.psp_hdl);
> }
>
> static int aie4_fw_load(struct amdxdna_dev_hdl *ndev)
> {
> - /* TODO */
> - return 0;
> + return aie_psp_start(ndev->aie.psp_hdl);
> }
>
> static int aie4_hw_start(struct amdxdna_dev *xdna)
> @@ -261,11 +264,98 @@ static void aie4_hw_stop(struct amdxdna_dev *xdna)
> aie4_fw_unload(ndev);
> }
>
> +static int aie4_request_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware **npufw,
> + const struct firmware **certfw)
> +{
> + struct amdxdna_dev *xdna = ndev->aie.xdna;
> + struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
> + char fw_name[128];
> + int ret;
> +
> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
> + pdev->device, pdev->revision, ndev->priv->npufw_path);
> + if (ret >= sizeof(fw_name)) {
> + XDNA_ERR(xdna, "npu firmware path is truncated");
> + return -EINVAL;
> + }
> +
> + ret = request_firmware(npufw, fw_name, &pdev->dev);
> + if (ret) {
> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
> + return ret;
> + }
> +
> + ret = snprintf(fw_name, sizeof(fw_name), "amdnpu/%04x_%02x/%s",
> + pdev->device, pdev->revision, ndev->priv->certfw_path);
> + if (ret >= sizeof(fw_name)) {
> + XDNA_ERR(xdna, "cert firmware path is truncated");
> + ret = -EINVAL;
> + goto release_npufw;
> + }
> +
> + ret = request_firmware(certfw, fw_name, &pdev->dev);
> + if (ret) {
> + XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", fw_name, ret);
> + goto release_npufw;
> + }
> +
> + if ((*certfw)->size > CERTFW_MAX_SIZE) {
> + XDNA_ERR(xdna, "CERTFW over maximum size of 32 KB + 256 B");
> + ret = -EINVAL;
> + goto release_certfw;
> + }
Should there be a similar size check for NPU FW? Not sure why it would
only be done for Cert FW.
> +
> + return 0;
> +
> +release_certfw:
> + release_firmware(*certfw);
> +release_npufw:
> + release_firmware(*npufw);
> +
> + return ret;
> +}
> +
> +static void aie4_release_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware *npufw,
> + const struct firmware *certfw)
> +{
> + release_firmware(certfw);
> + release_firmware(npufw);
> +}
> +
> +static int aie4_prepare_firmware(struct amdxdna_dev_hdl *ndev,
> + const struct firmware *npufw,
> + const struct firmware *certfw,
> + void __iomem *tbl[PCI_NUM_RESOURCES])
> +{
> + struct amdxdna_dev *xdna = ndev->aie.xdna;
> + struct psp_config psp_conf;
> + int i;
> +
> + psp_conf.fw_size = npufw->size;
> + psp_conf.fw_buf = npufw->data;
> + psp_conf.certfw_size = certfw->size;
> + psp_conf.certfw_buf = certfw->data;
> + psp_conf.arg2_mask = ~0;
> + psp_conf.notify_val = PSP_NOTIFY_INTR;
> + for (i = 0; i < PSP_MAX_REGS; i++)
> + psp_conf.psp_regs[i] = tbl[PSP_REG_BAR(ndev, i)] + PSP_REG_OFF(ndev, i);
> + ndev->aie.psp_hdl = aiem_psp_create(&xdna->ddev, &psp_conf);
> + if (!ndev->aie.psp_hdl) {
> + XDNA_ERR(xdna, "failed to create psp");
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
> {
> struct amdxdna_dev *xdna = ndev->aie.xdna;
> struct pci_dev *pdev = to_pci_dev(xdna->ddev.dev);
> void __iomem *tbl[PCI_NUM_RESOURCES] = {0};
> + const struct firmware *npufw, *certfw;
> unsigned long bars = 0;
> int ret, i;
>
> @@ -282,6 +372,8 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
> return ret;
> }
>
> + for (i = 0; i < PSP_MAX_REGS; i++)
> + set_bit(PSP_REG_BAR(ndev, i), &bars);
> set_bit(xdna->dev_info->mbox_bar, &bars);
> set_bit(xdna->dev_info->sram_bar, &bars);
>
> @@ -300,6 +392,15 @@ static int aie4_pcidev_init(struct amdxdna_dev_hdl *ndev)
>
> pci_set_master(pdev);
>
> + ret = aie4_request_firmware(ndev, &npufw, &certfw);
> + if (ret)
> + goto clear_master;
> +
> + ret = aie4_prepare_firmware(ndev, npufw, certfw, tbl);
> + aie4_release_firmware(ndev, npufw, certfw);
> + if (ret)
> + goto clear_master;
> +
> ret = aie4_irq_init(xdna);
> if (ret)
> goto clear_master;
> diff --git a/drivers/accel/amdxdna/aie4_pci.h b/drivers/accel/amdxdna/aie4_pci.h
> index f3810a969431..ee388ccf7196 100644
> --- a/drivers/accel/amdxdna/aie4_pci.h
> +++ b/drivers/accel/amdxdna/aie4_pci.h
> @@ -14,9 +14,13 @@
> #include "amdxdna_mailbox.h"
>
> struct amdxdna_dev_priv {
> + const char *npufw_path;
> + const char *certfw_path;
> u32 mbox_bar;
> u32 mbox_rbuf_bar;
> u64 mbox_info_off;
> +
> + struct aie_bar_off_pair psp_regs_off[PSP_MAX_REGS];
> };
>
> struct amdxdna_dev_hdl {
> diff --git a/drivers/accel/amdxdna/aie_psp.c b/drivers/accel/amdxdna/aie_psp.c
> index 8743b812a449..458dca7cc5a0 100644
> --- a/drivers/accel/amdxdna/aie_psp.c
> +++ b/drivers/accel/amdxdna/aie_psp.c
> @@ -18,6 +18,7 @@
> #define PSP_VALIDATE 1
> #define PSP_START 2
> #define PSP_RELEASE_TMR 3
> +#define PSP_VALIDATE_CERT 4
>
> /* PSP special arguments */
> #define PSP_START_COPY_FW 1
> @@ -27,10 +28,20 @@
> #define PSP_ERROR_BAD_STATE 0xFFFF0007
>
> #define PSP_FW_ALIGN 0x10000
> +#define PSP_CFW_ALIGN 0x8000
> #define PSP_POLL_INTERVAL 20000 /* us */
> #define PSP_POLL_TIMEOUT 1000000 /* us */
>
> -#define PSP_REG(p, reg) ((p)->psp_regs[reg])
> +#define PSP_REG(p, reg) ((p)->conf.psp_regs[reg])
> +#define PSP_SET_CMD(psp, reg_vals, cmd, arg0, arg1, arg2) \
> +({ \
> + u32 *_regs = reg_vals; \
> + u32 _cmd = cmd; \
> + _regs[0] = _cmd; \
> + _regs[1] = arg0; \
> + _regs[2] = arg1; \
> + _regs[3] = ((arg2) | ((_cmd) << 24)) & (psp)->conf.arg2_mask; \
> +})
>
For AIE4, arg2_mask is set to ~0 (0xFFFFFFFF), which means the full
32-bit value including cmd<<24 is preserved.
If arg2 uses bits 24-31, the OR operation could corrupt the cmd field.
For example:
arg2 = 0x02000000 (32MB firmware size, bit 25 set)
cmd = 1 (PSP_VALIDATE)
_regs[3] = (0x02000000 | 0x01000000) & 0xFFFFFFFF
= 0x03000000
This puts cmd=3 instead of cmd=1 in bits 24-31, while the size field
in bits 0-23 becomes 0 instead of the intended value.
Should arg2 be masked before the OR to ensure it only uses bits 0-23?
_regs[3] = ((arg2 & 0x00FFFFFF) | (_cmd << 24)) & (psp)->conf.arg2_mask;
This would prevent arg2 from corrupting the cmd field on AIE4 while
maintaining backward compatibility with AIE2 (which masks out the cmd
bits anyway).
> struct psp_device {
> struct drm_device *ddev;
> @@ -38,7 +49,9 @@ struct psp_device {
> u32 fw_buf_sz;
> u64 fw_paddr;
> void *fw_buffer;
> - void __iomem *psp_regs[PSP_MAX_REGS];
> + u32 certfw_buf_sz;
> + u64 certfw_paddr;
> + void *certfw_buffer;
> };
>
> static int psp_exec(struct psp_device *psp, u32 *reg_vals)
> @@ -47,13 +60,22 @@ static int psp_exec(struct psp_device *psp, u32 *reg_vals)
> int ret, i;
> u32 ready;
>
> + /* Check for PSP ready before any write */
> + ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
> + FIELD_GET(PSP_STATUS_READY, ready),
> + PSP_POLL_INTERVAL, PSP_POLL_TIMEOUT);
> + if (ret) {
> + drm_err(psp->ddev, "PSP is not ready, ret 0x%x", ret);
> + return ret;
> + }
> +
> /* Write command and argument registers */
> for (i = 0; i < PSP_NUM_IN_REGS; i++)
> writel(reg_vals[i], PSP_REG(psp, i));
>
> /* clear and set PSP INTR register to kick off */
> writel(0, PSP_REG(psp, PSP_INTR_REG));
> - writel(1, PSP_REG(psp, PSP_INTR_REG));
> + writel(psp->conf.notify_val, PSP_REG(psp, PSP_INTR_REG));
>
> /* PSP should be busy. Wait for ready, so we know task is done. */
> ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
> @@ -90,69 +112,124 @@ int aie_psp_waitmode_poll(struct psp_device *psp)
>
> void aie_psp_stop(struct psp_device *psp)
> {
> - u32 reg_vals[PSP_NUM_IN_REGS] = { PSP_RELEASE_TMR, };
> + u32 reg_vals[PSP_NUM_IN_REGS];
> int ret;
>
> + PSP_SET_CMD(psp, reg_vals, PSP_RELEASE_TMR, 0, 0, 0);
> +
> ret = psp_exec(psp, reg_vals);
> if (ret)
> drm_err(psp->ddev, "release tmr failed, ret %d", ret);
> }
>
> -int aie_psp_start(struct psp_device *psp)
> +static int psp_validate_fw(struct psp_device *psp, u8 cmd, u64 paddr, u32 buf_sz)
> {
> u32 reg_vals[PSP_NUM_IN_REGS];
> int ret;
>
> - reg_vals[0] = PSP_VALIDATE;
> - reg_vals[1] = lower_32_bits(psp->fw_paddr);
> - reg_vals[2] = upper_32_bits(psp->fw_paddr);
> - reg_vals[3] = psp->fw_buf_sz;
> + PSP_SET_CMD(psp, reg_vals, cmd, lower_32_bits(paddr),
> + upper_32_bits(paddr), buf_sz);
>
> ret = psp_exec(psp, reg_vals);
> - if (ret) {
> + if (ret)
> drm_err(psp->ddev, "failed to validate fw, ret %d", ret);
> - return ret;
> - }
>
> - memset(reg_vals, 0, sizeof(reg_vals));
> - reg_vals[0] = PSP_START;
> - reg_vals[1] = PSP_START_COPY_FW;
> + return ret;
> +}
> +
> +static int psp_start(struct psp_device *psp)
> +{
> + u32 reg_vals[PSP_NUM_IN_REGS];
> + int ret;
> +
> + PSP_SET_CMD(psp, reg_vals, PSP_START, PSP_START_COPY_FW, 0, 0);
> +
> ret = psp_exec(psp, reg_vals);
> - if (ret) {
> + if (ret)
> drm_err(psp->ddev, "failed to start fw, ret %d", ret);
> +
> + return ret;
> +}
> +
> +int aie_psp_start(struct psp_device *psp)
> +{
> + int ret;
> +
> + ret = psp_validate_fw(psp, PSP_VALIDATE,
> + psp->fw_paddr, psp->fw_buf_sz);
> + if (ret)
> return ret;
> - }
>
> - return 0;
> + if (!psp->certfw_buf_sz)
> + goto psp_start;
> +
> + ret = psp_validate_fw(psp, PSP_VALIDATE_CERT,
> + psp->certfw_paddr, psp->certfw_buf_sz);
> + if (ret)
> + return ret;
> +psp_start:
> + return psp_start(psp);
> +}
> +
> +/*
> + * PSP requires host physical address to load firmware.
> + * Allocate a buffer, obtain its physical address, align, and copy data in.
> + */
> +static void *psp_alloc_fw_buf(struct psp_device *psp, const void *fw_data,
> + u32 fw_size, u32 align, u32 *buf_sz,
> + u64 *paddr)
> +{
> + u32 alloc_sz;
> + void *buffer;
> + u64 offset;
> +
> + *buf_sz = ALIGN(fw_size, align);
> + alloc_sz = *buf_sz + align;
> +
> + buffer = drmm_kmalloc(psp->ddev, alloc_sz, GFP_KERNEL);
> + if (!buffer)
> + return NULL;
> +
> + *paddr = virt_to_phys(buffer);
> + offset = ALIGN(*paddr, align) - *paddr;
> + *paddr += offset;
> + memcpy(buffer + offset, fw_data, fw_size);
> +
> + return buffer;
> }
>
Two comments:
1) Can the integer overflow check be added here? If fw_size is very large
(close to UINT_MAX), ALIGN(fw_size, align) could overflow:
fw_size = 0xFFFF0000 (4GB - 64KB)
align = 0x10000 (64KB)
*buf_sz = ALIGN(0xFFFF0000, 0x10000) = 0x0 (overflow)
alloc_sz = 0x0 + 0x10000 = 0x10000
2) virt_to_phys() on drmm_kmalloc() allocated memory assumes
physical contiguity. Not sure size of this FW.
For allocations larger than a few MB, kmalloc may
not provide physically contiguous pages. Would dma_alloc_coherent() be
more appropriate.
> struct psp_device *aiem_psp_create(struct drm_device *ddev, struct psp_config *conf)
> {
> struct psp_device *psp;
> - u64 offset;
>
> psp = drmm_kzalloc(ddev, sizeof(*psp), GFP_KERNEL);
> if (!psp)
> return NULL;
>
> psp->ddev = ddev;
> - memcpy(psp->psp_regs, conf->psp_regs, sizeof(psp->psp_regs));
> + psp->fw_buffer = psp_alloc_fw_buf(psp, conf->fw_buf, conf->fw_size,
> + PSP_FW_ALIGN, &psp->fw_buf_sz,
> + &psp->fw_paddr);
> + if (!psp->fw_buffer)
> + return NULL;
> +
> + if (!conf->certfw_size) {
> + drm_dbg(ddev, "no cert fw");
> + goto done;
> + }
>
> - psp->fw_buf_sz = ALIGN(conf->fw_size, PSP_FW_ALIGN);
> - psp->fw_buffer = drmm_kmalloc(ddev, psp->fw_buf_sz + PSP_FW_ALIGN, GFP_KERNEL);
> - if (!psp->fw_buffer) {
> - drm_err(ddev, "no memory for fw buffer");
> + /* CERT firmware */
> + psp->certfw_buffer = psp_alloc_fw_buf(psp, conf->certfw_buf,
> + conf->certfw_size, PSP_CFW_ALIGN,
> + &psp->certfw_buf_sz,
> + &psp->certfw_paddr);
> + if (!psp->certfw_buffer) {
> + drm_err(ddev, "no memory for cert fw buffer");
> return NULL;
> }
>
> - /*
> - * AMD Platform Security Processor(PSP) requires host physical
> - * address to load NPU firmware.
> - */
> - psp->fw_paddr = virt_to_phys(psp->fw_buffer);
> - offset = ALIGN(psp->fw_paddr, PSP_FW_ALIGN) - psp->fw_paddr;
> - psp->fw_paddr += offset;
> - memcpy(psp->fw_buffer + offset, conf->fw_buf, conf->fw_size);
> +done:
> + memcpy(&psp->conf, conf, sizeof(psp->conf));
>
> return psp;
> }
> diff --git a/drivers/accel/amdxdna/npu3_regs.c b/drivers/accel/amdxdna/npu3_regs.c
> index f6e20f4858db..fb2bd60b8f00 100644
> --- a/drivers/accel/amdxdna/npu3_regs.c
> +++ b/drivers/accel/amdxdna/npu3_regs.c
> @@ -16,6 +16,15 @@
>
> /* PCIe BAR Index for NPU3 */
> #define NPU3_REG_BAR_INDEX 0
> +#define NPU3_PSP_BAR_INDEX 4
> +
> +#define MMNPU_APERTURE3_BASE 0x3810000
> +#define NPU3_PSP_BAR_BASE MMNPU_APERTURE3_BASE
> +
> +#define MPASP_C2PMSG_123_ALT_1 0x3810AEC
> +#define MPASP_C2PMSG_156_ALT_1 0x3810B70
> +#define MPASP_C2PMSG_157_ALT_1 0x3810B74
> +#define MPASP_C2PMSG_73_ALT_1 0x3810A24
>
> static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
> { .major = 5, .min_minor = 10 },
> @@ -23,14 +32,28 @@ static const struct amdxdna_fw_feature_tbl npu3_fw_feature_table[] = {
> };
>
> static const struct amdxdna_dev_priv npu3_dev_priv = {
> + .npufw_path = "npu.dev.sbin",
> + .certfw_path = "cert.dev.sbin",
> .mbox_bar = NPU3_MBOX_BAR,
> .mbox_rbuf_bar = NPU3_MBOX_BUFFER_BAR,
> .mbox_info_off = NPU3_MBOX_INFO_OFF,
> + .psp_regs_off = {
> + DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG0_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG1_REG, NPU3_PSP, MPASP_C2PMSG_157_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_INTR_REG, NPU3_PSP, MPASP_C2PMSG_73_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
> + DEFINE_BAR_OFFSET(PSP_RESP_REG, NPU3_PSP, MPASP_C2PMSG_156_ALT_1),
> + /* npu3 doesn't use 8th pwaitmode register */
> + },
> +
> };
>
> const struct amdxdna_dev_info dev_npu3_pf_info = {
> .mbox_bar = NPU3_MBOX_BAR,
> .sram_bar = NPU3_MBOX_BUFFER_BAR,
> + .psp_bar = NPU3_PSP_BAR_INDEX,
> .vbnv = "RyzenAI-npu3-pf",
> .device_type = AMDXDNA_DEV_TYPE_PF,
> .dev_priv = &npu3_dev_priv,
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Initial support for AIE4 platform
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
` (5 preceding siblings ...)
2026-03-30 16:37 ` [PATCH V1 6/6] accel/amdxdna: Add AIE4 power on and off support Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
6 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: accel/amdxdna: Initial support for AIE4 platform
Author: Lizhi Hou <lizhi.hou@amd.com>
Patches: 10
Reviewed: 2026-03-31T17:05:08.529572
---
This 6-patch series adds initial AIE4 (NPU3) Physical Function support with SR-IOV to the amdxdna accel driver (PCI IDs 0x17F2 and 0x1B0B). The approach is to refactor existing AIE2-specific mailbox, PSP, and SMU code into shared "aie_" common code, then build AIE4 support on top.
**Overall assessment: The refactoring approach is sound**, and the series is structured logically (common code first, then AIE4 specifics). However, there are several issues:
1. **Two `struct amdxdna_dev_hdl` definitions** — AIE2 (`aie2_pci.h`) and AIE4 (`aie4_pci.h`) define completely different structs with the same name. This works because they're compiled in separate translation units, but it's fragile and makes the codebase confusing. A more scalable approach would use different names or a proper inheritance pattern.
2. **Two `struct amdxdna_dev_priv` definitions** — same problem. Both `aie2_pci.h` and `aie4_pci.h` define this struct differently.
3. **Behavioral change in status checking** — The old `DECLARE_AIE2_MSG` used `MAX_AIE2_STATUS_CODE` for the status sentinel, the new `DECLARE_AIE_MSG` uses `-1`. This changes how success/failure is determined in `aie_send_mgmt_msg_wait()`.
4. **UAPI gap** — `AMDXDNA_DEV_TYPE_PF = 2` skips value 1, which should be explicitly documented.
5. **Missing `sriov_configure` in existing ops** — NULL dereference is avoided by a check, but the ops struct lacks the field declaration in the same patch that adds it.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Create shared functions for AIE2 and AIE4
2026-03-30 16:37 ` [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4 Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
0 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This is the refactoring foundation. Moves mailbox management channel, protocol checking, and feature mask into a new `struct aie_device` embedded in `amdxdna_dev_hdl`.
**Issue 1 — Status check semantic change:**
```c
// Old (aie2_message.c):
#define DECLARE_AIE2_MSG(name, op) \
DECLARE_XDNA_MSG_COMMON(name, op, MAX_AIE2_STATUS_CODE)
...
if (!ret && *hdl->status != AIE2_STATUS_SUCCESS) {
// New (aie.h / aie.c):
#define DECLARE_AIE_MSG(name, op) \
DECLARE_XDNA_MSG_COMMON(name, op, -1)
...
if (!ret && *hdl->status) {
```
The old code checked against `AIE2_STATUS_SUCCESS` (0) explicitly with a `MAX_AIE2_STATUS_CODE` sentinel for the notify callback. The new code uses `-1` as sentinel and checks `*hdl->status` (truthy = nonzero = error). This happens to be functionally equivalent if `AIE2_STATUS_SUCCESS == 0`, but merging the AIE2 path into a generic check that uses `-1` is a subtle behavioral change that should be called out in the commit message.
**Issue 2 — `fw_feature_tbl` moved to `amdxdna_dev_info`:**
Moving `fw_feature_tbl` from `amdxdna_dev_priv` to `amdxdna_dev_info` is a layering change. The code in `aie_check_protocol()` accesses it via `aie->xdna->dev_info->fw_feature_tbl`. This is fine but the commit message doesn't mention this relocation.
**Issue 3 — Extra blank line removal:**
```c
-
op = amdxdna_cmd_get_op(cmd_abo);
```
Unrelated whitespace cleanup snuck into this patch (line ~891 area). Should be in a separate patch or at least noted.
**Minor: `drm_WARN_ON` in `aie_destroy_chann`** takes a `**chann` double pointer, which is a reasonable pattern for NULL-out-on-destroy but differs from the original which operated directly on `ndev->mgmt_chann`.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Add basic support for AIE4 devices
2026-03-30 16:37 ` [PATCH V1 2/6] accel/amdxdna: Add basic support for AIE4 devices Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
0 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This is the largest patch — adds aie4_pci.c, aie4_sriov.c, aie4_message.c, npu3_regs.c, mailbox changes, PCI driver changes, and UAPI change.
**Issue 1 — Duplicate struct names:**
```c
// aie4_pci.h:
struct amdxdna_dev_priv {
u32 mbox_bar;
u32 mbox_rbuf_bar;
u64 mbox_info_off;
};
struct amdxdna_dev_hdl {
struct aie_device aie;
const struct amdxdna_dev_priv *priv;
...
};
```
These shadow the identically-named structs in `aie2_pci.h`. This means any `.c` file can only include one of these headers. This is a maintenance hazard — if someone adds a shared helper that needs `ndev->xdna`, it will silently pick the wrong struct depending on includes. Consider `aie2_dev_hdl` / `aie4_dev_hdl` or a union/opaque pattern.
**Issue 2 — UAPI gap in device type enum:**
```c
enum amdxdna_device_type {
AMDXDNA_DEV_TYPE_UNKNOWN = -1,
AMDXDNA_DEV_TYPE_KMQ = 0,
AMDXDNA_DEV_TYPE_PF = 2,
};
```
Value 1 is skipped. This is ABI — once merged it can't change. If value 1 was intentionally reserved, document it. If not, use value 1.
**Issue 3 — PCI ID case inconsistency:**
```c
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x17f2) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x1B0B) },
```
`0x17f2` is lowercase, `0x1B0B` is mixed case. Should be consistent (kernel convention is lowercase hex).
**Issue 4 — `aie4_sriov_stop` returning error but continuing:**
```c
int aie4_sriov_stop(struct amdxdna_dev_hdl *ndev)
{
...
pci_disable_sriov(pdev);
return aie4_destroy_vfs(ndev);
}
```
If `pci_disable_sriov` succeeds but `aie4_destroy_vfs` fails, the VFs are already disabled at the PCI level but the firmware still thinks they exist. The error handling needs consideration here.
**Issue 5 — `readx_poll_timeout` with pointer arithmetic on `__iomem`:**
```c
src = ndev->rbuf_base + npriv->mbox_info_off;
ret = readx_poll_timeout(readl, src + offsetof(struct mailbox_info, valid),
```
`src` is `u32 __iomem *`, so `+ offsetof(...)` does pointer arithmetic in units of `u32`, not bytes. `offsetof` returns bytes. This would read the wrong offset. Either `src` should be `void __iomem *` or the offset should be divided by `sizeof(u32)`.
**Issue 6 — `aie4_sriov_configure` return value for enable:**
```c
static int aie4_sriov_start(...)
{
...
return num_vfs;
}
```
Returning `num_vfs` on success is correct per the `sriov_configure` API, good.
**Issue 7 — Mailbox iohub helpers:**
```c
static inline u32 mailbox_irq_status(struct mailbox_channel *mb_chann)
{
return (mb_chann->iohub_int_addr) ?
mailbox_reg_read(mb_chann, mb_chann->iohub_int_addr) : 0;
}
```
Returning 0 when there's no iohub means the "check again" loop in `mailbox_rx_worker` will never re-enter via the `mailbox_irq_status` path. This seems intentional for AIE4 (no iohub), but the worker would then potentially miss messages that arrive during the window between processing and the status check. Worth a comment explaining why this is safe.
**Issue 8 — Missing `MODULE_FIRMWARE` for NPU3:**
New firmware paths are introduced but no `MODULE_FIRMWARE()` declarations are added for the NPU3 firmware files.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4
2026-03-30 16:37 ` [PATCH V1 3/6] accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4 Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
0 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Renames `aie2_psp.c` → `aie_psp.c` and moves PSP types/declarations into `aie.h`.
**Issue 1 — Copyright year change removes history:**
```c
-// Copyright (C) 2022-2024, Advanced Micro Devices, Inc.
+// Copyright (C) 2026, Advanced Micro Devices, Inc.
```
The original copyright range (2022-2024) is replaced with just 2026. This loses the attribution history. Should be `2022-2026`.
**Issue 2 — `psp_conf` zero-initialization:**
```c
- struct psp_config psp_conf;
+ struct psp_config psp_conf = { 0 };
```
Good — needed because `psp_config` now has more fields (`certfw_buf`, `certfw_size`, `arg2_mask`, `notify_val`). But `arg2_mask` and `notify_val` are not set in this patch (they're added in patch 4), so `arg2_mask = 0` and `notify_val = 0` would be used if PSP runs before patch 4 is applied. This means the series has a bisection issue — patches 3 alone would break PSP because `notify_val = 0` means the interrupt never fires.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Add AIE4 firmware loading
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
2026-03-30 20:17 ` Mario Limonciello
2026-03-31 2:45 ` Mario Limonciello
@ 2026-03-31 7:05 ` Claude Code Review Bot
2 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Adds CERT firmware support to PSP, AIE4 firmware loading paths, and NPU3 PSP register definitions.
**Issue 1 — Bisection fix for patch 3:**
```c
+ psp_conf.arg2_mask = GENMASK(23, 0);
+ psp_conf.notify_val = 1;
```
This is added in `aie2_pci.c`, fixing the bisection issue from patch 3. These two lines should have been in patch 3.
**Issue 2 — `PSP_SET_CMD` macro readability:**
```c
#define PSP_SET_CMD(psp, reg_vals, cmd, arg0, arg1, arg2) \
({ \
...
_regs[3] = ((arg2) | ((_cmd) << 24)) & (psp)->conf.arg2_mask; \
})
```
The `arg2` register now encodes both `arg2` data AND the command shifted left by 24 bits, masked by a per-device mask. For AIE2, `arg2_mask = GENMASK(23, 0)` strips the command bits (so `arg2` register = just the size, backwards compatible). For AIE4, `arg2_mask = ~0` keeps command bits. This is clever but should be documented in a comment explaining the protocol difference.
**Issue 3 — PSP register aliasing for NPU3:**
```c
DEFINE_BAR_OFFSET(PSP_CMD_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
...
DEFINE_BAR_OFFSET(PSP_ARG2_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
...
DEFINE_BAR_OFFSET(PSP_STATUS_REG, NPU3_PSP, MPASP_C2PMSG_123_ALT_1),
```
Three registers (`PSP_CMD_REG`, `PSP_ARG2_REG`, `PSP_STATUS_REG`) all map to the same physical address `MPASP_C2PMSG_123_ALT_1`. Similarly `PSP_ARG0_REG` and `PSP_RESP_REG` share `MPASP_C2PMSG_156_ALT_1`. This means writes to CMD, ARG2, and reads from STATUS all hit the same MMIO register. This is a fundamental protocol difference from AIE2 and should be documented, as it means the register serves different purposes depending on direction (write for CMD/ARG2, read for STATUS).
**Issue 4 — `virt_to_phys` usage:**
```c
*paddr = virt_to_phys(buffer);
```
`virt_to_phys` on `kmalloc`/`drmm_kmalloc` memory is technically valid but fragile. The existing AIE2 code already does this, so this isn't new, but it's worth noting that this won't work on systems with IOMMU remapping if the PSP can't bypass it. This appears to be an existing design choice.
**Issue 5 — psp_conf stored by value:**
```c
memcpy(&psp->conf, conf, sizeof(psp->conf));
```
`psp->conf` stores `fw_buf` and `certfw_buf` pointers. After `aie4_release_firmware()` is called, these pointers become dangling. This is fine if the firmware data was already copied into the aligned buffers, but the pointers in `conf` should not be accessed after this point. The code doesn't seem to use them after `aiem_psp_create`, but storing dangling pointers is a latent bug.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4
2026-03-30 16:37 ` [PATCH V1 5/6] accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4 Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
0 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Moves SMU code from `aie2_smu.c` to `aie_smu.c` with a new `smu_device` abstraction.
**Issue 1 — Non-angle-bracket include:**
```c
+#include "drm/amdxdna_accel.h"
```
Should be `<drm/amdxdna_accel.h>` (angle brackets for installed headers).
**Issue 2 — Unnecessary include:**
```c
+#include <drm/gpu_scheduler.h>
```
`aie_smu.c` doesn't use any gpu_scheduler APIs. This was carried over from the old `aie2_smu.c`.
**Issue 3 — `aie2_smu_fini` wraps two operations but `aie_smu_fini` only does one:**
```c
// Old:
void aie2_smu_fini(struct amdxdna_dev_hdl *ndev)
{
ndev->priv->hw_ops.set_dpm(ndev, 0); // Reset DPM first
aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, ...);
}
// New wrapper in aie2_pci.c:
static void aie2_smu_fini(struct amdxdna_dev_hdl *ndev)
{
ndev->priv->hw_ops.set_dpm(ndev, 0);
aie_smu_fini(ndev->aie.smu_hdl);
}
```
The DPM reset is kept in the wrapper. This is fine but means AIE4 callers must remember to reset DPM themselves before calling `aie_smu_fini()` or the power-off happens at whatever DPM level was active. The patch 6 AIE4 path calls `aie_smu_fini()` directly without a DPM reset — need to confirm this is intentional.
**Issue 4 — `npu1_set_dpm` moved to `npu1_regs.c`:**
The `npu1_set_dpm` function was `extern` declared in `aie2_pci.h` but is now a `static` function in `npu1_regs.c`. However, `npu4_set_dpm` remains extern. The header still has:
```c
int npu4_set_dpm(struct amdxdna_dev_hdl *ndev, u32 dpm_level);
```
But `npu1_set_dpm` declaration is removed. This works because `npu1_set_dpm` is referenced only via function pointer in `npu1_regs.c`, but the asymmetry is confusing.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Claude review: accel/amdxdna: Add AIE4 power on and off support
2026-03-30 16:37 ` [PATCH V1 6/6] accel/amdxdna: Add AIE4 power on and off support Lizhi Hou
@ 2026-03-31 7:05 ` Claude Code Review Bot
0 siblings, 0 replies; 17+ messages in thread
From: Claude Code Review Bot @ 2026-03-31 7:05 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Adds SMU init/fini to the AIE4 firmware load/unload path and NPU3 SMU register definitions.
**Issue 1 — `SMU_ARG_REG` and `SMU_OUT_REG` aliased:**
```c
DEFINE_BAR_OFFSET(SMU_ARG_REG, NPU3_SMU, MP1_C2PMSG_61_ALT_1),
...
DEFINE_BAR_OFFSET(SMU_OUT_REG, NPU3_SMU, MP1_C2PMSG_61_ALT_1),
```
Same register for input argument and output. This is the same aliasing pattern as PSP. Appears intentional for the NPU3 hardware protocol.
**Issue 2 — `SMU_INTR_REG` maps to bar base:**
```c
DEFINE_BAR_OFFSET(SMU_INTR_REG, NPU3_SMU, MMNPU_APERTURE4_BASE),
```
`DEFINE_BAR_OFFSET` computes `(reg_addr) - bar##_BAR_BASE`, and here `NPU3_SMU_BAR_BASE = MMNPU_APERTURE4_BASE`, so the offset is 0. The interrupt register is at offset 0 of the SMU BAR. This seems intentional.
**Issue 3 — No DPM reset before `aie_smu_fini` in AIE4:**
```c
static void aie4_fw_unload(struct amdxdna_dev_hdl *ndev)
{
aie_psp_stop(ndev->aie.psp_hdl);
aie_smu_fini(ndev->aie.smu_hdl);
}
```
No DPM level reset before power off. If this is intentional (AIE4 doesn't need it), it should be documented. If not, it's a bug.
**Overall the series is a reasonable first step.** The main concerns are: the duplicate struct name pattern, the PSP bisection issue between patches 3 and 4, the `readx_poll_timeout` pointer arithmetic bug in patch 2, and the UAPI device type gap. The register aliasing patterns should be documented for future maintainers.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-03-31 7:05 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 16:36 [PATCH V1 0/6] accel/amdxdna: Initial support for AIE4 platform Lizhi Hou
2026-03-30 16:37 ` [PATCH V1 1/6] accel/amdxdna: Create shared functions for AIE2 and AIE4 Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 2/6] accel/amdxdna: Add basic support for AIE4 devices Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 3/6] accel/amdxdna: Create common PSP interfaces for AIE2 and AIE4 Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 4/6] accel/amdxdna: Add AIE4 firmware loading Lizhi Hou
2026-03-30 20:17 ` Mario Limonciello
2026-03-30 20:30 ` yidong Zhang
2026-03-31 2:45 ` Mario Limonciello
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 5/6] accel/amdxdna: Create common SMU interfaces for AIE2 and AIE4 Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-30 16:37 ` [PATCH V1 6/6] accel/amdxdna: Add AIE4 power on and off support Lizhi Hou
2026-03-31 7:05 ` Claude review: " Claude Code Review Bot
2026-03-31 7:05 ` Claude review: accel/amdxdna: Initial support for AIE4 platform Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox