* [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
` (8 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
There is no particular order required here and keeping them alphabetical
will help preventing future mistakes.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/fw.rs | 67 +++++++++++++++++++++--------------------
1 file changed, 35 insertions(+), 32 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index f1797e1f0d9d..4b998485360b 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -191,34 +191,34 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
#[repr(u32)]
pub(crate) enum MsgFunction {
// Common function codes
- Nop = bindings::NV_VGPU_MSG_FUNCTION_NOP,
- SetGuestSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO,
- AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
+ AllocChannelDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA,
+ AllocCtxDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA,
AllocDevice = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_DEVICE,
AllocMemory = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY,
- AllocCtxDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA,
- AllocChannelDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA,
- MapMemory = bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY,
- BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
AllocObject = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT,
+ AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
+ BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
Free = bindings::NV_VGPU_MSG_FUNCTION_FREE,
- Log = bindings::NV_VGPU_MSG_FUNCTION_LOG,
GetGspStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO,
- SetRegistry = bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY,
- GspSetSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO,
+ GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
GspInitPostObjGpu = bindings::NV_VGPU_MSG_FUNCTION_GSP_INIT_POST_OBJGPU,
GspRmControl = bindings::NV_VGPU_MSG_FUNCTION_GSP_RM_CONTROL,
- GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
+ GspSetSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO,
+ Log = bindings::NV_VGPU_MSG_FUNCTION_LOG,
+ MapMemory = bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY,
+ Nop = bindings::NV_VGPU_MSG_FUNCTION_NOP,
+ SetGuestSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO,
+ SetRegistry = bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY,
// Event codes
GspInitDone = bindings::NV_VGPU_MSG_EVENT_GSP_INIT_DONE,
+ GspLockdownNotice = bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE,
+ GspPostNoCat = bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD,
GspRunCpuSequencer = bindings::NV_VGPU_MSG_EVENT_GSP_RUN_CPU_SEQUENCER,
- PostEvent = bindings::NV_VGPU_MSG_EVENT_POST_EVENT,
- RcTriggered = bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED,
MmuFaultQueued = bindings::NV_VGPU_MSG_EVENT_MMU_FAULT_QUEUED,
OsErrorLog = bindings::NV_VGPU_MSG_EVENT_OS_ERROR_LOG,
- GspPostNoCat = bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD,
- GspLockdownNotice = bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE,
+ PostEvent = bindings::NV_VGPU_MSG_EVENT_POST_EVENT,
+ RcTriggered = bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED,
UcodeLibOsPrint = bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT,
}
@@ -227,38 +227,41 @@ impl TryFrom<u32> for MsgFunction {
fn try_from(value: u32) -> Result<MsgFunction> {
match value {
- bindings::NV_VGPU_MSG_FUNCTION_NOP => Ok(MsgFunction::Nop),
- bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO => {
- Ok(MsgFunction::SetGuestSystemInfo)
- }
- bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
+ // Common function codes
+ bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA => Ok(MsgFunction::AllocChannelDma),
+ bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA => Ok(MsgFunction::AllocCtxDma),
bindings::NV_VGPU_MSG_FUNCTION_ALLOC_DEVICE => Ok(MsgFunction::AllocDevice),
bindings::NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY => Ok(MsgFunction::AllocMemory),
- bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA => Ok(MsgFunction::AllocCtxDma),
- bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA => Ok(MsgFunction::AllocChannelDma),
- bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY => Ok(MsgFunction::MapMemory),
- bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT => Ok(MsgFunction::AllocObject),
+ bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
+ bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
bindings::NV_VGPU_MSG_FUNCTION_FREE => Ok(MsgFunction::Free),
- bindings::NV_VGPU_MSG_FUNCTION_LOG => Ok(MsgFunction::Log),
bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO => Ok(MsgFunction::GetGspStaticInfo),
- bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY => Ok(MsgFunction::SetRegistry),
- bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO => Ok(MsgFunction::GspSetSystemInfo),
+ bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),
bindings::NV_VGPU_MSG_FUNCTION_GSP_INIT_POST_OBJGPU => {
Ok(MsgFunction::GspInitPostObjGpu)
}
bindings::NV_VGPU_MSG_FUNCTION_GSP_RM_CONTROL => Ok(MsgFunction::GspRmControl),
- bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),
+ bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO => Ok(MsgFunction::GspSetSystemInfo),
+ bindings::NV_VGPU_MSG_FUNCTION_LOG => Ok(MsgFunction::Log),
+ bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY => Ok(MsgFunction::MapMemory),
+ bindings::NV_VGPU_MSG_FUNCTION_NOP => Ok(MsgFunction::Nop),
+ bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO => {
+ Ok(MsgFunction::SetGuestSystemInfo)
+ }
+ bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY => Ok(MsgFunction::SetRegistry),
+
+ // Event codes
bindings::NV_VGPU_MSG_EVENT_GSP_INIT_DONE => Ok(MsgFunction::GspInitDone),
+ bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE => Ok(MsgFunction::GspLockdownNotice),
+ bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD => Ok(MsgFunction::GspPostNoCat),
bindings::NV_VGPU_MSG_EVENT_GSP_RUN_CPU_SEQUENCER => {
Ok(MsgFunction::GspRunCpuSequencer)
}
- bindings::NV_VGPU_MSG_EVENT_POST_EVENT => Ok(MsgFunction::PostEvent),
- bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED => Ok(MsgFunction::RcTriggered),
bindings::NV_VGPU_MSG_EVENT_MMU_FAULT_QUEUED => Ok(MsgFunction::MmuFaultQueued),
bindings::NV_VGPU_MSG_EVENT_OS_ERROR_LOG => Ok(MsgFunction::OsErrorLog),
- bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD => Ok(MsgFunction::GspPostNoCat),
- bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE => Ok(MsgFunction::GspLockdownNotice),
+ bindings::NV_VGPU_MSG_EVENT_POST_EVENT => Ok(MsgFunction::PostEvent),
+ bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED => Ok(MsgFunction::RcTriggered),
bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT => Ok(MsgFunction::UcodeLibOsPrint),
_ => Err(EINVAL),
}
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
2026-03-04 1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-04 11:39 ` Danilo Krummrich
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 3/9] rust: add EMSGSIZE error code Eliot Courtney
` (7 subsequent siblings)
9 siblings, 2 replies; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Add a timeout to `allocate_command` which waits for space on the GSP
command queue. It uses a similar timeout to nouveau.
This lets `send_command` wait for space to free up in the command queue.
This is required to support continuation records which can fill up the
queue.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 42 ++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 87dbbd6d1be9..5cb845bae123 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -250,6 +250,19 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
}
}
+ /// Returns the size of the region of the CPU message queue that the driver is currently allowed
+ /// to write to, in bytes.
+ fn driver_write_area_size(&self) -> usize {
+ let tx = self.cpu_write_ptr();
+ let rx = self.gsp_read_ptr();
+
+ // `rx` and `tx` are both in `0..MSGQ_NUM_PAGES` per the invariants of `gsp_read_ptr` and
+ // `cpu_write_ptr`. The minimum value case is where `rx == 0` and `tx == MSGQ_NUM_PAGES -
+ // 1`, which gives `0 + MSGQ_NUM_PAGES - (MSGQ_NUM_PAGES - 1) - 1 == 0`.
+ let slots = (rx + MSGQ_NUM_PAGES - tx - 1) % MSGQ_NUM_PAGES;
+ num::u32_as_usize(slots) * GSP_PAGE_SIZE
+ }
+
/// Returns the region of the GSP message queue that the driver is currently allowed to read
/// from.
///
@@ -281,15 +294,22 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
}
/// Allocates a region on the command queue that is large enough to send a command of `size`
- /// bytes.
+ /// bytes, waiting for space to become available based on the provided timeout.
///
/// This returns a [`GspCommand`] ready to be written to by the caller.
///
/// # Errors
///
- /// - `EAGAIN` if the driver area is too small to hold the requested command.
+ /// - `ETIMEDOUT` if space does not become available within the timeout.
/// - `EIO` if the command header is not properly aligned.
- fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
+ fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
+ read_poll_timeout(
+ || Ok(self.driver_write_area_size()),
+ |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
+ Delta::ZERO,
+ timeout,
+ )?;
+
// Get the current writable area as an array of bytes.
let (slice_1, slice_2) = {
let (slice_1, slice_2) = self.driver_write_area();
@@ -298,13 +318,6 @@ fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
(slice_1.as_flattened_mut(), slice_2.as_flattened_mut())
};
- // If the GSP is still processing previous messages the shared region
- // may be full in which case we will have to retry once the GSP has
- // processed the existing commands.
- if size_of::<GspMsgElement>() + size > slice_1.len() + slice_2.len() {
- return Err(EAGAIN);
- }
-
// Extract area for the `GspMsgElement`.
let (header, slice_1) = GspMsgElement::from_bytes_mut_prefix(slice_1).ok_or(EIO)?;
@@ -462,6 +475,9 @@ impl Cmdq {
/// Number of page table entries for the GSP shared region.
pub(crate) const NUM_PTES: usize = size_of::<GspMem>() >> GSP_PAGE_SHIFT;
+ /// Timeout for waiting for space on the command queue.
+ const ALLOCATE_TIMEOUT: Delta = Delta::from_secs(1);
+
/// Creates a new command queue for `dev`.
pub(crate) fn new(dev: &device::Device<device::Bound>) -> Result<Cmdq> {
let gsp_mem = DmaGspMem::new(dev)?;
@@ -497,7 +513,7 @@ fn notify_gsp(bar: &Bar0) {
///
/// # Errors
///
- /// - `EAGAIN` if there was not enough space in the command queue to send the command.
+ /// - `ETIMEDOUT` if space does not become available within the timeout.
/// - `EIO` if the variable payload requested by the command has not been entirely
/// written to by its [`CommandToGsp::init_variable_payload`] method.
///
@@ -509,7 +525,9 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
Error: From<M::InitError>,
{
let command_size = size_of::<M::Command>() + command.variable_payload_len();
- let dst = self.gsp_mem.allocate_command(command_size)?;
+ let dst = self
+ .gsp_mem
+ .allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
// Extract area for the command itself.
let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
2026-03-04 1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
@ 2026-03-04 11:39 ` Danilo Krummrich
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
1 sibling, 0 replies; 27+ messages in thread
From: Danilo Krummrich @ 2026-03-04 11:39 UTC (permalink / raw)
To: Eliot Courtney
Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
nouveau, dri-devel, linux-kernel, rust-for-linux
On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
> + fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
> + read_poll_timeout(
> + || Ok(self.driver_write_area_size()),
> + |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
> + Delta::ZERO,
Isn't this either creating unneccessary thrashing of the memory controller or
unnecessary contention at the cache-coherency level?
I think we should probably add at least a small delay of something around 1us.
> + timeout,
> + )?;
^ permalink raw reply [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: add mechanism to wait for space on command queue
2026-03-04 1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
2026-03-04 11:39 ` Danilo Krummrich
@ 2026-03-05 3:55 ` Claude Code Review Bot
1 sibling, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This replaces the immediate `EAGAIN` return with a `read_poll_timeout` loop, which is the right approach for supporting continuation records that can fill the queue.
The `driver_write_area_size` method is a good addition that cleanly computes the available space without needing to materialize the slices. The comment explaining the minimum value is helpful.
One observation: the `read_poll_timeout` uses `Delta::ZERO` for the poll interval:
```rust
read_poll_timeout(
|| Ok(self.driver_write_area_size()),
|available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
Delta::ZERO,
timeout,
)?;
```
A zero poll interval means busy-waiting. Other callers in the same tree (e.g., `wait_for_msg` at cmdq.rs:572) use `Delta::from_millis(1)`. This could be intentional since command queue space should free up very quickly after GSP processes a command, but it might be worth adding a brief comment explaining the choice or using a small interval to avoid CPU spinning under load.
The timeout of 1 second (`ALLOCATE_TIMEOUT`) matches nouveau's approach, which is good.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 3/9] rust: add EMSGSIZE error code
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
2026-03-04 1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
2026-03-04 1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
` (6 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Add the EMSGSIZE error code, which indicates that a message is too
long.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
rust/kernel/error.rs | 1 +
1 file changed, 1 insertion(+)
diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 258b12afdcba..10fcf1f0404d 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -67,6 +67,7 @@ macro_rules! declare_err {
declare_err!(EDOM, "Math argument out of domain of func.");
declare_err!(ERANGE, "Math result not representable.");
declare_err!(EOVERFLOW, "Value too large for defined data type.");
+ declare_err!(EMSGSIZE, "Message too long.");
declare_err!(ETIMEDOUT, "Connection timed out.");
declare_err!(ERESTARTSYS, "Restart the system call.");
declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted.");
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (2 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 3/9] rust: add EMSGSIZE error code Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
` (5 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
The limit is 16 pages for a single command sent to the GSP. Return an
error if `allocate_command` is called with a too large size.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 7 ++++++-
drivers/gpu/nova-core/gsp/fw.rs | 4 ++++
drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 1 +
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 5cb845bae123..73012da436e5 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -32,7 +32,8 @@
GspMsgElement,
MsgFunction,
MsgqRxHeader,
- MsgqTxHeader, //
+ MsgqTxHeader,
+ GSP_MSG_QUEUE_ELEMENT_SIZE_MAX, //
},
PteArray,
GSP_PAGE_SHIFT,
@@ -300,9 +301,13 @@ fn driver_write_area_size(&self) -> usize {
///
/// # Errors
///
+ /// - `EMSGSIZE` if the command is larger than [`GSP_MSG_QUEUE_ELEMENT_SIZE_MAX`].
/// - `ETIMEDOUT` if space does not become available within the timeout.
/// - `EIO` if the command header is not properly aligned.
fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
+ if size_of::<GspMsgElement>() + size > GSP_MSG_QUEUE_ELEMENT_SIZE_MAX {
+ return Err(EMSGSIZE);
+ }
read_poll_timeout(
|| Ok(self.driver_write_area_size()),
|available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4b998485360b..6005362450cb 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -39,6 +39,10 @@
},
};
+/// Maximum size of a single GSP message queue element in bytes.
+pub(crate) const GSP_MSG_QUEUE_ELEMENT_SIZE_MAX: usize =
+ num::u32_as_usize(bindings::GSP_MSG_QUEUE_ELEMENT_SIZE_MAX);
+
/// Empty type to group methods related to heap parameters for running the GSP firmware.
enum GspFwHeapParams {}
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
index 6d25fe0bffa9..334e8be5fde8 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -43,6 +43,7 @@ fn fmt(&self, fmt: &mut ::core::fmt::Formatter<'_>) -> ::core::fmt::Result {
pub const GSP_FW_WPR_META_REVISION: u32 = 1;
pub const GSP_FW_WPR_META_MAGIC: i64 = -2577556379034558285;
pub const REGISTRY_TABLE_ENTRY_TYPE_DWORD: u32 = 1;
+pub const GSP_MSG_QUEUE_ELEMENT_SIZE_MAX: u32 = 65536;
pub type __u8 = ffi::c_uchar;
pub type __u16 = ffi::c_ushort;
pub type __u32 = ffi::c_uint;
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: add checking oversized commands
2026-03-04 1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
0 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Adds the `EMSGSIZE` check before attempting allocation. The binding for `GSP_MSG_QUEUE_ELEMENT_SIZE_MAX` (65536 = 16 * 4096) is consistent with the 16-page limit mentioned in the cover letter.
The check:
```rust
if size_of::<GspMsgElement>() + size > GSP_MSG_QUEUE_ELEMENT_SIZE_MAX {
return Err(EMSGSIZE);
}
```
This is correct - after patch 8 lands, this check becomes effectively dead code for the `allocate_command` path because `SplitState::new` will split commands before they reach `allocate_command` (renamed to `send_single_command`). However, keeping it as a defensive check in `allocate_command` is reasonable as it documents the invariant and catches programming errors.
Minor: the trailing `//` comment on the import line (line 1131 in the mbox) is a `rustfmt` artifact for forcing multi-line formatting. It's consistent with the codebase style.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (3 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
` (4 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Clarify why using only the first returned slice from allocate_command
for the message headers is okay.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 73012da436e5..c91da368206d 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -534,7 +534,9 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
.gsp_mem
.allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
- // Extract area for the command itself.
+ // Extract area for the command itself. The GSP message header and the command header
+ // together are guaranteed to fit entirely into a single page, so it's ok to only look
+ // at `dst.contents.0` here.
let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
// Fill the header and command in-place.
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: clarify invariant on command queue
2026-03-04 1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
0 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Adds a comment explaining why only `dst.contents.0` is used for extracting the command header. This is a documentation-only change.
The comment:
```rust
// Extract area for the command itself. The GSP message header and the command header
// together are guaranteed to fit entirely into a single page, so it's ok to only look
// at `dst.contents.0` here.
```
This relies on the fact that `GspMsgElement` + any `Command` struct together fit in `GSP_PAGE_SIZE` (4096 bytes). This is a reasonable assumption for the fixed-size command headers currently in use, but worth noting this would break if a `Command` struct ever exceeded ~4KB. The invariant is implicitly enforced by the fact that the queue element starts page-aligned and `GspMsgElement::from_bytes_mut_prefix` splits at the element boundary.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (4 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
` (3 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Unconditionally call the variable length payload code, which is a no-op
if there is no such payload but could defensively catch some coding
errors by e.g. checking that the allocated size is completely filled.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index c91da368206d..48cf28b41f39 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -548,16 +548,14 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
command.init().__init(core::ptr::from_mut(cmd))?;
}
- // Fill the variable-length payload.
- if command_size > size_of::<M::Command>() {
- let mut sbuffer =
- SBufferIter::new_writer([&mut payload_1[..], &mut dst.contents.1[..]]);
- command.init_variable_payload(&mut sbuffer)?;
+ // Fill the variable-length payload, which may be empty.
+ let mut sbuffer = SBufferIter::new_writer([&mut payload_1[..], &mut dst.contents.1[..]]);
+ command.init_variable_payload(&mut sbuffer)?;
- if !sbuffer.is_empty() {
- return Err(EIO);
- }
+ if !sbuffer.is_empty() {
+ return Err(EIO);
}
+ drop(sbuffer);
// Compute checksum now that the whole message is ready.
dst.header
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: unconditionally call variable payload handling
2026-03-04 1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
0 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Removes the `if command_size > size_of::<M::Command>()` guard around the variable payload handling. This is a good defensive change - calling `init_variable_payload` on zero-length payloads is a no-op by default, and the `sbuffer.is_empty()` check will catch any mismatch between `variable_payload_len()` and actual bytes written.
The explicit `drop(sbuffer)` is needed because `sbuffer` borrows `payload_1` and `dst.contents.1` mutably, and the subsequent checksum computation needs to read them. Good.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (5 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-04 2:53 ` Alexandre Courbot
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
` (2 subsequent siblings)
9 siblings, 2 replies; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Add a default method to `CommandToGsp` which computes the size of a
command.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 48cf28b41f39..3424be4e15f8 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -94,6 +94,12 @@ fn init_variable_payload(
) -> Result {
Ok(())
}
+
+ /// Total size of the command (including its variable-length payload) without the
+ /// [`GspMsgElement`] header.
+ fn size_in_bytes(&self) -> usize {
+ size_of::<Self::Command>() + self.variable_payload_len()
+ }
}
/// Trait representing messages received from the GSP.
@@ -529,10 +535,10 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
// This allows all error types, including `Infallible`, to be used for `M::InitError`.
Error: From<M::InitError>,
{
- let command_size = size_of::<M::Command>() + command.variable_payload_len();
+ let size_in_bytes = command.size_in_bytes();
let dst = self
.gsp_mem
- .allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
+ .allocate_command(size_in_bytes, Self::ALLOCATE_TIMEOUT)?;
// Extract area for the command itself. The GSP message header and the command header
// together are guaranteed to fit entirely into a single page, so it's ok to only look
@@ -540,7 +546,7 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
// Fill the header and command in-place.
- let msg_element = GspMsgElement::init(self.seq, command_size, M::FUNCTION);
+ let msg_element = GspMsgElement::init(self.seq, size_in_bytes, M::FUNCTION);
// SAFETY: `msg_header` and `cmd` are valid references, and not touched if the initializer
// fails.
unsafe {
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
2026-03-04 1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
@ 2026-03-04 2:53 ` Alexandre Courbot
2026-03-04 3:10 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
1 sibling, 1 reply; 27+ messages in thread
From: Alexandre Courbot @ 2026-03-04 2:53 UTC (permalink / raw)
To: Eliot Courtney
Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
Alistair Popple, Joel Fernandes, nouveau, dri-devel, linux-kernel,
rust-for-linux
On Wed Mar 4, 2026 at 10:42 AM JST, Eliot Courtney wrote:
> Add a default method to `CommandToGsp` which computes the size of a
> command.
>
> Tested-by: Zhi Wang <zhiw@nvidia.com>
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 48cf28b41f39..3424be4e15f8 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -94,6 +94,12 @@ fn init_variable_payload(
> ) -> Result {
> Ok(())
> }
> +
> + /// Total size of the command (including its variable-length payload) without the
> + /// [`GspMsgElement`] header.
> + fn size_in_bytes(&self) -> usize {
We usually don't specify the unit in the method name (the doccomment is
a better place for that). I can fix this when applying.
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
2026-03-04 2:53 ` Alexandre Courbot
@ 2026-03-04 3:10 ` Eliot Courtney
0 siblings, 0 replies; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 3:10 UTC (permalink / raw)
To: Alexandre Courbot, Eliot Courtney
Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
Alistair Popple, Joel Fernandes, nouveau, dri-devel, linux-kernel,
rust-for-linux, dri-devel
On Wed Mar 4, 2026 at 11:53 AM JST, Alexandre Courbot wrote:
> On Wed Mar 4, 2026 at 10:42 AM JST, Eliot Courtney wrote:
>> Add a default method to `CommandToGsp` which computes the size of a
>> command.
>>
>> Tested-by: Zhi Wang <zhiw@nvidia.com>
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>> drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 48cf28b41f39..3424be4e15f8 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -94,6 +94,12 @@ fn init_variable_payload(
>> ) -> Result {
>> Ok(())
>> }
>> +
>> + /// Total size of the command (including its variable-length payload) without the
>> + /// [`GspMsgElement`] header.
>> + fn size_in_bytes(&self) -> usize {
>
> We usually don't specify the unit in the method name (the doccomment is
> a better place for that). I can fix this when applying.
Thanks, I will make a note of this so I don't make this mistake again. I
wanted to clarify this since size for a command could mean a few things.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Claude review: gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
2026-03-04 1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
2026-03-04 2:53 ` Alexandre Courbot
@ 2026-03-05 3:55 ` Claude Code Review Bot
1 sibling, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Adds a default method computing total command size. Clean refactor. The method is later used by `SplitState::new` in patch 8, making it a necessary abstraction.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (6 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-04 1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney
2026-03-05 3:55 ` Claude review: gpu: nova-core: gsp: add continuation record support Claude Code Review Bot
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Splits large RPCs if necessary and sends the remaining parts using
continuation records. RPCs that do not need continuation records
continue to write directly into the command buffer. Ones that do write
into a staging buffer first, so there is one copy.
Continuation record for receive is not necessary to support at the
moment because those replies do not need to be read and are currently
drained by retrying `receive_msg` on `ERANGE`.
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 42 ++++++-
drivers/gpu/nova-core/gsp/cmdq/continuation.rs | 163 +++++++++++++++++++++++++
drivers/gpu/nova-core/gsp/fw.rs | 4 +
3 files changed, 207 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 3424be4e15f8..492e9489e808 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+mod continuation;
+
use core::{
mem,
sync::atomic::{
@@ -25,6 +27,11 @@
},
};
+use continuation::{
+ ContinuationRecord,
+ SplitState, //
+};
+
use crate::{
driver::Bar0,
gsp::{
@@ -520,7 +527,7 @@ fn notify_gsp(bar: &Bar0) {
.write(bar);
}
- /// Sends `command` to the GSP.
+ /// Sends `command` to the GSP, without splitting it.
///
/// # Errors
///
@@ -529,7 +536,7 @@ fn notify_gsp(bar: &Bar0) {
/// written to by its [`CommandToGsp::init_variable_payload`] method.
///
/// Error codes returned by the command initializers are propagated as-is.
- pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
+ fn send_single_command<M>(&mut self, bar: &Bar0, command: M) -> Result
where
M: CommandToGsp,
// This allows all error types, including `Infallible`, to be used for `M::InitError`.
@@ -588,6 +595,37 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
Ok(())
}
+ /// Sends `command` to the GSP.
+ ///
+ /// The command may be split into multiple messages if it is large.
+ ///
+ /// # Errors
+ ///
+ /// - `ETIMEDOUT` if space does not become available within the timeout.
+ /// - `EIO` if the variable payload requested by the command has not been entirely
+ /// written to by its [`CommandToGsp::init_variable_payload`] method.
+ ///
+ /// Error codes returned by the command initializers are propagated as-is.
+ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
+ where
+ M: CommandToGsp,
+ Error: From<M::InitError>,
+ {
+ match SplitState::new(command)? {
+ SplitState::Single(command) => self.send_single_command(bar, command),
+ SplitState::Split(command, mut continuations) => {
+ self.send_single_command(bar, command)?;
+
+ while let Some(continuation) = continuations.next() {
+ // Turbofish needed because the compiler cannot infer M here.
+ self.send_single_command::<ContinuationRecord<'_>>(bar, continuation)?;
+ }
+
+ Ok(())
+ }
+ }
+ }
+
/// Wait for a message to become available on the message queue.
///
/// This works purely at the transport layer and does not interpret or validate the message
diff --git a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
new file mode 100644
index 000000000000..3b83cdcbb39e
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Support for splitting large GSP commands across continuation records.
+
+use core::convert::Infallible;
+
+use kernel::prelude::*;
+
+use super::CommandToGsp;
+
+use crate::{
+ gsp::fw::{
+ GspMsgElement,
+ MsgFunction,
+ GSP_MSG_QUEUE_ELEMENT_SIZE_MAX, //
+ },
+ sbuffer::SBufferIter,
+};
+
+/// Maximum command size that fits in a single queue element.
+const MAX_CMD_SIZE: usize = GSP_MSG_QUEUE_ELEMENT_SIZE_MAX - size_of::<GspMsgElement>();
+
+/// Acts as an iterator over the continuation records for a split command.
+pub(super) struct ContinuationRecords {
+ payload: KVVec<u8>,
+ offset: usize,
+}
+
+impl ContinuationRecords {
+ /// Creates a new iterator over continuation records for the given payload.
+ fn new(payload: KVVec<u8>) -> Self {
+ Self { payload, offset: 0 }
+ }
+
+ /// Returns the next continuation record, or [`None`] if there are no more.
+ pub(super) fn next(&mut self) -> Option<ContinuationRecord<'_>> {
+ let remaining = self.payload.len() - self.offset;
+
+ if remaining > 0 {
+ let chunk_size = remaining.min(MAX_CMD_SIZE);
+ let record =
+ ContinuationRecord::new(&self.payload[self.offset..(self.offset + chunk_size)]);
+ self.offset += chunk_size;
+ Some(record)
+ } else {
+ None
+ }
+ }
+}
+
+/// The [`ContinuationRecord`] command.
+pub(super) struct ContinuationRecord<'a> {
+ data: &'a [u8],
+}
+
+impl<'a> ContinuationRecord<'a> {
+ /// Creates a new [`ContinuationRecord`] command with the given data.
+ fn new(data: &'a [u8]) -> Self {
+ Self { data }
+ }
+}
+
+impl<'a> CommandToGsp for ContinuationRecord<'a> {
+ const FUNCTION: MsgFunction = MsgFunction::ContinuationRecord;
+ type Command = ();
+ type InitError = Infallible;
+
+ fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+ <()>::init_zeroed()
+ }
+
+ fn variable_payload_len(&self) -> usize {
+ self.data.len()
+ }
+
+ fn init_variable_payload(
+ &self,
+ dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+ ) -> Result {
+ dst.write_all(self.data)
+ }
+}
+
+/// Whether a command needs to be split across continuation records or not.
+pub(super) enum SplitState<C: CommandToGsp> {
+ /// A command that fits in a single queue element.
+ Single(C),
+ /// A command split across continuation records.
+ Split(SplitCommand<C>, ContinuationRecords),
+}
+
+impl<C: CommandToGsp> SplitState<C> {
+ /// Maximum variable payload size that fits in the first command alongside the command header.
+ const MAX_FIRST_PAYLOAD: usize = MAX_CMD_SIZE - size_of::<C::Command>();
+
+ /// Creates a new [`SplitState`] for the given command.
+ ///
+ /// If the command is too large, it will be split into a main command and some number of
+ /// continuation records.
+ pub(super) fn new(command: C) -> Result<Self> {
+ let payload_len = command.variable_payload_len();
+
+ if command.size_in_bytes() > MAX_CMD_SIZE {
+ let mut command_payload =
+ KVVec::<u8>::from_elem(0u8, payload_len.min(Self::MAX_FIRST_PAYLOAD), GFP_KERNEL)?;
+ let mut continuation_payload =
+ KVVec::<u8>::from_elem(0u8, payload_len - command_payload.len(), GFP_KERNEL)?;
+ let mut sbuffer = SBufferIter::new_writer([
+ command_payload.as_mut_slice(),
+ continuation_payload.as_mut_slice(),
+ ]);
+
+ command.init_variable_payload(&mut sbuffer)?;
+ if !sbuffer.is_empty() {
+ return Err(EIO);
+ }
+ drop(sbuffer);
+
+ Ok(Self::Split(
+ SplitCommand::new(command, command_payload),
+ ContinuationRecords::new(continuation_payload),
+ ))
+ } else {
+ Ok(Self::Single(command))
+ }
+ }
+}
+
+/// A command that has been truncated to maximum accepted length of the command queue.
+///
+/// The remainder of its payload is expected to be sent using [`ContinuationRecords`].
+pub(super) struct SplitCommand<C: CommandToGsp> {
+ command: C,
+ payload: KVVec<u8>,
+}
+
+impl<C: CommandToGsp> SplitCommand<C> {
+ /// Creates a new [`SplitCommand`] wrapping `command` with the given truncated payload.
+ fn new(command: C, payload: KVVec<u8>) -> Self {
+ Self { command, payload }
+ }
+}
+
+impl<C: CommandToGsp> CommandToGsp for SplitCommand<C> {
+ const FUNCTION: MsgFunction = C::FUNCTION;
+ type Command = C::Command;
+ type InitError = C::InitError;
+
+ fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+ self.command.init()
+ }
+
+ fn variable_payload_len(&self) -> usize {
+ self.payload.len()
+ }
+
+ fn init_variable_payload(
+ &self,
+ dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+ ) -> Result {
+ dst.write_all(&self.payload)
+ }
+}
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 6005362450cb..25fca1f6db2c 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -202,6 +202,7 @@ pub(crate) enum MsgFunction {
AllocObject = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT,
AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
+ ContinuationRecord = bindings::NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD,
Free = bindings::NV_VGPU_MSG_FUNCTION_FREE,
GetGspStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO,
GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
@@ -239,6 +240,9 @@ fn try_from(value: u32) -> Result<MsgFunction> {
bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT => Ok(MsgFunction::AllocObject),
bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
+ bindings::NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD => {
+ Ok(MsgFunction::ContinuationRecord)
+ }
bindings::NV_VGPU_MSG_FUNCTION_FREE => Ok(MsgFunction::Free),
bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO => Ok(MsgFunction::GetGspStaticInfo),
bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: support large RPCs via continuation record
2026-03-04 1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
0 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This is the main patch. The design is solid:
1. `SplitState::new()` checks if splitting is needed and stages the payload into kernel-allocated buffers
2. `SplitCommand` wraps the original command with truncated payload for the first message
3. `ContinuationRecords` iterates over remaining payload chunks
A few observations:
**`ContinuationRecords` doesn't implement `Iterator`**: The `next()` method signature matches `Iterator` but it doesn't implement the trait. This is likely intentional since `Iterator` would require `&mut self` lifetime complications with the returned `ContinuationRecord<'_>` borrowing from `self`. The usage in `send_command` with `while let Some(...)` is fine. Consider a brief comment explaining this design choice.
**Missing `Tested-by` tag**: This is the only non-trivial patch without a `Tested-by: Zhi Wang` tag (patches 1-7 and 9 all have it). This may be intentional since the patch changed substantively in v4/v5.
**The `SplitState::MAX_FIRST_PAYLOAD` associated constant**:
```rust
const MAX_FIRST_PAYLOAD: usize = MAX_CMD_SIZE - size_of::<C::Command>();
```
This is defined as an associated constant on the generic `SplitState<C>`, which is elegant since it depends on `C::Command`. It's also exposed to tests in patch 9 via `SplitState::<TestPayload>::MAX_FIRST_PAYLOAD`, which is a nice pattern.
**Allocation strategy**: The staging buffer uses `KVVec::from_elem(0u8, ...)` which zero-initializes. The data is immediately overwritten by `init_variable_payload`, so `KVVec::with_capacity` + `resize` would save the zero-fill, but this is a minor optimization concern for a path that's only taken for large (>64KB) RPCs.
**`ContinuationRecord` uses `Command = ()`**: This means the GspMsgElement header is written but the command body is entirely variable-length payload. This matches the continuation record protocol where only the raw data matters.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (7 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
@ 2026-03-04 1:42 ` Eliot Courtney
2026-03-05 3:55 ` Claude review: " Claude Code Review Bot
2026-03-05 3:55 ` Claude review: gpu: nova-core: gsp: add continuation record support Claude Code Review Bot
9 siblings, 1 reply; 27+ messages in thread
From: Eliot Courtney @ 2026-03-04 1:42 UTC (permalink / raw)
To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross
Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
Eliot Courtney
Add tests for continuation record splitting. They cover boundary
conditions at the split points to make sure the right number of
continuation records are made. They also check that the data
concatenated is correct.
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq/continuation.rs | 138 +++++++++++++++++++++++++
1 file changed, 138 insertions(+)
diff --git a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
index 3b83cdcbb39e..637942917237 100644
--- a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
@@ -161,3 +161,141 @@ fn init_variable_payload(
dst.write_all(&self.payload)
}
}
+
+#[kunit_tests(nova_core_gsp_continuation)]
+mod tests {
+ use super::*;
+
+ use kernel::transmute::{
+ AsBytes,
+ FromBytes, //
+ };
+
+ /// Non-zero-sized command header for testing.
+ #[repr(C)]
+ #[derive(Clone, Copy, Zeroable)]
+ struct TestHeader([u8; 64]);
+
+ // SAFETY: `TestHeader` is a plain array of bytes for which all bit patterns are valid.
+ unsafe impl FromBytes for TestHeader {}
+
+ // SAFETY: `TestHeader` is a plain array of bytes for which all bit patterns are valid.
+ unsafe impl AsBytes for TestHeader {}
+
+ struct TestPayload {
+ data: KVVec<u8>,
+ }
+
+ impl TestPayload {
+ fn generate_pattern(len: usize) -> Result<KVVec<u8>> {
+ let mut data = KVVec::with_capacity(len, GFP_KERNEL)?;
+ for i in 0..len {
+ // Mix in higher bits so the pattern does not repeat every 256 bytes.
+ data.push((i ^ (i >> 8)) as u8, GFP_KERNEL)?;
+ }
+ Ok(data)
+ }
+
+ fn new(len: usize) -> Result<Self> {
+ Ok(Self {
+ data: Self::generate_pattern(len)?,
+ })
+ }
+ }
+
+ impl CommandToGsp for TestPayload {
+ const FUNCTION: MsgFunction = MsgFunction::Nop;
+ type Command = TestHeader;
+ type InitError = Infallible;
+
+ fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+ TestHeader::init_zeroed()
+ }
+
+ fn variable_payload_len(&self) -> usize {
+ self.data.len()
+ }
+
+ fn init_variable_payload(
+ &self,
+ dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+ ) -> Result {
+ dst.write_all(self.data.as_slice())
+ }
+ }
+
+ /// Maximum variable payload size that fits in the first command alongside the header.
+ const MAX_FIRST_PAYLOAD: usize = SplitState::<TestPayload>::MAX_FIRST_PAYLOAD;
+
+ fn read_payload(cmd: impl CommandToGsp) -> Result<KVVec<u8>> {
+ let len = cmd.variable_payload_len();
+ let mut buf = KVVec::from_elem(0u8, len, GFP_KERNEL)?;
+ let mut sbuf = SBufferIter::new_writer([buf.as_mut_slice(), &mut []]);
+ cmd.init_variable_payload(&mut sbuf)?;
+ drop(sbuf);
+ Ok(buf)
+ }
+
+ struct SplitTest {
+ payload_size: usize,
+ num_continuations: usize,
+ }
+
+ fn check_split(t: SplitTest) -> Result {
+ let payload = TestPayload::new(t.payload_size)?;
+ let mut num_continuations = 0;
+
+ let buf = match SplitState::new(payload)? {
+ SplitState::Single(cmd) => read_payload(cmd)?,
+ SplitState::Split(cmd, mut continuations) => {
+ let mut buf = read_payload(cmd)?;
+ assert!(size_of::<TestHeader>() + buf.len() <= MAX_CMD_SIZE);
+
+ while let Some(cont) = continuations.next() {
+ let payload = read_payload(cont)?;
+ assert!(payload.len() <= MAX_CMD_SIZE);
+ buf.extend_from_slice(&payload, GFP_KERNEL)?;
+ num_continuations += 1;
+ }
+
+ buf
+ }
+ };
+
+ assert_eq!(num_continuations, t.num_continuations);
+ assert_eq!(
+ buf.as_slice(),
+ TestPayload::generate_pattern(t.payload_size)?.as_slice()
+ );
+ Ok(())
+ }
+
+ #[test]
+ fn split_command() -> Result {
+ check_split(SplitTest {
+ payload_size: 0,
+ num_continuations: 0,
+ })?;
+ check_split(SplitTest {
+ payload_size: MAX_FIRST_PAYLOAD,
+ num_continuations: 0,
+ })?;
+ check_split(SplitTest {
+ payload_size: MAX_FIRST_PAYLOAD + 1,
+ num_continuations: 1,
+ })?;
+ check_split(SplitTest {
+ payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE,
+ num_continuations: 1,
+ })?;
+ check_split(SplitTest {
+ payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE + 1,
+ num_continuations: 2,
+ })?;
+ check_split(SplitTest {
+ payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE * 3 + MAX_CMD_SIZE / 2,
+ num_continuations: 4,
+ })?;
+ Ok(())
+ }
+}
--
2.53.0
^ permalink raw reply related [flat|nested] 27+ messages in thread* Claude review: gpu: nova-core: gsp: add tests for continuation records
2026-03-04 1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
0 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
Good test coverage with boundary conditions:
- 0 bytes payload (no split)
- Exactly `MAX_FIRST_PAYLOAD` (no split, boundary)
- `MAX_FIRST_PAYLOAD + 1` (first split)
- `MAX_FIRST_PAYLOAD + MAX_CMD_SIZE` (single continuation, boundary)
- `MAX_FIRST_PAYLOAD + MAX_CMD_SIZE + 1` (two continuations)
- `MAX_FIRST_PAYLOAD + 3.5 * MAX_CMD_SIZE` (multiple continuations)
The test pattern generation with `(i ^ (i >> 8)) as u8` ensures non-repeating patterns across 256-byte boundaries, which is a good approach for catching off-by-one copy errors.
The `read_payload` helper correctly verifies payload content by reading back through the `CommandToGsp` interface.
One minor style note: the `// SAFETY` comments on `TestHeader`'s `FromBytes`/`AsBytes` impls are straightforward since it's `[u8; 64]`. No issues.
Overall this is a well-crafted series ready for merging, with the only substantive note being the zero poll interval choice in patch 2 which may deserve a comment or small delay.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread
* Claude review: gpu: nova-core: gsp: add continuation record support
2026-03-04 1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
` (8 preceding siblings ...)
2026-03-04 1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney
@ 2026-03-05 3:55 ` Claude Code Review Bot
9 siblings, 0 replies; 27+ messages in thread
From: Claude Code Review Bot @ 2026-03-05 3:55 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: gpu: nova-core: gsp: add continuation record support
Author: Eliot Courtney <ecourtney@nvidia.com>
Patches: 13
Reviewed: 2026-03-05T13:55:43.016540
---
This is a well-structured 9-patch series for the nova-core (Rust NVIDIA GPU driver) that adds continuation record support to the GSP command queue. This allows large RPCs (>16 pages / 64KB) to be split across multiple command queue entries. The series follows a clean incremental approach: preparatory cleanups, infrastructure additions, and then the core feature with tests.
The code quality is generally high, with good documentation, proper error handling, and thorough test coverage. The design with `SplitState` / `SplitCommand` / `ContinuationRecord` is clean and maintains backward compatibility for small commands. I have a few observations but no blocking issues.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 27+ messages in thread