From: "Gary Guo" <gary@garyguo.net>
To: "Alexandre Courbot" <acourbot@nvidia.com>,
"Tim Kovalenko via B4 Relay"
<devnull+tim.kovalenko.proton.me@kernel.org>
Cc: <tim.kovalenko@proton.me>, "Danilo Krummrich" <dakr@kernel.org>,
"Alice Ryhl" <aliceryhl@google.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Miguel Ojeda" <ojeda@kernel.org>, "Gary Guo" <gary@garyguo.net>,
Björn Roy Baron <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>,
"Trevor Gross" <tmgross@umich.edu>,
"Boqun Feng" <boqun@kernel.org>,
"Nathan Chancellor" <nathan@kernel.org>,
"Nicolas Schier" <nsc@kernel.org>,
"Abdiel Janulgue" <abdiel.janulgue@gmail.com>,
"Daniel Almeida" <daniel.almeida@collabora.com>,
"Robin Murphy" <robin.murphy@arm.com>,
<nouveau@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
<rust-for-linux@vger.kernel.org>, <linux-kbuild@vger.kernel.org>,
<driver-core@lists.linux.dev>
Subject: Re: [PATCH v4 4/4] gpu: nova-core: fix stack overflow in GSP memory allocation
Date: Tue, 10 Mar 2026 01:51:36 +0000 [thread overview]
Message-ID: <DGYQ5EYS1LB0.TP93SPR5Q3BX@garyguo.net> (raw)
In-Reply-To: <DGYPX7TT8A4E.3KTO5Z5RS17B4@nvidia.com>
On Tue Mar 10, 2026 at 1:40 AM GMT, Alexandre Courbot wrote:
> On Tue Mar 10, 2026 at 1:34 AM JST, Tim Kovalenko via B4 Relay wrote:
>> From: Tim Kovalenko <tim.kovalenko@proton.me>
>>
>> The `Cmdq::new` function was allocating a `PteArray` struct on the stack
>> and was causing a stack overflow with 8216 bytes.
>>
>> Modify the `PteArray` to calculate and write the Page Table Entries
>> directly into the coherent DMA buffer one-by-one. This reduces the stack
>> usage quite a lot.
>>
>> Signed-off-by: Tim Kovalenko <tim.kovalenko@proton.me>
>> ---
>> drivers/gpu/nova-core/gsp.rs | 34 +++++++++++++++++++---------------
>> drivers/gpu/nova-core/gsp/cmdq.rs | 15 ++++++++++++++-
>> 2 files changed, 33 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
>> index 25cd48514c777cb405a2af0acf57196b2e2e7837..20170e483e04c476efce8997b3916b0ad829ed38 100644
>> --- a/drivers/gpu/nova-core/gsp.rs
>> +++ b/drivers/gpu/nova-core/gsp.rs
>> @@ -47,16 +47,11 @@
>> unsafe impl<const NUM_ENTRIES: usize> AsBytes for PteArray<NUM_ENTRIES> {}
>>
>> impl<const NUM_PAGES: usize> PteArray<NUM_PAGES> {
>> - /// Creates a new page table array mapping `NUM_PAGES` GSP pages starting at address `start`.
>> - fn new(start: DmaAddress) -> Result<Self> {
>> - let mut ptes = [0u64; NUM_PAGES];
>> - for (i, pte) in ptes.iter_mut().enumerate() {
>> - *pte = start
>> - .checked_add(num::usize_as_u64(i) << GSP_PAGE_SHIFT)
>> - .ok_or(EOVERFLOW)?;
>> - }
>> -
>> - Ok(Self(ptes))
>> + /// Returns the page table entry for `index`, for a mapping starting at `start` DmaAddress.
>> + fn entry(start: DmaAddress, index: usize) -> Result<u64> {
>> + start
>> + .checked_add(num::usize_as_u64(index) << GSP_PAGE_SHIFT)
>> + .ok_or(EOVERFLOW)
>> }
>> }
>>
>> @@ -86,16 +81,25 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>> NUM_PAGES * GSP_PAGE_SIZE,
>> GFP_KERNEL | __GFP_ZERO,
>> )?);
>> - let ptes = PteArray::<NUM_PAGES>::new(obj.0.dma_handle())?;
>> +
>> + let start_addr = obj.0.dma_handle();
>>
>> // SAFETY: `obj` has just been created and we are its sole user.
>> - unsafe {
>> - // Copy the self-mapping PTE at the expected location.
>> + let pte_region = unsafe {
>> obj.0
>> - .as_slice_mut(size_of::<u64>(), size_of_val(&ptes))?
>> - .copy_from_slice(ptes.as_bytes())
>> + .as_slice_mut(size_of::<u64>(), NUM_PAGES * size_of::<u64>())?
>> };
>>
>> + // This is a one by one GSP Page write to the memory
>> + // to avoid stack overflow when allocating the whole array at once.
>> + for (i, chunk) in pte_region.chunks_exact_mut(size_of::<u64>()).enumerate() {
>> + let pte_value = start_addr
>> + .checked_add(num::usize_as_u64(i) << GSP_PAGE_SHIFT)
>> + .ok_or(EOVERFLOW)?;
>> +
>> + chunk.copy_from_slice(&pte_value.to_ne_bytes());
>> + }
>> +
>> Ok(obj)
>> }
>> }
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 0056bfbf0a44cfbc5a0ca08d069f881b877e1edc..c8327d3098f73f9b880eee99038ad10a16e1e32d 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -202,7 +202,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>>
>> let gsp_mem =
>> CoherentAllocation::<GspMem>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
>> - dma_write!(gsp_mem, [0]?.ptes, PteArray::new(gsp_mem.dma_handle())?);
>> +
>> + const NUM_PTES: usize = GSP_PAGE_SIZE / size_of::<u64>();
>> +
>> + let start = gsp_mem.dma_handle();
>> + // One by one GSP Page write to the memory to avoid stack overflow when allocating
>> + // the whole array at once.
>> + for i in 0..NUM_PTES {
>> + dma_write!(
>> + gsp_mem,
>> + [0]?.ptes.0[i],
>> + PteArray::<NUM_PTES>::entry(start, i)?
>
> Does `::<NUM_PTES>` need to be mentioned here, or is the compiler able
> to infer it?
The function signature doesn't mention NUM_PTES at all, so no. In fact, perhaps
the `entry` shouldn't be an associated method at all (even if is, it probably
should be of `PteArray::<0>` or something.
Best,
Gary
>
> In any case, the updated patch
>
> Acked-by: Alexandre Courbot <acourbot@nvidia.com>
>
> Thanks!
next prev parent reply other threads:[~2026-03-10 1:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 16:34 [PATCH v4 0/4] Fixes the stack overflow Tim Kovalenko via B4 Relay
2026-03-09 16:34 ` [PATCH v4 1/4] rust: ptr: add `KnownSize` trait to support DST size info extraction Tim Kovalenko via B4 Relay
2026-03-10 2:10 ` Claude review: " Claude Code Review Bot
2026-03-09 16:34 ` [PATCH v4 2/4] rust: ptr: add projection infrastructure Tim Kovalenko via B4 Relay
2026-03-10 2:10 ` Claude review: " Claude Code Review Bot
2026-03-09 16:34 ` [PATCH v4 3/4] rust: dma: use pointer projection infra for `dma_{read,write}` macro Tim Kovalenko via B4 Relay
2026-03-10 2:10 ` Claude review: " Claude Code Review Bot
2026-03-09 16:34 ` [PATCH v4 4/4] gpu: nova-core: fix stack overflow in GSP memory allocation Tim Kovalenko via B4 Relay
2026-03-09 19:40 ` Danilo Krummrich
2026-03-09 22:40 ` Miguel Ojeda
2026-03-10 1:40 ` Alexandre Courbot
2026-03-10 1:51 ` Gary Guo [this message]
2026-03-10 2:10 ` Claude review: " Claude Code Review Bot
2026-03-09 17:00 ` [PATCH v4 0/4] Fixes the stack overflow Danilo Krummrich
2026-03-10 2:10 ` Claude review: " Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DGYQ5EYS1LB0.TP93SPR5Q3BX@garyguo.net \
--to=gary@garyguo.net \
--cc=a.hindborg@kernel.org \
--cc=abdiel.janulgue@gmail.com \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=aliceryhl@google.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun@kernel.org \
--cc=dakr@kernel.org \
--cc=daniel.almeida@collabora.com \
--cc=devnull+tim.kovalenko.proton.me@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=driver-core@lists.linux.dev \
--cc=linux-kbuild@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lossin@kernel.org \
--cc=nathan@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=nsc@kernel.org \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
--cc=tim.kovalenko@proton.me \
--cc=tmgross@umich.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox