public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Christian König <christian.koenig@amd.com>
Cc: arjan@linux.intel.com, amd-gfx@lists.freedesktop.org,
	Alex Deucher <alexander.deucher@amd.com>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] drm/amdgpu: fix zero-size GDS range init on RDNA4
Date: Tue, 21 Apr 2026 09:42:22 -0400	[thread overview]
Message-ID: <CADnq5_McEDYBcU8B+T4MeRKoST10EhA=LWXju1y2BL48kJPBNA@mail.gmail.com> (raw)
In-Reply-To: <34718f21-712a-4161-98e0-079dd9390ae6@amd.com>

On Tue, Apr 21, 2026 at 2:59 AM Christian König
<christian.koenig@amd.com> wrote:
>
> On 4/20/26 23:57, arjan@linux.intel.com wrote:
> >
> > RDNA4 (GFX 12) hardware removes the GDS, GWS, and OA on-chip memory
> > resources. The gfx_v12_0 initialisation code correctly leaves
> > adev->gds.gds_size, adev->gds.gws_size, and adev->gds.oa_size at
> > zero to reflect this.
> >
> > amdgpu_ttm_init() unconditionally calls amdgpu_ttm_init_on_chip() for
> > each of these resources regardless of size. When the size is zero,
> > amdgpu_ttm_init_on_chip() forwards the call to ttm_range_man_init(),
> > which calls drm_mm_init(mm, 0, 0). drm_mm_init() immediately fires
> > DRM_MM_BUG_ON(start + size <= start) -- trivially true when size is
> > zero -- crashing the kernel during modprobe of amdgpu on an RX 9070 XT.
>
> Mhm in general not a bad idea, but we are having tons of GFX 12 systems in our test machines and nothing is crashing there.
>
> We are clearly missing something here. Is that on an upstream kernel or something backported?

Looks like that check only asserts if CONFIG_DRM_DEBUG_MM is set in
the user's kernel config.  I guess no one uses that option.  These
chips have been in the market for over a year and no one has reported
that until now.  Applied with a note about this in the commit message.

Thanks!

Alex

>
> Regards,
> Christian.
>
> >
> > Guard against this by returning 0 early from
> > amdgpu_ttm_init_on_chip() when size_in_page is zero. This skips TTM
> > resource manager registration for hardware resources that are absent,
> > without affecting any other GPU type.
> >
> > Link: https://lore.kernel.org/all/bug-221376-2300@https.bugzilla.kernel.org%2F/
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=221376
> > Oops-Analysis: http://oops.fenrus.org/reports/bugzilla.korg/221376/report.html
> > Assisted-by: GitHub Copilot:Claude Sonnet 4.6 linux-kernel-oops-x86.
> > Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: linux-kernel@vger.kernel.org
> >
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |    3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index afaaab6496def..8075ac735321e 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -75,6 +75,9 @@ static int amdgpu_ttm_init_on_chip(struct amdgpu_device *adev,
> >                                     unsigned int type,
> >                                     uint64_t size_in_page)
> >  {
> > +       if (!size_in_page)
> > +               return 0;
> > +
> >         return ttm_range_man_init(&adev->mman.bdev, type,
> >                                   false, size_in_page);
> >  }
>

  parent reply	other threads:[~2026-04-21 13:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-17  9:23 [Bug 221376] New: AMD RADEON RX 9070 XT - modprobe amdgpu is fail bugzilla-daemon
2026-04-17 16:08 ` [Bug 221376] " bugzilla-daemon
2026-04-20 21:57 ` [PATCH] drm/amdgpu: fix zero-size GDS range init on RDNA4 arjan
2026-04-21  6:42   ` Christian König
2026-04-21 11:54     ` Arjan van de Ven
2026-04-21 13:42     ` Alex Deucher [this message]
2026-04-22 23:16   ` Claude review: " Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADnq5_McEDYBcU8B+T4MeRKoST10EhA=LWXju1y2BL48kJPBNA@mail.gmail.com' \
    --to=alexdeucher@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=arjan@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox