public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Calvin Owens <calvin@wbinvd.org>
To: Nathan Chancellor <nathan@kernel.org>
Cc: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org,
	Charlene Liu <charlene.liu@amd.com>,
	Ovidiu Bunea <ovidiu.bunea@amd.com>,
	Alex Hung <alex.hung@amd.com>,
	Dan Wheeler <daniel.wheeler@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Harry Wentland <harry.wentland@amd.com>,
	Leo Li <sunpeng.li@amd.com>,
	Rodrigo Siqueira <siqueira@igalia.com>,
	Christian Koenig <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	llvm@lists.linux.dev
Subject: Re: [REGRESSION][PATCH] drm/amd/display: Fix uninitialized variable which breaks full LTO
Date: Thu, 12 Mar 2026 09:15:30 -0700	[thread overview]
Message-ID: <abLmonL-8QubcCpA@mozart.vkv.me> (raw)
In-Reply-To: <20260312080245.GA3988095@ax162>

On Thursday 03/12 at 01:02 -0700, Nathan Chancellor wrote:
> Hi Calvin,
> 
> On Mon, Mar 09, 2026 at 09:24:57PM -0700, Calvin Owens wrote:
> > Commit e1b385726f7f ("drm/amd/display: Add additional checks for PSP
> > footer size") introduced a use of an uninitialized stack variable
> > in dm_dmub_sw_init() (region_params.bss_data_size).
> > 
> > Interestingly, this seems to cause no issue on normal kernels. But when
> > full LTO is enabled, it causes the compiler to "optimize" out huge
> > swaths of amdgpu initialization code, and the driver is unusable:
> 
> Yeah, this appears to be a very unfortunate case of "clang encountered known
> undefined behavior and stopped code generation", which we would like to
> avoid but figuring out a proper upstreamable solution is hard. The most
> recent attempt:
> 
>   https://github.com/llvm/llvm-project/pull/146791
> 
> My guess is that LTO allows inlining of
> dmub_srv_get_fw_meta_info_from_raw_fw() into dm_dmub_sw_init(), at which
> point it can see that the result of accessing an uninitialized
> region_params.bss_data_size will be used through
> fw_meta_info_params.fw_bss_data and gives up generating the rest of the
> function.

Thanks for looking Nathan. I'll keep an eye on that and see if it's able
to catch this example. I've tried to come up with a minimal reproducer,
but I haven't had any luck yet (so far I always get the warning), would
that be helpful at all?

I put the full W=2 output for the one file here in case anyone else
wants to look:

   https://github.com/jcalvinowens/lkml-debug/blob/main/amdgpu-lto/gcc-warns.txt
   https://github.com/jcalvinowens/lkml-debug/blob/main/amdgpu-lto/llvm-warns.txt

Somehow 'make drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.o' doesn't
work, I want to look at that later because it was mildly annoying while
digging into this.

> >     amdgpu 0000:03:00.0: [drm] Loading DMUB firmware via PSP: version=0x07002F00
> >     amdgpu 0000:03:00.0: sw_init of IP block <dm> failed 5
> >     amdgpu 0000:03:00.0: amdgpu_device_ip_init failed
> >     amdgpu 0000:03:00.0: Fatal error during GPU init
> > 
> > It surprises me that neither gcc nor clang emit a warning about this: I
> > only found it by bisecting the LTO breakage.
> 
> gcc's -Wmaybe-uninitialized is disabled by default for the kernel but
> even enabling it with KCFLAGS does not show an instance here, which I
> find quite surprising... for clang, it is harder because the warning
> happens early in the frontend where it might not be able to track a
> value that well.

GCC does flag what seems to me to be a real but benign warning about an
ERR_PTR check that doesn't handle NULL in the same file:

    https://lore.kernel.org/lkml/6aaf2cf4bd19363a85f35e649685d7bdae400253.1773157137.git.calvin@wbinvd.org/

I'm also trying to find a minimal reproducer for GCC, no luck yet.

> > Fix by using the old value for region_params.bss_data_size in place of
> > the uninitialized reference, which makes amdgpu work with LTO again.
> > 
> > Fixes: e1b385726f7f ("drm/amd/display: Add additional checks for PSP footer size")
> > Signed-off-by: Calvin Owens <calvin@wbinvd.org>
> > ---
> >  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index b3d6f2cd8ab6..e69e61163ae9 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -2554,7 +2554,7 @@ static int dm_dmub_sw_init(struct amdgpu_device *adev)
> >  	fw_meta_info_params.fw_inst_const = adev->dm.dmub_fw->data +
> >  					    le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
> >  					    PSP_HEADER_BYTES_256;
> > -	fw_meta_info_params.fw_bss_data = region_params.bss_data_size ? adev->dm.dmub_fw->data +
> > +	fw_meta_info_params.fw_bss_data = le32_to_cpu(hdr->bss_data_bytes) ? adev->dm.dmub_fw->data +
> 
> Maybe it would be better to use fw_meta_info_params.bss_data_size
> instead of le32_to_cpu(hdr->bss_data_bytes)? Obviously it is the same
> value but it would result in a smaller change. It seems likely that this
> was just a copy and paste failure.

Agreed. That ends up being almost self evidently correct if I force git
to add an extra context line with the assignment, I always forget I can
do that:

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b3d6f2cd8ab6..0d1c772ef713 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2553,9 +2553,9 @@ static int dm_dmub_sw_init(struct amdgpu_device *adev)
 	fw_meta_info_params.bss_data_size = le32_to_cpu(hdr->bss_data_bytes);
 	fw_meta_info_params.fw_inst_const = adev->dm.dmub_fw->data +
 					    le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
 					    PSP_HEADER_BYTES_256;
-	fw_meta_info_params.fw_bss_data = region_params.bss_data_size ? adev->dm.dmub_fw->data +
+	fw_meta_info_params.fw_bss_data = fw_meta_info_params.bss_data_size ? adev->dm.dmub_fw->data +
 					  le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
 					  le32_to_cpu(hdr->inst_const_bytes) : NULL;
 	fw_meta_info_params.custom_psp_footer_size = 0;
 

I'll send a v2 in a little bit.

Thanks,
Calvin

> >  					  le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
> >  					  le32_to_cpu(hdr->inst_const_bytes) : NULL;
> >  	fw_meta_info_params.custom_psp_footer_size = 0;
> > -- 
> > 2.47.3
> > 

  reply	other threads:[~2026-03-12 16:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  4:24 [REGRESSION][PATCH] drm/amd/display: Fix uninitialized variable which breaks full LTO Calvin Owens
2026-03-10  5:54 ` Calvin Owens
2026-03-12  8:02 ` Nathan Chancellor
2026-03-12 16:15   ` Calvin Owens [this message]
2026-03-12 17:13 ` [PATCH v2] drm/amd/display: Fix uninitialized variable use " Calvin Owens
2026-03-12 20:31   ` Harry Wentland
2026-03-12 20:37   ` Nathan Chancellor
2026-03-13  3:56 ` Claude review: [PATCH] drm/amd/display: Fix uninitialized variable " Claude Code Review Bot
2026-03-13  3:56 ` Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abLmonL-8QubcCpA@mozart.vkv.me \
    --to=calvin@wbinvd.org \
    --cc=airlied@gmail.com \
    --cc=alex.hung@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=charlene.liu@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.wheeler@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=harry.wentland@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=nathan@kernel.org \
    --cc=ovidiu.bunea@amd.com \
    --cc=simona@ffwll.ch \
    --cc=siqueira@igalia.com \
    --cc=sunpeng.li@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox