public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: harry.wentland@amd.com, sunpeng.li@amd.com,
	alexander.deucher@amd.com, christian.koenig@amd.com,
	siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch,
	ardb@kernel.org, hamza.mahfooz@amd.com, aurabindo.pillai@amd.com,
	Roman.Li@amd.com, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH] drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED
Date: Mon, 4 May 2026 17:01:10 -0400	[thread overview]
Message-ID: <CADnq5_MiPNMMz3aE59bXdx11e_MBqS5SnkcC_YMUYvRtwiEokQ@mail.gmail.com> (raw)
In-Reply-To: <20260504201905.90667-1-mikhail.v.gavrilov@gmail.com>

On Mon, May 4, 2026 at 4:29 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> dcn32_validate_bandwidth() wraps dcn32_internal_validate_bw() with
> DC_FP_START()/DC_FP_END(). On x86 non-RT, DC_FP_START expands into
> kernel_fpu_begin() which takes fpregs_lock(), i.e. local_bh_disable().
> Allocations done inside this region must therefore not sleep.
>
> The legacy DML1 path through dcn32_full_validate_bw_helper() ->
> dcn32_add_phantom_pipes() -> dcn32_enable_phantom_plane() unconditionally
> calls dc_state_create_phantom_plane() -> dc_create_plane_state(), which
> performs kvzalloc(sizeof(struct dc_plane_state)). On a recent kernel
> sizeof(struct dc_plane_state) is 343736 bytes (335 KiB), well above the
> PAGE_ALLOC_COSTLY_ORDER threshold, so __kvmalloc_node() takes the vmalloc
> path. __get_vm_area_node() then trips its BUG_ON(in_interrupt()) because
> SOFTIRQ_DISABLE_OFFSET is set in preempt_count:
>
>   kernel BUG at mm/vmalloc.c:3206!
>   RIP: __get_vm_area_node+0x257/0x2d0
>   Workqueue: events_unbound commit_work
>   Call Trace:
>    __vmalloc_node_range_noprof+0x22b/0x570
>    __kvmalloc_node_noprof+0x3d0/0xb40
>    dc_create_plane_state+0x35/0x290 [amdgpu]
>    dc_state_create_phantom_plane+0x1a/0x120 [amdgpu]
>    dcn32_enable_phantom_plane+0x101/0x780 [amdgpu]
>    dcn32_add_phantom_pipes+0x47/0x460 [amdgpu]
>    dcn32_full_validate_bw_helper.constprop.0+0xa46/0x1d70 [amdgpu]
>    dcn32_internal_validate_bw+0x49c/0x1600 [amdgpu]
>    dml1_validate+0x20f/0x800 [amdgpu]
>    dcn32_validate_bandwidth+0x317/0x540 [amdgpu]
>    dc_validate_with_context+0xd34/0x1d30 [amdgpu]
>    dc_commit_streams+0x7ca/0x1810 [amdgpu]
>    amdgpu_dm_commit_streams+0xfd4/0x1e60 [amdgpu]
>    amdgpu_dm_atomic_commit_tail+0x29e/0x3520 [amdgpu]
>    commit_tail+0x204/0x4b0
>    process_one_work+0x8fd/0x16a0
>
> Per-CPU __preempt_count on the crashing CPU at panic time was 0x202:
> SOFTIRQ_DISABLE_OFFSET (0x200) from fpregs_lock() plus two preempt holds
> from dc_fpu_begin() and kernel_fpu_begin().
>
> The DML2 paths already wrap their large vzalloc()s in
> DC_RUN_WITH_PREEMPTION_ENABLED() to handle this case (see
> drivers/gpu/drm/amd/display/dc/dml2_0/dml21/dml21_wrapper.c:26 and
> drivers/gpu/drm/amd/display/dc/dml2_0/dml2_wrapper.c:24). Apply the same
> guard to the DML1 phantom-plane allocation in dcn32_enable_phantom_plane().
>
> This is a separate class of issue from "drm/amd/display: Fix unsafe uses
> of kernel mode FPU" by Ard Biesheuvel, which addressed callers entering
> DC FP compilation units without DC_FP_START. The bug fixed here is the
> inverse: a sleeping allocator invoked from within an active DC_FP_START
> region.
>
> Reproducer (RX 7900 XTX, single 4K HDMI display, DCN 3.2): launch any
> workload that produces rapid atomic modeset commits. The most reliable
> trigger observed is launching Rise of the Tomb Raider via Proton and
> repeatedly pressing the Super key during the level loading screen;
> crash occurs within ~4 minutes uptime. Random crashes are also observed
> during routine fullscreen toggles (image viewers, chat applications).
>
> Hardware verified clean: memtest86+ 4 passes, stressapptest -W -m 32
> 4 hours, both pass with 0 errors. KASAN active, no reports under load.
>
> Fixes: 235c67634230 ("drm/amd/display: add DCN32/321 specific files for Display Core")
> Cc: stable@vger.kernel.org # v6.0+
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4470

Alex

> ---
>  .../drm/amd/display/dc/resource/dcn32/dcn32_resource.c    | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> index 82f81b586986..3751f7a94a05 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> @@ -92,9 +92,14 @@
>  #include "dml/dcn32/dcn32_fpu.h"
>
>  #include "dc_state_priv.h"
> +#include "dc_fpu.h"
>
>  #include "dml2_0/dml2_wrapper.h"
>
> +#if !defined(DC_RUN_WITH_PREEMPTION_ENABLED)
> +#define DC_RUN_WITH_PREEMPTION_ENABLED(code) code
> +#endif
> +
>  #define DC_LOGGER_INIT(logger)
>
>  enum dcn32_clk_src_array_id {
> @@ -1684,7 +1689,8 @@ static void dcn32_enable_phantom_plane(struct dc *dc,
>                 if (curr_pipe->top_pipe && curr_pipe->top_pipe->plane_state == curr_pipe->plane_state)
>                         phantom_plane = prev_phantom_plane;
>                 else
> -                       phantom_plane = dc_state_create_phantom_plane(dc, context, curr_pipe->plane_state);
> +                       DC_RUN_WITH_PREEMPTION_ENABLED(phantom_plane =
> +                               dc_state_create_phantom_plane(dc, context, curr_pipe->plane_state));
>
>                 if (!phantom_plane)
>                         continue;
> --
> 2.54.0
>

  parent reply	other threads:[~2026-05-04 21:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04 20:19 [PATCH] drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED Mikhail Gavrilov
2026-05-04 20:57 ` Aurabindo Pillai
2026-05-04 21:01 ` Alex Deucher [this message]
2026-05-04 21:55 ` Claude review: " Claude Code Review Bot
2026-05-04 21:55 ` Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADnq5_MiPNMMz3aE59bXdx11e_MBqS5SnkcC_YMUYvRtwiEokQ@mail.gmail.com \
    --to=alexdeucher@gmail.com \
    --cc=Roman.Li@amd.com \
    --cc=airlied@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ardb@kernel.org \
    --cc=aurabindo.pillai@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamza.mahfooz@amd.com \
    --cc=harry.wentland@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=simona@ffwll.ch \
    --cc=siqueira@igalia.com \
    --cc=stable@vger.kernel.org \
    --cc=sunpeng.li@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox