From: Alex Deucher <alexdeucher@gmail.com>
To: Rafael Passos <rafael@rcpassos.me>,
"Wentland, Harry" <Harry.Wentland@amd.com>,
"Leo (Sunpeng) Li" <Sunpeng.Li@amd.com>,
Bhuvana Chandra Pinninti <BhuvanaChandra.Pinninti@amd.com>
Cc: amd-gfx@lists.freedesktop.org, siqueira@igalia.com,
linux-kernel@vger.kernel.org, Martin Leung <Martin.Leung@amd.com>,
Ray Wu <ray.wu@amd.com>, Daniel Wheeler <daniel.wheeler@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
Rafael Passos <rcpassos@ime.usp.br>,
davidbtadokoro@ime.usp.br, dri-devel@lists.freedesktop.org
Subject: Re: [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression)
Date: Wed, 4 Mar 2026 11:32:57 -0500 [thread overview]
Message-ID: <CADnq5_Msmohg3T5KLrqPwvJGbXPOMKeNN-ZcqgTS2pHb+GjuYQ@mail.gmail.com> (raw)
In-Reply-To: <DGTZWNGLPVI3.108P2EKGFCO2R@rcpassos.me>
+ Harry, Leo, Bhuvana
On Wed, Mar 4, 2026 at 8:42 AM Rafael Passos <rafael@rcpassos.me> wrote:
>
> I found the issue, but I'm still not sure how to proceed.
> I would like some guidance in fixing this regression.
>
> The issue is the where a Register is being read from.
> Before this change, the MICROSECOND_TIME_BASE_DIV reg wa read from
> dce_hwseq_registers (dce_hwseq.h) and now from dccg_registers (dcn20_dccg.h)
>
> The bisection lead me to this commit: 4c595e75110ece20af3a68c1ebef8ed4c1b69afe
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> After lot of debugging, I traced the issue to this file:
> drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> This card is dcn21, but it uses most of the dcn20 implementation.
> For easy comparison, the following block contains the function with the original path
> commented out (from dcn21), and the function it calls from dcn20:
>
> ```
> bool dcn21_s0i3_golden_init_wa(struct dc *dc)
> {
> if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done){
>
> printk(KERN_CRIT "AUYER in %s", __func__);
> return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
> }
>
> printk(KERN_CRIT "AUYER in %s", __func__);
>
> return false;
>
> // original flow:
> // struct dce_hwseq *hws = dc->hwseq;
> // uint32_t value = 0;
> // value = REG_READ(MICROSECOND_TIME_BASE_DIV);
>
> // return value != 0x00120464;
> }
>
> // is_s0i3_golden_init_wa_done -> dccg2_is_s0i3_golden_init_wa_done
> bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
> {
> struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
> return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
> }
> ```
>
> I instrumented this code to compare the values.
> On boot, the value is the same. When resuming from s3 sleep, different.
> If using the output of this codepath before this commit, the screen works.
> At the end of this email is my "debugging patch", and the logs comparing what shows
> up on boot vs on resuming from sleep.
>
> I am attempting to implement a `dccg21_is_s0i3_golden_init_wa_done` to
> replace the `dccg2_is_s0i3_golden_init_wa_done` that is used in dcn21_dccg.c.
> Maybe dcn21 needs a separate register page, (insted of using dcn20_dccg.h)?
>
>
> Note the difference between log line 2 and 5
> [ 4.956404] [ T316] AUYER PATCHED in dcn21_s0i3_golden_init_wa, values compared to 0x00120464
> [ 4.956407] [ T316] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, bool: 1
> [ 4.956411] [ T316] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> [ 4.956412] [ T316] AUYER in dccg21_is_s0i3_golden_init_wa_done
> [ 4.956415] [ T316] AUYER in dccg21_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> [ 4.956418] [ T316] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as bool 1
>
>
> 1 [ 4.942660] [ T343] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [ 4.942662] [ T343] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, comparing to 0x00120464 bool: 1
> 3 [ 4.942665] [ T343] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> 4 [ 4.942668] [ T343] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [ 4.942671] [ T343] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
> On wake from S3:
>
> 1 [ 279.431636] [ T5497] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [ 279.431638] [ T5497] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180772, comparing to 0x00120464 bool: 0
> 3 [ 279.431640] [ T5497] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180772
> 4 [ 279.431641] [ T5497] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [ 279.431642] [ T5497] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
>
> The "patch" (just a test lab), to understad where these logs came from.
> I applies cleanly to amddrm drm-next, and mainline.
>
> ---
> .../amd/display/dc/dccg/dcn20/dcn20_dccg.c | 3 +++
> .../amd/display/dc/hwss/dcn21/dcn21_hwseq.c | 25 ++++++++++++++++---
> 2 files changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> index 13ba7f5ce13e..0ba20c7969ed 100644
> --- a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> +++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> @@ -158,6 +158,9 @@ bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
> {
> struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
> + printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x, value: %d",
> + __func__, dccg_dcn->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
> return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
> }
>
> diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> index 062745389d9a..143c552e0fa9 100644
> --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> @@ -88,10 +88,28 @@ int dcn21_init_sys_ctx(struct dce_hwseq *hws, struct dc *dc, struct dc_phy_addr_
>
> bool dcn21_s0i3_golden_init_wa(struct dc *dc)
> {
> - if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done)
> - return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
>
> - return false;
> + printk(KERN_CRIT "AUYER PATCHED in %s, values compared to 0x00120464", __func__);
> +
> + // original flow
> + struct dce_hwseq *hws = dc->hwseq;
> + uint32_t value = 0;
> + value = REG_READ(MICROSECOND_TIME_BASE_DIV);
> +
> + printk(KERN_CRIT "AUYER in %s, original flow value: %d, bool: %d",
> + __func__, value, value != 0x00120464);
> +
> + printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x value: %d",
> + __func__, hws->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
> + if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done) {
> + // new flow
> + bool v2 = 0;
> + v2 = !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
> + printk(KERN_CRIT "AUYER in %s, NEW flow value as bool %d", __func__, v2);
> + }
> +
> + return value != 0x00120464;
> }
>
> void dcn21_exit_optimized_pwr_state(
> @@ -298,4 +316,3 @@ bool dcn21_is_abm_supported(struct dc *dc,
> }
> return false;
> }
> -
> --
> 2.53.0
>
>
next prev parent reply other threads:[~2026-03-04 16:33 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-27 23:01 [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression) Rafael Passos
2026-03-04 12:27 ` Rafael Passos
2026-03-04 16:32 ` Alex Deucher [this message]
2026-03-08 0:04 ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
2026-03-08 8:19 ` kernel test robot
2026-03-08 22:01 ` Claude review: " Claude Code Review Bot
2026-03-08 16:23 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADnq5_Msmohg3T5KLrqPwvJGbXPOMKeNN-ZcqgTS2pHb+GjuYQ@mail.gmail.com \
--to=alexdeucher@gmail.com \
--cc=BhuvanaChandra.Pinninti@amd.com \
--cc=Harry.Wentland@amd.com \
--cc=Martin.Leung@amd.com \
--cc=Sunpeng.Li@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=daniel.wheeler@amd.com \
--cc=davidbtadokoro@ime.usp.br \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@rcpassos.me \
--cc=ray.wu@amd.com \
--cc=rcpassos@ime.usp.br \
--cc=siqueira@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox