From: <w15303746062@163.com>
To: tzimmermann@suse.de, airlied@redhat.com, jfalempe@redhat.com
Cc: maarten.lankhorst@linux.intel.com, mripard@kernel.org,
airlied@gmail.com, simona@ffwll.ch,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
Mingyu Wang <25181214217@stu.xidian.edu.cn>
Subject: [PATCH] drm/ast: Add timeouts to AHB/SCU polling loops to prevent soft lockups
Date: Wed, 13 May 2026 19:39:49 +0800 [thread overview]
Message-ID: <20260513113949.356537-1-w15303746062@163.com> (raw)
In-Reply-To: <b98e1013-25ae-44e1-8905-88f104cc0608@suse.de>
From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
While validating the driver using DevGen (a framework that synthesizes
virtual device models directly from driver source code via LLM guidance),
a severe soft lockup was observed.
The hardware polling loops in `__ast_mindwm`, `__ast_moutdwm`, and
`ast_2500_patch_ahb` lack a timeout mechanism. On bare-metal systems,
if the ASPEED chip becomes unresponsive or a PCIe bus fault occurs,
the CPU will spin indefinitely in these loops. This results in a system
hang, triggering the watchdog soft lockup and causing subsequent I/O
starvation (e.g., blocking jbd2).
Fix this by introducing a bounded loop with a safe timeout of
approximately 100ms using `udelay(10)`. Using `udelay()` ensures
that the fix remains safe even if these accessors are called from
an atomic context or while holding spinlocks. If the hardware fails
to respond, the loop breaks and emits a `WARN_ONCE`, allowing the
kernel to degrade gracefully and preventing complete system paralysis.
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
Hi Thomas,
Thanks for the prompt response and confirmation!
Instead of just waiting for threshold suggestions, I have drafted this
patch to address the soft lockup. Since changing the return types of
`__ast_mindwm` and `__ast_moutdwm` to propagate error codes (e.g.,
`-ETIMEDOUT`) would require an intrusive refactoring across the entire
AST driver, I took a more defensive, minimal-invasive approach.
To avoid the risk of sleeping in an atomic context (in case these
low-level I/O accessors are ever called under a spinlock), I used
a bounded loop with `udelay(10)` and a maximum of 10000 iterations
(approx. 100ms total timeout). If the ASPEED hardware completely
fails to respond, it breaks the infinite loop, emits a `WARN_ONCE`,
and prevents the CPU from halting the entire system.
Please let me know if you think the 100ms threshold and the `udelay`
approach are appropriate for this specific AHB/SCU hardware sequence.
drivers/gpu/drm/ast/ast_2500.c | 8 +++++++-
drivers/gpu/drm/ast/ast_post.c | 16 ++++++++++++++--
2 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ast/ast_2500.c b/drivers/gpu/drm/ast/ast_2500.c
index 2a52af0ded56..08d18f90201a 100644
--- a/drivers/gpu/drm/ast/ast_2500.c
+++ b/drivers/gpu/drm/ast/ast_2500.c
@@ -107,6 +107,7 @@ static const u32 ast2500_ddr4_1600_timing_table[REGTBL_NUM] = {
void ast_2500_patch_ahb(void __iomem *regs)
{
u32 data;
+ int retries = 10000; /* ~100ms timeout */
/* Clear bus lock condition */
__ast_moutdwm(regs, 0x1e600000, 0xAEED1A03);
@@ -136,7 +137,12 @@ void ast_2500_patch_ahb(void __iomem *regs)
do {
__ast_moutdwm(regs, 0x1e6e2000, 0x1688A8A8);
data = __ast_mindwm(regs, 0x1e6e2000);
- } while (data != 1);
+ if (data == 1)
+ break;
+ udelay(10);
+ } while (--retries);
+
+ WARN_ONCE(!retries, "ast: timeout waiting for AHB patch\n");
__ast_moutdwm(regs, 0x1e6e207c, 0x08000000); /* clear fast reset */
}
diff --git a/drivers/gpu/drm/ast/ast_post.c b/drivers/gpu/drm/ast/ast_post.c
index b72914dbed38..66eb80925e27 100644
--- a/drivers/gpu/drm/ast/ast_post.c
+++ b/drivers/gpu/drm/ast/ast_post.c
@@ -37,13 +37,19 @@
u32 __ast_mindwm(void __iomem *regs, u32 r)
{
u32 data;
+ int retries = 10000; /* ~100ms timeout */
__ast_write32(regs, 0xf004, r & 0xffff0000);
__ast_write32(regs, 0xf000, 0x1);
do {
data = __ast_read32(regs, 0xf004) & 0xffff0000;
- } while (data != (r & 0xffff0000));
+ if (data == (r & 0xffff0000))
+ break;
+ udelay(10);
+ } while (--retries);
+
+ WARN_ONCE(!retries, "ast: timeout reading from AHB/SCU\n");
return __ast_read32(regs, 0x10000 + (r & 0x0000ffff));
}
@@ -51,13 +57,19 @@ u32 __ast_mindwm(void __iomem *regs, u32 r)
void __ast_moutdwm(void __iomem *regs, u32 r, u32 v)
{
u32 data;
+ int retries = 10000; /* ~100ms timeout */
__ast_write32(regs, 0xf004, r & 0xffff0000);
__ast_write32(regs, 0xf000, 0x1);
do {
data = __ast_read32(regs, 0xf004) & 0xffff0000;
- } while (data != (r & 0xffff0000));
+ if (data == (r & 0xffff0000))
+ break;
+ udelay(10);
+ } while (--retries);
+
+ WARN_ONCE(!retries, "ast: timeout writing to AHB/SCU\n");
__ast_write32(regs, 0x10000 + (r & 0x0000ffff), v);
}
--
2.34.1
next prev parent reply other threads:[~2026-05-13 20:57 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 7:36 BUG: drm/ast: soft lockup due to missing timeout in hardware polling (ast_2500_patch_ahb) w15303746062
2026-05-13 7:50 ` Thomas Zimmermann
2026-05-13 11:39 ` w15303746062 [this message]
2026-05-16 2:09 ` Claude review: drm/ast: Add timeouts to AHB/SCU polling loops to prevent soft lockups Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260513113949.356537-1-w15303746062@163.com \
--to=w15303746062@163.com \
--cc=25181214217@stu.xidian.edu.cn \
--cc=airlied@gmail.com \
--cc=airlied@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=jfalempe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=simona@ffwll.ch \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox