* [PATCH 0/2] 2 bugfixes found in DRM-CI
@ 2026-03-08 22:35 Jim Cromie
2026-03-08 22:35 ` [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs Jim Cromie
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Jim Cromie @ 2026-03-08 22:35 UTC (permalink / raw)
To: linux-kernel
Cc: Jim Cromie, Matthew Auld, Arun Pravin, Christian Koenig,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Luca Coelho, Jani Nikula, dri-devel
1st fixes drm-buddy's use of long-int power-of-2 fns on u64 var, on 32
bit arches (like arm32).
2nd is drm_print.c __drm_printfn_dbg() using __drm_print_enabled(),
which short-circuits the drm.debg bit tests because its optimized for
use behind dynamic-debug's per-callsite static-key guard.
__drm_printfn_dbg() doesnt have the guard, so under
CONFIG_DRM_USE_DYNAMIC_DEBUG=Y it floods the log, causing DRM-CI to
timeout on runners with a serial console.
Jim Cromie (2):
drm_buddy: fix power-of-2 rounding errs
drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass
drivers/gpu/drm/drm_buddy.c | 4 ++--
drivers/gpu/drm/drm_print.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs
2026-03-08 22:35 [PATCH 0/2] 2 bugfixes found in DRM-CI Jim Cromie
@ 2026-03-08 22:35 ` Jim Cromie
2026-03-10 2:46 ` Claude review: " Claude Code Review Bot
2026-03-08 22:35 ` [PATCH 2/2] drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass Jim Cromie
2026-03-10 2:46 ` Claude review: 2 bugfixes found in DRM-CI Claude Code Review Bot
2 siblings, 1 reply; 6+ messages in thread
From: Jim Cromie @ 2026-03-08 22:35 UTC (permalink / raw)
To: linux-kernel
Cc: Jim Cromie, Matthew Auld, Arun Pravin, Christian Koenig,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jani Nikula, Luca Coelho, dri-devel
On DRM-CI, I encountered this kunit:arm32 failure.
[23:19:40] [PASSED] drm_test_buddy_alloc_clear
[23:19:40] [PASSED] drm_test_buddy_alloc_range_bias
[23:19:41] [PASSED] drm_test_buddy_fragmentation_performance
[23:19:41] # drm_test_buddy_alloc_exceeds_max_order: EXPECTATION FAILED at drivers/gpu/drm/tests/drm_buddy_test.c:889
[23:19:41] Expected err == -22, but
[23:19:41] err == 0 (0x0)
[23:19:41] ------------[ cut here ]------------
[23:19:41] WARNING: drivers/gpu/drm/drm_buddy.c:405 at drm_buddy_fini+0x114/0x1b8, CPU#0: kunit_try_catch/74
[23:19:41] CPU: 0 UID: 0 PID: 74 Comm: kunit_try_catch Tainted: G N 7.0.0-rc1-gdfb0bcedd08a #1 VOLUNTARY
[23:19:41] Tainted: [N]=TEST
[23:19:41] Hardware name: Generic DT based system
[23:19:41] Call trace:
[23:19:41] unwind_backtrace from show_stack+0x10/0x14
[23:19:41] show_stack from dump_stack_lvl+0x3c/0x4c
[23:19:41] dump_stack_lvl from __warn+0xe8/0x1c4
[23:19:41] __warn from warn_slowpath_fmt+0xa4/0xc0
[23:19:41] warn_slowpath_fmt from drm_buddy_fini+0x114/0x1b8
[23:19:41] drm_buddy_fini from drm_test_buddy_alloc_exceeds_max_order+0x1c8/0x36c
[23:19:41] drm_test_buddy_alloc_exceeds_max_order from kunit_try_run_case+0x78/0x1c8
[23:19:41] kunit_try_run_case from kunit_generic_run_threadfn_adapter+0x1c/0x34
[23:19:41] kunit_generic_run_threadfn_adapter from kthread+0x108/0x134
[23:19:41] kthread from ret_from_fork+0x14/0x28
[23:19:41] Exception stack(0xf0bd5fb0 to 0xf0bd5ff8)
[23:19:41] 5fa0: 00000000 00000000 00000000 00000000
[23:19:41] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[23:19:41] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[23:19:41] ---[ end trace 0000000000000000 ]---
[23:19:41] # drm_test_buddy_alloc_exceeds_max_order: drivers/gpu/drm/drm_buddy.c:406: buddy_fini() root
[23:19:41] ------------[ cut here ]------------
[23:19:41] WARNING: drivers/gpu/drm/drm_buddy.c:414 at drm_buddy_fini+0x1b4/0x1b8, CPU#0: kunit_try_catch/74
[23:19:41] CPU: 0 UID: 0 PID: 74 Comm: kunit_try_catch Tainted: G W N 7.0.0-rc1-gdfb0bcedd08a #1 VOLUNTARY
[23:19:41] Tainted: [W]=WARN, [N]=TEST
[23:19:41] Hardware name: Generic DT based system
[23:19:41] Call trace:
[23:19:41] unwind_backtrace from show_stack+0x10/0x14
[23:19:41] show_stack from dump_stack_lvl+0x3c/0x4c
[23:19:41] dump_stack_lvl from __warn+0xe8/0x1c4
[23:19:41] __warn from warn_slowpath_fmt+0xa4/0xc0
[23:19:41] warn_slowpath_fmt from drm_buddy_fini+0x1b4/0x1b8
[23:19:41] drm_buddy_fini from drm_test_buddy_alloc_exceeds_max_order+0x1c8/0x36c
[23:19:41] drm_test_buddy_alloc_exceeds_max_order from kunit_try_run_case+0x78/0x1c8
[23:19:41] kunit_try_run_case from kunit_generic_run_threadfn_adapter+0x1c/0x34
[23:19:41] kunit_generic_run_threadfn_adapter from kthread+0x108/0x134
[23:19:41] kthread from ret_from_fork+0x14/0x28
[23:19:41] Exception stack(0xf0bd5fb0 to 0xf0bd5ff8)
[23:19:41] 5fa0: 00000000 00000000 00000000 00000000
[23:19:41] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[23:19:41] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[23:19:41] ---[ end trace 0000000000000000 ]---
[23:19:41] [FAILED] drm_test_buddy_alloc_exceeds_max_order
[23:19:41] # drm_buddy: Testing DRM buddy manager, with random_seed=0xacce106c
[23:19:41] # module: drm_buddy_test
[23:19:41] # drm_buddy: pass:8 fail:1 skip:0 total:9
This error in drm_test_buddy_alloc_exceeds_max_order (Expected -22,
got 0) is caused by a 64-bit truncation bug in the DRM buddy
allocator's core logic.
On 32-bit architectures (like arm32), the standard
roundup_pow_of_two() and rounddown_pow_of_two() macros use unsigned
long internally. When these are called with u64 memory sizes (as
drm_buddy.c does), any size above 4GB is truncated.
The test tries to allocate ~9GB on a ~10GB manager. Due to
truncation, it actually asks for ~1GB on a ~2GB manager, which
unexpectedly succeeds. This then causes a mm->avail mismatch warning
when the test finishes.
Use 64-bit safe power-of-two calculations. Since ilog2() is already
safe for u64, I'll use it to implement safe rounding.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
drivers/gpu/drm/drm_buddy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index dbf984f8e301..8f23fb615d47 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -919,7 +919,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
u64 modify_size;
int err;
- modify_size = rounddown_pow_of_two(size);
+ modify_size = 1ULL << ilog2(size);
pages = modify_size >> ilog2(mm->chunk_size);
order = fls(pages) - 1;
if (order == 0)
@@ -1140,7 +1140,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
/* Roundup the size to power of 2 */
if (flags & DRM_BUDDY_CONTIGUOUS_ALLOCATION) {
- size = roundup_pow_of_two(size);
+ size = 1ULL << (ilog2(size - 1) + 1);
min_block_size = size;
/* Align size value to min_block_size */
} else if (!IS_ALIGNED(size, min_block_size)) {
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass
2026-03-08 22:35 [PATCH 0/2] 2 bugfixes found in DRM-CI Jim Cromie
2026-03-08 22:35 ` [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs Jim Cromie
@ 2026-03-08 22:35 ` Jim Cromie
2026-03-10 2:46 ` Claude review: " Claude Code Review Bot
2026-03-10 2:46 ` Claude review: 2 bugfixes found in DRM-CI Claude Code Review Bot
2 siblings, 1 reply; 6+ messages in thread
From: Jim Cromie @ 2026-03-08 22:35 UTC (permalink / raw)
To: linux-kernel
Cc: Jim Cromie, Matthew Auld, Arun Pravin, Christian Koenig,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Luca Coelho, Jani Nikula, dri-devel
drm_debug_enabled() is the canonical bit-test for drm.debug.
Commit 6ce6fae84536 ("drm_print: optimize drm_debug_enabled for
jump-label") renamed the original bit-test to drm_debug_enabled_raw()
and introduced an internal bypass for dyndbg. When [1]=y, it defined
__drm_debug_enabled() to evaluate to 'true', allowing dyndbg's
static-key to handle the filtering at the callsite. It also provided
drm_debug_enabled() for cases where an explicit bit-mask check is
still required.
Later, commit 9fd6f61a297e ("drm/print: add drm_dbg_printer() for drm
device specific printer") added __drm_printfn_dbg(), but mistakenly
used the internal bypass __drm_debug_enabled() instead of the
canonical drm_debug_enabled(). This went unnoticed because at the
time, [1]=y was marked BROKEN.
Because __drm_printfn_dbg() is a shared helper where the callsite is
not directly guarded by dyndbg's static-key, this caused it to hit
the 'true' bypass and always print, ignoring the drm.debug bit-mask.
This results in a flood of messages in environments with slow serial
consoles, as seen in DRM-CI on i915 CML devices. When IGT causes a
mismatch in intel_pipe_config_compare(), the resulting UART storm
causes a hard timeout after 20 minutes.
To fix this, change __drm_printfn_dbg() to use
drm_debug_enabled_instrumented() instead. This ensures the bit-test
is performed at runtime even when dyndbg is enabled. It also adds a
pr_debug() to help track the frequency of this bit-test.
Additionally, update __drm_dev_dbg() to use the canonical
drm_debug_enabled() instead of the internal __drm_debug_enabled().
While __drm_dev_dbg() is wrapped by a dyndbg factory and thus safe,
this change ensures consistency and clarifies the intended usage.
[1] CONFIG_DRM_USE_DYNAMIC_DEBUG
Fixes: 9fd6f61a297e ("drm/print: add drm_dbg_printer() for drm device specific printer")
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
---
drivers/gpu/drm/drm_print.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index ded9461df5f2..9b622345e2eb 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -218,7 +218,7 @@ void __drm_printfn_dbg(struct drm_printer *p, struct va_format *vaf)
const struct device *dev = drm ? drm->dev : NULL;
enum drm_debug_category category = p->category;
- if (!__drm_debug_enabled(category))
+ if (!drm_debug_enabled_instrumented(category))
return;
__drm_dev_vprintk(dev, KERN_DEBUG, p->origin, p->prefix, vaf);
@@ -335,7 +335,7 @@ void __drm_dev_dbg(struct _ddebug *desc, const struct device *dev,
struct va_format vaf;
va_list args;
- if (!__drm_debug_enabled(category))
+ if (!drm_debug_enabled(category))
return;
/* we know we are printing for either syslog, tracefs, or both */
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Claude review: 2 bugfixes found in DRM-CI
2026-03-08 22:35 [PATCH 0/2] 2 bugfixes found in DRM-CI Jim Cromie
2026-03-08 22:35 ` [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs Jim Cromie
2026-03-08 22:35 ` [PATCH 2/2] drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass Jim Cromie
@ 2026-03-10 2:46 ` Claude Code Review Bot
2 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-10 2:46 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: 2 bugfixes found in DRM-CI
Author: Jim Cromie <jim.cromie@gmail.com>
Patches: 3
Reviewed: 2026-03-10T12:46:56.389302
---
This is a 2-patch bugfix series from Jim Cromie addressing issues found in DRM-CI. Both patches fix real bugs:
1. **Patch 1** fixes a 64-bit truncation bug in `drm_buddy.c` where `roundup_pow_of_two()` / `rounddown_pow_of_two()` use `unsigned long` internally, which truncates `u64` values on 32-bit architectures.
2. **Patch 2** fixes a debug message flood in `drm_print.c` where `__drm_printfn_dbg()` uses the dyndbg-optimized `__drm_debug_enabled()` (which evaluates to `true` under `CONFIG_DRM_USE_DYNAMIC_DEBUG=Y`) instead of performing the actual bitmask check.
Both are legitimate fixes with clear problem descriptions. However, each patch has issues worth discussing.
Note: The kernel tree at `linux/` has been significantly refactored (drm_buddy → gpu_buddy), so the patches don't apply. The review is based on the patch content and the original code as visible from the diff context.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Claude review: drm_buddy: fix power-of-2 rounding errs
2026-03-08 22:35 ` [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs Jim Cromie
@ 2026-03-10 2:46 ` Claude Code Review Bot
0 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-10 2:46 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Problem analysis**: Correct. `roundup_pow_of_two()` and `rounddown_pow_of_two()` both use `unsigned long` internally (confirmed at `linux/include/linux/log2.h:55-67` — `1UL << fls_long(n - 1)` and `1UL << (fls_long(n) - 1)`). On 32-bit arches, `unsigned long` is 32 bits, so any `u64` value > 4GB is silently truncated.
**Rounddown fix** (line 306):
```c
- modify_size = rounddown_pow_of_two(size);
+ modify_size = 1ULL << ilog2(size);
```
This is correct. `ilog2()` handles `u64` properly, and `1ULL << ilog2(size)` is equivalent to `rounddown_pow_of_two(size)` for non-zero, non-power-of-two inputs. For inputs that are already powers of two, both return the same value.
**Roundup fix** (line 315):
```c
- size = roundup_pow_of_two(size);
+ size = 1ULL << (ilog2(size - 1) + 1);
```
**Bug**: This has an edge case problem. When `size` is already a power of two (e.g., `size = 4`), `ilog2(size - 1)` = `ilog2(3)` = 1, so the result is `1ULL << 2` = 4, which is correct. When `size = 1`, `ilog2(0)` is undefined behavior — `ilog2(0)` returns `-1` on most implementations, so `1ULL << 0` = 1, which happens to work but relies on undefined behavior for `ilog2(0)`.
However, given the calling context (this is inside `drm_buddy_alloc_blocks` with contiguous allocation), `size` should never be 0 or 1 in practice (it's a memory allocation size that's been validated), so this is likely fine in practice.
**Suggestion**: The commit message mentions "I'll use it to implement safe rounding" which is a bit informal for a commit message — first person and future tense. Consider also adding a `Fixes:` tag — the original `rounddown_pow_of_two()` / `roundup_pow_of_two()` calls have been present since the initial drm_buddy code was merged, and this is a legitimate bug on 32-bit architectures. Also, the kernel now has `roundup_pow_of_two_u64()` / `rounddown_pow_of_two_u64()` in some trees — it might be worth checking if those helpers exist or are more appropriate than hand-rolling the `ilog2` pattern.
**Missing Cc: stable**: For a genuine 32-bit architecture bug, a `Cc: stable@vger.kernel.org` tag would be appropriate.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Claude review: drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass
2026-03-08 22:35 ` [PATCH 2/2] drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass Jim Cromie
@ 2026-03-10 2:46 ` Claude Code Review Bot
0 siblings, 0 replies; 6+ messages in thread
From: Claude Code Review Bot @ 2026-03-10 2:46 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Problem analysis**: Correct and well-explained. Under `CONFIG_DRM_USE_DYNAMIC_DEBUG=Y`, `__drm_debug_enabled(category)` expands to `true` (confirmed at `linux/include/drm/drm_print.h:161`). Since `__drm_printfn_dbg()` is a shared helper not behind a dyndbg static-key guard, using `__drm_debug_enabled()` means the check is always bypassed, causing all debug output to print.
**First hunk** (line 474-475):
```c
- if (!__drm_debug_enabled(category))
+ if (!drm_debug_enabled_instrumented(category))
```
**Concern**: `drm_debug_enabled_instrumented()` includes a `pr_debug("todo: is this frequent enough to optimize ?\n")` call (confirmed at `linux/include/drm/drm_print.h:149-153`). This is a debugging/instrumentation macro that was clearly intended for development purposes — it emits a `pr_debug` on every invocation. While the commit message acknowledges this ("It also adds a pr_debug() to help track the frequency of this bit-test"), using an instrumentation/development macro in production code is questionable. This `pr_debug` could itself become a performance issue if someone enables it, and the "todo:" text in the pr_debug string suggests it's not meant for permanent use.
**Recommendation**: Use `drm_debug_enabled()` here instead. Under `CONFIG_DRM_USE_DYNAMIC_DEBUG=Y`, `drm_debug_enabled()` is already defined as `drm_debug_enabled_instrumented()` (line 162), but under the `!CONFIG_DRM_USE_DYNAMIC_DEBUG` path, it maps to `drm_debug_enabled_raw()` which is the simple bitmask check without the pr_debug overhead. However, the real fix for `__drm_printfn_dbg()` is just to use `drm_debug_enabled()` — both paths do a real bitmask check, which is exactly what's needed here.
Actually, looking more carefully: `drm_debug_enabled()` under `CONFIG_DRM_USE_DYNAMIC_DEBUG=Y` already equals `drm_debug_enabled_instrumented()`, so using `drm_debug_enabled()` would give the same result but is the canonical API. Using `drm_debug_enabled_instrumented()` directly bypasses the abstraction and couples the code to an internal implementation detail.
**Second hunk** (line 483-484):
```c
- if (!__drm_debug_enabled(category))
+ if (!drm_debug_enabled(category))
```
This change to `__drm_dev_dbg()` is correct and harmless. As the commit message notes, `__drm_dev_dbg()` is wrapped by dyndbg so the `__drm_debug_enabled()` → `true` shortcut is actually valid there. But switching to `drm_debug_enabled()` adds a redundant (but cheap) bitmask check and is cleaner. Fine as-is.
**Overall for patch 2**: The fix is correct in principle but the first hunk should use `drm_debug_enabled()` rather than `drm_debug_enabled_instrumented()`. The `_instrumented` variant with its embedded `pr_debug("todo: ...")` is clearly a temporary diagnostic tool, not something that should be baked into production code paths.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-10 2:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-08 22:35 [PATCH 0/2] 2 bugfixes found in DRM-CI Jim Cromie
2026-03-08 22:35 ` [PATCH 1/2] drm_buddy: fix power-of-2 rounding errs Jim Cromie
2026-03-10 2:46 ` Claude review: " Claude Code Review Bot
2026-03-08 22:35 ` [PATCH 2/2] drivers/gpu/drm/drm_print: fix drm_printer dynamic debug bypass Jim Cromie
2026-03-10 2:46 ` Claude review: " Claude Code Review Bot
2026-03-10 2:46 ` Claude review: 2 bugfixes found in DRM-CI Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox