From: Claude Code Review Bot <claude-review@example.com>
To: dri-devel-reviews@example.com
Subject: Claude review: cgroup/dmem: implement dmem.high soft limit and throttling
Date: Mon, 25 May 2026 22:14:45 +1000 [thread overview]
Message-ID: <review-patch1-20260520-feature-dmem-high-v1-1-97ca0cb7f95a@gmail.com> (raw)
In-Reply-To: <20260520-feature-dmem-high-v1-1-97ca0cb7f95a@gmail.com>
Patch Review
**Critical: Sleeping inside RCU read-side critical section**
The `__dmem_cgroup_handle_over_high()` function calls `schedule_timeout_killable()` while holding `rcu_read_lock()`. This is illegal — RCU read-side critical sections must not sleep, and this will trigger a `BUG: sleeping function called from invalid context` or similar splat on any kernel with debug options enabled, and cause RCU stalls on production kernels.
```c
rcu_read_lock();
list_for_each_entry_rcu(pool, &dmemcs->pools, css_node) {
unsigned long usage, high;
usage = page_counter_read(&pool->cnt);
high = READ_ONCE(pool->cnt.high);
if (usage > high)
schedule_timeout_killable(HZ / 10);
}
rcu_read_unlock();
```
You need to either collect the over-high state under RCU and sleep after `rcu_read_unlock()`, or use a different locking strategy. Look at how `mem_cgroup_handle_over_high` structures its work — it does not hold RCU across the sleep.
**Bug: `get_resource_high` returns 0 for NULL pool**
```c
static u64 get_resource_high(struct dmem_cgroup_pool_state *pool)
{
return pool ? READ_ONCE(pool->cnt.high) : 0;
}
```
When a cgroup has no pool for a particular region, the default should be `PAGE_COUNTER_MAX` (no limit), not `0`. Returning 0 means `dmem.high` will show `0` for regions where no pool exists, which is semantically wrong. Compare with `get_resource_max()` which correctly returns `PAGE_COUNTER_MAX` for NULL pools.
**Design issue: No fast-path guard in the inline wrapper**
The memcg version guards the expensive path with a per-task flag:
```c
static inline void mem_cgroup_handle_over_high(gfp_t gfp_mask)
{
if (unlikely(current->memcg_nr_pages_over_high))
__mem_cgroup_handle_over_high(gfp_mask);
}
```
This patch's inline wrapper is unconditional:
```c
static inline void dmem_cgroup_handle_over_high(void)
{
__dmem_cgroup_handle_over_high();
}
```
This means **every** return-to-userspace path will call `task_get_css()` + iterate pools + `css_put()`, even for tasks that have never touched device memory. This is a hot path — it should have a cheap early-exit. Consider adding a per-task flag (similar to `memcg_nr_pages_over_high`) or at minimum a static key that is only enabled when any dmem region is registered.
**Design issue: High limit check does not walk the hierarchy**
In `dmem_cgroup_try_charge`, the high check only looks at the leaf pool:
```c
if (page_counter_read(&pool->cnt) > READ_ONCE(pool->cnt.high))
set_notify_resume(current);
```
If a parent cgroup sets a high limit but the child does not, the child's `pool->cnt.high` remains at `PAGE_COUNTER_MAX` and the parent's limit is never checked. The memcg implementation walks the hierarchy for high limit enforcement. You should walk from the leaf to the root, checking each ancestor's `high` against its own `usage`.
**Nit: Gratuitous whitespace change**
The patch re-aligns `dmem_cgroup_region_max_write` parameter indentation without any functional change:
```c
static ssize_t dmem_cgroup_region_max_write(struct kernfs_open_file *of,
- char *buf, size_t nbytes, loff_t off)
+ char *buf, size_t nbytes, loff_t off)
```
This should be dropped — unrelated formatting changes in functional patches add review noise.
**Minor: EXPORT_SYMBOL_GPL may be unnecessary**
`__dmem_cgroup_handle_over_high` is called only from the inline in the header, which is called from `resume_user_mode_work()` in arch entry code — always built-in, never a module. The export appears unnecessary.
**Missing: No event counter or statistics**
The memcg high limit has associated event counters (`MEMCG_HIGH`) that can be observed via `memory.events`. Without analogous accounting, users have no way to observe how often throttling fires, making the feature difficult to tune in practice. This isn't a blocker for v1, but worth considering.
---
Generated by Claude Code Patch Reviewer
prev parent reply other threads:[~2026-05-25 12:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 6:07 [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling Qiliang Yuan
2026-05-20 9:52 ` Tejun Heo
2026-05-21 11:28 ` Qiliang Yuan
2026-05-21 9:45 ` Maarten Lankhorst
2026-05-21 11:28 ` Qiliang Yuan
2026-05-21 10:52 ` Natalie Vock
2026-05-21 11:28 ` Qiliang Yuan
2026-05-25 12:14 ` Claude review: " Claude Code Review Bot
2026-05-25 12:14 ` Claude Code Review Bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=review-patch1-20260520-feature-dmem-high-v1-1-97ca0cb7f95a@gmail.com \
--to=claude-review@example.com \
--cc=dri-devel-reviews@example.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox