* Claude review: cgroup/dmem: introduce a peak file
2026-05-06 11:58 [PATCH 0/2] " Thadeu Lima de Souza Cascardo
2026-05-06 11:58 ` [PATCH 2/2] " Thadeu Lima de Souza Cascardo
@ 2026-05-07 3:37 ` Claude Code Review Bot
1 sibling, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-07 3:37 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: cgroup/dmem: introduce a peak file
Author: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Patches: 6
Reviewed: 2026-05-07T13:37:55.340313
---
This 2-patch series adds a `dmem.peak` cgroup file, mirroring the existing `memory.peak` functionality for the DRM device-memory cgroup controller. Patch 1 extracts the peak-reset logic from memcontrol into a shared helper; Patch 2 uses it to implement per-region peak tracking in dmem.
**The overall approach is sound and the core logic is correct.** The reference counting is carefully handled, concurrent read/write races are safe due to appropriate use of `READ_ONCE`/`WRITE_ONCE` and fallback to global watermark, and the per-fd peak semantics correctly mirror the existing memory.peak behavior.
There are a few design concerns worth discussing, mainly around the addition of a dmem-specific `pool` pointer into the generic `cgroup_of_peak` struct, and the unused `seq_file *` parameter added to four getter functions. Neither is a correctness issue, but both are layering concerns that reviewers from the cgroup/mm side may push back on.
No bugs found. The series is in good shape for a v1.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: cgroup/dmem: introduce a peak file
2026-05-06 11:58 ` [PATCH 2/2] " Thadeu Lima de Souza Cascardo
@ 2026-05-07 3:37 ` Claude Code Review Bot
0 siblings, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-07 3:37 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
This is the main patch, adding `dmem.peak` file support with per-region, per-fd peak tracking.
**Correctness: Good.** The peak tracking semantics are correct:
- Global watermark (`pool->cnt.watermark`) is reported for regions the fd has not reset.
- Per-fd peak (`max(ofp->value, pool->cnt.local_watermark)`) is reported for the region the fd has reset.
- Writing a different region name correctly "detaches" from the old region (reverting to global watermark) and "attaches" to the new one.
**Reference counting is correct:** `get_cg_pool_unlocked()` acquires a pool ref that is transferred to `ofp->pool`. `dmem_cgroup_region_peak_remove()` drops it. The region ref from `dmemcg_get_region_by_name()` is dropped at `out_put`. The release path handles both the fast path (no writes) and the watcher-cleanup path.
**Design concerns:**
1. **`pool` pointer in generic `cgroup_of_peak` struct:**
```c
struct cgroup_of_peak {
unsigned long value;
struct list_head list;
+ struct dmem_cgroup_pool_state *pool;
};
```
This adds a DRM-subsystem-specific pointer to a struct used by the generic cgroup infrastructure and the memory controller. Every `memory.peak` open file now carries an unused `struct dmem_cgroup_pool_state *pool`. While harmless (it's zero-initialized via `kzalloc` of the containing `cgroup_file_ctx`), this is a layering violation that cgroup/mm maintainers may object to. Consider using a wrapper struct in dmem.c or adding a `void *private` to `cgroup_of_peak` instead.
2. **Unused `struct seq_file *sf` parameter added to four functions:**
```c
-static u64 get_resource_low(struct dmem_cgroup_pool_state *pool)
+static u64 get_resource_low(struct seq_file *sf, struct dmem_cgroup_pool_state *pool)
```
`get_resource_low()`, `get_resource_min()`, `get_resource_max()`, and `get_resource_current()` all gain an `sf` parameter they never use, solely to match the callback signature needed by `get_resource_peak()`. This is a code smell. An alternative would be to have the show function pass a context struct or to handle the peak case with a separate callback type. At minimum, marking the parameter `__always_unused` would signal the intent.
3. **`get_resource_peak()` reads `ofp` without any lock:**
```c
static u64 get_resource_peak(struct seq_file *sf, struct dmem_cgroup_pool_state *pool)
{
struct cgroup_of_peak *ofp = of_peak(sf->private);
...
of_pool = READ_ONCE(ofp->pool);
fd_peak = READ_ONCE(ofp->value);
if (of_pool != pool || fd_peak == OFP_PEAK_UNSET)
peak = pool->cnt.watermark;
else
peak = max(fd_peak, READ_ONCE(pool->cnt.local_watermark));
```
This is intentionally lockless for the read path, relying on `READ_ONCE` for individual fields. There is a window where `of_pool` could point to a freed pool (between `dmem_cgroup_region_peak_remove()` setting it to NULL and a concurrent read having already loaded the old value). However, since the pointer is only used for an equality comparison against the valid `pool` from the iterator — never dereferenced — this is safe. The fallback to `pool->cnt.watermark` on mismatch makes this robust. This is fine but worth a brief comment in the code explaining why the lockless read is safe.
4. **The `dmem_cgroup_region_peak_remove()` function does not check `ofp->value == OFP_PEAK_UNSET`:**
```c
static void dmem_cgroup_region_peak_remove(struct cgroup_of_peak *ofp)
{
struct dmem_cgroup_pool_state *pool;
...
pool = xchg(&ofp->pool, NULL);
if (!pool)
return;
...
```
It relies on `ofp->pool == NULL` for the "no prior write" case, which is correct since `pool` is only set in the write path. But `dmem_cgroup_region_peak_release()` additionally checks `ofp->value == OFP_PEAK_UNSET` as a fast path before calling `remove`. This is slightly redundant but harmless — the two checks are equivalent for the no-write case.
5. **Documentation is good:**
```
+ dmem.peak
+ A readwrite nested-keyed file that exists on non-root cgroups.
+ ...
```
Clear and follows the existing documentation style. The semantics are well-described.
6. **Initialization in `alloc_pool_single()`:**
```c
+ INIT_LIST_HEAD(&pool->peaks);
```
Correctly initializes the peaks watcher list for new pools. The `spin_lock_init(&dmemcs->peaks_lock)` in `dmemcs_alloc()` is also correct.
**One question for the author:** When multiple fds reset the same region, the `local_watermark` is lowered to current usage on each reset. This means fd1's peak could temporarily appear lower than its actual peak-since-reset if fd2 resets afterward at a lower usage point — until new allocations push `local_watermark` back up. This is the same behavior as `memory.peak` so it's consistent, but worth confirming that this is the intended semantic for dmem as well, since device memory allocation patterns may differ from system memory.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3] cgroup/dmem: introduce a peak file
@ 2026-05-14 17:36 Thadeu Lima de Souza Cascardo
2026-05-16 0:32 ` Claude review: " Claude Code Review Bot
2026-05-16 0:32 ` Claude Code Review Bot
0 siblings, 2 replies; 7+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2026-05-14 17:36 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Michal Hocko,
Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton,
Jonathan Corbet, Shuah Khan, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tvrtko Ursulin
Cc: cgroups, linux-kernel, linux-mm, linux-doc, dri-devel, kernel-dev,
Thadeu Lima de Souza Cascardo
Just like we have memory.peak, introduce a dmem.peak, which uses the
page_counter support for that.
For now, make it read-only.
This allows for memory usage monitoring without polling dmem.current when
the information needed is the maximum device memory used. That can be used
for capacity planning, such that dmem.max can be properly setup for a given
workload. It can also be used for debugging to determine whether a given
workload would have caused eviction or system memory use.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
---
Changes in v3:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v2: https://patch.msgid.link/20260513-dmem_peak-v2-1-dac06999db9e@igalia.com
Changes in v2:
- Make it read-only for now and adjust documentation accordingly.
- Link to v1: https://patch.msgid.link/20260506-dmem_peak-v1-0-8d803eb3449c@igalia.com
---
Documentation/admin-guide/cgroup-v2.rst | 6 ++++++
kernel/cgroup/dmem.c | 15 +++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 6efd0095ed99..d103623b2be4 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2808,6 +2808,12 @@ DMEM Interface Files
The semantics are the same as for the memory cgroup controller, and are
calculated in the same way.
+ dmem.peak
+ A read-only nested-keyed file that exists on non-root cgroups.
+
+ The max device memory usage recorded for the cgroup and its
+ descendants since the creation of the cgroup for each region.
+
dmem.capacity
A read-only file that describes maximum region capacity.
It only exists on the root cgroup. Not all memory can be
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 4753a67d0f0f..6430c7ce1e03 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -182,6 +182,11 @@ static u64 get_resource_current(struct dmem_cgroup_pool_state *pool)
return pool ? page_counter_read(&pool->cnt) : 0;
}
+static u64 get_resource_peak(struct dmem_cgroup_pool_state *pool)
+{
+ return pool ? READ_ONCE(pool->cnt.watermark) : 0;
+}
+
static void reset_all_resource_limits(struct dmem_cgroup_pool_state *rpool)
{
set_resource_min(rpool, 0);
@@ -808,6 +813,11 @@ static int dmemcg_limit_show(struct seq_file *sf, void *v,
return 0;
}
+static int dmem_cgroup_region_peak_show(struct seq_file *sf, void *v)
+{
+ return dmemcg_limit_show(sf, v, get_resource_peak);
+}
+
static int dmem_cgroup_region_current_show(struct seq_file *sf, void *v)
{
return dmemcg_limit_show(sf, v, get_resource_current);
@@ -856,6 +866,11 @@ static struct cftype files[] = {
.name = "current",
.seq_show = dmem_cgroup_region_current_show,
},
+ {
+ .name = "peak",
+ .seq_show = dmem_cgroup_region_peak_show,
+ .flags = CFTYPE_NOT_ON_ROOT,
+ },
{
.name = "min",
.write = dmem_cgroup_region_min_write,
---
base-commit: d3b0a7f21119f5a66cb76aa28fb8cc13206aaf7d
change-id: 20260409-dmem_peak-3abc1be95072
Best regards,
--
Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Claude review: cgroup/dmem: introduce a peak file
2026-05-14 17:36 [PATCH v3] cgroup/dmem: introduce a peak file Thadeu Lima de Souza Cascardo
2026-05-16 0:32 ` Claude review: " Claude Code Review Bot
@ 2026-05-16 0:32 ` Claude Code Review Bot
1 sibling, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-16 0:32 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: cgroup/dmem: introduce a peak file
Author: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Patches: 1
Reviewed: 2026-05-16T10:32:53.178730
---
This is a single-patch series (v3) that adds a read-only `dmem.peak` file to the dmem cgroup controller, mirroring the existing `memory.peak` concept. The implementation is clean and minimal — it reuses the existing `page_counter` watermark infrastructure and the `dmemcg_limit_show` callback pattern already established for `dmem.current`.
The patch is **correct and safe to merge** with minor nits. The watermark is already maintained hierarchically by `page_counter_try_charge()` / `page_counter_charge()`, so no additional charge-path changes are needed. The `READ_ONCE` usage is appropriate and matches how the memory controller reads the watermark.
**Issues worth noting:**
1. **Incomplete changelog**: The v3 changelog still contains `EDITME` placeholders — this should be fixed before merge, though it's metadata only.
2. **No selftest coverage**: There's no selftest for this new file, and no existing dmem selftest infrastructure was found. A basic test would strengthen the patch.
3. **Read-only vs. read-write**: The commit message says "for now, make it read-only." The memory controller's `memory.peak` is read-write (writing resets the peak per-FD). This is fine as stated — reset support can come later — but the documentation should probably mention this difference more explicitly or note that reset support is planned.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: cgroup/dmem: introduce a peak file
2026-05-14 17:36 [PATCH v3] cgroup/dmem: introduce a peak file Thadeu Lima de Souza Cascardo
@ 2026-05-16 0:32 ` Claude Code Review Bot
2026-05-16 0:32 ` Claude Code Review Bot
1 sibling, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-16 0:32 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Correctness: Good**
The core implementation is sound:
```c
static u64 get_resource_peak(struct dmem_cgroup_pool_state *pool)
{
return pool ? READ_ONCE(pool->cnt.watermark) : 0;
}
```
This correctly handles the NULL pool case (returns 0, matching the `get_resource_current` pattern) and uses `READ_ONCE` to read the racy watermark, which is the established pattern in `mm/page_counter.c` (line 103: `if (new > READ_ONCE(c->watermark))`).
The watermark is updated hierarchically in `page_counter_try_charge()` and `page_counter_charge()`, walking the parent chain, so the "cgroup and its descendants" claim in the documentation is correct — charging to a leaf updates the watermark on all ancestors.
**Flag choice: Correct**
```c
{
.name = "peak",
.seq_show = dmem_cgroup_region_peak_show,
.flags = CFTYPE_NOT_ON_ROOT,
},
```
Using `CFTYPE_NOT_ON_ROOT` is consistent with `min`, `low`, and `max` in this controller. The root cgroup's peak is not meaningful (it would just track total system device memory usage, which `capacity` already covers from a different angle).
**Nit — v3 changelog has EDITME placeholders:**
```
Changes in v3:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
```
This is a b4 template artifact that wasn't filled in. It doesn't affect the code, but it's sloppy and should be fixed before the patch is applied. The maintainer will likely ask for a resend.
**Nit — Documentation placement:**
The `dmem.peak` documentation is placed between `dmem.current` and `dmem.capacity`, which is a good logical grouping (current/peak are related usage metrics). The text is clear and mirrors the `memory.peak` description style.
**Observation — No selftest:**
The patch doesn't include a selftest. While there's no existing dmem selftest infrastructure to extend, a simple test (create cgroup, allocate device memory, read peak, verify peak >= current) would be valuable, especially since peak tracking is the kind of feature that could silently regress.
**Observation — Future write-to-reset:**
The commit message mentions "for now, make it read-only." When write-to-reset is added later, it will likely need to use `local_watermark` (like `memory.peak` does for per-FD resets via `page_counter_reset_watermark()`). The current implementation reading `watermark` (the global one) is correct for a read-only peak file, so no changes are needed now.
**Overall: The patch is a clean, minimal, and correct addition. Fix the EDITME changelog and consider adding a selftest, otherwise this is ready.**
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: cgroup/dmem: introduce a peak file
2026-05-13 18:58 [PATCH v2] " Thadeu Lima de Souza Cascardo
2026-05-16 1:20 ` Claude review: " Claude Code Review Bot
@ 2026-05-16 1:20 ` Claude Code Review Bot
1 sibling, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-16 1:20 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: cgroup/dmem: introduce a peak file
Author: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Patches: 2
Reviewed: 2026-05-16T11:20:49.360654
---
This is a clean, well-scoped single patch that adds a `dmem.peak` cgroup file to the device memory (dmem) cgroup controller, mirroring the existing `memory.peak` pattern. The implementation correctly reuses the existing `dmemcg_limit_show` callback infrastructure and the `page_counter` watermark tracking that is already in place. The patch is minimal and follows established patterns in the codebase.
**Verdict: Looks good overall.** A couple of minor observations below but nothing blocking.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
* Claude review: cgroup/dmem: introduce a peak file
2026-05-13 18:58 [PATCH v2] " Thadeu Lima de Souza Cascardo
@ 2026-05-16 1:20 ` Claude Code Review Bot
2026-05-16 1:20 ` Claude Code Review Bot
1 sibling, 0 replies; 7+ messages in thread
From: Claude Code Review Bot @ 2026-05-16 1:20 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**Approach:** The patch adds a `get_resource_peak()` getter that reads `pool->cnt.watermark`, a `dmem_cgroup_region_peak_show()` display function, and a cftype entry — all following the exact same pattern as `get_resource_current()` / `dmem_cgroup_region_current_show()`.
**Positive:**
1. The `READ_ONCE` usage in `get_resource_peak()` is correct and consistent with how `page_counter_charge()` in `mm/page_counter.c` uses `WRITE_ONCE` to update the watermark. This is the same deliberate-racy-but-acceptable pattern documented in the page counter code.
2. The null check (`pool ? ... : 0`) mirrors `get_resource_current()` exactly.
3. `CFTYPE_NOT_ON_ROOT` is the correct flag — this matches `memory.peak` in the memory controller and is consistent with how `dmem.min`, `dmem.low`, and `dmem.max` are configured.
**Observations:**
1. **`current` lacks `CFTYPE_NOT_ON_ROOT` but `peak` has it.** Looking at the existing tree, `dmem.current` does *not* set `CFTYPE_NOT_ON_ROOT`, so it appears on root. The new `dmem.peak` does set `CFTYPE_NOT_ON_ROOT`, so it won't appear on root. This is consistent with how `memory.current` (no flag) and `memory.peak` (has `CFTYPE_NOT_ON_ROOT`) work in `mm/memcontrol.c`, so the asymmetry is intentional and correct.
2. **No write/reset support.** The commit message and documentation say "read-only for now," which is fine for an initial version. The `memory.peak` controller has grown `peak_open`/`peak_release`/`memory_peak_write` for per-fd local watermark tracking and reset support. If that's planned as a follow-up, it might be worth mentioning in the commit message for future reference, but this is a nit — the v1→v2 changelog already documents the deliberate decision to start read-only.
3. **Documentation placement is correct.** The `dmem.peak` entry is inserted between `dmem.max/min/low` and `dmem.capacity`, which is a natural location grouping the per-cgroup files together before the root-only `dmem.capacity`.
4. **Watermark initialization.** The `page_counter_init()` function zeroes `usage` and sets `max` but does not explicitly initialize `watermark` or `local_watermark`. Since `page_counter` structs in `dmem_cgroup_pool_state` are allocated with `kzalloc` (which zero-initializes), the watermark starts at 0, which is correct for a "max seen so far" semantic. This is fine but worth noting for anyone reading the patch.
**No functional issues found.** The patch is a straightforward, correct extension of the existing dmem cgroup interface.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-05-16 1:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 17:36 [PATCH v3] cgroup/dmem: introduce a peak file Thadeu Lima de Souza Cascardo
2026-05-16 0:32 ` Claude review: " Claude Code Review Bot
2026-05-16 0:32 ` Claude Code Review Bot
-- strict thread matches above, loose matches on Subject: below --
2026-05-13 18:58 [PATCH v2] " Thadeu Lima de Souza Cascardo
2026-05-16 1:20 ` Claude review: " Claude Code Review Bot
2026-05-16 1:20 ` Claude Code Review Bot
2026-05-06 11:58 [PATCH 0/2] " Thadeu Lima de Souza Cascardo
2026-05-06 11:58 ` [PATCH 2/2] " Thadeu Lima de Souza Cascardo
2026-05-07 3:37 ` Claude review: " Claude Code Review Bot
2026-05-07 3:37 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox