From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: Re: [PATCH v2] cgroup/dmem: implement dmem.high soft limit via prioritized eviction Date: Mon, 25 May 2026 19:12:32 +1000 Message-ID: In-Reply-To: References: X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review **Design concern: `try_high` bypasses `emin`/`elow` protections** In `dmem_cgroup_state_evict_valuable()`, when `try_high` is true, the funct= ion short-circuits before the `emin`/`elow` protection checks: ```c + if (try_high) + return used > READ_ONCE(ctest->high); + dmem_cgroup_calculate_protection(limit_pool, test_pool); - used =3D page_counter_read(ctest); min =3D READ_ONCE(ctest->emin); if (used <=3D min) return false; ``` This means pass 1 will target a cgroup that is above `high` even if it is w= ithin its `emin` guarantee. In the memory cgroup controller, reclaim trigge= red by `memory.high` still respects `memory.min` protection =E2=80=94 `min`= is an absolute guarantee. The dmem controller documents `min` the same way= (via `page_counter` semantics). If the intent is to make `high` override `= min`, that should be explicitly documented and justified; otherwise the `tr= y_high` path should check `emin` too. Consider: ```c if (try_high) { if (used <=3D READ_ONCE(ctest->emin)) return false; return used > READ_ONCE(ctest->high); } ``` Note that if `high >=3D min` is always true in practice (which is the expec= ted configuration), this wouldn't change behavior, but it preserves the `mi= n` guarantee contract for correctness. **Design concern: no `memory.events` style tracking for high limit hits** The memory cgroup controller tracks `high` limit events via `memory.events`= . There is no equivalent tracking here (e.g., counting how many times BOs w= ere evicted due to a cgroup being above its `high` limit). This would be us= eful for user-space monitoring of whether the `high` limit is having an eff= ect. **Missing documentation** There is no update to `Documentation/admin-guide/cgroup-v2.rst` or similar = documentation file describing the new `dmem.high` control file, its semanti= cs, default value, and interaction with `min`/`low`/`max`. Kernel cgroup in= terfaces require documentation for new control knobs. **Missing selftests** No selftest or test case is provided. At minimum, a test verifying that BOs= from a cgroup over `high` are evicted before BOs from a cgroup under `high= ` would demonstrate the feature works. **Typo in commit message** > Pass 2 falls back to the standard above-elow eviction. "above-elow" should be "above effective-low" or "above-elow (effective low)= ". **Pass 1 early exit catches errors too** ```c + evict_walk.try_high =3D true; lret =3D ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1); + evict_walk.try_high =3D false; + if (lret) + goto out; ``` `lret` is negative on error, positive on success. The `if (lret)` check exi= ts on both. If pass 1 encounters an error (e.g., `-ENOSPC` mapped to `-EBUS= Y` in the callback), this skips the standard passes 2 and 3, which differ i= n what they're willing to evict. In the original code, an error in the tryl= ock pass would still fall through to `if (lret || !ticket) goto out;`, so t= his is a minor behavioral change: an error that previously reached the `hit= _low` retry no longer does. In practice this is unlikely to matter since er= rors in LRU walks are typically terminal, but it's worth noting. **`used` read hoisted above protection calculation =E2=80=94 correct but wo= rth noting** ```c ctest =3D &test_pool->cnt; + used =3D page_counter_read(ctest); + ... + if (try_high) + return used > READ_ONCE(ctest->high); + dmem_cgroup_calculate_protection(limit_pool, test_pool); - used =3D page_counter_read(ctest); ``` Moving `page_counter_read()` above `dmem_cgroup_calculate_protection()` is = safe since the read doesn't depend on the protection calculation. Both the = `try_high` and non-`try_high` paths now use the same `used` value, which is= consistent. **Cgroupfs file registration looks correct** The `high` entry is placed between `low` and `max` in the `cftype files[]` = array, matching the cgroup v2 convention: ```c + { + .name =3D "high", + .write =3D dmem_cgroup_region_high_write, + .seq_show =3D dmem_cgroup_region_high_show, + .flags =3D CFTYPE_NOT_ON_ROOT, + }, ``` **Initialization is correct** `page_counter_init()` already sets `high =3D PAGE_COUNTER_MAX`, and `reset_= all_resource_limits()` now also resets it: ```c + set_resource_high(rpool, PAGE_COUNTER_MAX); ``` This means existing pools and newly created pools will both default to unli= mited, so the feature is opt-in. Correct. **Stub function for !CONFIG_CGROUP_DMEM is correct** ```c bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit= _pool, struct dmem_cgroup_pool_state *test_pool, - bool ignore_low, bool *ret_hit_low) + bool try_high, bool ignore_low, bool *ret_hit_low) { return true; } ``` When dmem cgroup is disabled, the stub ignores `try_high` and returns `true= `, so pass 1 behaves identically to pass 2 =E2=80=94 first eligible BO gets= evicted. This adds one extra (no-op) LRU walk but is functionally correct. **Summary**: The approach is sound and a significant improvement over v1. T= he main concerns are: (1) the `try_high` path should respect `emin` protect= ion to maintain the `min` guarantee contract, (2) documentation and selftes= ts are needed, and (3) consider adding event tracking. The code is otherwis= e clean and correctly uses existing infrastructure. --- Generated by Claude Code Patch Reviewer