From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: cgroup/dmem: implement dmem.high soft limit via prioritized eviction Date: Thu, 04 Jun 2026 14:56:52 +1000 Message-ID: In-Reply-To: <20260531-feature-dmem-high-v6-1-20563ecd6dc7@gmail.com> References: <20260531-feature-dmem-high-v6-1-20563ecd6dc7@gmail.com> <20260531-feature-dmem-high-v6-1-20563ecd6dc7@gmail.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 Patch Review **TTM changes (drivers/gpu/drm/ttm/ttm_bo.c):** **Issue 1 (medium): Pass 1 blocking lock without ticket is a no-op, not a trylock concern** ```c + evict_walk.walk.arg.trylock_only = !ticket; + evict_walk.try_high = true; lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1); + evict_walk.try_high = false; + if (lret) + goto out; ``` When `ticket` is available, Pass 1 uses a blocking lock (`trylock_only = false`). When `ticket` is NULL, it falls through to trylock. This is fine for deadlock safety. However, the `if (lret) goto out` means if Pass 1 succeeds (returns positive), we skip Passes 2 and 3 entirely. This is correct behavior -- if we evicted a high-limit BO and got a resource, we're done. **Issue 2 (minor): hit_low is not reset between Pass 1 and Pass 2** Pass 1 could set `evict_walk.hit_low` to true (via `dmem_cgroup_state_evict_valuable` returning false with `ret_hit_low` set). This `hit_low` state then carries into Pass 2. In the original code, `hit_low` was only set during what is now Pass 2, so it was clean. Now there's potential for Pass 1 to set `hit_low`, causing Pass 2 to immediately retry with `try_low = true` even if Pass 2 itself didn't hit the low limit. This is likely harmless (an extra attempt isn't wrong, just unnecessary) but is a subtle behavioral change. Consider resetting `hit_low` before Pass 2: ```c + evict_walk.hit_low = false; /* Reset from Pass 1 */ evict_walk.walk.arg.trylock_only = true; lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1); ``` **Issue 3 (nit): Comment says "above-elow" (typo in changelog too)** The commit message and V1->V2 changelog both contain "above-elow" which should be "above-elow" (effective low). This appears to be a pre-existing typo that's been carried forward through versions. The in-code comment at Pass 2 says: ```c + * Pass 2 (trylock): Evict BOs above the effective low watermark. ``` This is clear enough. **dmem cgroup header changes (include/linux/cgroup_dmem.h):** **The CONFIG_CGROUP_DMEM=n stub looks correct:** ```c + if (try_high) + return false; return true; ``` When cgroup support is disabled, there's no high limit to check, so the `try_high` pass should be a no-op (returning false means "don't evict this one in the high pass"). This was the v6 fix and is correct. **dmem cgroup core changes (kernel/cgroup/dmem.c):** **Issue 4 (correctness concern): Asymmetric min protection in limit_pool == test_pool path** ```c + if (limit_pool == test_pool) { + if (try_high && test_pool) { + ctest = &test_pool->cnt; + used = page_counter_read(ctest); + return used > READ_ONCE(ctest->high); + } return true; + } ``` When `limit_pool == test_pool` and `try_high` is set, the code checks `used > high` but does NOT check `dmem.min` protection. In contrast, the general `try_high` path later in the function does: ```c + dmem_cgroup_calculate_protection(limit_pool, test_pool); + min = READ_ONCE(ctest->emin); + return used > min; ``` This asymmetry means: for the cgroup that hit the limit, even if its usage is below `emin`, it will still be evicted in the high pass as long as `used > high`. This might be intentional (the cgroup that caused the limit hit should be penalized regardless of min protection), but it's undocumented and could surprise users who set both `min` and `high`. **Issue 5 (minor concern): try_high path ignores low watermark** The general `try_high` code block checks `dmem.min` but not `dmem.low`: ```c + dmem_cgroup_calculate_protection(limit_pool, test_pool); + min = READ_ONCE(ctest->emin); + return used > min; ``` The non-try_high path checks both `min` and `low`. This means the high-priority eviction pass is more aggressive than the normal pass -- it will evict BOs that are between `emin` and `elow`. This is probably intentional (the point is to aggressively reclaim from over-high cgroups), but worth noting. **Issue 6 (correctness): Parent chain walk uses page_counter parents, not cgroup parents** ```c + if (used <= READ_ONCE(ctest->high)) { + for (c = ctest->parent; c; c = c->parent) { + if (page_counter_read(c) > READ_ONCE(c->high)) + break; + } + if (!c) + return false; + } ``` This walks `page_counter` parents, which correspond to dmem pool parents, which follow the cgroup hierarchy. This is correct -- it checks if any ancestor cgroup exceeds its `high` limit. If the test_pool itself is under its high limit AND no ancestor is over their high limit, return false (don't evict). If any ancestor is over, continue to the min check. This logic is sound. However, there's a subtlety: if `test_pool->cnt.parent` is NULL (root cgroup), the loop body never executes, `c` is NULL, and the function returns false. For a BO owned by the root cgroup during `try_high`, this means it won't be evicted in Pass 1 (unless its own usage exceeds high). This seems correct -- the root cgroup shouldn't be penalized by the high pass. **Issue 7 (nit): `set_resource_high` doesn't validate against current usage** `page_counter_set_high()` is a simple `WRITE_ONCE` with no validation. Unlike `page_counter_set_max()` which can fail, setting `high` below current usage silently succeeds and immediately makes the cgroup eligible for eviction. This is consistent with how memory cgroup `memory.high` works, but the commit message doesn't mention this behavior. **Issue 8 (style): The cftype entry placement is good** ```c + { + .name = "high", + .write = dmem_cgroup_region_high_write, + .seq_show = dmem_cgroup_region_high_show, + .flags = CFTYPE_NOT_ON_ROOT, + }, ``` Placed between `low` and `max` which matches the natural ordering (min < low < high < max). Correct flags (`CFTYPE_NOT_ON_ROOT` matches the other limit files). **Issue 9 (documentation): No documentation update** There's no update to Documentation/ for the new `dmem.high` control file. The cgroup v2 documentation should describe the semantics of this new knob, especially since it has a non-trivial interaction with `min`, `low`, and `max`. **Overall assessment:** The patch is functionally sound and addresses a real need. The v6 NULL-check fix is correct. The main concerns are: - `hit_low` not being reset between Pass 1 and Pass 2 (minor but could cause unnecessary retries) - Asymmetric `min` protection in the `limit_pool == test_pool` path (design question, should be documented/justified) - Missing documentation for the new cgroup control file - The patch should consider whether it needs an `events` file or similar for observability (how does an admin know evictions happened due to `high`?) --- Generated by Claude Code Patch Reviewer