From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: drm/xe: Avoid shrinker reclaim from kswapd under fragmentation Date: Thu, 23 Apr 2026 09:01:59 +1000 Message-ID: In-Reply-To: <20260421012608.1474950-4-matthew.brost@intel.com> References: <20260421012608.1474950-1-matthew.brost@intel.com> <20260421012608.1474950-4-matthew.brost@intel.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review This adds a fragmentation detection heuristic in `xe_shrinker_scan()`: ```c + if (current_is_kswapd()) { + struct zone *zone =3D &NODE_DATA(sc->nid)->node_zones[ZONE_NORMAL]; + unsigned long free_pages =3D zone_page_state(zone, NR_FREE_PAGES); + unsigned long high_wmark =3D high_wmark_pages(zone); + + /* + * If we have 2x the high watermark free, this is definitely + * fragmentation + */ + if (free_pages > (high_wmark * 2)) + goto out; + } ``` **Placement is good:** The check is positioned after the purgeable-objects = walk but before the expensive eviction/writeback walk. This means cheap-to-= free purgeable objects are still reclaimed, but expensive eviction of activ= e BOs is avoided under fragmentation. This is the right trade-off. **Concerns:** 1. **Hardcoded ZONE_NORMAL:** This assumes ZONE_NORMAL is the relevant zone= to check. On x86_64 systems with limited memory (e.g., all RAM < 4GB), ZON= E_NORMAL could be empty (0 managed pages, 0 free pages, 0 high watermark). = In that case `free_pages > (high_wmark * 2)` evaluates to `0 > 0` which is = false, so the heuristic wouldn't trigger =E2=80=94 this is a safe failure m= ode (no false bailout). However, the heuristic would also never provide its= intended benefit on such systems. Consider checking the zone that actually= has managed pages, or checking `managed_zone(zone)` as a guard. Since Xe t= argets Intel iGPUs which are almost always on x86_64 systems with >4GB RAM,= this is unlikely to matter in practice but is still a fragile assumption. 2. **The 2x multiplier is a magic number:** The comment says "this is defin= itely fragmentation" but 2x is an arbitrary threshold. It would be good to = document why 2x was chosen rather than, say, 1.5x or 3x. If the high waterm= ark is, say, 128MB and we have 260MB free, that's "definitely fragmentation= " per this heuristic =E2=80=94 but it's not an overwhelming amount of free = memory either. Some justification or at least a named constant would improv= e readability. 3. **NUMA considerations:** On multi-node systems, `sc->nid` identifies the= node being reclaimed. Using `NODE_DATA(sc->nid)->node_zones[ZONE_NORMAL]` = checks the right node's ZONE_NORMAL. This is correct for NUMA-aware reclaim. 4. **Return value semantics:** When bailing out via `goto out`, if `nr_scan= ned` is 0, the function returns `SHRINK_STOP`. This tells the VM that the s= hrinker can't free anything, which should prevent repeated calls. If some p= urgeable pages were already freed (`nr_scanned > 0`), it returns `freed`, c= orrectly reporting partial work. Both cases seem correct. 5. **Comment accuracy:** The comment says "this is definitely fragmentation= " but it's a heuristic, not a guarantee. Having lots of free memory in ZONE= _NORMAL while kswapd is running *could* also indicate pressure in other zon= es (DMA, DMA32, MOVABLE). Consider softening to "this is likely fragmentati= on" or "this is a strong signal of fragmentation" (which matches the commit= message wording). **Overall:** The heuristic is a pragmatic solution to a real problem. The Z= ONE_NORMAL assumption and magic constant are defensible for the Intel iGPU = target but should be documented better. --- Generated by Claude Code Patch Reviewer