From: Natalie Vock <natalie.vock@gmx.de>
To: Maarten Lankhorst <dev@lankhorst.se>,
Maxime Ripard <mripard@kernel.org>, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Koutný <mkoutny@suse.com>,
Christian Koenig <christian.koenig@amd.com>,
Huang Rui <ray.huang@amd.com>,
Matthew Auld <matthew.auld@intel.com>,
Matthew Brost <matthew.brost@intel.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Tvrtko Ursulin <tursulin@ursulin.net>
Cc: cgroups@vger.kernel.org, dri-devel@lists.freedesktop.org,
Natalie Vock <natalie.vock@gmx.de>,
Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Subject: [PATCH v5 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases
Date: Mon, 02 Mar 2026 13:37:02 +0100 [thread overview]
Message-ID: <20260302-dmemcg-aggressive-protect-v5-0-ffd3a2602309@gmx.de> (raw)
Hi all,
I've been looking into some cases where dmem protection fails to prevent
allocations from ending up in GTT when VRAM gets scarce and apps start
competing hard.
In short, this is because other (unprotected) applications end up
filling VRAM before protected applications do. This causes TTM to back
off and try allocating in GTT before anything else, and that is where
the allocation is placed in the end. The existing eviction protection
cannot prevent this, because no attempt at evicting is ever made
(although you could consider the backing-off as an immediate eviction to
GTT).
This series tries to alleviate this by adding a special case when the
allocation is protected by cgroups: Instead of backing off immediately,
TTM will try evicting unprotected buffers from the domain to make space
for the protected one. This ensures that applications can actually use
all the memory protection awarded to them by the system, without being
prone to ping-ponging (only protected allocations can evict unprotected
ones, never the other way around).
The first two patches just add a few small utilities needed to implement
this to the dmem controller. The other patches are the TTM implementation:
"drm/ttm: Be more aggressive..." decouples cgroup charging from resource
allocation to allow us to hold on to the charge even if allocation fails
on first try, and adds a path to call ttm_bo_evict_alloc when the
charged allocation falls within min/low protection limits.
"drm/ttm: Use common ancestor..." is a more general improvement in
correctly implementing cgroup protection semantics. With recursive
protection rules, unused memory protection afforded to a parent node is
transferred to children recursively, which helps protect entire
subtrees from stealing each others' memory without needing to protect
each cgroup individually. This doesn't apply when considering direct
siblings inside the same subtree, so in order to not break
prioritization between these siblings, we need to consider the
relationship of evictor and evictee when calculating protection.
In practice, this fixes cases where a protected cgroup cannot steal
memory from unprotected siblings (which, in turn, leads to eviction
failures and new allocations being placed in GTT).
Thanks,
Natalie
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
---
Changes in v5:
- Added cgroup_common_ancestor helper to use with
dmem_cgroup_common_ancestor (Tejun)
- Note: "drm/ttm: Use common ancestor..." needed minor changes since
dmem_cgroup_common_ancestor now grabs a reference to the ancestor
pool which needs to be dropped after use
- Removed extraneous whitespaces in "drm/ttm: Split cgroup charge..."
and unnecessary changes done in "drm/ttm: Extract code..." (Tvrtko)
- Applied a comment from v3 about below_low not needing to be
initialized in "drm/ttm: Be more aggressive..." (Tvrtko)
- Fixed uncharging the cgroup on allocation failure (Tvrtko)
- Fixed a typo in the message of "drm/ttm: Split cgroup charge..."
(Tvrtko)
- Added case in ttm_bo_evict_cb for when charging fails, since we need
to retry the charge (found myself)
- Link to v4: https://lore.kernel.org/r/20260225-dmemcg-aggressive-protect-v4-0-de847ab35184@gmx.de
Changes in v4:
- Split cgroup charge decoupling and eviction logic changes into
separate commits (Tvrtko)
- Fix two cases of errno handling in ttm_bo_alloc_place and its caller
(Tvrtko)
- Improve commit message/description of "drm/ttm: Make a helper..." (now
"drm/ttm: Extract code...") (Tvrtko)
- Documentation improvements for new TTM eviction logic (Tvrtko)
- Formatting fixes (Tvrtko)
- Link to v3: https://lore.kernel.org/r/20251110-dmemcg-aggressive-protect-v3-0-219ffcfc54e9@gmx.de
Changes in v3:
- Improved documentation around cgroup queries and TTM eviction helpers
(Maarten)
- Fixed up ttm_alloc_at_place charge failure logic to return either
-EBUSY or -ENOSPC, not -EAGAIN (found this myself)
- Link to v2: https://lore.kernel.org/r/20251015-dmemcg-aggressive-protect-v2-0-36644fb4e37f@gmx.de
Changes in v2:
- Factored out the ttm logic for charging/allocating/evicting into a
separate helper to keep things simpler
- Link to v1: https://lore.kernel.org/r/20250915-dmemcg-aggressive-protect-v1-0-2f3353bfcdac@gmx.de
---
Natalie Vock (6):
cgroup/dmem: Add queries for protection values
cgroup,cgroup/dmem: Add (dmem_)cgroup_common_ancestor helper
drm/ttm: Extract code for attempting allocation in a place
drm/ttm: Split cgroup charge and resource allocation
drm/ttm: Be more aggressive when allocating below protection limit
drm/ttm: Use common ancestor of evictor and evictee as limit pool
drivers/gpu/drm/ttm/ttm_bo.c | 214 ++++++++++++++++++++++++++++++++-----
drivers/gpu/drm/ttm/ttm_resource.c | 48 ++++++---
include/drm/ttm/ttm_resource.h | 6 +-
include/linux/cgroup.h | 21 ++++
include/linux/cgroup_dmem.h | 25 +++++
kernel/cgroup/dmem.c | 105 +++++++++++++++++-
6 files changed, 374 insertions(+), 45 deletions(-)
---
base-commit: 61c0f69a2ff79c8f388a9e973abb4853be467127
change-id: 20250915-dmemcg-aggressive-protect-5cf37f717cdb
Best regards,
--
Natalie Vock <natalie.vock@gmx.de>
next reply other threads:[~2026-03-02 12:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 12:37 Natalie Vock [this message]
2026-03-02 12:37 ` [PATCH v5 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 2/6] cgroup,cgroup/dmem: Add (dmem_)cgroup_common_ancestor helper Natalie Vock
2026-03-02 14:38 ` Maarten Lankhorst
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
2026-03-02 15:08 ` Tvrtko Ursulin
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
2026-03-02 15:25 ` Tvrtko Ursulin
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2026-03-02 17:02 ` Tvrtko Ursulin
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
2026-03-03 3:29 ` Claude review: " Claude Code Review Bot
2026-03-03 3:29 ` Claude review: cgroup/dmem,drm/ttm: Improve protection in contended cases Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260302-dmemcg-aggressive-protect-v5-0-ffd3a2602309@gmx.de \
--to=natalie.vock@gmx.de \
--cc=airlied@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=christian.koenig@amd.com \
--cc=dev@lankhorst.se \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=mkoutny@suse.com \
--cc=mripard@kernel.org \
--cc=ray.huang@amd.com \
--cc=simona@ffwll.ch \
--cc=tj@kernel.org \
--cc=tursulin@ursulin.net \
--cc=tvrtko.ursulin@igalia.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox