public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tursulin@ursulin.net>
To: Natalie Vock <natalie.vock@gmx.de>,
	Maarten Lankhorst <dev@lankhorst.se>,
	Maxime Ripard <mripard@kernel.org>, Tejun Heo <tj@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Koutný <mkoutny@suse.com>,
	Christian Koenig <christian.koenig@amd.com>,
	Huang Rui <ray.huang@amd.com>,
	Matthew Auld <matthew.auld@intel.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>
Cc: cgroups@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v5 5/6] drm/ttm: Be more aggressive when allocating below protection limit
Date: Mon, 2 Mar 2026 17:02:00 +0000	[thread overview]
Message-ID: <86ef0e02-ac40-4bd4-bfcb-173d4312acb2@ursulin.net> (raw)
In-Reply-To: <20260302-dmemcg-aggressive-protect-v5-5-ffd3a2602309@gmx.de>


On 02/03/2026 12:37, Natalie Vock wrote:
> When the cgroup's memory usage is below the low/min limit and allocation
> fails, try evicting some unprotected buffers to make space. Otherwise,
> application buffers may be forced to go into GTT even though usage is
> below the corresponding low/min limit, if other applications filled VRAM
> with their allocations first.
> 
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c | 52 +++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 47 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 53c4de4bcc1e3..86f99237f6490 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -494,6 +494,10 @@ struct ttm_bo_alloc_state {
>   	struct dmem_cgroup_pool_state *charge_pool;
>   	/** @limit_pool: Which pool limit we should test against */
>   	struct dmem_cgroup_pool_state *limit_pool;
> +	/** @only_evict_unprotected: If only unprotected BOs, i.e. BOs whose cgroup
> +	 *  is exceeding its dmem low/min protection, should be considered for eviction
> +	 */
> +	bool only_evict_unprotected;
>   };
>   
>   /**
> @@ -598,8 +602,12 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   	evict_walk.walk.arg.trylock_only = true;
>   	lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>   
> -	/* One more attempt if we hit low limit? */
> -	if (!lret && evict_walk.hit_low) {
> +	/* If we failed to find enough BOs to evict, but we skipped over
> +	 * some BOs because they were covered by dmem low protection, retry
> +	 * evicting these protected BOs too, except if we're told not to
> +	 * consider protected BOs at all.
> +	 */
> +	if (!lret && evict_walk.hit_low && !state->only_evict_unprotected) {
>   		evict_walk.try_low = true;
>   		lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>   	}
> @@ -620,7 +628,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
>   	} while (!lret && evict_walk.evicted);
>   
>   	/* We hit the low limit? Try once more */
> -	if (!lret && evict_walk.hit_low && !evict_walk.try_low) {
> +	if (!lret && evict_walk.hit_low && !evict_walk.try_low &&
> +			!state->only_evict_unprotected) {
>   		evict_walk.try_low = true;
>   		goto retry;
>   	}
> @@ -730,7 +739,7 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>   				 struct ttm_resource **res,
>   				 struct ttm_bo_alloc_state *alloc_state)
>   {
> -	bool may_evict;
> +	bool may_evict, below_low;
>   	int ret;
>   
>   	may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM);
> @@ -749,9 +758,42 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
>   		return ret;
>   	}
>   
> +	/*
> +	 * cgroup protection plays a special role in eviction.
> +	 * Conceptually, protection of memory via the dmem cgroup controller
> +	 * entitles the protected cgroup to use a certain amount of memory.
> +	 * There are two types of protection - the 'low' limit is a
> +	 * "best-effort" protection, whereas the 'min' limit provides a hard
> +	 * guarantee that memory within the cgroup's allowance will not be
> +	 * evicted under any circumstance.
> +	 *
> +	 * To faithfully model this concept in TTM, we also need to take cgroup
> +	 * protection into account when allocating. When allocation in one
> +	 * place fails, TTM will default to trying other places first before
> +	 * evicting.
> +	 * If the allocation is covered by dmem cgroup protection, however,
> +	 * this prevents the allocation from using the memory it is "entitled"
> +	 * to. To make sure unprotected allocations cannot push new protected
> +	 * allocations out of places they are "entitled" to use, we should
> +	 * evict buffers not covered by any cgroup protection, if this
> +	 * allocation is covered by cgroup protection.
> +	 *
> +	 * Buffers covered by 'min' protection are a special case - the 'min'
> +	 * limit is a stronger guarantee than 'low', and thus buffers protected
> +	 * by 'low' but not 'min' should also be considered for eviction.
> +	 * Buffers protected by 'min' will never be considered for eviction
> +	 * anyway, so the regular eviction path should be triggered here.
> +	 * Buffers protected by 'low' but not 'min' will take a special
> +	 * eviction path that only evicts buffers covered by neither 'low' or
> +	 * 'min' protections.
> +	 */
> +	may_evict |= dmem_cgroup_below_min(NULL, alloc_state->charge_pool);

It may make sense to group the two lines which "calculate" may_evict 
together. which would probably mean also pulling two lines below to 
before try charge, so that the whole logical block is not split.

> +	below_low = dmem_cgroup_below_low(NULL, alloc_state->charge_pool);
> +	alloc_state->only_evict_unprotected = !may_evict && below_low;

Would it work to enable may_evict also if below_low is true, and assign 
below_low directly to only_evict_unprotected? I mean along the lines of:

may_evict = force_space && place->mem_type != TTM_PL_SYSTEM;
may_evict |= dmem_cgroup_below_min(NULL, alloc_state->charge_pool);
alloc_state->only_evict_unprotected = dmem_cgroup_below_low(NULL, 
alloc_state->charge_pool);

It would allow the if condition below to be simpler. Evict callback 
would remain the same I guess.

And maybe only_evict_unprotected could be renamed to "try_low" to align 
with the naming in there? Then in the callback the condition would be like:

  	/* We hit the low limit? Try once more */
	if (!lret && evict_walk.hit_low &&
	    !(evict_walk.try_low | state->try_low))
  		evict_walk.try_low = true;
  		goto retry;

Give or take.. Would that be more readable eg. obvious? Although I am 
endlessly confused how !try_low ends up being try_low = true in this 
condition so maybe I am mixing something up. You get my gist though? 
Unifying the naming and logic for easier understanding in essence if you 
can find some workable way in this spirit I think it is worth thinking 
about it.

Regards,

Tvrtko

> +
>   	ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
>   	if (ret) {
> -		if (ret == -ENOSPC && may_evict)
> +		if (ret == -ENOSPC && (may_evict || below_low))
>   			ret = -EBUSY;
>   		return ret;
>   	}
> 


  reply	other threads:[~2026-03-02 17:02 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02 12:37 [PATCH v5 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2026-03-02 12:37 ` [PATCH v5 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 2/6] cgroup,cgroup/dmem: Add (dmem_)cgroup_common_ancestor helper Natalie Vock
2026-03-02 14:38   ` Maarten Lankhorst
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
2026-03-02 15:08   ` Tvrtko Ursulin
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
2026-03-02 15:25   ` Tvrtko Ursulin
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2026-03-02 17:02   ` Tvrtko Ursulin [this message]
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-02 12:37 ` [PATCH v5 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
2026-03-03  3:29   ` Claude review: " Claude Code Review Bot
2026-03-03  3:29 ` Claude review: cgroup/dmem,drm/ttm: Improve protection in contended cases Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ef0e02-ac40-4bd4-bfcb-173d4312acb2@ursulin.net \
    --to=tursulin@ursulin.net \
    --cc=airlied@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=dev@lankhorst.se \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mkoutny@suse.com \
    --cc=mripard@kernel.org \
    --cc=natalie.vock@gmx.de \
    --cc=ray.huang@amd.com \
    --cc=simona@ffwll.ch \
    --cc=tj@kernel.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox