From: Tvrtko Ursulin <tursulin@ursulin.net>
To: Natalie Vock <natalie.vock@gmx.de>,
Maarten Lankhorst <dev@lankhorst.se>,
Maxime Ripard <mripard@kernel.org>, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Koutný <mkoutny@suse.com>,
Christian Koenig <christian.koenig@amd.com>,
Huang Rui <ray.huang@amd.com>,
Matthew Auld <matthew.auld@intel.com>,
Matthew Brost <matthew.brost@intel.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>
Cc: cgroups@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v6 4/6] drm/ttm: Split cgroup charge and resource allocation
Date: Fri, 13 Mar 2026 12:53:02 +0000 [thread overview]
Message-ID: <fd77dce5-39f9-446e-b78e-3304ca3de4ac@ursulin.net> (raw)
In-Reply-To: <20260313-dmemcg-aggressive-protect-v6-4-7c71cc1492db@gmx.de>
On 13/03/2026 11:40, Natalie Vock wrote:
> Coupling resource allocation and cgroup charging is racy when charging
> succeeds, but subsequent resource allocation fails. Certain eviction
> decisions are made on the basis of whether the allocating cgroup is
> protected, i.e. within its min/low limits, but with the charge being
> tied to resource allocation (and uncharged when the resource allocation
> fails), this check is done at a point where the allocation is not actually
> charged to the cgroup.
>
> This is subtly wrong if the allocation were to cause the cgroup to exceed
> the min/low protection, but it's even more wrong if the same cgroup tries
> allocating multiple buffers concurrently: In this case, the min/low
> protection may pass for all allocation attempts when the real min/low
> protection covers only some, or potentially none of the allocated
> buffers.
>
> Instead, charge the allocation to the cgroup once and keep the charge
> for as long as we try to allocate a ttm_resource, and only undo the charge
> if allocating the resource is ultimately unsuccessful and we move on to
> a different ttm_place.
>
> Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 66 ++++++++++++++++++++++++++------------
> drivers/gpu/drm/ttm/ttm_resource.c | 48 +++++++++++++++++++--------
> include/drm/ttm/ttm_resource.h | 6 +++-
> 3 files changed, 85 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 5cca0d6edbaf6..4adc9b80cba4a 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -490,8 +490,12 @@ int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man
> }
>
> struct ttm_bo_alloc_state {
> + /** @charge_pool: The memory pool the resource is charged to */
> + struct dmem_cgroup_pool_state *charge_pool;
> /** @limit_pool: Which pool limit we should test against */
> struct dmem_cgroup_pool_state *limit_pool;
> + /** @in_evict: Whether we are currently evicting buffers */
> + bool in_evict;
> };
>
> /**
> @@ -520,28 +524,39 @@ static int ttm_bo_alloc_at_place(struct ttm_buffer_object *bo,
> bool may_evict;
> int ret;
>
> - may_evict = force_space && place->mem_type != TTM_PL_SYSTEM;
> -
> - ret = ttm_resource_alloc(bo, place, res,
> - force_space ? &alloc_state->limit_pool : NULL);
> + may_evict = !alloc_state->in_evict && force_space &&
> + place->mem_type != TTM_PL_SYSTEM;
> + if (!alloc_state->charge_pool) {
> + ret = ttm_resource_try_charge(bo, place, &alloc_state->charge_pool,
> + force_space ? &alloc_state->limit_pool
> + : NULL);
> + if (ret) {
> + /*
> + * -EAGAIN means the charge failed, which we treat
> + * like an allocation failure. Therefore, return an
> + * error code indicating the allocation failed -
> + * either -EBUSY if the allocation should be
> + * retried with eviction, or -ENOSPC if there should
> + * be no second attempt.
> + */
> + if (ret == -EAGAIN)
> + ret = may_evict ? -EBUSY : -ENOSPC;
> + return ret;
> + }
> + }
>
> + ret = ttm_resource_alloc(bo, place, res, alloc_state->charge_pool);
> if (ret) {
> - /*
> - * -EAGAIN means the charge failed, which we treat like an
> - * allocation failure. Therefore, return an error code indicating
> - * the allocation failed - either -EBUSY if the allocation should
> - * be retried with eviction, or -ENOSPC if there should be no second
> - * attempt.
> - */
> - if (ret == -EAGAIN)
> - return may_evict ? -EBUSY : -ENOSPC;
> -
> if (ret == -ENOSPC && may_evict)
> - return -EBUSY;
> -
> + ret = -EBUSY;
> return ret;
> }
>
> + /*
> + * Ownership of charge_pool has been transferred to the TTM resource,
> + * don't make the caller think we still hold a reference to it.
> + */
> + alloc_state->charge_pool = NULL;
> return 0;
> }
>
> @@ -596,8 +611,9 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *
>
> evict_walk->evicted++;
> if (evict_walk->res)
> - lret = ttm_resource_alloc(evict_walk->evictor, evict_walk->place,
> - evict_walk->res, NULL);
> + lret = ttm_bo_alloc_at_place(evict_walk->evictor, evict_walk->place,
> + walk->arg.ctx, false, evict_walk->res,
> + evict_walk->alloc_state);
Oh I am glad this worked out. I will not go as far to say TTM eviction
logic is now easy to follow but at least the new state machine logic is
consolidated. Anyway, I went back and forth many many times and it all
looks good to me.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Btw have you looked at the TTM kunit tests? Does any cover the new paths
or it would make sense to add some coverage with this series, or as a
follow up?
Regards,
Tvrtko
> if (lret == 0)
> return 1;
> out:
> @@ -636,6 +652,8 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
> };
> s64 lret;
>
> + state->in_evict = true;
> +
> evict_walk.walk.arg.trylock_only = true;
> lret = ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, 1);
>
> @@ -666,6 +684,7 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev,
> goto retry;
> }
> out:
> + state->in_evict = false;
> if (lret < 0)
> return lret;
> if (lret == 0)
> @@ -798,6 +817,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
> res, &alloc_state);
>
> if (ret == -ENOSPC) {
> + dmem_cgroup_uncharge(alloc_state.charge_pool, bo->base.size);
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> continue;
> } else if (ret == -EBUSY) {
> @@ -806,11 +826,15 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo,
>
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
>
> - if (ret == -EBUSY)
> - continue;
> - else if (ret)
> + if (ret) {
> + dmem_cgroup_uncharge(alloc_state.charge_pool,
> + bo->base.size);
> + if (ret == -EBUSY)
> + continue;
> return ret;
> + }
> } else if (ret) {
> + dmem_cgroup_uncharge(alloc_state.charge_pool, bo->base.size);
> dmem_cgroup_pool_state_put(alloc_state.limit_pool);
> return ret;
> }
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 192fca24f37e4..a8a836f6e376a 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -373,30 +373,52 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
> }
> EXPORT_SYMBOL(ttm_resource_fini);
>
> +/**
> + * ttm_resource_try_charge - charge a resource manager's cgroup pool
> + * @bo: buffer for which an allocation should be charged
> + * @place: where the allocation is attempted to be placed
> + * @ret_pool: on charge success, the pool that was charged
> + * @ret_limit_pool: on charge failure, the pool responsible for the failure
> + *
> + * Should be used to charge cgroups before attempting resource allocation.
> + * When charging succeeds, the value of ret_pool should be passed to
> + * ttm_resource_alloc.
> + *
> + * Returns: 0 on charge success, negative errno on failure.
> + */
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> + const struct ttm_place *place,
> + struct dmem_cgroup_pool_state **ret_pool,
> + struct dmem_cgroup_pool_state **ret_limit_pool)
> +{
> + struct ttm_resource_manager *man =
> + ttm_manager_type(bo->bdev, place->mem_type);
> +
> + if (!man->cg) {
> + *ret_pool = NULL;
> + if (ret_limit_pool)
> + *ret_limit_pool = NULL;
> + return 0;
> + }
> +
> + return dmem_cgroup_try_charge(man->cg, bo->base.size, ret_pool,
> + ret_limit_pool);
> +}
> +
> int ttm_resource_alloc(struct ttm_buffer_object *bo,
> const struct ttm_place *place,
> struct ttm_resource **res_ptr,
> - struct dmem_cgroup_pool_state **ret_limit_pool)
> + struct dmem_cgroup_pool_state *charge_pool)
> {
> struct ttm_resource_manager *man =
> ttm_manager_type(bo->bdev, place->mem_type);
> - struct dmem_cgroup_pool_state *pool = NULL;
> int ret;
>
> - if (man->cg) {
> - ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool);
> - if (ret)
> - return ret;
> - }
> -
> ret = man->func->alloc(man, bo, place, res_ptr);
> - if (ret) {
> - if (pool)
> - dmem_cgroup_uncharge(pool, bo->base.size);
> + if (ret)
> return ret;
> - }
>
> - (*res_ptr)->css = pool;
> + (*res_ptr)->css = charge_pool;
>
> spin_lock(&bo->bdev->lru_lock);
> ttm_resource_add_bulk_move(*res_ptr, bo);
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 33e80f30b8b82..549b5b796884d 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -456,10 +456,14 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> void ttm_resource_fini(struct ttm_resource_manager *man,
> struct ttm_resource *res);
>
> +int ttm_resource_try_charge(struct ttm_buffer_object *bo,
> + const struct ttm_place *place,
> + struct dmem_cgroup_pool_state **ret_pool,
> + struct dmem_cgroup_pool_state **ret_limit_pool);
> int ttm_resource_alloc(struct ttm_buffer_object *bo,
> const struct ttm_place *place,
> struct ttm_resource **res,
> - struct dmem_cgroup_pool_state **ret_limit_pool);
> + struct dmem_cgroup_pool_state *charge_pool);
> void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
> bool ttm_resource_intersects(struct ttm_device *bdev,
> struct ttm_resource *res,
>
next prev parent reply other threads:[~2026-03-13 12:53 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-13 11:39 [PATCH v6 0/6] cgroup/dmem,drm/ttm: Improve protection in contended cases Natalie Vock
2026-03-13 11:40 ` [PATCH v6 1/6] cgroup/dmem: Add queries for protection values Natalie Vock
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 11:40 ` [PATCH v6 2/6] cgroup,cgroup/dmem: Add (dmem_)cgroup_common_ancestor helper Natalie Vock
2026-03-13 14:16 ` Michal Koutný
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 11:40 ` [PATCH v6 3/6] drm/ttm: Extract code for attempting allocation in a place Natalie Vock
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 11:40 ` [PATCH v6 4/6] drm/ttm: Split cgroup charge and resource allocation Natalie Vock
2026-03-13 12:53 ` Tvrtko Ursulin [this message]
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 11:40 ` [PATCH v6 5/6] drm/ttm: Be more aggressive when allocating below protection limit Natalie Vock
2026-03-13 13:29 ` Tvrtko Ursulin
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 11:40 ` [PATCH v6 6/6] drm/ttm: Use common ancestor of evictor and evictee as limit pool Natalie Vock
2026-03-13 14:11 ` Tvrtko Ursulin
2026-03-13 14:16 ` Michal Koutný
2026-03-13 21:10 ` Claude review: " Claude Code Review Bot
2026-03-13 21:10 ` Claude review: cgroup/dmem,drm/ttm: Improve protection in contended cases Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd77dce5-39f9-446e-b78e-3304ca3de4ac@ursulin.net \
--to=tursulin@ursulin.net \
--cc=airlied@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=christian.koenig@amd.com \
--cc=dev@lankhorst.se \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=mkoutny@suse.com \
--cc=mripard@kernel.org \
--cc=natalie.vock@gmx.de \
--cc=ray.huang@amd.com \
--cc=simona@ffwll.ch \
--cc=tj@kernel.org \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox