public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Boris Brezillon <boris.brezillon@collabora.com>,
	Steven Price <steven.price@arm.com>,
	Liviu Dudau <liviu.dudau@arm.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Tvrtko Ursulin <tursulin@ursulin.net>
Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	kernel@collabora.com, linux-doc@vger.kernel.org
Subject: Re: [PATCH v4 1/3] drm/fdinfo: Add "evicted" memory accounting
Date: Wed, 20 May 2026 21:38:41 +0200	[thread overview]
Message-ID: <J0t78_M-TDCxz-OeUDvFdA@collabora.com> (raw)
In-Reply-To: <7c7242b8-eb22-41b1-8f04-f7abda62bb28@ursulin.net>

Hello Tvrtko,

On Wednesday, 20 May 2026 16:19:12 Central European Summer Time Tvrtko Ursulin wrote:
> 
> On 20/05/2026 14:04, Nicolas Frattaroli wrote:
> > Currently, there's no way to know for certain how much GPU memory was
> > swapped out. The difference between total and resident memory would
> > include newly allocated pages, which are not resident, but also aren't
> > swapped out.
> > 
> > Add a new drm_gem_object_status so drivers can signal when an object has
> > been evicted to swap, and add a new "evicted" counter to
> > drm_memory_stats.
> > 
> > Due to how the supported_flags bitmask is determined, the "evicted"
> > count won't be printed to fdinfo if there's no swapped out pages.
> > 
> > Reviewed-by: Steven Price <steven.price@arm.com>
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> >   Documentation/gpu/drm-usage-stats.rst | 6 ++++++
> >   drivers/gpu/drm/drm_file.c            | 8 ++++++++
> >   include/drm/drm_file.h                | 2 ++
> >   include/drm/drm_gem.h                 | 2 ++
> >   4 files changed, 18 insertions(+)
> > 
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index 70b7cfcc194f..ac1dbf52d96d 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -202,6 +202,12 @@ One practical example of this could be the presence of unsignaled fences in a
> >   GEM buffer reservation object. Therefore, the active category is a subset of the
> >   resident category.
> >   
> > +- drm-evicted-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that have been evicted and are no longer pinned by the
> > +device. Only present if there are buffers that are currently evicted, and if the
> > +driver implements reporting of this type of memory.
> 
> The semantics as tricky to make work in an obvious way.
> 
> On one hand the text above is almost exactly the semantics of 'total' - 
> 'resident'. Almost meaning it was resident at some point, but isn't any 
> more. Whereas raw 'total' - 'resident' can also mean it never has been 
> instantiated.

Yes, that is the difference. You cannot tell them apart otherwise.

> You could even have a "workaround" where you report a 'swap' memory 
> region and then don't need to add anything new to the spec.

I get the idea that technically, swap is its own memory region, but
evicted is counting memory that panthor knows is currently evicted,
not necessarily memory that is in swap. Counting pages that would
*actually* be in swap would probably involve breaking several
abstractions that shouldn't be broken.

> 
> Next problem - on paper evicted could be useful to replace driver legacy 
> keys such as 'amd-evicted-ram'. But that "evicted" is defined as "not in 
> a the preferred placement". While your evicted is more like "no current 
> placement" (as in, no GPU accessible backing storage).
> 
> Is it possible to find a definition of this new category which makes 
> sense for different GPUs/drivers, be it integrated or discrete.

Sure, we can make this definition as loose as you need it to be to use
it in a different driver. I think the difference between "not in a
preferred placement" and "no current placement (but had a placement in
the past)" is not a big one for the users of this information; the goal
is to see how much of the GPU memory of a process has been made non-
resident by a shrinker.

> Or would simply going for 'drm-total-swap:' (or resident?) work for 
> panthor? Advantage being it would also work unambiguously for discrete 
> drivers.

Panthor itself doesn't really know whether something is in swap or
has just been made non-resident by the drm shrinker. It could be
somewhere between swap and resident, as Steven Price pointed out.

> 
> Like the ones which support multiple TTM placements, for example VRAM + 
> SYSTEM and then next step is swapping out so an extreme example on a 
> 16GiB GPU + 16GiB RAM machine with a 32GiB gfx workload could be like:
> 
> drm-total-vram:		32GiB
> drm-resident-vram:	16GiB
> drm-resident-system:	15GiB
> drm-total-swap:		1GiB
> 
> Does this look clear enough? Whereas with the "evicted" category it 
> would be:
> 
> drm-total-vram:		32GiB
> drm-resident-vram:	16GiB
> drm-evicted-vram:	16GiB # portion which got demoted to system RAM
> drm-resident-system:	15GiB
> drm-evicted-system:	1GiB  # portion which got demoted to swap
> 
> Where drm-evicted-vram is redundant to "total - resident". And it is 
> overloaded semantics as it where does evicted go depending on the 
> GPU/driver/region.

"drm-evicted-vram" is only redundant to "total - resident" if objects
that have never been packed by any pages aren't counted in total. This
is not the case, so I'm trying to fix it by adding evicted to it for
pages that were backed at some stage, but now aren't backed anymore.

I think "evicted" solves this problem generally in your second example,
without me having to worry about whether a page is in swap or AMD's
memory model.

So, to summarise:
- Panthor does not know how much of the memory that was reclaimed by the
  shrinker is actually in swap space, so "drm-total-swap" wouldn't work
  here.
- "total - resident" measures the wrong thing. Objects that have never
  been backed are not evicted.
- I am completely fine with AMD not using this fdinfo memory type due
  to having more complex eviction handling, but I don't see why it
  could not be used in this form.

> 
> Thoughts, opinions?
> 
> Regards,
> 
> Tvrtko
> 
> > +
> >   Implementation Details
> >   ======================
> >   
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index ec820686b302..5078172976c0 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -868,6 +868,7 @@ int drm_memory_stats_is_zero(const struct drm_memory_stats *stats)
> >   		stats->private == 0 &&
> >   		stats->resident == 0 &&
> >   		stats->purgeable == 0 &&
> > +		stats->evicted == 0 &&
> >   		stats->active == 0);
> >   }
> >   EXPORT_SYMBOL(drm_memory_stats_is_zero);
> > @@ -901,6 +902,10 @@ void drm_print_memory_stats(struct drm_printer *p,
> >   	if (supported_status & DRM_GEM_OBJECT_PURGEABLE)
> >   		drm_fdinfo_print_size(p, prefix, "purgeable", region,
> >   				      stats->purgeable);
> > +
> > +	if (supported_status & DRM_GEM_OBJECT_EVICTED)
> > +		drm_fdinfo_print_size(p, prefix, "evicted", region,
> > +				      stats->evicted);
> >   }
> >   EXPORT_SYMBOL(drm_print_memory_stats);
> >   
> > @@ -954,6 +959,9 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> >   
> >   		if (s & DRM_GEM_OBJECT_PURGEABLE)
> >   			status.purgeable += add_size;
> > +
> > +		if (s & DRM_GEM_OBJECT_EVICTED)
> > +			status.evicted += add_size;
> >   	}
> >   	spin_unlock(&file->table_lock);
> >   
> > diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> > index 6ee70ad65e1f..7e4cb45a52c3 100644
> > --- a/include/drm/drm_file.h
> > +++ b/include/drm/drm_file.h
> > @@ -500,6 +500,7 @@ void drm_send_event_timestamp_locked(struct drm_device *dev,
> >    * @resident: Total size of GEM objects backing pages
> >    * @purgeable: Total size of GEM objects that can be purged (resident and not active)
> >    * @active: Total size of GEM objects active on one or more engines
> > + * @evicted: Total size of GEM objects that have been evicted
> >    *
> >    * Used by drm_print_memory_stats()
> >    */
> > @@ -509,6 +510,7 @@ struct drm_memory_stats {
> >   	u64 resident;
> >   	u64 purgeable;
> >   	u64 active;
> > +	u64 evicted;
> >   };
> >   
> >   enum drm_gem_object_status;
> > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> > index 86f5846154f7..799588a2762a 100644
> > --- a/include/drm/drm_gem.h
> > +++ b/include/drm/drm_gem.h
> > @@ -53,6 +53,7 @@ struct drm_gem_object;
> >    * @DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
> >    * @DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
> >    * @DRM_GEM_OBJECT_ACTIVE: object is currently used by an active submission
> > + * @DRM_GEM_OBJECT_EVICTED: object is evicted and no longer pinned by driver
> >    *
> >    * Bitmask of status used for fdinfo memory stats, see &drm_gem_object_funcs.status
> >    * and drm_show_fdinfo().  Note that an object can report DRM_GEM_OBJECT_PURGEABLE
> > @@ -67,6 +68,7 @@ enum drm_gem_object_status {
> >   	DRM_GEM_OBJECT_RESIDENT  = BIT(0),
> >   	DRM_GEM_OBJECT_PURGEABLE = BIT(1),
> >   	DRM_GEM_OBJECT_ACTIVE    = BIT(2),
> > +	DRM_GEM_OBJECT_EVICTED   = BIT(3),
> >   };
> >   
> >   /**
> > 
> 
> 





  reply	other threads:[~2026-05-20 19:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 13:04 [PATCH v4 0/3] Let userspace know about swapped out panthor GEM objects Nicolas Frattaroli
2026-05-20 13:04 ` [PATCH v4 1/3] drm/fdinfo: Add "evicted" memory accounting Nicolas Frattaroli
2026-05-20 14:19   ` Tvrtko Ursulin
2026-05-20 19:38     ` Nicolas Frattaroli [this message]
2026-05-25 11:52   ` Claude review: " Claude Code Review Bot
2026-05-20 13:04 ` [PATCH v4 2/3] drm/panthor: Implement evicted status for GEM objects Nicolas Frattaroli
2026-05-20 13:24   ` Boris Brezillon
2026-05-20 14:33   ` Boris Brezillon
2026-05-25 11:52   ` Claude review: " Claude Code Review Bot
2026-05-20 13:04 ` [PATCH v4 3/3] drm/panthor: Reduce padding in gems debugfs for refcount Nicolas Frattaroli
2026-05-25 11:52   ` Claude review: " Claude Code Review Bot
2026-05-25 11:52 ` Claude review: Let userspace know about swapped out panthor GEM objects Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=J0t78_M-TDCxz-OeUDvFdA@collabora.com \
    --to=nicolas.frattaroli@collabora.com \
    --cc=airlied@gmail.com \
    --cc=boris.brezillon@collabora.com \
    --cc=corbet@lwn.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liviu.dudau@arm.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=skhan@linuxfoundation.org \
    --cc=steven.price@arm.com \
    --cc=tursulin@ursulin.net \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox