* [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg
@ 2026-05-19 15:59 Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
Shuah Khan
Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc, Eric Chanudet
Following suggestion[1], offer a cgroupfs entry to allow an
administrator to request that a dmem controlled region also charges to
the memory controller.
Add mem_cgroup_dmem_charge/uncharge helpers to resolve the effective
cgroup from a dmem pool's cgroup, perform the charge and update a
MEMCG_DMEM stat counter.
Add a "dmem.memcg" control file at the root level to configure memcg
charging per region. The setting is disabled by default and locked on
first charge attempt.
[1] https://lore.kernel.org/all/a446b598-5041-450b-aaa9-3c39a09ff6a0@amd.com/
Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
Changes in v2:
- Use mem_cgroup_dmem_{,un}charge to account for memcg pages instead of
exposing raw nr_pages functions. Use it to centralize where to find
the effective cgroup from the pool's cgroup (Johannes)
- Set depends_on for cgrp_memory if CONFIG_MEMCG by having a memory
controller in children cgroup (Michal)
- Move dmem.memcg to the root level as it applies by region for all
cgroups
- Add a dmem memory.stats entry for reporting memcg charges for dmem
allocations.
- Wrap the memcg enable/disable/lock configuration under a single state
to avoid toctou races and simplify transitions.
- Link to v1: https://lore.kernel.org/r/20260403-cgroup-dmem-memcg-double-charge-v1-0-c371d155de2a@redhat.com
---
Eric Chanudet (2):
mm/memcontrol: add dmem charge/uncharge functions
cgroup/dmem: add dmem.memcg control file for double-charging to memcg
Documentation/admin-guide/cgroup-v2.rst | 23 +++++
include/linux/memcontrol.h | 16 ++++
kernel/cgroup/dmem.c | 158 +++++++++++++++++++++++++++++++-
mm/memcontrol.c | 65 +++++++++++++
4 files changed, 259 insertions(+), 3 deletions(-)
---
base-commit: d989f135f71699294bb2ffd4726b526456e2db68
change-id: 20260327-cgroup-dmem-memcg-double-charge-0f100a9ffbf2
Best regards,
--
Eric Chanudet <echanude@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
@ 2026-05-19 15:59 ` Eric Chanudet
2026-05-20 7:22 ` Albert Esteve
` (2 more replies)
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
2026-05-25 12:42 ` Claude review: cgroup/dmem: allow double-charging dmem allocations " Claude Code Review Bot
2 siblings, 3 replies; 11+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
Shuah Khan
Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc, Eric Chanudet
Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
dmem pool allocations to optionally be double-charged against the memory
controller. Take the struct cgroup from the dmem pool's css as there is
no convenient object exported to represent these allocations. These will
resolve the effective memory css from that cgroup and perform the
charge.
Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
dmem charge visible.
Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
include/linux/memcontrol.h | 16 ++++++++++++
mm/memcontrol.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 81 insertions(+)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -39,6 +39,7 @@ enum memcg_stat_item {
MEMCG_ZSWAP_B,
MEMCG_ZSWAPPED,
MEMCG_ZSWAP_INCOMP,
+ MEMCG_DMEM,
MEMCG_NR_STAT,
};
@@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
}
#endif
+#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
+bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
+ gfp_t gfp_mask);
+void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
+#else
+static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
+ unsigned int nr_pages, gfp_t gfp_mask)
+{
+ return true;
+}
+static inline void mem_cgroup_dmem_uncharge(struct cgroup *cgrp,
+ unsigned int nr_pages)
+{
+}
+#endif
/* Cgroup v1-related declarations */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c03d4787d466803db49cdaa90e6d6ba426b7afe2..91a7ac16b6eac2d6c3700b6885a068bf8b640706 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -433,6 +433,7 @@ static const unsigned int memcg_stat_items[] = {
MEMCG_ZSWAP_B,
MEMCG_ZSWAPPED,
MEMCG_ZSWAP_INCOMP,
+ MEMCG_DMEM,
};
#define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
@@ -1606,6 +1607,9 @@ static const struct memory_stat memory_stats[] = {
#ifdef CONFIG_NUMA_BALANCING
{ "pgpromote_success", PGPROMOTE_SUCCESS },
#endif
+#ifdef CONFIG_CGROUP_DMEM
+ { "dmem", MEMCG_DMEM },
+#endif
};
/* The actual unit of the state item, not the same as the output unit */
@@ -5909,6 +5913,67 @@ static struct cftype zswap_files[] = {
};
#endif /* CONFIG_ZSWAP */
+#ifdef CONFIG_CGROUP_DMEM
+/**
+ * mem_cgroup_dmem_charge - charge memcg for a dmem pool allocation
+ * @cgrp: cgroup of the dmem pool
+ * @nr_pages: number of pages to charge
+ * @gfp_mask: reclaim mode
+ *
+ * Charges @nr_pages to @memcg. Returns %true if the charge fit within
+ * @memcg's configured limit, %false if it doesn't.
+ */
+bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
+ gfp_t gfp_mask)
+{
+ struct cgroup_subsys_state *mem_css;
+ struct mem_cgroup *memcg;
+
+ /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
+ mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+
+ /* Use the memcg, if any, of the dmem cgroup. */
+ memcg = mem_cgroup_from_css(mem_css);
+ if (!memcg || mem_cgroup_is_root(memcg)) {
+ css_put(mem_css);
+ return false;
+ }
+
+ if (try_charge_memcg(memcg, gfp_mask, nr_pages)) {
+ css_put(mem_css);
+ return false;
+ }
+
+ mod_memcg_state(memcg, MEMCG_DMEM, nr_pages);
+ css_put(mem_css);
+ return true;
+}
+
+/**
+ * mem_cgroup_dmem_uncharge - uncharge memcg from a dmem pool allocation
+ * @cgrp: cgroup of the dmem pool
+ * @nr_pages: number of pages to uncharge
+ */
+void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages)
+{
+ struct cgroup_subsys_state *mem_css;
+ struct mem_cgroup *memcg;
+
+ /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
+ mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
+
+ memcg = mem_cgroup_from_css(mem_css);
+ if (!memcg || mem_cgroup_is_root(memcg)) {
+ css_put(mem_css);
+ return;
+ }
+
+ mod_memcg_state(memcg, MEMCG_DMEM, -nr_pages);
+ refill_stock(memcg, nr_pages);
+ css_put(mem_css);
+}
+#endif /* CONFIG_CGROUP_DMEM */
+
static int __init mem_cgroup_swap_init(void)
{
if (mem_cgroup_disabled())
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
@ 2026-05-19 15:59 ` Eric Chanudet
2026-05-22 15:26 ` Michal Koutný
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
2026-05-25 12:42 ` Claude review: cgroup/dmem: allow double-charging dmem allocations " Claude Code Review Bot
2 siblings, 2 replies; 11+ messages in thread
From: Eric Chanudet @ 2026-05-19 15:59 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
Shuah Khan
Cc: cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc, Eric Chanudet
Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
configure whether allocations in a dmem region should also be charged to
the memory controller.
To handle inheritance, dmem adds a depends_on the memory controller,
unless MEMCG isn't configured in.
Double-charging is disabled by default. Once a charge is attempted, the
setting is locked to prevent inconsistent accounting by a small 4-state
machine (off, on, locked off, locked on).
The memcg to charge is derived from the pool's cgroup, since the pool
holds a reference to the dmem cgroup state that keeps the cgroup alive
until it gets uncharged.
Signed-off-by: Eric Chanudet <echanude@redhat.com>
---
Documentation/admin-guide/cgroup-v2.rst | 23 +++++
kernel/cgroup/dmem.c | 158 +++++++++++++++++++++++++++++++-
2 files changed, 178 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 6efd0095ed995b1550317662bc1b56c7a7f3db23..1d2fa55ddf0faa17baa916a8914d3033e8e42359 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2828,6 +2828,29 @@ DMEM Interface Files
drm/0000:03:00.0/vram0 12550144
drm/0000:03:00.0/stolen 8650752
+ dmem.memcg
+ A readwrite nested-keyed file that exists only on the root
+ cgroup. It configures whether allocations in a dmem region
+ should also be charged to the memory controller.
+
+ Upon the first charge to a region, its setting can no longer be changed
+ and is reported as "[true|false] (locked)".
+
+ Charges to the memory controller are visible in ``memory.stat`` as the
+ ``dmem`` entry, reported in bytes.
+
+ An example read output follows::
+
+ drm/0000:03:00.0/vram0 false
+ drm/0000:03:00.0/stolen false (locked)
+
+ Writing uses the same nested-keyed format::
+
+ echo "drm/0000:03:00.0/vram0 true" > dmem.memcg
+
+ This file is only available when the kernel is built with
+ ``CONFIG_MEMCG``.
+
HugeTLB
-------
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 1ab1fb47f2711ecc60dd13e611a8a4920b48f3e9..e07b20b8025c528f190f84c76b088cb8a32a7f5e 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -17,6 +17,14 @@
#include <linux/refcount.h>
#include <linux/rculist.h>
#include <linux/slab.h>
+#include <linux/memcontrol.h>
+
+enum dmem_memcg_status {
+ DMEM_MEMCG_OFF,
+ DMEM_MEMCG_ON,
+ DMEM_MEMCG_LOCKED_OFF,
+ DMEM_MEMCG_LOCKED_ON,
+};
struct dmem_cgroup_region {
/**
@@ -51,6 +59,14 @@ struct dmem_cgroup_region {
* No new pools should be added to the region afterwards.
*/
bool unregistered;
+
+ /**
+ * @memcg_status: Whether allocation in this region should charge memcg.
+ * DMEM_MEMCG_OFF/DMEM_MEMCG_ON or
+ * DMEM_MEMCG_LOCKED_OFF/DMEM_MEMCG_LOCKED_ON, frozen after first allocation.
+ * Transitions to a locked state are one-way.
+ */
+ atomic_t memcg_status;
};
struct dmemcg_state {
@@ -609,6 +625,34 @@ get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *region)
return pool;
}
+static bool apply_memcg_charge(atomic_t *status)
+{
+ int state = atomic_read(status);
+
+ for (;;) {
+ switch (state) {
+ case DMEM_MEMCG_OFF:
+ state = atomic_cmpxchg(status, DMEM_MEMCG_OFF,
+ DMEM_MEMCG_LOCKED_OFF);
+ if (state != DMEM_MEMCG_OFF)
+ continue;
+ return false;
+ case DMEM_MEMCG_LOCKED_OFF:
+ return false;
+ case DMEM_MEMCG_ON:
+ state = atomic_cmpxchg(status, DMEM_MEMCG_ON,
+ DMEM_MEMCG_LOCKED_ON);
+ if (state != DMEM_MEMCG_ON)
+ continue;
+ return true;
+ case DMEM_MEMCG_LOCKED_ON:
+ return true;
+ }
+ WARN_ONCE(1, "Invalid memcg_status (%#x).\n", state);
+ return false;
+ }
+}
+
/**
* dmem_cgroup_uncharge() - Uncharge a pool.
* @pool: Pool to uncharge.
@@ -624,6 +668,12 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size)
return;
page_counter_uncharge(&pool->cnt, size);
+
+ if (atomic_read(&pool->region->memcg_status) == DMEM_MEMCG_LOCKED_ON &&
+ !WARN_ON_ONCE(size > (u64)UINT_MAX << PAGE_SHIFT))
+ mem_cgroup_dmem_uncharge(pool->cs->css.cgroup,
+ PAGE_ALIGN(size) >> PAGE_SHIFT);
+
css_put(&pool->cs->css);
dmemcg_pool_put(pool);
}
@@ -655,6 +705,8 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
struct dmemcg_state *cg;
struct dmem_cgroup_pool_state *pool;
struct page_counter *fail;
+ unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ bool charge_memcg;
int ret;
*ret_pool = NULL;
@@ -670,7 +722,28 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
pool = get_cg_pool_unlocked(cg, region);
if (IS_ERR(pool)) {
ret = PTR_ERR(pool);
- goto err;
+ goto err_css_put;
+ }
+
+ charge_memcg = apply_memcg_charge(®ion->memcg_status);
+ if (charge_memcg) {
+ /* mem_cgroup_dmem_charge limitation from try_charge_memcg */
+ if (size > (u64)UINT_MAX << PAGE_SHIFT) {
+ ret = -EINVAL;
+ dmemcg_pool_put(pool);
+ goto err_css_put;
+ }
+
+ if (!mem_cgroup_dmem_charge(pool->cs->css.cgroup, nr_pages,
+ GFP_KERNEL)) {
+ /*
+ * No dmem_cgroup_state_evict_valuable() could help,
+ * there's no ret_limit_pool to return.
+ */
+ ret = -ENOMEM;
+ dmemcg_pool_put(pool);
+ goto err_css_put;
+ }
}
if (!page_counter_try_charge(&pool->cnt, size, &fail)) {
@@ -681,14 +754,17 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size,
}
dmemcg_pool_put(pool);
ret = -EAGAIN;
- goto err;
+ goto err_uncharge_memcg;
}
/* On success, reference from get_current_dmemcs is transferred to *ret_pool */
*ret_pool = pool;
return 0;
-err:
+err_uncharge_memcg:
+ if (charge_memcg)
+ mem_cgroup_dmem_uncharge(pool->cs->css.cgroup, nr_pages);
+err_css_put:
css_put(&cg->css);
return ret;
}
@@ -845,6 +921,71 @@ static ssize_t dmem_cgroup_region_max_write(struct kernfs_open_file *of,
return dmemcg_limit_write(of, buf, nbytes, off, set_resource_max);
}
+#ifdef CONFIG_MEMCG
+static int dmem_cgroup_memcg_show(struct seq_file *sf, void *v)
+{
+ struct dmem_cgroup_region *region;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(region, &dmem_cgroup_regions, region_node) {
+ int state = atomic_read(®ion->memcg_status);
+
+ seq_printf(sf, "%s %s\n", region->name,
+ state == DMEM_MEMCG_ON ? "true" :
+ state == DMEM_MEMCG_OFF ? "false" :
+ state == DMEM_MEMCG_LOCKED_ON ? "true (locked)" :
+ state == DMEM_MEMCG_LOCKED_OFF ? "false (locked)" :
+ "(invalid)");
+ }
+ rcu_read_unlock();
+ return 0;
+}
+
+static ssize_t dmem_cgroup_memcg_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
+ while (buf) {
+ struct dmem_cgroup_region *region;
+ char *options, *name;
+ bool flag;
+
+ options = buf;
+ buf = strchr(buf, '\n');
+ if (buf)
+ *buf++ = '\0';
+
+ options = strstrip(options);
+ if (!options[0])
+ continue;
+
+ name = strsep(&options, " \t");
+ if (!name[0])
+ continue;
+
+ if (!options || !options[0])
+ return -EINVAL;
+
+ if (kstrtobool(options, &flag))
+ return -EINVAL;
+
+ rcu_read_lock();
+ region = dmemcg_get_region_by_name(name);
+ rcu_read_unlock();
+ if (!region)
+ return -ENODEV;
+
+ atomic_cmpxchg(®ion->memcg_status,
+ flag ? DMEM_MEMCG_OFF : DMEM_MEMCG_ON,
+ flag ? DMEM_MEMCG_ON : DMEM_MEMCG_OFF);
+ /* Continue if a region is already locked. */
+
+ kref_put(®ion->ref, dmemcg_free_region);
+ }
+
+ return nbytes;
+}
+#endif
+
static struct cftype files[] = {
{
.name = "capacity",
@@ -873,6 +1014,14 @@ static struct cftype files[] = {
.seq_show = dmem_cgroup_region_max_show,
.flags = CFTYPE_NOT_ON_ROOT,
},
+#ifdef CONFIG_MEMCG
+ {
+ .name = "memcg",
+ .write = dmem_cgroup_memcg_write,
+ .seq_show = dmem_cgroup_memcg_show,
+ .flags = CFTYPE_ONLY_ON_ROOT,
+ },
+#endif
{ } /* Zero entry terminates. */
};
@@ -882,4 +1031,7 @@ struct cgroup_subsys dmem_cgrp_subsys = {
.css_offline = dmemcs_offline,
.legacy_cftypes = files,
.dfl_cftypes = files,
+#ifdef CONFIG_MEMCG
+ .depends_on = 1 << memory_cgrp_id,
+#endif
};
--
2.52.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
@ 2026-05-20 7:22 ` Albert Esteve
2026-05-22 15:53 ` Shakeel Butt
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
2 siblings, 0 replies; 11+ messages in thread
From: Albert Esteve @ 2026-05-20 7:22 UTC (permalink / raw)
To: Eric Chanudet
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tejun Heo, Michal Koutný, Jonathan Corbet,
Shuah Khan, cgroups, linux-mm, linux-kernel, dri-devel,
T.J. Mercier, Christian König, Maxime Ripard, Dave Airlie,
linux-doc
On Tue, May 19, 2026 at 6:01 PM Eric Chanudet <echanude@redhat.com> wrote:
>
> Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> dmem pool allocations to optionally be double-charged against the memory
> controller. Take the struct cgroup from the dmem pool's css as there is
> no convenient object exported to represent these allocations. These will
> resolve the effective memory css from that cgroup and perform the
> charge.
>
> Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> dmem charge visible.
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
> ---
> include/linux/memcontrol.h | 16 ++++++++++++
> mm/memcontrol.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 81 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -39,6 +39,7 @@ enum memcg_stat_item {
> MEMCG_ZSWAP_B,
> MEMCG_ZSWAPPED,
> MEMCG_ZSWAP_INCOMP,
> + MEMCG_DMEM,
> MEMCG_NR_STAT,
> };
>
> @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> }
> #endif
>
> +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> + gfp_t gfp_mask);
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> +#else
> +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> + unsigned int nr_pages, gfp_t gfp_mask)
> +{
> + return true;
> +}
> +static inline void mem_cgroup_dmem_uncharge(struct cgroup *cgrp,
> + unsigned int nr_pages)
> +{
> +}
> +#endif
>
> /* Cgroup v1-related declarations */
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c03d4787d466803db49cdaa90e6d6ba426b7afe2..91a7ac16b6eac2d6c3700b6885a068bf8b640706 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -433,6 +433,7 @@ static const unsigned int memcg_stat_items[] = {
> MEMCG_ZSWAP_B,
> MEMCG_ZSWAPPED,
> MEMCG_ZSWAP_INCOMP,
> + MEMCG_DMEM,
> };
>
> #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
> @@ -1606,6 +1607,9 @@ static const struct memory_stat memory_stats[] = {
> #ifdef CONFIG_NUMA_BALANCING
> { "pgpromote_success", PGPROMOTE_SUCCESS },
> #endif
> +#ifdef CONFIG_CGROUP_DMEM
> + { "dmem", MEMCG_DMEM },
> +#endif
> };
>
> /* The actual unit of the state item, not the same as the output unit */
> @@ -5909,6 +5913,67 @@ static struct cftype zswap_files[] = {
> };
> #endif /* CONFIG_ZSWAP */
>
> +#ifdef CONFIG_CGROUP_DMEM
> +/**
> + * mem_cgroup_dmem_charge - charge memcg for a dmem pool allocation
> + * @cgrp: cgroup of the dmem pool
> + * @nr_pages: number of pages to charge
> + * @gfp_mask: reclaim mode
> + *
> + * Charges @nr_pages to @memcg. Returns %true if the charge fit within
> + * @memcg's configured limit, %false if it doesn't.
> + */
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> + gfp_t gfp_mask)
> +{
> + struct cgroup_subsys_state *mem_css;
> + struct mem_cgroup *memcg;
> +
> + /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
> + mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> +
> + /* Use the memcg, if any, of the dmem cgroup. */
> + memcg = mem_cgroup_from_css(mem_css);
> + if (!memcg || mem_cgroup_is_root(memcg)) {
> + css_put(mem_css);
> + return false;
> + }
> +
> + if (try_charge_memcg(memcg, gfp_mask, nr_pages)) {
> + css_put(mem_css);
> + return false;
> + }
> +
> + mod_memcg_state(memcg, MEMCG_DMEM, nr_pages);
> + css_put(mem_css);
> + return true;
> +}
> +
> +/**
> + * mem_cgroup_dmem_uncharge - uncharge memcg from a dmem pool allocation
> + * @cgrp: cgroup of the dmem pool
> + * @nr_pages: number of pages to uncharge
> + */
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages)
> +{
> + struct cgroup_subsys_state *mem_css;
> + struct mem_cgroup *memcg;
> +
> + /* CGROUP_DMEM and MEMCG guarantees this cannot be NULL. */
> + mem_css = cgroup_get_e_css(cgrp, &memory_cgrp_subsys);
> +
> + memcg = mem_cgroup_from_css(mem_css);
> + if (!memcg || mem_cgroup_is_root(memcg)) {
> + css_put(mem_css);
> + return;
> + }
> +
> + mod_memcg_state(memcg, MEMCG_DMEM, -nr_pages);
> + refill_stock(memcg, nr_pages);
> + css_put(mem_css);
> +}
> +#endif /* CONFIG_CGROUP_DMEM */
> +
> static int __init mem_cgroup_swap_init(void)
> {
> if (mem_cgroup_disabled())
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
@ 2026-05-22 15:26 ` Michal Koutný
2026-05-22 16:17 ` Tejun Heo
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
1 sibling, 1 reply; 11+ messages in thread
From: Michal Koutný @ 2026-05-22 15:26 UTC (permalink / raw)
To: Eric Chanudet
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Maarten Lankhorst, Maxime Ripard,
Natalie Vock, Tejun Heo, Jonathan Corbet, Shuah Khan, cgroups,
linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc
[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]
Hello Eric.
On Tue, May 19, 2026 at 11:59:02AM -0400, Eric Chanudet <echanude@redhat.com> wrote:
> Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
> configure whether allocations in a dmem region should also be charged to
> the memory controller.
This kinda makes sense as it is not unlike io.cost.* device
configurators.
Just for my better understanding -- will there be a space for userspace
to switch this? (No charged dmem allocations happen before responsible
userspace runs, so that the attribute remains unlocked.)
(I'm rather indifferent about the actual double charging/non-charging
matter.)
>
> To handle inheritance, dmem adds a depends_on the memory controller,
> unless MEMCG isn't configured in.
>
> Double-charging is disabled by default. Once a charge is attempted, the
> setting is locked to prevent inconsistent accounting by a small 4-state
> machine (off, on, locked off, locked on).
>
> The memcg to charge is derived from the pool's cgroup, since the pool
> holds a reference to the dmem cgroup state that keeps the cgroup alive
> until it gets uncharged.
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 23 +++++
> kernel/cgroup/dmem.c | 158 +++++++++++++++++++++++++++++++-
> 2 files changed, 178 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 6efd0095ed995b1550317662bc1b56c7a7f3db23..1d2fa55ddf0faa17baa916a8914d3033e8e42359 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -2828,6 +2828,29 @@ DMEM Interface Files
> drm/0000:03:00.0/vram0 12550144
> drm/0000:03:00.0/stolen 8650752
>
> + dmem.memcg
> + A readwrite nested-keyed file that exists only on the root
> + cgroup.
Strictly speaking this is not nested-keyed but flat keyed [1],
which leads me to realization that this is the first instance of a boolean.
All in call, such a composition comes to my mind (latter is RO):
drm/0000:03:00.0/vram0 enable=0|1 locked=0|1
> +static ssize_t dmem_cgroup_memcg_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + while (buf) {
> + struct dmem_cgroup_region *region;
> + char *options, *name;
> + bool flag;
> +
> + options = buf;
> + buf = strchr(buf, '\n');
> + if (buf)
> + *buf++ = '\0';
I recall there was a discussion about accepting only a single device per
write(2) (at the same time I see this idiom is still present in other
dmem.* files, so this is nothing to change in _this_ patch).
Thanks,
Michal
[1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#format
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
2026-05-20 7:22 ` Albert Esteve
@ 2026-05-22 15:53 ` Shakeel Butt
2026-05-22 15:55 ` Shakeel Butt
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
2 siblings, 1 reply; 11+ messages in thread
From: Shakeel Butt @ 2026-05-22 15:53 UTC (permalink / raw)
To: Eric Chanudet
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Maarten Lankhorst, Maxime Ripard, Natalie Vock,
Tejun Heo, Michal Koutný, Jonathan Corbet, Shuah Khan,
cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc
On Tue, May 19, 2026 at 11:59:01AM -0400, Eric Chanudet wrote:
> Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> dmem pool allocations to optionally be double-charged against the memory
> controller. Take the struct cgroup from the dmem pool's css as there is
> no convenient object exported to represent these allocations. These will
> resolve the effective memory css from that cgroup and perform the
> charge.
>
> Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> dmem charge visible.
>
> Signed-off-by: Eric Chanudet <echanude@redhat.com>
> ---
> include/linux/memcontrol.h | 16 ++++++++++++
> mm/memcontrol.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 81 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -39,6 +39,7 @@ enum memcg_stat_item {
> MEMCG_ZSWAP_B,
> MEMCG_ZSWAPPED,
> MEMCG_ZSWAP_INCOMP,
> + MEMCG_DMEM,
> MEMCG_NR_STAT,
> };
>
> @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> }
> #endif
>
> +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> + gfp_t gfp_mask);
> +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> +#else
> +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> + unsigned int nr_pages, gfp_t gfp_mask)
Please follow Johannes's request to pass the actually memory object instead of
naked numbers.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions
2026-05-22 15:53 ` Shakeel Butt
@ 2026-05-22 15:55 ` Shakeel Butt
0 siblings, 0 replies; 11+ messages in thread
From: Shakeel Butt @ 2026-05-22 15:55 UTC (permalink / raw)
To: Eric Chanudet
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Maarten Lankhorst, Maxime Ripard, Natalie Vock,
Tejun Heo, Michal Koutný, Jonathan Corbet, Shuah Khan,
cgroups, linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc
On Fri, May 22, 2026 at 08:53:10AM -0700, Shakeel Butt wrote:
> On Tue, May 19, 2026 at 11:59:01AM -0400, Eric Chanudet wrote:
> > Add mem_cgroup_dmem_charge() and mem_cgroup_dmem_uncharge() to allow
> > dmem pool allocations to optionally be double-charged against the memory
> > controller. Take the struct cgroup from the dmem pool's css as there is
> > no convenient object exported to represent these allocations. These will
> > resolve the effective memory css from that cgroup and perform the
> > charge.
> >
> > Introduce a MEMCG_DMEM stat counter to memory.stat to make the cgroup's
> > dmem charge visible.
> >
> > Signed-off-by: Eric Chanudet <echanude@redhat.com>
> > ---
> > include/linux/memcontrol.h | 16 ++++++++++++
> > mm/memcontrol.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 81 insertions(+)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index dc3fa687759b45748b2acee6d7f43da325eb50c1..8e1d49b87fb64e6114f3eb920293e14920290fe7 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -39,6 +39,7 @@ enum memcg_stat_item {
> > MEMCG_ZSWAP_B,
> > MEMCG_ZSWAPPED,
> > MEMCG_ZSWAP_INCOMP,
> > + MEMCG_DMEM,
> > MEMCG_NR_STAT,
> > };
> >
> > @@ -1872,6 +1873,21 @@ static inline bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> > }
> > #endif
> >
> > +#if defined(CONFIG_MEMCG) && defined(CONFIG_CGROUP_DMEM)
> > +bool mem_cgroup_dmem_charge(struct cgroup *cgrp, unsigned int nr_pages,
> > + gfp_t gfp_mask);
> > +void mem_cgroup_dmem_uncharge(struct cgroup *cgrp, unsigned int nr_pages);
> > +#else
> > +static inline bool mem_cgroup_dmem_charge(struct cgroup *cgrp,
> > + unsigned int nr_pages, gfp_t gfp_mask)
>
> Please follow Johannes's request to pass the actually memory object instead of
> naked numbers.
>
Also what exactly is the backing memory here? Is it system memory? If yes, then
you need to pass struct page. For non-system memory, I am not sure memcg is the
right place to charge such memory.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg
2026-05-22 15:26 ` Michal Koutný
@ 2026-05-22 16:17 ` Tejun Heo
0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2026-05-22 16:17 UTC (permalink / raw)
To: Michal Koutný
Cc: Eric Chanudet, Johannes Weiner, Michal Hocko, Roman Gushchin,
Shakeel Butt, Muchun Song, Andrew Morton, Maarten Lankhorst,
Maxime Ripard, Natalie Vock, Jonathan Corbet, Shuah Khan, cgroups,
linux-mm, linux-kernel, dri-devel, T.J. Mercier,
Christian König, Maxime Ripard, Albert Esteve, Dave Airlie,
linux-doc
Hello,
On Fri, May 22, 2026 at 05:26:16PM +0200, Michal Koutný wrote:
> Hello Eric.
>
> On Tue, May 19, 2026 at 11:59:02AM -0400, Eric Chanudet <echanude@redhat.com> wrote:
> > Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
> > configure whether allocations in a dmem region should also be charged to
> > the memory controller.
>
> This kinda makes sense as it is not unlike io.cost.* device
> configurators.
>
> Just for my better understanding -- will there be a space for userspace
> to switch this? (No charged dmem allocations happen before responsible
> userspace runs, so that the attribute remains unlocked.)
>
> (I'm rather indifferent about the actual double charging/non-charging
> matter.)
I wonder whether this would make more sense as a mount flag? What's the use
case for e.g. having different config for different devices? Wouldn't that
be really confusing?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Claude review: cgroup/dmem: allow double-charging dmem allocations to memcg
2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
@ 2026-05-25 12:42 ` Claude Code Review Bot
2 siblings, 0 replies; 11+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 12:42 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: cgroup/dmem: allow double-charging dmem allocations to memcg
Author: Eric Chanudet <echanude@redhat.com>
Patches: 8
Reviewed: 2026-05-25T22:42:50.439377
---
This is a v2 2-patch series by Eric Chanudet that adds optional "double-charging" of dmem (device memory) cgroup allocations to the memory controller (memcg). The motivation is to let administrators account device memory allocations against both the dmem controller and the standard memory controller simultaneously.
**Architecture:** Patch 1 adds `mem_cgroup_dmem_charge()`/`mem_cgroup_dmem_uncharge()` helpers in mm/memcontrol.c. Patch 2 wires them into kernel/cgroup/dmem.c with a root-only `dmem.memcg` control file and a 4-state lock machine.
**Overall assessment:** The design is reasonable — the lock-on-first-charge approach prevents mid-life accounting inconsistencies, and keeping the control at root-only is appropriate. However, there are several correctness issues around charging/uncharging symmetry, the stat output units, error-path behavior in the write handler, and a potential silent failure mode in the charge path that need attention.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 11+ messages in thread
* Claude review: mm/memcontrol: add dmem charge/uncharge functions
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
2026-05-20 7:22 ` Albert Esteve
2026-05-22 15:53 ` Shakeel Butt
@ 2026-05-25 12:42 ` Claude Code Review Bot
2 siblings, 0 replies; 11+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 12:42 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**1. Root memcg charge silently treated as failure**
```c
memcg = mem_cgroup_from_css(mem_css);
if (!memcg || mem_cgroup_is_root(memcg)) {
css_put(mem_css);
return false;
}
```
`mem_cgroup_dmem_charge()` returns `false` when the effective memcg is root. In patch 2's `dmem_cgroup_try_charge()`, `false` means `-ENOMEM` and the allocation fails entirely. This means that if a process in the root cgroup (or a cgroup with no memory controller enabled in its subtree) tries to allocate dmem with memcg charging enabled, the allocation will be unconditionally rejected. This seems wrong — charging to root should probably succeed (return `true`) and just skip the actual charge, since root is not subject to limits. The same issue exists in the uncharge path where it silently returns without uncharging.
**2. MEMCG_DMEM stat output unit is pages, displayed as bytes**
The new `MEMCG_DMEM` stat item stores values in pages (via `mod_memcg_state(memcg, MEMCG_DMEM, nr_pages)`). The `memcg_page_state_unit()` function defaults to `PAGE_SIZE` for unknown items, so `memcg_page_state_output()` will multiply the stored page count by `PAGE_SIZE`, producing a byte value. This is correct for display in `memory.stat`. However, the documentation in patch 2 says the stat is "reported in bytes" — it's worth confirming this is the intended behavior, since the internal accounting is in pages but the output conversion happens to work correctly by default. This is fine, just noting it's implicit rather than explicit.
**3. `try_charge_memcg` is a static function**
```c
if (try_charge_memcg(memcg, gfp_mask, nr_pages)) {
```
`try_charge_memcg()` is declared `static` in `mm/memcontrol.c`. The new `mem_cgroup_dmem_charge()` is also in `mm/memcontrol.c`, so this works. But it's calling a fairly low-level internal function directly rather than going through a higher-level API. This is acceptable for code within the same file, but worth noting that it couples dmem charging to the internal memcg charge path. If `try_charge_memcg` semantics change, this code must be updated too.
**4. Charge/uncharge asymmetry for `css_put`**
In `mem_cgroup_dmem_charge()`, a `css_get` (via `cgroup_get_e_css`) is taken and then `css_put` is called before returning. In `mem_cgroup_dmem_uncharge()`, the same pattern is used. This means the css reference is not held between charge and uncharge. This works because the dmem pool itself holds the css alive via `pool->cs->css`, but it's subtle — the correctness depends on the dmem pool keeping its cgroup alive until uncharge. The commit message mentions this ("the pool holds a reference to the dmem cgroup state that keeps the cgroup alive until it gets uncharged"), so it's intentional, but a brief code comment would help future readers.
**5. Missing `EXPORT_SYMBOL_GPL` for the new functions**
Neither `mem_cgroup_dmem_charge()` nor `mem_cgroup_dmem_uncharge()` has `EXPORT_SYMBOL_GPL`. Since dmem.c is built-in (not a module), this is fine for now. But if dmem ever becomes modular, these would need exports. Minor point.
---
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 11+ messages in thread
* Claude review: cgroup/dmem: add dmem.memcg control file for double-charging to memcg
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
2026-05-22 15:26 ` Michal Koutný
@ 2026-05-25 12:42 ` Claude Code Review Bot
1 sibling, 0 replies; 11+ messages in thread
From: Claude Code Review Bot @ 2026-05-25 12:42 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**1. Charge/uncharge size inconsistency (potential accounting leak)**
In `dmem_cgroup_try_charge()`:
```c
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
...
if (charge_memcg) {
...
mem_cgroup_dmem_charge(pool->cs->css.cgroup, nr_pages, GFP_KERNEL);
}
```
But in `dmem_cgroup_uncharge()`:
```c
if (atomic_read(&pool->region->memcg_status) == DMEM_MEMCG_LOCKED_ON &&
!WARN_ON_ONCE(size > (u64)UINT_MAX << PAGE_SHIFT))
mem_cgroup_dmem_uncharge(pool->cs->css.cgroup,
PAGE_ALIGN(size) >> PAGE_SHIFT);
```
The charge path checks `size > (u64)UINT_MAX << PAGE_SHIFT` and bails out early with `-EINVAL` *before* charging memcg. But the uncharge path uses `WARN_ON_ONCE` on the same condition and *skips the uncharge* if the size is too large. If somehow a large allocation was charged (e.g., due to a code change or different path), the uncharge would be silently skipped, leaking the memcg charge. The WARN_ON_ONCE is also inverted — it triggers when the condition is true (size is too large) and then the `!` causes the uncharge to be skipped. The logic is correct but confusing to read.
**2. `apply_memcg_charge` is called on every charge attempt, even if it's already locked**
```c
charge_memcg = apply_memcg_charge(®ion->memcg_status);
```
This function does an `atomic_cmpxchg` on every allocation to lock the state, even after the first successful lock. For `DMEM_MEMCG_LOCKED_OFF` and `DMEM_MEMCG_LOCKED_ON`, it just reads and returns, which is fine. But for the unlocked states, it'll keep attempting cmpxchg until it succeeds. This is a minor hot-path concern — once locked, the `atomic_read` in the switch is cheap. No bug, just noting the fast path is already O(1) after the first allocation.
**3. Write handler silently ignores locked state**
```c
atomic_cmpxchg(®ion->memcg_status,
flag ? DMEM_MEMCG_OFF : DMEM_MEMCG_ON,
flag ? DMEM_MEMCG_ON : DMEM_MEMCG_OFF);
/* Continue if a region is already locked. */
```
When a write to `dmem.memcg` tries to change a region that is already locked, the `atomic_cmpxchg` silently fails (the old value doesn't match) and the write returns success (`nbytes`). The admin gets no indication that the change was not applied. It would be more user-friendly to return `-EBUSY` or `-EPERM` when the region is locked, rather than silently discarding the write. At minimum, the admin could check by reading the file, but silent success on a no-op write is a usability issue.
**4. Write handler returns `nbytes` on partial processing**
The write handler processes lines in a loop. If parsing fails mid-way (e.g., second line is invalid), it returns `-EINVAL` having already processed the first line and released its region ref. But there's no rollback of the first line's state change. While cgroup write operations are typically single-line, the parser supports multi-line input, so this is a potential issue.
**5. Missing `CFTYPE_NS_DELEGATABLE` check**
The `dmem.memcg` file is `CFTYPE_ONLY_ON_ROOT` but has no `.access` flags to restrict writes. The commit message says "root-only cgroupfs file that lets an administrator configure" — but the write permission is just standard file permissions. Any process with write access to the cgroupfs root can modify the setting, not just uid 0. For a security-sensitive setting that affects accounting, this might want explicit access control. However, this matches the pattern of other root-only cgroup files, so it may be intentional.
**6. `depends_on` declaration**
```c
#ifdef CONFIG_MEMCG
.depends_on = 1 << memory_cgrp_id,
#endif
```
This ensures the memory controller is enabled whenever dmem is enabled. This is correct for the feature, but it changes behavior even when `dmem.memcg` is not used — any system with both CONFIG_MEMCG and CONFIG_CGROUP_DMEM will now force the memory controller to be active whenever dmem is. This could be surprising for users who want dmem without memcg overhead. The tradeoff makes the implementation simpler (no need to handle the case where memcg is disabled at runtime), but should be documented.
**7. `atomic_read` in uncharge path has no acquire barrier**
```c
if (atomic_read(&pool->region->memcg_status) == DMEM_MEMCG_LOCKED_ON &&
```
The uncharge path uses a plain `atomic_read` to check the locked-on state. The corresponding `atomic_cmpxchg` in `apply_memcg_charge` provides full memory ordering, so the transition to LOCKED_ON is properly visible. However, there's a window between when `apply_memcg_charge` returns `true` (transitioning to LOCKED_ON) and when the charge actually completes. If the charge fails after `apply_memcg_charge` locks the state, the state is permanently locked ON but no charge was made. The error path in `dmem_cgroup_try_charge` handles this correctly (it uncharges memcg on `page_counter_try_charge` failure), but it's worth verifying that all error paths are covered.
**8. Documentation placement**
The documentation is well-written and clear. The placement after the existing dmem documentation section is correct. One minor note: the doc says "This file is only available when the kernel is built with CONFIG_MEMCG" — this should probably also mention CONFIG_CGROUP_DMEM since both are required.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-25 12:42 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 15:59 [PATCH v2 0/2] cgroup/dmem: allow double-charging dmem allocations to memcg Eric Chanudet
2026-05-19 15:59 ` [PATCH v2 1/2] mm/memcontrol: add dmem charge/uncharge functions Eric Chanudet
2026-05-20 7:22 ` Albert Esteve
2026-05-22 15:53 ` Shakeel Butt
2026-05-22 15:55 ` Shakeel Butt
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
2026-05-19 15:59 ` [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg Eric Chanudet
2026-05-22 15:26 ` Michal Koutný
2026-05-22 16:17 ` Tejun Heo
2026-05-25 12:42 ` Claude review: " Claude Code Review Bot
2026-05-25 12:42 ` Claude review: cgroup/dmem: allow double-charging dmem allocations " Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox