* [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
@ 2026-06-02 15:33 Tvrtko Ursulin
2026-06-03 8:37 ` Philipp Stanner
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2026-06-02 15:33 UTC (permalink / raw)
To: dri-devel
Cc: kernel-dev, Tvrtko Ursulin, Christian König,
Danilo Krummrich, Matthew Brost, Philipp Stanner
Commit
28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
failed to notice clearing of entity->rq in drm_sched_entity_init() is now
redundant and can be removed.
Given that entity->rq can now never be NULL, we also remove two impossible
checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
respectively.
Similarly, we can also remove the !entity->rq check in
drm_sched_job_init(). And for the better, given that the error message, if
it ever triggered, would have dereferenced the yet un-initialized job->
sched (only initialized later in drm_sched_job_arm()). This appears to
have been theoretically broken ever since commit
56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <phasta@kernel.org>
---
drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
drivers/gpu/drm/scheduler/sched_main.c | 9 ---------
2 files changed, 2 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 4ebb513255ed..c51101ec70c1 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
return -ENOMEM;
INIT_LIST_HEAD(&entity->list);
- entity->rq = NULL;
entity->guilty = guilty;
entity->priority = priority;
entity->last_user = current->group_leader;
@@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
struct drm_sched_job *job;
struct dma_fence *prev;
- if (!entity->rq)
- return;
-
spin_lock(&entity->lock);
entity->stopped = true;
drm_sched_rq_remove_entity(entity->rq, entity);
@@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
*/
long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
{
- struct drm_gpu_scheduler *sched;
+ struct drm_gpu_scheduler *sched =
+ container_of(entity->rq, typeof(*sched), rq);
struct task_struct *last_user;
long ret = timeout;
- if (!entity->rq)
- return 0;
-
- sched = container_of(entity->rq, typeof(*sched), rq);
/*
* The client will not queue more jobs during this fini - consume
* existing queued ones, or discard them on SIGKILL.
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 818d3d4434b5..d2ca01b31ee4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
u32 credits, void *owner,
uint64_t drm_client_id)
{
- if (!entity->rq) {
- /* This will most likely be followed by missing frames
- * or worse--a blank screen--leave a trail in the
- * logs, so this can be debugged easier.
- */
- dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
- return -ENOENT;
- }
-
if (unlikely(!credits)) {
pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
return -EINVAL;
--
2.54.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
@ 2026-06-03 8:37 ` Philipp Stanner
2026-06-03 9:14 ` Tvrtko Ursulin
2026-06-04 2:41 ` Claude review: " Claude Code Review Bot
2026-06-04 2:41 ` Claude Code Review Bot
2 siblings, 1 reply; 5+ messages in thread
From: Philipp Stanner @ 2026-06-03 8:37 UTC (permalink / raw)
To: Tvrtko Ursulin, dri-devel
Cc: kernel-dev, Christian König, Danilo Krummrich, Matthew Brost,
Philipp Stanner
On Tue, 2026-06-02 at 16:33 +0100, Tvrtko Ursulin wrote:
> Commit
> 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
> failed to notice clearing of entity->rq in drm_sched_entity_init() is now
By clearing you also mean the setting to NULL?
I'd just use "initialization" consistently, like in the commit title.
> redundant and can be removed.
>
> Given that entity->rq can now never be NULL, we also remove two impossible
> checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
> respectively.
>
> Similarly, we can also remove the !entity->rq check in
> drm_sched_job_init(). And for the better, given that the error message, if
> it ever triggered, would have dereferenced the yet un-initialized job->
> sched (only initialized later in drm_sched_job_arm()). This appears to
> have been theoretically broken ever since commit
> 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
> .
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <phasta@kernel.org>
> ---
> drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
> drivers/gpu/drm/scheduler/sched_main.c | 9 ---------
> 2 files changed, 2 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 4ebb513255ed..c51101ec70c1 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> return -ENOMEM;
>
> INIT_LIST_HEAD(&entity->list);
> - entity->rq = NULL;
It would seem that has always been redundant because of the memset(0)
directly above.
> entity->guilty = guilty;
> entity->priority = priority;
> entity->last_user = current->group_leader;
> @@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
> struct drm_sched_job *job;
> struct dma_fence *prev;
>
> - if (!entity->rq)
> - return;
> -
> spin_lock(&entity->lock);
> entity->stopped = true;
> drm_sched_rq_remove_entity(entity->rq, entity);
> @@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
> */
> long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> {
> - struct drm_gpu_scheduler *sched;
> + struct drm_gpu_scheduler *sched =
> + container_of(entity->rq, typeof(*sched), rq);
> struct task_struct *last_user;
> long ret = timeout;
>
> - if (!entity->rq)
> - return 0;
> -
> - sched = container_of(entity->rq, typeof(*sched), rq);
> /*
> * The client will not queue more jobs during this fini - consume
> * existing queued ones, or discard them on SIGKILL.
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 818d3d4434b5..d2ca01b31ee4 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
> u32 credits, void *owner,
> uint64_t drm_client_id)
> {
> - if (!entity->rq) {
> - /* This will most likely be followed by missing frames
> - * or worse--a blank screen--leave a trail in the
> - * logs, so this can be debugged easier.
> - */
> - dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
> - return -ENOENT;
> - }
> -
> if (unlikely(!credits)) {
> pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
> return -EINVAL;
But overall a very nice cleanup
P.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
2026-06-03 8:37 ` Philipp Stanner
@ 2026-06-03 9:14 ` Tvrtko Ursulin
0 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2026-06-03 9:14 UTC (permalink / raw)
To: phasta, dri-devel
Cc: kernel-dev, Christian König, Danilo Krummrich, Matthew Brost
On 03/06/2026 09:37, Philipp Stanner wrote:
> On Tue, 2026-06-02 at 16:33 +0100, Tvrtko Ursulin wrote:
>> Commit
>> 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
>> failed to notice clearing of entity->rq in drm_sched_entity_init() is now
>
> By clearing you also mean the setting to NULL?
>
> I'd just use "initialization" consistently, like in the commit title.
It is initialized properly a bit lower down so clearing is I think more
accurate.
>> redundant and can be removed.
>>
>> Given that entity->rq can now never be NULL, we also remove two impossible
>> checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
>> respectively.
>>
>> Similarly, we can also remove the !entity->rq check in
>> drm_sched_job_init(). And for the better, given that the error message, if
>> it ever triggered, would have dereferenced the yet un-initialized job->
>> sched (only initialized later in drm_sched_job_arm()). This appears to
>> have been theoretically broken ever since commit
>> 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
>> .
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Philipp Stanner <phasta@kernel.org>
>> ---
>> drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
>> drivers/gpu/drm/scheduler/sched_main.c | 9 ---------
>> 2 files changed, 2 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>> index 4ebb513255ed..c51101ec70c1 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>> return -ENOMEM;
>>
>> INIT_LIST_HEAD(&entity->list);
>> - entity->rq = NULL;
>
> It would seem that has always been redundant because of the memset(0)
> directly above.
True, ever since 1decbf6bb0b4 ("drm/sched: Fix entities with 0 rqs.").
Just that in this patch the redundant is focusing on:
entity->rq = NULL;
...
entity->rq = &sched_list[0]->rq;
Good enough or you want a respin and if so in what flavour?
Regards,
Tvrtko
>
>> entity->guilty = guilty;
>> entity->priority = priority;
>> entity->last_user = current->group_leader;
>> @@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
>> struct drm_sched_job *job;
>> struct dma_fence *prev;
>>
>> - if (!entity->rq)
>> - return;
>> -
>> spin_lock(&entity->lock);
>> entity->stopped = true;
>> drm_sched_rq_remove_entity(entity->rq, entity);
>> @@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
>> */
>> long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
>> {
>> - struct drm_gpu_scheduler *sched;
>> + struct drm_gpu_scheduler *sched =
>> + container_of(entity->rq, typeof(*sched), rq);
>> struct task_struct *last_user;
>> long ret = timeout;
>>
>> - if (!entity->rq)
>> - return 0;
>> -
>> - sched = container_of(entity->rq, typeof(*sched), rq);
>> /*
>> * The client will not queue more jobs during this fini - consume
>> * existing queued ones, or discard them on SIGKILL.
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 818d3d4434b5..d2ca01b31ee4 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
>> u32 credits, void *owner,
>> uint64_t drm_client_id)
>> {
>> - if (!entity->rq) {
>> - /* This will most likely be followed by missing frames
>> - * or worse--a blank screen--leave a trail in the
>> - * logs, so this can be debugged easier.
>> - */
>> - dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
>> - return -ENOENT;
>> - }
>> -
>> if (unlikely(!credits)) {
>> pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
>> return -EINVAL;
>
> But overall a very nice cleanup
>
>
> P.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Claude review: drm/sched: Remove redundant entity->rq initialization and checks
2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
2026-06-03 8:37 ` Philipp Stanner
@ 2026-06-04 2:41 ` Claude Code Review Bot
2026-06-04 2:41 ` Claude Code Review Bot
2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 2:41 UTC (permalink / raw)
To: dri-devel-reviews
Overall Series Review
Subject: drm/sched: Remove redundant entity->rq initialization and checks
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Patches: 3
Reviewed: 2026-06-04T12:41:19.912219
---
This is a single-patch cleanup that removes dead code from the DRM scheduler. The analysis is correct: after commit 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers"), `drm_sched_entity_init()` validates that `sched_list` is non-empty and non-NULL, then unconditionally sets `entity->rq = &sched_list[0]->rq`. No code path in the scheduler ever sets `entity->rq` back to NULL after init. Therefore, the NULL checks in `kill`, `flush`, and `job_init` are dead code.
The patch also correctly identifies a latent bug in the removed dead code: the `dev_err(job->sched->dev, ...)` in `drm_sched_job_init()` would dereference `job->sched` before it has been initialized — `job->sched` is only assigned later in `drm_sched_job_arm()`, and the `memset(job, 0, ...)` hasn't even run yet at that point, so `job->sched` is whatever garbage was in the caller's allocation.
The patch is correct and safe to apply, with one minor issue noted below.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
* Claude review: drm/sched: Remove redundant entity->rq initialization and checks
2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
2026-06-03 8:37 ` Philipp Stanner
2026-06-04 2:41 ` Claude review: " Claude Code Review Bot
@ 2026-06-04 2:41 ` Claude Code Review Bot
2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-06-04 2:41 UTC (permalink / raw)
To: dri-devel-reviews
Patch Review
**sched_entity.c — removal of `entity->rq = NULL` in init:**
Correct. The function does `memset(entity, 0, sizeof(struct drm_sched_entity))` at line 125 (which already zeroes `rq`), and then immediately sets `entity->rq = &sched_list[0]->rq` at line 138. The explicit NULL assignment is doubly redundant.
**sched_entity.c — removal of `!entity->rq` check in `drm_sched_entity_kill()`:**
Correct. After successful init, `entity->rq` is always non-NULL, and no code path sets it to NULL afterwards (confirmed by grep — the only `entity->rq = NULL` in the scheduler was the one being removed in init).
**sched_entity.c — removal of `!entity->rq` check in `drm_sched_entity_flush()`:**
Correct, and the refactoring to move the `sched` variable initialization to the declaration is clean:
```c
- struct drm_gpu_scheduler *sched;
+ struct drm_gpu_scheduler *sched =
+ container_of(entity->rq, typeof(*sched), rq);
```
This is safe since the early-return that guarded it is no longer needed.
**sched_main.c — removal of `!entity->rq` check in `drm_sched_job_init()`:**
Correct. As noted in the commit message, the error path's `dev_err(job->sched->dev, ...)` was buggy — `job->sched` is uninitialized at that point (only set in `drm_sched_job_arm()`), and even the `memset(job, 0, sizeof(*job))` runs *after* this check. If this dead code ever executed, it would dereference garbage memory rather than printing a useful diagnostic.
**Minor issue — stale kdoc:** The function comment for `drm_sched_job_init()` (lines 579–582 in the current tree) contains:
```c
* WARNING: amdgpu abuses &drm_sched.ready to signal when the hardware
* has died, which can mean that there's no valid runqueue for a @entity.
* This function returns -ENOENT in this case (which probably should be -EIO as
* a more meanigful return value).
```
After this patch, `drm_sched_job_init()` no longer returns `-ENOENT` at all. This WARNING should be removed or updated in the same patch to avoid a misleading kdoc comment.
**Verdict:** Patch is correct. The one piece of feedback is to also clean up the now-stale WARNING in the `drm_sched_job_init()` kdoc.
---
Generated by Claude Code Patch Reviewer
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-04 2:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
2026-06-03 8:37 ` Philipp Stanner
2026-06-03 9:14 ` Tvrtko Ursulin
2026-06-04 2:41 ` Claude review: " Claude Code Review Bot
2026-06-04 2:41 ` Claude Code Review Bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox