public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
@ 2026-06-02 15:33 Tvrtko Ursulin
  2026-06-03  8:37 ` Philipp Stanner
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2026-06-02 15:33 UTC (permalink / raw)
  To: dri-devel
  Cc: kernel-dev, Tvrtko Ursulin, Christian König,
	Danilo Krummrich, Matthew Brost, Philipp Stanner

Commit
28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
failed to notice clearing of entity->rq in drm_sched_entity_init() is now
redundant and can be removed.

Given that entity->rq can now never be NULL, we also remove two impossible
checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
respectively.

Similarly, we can also remove the !entity->rq check in
drm_sched_job_init(). And for the better, given that the error message, if
it ever triggered, would have dereferenced the yet un-initialized job->
sched (only initialized later in drm_sched_job_arm()). This appears to
have been theoretically broken ever since commit
56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
 drivers/gpu/drm/scheduler/sched_main.c   |  9 ---------
 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 4ebb513255ed..c51101ec70c1 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 		return -ENOMEM;
 
 	INIT_LIST_HEAD(&entity->list);
-	entity->rq = NULL;
 	entity->guilty = guilty;
 	entity->priority = priority;
 	entity->last_user = current->group_leader;
@@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
 	struct drm_sched_job *job;
 	struct dma_fence *prev;
 
-	if (!entity->rq)
-		return;
-
 	spin_lock(&entity->lock);
 	entity->stopped = true;
 	drm_sched_rq_remove_entity(entity->rq, entity);
@@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
  */
 long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 {
-	struct drm_gpu_scheduler *sched;
+	struct drm_gpu_scheduler *sched =
+		container_of(entity->rq, typeof(*sched), rq);
 	struct task_struct *last_user;
 	long ret = timeout;
 
-	if (!entity->rq)
-		return 0;
-
-	sched = container_of(entity->rq, typeof(*sched), rq);
 	/*
 	 * The client will not queue more jobs during this fini - consume
 	 * existing queued ones, or discard them on SIGKILL.
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 818d3d4434b5..d2ca01b31ee4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       u32 credits, void *owner,
 		       uint64_t drm_client_id)
 {
-	if (!entity->rq) {
-		/* This will most likely be followed by missing frames
-		 * or worse--a blank screen--leave a trail in the
-		 * logs, so this can be debugged easier.
-		 */
-		dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
-		return -ENOENT;
-	}
-
 	if (unlikely(!credits)) {
 		pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
 		return -EINVAL;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
  2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
@ 2026-06-03  8:37 ` Philipp Stanner
  2026-06-03  9:14   ` Tvrtko Ursulin
  2026-06-04  2:41 ` Claude review: " Claude Code Review Bot
  2026-06-04  2:41 ` Claude Code Review Bot
  2 siblings, 1 reply; 5+ messages in thread
From: Philipp Stanner @ 2026-06-03  8:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, dri-devel
  Cc: kernel-dev, Christian König, Danilo Krummrich, Matthew Brost,
	Philipp Stanner

On Tue, 2026-06-02 at 16:33 +0100, Tvrtko Ursulin wrote:
> Commit
> 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
> failed to notice clearing of entity->rq in drm_sched_entity_init() is now

By clearing you also mean the setting to NULL?

I'd just use "initialization" consistently, like in the commit title.

> redundant and can be removed.
> 
> Given that entity->rq can now never be NULL, we also remove two impossible
> checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
> respectively.
> 
> Similarly, we can also remove the !entity->rq check in
> drm_sched_job_init(). And for the better, given that the error message, if
> it ever triggered, would have dereferenced the yet un-initialized job->
> sched (only initialized later in drm_sched_job_arm()). This appears to
> have been theoretically broken ever since commit
> 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
> .
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <phasta@kernel.org>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
>  drivers/gpu/drm/scheduler/sched_main.c   |  9 ---------
>  2 files changed, 2 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 4ebb513255ed..c51101ec70c1 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  		return -ENOMEM;
>  
>  	INIT_LIST_HEAD(&entity->list);
> -	entity->rq = NULL;

It would seem that has always been redundant because of the memset(0)
directly above.

>  	entity->guilty = guilty;
>  	entity->priority = priority;
>  	entity->last_user = current->group_leader;
> @@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
>  	struct drm_sched_job *job;
>  	struct dma_fence *prev;
>  
> -	if (!entity->rq)
> -		return;
> -
>  	spin_lock(&entity->lock);
>  	entity->stopped = true;
>  	drm_sched_rq_remove_entity(entity->rq, entity);
> @@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
>   */
>  long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
>  {
> -	struct drm_gpu_scheduler *sched;
> +	struct drm_gpu_scheduler *sched =
> +		container_of(entity->rq, typeof(*sched), rq);
>  	struct task_struct *last_user;
>  	long ret = timeout;
>  
> -	if (!entity->rq)
> -		return 0;
> -
> -	sched = container_of(entity->rq, typeof(*sched), rq);
>  	/*
>  	 * The client will not queue more jobs during this fini - consume
>  	 * existing queued ones, or discard them on SIGKILL.
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 818d3d4434b5..d2ca01b31ee4 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       u32 credits, void *owner,
>  		       uint64_t drm_client_id)
>  {
> -	if (!entity->rq) {
> -		/* This will most likely be followed by missing frames
> -		 * or worse--a blank screen--leave a trail in the
> -		 * logs, so this can be debugged easier.
> -		 */
> -		dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
> -		return -ENOENT;
> -	}
> -
>  	if (unlikely(!credits)) {
>  		pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
>  		return -EINVAL;

But overall a very nice cleanup


P.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/sched: Remove redundant entity->rq initialization and checks
  2026-06-03  8:37 ` Philipp Stanner
@ 2026-06-03  9:14   ` Tvrtko Ursulin
  0 siblings, 0 replies; 5+ messages in thread
From: Tvrtko Ursulin @ 2026-06-03  9:14 UTC (permalink / raw)
  To: phasta, dri-devel
  Cc: kernel-dev, Christian König, Danilo Krummrich, Matthew Brost


On 03/06/2026 09:37, Philipp Stanner wrote:
> On Tue, 2026-06-02 at 16:33 +0100, Tvrtko Ursulin wrote:
>> Commit
>> 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers")
>> failed to notice clearing of entity->rq in drm_sched_entity_init() is now
> 
> By clearing you also mean the setting to NULL?
> 
> I'd just use "initialization" consistently, like in the commit title.

It is initialized properly a bit lower down so clearing is I think more 
accurate.

>> redundant and can be removed.
>>
>> Given that entity->rq can now never be NULL, we also remove two impossible
>> checks, from drm_sched_entity_kill() and drm_sched_entity_flush()
>> respectively.
>>
>> Similarly, we can also remove the !entity->rq check in
>> drm_sched_job_init(). And for the better, given that the error message, if
>> it ever triggered, would have dereferenced the yet un-initialized job->
>> sched (only initialized later in drm_sched_job_arm()). This appears to
>> have been theoretically broken ever since commit
>> 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
>> .
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Philipp Stanner <phasta@kernel.org>
>> ---
>>   drivers/gpu/drm/scheduler/sched_entity.c | 11 ++---------
>>   drivers/gpu/drm/scheduler/sched_main.c   |  9 ---------
>>   2 files changed, 2 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>> index 4ebb513255ed..c51101ec70c1 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -129,7 +129,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>>   		return -ENOMEM;
>>   
>>   	INIT_LIST_HEAD(&entity->list);
>> -	entity->rq = NULL;
> 
> It would seem that has always been redundant because of the memset(0)
> directly above.

True, ever since 1decbf6bb0b4 ("drm/sched: Fix entities with 0 rqs."). 
Just that in this patch the redundant is focusing on:

   entity->rq = NULL;
...
   entity->rq = &sched_list[0]->rq;

Good enough or you want a respin and if so in what flavour?

Regards,

Tvrtko

> 
>>   	entity->guilty = guilty;
>>   	entity->priority = priority;
>>   	entity->last_user = current->group_leader;
>> @@ -280,9 +279,6 @@ void drm_sched_entity_kill(struct drm_sched_entity *entity)
>>   	struct drm_sched_job *job;
>>   	struct dma_fence *prev;
>>   
>> -	if (!entity->rq)
>> -		return;
>> -
>>   	spin_lock(&entity->lock);
>>   	entity->stopped = true;
>>   	drm_sched_rq_remove_entity(entity->rq, entity);
>> @@ -329,14 +325,11 @@ EXPORT_SYMBOL(drm_sched_entity_kill);
>>    */
>>   long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
>>   {
>> -	struct drm_gpu_scheduler *sched;
>> +	struct drm_gpu_scheduler *sched =
>> +		container_of(entity->rq, typeof(*sched), rq);
>>   	struct task_struct *last_user;
>>   	long ret = timeout;
>>   
>> -	if (!entity->rq)
>> -		return 0;
>> -
>> -	sched = container_of(entity->rq, typeof(*sched), rq);
>>   	/*
>>   	 * The client will not queue more jobs during this fini - consume
>>   	 * existing queued ones, or discard them on SIGKILL.
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 818d3d4434b5..d2ca01b31ee4 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -588,15 +588,6 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   		       u32 credits, void *owner,
>>   		       uint64_t drm_client_id)
>>   {
>> -	if (!entity->rq) {
>> -		/* This will most likely be followed by missing frames
>> -		 * or worse--a blank screen--leave a trail in the
>> -		 * logs, so this can be debugged easier.
>> -		 */
>> -		dev_err(job->sched->dev, "%s: entity has no rq!\n", __func__);
>> -		return -ENOENT;
>> -	}
>> -
>>   	if (unlikely(!credits)) {
>>   		pr_err("*ERROR* %s: credits cannot be 0!\n", __func__);
>>   		return -EINVAL;
> 
> But overall a very nice cleanup
> 
> 
> P.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Claude review: drm/sched: Remove redundant entity->rq initialization and checks
  2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
  2026-06-03  8:37 ` Philipp Stanner
@ 2026-06-04  2:41 ` Claude Code Review Bot
  2026-06-04  2:41 ` Claude Code Review Bot
  2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-06-04  2:41 UTC (permalink / raw)
  To: dri-devel-reviews

Overall Series Review

Subject: drm/sched: Remove redundant entity->rq initialization and checks
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Patches: 3
Reviewed: 2026-06-04T12:41:19.912219

---

This is a single-patch cleanup that removes dead code from the DRM scheduler. The analysis is correct: after commit 28c5bf28763d ("drm/sched: Disallow initializing entities with no schedulers"), `drm_sched_entity_init()` validates that `sched_list` is non-empty and non-NULL, then unconditionally sets `entity->rq = &sched_list[0]->rq`. No code path in the scheduler ever sets `entity->rq` back to NULL after init. Therefore, the NULL checks in `kill`, `flush`, and `job_init` are dead code.

The patch also correctly identifies a latent bug in the removed dead code: the `dev_err(job->sched->dev, ...)` in `drm_sched_job_init()` would dereference `job->sched` before it has been initialized — `job->sched` is only assigned later in `drm_sched_job_arm()`, and the `memset(job, 0, ...)` hasn't even run yet at that point, so `job->sched` is whatever garbage was in the caller's allocation.

The patch is correct and safe to apply, with one minor issue noted below.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Claude review: drm/sched: Remove redundant entity->rq initialization and checks
  2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
  2026-06-03  8:37 ` Philipp Stanner
  2026-06-04  2:41 ` Claude review: " Claude Code Review Bot
@ 2026-06-04  2:41 ` Claude Code Review Bot
  2 siblings, 0 replies; 5+ messages in thread
From: Claude Code Review Bot @ 2026-06-04  2:41 UTC (permalink / raw)
  To: dri-devel-reviews

Patch Review

**sched_entity.c — removal of `entity->rq = NULL` in init:**

Correct. The function does `memset(entity, 0, sizeof(struct drm_sched_entity))` at line 125 (which already zeroes `rq`), and then immediately sets `entity->rq = &sched_list[0]->rq` at line 138. The explicit NULL assignment is doubly redundant.

**sched_entity.c — removal of `!entity->rq` check in `drm_sched_entity_kill()`:**

Correct. After successful init, `entity->rq` is always non-NULL, and no code path sets it to NULL afterwards (confirmed by grep — the only `entity->rq = NULL` in the scheduler was the one being removed in init).

**sched_entity.c — removal of `!entity->rq` check in `drm_sched_entity_flush()`:**

Correct, and the refactoring to move the `sched` variable initialization to the declaration is clean:

```c
-	struct drm_gpu_scheduler *sched;
+	struct drm_gpu_scheduler *sched =
+		container_of(entity->rq, typeof(*sched), rq);
```

This is safe since the early-return that guarded it is no longer needed.

**sched_main.c — removal of `!entity->rq` check in `drm_sched_job_init()`:**

Correct. As noted in the commit message, the error path's `dev_err(job->sched->dev, ...)` was buggy — `job->sched` is uninitialized at that point (only set in `drm_sched_job_arm()`), and even the `memset(job, 0, sizeof(*job))` runs *after* this check. If this dead code ever executed, it would dereference garbage memory rather than printing a useful diagnostic.

**Minor issue — stale kdoc:** The function comment for `drm_sched_job_init()` (lines 579–582 in the current tree) contains:

```c
 * WARNING: amdgpu abuses &drm_sched.ready to signal when the hardware
 * has died, which can mean that there's no valid runqueue for a @entity.
 * This function returns -ENOENT in this case (which probably should be -EIO as
 * a more meanigful return value).
```

After this patch, `drm_sched_job_init()` no longer returns `-ENOENT` at all. This WARNING should be removed or updated in the same patch to avoid a misleading kdoc comment.

**Verdict:** Patch is correct. The one piece of feedback is to also clean up the now-stale WARNING in the `drm_sched_job_init()` kdoc.

---
Generated by Claude Code Patch Reviewer

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-04  2:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 15:33 [PATCH] drm/sched: Remove redundant entity->rq initialization and checks Tvrtko Ursulin
2026-06-03  8:37 ` Philipp Stanner
2026-06-03  9:14   ` Tvrtko Ursulin
2026-06-04  2:41 ` Claude review: " Claude Code Review Bot
2026-06-04  2:41 ` Claude Code Review Bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox