public inbox for drm-ai-reviews@public-inbox.freedesktop.org
 help / color / mirror / Atom feed
From: Liviu Dudau <liviu.dudau@arm.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>,
	Adrián Larumbe <adrian.larumbe@collabora.com>,
	dri-devel@lists.freedesktop.org, kernel@collabora.com,
	Nicolas Frattaroli <nicolas.frattaroli@collabora.com>,
	Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	Philipp Stanner <phasta@kernel.org>,
	Christian König <christian.koenig@amd.com>
Subject: Re: [PATCH] drm/panthor: Fix the "done_fence is initialized" detection logic
Date: Mon, 9 Mar 2026 11:05:06 +0000	[thread overview]
Message-ID: <aa6pYsoS6Ahdi8nu@e142607> (raw)
In-Reply-To: <20260309103053.211415-1-boris.brezillon@collabora.com>

On Mon, Mar 09, 2026 at 11:30:53AM +0100, Boris Brezillon wrote:
> After commit 541c8f2468b9 ("dma-buf: detach fence ops on signal v3"),
> dma_fence::ops == NULL can't be used to check if the fence is initialized
> or not. We could turn this into an "is_signaled() || ops == NULL" test,
> but that's fragile, since it's still subject to dma_fence internal
> changes. So let's have the "is_initialized" state encoded directly in
> the pointer through the lowest bit which is guaranteed to be unused
> because of the dma_fence alignment constraint.

I'm confused! There is only one place where we end up being interested if the
fence has been initialized or not, and that is in job_release(). I don't
see why checking for "ops != NULL" before calling dma_fence_put() should not
be enough, or even better, why don't we call dma_fence_put() regardless,
as the core code should take care of an uninitialized dma_fence AFAICT.

Best regards,
Liviu

> 
> Cc: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Philipp Stanner <phasta@kernel.org>
> Cc: Christian König <christian.koenig@amd.com>
> Reported-by: Steven Price <steven.price@arm.com>
> Reported-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> Fixes: 541c8f2468b9 ("dma-buf: detach fence ops on signal v3")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_sched.c | 69 ++++++++++++++++++-------
>  1 file changed, 50 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index bd703a2904a1..31589add86f5 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -835,8 +835,15 @@ struct panthor_job {
>  	 */
>  	struct list_head node;
>  
> -	/** @done_fence: Fence signaled when the job is finished or cancelled. */
> -	struct dma_fence *done_fence;
> +	/**
> +	 * @done_fence: Fence signaled when the job is finished or cancelled.
> +	 *
> +	 * This is a uintptr_t because we use the lower bit to encode whether
> +	 * the fence has been initialized or not, and we don't want code to dereference
> +	 * this field directly (job_done_fence()/job_done_fence_initialized() should be used
> +	 * instead).
> +	 */
> +	uintptr_t done_fence;
>  
>  	/** @profiling: Job profiling information. */
>  	struct {
> @@ -1518,6 +1525,18 @@ cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
>  		 info);
>  }
>  
> +#define DONE_FENCE_INITIALIZED BIT(0)
> +
> +static struct dma_fence *job_done_fence(struct panthor_job *job)
> +{
> +	return (void *)(job->done_fence & ~(uintptr_t)DONE_FENCE_INITIALIZED);
> +}
> +
> +static bool job_done_fence_initialized(struct panthor_job *job)
> +{
> +	return job->done_fence & DONE_FENCE_INITIALIZED;
> +}
> +
>  static void
>  cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
>  				   u32 csg_id, u32 cs_id)
> @@ -1549,7 +1568,7 @@ cs_slot_process_fault_event_locked(struct panthor_device *ptdev,
>  			if (cs_extract < job->ringbuf.start)
>  				break;
>  
> -			dma_fence_set_error(job->done_fence, -EINVAL);
> +			dma_fence_set_error(job_done_fence(job), -EINVAL);
>  		}
>  		spin_unlock(&queue->fence_ctx.lock);
>  	}
> @@ -2182,9 +2201,11 @@ group_term_post_processing(struct panthor_group *group)
>  
>  		spin_lock(&queue->fence_ctx.lock);
>  		list_for_each_entry_safe(job, tmp, &queue->fence_ctx.in_flight_jobs, node) {
> +			struct dma_fence *done_fence = job_done_fence(job);
> +
>  			list_move_tail(&job->node, &faulty_jobs);
> -			dma_fence_set_error(job->done_fence, err);
> -			dma_fence_signal_locked(job->done_fence);
> +			dma_fence_set_error(done_fence, err);
> +			dma_fence_signal_locked(done_fence);
>  		}
>  		spin_unlock(&queue->fence_ctx.lock);
>  
> @@ -2734,7 +2755,7 @@ static void queue_start(struct panthor_queue *queue)
>  
>  	/* Re-assign the parent fences. */
>  	list_for_each_entry(job, &queue->scheduler.pending_list, base.list)
> -		job->base.s_fence->parent = dma_fence_get(job->done_fence);
> +		job->base.s_fence->parent = dma_fence_get(job_done_fence(job));
>  
>  	enable_delayed_work(&queue->timeout.work);
>  	drm_sched_start(&queue->scheduler, 0);
> @@ -3047,6 +3068,8 @@ static bool queue_check_job_completion(struct panthor_queue *queue)
>  	cookie = dma_fence_begin_signalling();
>  	spin_lock(&queue->fence_ctx.lock);
>  	list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, node) {
> +		struct dma_fence *done_fence = job_done_fence(job);
> +
>  		if (!syncobj) {
>  			struct panthor_group *group = job->group;
>  
> @@ -3054,11 +3077,11 @@ static bool queue_check_job_completion(struct panthor_queue *queue)
>  				  (job->queue_idx * sizeof(*syncobj));
>  		}
>  
> -		if (syncobj->seqno < job->done_fence->seqno)
> +		if (syncobj->seqno < done_fence->seqno)
>  			break;
>  
>  		list_move_tail(&job->node, &done_jobs);
> -		dma_fence_signal_locked(job->done_fence);
> +		dma_fence_signal_locked(done_fence);
>  	}
>  
>  	if (list_empty(&queue->fence_ctx.in_flight_jobs)) {
> @@ -3309,8 +3332,10 @@ queue_run_job(struct drm_sched_job *sched_job)
>  	 * drm_sched_job::s_fence::finished fence.
>  	 */
>  	if (!job->call_info.size) {
> -		job->done_fence = dma_fence_get(queue->fence_ctx.last_fence);
> -		return dma_fence_get(job->done_fence);
> +		done_fence = dma_fence_get(queue->fence_ctx.last_fence);
> +
> +		job->done_fence = (uintptr_t)done_fence | DONE_FENCE_INITIALIZED;
> +		return dma_fence_get(done_fence);
>  	}
>  
>  	ret = panthor_device_resume_and_get(ptdev);
> @@ -3323,11 +3348,13 @@ queue_run_job(struct drm_sched_job *sched_job)
>  		goto out_unlock;
>  	}
>  
> -	dma_fence_init(job->done_fence,
> +	done_fence = job_done_fence(job);
> +	dma_fence_init(done_fence,
>  		       &panthor_queue_fence_ops,
>  		       &queue->fence_ctx.lock,
>  		       queue->fence_ctx.id,
>  		       atomic64_inc_return(&queue->fence_ctx.seqno));
> +	job->done_fence |= DONE_FENCE_INITIALIZED;
>  
>  	job->profiling.slot = queue->profiling.seqno++;
>  	if (queue->profiling.seqno == queue->profiling.slot_count)
> @@ -3381,9 +3408,9 @@ queue_run_job(struct drm_sched_job *sched_job)
>  
>  	/* Update the last fence. */
>  	dma_fence_put(queue->fence_ctx.last_fence);
> -	queue->fence_ctx.last_fence = dma_fence_get(job->done_fence);
> +	queue->fence_ctx.last_fence = dma_fence_get(done_fence);
>  
> -	done_fence = dma_fence_get(job->done_fence);
> +	done_fence = dma_fence_get(done_fence);
>  
>  out_unlock:
>  	mutex_unlock(&sched->lock);
> @@ -3403,7 +3430,7 @@ queue_timedout_job(struct drm_sched_job *sched_job)
>  	struct panthor_queue *queue = group->queues[job->queue_idx];
>  
>  	drm_warn(&ptdev->base, "job timeout: pid=%d, comm=%s, seqno=%llu\n",
> -		 group->task_info.pid, group->task_info.comm, job->done_fence->seqno);
> +		 group->task_info.pid, group->task_info.comm, job_done_fence(job)->seqno);
>  
>  	drm_WARN_ON(&ptdev->base, atomic_read(&sched->reset.in_progress));
>  
> @@ -3915,10 +3942,10 @@ static void job_release(struct kref *ref)
>  	if (job->base.s_fence)
>  		drm_sched_job_cleanup(&job->base);
>  
> -	if (job->done_fence && job->done_fence->ops)
> -		dma_fence_put(job->done_fence);
> +	if (job_done_fence_initialized(job))
> +		dma_fence_put(job_done_fence(job));
>  	else
> -		dma_fence_free(job->done_fence);
> +		dma_fence_free(job_done_fence(job));
>  
>  	group_put(job->group);
>  
> @@ -4011,11 +4038,15 @@ panthor_job_create(struct panthor_file *pfile,
>  	 * the previously submitted job.
>  	 */
>  	if (job->call_info.size) {
> -		job->done_fence = kzalloc_obj(*job->done_fence);
> -		if (!job->done_fence) {
> +		struct dma_fence *done_fence;
> +
> +		done_fence = kzalloc_obj(*done_fence);
> +		if (!done_fence) {
>  			ret = -ENOMEM;
>  			goto err_put_job;
>  		}
> +
> +		job->done_fence = (uintptr_t)done_fence;
>  	}
>  
>  	job->profiling.mask = pfile->ptdev->profile_mask;
> -- 
> 2.53.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯

  parent reply	other threads:[~2026-03-09 11:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09 10:30 [PATCH] drm/panthor: Fix the "done_fence is initialized" detection logic Boris Brezillon
2026-03-09 10:50 ` Christian König
2026-03-09 11:06   ` Boris Brezillon
2026-03-09 11:05 ` Liviu Dudau [this message]
2026-03-09 13:15   ` Boris Brezillon
2026-03-09 14:54     ` Liviu Dudau
2026-03-09 15:32       ` Boris Brezillon
2026-03-09 11:06 ` Nicolas Frattaroli
2026-03-10  2:25 ` Claude review: " Claude Code Review Bot
2026-03-10  2:25 ` Claude Code Review Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa6pYsoS6Ahdi8nu@e142607 \
    --to=liviu.dudau@arm.com \
    --cc=adrian.larumbe@collabora.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=nicolas.frattaroli@collabora.com \
    --cc=phasta@kernel.org \
    --cc=steven.price@arm.com \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox