From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: accel: ethosu: Fix job submit error clean-up refcount underflows Date: Thu, 19 Feb 2026 09:57:30 +1000 Message-ID: In-Reply-To: <20260218-ethos-fixes-v1-1-be3fa3ea9a30@kernel.org> References: <20260218-ethos-fixes-v1-0-be3fa3ea9a30@kernel.org> <20260218-ethos-fixes-v1-1-be3fa3ea9a30@kernel.org> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Patch Review The commit message accurately describes the problem: when job submit fails = before the job is queued (e.g. during GEM buffer bounds checks), calling `e= thosu_job_put` leads to `ethosu_job_cleanup` which calls `pm_runtime_put_au= tosuspend` without a corresponding `pm_runtime_resume_and_get`. The fix cor= rectly splits the cleanup into `ethosu_job_err_cleanup` (for pre-queue fail= ures) and `ethosu_job_cleanup` (for post-queue teardown via kref). However, there is a memory leak. `done_fence` is separately allocated: > + ejob->done_fence =3D kzalloc(sizeof(*ejob->done_fence), GFP_KERNEL); > + if (!ejob->done_fence) { > + ret =3D -ENOMEM; > + goto out_cleanup_job; > + } But `ethosu_job_err_cleanup` does not free it: > +static void ethosu_job_err_cleanup(struct ethosu_job *job) > +{ > + unsigned int i; > + > + for (i =3D 0; i < job->region_cnt; i++) > + drm_gem_object_put(job->region_bo[i]); > + > + drm_gem_object_put(job->cmd_bo); > + > + kfree(job); > +} Every error path through `out_cleanup_job` or `out_put_job` will leak the `= done_fence` allocation. A `kfree(job->done_fence)` should be added to `etho= su_job_err_cleanup` before `kfree(job)`. (This was also leaked in the origi= nal code since `dma_fence_put` on a never-initialized fence doesn't actuall= y free it, but this patch should fix it while it's restructuring the cleanu= p.) A second concern is the `out_cleanup_job` label when reached from the `done= _fence` allocation failure: > out_cleanup_job: > if (ret) > drm_sched_job_cleanup(&ejob->base); When `done_fence` allocation fails, we jump to `out_cleanup_job` and call `= drm_sched_job_cleanup(&ejob->base)` even though `drm_sched_job_init` has no= t yet been called. The job was kzalloc'd so `ejob->base` is zeroed, but cal= ling `drm_sched_job_cleanup` on an uninitialized scheduler job is at best r= elying on implementation details. This is a pre-existing issue not introduc= ed by this patch, but it might be worth fixing while restructuring the erro= r paths =E2=80=94 perhaps by jumping to `out_put_job` instead. The success path change looks correct: > ret =3D ethosu_job_push(ejob); > + if (!ret) { > + ethosu_job_put(ejob); > + return 0; > + } On success, `ethosu_job_push` has already called `pm_runtime_resume_and_get= ` and taken a kref, so `ethosu_job_put` drops the initial reference correct= ly. --- Generated by Claude Code Patch Reviewer