From: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
To: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Cc: Oded Gabbay <ogabbay@kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Sumit Semwal <sumit.semwal@linaro.org>,
Christian König <christian.koenig@amd.com>,
dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
iommu@lists.linux.dev, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org,
Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>,
Bharath Kumar <quic_bkumar@quicinc.com>,
Chenna Kesava Raju <quic_chennak@quicinc.com>
Subject: Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
Date: Wed, 25 Feb 2026 19:27:47 +0530 [thread overview]
Message-ID: <ceba8973-4fb7-4497-aebf-dd41f2d2eaa5@oss.qualcomm.com> (raw)
In-Reply-To: <jyd3ufisoz4xcfe2dvu26odesaz2czj22jn46qswkzz6ocg4zu@6krzvyvirkmo>
On 2/24/2026 2:47 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:55AM +0530, Ekansh Gupta wrote:
>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
>> integrated in the DRM accel subsystem.
>>
>> The new docs introduce QDA as a DRM/accel-based implementation of
>> Hexagon DSP offload that is intended as a modern alternative to the
>> legacy FastRPC driver in drivers/misc. The text describes the driver
>> motivation, high-level architecture and interaction with IOMMU context
>> banks, GEM-based buffer management and the RPMsg transport.
>>
>> The user-space facing section documents the main QDA IOCTLs used to
>> establish DSP sessions, manage GEM buffer objects and invoke remote
>> procedures using the FastRPC protocol, along with a typical lifecycle
>> example for applications.
>>
>> Finally, the driver is wired into the Compute Accelerators
>> documentation index under Documentation/accel, and a brief debugging
>> section shows how to enable dynamic debug for the QDA implementation.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> Documentation/accel/index.rst | 1 +
>> Documentation/accel/qda/index.rst | 14 +++++
>> Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 144 insertions(+)
>>
>> diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
>> index cbc7d4c3876a..5901ea7f784c 100644
>> --- a/Documentation/accel/index.rst
>> +++ b/Documentation/accel/index.rst
>> @@ -10,4 +10,5 @@ Compute Accelerators
>> introduction
>> amdxdna/index
>> qaic/index
>> + qda/index
>> rocket/index
>> diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst
>> new file mode 100644
>> index 000000000000..bce188f21117
>> --- /dev/null
>> +++ b/Documentation/accel/qda/index.rst
>> @@ -0,0 +1,14 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +==============================
>> + accel/qda Qualcomm DSP Driver
>> +==============================
>> +
>> +The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
>> +Signal Processors) within the DRM accelerator framework. It serves as a modern
>> +replacement for the legacy FastRPC driver, offering improved resource management
>> +and standard subsystem integration.
>> +
>> +.. toctree::
>> +
>> + qda
>> diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
>> new file mode 100644
>> index 000000000000..742159841b95
>> --- /dev/null
>> +++ b/Documentation/accel/qda/qda.rst
>> @@ -0,0 +1,129 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +==================================
>> +Qualcomm Hexagon DSP (QDA) Driver
>> +==================================
>> +
>> +Introduction
>> +============
>> +
>> +The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
>> +accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
>> +interface for user-space applications to offload computational tasks ranging
>> +from audio processing and sensor offload to computer vision and AI
>> +inference to the Hexagon DSPs found on Qualcomm SoCs.
>> +
>> +This driver is designed to align with the Linux kernel's modern **Compute
>> +Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
>> +alternative to the legacy FastRPC driver in `drivers/misc/`, offering
>> +improved resource management and better integration with standard kernel
>> +subsystems.
>> +
>> +Motivation
>> +==========
>> +
>> +The existing FastRPC implementation in the kernel utilizes a custom character
>> +device and lacks integration with modern kernel memory management frameworks.
>> +The QDA driver addresses these limitations by:
>> +
>> +1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
>> + management, job submission, and synchronization.
>> +2. **Utilizing GEM for Memory**: Providing proper buffer object management,
>> + including DMA-BUF import/export capabilities.
>> +3. **Improving Isolation**: Using IOMMU context banks to enforce memory
>> + isolation between different DSP user sessions.
>> +
>> +Key Features
>> +============
>> +
>> +* **Standard Accelerator Interface**: Exposes a standard character device
>> + node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
>> +* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
>> + GDSP) via a single driver architecture.
>> +* **FastRPC Protocol**: Implements the reliable Remote Procedure Call
>> + (FastRPC) protocol for communication between the application processor
>> + and DSP.
>> +* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
>> + and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
>> +* **Modular Design**: Clean separation between the core DRM logic, the memory
>> + manager, and the RPMsg-based transport layer.
>> +
>> +Architecture
>> +============
>> +
>> +The QDA driver is composed of several modular components:
>> +
>> +1. **Core Driver (`qda_drv`)**: Manages device registration, file operations,
>> + and bridges the driver with the DRM accelerator subsystem.
>> +2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management
>> + layer that handles IOMMU context banks. It supports pluggable backends
>> + (such as DMA-coherent) to adapt to different SoC memory architectures.
>> +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
>> +
>> + * **`qda_gem`**: Core GEM object management, including allocation, mmap
>> + operations, and buffer lifecycle management.
>> + * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
>> + enabling seamless buffer sharing with other kernel subsystems.
>> +
>> +4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
>> + to handle low-level message passing with the DSP firmware.
>> +5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
>> + enumerate and manage the specific compute context banks defined in the
>> + device tree.
> I'm really not sure if it's a bonus or not. I'm waiting for iommu-map
> improvements to land to send patches reworking FastRPC CB from using
> probe into being created by the main driver: it would remove some of the
> possible race conditions between main driver finishing probe and the CB
> devices probing in the background.
>
> What's the actual benefit of the CB bus?
I tried following the Tegra host1x logic here as was discussed here[1]. My understanding is that
with this the CB will become more manageable reducing the scope of races that exists in the
current fastrpc driver.
That said, I'm not completely aware about the iommu-map improvements. Is it the one
being discussed for this patch[2]? If it helps in main driver to create CB devices directly, then I
would be happy to adapt the design.
[1] https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
[2] https://lore.kernel.org/all/20260126-kaanapali-iris-v1-3-e2646246bfc1@oss.qualcomm.com/
>
>> +6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
>> + marshalling arguments and handling remote invocations.
>> +
>> +User-Space API
>> +==============
>> +
>> +The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
>> +to be familiar to existing FastRPC users while adhering to DRM standards.
>> +
>> +* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
>> + and capabilities.
>> +* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
>> + domain.
>> +* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
> You need to explain the difference between these two.
Ack.
>
>> +* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
>> + execution unit).
>> +* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
>> +* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
>> +* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
>> + DSP's virtual address space.
> Do we need to make this separate? Can we map/unmap buffers on their
> usage? Or when they are created? I'm thinking about that the
> virtualization.
The lib provides ways(fastrpc_mmap/remote_mmap64) for users to map/unmap the
buffers on DSP as per processes requirement. The ioctls are added to support the same.
> An alternative approach would be to merge
> GET_MMAP_OFFSET with _MAP: once you map it to the DSP memory, you will
> get the offset.
_MAP is not need for all the buffers. Most of the remote call buffers that are passed to DSP
are automatically mapped by DSP before invoking the DSP implementation so the user-space
does not need to call _MAP for these.
Some buffers(e.g., shared persistent buffers) do require explicit mapping, which is why
MAP/MUNMAP exists in FastRPC.
Because of this behavioral difference, merging GET_MMAP_OFFSET with MAP is not accurate.
GET_MMAP_OFFSET is for CPU‑side mmap via GEM, whereas MAP is specifically for DSP
virtual address assignment.
>
>> +
>> +Usage Example
>> +=============
>> +
>> +A typical lifecycle for a user-space application:
>> +
>> +1. **Discovery**: Open `/dev/accel/accel*` and check
>> + `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
>> + compute workloads).
>> +2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
>> + `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
>> +3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
>> + DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`.
>> +4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
>> + functions on the DSP.
>> +5. **Cleanup**: Close file descriptors to automatically release resources and
>> + detach the session.
>> +
>> +Internal Implementation
>> +=======================
>> +
>> +Memory Management
>> +-----------------
>> +The driver's memory manager creates virtual "IOMMU devices" that map to
>> +hardware context banks. This allows the driver to manage multiple isolated
>> +address spaces. The implementation currently uses a **DMA-coherent backend**
>> +to ensure data consistency between the CPU and DSP without manual cache
>> +maintenance in most cases.
>> +
>> +Debugging
>> +=========
>> +The driver includes extensive dynamic debug support. Enable it via the
>> +kernel's dynamic debug control:
>> +
>> +.. code-block:: bash
>> +
>> + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
> Please add documentation on how to build the test apps and how to load
> them to the DSP.
Ack.
>
>> --
>> 2.34.1
>>
next prev parent reply other threads:[~2026-02-25 13:58 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <vU2QyEVqOu-D3eGp7BZFICUeauxL32bwWzeidOAijoeVaJTk8KcRVsaQQD4MdFQEcaQTZ5RkzRsz9-Lhl1qsqg==@protonmail.internalid>
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
2026-02-23 21:17 ` Dmitry Baryshkov
2026-02-25 13:57 ` Ekansh Gupta [this message]
2026-02-25 17:17 ` Dmitry Baryshkov
2026-02-24 3:33 ` Trilok Soni
2026-02-25 14:17 ` Ekansh Gupta
2026-02-25 15:12 ` Bjorn Andersson
2026-02-25 19:16 ` Trilok Soni
2026-02-25 19:40 ` Dmitry Baryshkov
2026-02-25 23:18 ` Trilok Soni
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:08 ` [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton Ekansh Gupta
2026-02-23 21:52 ` Bjorn Andersson
2026-02-25 14:20 ` Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:08 ` [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator Ekansh Gupta
2026-02-23 21:23 ` Dmitry Baryshkov
2026-02-23 21:50 ` Bjorn Andersson
2026-02-23 22:12 ` Dmitry Baryshkov
2026-02-23 22:25 ` Bjorn Andersson
2026-02-23 22:41 ` Dmitry Baryshkov
2026-02-25 17:16 ` Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
2026-02-23 22:44 ` Dmitry Baryshkov
2026-02-25 17:56 ` Ekansh Gupta
2026-02-25 19:09 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-26 10:46 ` [PATCH RFC 04/18] " Krzysztof Kozlowski
2026-02-23 19:08 ` [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus Ekansh Gupta
2026-02-23 22:49 ` Dmitry Baryshkov
2026-02-26 8:38 ` Ekansh Gupta
2026-02-26 10:46 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
2026-02-23 22:50 ` Dmitry Baryshkov
2026-02-23 23:11 ` Bjorn Andersson
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver Ekansh Gupta
2026-02-23 22:16 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling Ekansh Gupta
2026-02-23 22:20 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header Ekansh Gupta
2026-02-23 22:24 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration Ekansh Gupta
2026-02-23 22:36 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
2026-02-23 22:39 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-24 9:05 ` [PATCH RFC 11/18] " Christian König
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-24 8:52 ` [PATCH RFC 12/18] " Matthew Brost
2026-02-24 9:12 ` Christian König
2026-02-23 19:09 ` [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support Ekansh Gupta
2026-02-23 23:07 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support Ekansh Gupta
2026-02-23 23:10 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 15/18] accel/qda: Add FastRPC DSP process creation support Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-26 10:48 ` [PATCH RFC 16/18] " Krzysztof Kozlowski
2026-02-23 19:09 ` [PATCH RFC 17/18] accel/qda: Add FastRPC-based DSP memory unmapping support Ekansh Gupta
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 19:09 ` [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver Ekansh Gupta
2026-02-23 22:40 ` Dmitry Baryshkov
2026-02-24 8:20 ` Claude review: " Claude Code Review Bot
2026-02-23 22:03 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Bjorn Andersson
2026-02-24 3:37 ` Trilok Soni
2026-02-24 3:39 ` Trilok Soni
2026-02-24 6:39 ` Claude review: " Claude Code Review Bot
2026-02-25 13:42 ` [PATCH RFC 00/18] " Bryan O'Donoghue
2026-02-25 19:12 ` Dmitry Baryshkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ceba8973-4fb7-4497-aebf-dd41f2d2eaa5@oss.qualcomm.com \
--to=ekansh.gupta@oss.qualcomm.com \
--cc=airlied@gmail.com \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=dmitry.baryshkov@oss.qualcomm.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=ogabbay@kernel.org \
--cc=quic_bkumar@quicinc.com \
--cc=quic_chennak@quicinc.com \
--cc=robin.murphy@arm.com \
--cc=simona@ffwll.ch \
--cc=skhan@linuxfoundation.org \
--cc=srinivas.kandagatla@oss.qualcomm.com \
--cc=sumit.semwal@linaro.org \
--cc=tzimmermann@suse.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox