Support Required for Running QAI Hub Models on QCS6490 NPU (Radxa Dragon Q6A)

Lokesh_Peddimsetti · June 1, 2026, 6:54am

Dear Support Team,

We are currently working on deploying ML inference workloads on the Radxa Dragon Q6A platform based on the Qualcomm QCS6490 SoC using Qualcomm QAIRT/QNN SDK and Qualcomm AI Hub models.

During development and deployment, we have been facing multiple issues related to model conversion, runtime compatibility, tensor output mismatches, and execution on the HTP/NPU backend.

Current Platform Details:

Board: Radxa Dragon Q6A
SoC: Qualcomm QCS6490
OS: Ubuntu 24.04
QAIRT SDK available on board: 2.42.0.251225
Target accelerator: HTP V68 NPU
Models tested:
- YOLOv8n
- FootTrackNet
- Other QAI Hub reference models

Issues Observed:

Model conversion and runtime compatibility problems

QAI Hub generated context binaries and artifacts are being compiled using QAIRT/QNN 2.45.x by default.
Our board BSP/runtime currently supports QAIRT 2.42 only.
Generated binaries frequently fail or show incompatibility concerns due to runtime version mismatch.

Input/output tensor mismatch issues

We observed mismatches between:
- ONNX outputs
- QNN outputs
- Quantized outputs
- Runtime inference outputs
Tensor values and detections differ significantly after quantization or deployment on HTP backend.

Context binary/runtime compatibility concerns

Generated .bin context binaries appear tightly coupled with:
- QAIRT version
- backend serializer version
- DSP/HTP runtime stack
We are unsure whether binaries generated on Qualcomm AI Hub are officially supported on Ubuntu-based community BSP images running on Radxa Dragon Q6A.

BSP/runtime uncertainty

The QAI Hub hosted QCS6490 reference device uses:
- Dragonwing RB3 Gen 2 Vision Kit
- Qc_Linux 1.6
Our board uses:
- Ubuntu 24.04 community BSP
We would like clarification regarding:
- compatibility expectations
- supported deployment workflows
- whether Qc_Linux BSP is required for reliable QNN/HTP deployment

Difficulties during deployment
We encountered multiple runtime-related issues including:

model conversion failures
graph finalize failures
inference failures
tensor layout mismatches
quantization inconsistencies
possible serializer/runtime incompatibilities
uncertainty regarding supported SDK/runtime combinations

Observations:

AI Hub successfully compiles and profiles models for QCS6490/Qc_Linux targets.
However, generated artifacts indicate usage of QAIRT 2.45.x internally.
Our local deployment environment currently uses QAIRT 2.42.
This raises concerns regarding official compatibility between AI Hub generated artifacts and older runtime stacks.

We would appreciate guidance regarding the officially recommended deployment workflow for Radxa Dragon Q6A/QCS6490 platforms.

Specifically, we would like clarification on:

Is Ubuntu 24.04 community BSP officially supported for QNN/HTP deployment on QCS6490?
Are AI Hub generated QNN context binaries expected to work on Ubuntu-based Radxa Dragon Q6A boards?
Is Qc_Linux 1.6 BSP required for reliable compatibility with AI Hub generated binaries?
What is the officially recommended workflow for:
- model conversion
- quantization
- context binary generation
- deployment
- inference execution
  on QCS6490 devices?
Is there a recommended method to generate QNN context binaries using QAIRT 2.42 specifically?
Are there any reference guides/examples available for:
- YOLOv8 deployment
- RTSP/camera inference
- HTP runtime execution
- QNN deployment on Radxa Dragon Q6A?
Is there an officially supported BSP image/runtime stack recommended for AI inference workloads on this platform?

We would greatly appreciate:

technical guidance
reference deployment examples
supported software stack recommendations
compatibility clarification
debugging recommendations for QNN/HTP inference on QCS6490

Thank you for your support.

Best Regards,
Lokesh Srinivas Bhaskar P