Dear Support Team,
We are currently working on deploying ML inference workloads on the Radxa Dragon Q6A platform based on the Qualcomm QCS6490 SoC using Qualcomm QAIRT/QNN SDK and Qualcomm AI Hub models.
During development and deployment, we have been facing multiple issues related to model conversion, runtime compatibility, tensor output mismatches, and execution on the HTP/NPU backend.
Current Platform Details:
-
Board: Radxa Dragon Q6A
-
SoC: Qualcomm QCS6490
-
OS: Ubuntu 24.04
-
QAIRT SDK available on board: 2.42.0.251225
-
Target accelerator: HTP V68 NPU
-
Models tested:
-
YOLOv8n
-
FootTrackNet
-
Other QAI Hub reference models
-
Issues Observed:
- Model conversion and runtime compatibility problems
-
QAI Hub generated context binaries and artifacts are being compiled using QAIRT/QNN 2.45.x by default.
-
Our board BSP/runtime currently supports QAIRT 2.42 only.
-
Generated binaries frequently fail or show incompatibility concerns due to runtime version mismatch.
- Input/output tensor mismatch issues
-
We observed mismatches between:
-
ONNX outputs
-
QNN outputs
-
Quantized outputs
-
Runtime inference outputs
-
-
Tensor values and detections differ significantly after quantization or deployment on HTP backend.
- Context binary/runtime compatibility concerns
-
Generated .bin context binaries appear tightly coupled with:
-
QAIRT version
-
backend serializer version
-
DSP/HTP runtime stack
-
-
We are unsure whether binaries generated on Qualcomm AI Hub are officially supported on Ubuntu-based community BSP images running on Radxa Dragon Q6A.
- BSP/runtime uncertainty
-
The QAI Hub hosted QCS6490 reference device uses:
-
Dragonwing RB3 Gen 2 Vision Kit
-
Qc_Linux 1.6
-
-
Our board uses:
- Ubuntu 24.04 community BSP
-
We would like clarification regarding:
-
compatibility expectations
-
supported deployment workflows
-
whether Qc_Linux BSP is required for reliable QNN/HTP deployment
-
- Difficulties during deployment
We encountered multiple runtime-related issues including:
-
model conversion failures
-
graph finalize failures
-
inference failures
-
tensor layout mismatches
-
quantization inconsistencies
-
possible serializer/runtime incompatibilities
-
uncertainty regarding supported SDK/runtime combinations
Observations:
-
AI Hub successfully compiles and profiles models for QCS6490/Qc_Linux targets.
-
However, generated artifacts indicate usage of QAIRT 2.45.x internally.
-
Our local deployment environment currently uses QAIRT 2.42.
-
This raises concerns regarding official compatibility between AI Hub generated artifacts and older runtime stacks.
We would appreciate guidance regarding the officially recommended deployment workflow for Radxa Dragon Q6A/QCS6490 platforms.
Specifically, we would like clarification on:
-
Is Ubuntu 24.04 community BSP officially supported for QNN/HTP deployment on QCS6490?
-
Are AI Hub generated QNN context binaries expected to work on Ubuntu-based Radxa Dragon Q6A boards?
-
Is Qc_Linux 1.6 BSP required for reliable compatibility with AI Hub generated binaries?
-
What is the officially recommended workflow for:
-
model conversion
-
quantization
-
context binary generation
-
deployment
-
inference execution
on QCS6490 devices?
-
-
Is there a recommended method to generate QNN context binaries using QAIRT 2.42 specifically?
-
Are there any reference guides/examples available for:
-
YOLOv8 deployment
-
RTSP/camera inference
-
HTP runtime execution
-
QNN deployment on Radxa Dragon Q6A?
-
-
Is there an officially supported BSP image/runtime stack recommended for AI inference workloads on this platform?
We would greatly appreciate:
-
technical guidance
-
reference deployment examples
-
supported software stack recommendations
-
compatibility clarification
-
debugging recommendations for QNN/HTP inference on QCS6490
Thank you for your support.
Best Regards,
Lokesh Srinivas Bhaskar P