Zero 2 Pro NPU - Any Experiences?

I just noticed that they are advertising the Zero 2 Pro and I’ve had a look out for this for a while. I have a project I’ve been working on with AI inference and have been using a Rock 5A in my prototype. I’m quite pleased with it and probably should just stick with it but interested to know how the Zero 2 Pro compares, as I would prefer the form factor, lower tdp and especially the price. I’m using yolov8 models, but happy to hear any experiences such as difficulty to get working and speeds. How does the 5 TOPS NPU compare with the RK3588s 6 TOPS NPU. Has anyone even got the NPU working either on the 2 Pro or another device. I will be getting one when I see them next available.

@RadxaYuntian Please help to give some advice

NPU is currently on available in Android, since it requires the use of Amlogic SDK. Our Linux image is based on upstream Linux kernel and Debian packages, which does not have support for it.

Further, we currently have no experience developing NPU related functionality on this SoC ourselves.

I have no idea but would this work?

I’m keeping an eye on this as well, I would like to buy the Zero 2 Pro due to the competitive price and form factor. However I if the NPU doesn’t work, then it’s useless for me. That said, Khadas VIM3 is based on the same SOC and they were early on supporting the NPU and the software support seems to be mature now. They have their own SDK based on Amlogic SDK, which seems OK, however it might be that it only works on their image. Easiest option seems to be what Stuart linked to, which is an extension / delegate which add support for the inference on the NPU using more or less standard TFLite. That implementation is backed by VeriSilicon which is the company that made the NPU. (See https://docs.khadas.com/products/sbc/vim3/npu/start)

I would very much like to know if the tflite_vx_delegate can be made to work on the Zero 2 Pro. There might be a need to have the right kernel version and some dependencies in place to make the delegate part work correctly on the Zero 2 Pro. I have no idea how difficult that would be and it would be great if Radxa could make this a smooth process, it would also relieve the burden of making a unique SDK for the Zero 2 Pro.

That said, going from RK3588 to A311D, you would need to make some changes to your code and workflow, if the tflite_vx_delegate works, it seems pretty straightforward, instead of converting your model using RKNN, you would quantize the model in TensorFlow, save it as TFLite and then its pretty much business as usual for running a TFLite model.

The Khadas VIM3 image with working NPU is 4.9 kernel, it’s way too old and end of life.

https://9to5linux.com/linux-kernel-4-9-reaches-end-of-life-after-6-years-of-support

I was told someone is working on the NPU driver for the mainline kernel, so I think we will have to wait for the upstream driver available by the end of Dec.

1 Like
1 Like

Tomeu did a nice writeup on performance improvements: https://blog.tomeuvizoso.net/2024/02/etnaviv-npu-update-16-nice-performance.html

Has the Zero2 been launched now?