On the 24Q4 release it has cixbuild version 6.1.2958.
On the 25Q1 release it has cixbuild version 6.1.3119.
Using the YOLOv8 example from the Model Zoo we download the ONNX model from modelscape.
Using cixbuild version 6.1.2958 we compile the ONNX model to .cix format using the vendor provider cfg file yolov8_lbuild.cfg.
We modified the vendor inference_npu.py script to do simple benchmarking and output the object detection results. Inference on the input image results in an average inference time of 110ms.
Then using cixbuild version 6.1.3119 we repeat the process and compile the same ONNX model.
However average inference time is three times slower at 340ms.
These tests are done on Radxa’s Debian b3 image.
The .cix compiled model at modelscape also runs with average inference at 110ms.
So what is the reason for cixbuild version 6.1.3119 producing a result that is three times slower?
Secondly in the yolov8_lbuild.cfg file the optimizer has settings;
trigger_float_op = disable & <[(258, 272)]:float16_preferred!>
weight_bits = 8& <[(273,274)]:16>
activation_bits = 8& <[(273,274)]:16>
bias_bits = 32& <[(273,274)]:48>
How are those magic numbers (258,272)
and (273,274)
determined?