Use YoloV8 in RK3588 NPU

First thing first:
In your PC:
The test.py is for the RKNN’s customized yolov5.
Please check the above link and re-program the post-posting code in your tests.py e.,g the code after outputs = model.forward().
For your hardware interference:

You need to remove the node /model.24/Add_2 in your yolov8.onnx.
Long story short, this node is related to the bounding box generation that not supported by the rknpu.

this is onnx model screenshot, please take a look, thanks!

could you please share rknn-toolkit2/examples/onnx/yolov5/test.py for yolov8 file, include pre/post- processing code to fit the v8 model into RK3588 NPU hardware ?
thanks again!

Thank you for your time and answers

How to achieve onnx (the same, that is in rkntoolkit2 repository) for rknn?
Becouse i need to run my network on rk3588 board, not standart one

I see difference in netron, but dont understand how to fix it and where (look at png - _work is from rockchip http://joxi.ru/krDjRz0hKN0Xz2, _off - from yolov5 repo http://joxi.ru/vAWLOzpfgbkL4r)

Also i tried these method https://github.com/rockchip-linux/rknpu2/issues/57
netron http://joxi.ru/xAevBgOIRQYqnr

clone this repo https://github.com/airockchip/yolov5
then,
$ python3 export.py --weights yolov5s.pt --rknpu “RK3588” --include “onnx”

It works very strange - after test.py http://joxi.ru/krDjRz0hKN0472 and on rk 3588 board i see million fake recognitions http://joxi.ru/ZrJdkz0hwE1WE2

So it doesn’t work like onnx from rockchip github

Inference from test.py (yolov5s)
working onnx from rockchip github http://joxi.ru/4Ak8lg9foOMEvr (looks fine)
onnx from export.py with opset=12 http://joxi.ru/ZrJdkz0hwE1l12 (out of range)
onnx from export.py with --rknpu RK3588 http://joxi.ru/E2pBpE9h7WByVr (million fake recognitions)

thx reply, niccliu973 :slight_smile:
I searched the web and found this pdf. After all, the best solution is to simulate on this board after making a model with AMD64 linux. After that, I confirmed the operation on armbian.

thx :slight_smile:

Finally it works

Follow these instruction

https://blog.csdn.net/m0_57315535/article/details/128250096?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_baidulandingword~default-0-128250096-blog-125578222.235^v32^pc_relevant_default_base&spm=1001.2101.3001.4242.1&utm_relevant_index=3

need to install older version of libraries:

pip install torch==1.10.0
pip install torchvision==0.11.1

I found the issues with my model export, and I have it matching yours now:

rknn_api/rknnrt version: 1.4.0 (a10f100eb@2022-09-09T09:07:14), driver version: 0.8.2
total weight size: 23487936, total internal size: 22118400
total dma used size: 63889408
model input num: 1, output num: 1
input tensors:
index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=2457600, w_stride = 640, size_with_stride=2457600, fmt=NHWC,type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
index=0, name=output0, n_dims=4, dims=[1, 84, 8400, 1], n_elems=705600, size=1411200, w_stride = 0, size_with_stride=1411200, fmt=NCHW, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000

Now this is the error:

custom string:
Warmup ...
E RKNN: [14:40:02.997] failed to submit!, op id: 171, op name: Add:/model.22/Add_1, flags: 0x5, task start: 9781, task number: 3, run task counter: 0, int status: 0
rknn run error -1

Which I suppose is this:

Is there an easy way to automatically/easily remove that node? :thinking:

Thanks, Luke!!
That’s a great finding:) Cannot wait to try the npu in armbian:)

2 Likes

Sorry for the late, @1117.
That’s a great finding!!
The link specified a correct version of YoloV5 to be converted to rknn model.
But the approach seems to be feasible for YoloV8.
Have your tried this on YoloV8?

Hi, Milas,

My modified onnx model looks like following:

@1117 find a great post shows the potential easier solution through modifying the detect class in the modules.py

Follow the instruction :slight_smile:

Your web browser or OS probably need “Chinese simplified” language.

I used google translate in chrome and it worked fine

1 Like

I got it working. I had installed ultralytics via PIP, and I then cloned the repo to the working directory and now it’s working fine.

Now i’m dealing with very poor frame rates due to inference times. Just using the webcam with no detection, i’m getting around 100fps. Using detection that drops to around 2fps.

I see where i’m going wrong now, but i’m having an issue installing the rknn toolkit lite2. I’m at the point where I need to install the package, however I am only met with the error “rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl is not a supported wheel on this platform.”

I am using Armbian.

I know this is going to end a little off topic, but i’ve done a good bit of research today as i’m doing this in an attempt to setup some facial recognition for work running on Rock5bs. I have it working without the NPU, just abysmal fps.

Since i’m currently working on JUST the 5b, I can’t convert my .pt model to rknn. So i’ve been digging elsewhere and came across the ARM NN. Going to see how I can make use of this, if it yields poorly I will just have to convert to rknn on my home computer and get the model to the 5b.

hello,i want to know how to do postprocess with yolov8n,because the output shape of yolov8n in rknn format is [1,84,8400,1],can u give me your email?i have some questions to ask,thanks,please

Hi, @Dbenton,
Sorry for the late. I was occupied by the work…
It’s great to hear you got the opencv working.
This issue is because the rnkk_toolkit requires specific version of python environment.
The .whl file you’re using requires python 3.9.
You may want to use mini-conda to create a virtual python==3.9 environment to proceed:)

Bests.

this link has already included the post processing.

Yes, that is how I was able to install it.

Currently I still have not yet resorted to using a separate machine to create the .rknn file. Using opencv’s dnn on an .onnx file has sped up the inference times by about 5x however. Still very poor frames (about 5-6fps), but I have just gotten OpenCL installed on the Rock5b using the mali blob driver.

Likely i’ll be converting to a .rknn file tonight and getting it on the 5b tomorrow.

Hi, Dbenton,

Good to know you have solved the problem.
There is another multi-thread rknn project may worth a try.
The profiling results showed a ~3x speedup from 1 thread to 6 thread.

Bests.

1 Like