You can get a noe SDK release from support.cixtech.com(apply through EBP), which includes a doc explaining how to use NPU for inference.
For C++ API, you should set env through:
$ export LD_LIBRARY_PATH=/usr/share/cix/lib/onnxruntime:$LD_LIBRARY_PATH
$ export OPERATOR_PATH=/usr/share/cix/lib/onnxruntime/operator/
For Python API, you should pip install onnxruntime_zhouyi.whl first:
$ pip3 install /usr/share/cix/pypi/onnxruntime_zhouyi-xxxx-linux_aarch64.whl
and set env:
$ export OPERATOR_PATH=/usr/share/cix/lib/onnxruntime/operator/
and add the following to the python file:
from ZhouyiOperators import operators