I was able to run ssd_mobilenet_v1 demo code ssd.py successfully.
The inference time is impressive if I set do_quantization=True, which is around 22 ~ 24 ms per frame. However, the “Post Process” (got valid candidate box) could take almost 700ms per frame!
Questions:
- Is there a way to speed up this “Post Process” by using NPU? and how?
- or by using GPU? and how?
- or other way? would port to C code make this “Post Process” faster? by how much?
Thank you very much for your help.