I like very much the hardware design of the Pi N10 and plan to buy one, if deliverable.
But I cannot find a deep description of the NPU in order to know, why it is so fast. In case one would get 3 TOPS out of a graphic card, at least 500 Cuda cores are necessary. So how many arithmetic units the NPU has? Is there any technical documentation available?
Many thanks for any hint.
Best regards Walter
There is some information on page 836 of the RK3399Pro TRM: http://rockchip.fr/Rockchip%20RK3399Pro%20TRM%20V1.0%20Part1.pdf
1920 Int8 MAC operations per cycle
192 Int16 MAC operations per cycle
64 FP16 MAC operations per cycle
And more details on that page about special operations.
The trick is that the 3 TOPS number is absolute best case performance strictly on INT8 values.
Many thanks for your interesting replies. It seems that the RK3399Pro is a damn fast device. Must have!