Research Article
An FPGA-Based Hardware Accelerator for CNNs Using On-Chip Memories Only: Design and Benchmarking with Intel Movidius Neural Compute Stick
Table 10
Performance comparison between Xilinx FPGAs, Intel FPGAs, and NCS.
| Device | fclk (MHz) | Inference time (ms) | Total power (W) | Energy (mJ) |
| Xilinx FPGA families | Artix 7 | 47.6 | 0.94 | 1.043 | 0.98 | Kintex-7 lv | 48.2 | 0.93 | 0.969 | 0.90 | Zynq-7000 | 67.8 | 0.65 | 1.387 | 0.90 | Virtex 7 | 63.5 | 0.71 | 1.351 | 0.96 | Virtex-US | 78.4 | 0.57 | 1.861 | 1.01 | Virtex-US+ | 104.2 | 0.43 | 2.141 | 0.92 | Zynq-US+ | 116.4 | 0.39 | 2.259 | 0.88 | Intel FPGA families | Cyclone V | 31.4 | 1.43 | 2.301 | 3.29 | Stratix V E | 57.4 | 0.78 | 3.757 | 2.9 | Stratix V GS | 60.3 | 0.74 | 4.010 | 2.96 | Arria 10 | 61 | 0.73 | 1.002 | 0.73 | Stratix V GX | 80 | 0.56 | 3.385 | 1.9 | Intel movidius neural compute stick | NCS | 600 | 10 | 0.810 | 8.1 |
|
|