Research Article
Hybrid MPI and CUDA Parallelization for CFD Applications on Multi-GPU HPC Clusters
Table 2
The runtime for one CPU and a single GPU.
| No. | CPU (ms) | GTX 1070 (ms) | Tesla V100 (ms) |
| Mesh 1 | 567.29 | 15.69 | 4.19 | Mesh 2 | 1,170.6 | 30.61 | 8.27 | Mesh 3 | 2,619.41 | 62.75 | 16.9 | Mesh 4 | 5,605.29 | 120.25 | 33.18 | Mesh 5 | 11,258.58 | 236.29 | 64.88 | Mesh 6 | 22,850.69 | 476.12 | 128.98 | Mesh 7 | 46,211.38 | ā | 256.72 | Mesh 8 | 93,322.76 | ā | 512.44 |
|
|