Research Article
Hybrid MPI and CUDA Parallelization for CFD Applications on Multi-GPU HPC Clusters
Table 4
The runtime for Tesla V100 multi-GPU clusters.
| No. | Two GPUs (ms) | Three GPUs (ms) | Four GPUs (ms) |
| Mesh 1 | 3.85 | 3.41 | 3.01 | Mesh 2 | 7.16 | 6.58 | 5.86 | Mesh 3 | 13.78 | 11.82 | 9.31 | Mesh 4 | 23.99 | 20.43 | 15.8 | Mesh 5 | 41.59 | 32.22 | 25.71 | Mesh 6 | 78.69 | 58.81 | 45.16 | Mesh 7 | 151.78 | 108.85 | 83.92 | Mesh 8 | 295.13 | 208.72 | 161.88 |
|
|