Research Article

Hybrid MPI and CUDA Parallelization for CFD Applications on Multi-GPU HPC Clusters

Table 4

The runtime for Tesla V100 multi-GPU clusters.

No.Two GPUs (ms)Three GPUs (ms)Four GPUs (ms)

Mesh 13.853.413.01
Mesh 27.166.585.86
Mesh 313.7811.829.31
Mesh 423.9920.4315.8
Mesh 541.5932.2225.71
Mesh 678.6958.8145.16
Mesh 7151.78108.8583.92
Mesh 8295.13208.72161.88