Research Article

Hybrid MPI and CUDA Parallelization for CFD Applications on Multi-GPU HPC Clusters

Table 2

The runtime for one CPU and a single GPU.

No.CPU (ms)GTX 1070 (ms)Tesla V100 (ms)

Mesh 1567.2915.694.19
Mesh 21,170.630.618.27
Mesh 32,619.4162.7516.9
Mesh 45,605.29120.2533.18
Mesh 511,258.58236.2964.88
Mesh 622,850.69476.12128.98
Mesh 746,211.38ā€”256.72
Mesh 893,322.76ā€”512.44