Research Article
Implementation and Optimization of a CFD Solver Using Overlapped Meshes on Multiple MIC Coprocessors
Table 2
Wall clock times for
and
(in seconds).
| NT | | | Scatter | Balanced | Compact | Scatter | Balanced | Compact |
| 4 | 1.22 (1.0x) | 1.22 (1.0x) | 2.288 (1.0x) | 2.132 (1.0x) | 2.132 (1.0x) | 6.80 (1.0x) | 8 | 0.636 (1.91x) | 0.636 (1.91x) | 1.196 (1.91x) | 1.160 (1.83x) | 1.161 (1.84x) | 3.44 (1.98x) | 16 | 0.352 (3.47x) | 0.353 (3.46x) | 0.660 (3.47x) | 0.632 (3.37x) | 0.630 (3.38x) | 1.832 (3.71x) | 32 | 0.268 (4.55x) | 0.266 (4.59x) | 0.432 (5.29x) | 0.360 (5.92x) | 0.362 (5.89x) | 0.960 (7.08x) | 59 | 0.232 (5.26x) | 0.231 (5.28x) | 0.272 (8.41x) | 0.296 (7.2x) | 0.296 (7.2x) | 0.684 (9.94x) | 118 | 0.216 (5.65x) | 0.212 (5.75x) | 0.260 (8.8x) | 0.296 (7.2x) | 0.288 (7.4x) | 0.48 (14.17x) |
| CPU time | 0.676 (3.19x) | 0.72 (2.5x) |
|
|
NT : number of threads.
|