Research Article
Implementation and Optimization of a CFD Solver Using Overlapped Meshes on Multiple MIC Coprocessors
Table 1
Wall clock times for
and
(in seconds).
| NT | | | Scatter | Balanced | Compact | Scatter | Balanced | Compact |
| 4 | 1.80 (1.0x) | 1.81 (1.0x) | 3.256 (1.0x) | 0.680 (1.0x) | 0.680 (1.0x) | 1.668 (1.0x) | 8 | 0.932 (1.93x) | 0.935 (1.94x) | 1.680 (1.94x) | 0.352 (1.93x) | 0.352 (1.93x) | 0.856 (1.95x) | 16 | 0.520 (3.46x) | 0.519 (3.49x) | 0.880 (3.7x) | 0.192 (3.54x) | 0.194 (3.51x) | 0.468 (3.56x) | 32 | 0.288 (6.25x) | 0.288 (6.28x) | 0.488 (6.67x) | 0.160 (4.25x) | 0.161 (4.22x) | 0.28 (5.96x) | 59 | 0.196 (9.18x) | 0.196 (9.23x) | 0.296 (11.0x) | 0.144 (4.72x) | 0.144 (4.72x) | 0.224 (7.45x) | 118 | 0.144 (12.5x) | 0.136 (13.3x) | 0.160 (20.35x) | 0.148 (4.59x) | 0.148 (4.59x) | 0.196 (8.51x) |
| CPU time | 0.52 (3.82x) | 0.54 (3.64x) |
|
|
NT : number of threads.
|