Research Article
Implementation and Optimization of a CFD Solver Using Overlapped Meshes on Multiple MIC Coprocessors
Table 3
Wall clock times for
and
(in seconds).
| NT | | | Scatter | Balanced | Compact | Scatter | Balanced | Compact |
| 4 | 0.988 (1.0x) | 0.988 (1.0x) | 2.424 (1.0x) | 1.404 (1.0x) | 1.404 (1.0x) | 2.736 (1.0x) | 8 | 0.508 (1.94x) | 0.508 (1.94x) | 1.256 (1.93x) | 0.716 (1.96x) | 0.714 (1.97x) | 1.436 (1.91x) | 16 | 0.280 (3.53x) | 0.282 (3.50x) | 0.664 (3.65x) | 0.408 (3.44x) | 0.407 (3.45x) | 0.804 (3.4x) | 32 | 0.164 (6.02x) | 0.166 (5.95x) | 0.368 (6.59x) | 0.260 (5.4x) | 0.264 (5.32x) | 0.464 (5.90x) | 59 | 0.140 (7.06x) | 0.139 (7.11x) | 0.232 (10.4x) | 0.200 (7.02x) | 0.202 (6.95x) | 0.283 (9.67x) | 118 | 0.156 (6.33x) | 0.152 (6.5x) | 0.196 (12.4x) | 0.200 (7.02x) | 0.199 (7.06x) | 0.228 (12.0x) |
| CPU time | 0.56 (3.68x) | 0.572 (2.87x) |
|
|
NT : number of threads.
|