Research Article
Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU
Table 2
Summary of experiments.
| Experiment | Description |
| Serial | Serial implementation on single CPU core | Parallel | | AoS | AoS scheme | SoA_Push_Only | SoA scheme + push data scheme | SoA_Pull | | SoA_Pull_Only | SoA scheme + pull data scheme | SoA_Pull_BR | SoA scheme + pull data scheme + branch divergence removal | SoA_Pull_RR | SoA scheme + pull data scheme + register reduction | SoA_Pull_Full | SoA scheme + pull data scheme + branch divergence removal + register usage reduction | SoA_Pull_Full_Tiling | SoA scheme + pull data scheme + branch divergence removal + register usage reduction + tiling with data layout change |
|
|