Research Article
Parallel Algorithms of Well-Balanced and Weighted Average Flux for Shallow Water Model Using CUDA
Table 3
Speedup compared to baseline program for various problems.
| | Rectangular dam break | Circular dam break on wet bed | Circular dam break on dry bed | Dam break flows over three humps | 16k | 1M | 16k | 1M | 16k | 1M | 16k | 1M |
| Serial: cell-based | Baseline | Baseline | Baseline | Baseline | Baseline | Baseline | Baseline | Baseline | Serial: edge-based | 1.90 | 1.92 | 1.87 | 1.24 | 1.92 | 1.52 | 1.74 | 1.81 | Parallel V1 | 12.45 | 18.67 | 20.98 | 19.66 | 19.03 | 21.39 | 13.74 | 17.47 | Parallel occupancy | 30.18 | 52.20 | 38.04 | 39.71 | 34.23 | 43.82 | 31.87 | 43.58 | Parallel memory pattern | 40.24 | 52.56 | 48.19 | 40.22 | 42.55 | 44.12 | 31.49 | 44.02 | Parallel unroll | 48.07 | 63.35 | 57.72 | 49.39 | 50.19 | 53.67 | 39.64 | 50.60 |
|
|