Research Article
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
| Radius | Type | Dims | Unique Dim | BX/VX | WX | WY | WZ | CX | CY | CZ | Data loading technique |
| 1 | thumbtack | 3 | x | 4 | 16 | 16 | 1 | 1 | 1 | 1 | vec | 1 | thumbtack | 3 | y | 1 | 256 | 1 | 1 | 1 | 1 | 2 | global | 1 | thumbtack | 3 | z | 1 | 64 | 1 | 2 | 2 | 1 | 1 | global | 2 | thumbtack | 3 | x | 4 | 16 | 4 | 1 | 1 | 1 | 1 | vec | 2 | thumbtack | 3 | y | 1 | 32 | 4 | 2 | 1 | 2 | 16 | local | 2 | thumbtack | 3 | z | 1 | 32 | 4 | 2 | 1 | 2 | 32 | local | 3 | thumbtack | 3 | x | 4 | 16 | 4 | 2 | 1 | 1 | 1 | vec | 3 | thumbtack | 3 | y | 1 | 16 | 8 | 2 | 1 | 2 | 32 | local | 3 | thumbtack | 3 | z | 1 | 16 | 8 | 2 | 1 | 2 | 32 | local | 4 | thumbtack | 3 | x | 4 | 16 | 4 | 4 | 1 | 1 | 1 | vec | 4 | thumbtack | 3 | y | 1 | 8 | 8 | 4 | 2 | 1 | 32 | local | 4 | thumbtack | 3 | z | 1 | 8 | 8 | 4 | 2 | 1 | 32 | local | 5 | thumbtack | 3 | x | 1 | 16 | 8 | 2 | 1 | 2 | 64 | local | 5 | thumbtack | 3 | y | 1 | 8 | 16 | 2 | 2 | 1 | 64 | local | 5 | thumbtack | 3 | z | 1 | 64 | 2 | 1 | 1 | 2 | 1 | global | 0 | dense | 1 | none | 1 | 128 | 1 | 1 | 2 | 1 | 4 | global | 1 | dense | 1 | x | 1 | 128 | 2 | 1 | 1 | 1 | 2 | global | 1 | dense | 1 | y | 2 | 32 | 8 | 1 | 1 | 1 | 1 | vec | 1 | dense | 1 | z | 1 | 64 | 1 | 4 | 1 | 1 | 8 | global | 1 | dense | 2 | x | 4 | 16 | 8 | 1 | 1 | 1 | 1 | vec | 1 | dense | 2 | y | 1 | 64 | 1 | 1 | 2 | 1 | 1 | global | 1 | dense | 2 | z | 1 | 64 | 2 | 1 | 2 | 1 | 1 | global | 1 | dense | 3 | none | 1 | 32 | 4 | 2 | 1 | 4 | 16 | local | 2 | dense | 1 | x | 1 | 128 | 2 | 1 | 2 | 1 | 2 | global | 2 | dense | 1 | y | 2 | 32 | 2 | 4 | 1 | 1 | 2 | vec | 2 | dense | 1 | z | 2 | 32 | 1 | 8 | 1 | 2 | 1 | vec | 2 | dense | 2 | x | 4 | 8 | 8 | 2 | 1 | 1 | 32 | vec | 2 | dense | 2 | y | 1 | 64 | 1 | 4 | 4 | 1 | 8 | local | 2 | dense | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 2 | dense | 3 | none | 1 | 32 | 4 | 2 | 1 | 2 | 8 | local | 3 | dense | 1 | x | 1 | 64 | 1 | 1 | 4 | 1 | 1 | global | 3 | dense | 1 | y | 4 | 16 | 4 | 1 | 1 | 1 | 2 | vec | 3 | dense | 1 | z | 4 | 32 | 1 | 8 | 1 | 2 | 1 | vec | 3 | dense | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 1 | vec | 3 | dense | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 32 | local | 3 | dense | 2 | z | 1 | 32 | 8 | 1 | 4 | 2 | 1 | local | 3 | dense | 3 | none | 1 | 32 | 4 | 2 | 1 | 1 | 16 | local | 4 | dense | 1 | x | 1 | 64 | 2 | 1 | 2 | 1 | 1 | global | 4 | dense | 1 | y | 4 | 32 | 2 | 1 | 1 | 1 | 1 | vec | 4 | dense | 1 | z | 4 | 16 | 2 | 8 | 1 | 1 | 2 | vec | 4 | dense | 2 | x | 4 | 16 | 4 | 4 | 2 | 1 | 4 | vec | 4 | dense | 2 | y | 1 | 32 | 1 | 8 | 4 | 1 | 1 | local | 4 | dense | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 4 | dense | 3 | none | 1 | 2 | 16 | 8 | 1 | 1 | 16 | local | 5 | dense | 1 | x | 1 | 128 | 2 | 1 | 2 | 1 | 1 | global | 5 | dense | 1 | y | 4 | 16 | 16 | 1 | 1 | 1 | 4 | vec | 5 | dense | 1 | z | 4 | 32 | 1 | 8 | 1 | 1 | 2 | vec | 5 | dense | 2 | x | 4 | 16 | 16 | 1 | 1 | 16 | 1 | vec | 5 | dense | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 16 | local | 5 | dense | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 5 | dense | 3 | none | 1 | 32 | 2 | 4 | 1 | 1 | 64 | local | 1 | star | 2 | x | 2 | 64 | 1 | 4 | 1 | 1 | 1 | vec | 1 | star | 2 | y | 1 | 128 | 1 | 1 | 2 | 1 | 1 | global | 1 | star | 2 | z | 1 | 64 | 1 | 2 | 4 | 1 | 1 | global | 1 | star | 3 | none | 1 | 64 | 1 | 1 | 4 | 1 | 1 | global | 2 | star | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 2 | vec | 2 | star | 2 | y | 1 | 128 | 1 | 2 | 1 | 1 | 4 | global | 2 | star | 2 | z | 1 | 64 | 4 | 1 | 2 | 1 | 1 | global | 2 | star | 3 | none | 1 | 64 | 1 | 2 | 4 | 1 | 1 | global | 3 | star | 2 | x | 4 | 8 | 4 | 8 | 2 | 1 | 2 | vec | 3 | star | 2 | y | 1 | 64 | 1 | 4 | 1 | 1 | 16 | global | 3 | star | 2 | z | 1 | 32 | 8 | 1 | 4 | 2 | 2 | local | 3 | star | 3 | none | 1 | 64 | 1 | 4 | 1 | 2 | 1 | global | 4 | star | 2 | x | 4 | 8 | 4 | 8 | 2 | 1 | 1 | vec | 4 | star | 2 | y | 1 | 64 | 1 | 4 | 2 | 1 | 1 | global | 4 | star | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 4 | star | 3 | none | 1 | 64 | 1 | 4 | 1 | 1 | 8 | global | 5 | star | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 2 | vec | 5 | star | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 32 | local | 5 | star | 2 | z | 1 | 32 | 8 | 1 | 1 | 8 | 1 | local | 5 | star | 3 | none | 1 | 64 | 1 | 2 | 1 | 1 | 8 | global | 2 | diamond | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 4 | vec | 2 | diamond | 2 | y | 1 | 64 | 1 | 4 | 1 | 1 | 4 | global | 2 | diamond | 2 | z | 1 | 64 | 2 | 2 | 2 | 1 | 1 | global | 2 | diamond | 3 | none | 1 | 32 | 4 | 2 | 1 | 2 | 32 | local | 3 | diamond | 2 | x | 4 | 16 | 8 | 2 | 1 | 32 | 1 | vec | 3 | diamond | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 32 | local | 3 | diamond | 2 | z | 1 | 16 | 16 | 1 | 4 | 2 | 1 | local | 3 | diamond | 3 | none | 1 | 64 | 4 | 1 | 1 | 32 | 1 | global | 4 | diamond | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 1 | vec | 4 | diamond | 2 | y | 1 | 32 | 1 | 8 | 4 | 1 | 16 | local | 4 | diamond | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 4 | diamond | 3 | none | 1 | 8 | 8 | 4 | 2 | 1 | 32 | local | 5 | diamond | 2 | x | 4 | 16 | 8 | 2 | 1 | 16 | 1 | vec | 5 | diamond | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 32 | local | 5 | diamond | 2 | z | 1 | 32 | 8 | 1 | 1 | 8 | 1 | local | 5 | diamond | 3 | none | 1 | 16 | 4 | 4 | 1 | 1 | 32 | local | 1 | no_corners | 3 | none | 1 | 64 | 1 | 1 | 2 | 1 | 1 | global | 2 | no_corners | 2 | x | 4 | 16 | 4 | 4 | 1 | 1 | 16 | vec | 2 | no_corners | 2 | y | 1 | 64 | 1 | 4 | 4 | 1 | 16 | local | 2 | no_corners | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 2 | no_corners | 3 | none | 1 | 32 | 4 | 2 | 1 | 2 | 8 | local | 3 | no_corners | 2 | x | 4 | 16 | 4 | 4 | 2 | 1 | 1 | vec | 3 | no_corners | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 32 | local | 3 | no_corners | 2 | z | 1 | 32 | 8 | 1 | 4 | 2 | 1 | local | 3 | no_corners | 3 | none | 1 | 32 | 4 | 2 | 1 | 1 | 16 | local | 4 | no_corners | 2 | x | 4 | 16 | 8 | 2 | 1 | 1 | 1 | vec | 4 | no_corners | 2 | y | 1 | 32 | 1 | 8 | 4 | 1 | 1 | local | 4 | no_corners | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 4 | no_corners | 3 | none | 1 | 2 | 16 | 8 | 1 | 1 | 16 | local | 5 | no_corners | 2 | x | 4 | 16 | 16 | 1 | 1 | 16 | 1 | vec | 5 | no_corners | 2 | y | 1 | 128 | 1 | 2 | 2 | 1 | 16 | local | 5 | no_corners | 2 | z | 1 | 32 | 8 | 1 | 2 | 4 | 1 | local | 5 | no_corners | 3 | none | 1 | 32 | 2 | 4 | 1 | 1 | 64 | local |
|
|