Research Article
High Performance Implementation of 3D Convolutional Neural Networks on a GPU
Figure 2
For an input matrix, its size is ; is the tile size, here equal to 4. After the reshape kernel is applied, lots of small submatrices with new layouts are generated.