Research Article
Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
Algorithm 7
IPCSR with the adaptive number of threads per block.
Input: , , , , , ; | Output: ; | () _; | () _; | () ; ; | () ; ; | () ; ; | //Assemble into shared memory | () for to with += do | () ; | () done | () (); | () ; | () if then | () //Omitted: Perform a scalar-style reduction | () else | //Omitted: Perform a multiple scalar-style reduction | () end |
|