Research Article
Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
Algorithm 4
Main procedure of generating row blocks.
Input: , , , ; | Output: , ; | () ; ; ; ; | () for to do | //Compute non-zeros and the total rows | () += ; | () ++; | () if ∥ | ( && ) then | //This row fills up SHARED_SIZE or threads per block | () ; ++; ; ; | () else if then | //This row is an extra one that is excluded | () ; ++; ; ; −−; | () end | () done | //Extra case | () if != then | () ; | () else | () −−; | () end |
|