Research Article

Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

Algorithm 4

Main procedure of generating row blocks.
Input: , , , ;
Output: , ;
()    ; ; ; ;
()    for to do
      //Compute non-zeros and the total rows
()        +=  ;
()      ++;
()      if
       (  &&  ) then
       //This row fills up SHARED_SIZE or threads per block
()       ; ++; ; ;
()      else if then
       //This row is an extra one that is excluded
()       ; ++; ; ; −−;
()      end
() done
      //Extra case
() if   !=   then
()    ;
() else
()    −−;
() end