Research Article

Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

Algorithm 3

Improved PCSR.
Input: , , , , ;
Output: ;
()    ;
()    ; ;
()    ;
()    ;
()    ;
()    ;
    //Assemble into shared memory
()    for to with   +=   do
()       ;
()    done
() ();
    //Perform a scalar-style reduction from temp_
() ;
() if then
()    ;
()    ;
()    ;
()    for to do
()      +=  ;
()    done
()    ;
() end