Research Article

Implementing and Evaluating an Heterogeneous, Scalable, Tridiagonal Linear System Solver with OpenCL to Target FPGAs, GPUs, and CPUs

Table 1

FLOP and global memory transactions required for the TDMA and truncated SPIKE FPGA kernels.

OperationTDMATruncated SPIKE

ADD/SUB
MUL
DIV
MEM