Review Article

High Performance Biological Pairwise Sequence Alignment: FPGA versus GPU versus Cell BE versus GPP

Figure 9

Our GPU parallel thread implementation of the Smith-Waterman algorithm: store and load operations are performed by the final thread and the first thread in each thread batch (block) to allow for any sequence length processing.
752910.fig.009