On the Usage of GPUs for Efficient Motion Estimation in Medical Image Sequences

<table>The overall architecture of a modern Fermi-based GPU device (a) and the inner details of a multiprocessor (b). Multiprocessors are configured around a shared level-2 cache and register files. Each multiprocessor has a number of computational cores and a level-1 cache. In the earlier versions of architectures such as C1060, these two cache levels do not exist, and the absence is facilitated by explicitly managed memories. This includes shared, constant, and texture memories. Image adopted from CUDA Programming Guide [<a href="/journals/ijbi/2011/137604/#B2">2</a>].</table>

International Journal of Biomedical Imaging

fig1

Figure 1

Figure 1: On the Usage of GPUs for Efficient Motion Estimation in Medical Image Sequences