Research Article

On the Usage of GPUs for Efficient Motion Estimation in Medical Image Sequences

Figure 1

The overall architecture of a modern Fermi-based GPU device (a) and the inner details of a multiprocessor (b). Multiprocessors are configured around a shared level-2 cache and register files. Each multiprocessor has a number of computational cores and a level-1 cache. In the earlier versions of architectures such as C1060, these two cache levels do not exist, and the absence is facilitated by explicitly managed memories. This includes shared, constant, and texture memories. Image adopted from CUDA Programming Guide [2].
137604.fig.001a
(a)
137604.fig.001b
(b)