Research Article

Parallel Solutions for Voxel-Based Simulations of Reaction-Diffusion Systems

Table 4

Evaluation of the execution times considering the three different CUDA devices available on the second cluster nodes for the 4096-compartment system. The number of blocks we consider is multiple of the number of SMs available on these devices. Speedup values are computed considering the TimePerStep values.

Device Blocks threadsTransferTime (ms)TimePerStep (ms)Speedup

CPUā€”88.929.9ā€”

GTX-58032-32177.04.96.1
64-32205.02.412.3

K2013-128198.62.910.2
26-128182.71.520.5
52-128189.90.837.8
78-128208.60.932.8

GTX-Titan14-128137.42.213.5
28-128136.11.127.6
56-128139.20.557.4
84-128146.50.648.1