Research Article

DeepVariant-on-Spark: Small-Scale Genome Analysis Using a Cloud-Based Computing Framework

Figure 2

Wall-clock time and speedup of DeepVariant and DeepVariant-on-Spark with different combinations of CPU/GPU. Runtime comparison of DeepVariant and DeepVariant-on-Spark with different combinations of CPU/GPU. (a) DeepVariant runs on the pure CPU machine. (b) DeepVariant runs on the CPU/GPU hybrid machine. (c) DeepVariant-on-Spark runs on the pure CPU cluster. (d) DeepVariant-on-Spark runs on the CPU/GPU hybrid cluster. AdamTransform, SelectBAM, Make_Examples, Call_Variants, Postprocess_Variants, and Merge VCF represent each step in DeepVariant or DeepVariant-on-Spark. Speedup represents how many times each condition is faster than DeepVariant’s (16 CPU) mode. The speed improvement of DeepVariant-on-Spark over DeepVariant is provided above. DeepVariant-on-Spark using 128-CPU and 8-GPU configurations improved the wall-clock time by 11.58x compared to DeepVariant using 16 CPUs.
(a)
(b)
(c)
(d)