About this Journal Submit a Manuscript Table of Contents
The Scientific World Journal
Volume 2013 (2013), Article ID 730210, 10 pages
http://dx.doi.org/10.1155/2013/730210
Review Article

Computational and Bioinformatics Frameworks for Next-Generation Whole Exome and Genome Sequencing

Rare Genomics Institute, 4100 Forest Park Avenue, Suite 204, St. Louis, MO 63108, USA

Received 28 October 2012; Accepted 22 November 2012

Academic Editors: R. Jiang, W. Tian, J. Wan, and X. Zhao

Copyright © 2013 Marisa P. Dolled-Filhart et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. E. R. Mardis, “Next-generation DNA sequencing methods,” Annual Review of Genomics and Human Genetics, vol. 9, pp. 387–402, 2008. View at Publisher · View at Google Scholar · View at Scopus
  2. H. Li, B. Handsaker, A. Wysoker et al., “The Sequence Alignment/Map format and SAMtools,” Bioinformatics, vol. 25, no. 16, pp. 2078–2079, 2009. View at Publisher · View at Google Scholar · View at Scopus
  3. P. Danecek, A. Auton, G. Abecasis et al., “The variant call format and VCFtools,” Bioinformatics, vol. 27, no. 15, Article ID btr330, pp. 2156–2158, 2011. View at Publisher · View at Google Scholar · View at Scopus
  4. A. G. Day-Williams and E. Zeggini, “The effect of next-generation sequencing technology on complex trait research,” European Journal of Clinical Investigation, vol. 41, no. 5, pp. 561–567, 2011. View at Publisher · View at Google Scholar · View at Scopus
  5. M. Ruffalo, T. LaFramboise, and M. Koyuturk, “Comparative analysis of algorithms for next-generation sequencing read alignment,” Bioinformatics, vol. 27, no. 20, pp. 2790–2796, 2011.
  6. N. Homer and S. F. Nelson, “Improved variant discovery through local re-alignment of short-read next-generation sequencing data using SRMA,” Genome Biology, vol. 11, no. 10, article R99, 2010. View at Publisher · View at Google Scholar · View at Scopus
  7. B. Langmead, C. Trapnell, M. Pop, and S. L. Salzberg, “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, vol. 10, no. 3, article R25, 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. B. Langmead and S. L. Salzberg, “Fast gapped-read alignment with Bowtie 2,” Nature Methods, vol. 9, no. 4, pp. 357–359, 2012.
  9. H. Li and R. Durbin, “Fast and accurate short read alignment with Burrows-Wheeler transform,” Bioinformatics, vol. 25, no. 14, pp. 1754–1760, 2009. View at Publisher · View at Google Scholar · View at Scopus
  10. H. Li, J. Ruan, and R. Durbin, “Mapping short DNA sequencing reads and calling variants using mapping quality scores,” Genome Research, vol. 18, no. 11, pp. 1851–1858, 2008. View at Publisher · View at Google Scholar · View at Scopus
  11. H. Li and R. Durbin, “Fast and accurate long-read alignment with Burrows-Wheeler transform,” Bioinformatics, vol. 26, no. 5, Article ID btp698, pp. 589–595, 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. C. Alkan, S. Sajjadian, and E. E. Eichler, “Limitations of next-generation genome sequence assembly,” Nature Methods, vol. 8, no. 1, pp. 61–65, 2011. View at Publisher · View at Google Scholar · View at Scopus
  13. F. Hach, F. Hormozdiari, C. Alkan et al., “MrsFAST: a cache-oblivious algorithm for short-read mapping,” Nature Methods, vol. 7, no. 8, pp. 576–577, 2010. View at Publisher · View at Google Scholar · View at Scopus
  14. I. D. Dinov, F. Torri, F. Macciardi et al., “Applications of the pipeline environment for visual informatics and genomics computations,” BMC Bioinformatics, vol. 12, article 304, 2011. View at Publisher · View at Google Scholar · View at Scopus
  15. C. Alkan, J. M. Kidd, T. Marques-Bonet et al., “Personalized copy number and segmental duplication maps using next-generation sequencing,” Nature Genetics, vol. 41, no. 10, pp. 1061–1067, 2009. View at Publisher · View at Google Scholar · View at Scopus
  16. S. M. Rumble, P. Lacroute, A. V. Dalca, M. Fiume, A. Sidow, and M. Brudno, “SHRiMP: accurate mapping of short color-space reads,” PLoS Computational Biology, vol. 5, no. 5, Article ID e1000386, 2009. View at Publisher · View at Google Scholar · View at Scopus
  17. M. David, M. Dzamba, D. Lister, L. Ilie, and M. Brudno, “SHRiMP2: sensitive yet practical short read mapping,” Bioinformatics, vol. 27, no. 7, Article ID btr046, pp. 1011–1012, 2011. View at Publisher · View at Google Scholar · View at Scopus
  18. R. Li, Y. Li, K. Kristiansen, and J. Wang, “SOAP: short oligonucleotide alignment program,” Bioinformatics, vol. 24, no. 5, pp. 713–714, 2008. View at Publisher · View at Google Scholar · View at Scopus
  19. R. Li, C. Yu, Y. Li et al., “SOAP2: an improved ultrafast tool for short read alignment,” Bioinformatics, vol. 25, no. 15, pp. 1966–1967, 2009. View at Publisher · View at Google Scholar · View at Scopus
  20. C. M. Liu, K. F. Wong, E. M. K. Wu, et al., “SOAP3: ultra-fast GPU-based parallel alignment tool for short reads,” Bioinformatics, vol. 28, no. 6, pp. 878–879, 2012.
  21. R. Nielsen, J. S. Paul, A. Albrechtsen, and Y. S. Song, “Genotype and SNP calling from next-generation sequencing data,” Nature Reviews Genetics, vol. 12, no. 6, pp. 443–451, 2011. View at Publisher · View at Google Scholar · View at Scopus
  22. A. McKenna, M. Hanna, E. Banks et al., “The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data,” Genome Research, vol. 20, no. 9, pp. 1297–1303, 2010. View at Publisher · View at Google Scholar · View at Scopus
  23. Cancer Genome Atlas Network, “Comprehensive molecular portraits of human breast tumours,” Nature, vol. 490, no. 7418, pp. 61–70, 2012.
  24. D. L. Altshuler, R. M. Durbin, G. R. Abecasis et al., “A map of human genome variation from population-scale sequencing,” Nature, vol. 467, no. 7319, pp. 1061–1073, 2010. View at Publisher · View at Google Scholar · View at Scopus
  25. D. C. Koboldt, K. Chen, T. Wylie et al., “VarScan: variant detection in massively parallel sequencing of individual and pooled samples,” Bioinformatics, vol. 25, no. 17, pp. 2283–2285, 2009. View at Publisher · View at Google Scholar · View at Scopus
  26. D. C. Koboldt, Q. Zhang, D. E. Larson, et al., “VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing,” Genome Research, vol. 22, no. 3, pp. 568–576, 2012.
  27. D. Challis, J. Yu, U. S. Evani, et al., “An integrative variant analysis suite for whole exome next-generation sequencing data,” BMC Bioinformatics, vol. 13, article 8, 2012.
  28. J. Hillman-Jackson, D. Clements, D. Blankenberg, J. Taylor, and A. Nekrutenko, “Using Galaxy to perform large-scale interactive data analyses,” in Current Protocols in Bioinformatics, chapter 10, unit 10.5, 2012.
  29. H. P. Ji, “Improving bioinformatic pipelines for exome variant calling,” Genome Medicine, vol. 4, no. 1, article 7, 2012.
  30. R. R. Lemos, M. B. Souza, and J. R. Oliveira, “Exploring the implications of INDELs in neuropsychiatric genetics: challenges and perspectives,” Journal of Molecular Neuroscience, vol. 47, no. 3, pp. 419–424, 2012.
  31. S. A. Lee, H. S. Mun, H. Kim et al., “Naturally occurring hepatitis B virus X deletions and insertions among Korean chronic patients,” Journal of Medical Virology, vol. 83, no. 1, pp. 65–70, 2011. View at Publisher · View at Google Scholar · View at Scopus
  32. U. Väli, M. Brandström, M. Johansson, and H. Ellegren, “Insertion-deletion polymorphisms (indels) as genetic markers in natural populations,” BMC Genetics, vol. 9, article 8, 2008. View at Publisher · View at Google Scholar · View at Scopus
  33. G. Lunter, “Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes,” Bioinformatics, vol. 23, no. 13, pp. i289–i296, 2007. View at Publisher · View at Google Scholar · View at Scopus
  34. P. Krawitz, C. Rödelsperger, M. Jäger, L. Jostins, S. Bauer, and P. N. Robinson, “Microindel detection in short-read sequence data,” Bioinformatics, vol. 26, no. 6, Article ID btq027, pp. 722–729, 2010. View at Publisher · View at Google Scholar · View at Scopus
  35. K. Ye, M. H. Schulz, Q. Long, R. Apweiler, and Z. Ning, “Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads,” Bioinformatics, vol. 25, no. 21, pp. 2865–2871, 2009. View at Publisher · View at Google Scholar · View at Scopus
  36. D. G. MacArthur, S. Balasubramanian, A. Frankish, et al., “A systematic survey of loss-of-function variants in human protein-coding genes,” Science, vol. 335, no. 6070, pp. 823–828, 2012.
  37. C. A. Albers, G. Lunter, D. G. MacArthur, G. McVean, W. H. Ouwehand, and R. Durbin, “Dindel: accurate indel calls from short-read data,” Genome Research, vol. 21, no. 6, pp. 961–973, 2011. View at Publisher · View at Google Scholar · View at Scopus
  38. J. A. Neuman, O. Isakov, and N. Shomron, “Analysis of insertion-deletion from deep-sequencing data: software evaluation for optimal detection,” Briefings in Bioinformatics. In press. View at Publisher · View at Google Scholar
  39. G. M. Cooper and J. Shendure, “Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data,” Nature Reviews Genetics, vol. 12, no. 9, pp. 628–640, 2011.
  40. E. S. Lander, “Initial impact of the sequencing of the human genome,” Nature, vol. 470, no. 7333, pp. 187–197, 2011. View at Publisher · View at Google Scholar · View at Scopus
  41. L. A. Hindorff, P. Sethupathy, H. A. Junkins et al., “Potential etiologic and functional implications of genome-wide association loci for human diseases and traits,” Proceedings of the National Academy of Sciences of the United States of America, vol. 106, no. 23, pp. 9362–9367, 2009. View at Publisher · View at Google Scholar · View at Scopus
  42. P. Kumar, S. Henikoff, and P. C. Ng, “Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm,” Nature Protocols, vol. 4, no. 7, pp. 1073–1081, 2009. View at Publisher · View at Google Scholar · View at Scopus
  43. I. A. Adzhubei, S. Schmidt, L. Peshkin et al., “A method and server for predicting damaging missense mutations,” Nature Methods, vol. 7, no. 4, pp. 248–249, 2010. View at Publisher · View at Google Scholar · View at Scopus
  44. P. S. Nair and M. Vihinen, “VariBench: A benchmark database for variations,” Human Mutation. In press.
  45. P. Cingolani, A. Platts, L. Wang le, et al., “A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118, iso-2, iso-3,” Fly, vol. 6, no. 2, pp. 80–92, 2012.
  46. G. De Baets, J. Van Durme, J. Reumers, et al., “SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants,” Nucleic Acids Research, vol. 40, pp. D935–D939, 2012.
  47. K. Wang, M. Li, and H. Hakonarson, “ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data,” Nucleic Acids Research, vol. 38, no. 16, Article ID gkq603, p. e164, 2010. View at Publisher · View at Google Scholar · View at Scopus
  48. M. Yandell, C. D. Huff, H. Hu, et al., “A probabilistic disease-gene finder for personal genomes,” Genome Research, vol. 21, no. 9, pp. 1529–1542, 2011.
  49. L. Habegger, S. Balasubramanian, D. Z. Chen, et al., “VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment,” Bioinformatics, vol. 28, no. 17, pp. 2267–2269, 2012.
  50. D. G. MacArthur, S. Balasubramanian, A. Frankish, et al., “A systematic survey of loss-of-function variants in human protein-coding genes,” Science, vol. 335, no. 6070, pp. 823–828, 2012.
  51. M. Sincan, D. R. Simeonov, D. Adams, et al., “VAR-MD: a tool to analyze whole exome-genome variants in small human pedigrees with mendelian inheritance,” Human Mutation, vol. 33, no. 4, pp. 593–598, 2012.