Table of Contents
ISRN Bioinformatics
Volume 2013 (2013), Article ID 481545, 8 pages
http://dx.doi.org/10.1155/2013/481545
Research Article

Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

1Systems Pharmacology and Biomarkers, Janssen Research & Development, LLC, 3210 Merryfield Row, San Diego, CA 92121, USA
2High Performance & Scientific Computing, Janssen Research & Development, LLC, 920 Route 202, Raritan, NJ 08869, USA
3Translational Informatics IT, Janssen Research & Development, LLC, 3210 Merryfield Row, San Diego, CA 92121, USA

Received 8 July 2013; Accepted 7 August 2013

Academic Editors: N. Lemke, K. Mizuguchi, O. Norberto de Souza, and J. T. L. Wang

Copyright © 2013 Shanrong Zhao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. Z. Wang, M. Gerstein, and M. Snyder, “RNA-Seq: a revolutionary tool for transcriptomics,” Nature Reviews Genetics, vol. 10, no. 1, pp. 57–63, 2009. View at Publisher · View at Google Scholar · View at Scopus
  2. S. Marguerat and J. Bähler, “RNA-seq: from technology to biology,” Cellular and Molecular Life Sciences, vol. 67, no. 4, pp. 569–579, 2010. View at Publisher · View at Google Scholar · View at Scopus
  3. K. O. Mutz, A. Heilkenbrinker, M. Lönne, J. G. Walter, and F. Stahl, “Transcriptome analysis using next-generation sequencing,” Current Opinion in Biotechnology, vol. 24, no. 1, pp. 22–30, 2013. View at Publisher · View at Google Scholar
  4. P. J. Hurd and C. J. Nelson, “Advantages of next-generation sequencing versus the microarray in epigenetic research,” Briefings in Functional Genomics and Proteomics, vol. 8, no. 3, pp. 174–183, 2009. View at Publisher · View at Google Scholar · View at Scopus
  5. J. H. Malone and B. Oliver, “Microarrays, deep sequencing and the true measure of the transcriptome,” BMC Biology, vol. 9, article 34, 2011. View at Publisher · View at Google Scholar · View at Scopus
  6. L. M. McIntyre, K. K. Lopiano, A. M. Morse et al., “RNA-seq: technical variability and sampling,” BMC Genomics, vol. 12, article 293, 2011. View at Publisher · View at Google Scholar · View at Scopus
  7. J. C. Marioni, C. E. Mason, S. M. Mane, M. Stephens, and Y. Gilad, “RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays,” Genome Research, vol. 18, no. 9, pp. 1509–1517, 2008. View at Publisher · View at Google Scholar · View at Scopus
  8. I. Nookaew, M. Papini, N. Pornputtapong et al., “A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae,” Nucleic Acids Research, vol. 40, no. 20, pp. 10084–10097, 2012. View at Publisher · View at Google Scholar
  9. M. A. Stalteri and A. P. Harrison, “Interpretation of multiple probe sets mapping to the same gene in Affymetrix GeneChips,” BMC Bioinformatics, vol. 8, article 13, 2007. View at Publisher · View at Google Scholar · View at Scopus
  10. A. Oshlack, M. D. Robinson, and M. D. Young, “From RNA-seq reads to differential expression results,” Genome Biology, vol. 11, no. 12, article 220, 2010. View at Publisher · View at Google Scholar · View at Scopus
  11. J. H. Bullard, E. Purdom, K. D. Hansen, and S. Dudoit, “Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments,” BMC Bioinformatics, vol. 11, article 94, 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. S. Tarazona, F. García-Alcalde, J. Dopazo, A. Ferrer, and A. Conesa, “Differential expression in RNA-seq: a matter of depth,” Genome Research, vol. 21, no. 12, pp. 2213–2223, 2011. View at Publisher · View at Google Scholar · View at Scopus
  13. J. Lee, Y. Ji, S. Liang, G. Cai, and P. Müller, “On differential gene expression using RNA-Seq data,” Cancer Informatics, vol. 10, pp. 205–215, 2011. View at Publisher · View at Google Scholar · View at Scopus
  14. M. Baker, “Next-generation sequencing: adjusting to data overload,” Nature Methods, vol. 7, no. 7, pp. 495–499, 2010. View at Publisher · View at Google Scholar · View at Scopus
  15. M. C. Schatz, B. Langmead, and S. L. Salzberg, “Cloud computing and the DNA data race,” Nature Biotechnology, vol. 28, no. 7, pp. 691–693, 2010. View at Publisher · View at Google Scholar · View at Scopus
  16. U. S. Evani, D. Challis, J. Yu et al., “Atlas2 cloud: a framework for personal genome analysis in the cloud,” BMC Genomics, vol. 13, supplement 6, article S19, 2012. View at Publisher · View at Google Scholar
  17. M. Garber, M. G. Grabherr, M. Guttman, and C. Trapnell, “Computational methods for transcriptome annotation and quantification using RNA-seq,” Nature Methods, vol. 8, no. 6, pp. 469–477, 2011. View at Publisher · View at Google Scholar · View at Scopus
  18. J. Chen, F. Qian, W. Yan, and B. Shen, “Translational biomedical informatics in the cloud: present and future,” BioMed Research International, vol. 2013, Article ID 658925, 8 pages, 2013. View at Publisher · View at Google Scholar
  19. L. D. Stein, “The case for cloud computing in genome informatics,” Genome Biology, vol. 11, no. 5, article 207, 2010. View at Publisher · View at Google Scholar · View at Scopus
  20. A. Rosenthal, P. Mork, M. H. Li, J. Stanford, D. Koester, and P. Reynolds, “Cloud computing: a new business paradigm for biomedical information sharing,” Journal of Biomedical Informatics, vol. 43, no. 2, pp. 342–353, 2010. View at Publisher · View at Google Scholar · View at Scopus
  21. D. P. Wall, P. Kudtarkar, V. A. Fusaro, R. Pivovarov, P. Patil, and P. J. Tonellato, “Cloud computing for comparative genomics,” BMC Bioinformatics, vol. 11, article 259, 2010. View at Publisher · View at Google Scholar · View at Scopus
  22. R. S. Thakur, R. Bandopadhyay, B. Chaudhary, and S. Chatterjee, “Now and next-generation sequencing techniques: future of sequence analysis using cloud computing,” Front Genetics, vol. 3, article 280, 2012. View at Publisher · View at Google Scholar
  23. M. C. Schatz, “CloudBurst: highly sensitive read mapping with MapReduce,” Bioinformatics, vol. 25, no. 11, pp. 1363–1369, 2009. View at Publisher · View at Google Scholar · View at Scopus
  24. B. Langmead, M. C. Schatz, J. Lin, M. Pop, and S. L. Salzberg, “Searching for SNPs with cloud computing,” Genome Biology, vol. 10, no. 11, article R134, 2009. View at Publisher · View at Google Scholar · View at Scopus
  25. B. Langmead, K. D. Hansen, and J. T. Leek, “Cloud-scale RNA-sequencing differential expression analysis with Myrna,” Genome Biology, vol. 11, no. 8, article R83, 2010. View at Publisher · View at Google Scholar · View at Scopus
  26. S. V. Angiuoli, J. R. White, M. Matalka, O. White, and W. F. Fricke, “Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing,” PLoS ONE, vol. 6, no. 10, Article ID e26624, 2011. View at Publisher · View at Google Scholar · View at Scopus
  27. T. Nguyen, W. Shi, and D. Ruden, “CloudAligner: a fast and full-featured MapReduce based tool for sequence mapping,” BMC Research Notes, vol. 4, article 171, 2011. View at Publisher · View at Google Scholar · View at Scopus
  28. X. Feng, R. Grossman, and L. Stein, “PeakRanger: a cloud-enabled peak caller for ChIP-seq data,” BMC Bioinformatics, vol. 12, article 139, 2011. View at Publisher · View at Google Scholar · View at Scopus
  29. S. Anders and W. Huber, “Differential expression analysis for sequence count data,” Genome Biology, vol. 11, no. 10, article R106, 2010. View at Publisher · View at Google Scholar · View at Scopus
  30. M. D. Robinson, D. J. McCarthy, and G. K. Smyth, “edgeR: a bioconductor package for differential expression analysis of digital gene expression data,” Bioinformatics, vol. 26, no. 1, pp. 139–140, 2010. View at Google Scholar · View at Scopus
  31. T. J. Hardcastle and K. A. Kelly, “BaySeq: empirical bayesian methods for identifying differential expression in sequence count data,” BMC Bioinformatics, vol. 11, article 422, 2010. View at Publisher · View at Google Scholar · View at Scopus
  32. S. Zhao, K. Prenger, L. Smith et al., “Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing,” BMC Genomics, vol. 14, article 425, 2013. View at Google Scholar
  33. Amazon Simple Storage Service (Amazon S3), http://aws.amazon.com/s3/.
  34. Amazon Elastic Compute Cloud (Amazon EC2), http://aws.amazon.com/ec2/.
  35. J. Hu, H. Ge, M. Newman, and K. Liu, “OSA: a fast and accurate alignment tool for RNA-Seq,” Bioinformatics, vol. 28, no. 14, pp. 1933–1934, 2012. View at Publisher · View at Google Scholar
  36. C. Trapnell, B. A. Williams, G. Pertea et al., “Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation,” Nature Biotechnology, vol. 28, no. 5, pp. 511–515, 2010. View at Publisher · View at Google Scholar · View at Scopus
  37. “OSA website,” http://www.omicsoft.com/osa/#Supplementary.
  38. “Mono-2.10.8,” http://www.mono-project.com/Release_Notes_Mono_2.10.8.
  39. “Omicsoft reference library,” http://www.omicsoft.com/downloads/dreflib/.
  40. “Command line S3 client,” http://s3tools.org/s3cmd.
  41. “Cloud-init,” https://help.ubuntu.com/community/CloudInit.
  42. B. Li and C. N. Dewey, “RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome,” BMC Bioinformatics, vol. 12, article 323, 2011. View at Publisher · View at Google Scholar · View at Scopus
  43. “Myrna website,” http://bowtie-bio.sourceforge.net/myrna/index.shtml.
  44. B. Langmead, C. Trapnell, M. Pop, and S. L. Salzberg, “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, vol. 10, no. 3, article R25, 2009. View at Publisher · View at Google Scholar · View at Scopus
  45. “Stormbow website,” http://s3.amazonaws.com/jnj_stormbow/index.html.