Table of Contents
Sequencing
Volume 2010, Article ID 782465, 12 pages
http://dx.doi.org/10.1155/2010/782465
Research Article

Identification and Quantification of Genomic Repeats and Sample Contamination in Assemblies of 454 Pyrosequencing Reads

Department of Biology, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, P.O. Box 1066 Blindern, 0316 Oslo, Norway

Received 26 May 2009; Revised 28 September 2009; Accepted 5 November 2009

Academic Editor: Nick Loman

Copyright © 2010 Alexander J. Nederbragt et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. T. Wicker, E. Schlagenhauf, A. Graner, T. J. Close, B. Keller, and N. Stein, “454 sequencing put to the test using the complex genome of barley,” BMC Genomics, vol. 7, article 275, 2006. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  2. M. Pop and S. L. Salzberg, “Bioinformatics challenges of new sequencing technology,” Trends in Genetics, vol. 24, no. 3, pp. 142–149, 2008. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  3. Phrap, http://www.phrap.org/.
  4. T. B. Rounge, T. Rohrlack, A. J. Nederbragt, T. Kristensen, and K. S. Jakobsen, “A genome-wide analysis of nonribosomal peptide synthetase gene clusters and their peptides in a Planktothrix rubescens strain,” BMC Genomics, vol. 10, no. 1, article 396, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  5. 454 Case Study: Genome Coverage of Neurospora crassa, http://www.454.com/downloads/454_CASE_STUDY_genome_coverage.pdf.
  6. K. Swaminathan, K. Varala, and M. E. Hudson, “Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey,” BMC Genomics, vol. 8, article 132, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  7. D. A. Wheeler, M. Srinivasan, M. Egholm et al., “The complete genome of an individual by massively parallel DNA sequencing,” Nature, vol. 452, no. 7189, pp. 872–876, 2008. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  8. J.-M. Aury, C. Cruaud, V. Barbe et al., “High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies,” BMC Genomics, vol. 9, article 603, 2008. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  9. M. Riley, T. Abe, M. B. Arnaud et al., “Escherichia coli K-12: a cooperatively developed annotation snapshot—2005,” Nucleic Acids Research, vol. 34, no. 1, pp. 1–9, 2006. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  10. The NCBI Short Read Archive, http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi.
  11. K. E. Nelson, R. D. Fleischmann, R. T. DeBoy et al., “Complete genome sequence of the oral pathogenic Bacterium Porphyromonas gingivalis strain W83,” Journal of Bacteriology, vol. 185, no. 18, pp. 5591–5601, 2003. View at Publisher · View at Google Scholar · View at Scopus
  12. A. Stüken, A. J. Nederbragt, and K. S. Jakobsen, “Cylindrospermopsin biosynthesis cluster in Aphanizomenon flos-aquae,” submitted for publication.
  13. M. Margulies, M. Egholm, W. E. Altman et al., “Genome sequencing in microfabricated high-density picolitre reactors,” Nature, vol. 437, no. 7057, pp. 376–380, 2005. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  14. R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2008.
  15. Primer3, http://primer3.sourceforge.net/.
  16. S. F. Altschul, T. L. Madden, A. A. Schäffer et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Research, vol. 25, no. 17, pp. 3389–3402, 1997. View at Publisher · View at Google Scholar · View at Scopus
  17. D. H. Huson, A. F. Auch, J. Qi, and S. C. Schuster, “MEGAN analysis of metagenomic data,” Genome Research, vol. 17, no. 3, pp. 377–386, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  18. F. R. Blattner, G. Plunkett III, C. A. Bloch et al., “The complete genome sequence of Escherichia coli K-12,” Science, vol. 277, no. 5331, pp. 1453–1462, 1997. View at Publisher · View at Google Scholar · View at Scopus
  19. S. L. Chissoe, M. A. Marra, L. Hillier, R. Brinkman, R. K. Wilson, and R. H. Waterston, “Representation of cloned genomic sequences in two sequencing vectors: correlation of DNA sequence and subclone distribution,” Nucleic Acids Research, vol. 25, no. 15, pp. 2960–2966, 1997. View at Publisher · View at Google Scholar · View at Scopus
  20. J. A. Bailey, Z. Gu, R. A. Clark et al., “Recent segmental duplications in the human genome,” Science, vol. 297, no. 5583, pp. 1003–1007, 2002. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  21. P. Siguier, J. Filee, and M. Chandler, “Insertion sequences in prokaryotic genomes,” Current Opinion in Microbiology, vol. 9, no. 5, pp. 526–531, 2006. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  22. Z. M.-P. Lee, C. Bussema III, and T. M. Schmidt, “rrnDB: documenting the number of rRNA and tRNA genes in bacteria and archaea,” Nucleic Acids Research, vol. 37, supplement 1, pp. D489–D493, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  23. E. S. Lander and M. S. Waterman, “Genomic mapping by fingerprinting random clones: a mathematical analysis,” Genomics, vol. 2, no. 3, pp. 231–239, 1988. View at Google Scholar · View at Scopus
  24. A. Tooming-Klunderud, T. Rohrlack, K. Shalchian-Tabrizi, T. Kristensen, and K. S. Jakobsen, “Structural analysis of a non-ribosomal halogenated cyclic peptide and its putative operon from Microcystis: implications for evolution of cyanopeptolins,” Microbiology, vol. 153, no. 5, pp. 1382–1393, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  25. B. Mikalsen, G. Boison, O. M. Skulberg et al., “Natural variation in the microcystin synthetase operon mcyABC and impact on microcystin production in Microcystis strains,” Journal of Bacteriology, vol. 185, no. 9, pp. 2774–2785, 2003. View at Publisher · View at Google Scholar · View at Scopus
  26. T. Rohrlack, B. Edvardsen, R. Skulberg et al., “Oligopeptide chemotypes of the toxic freshwater cyanobacterium Planktothrix can form subpopulations with dissimilar ecological traits,” Limnology and Oceanography, vol. 53, no. 4, pp. 1279–1293, 2008. View at Google Scholar · View at Scopus
  27. T. B. Rounge, T. Rohrlack, B. Decenciere, B. Edvardsen, T. Kristensen, and K. S. Jakobsen, “Subpopulation differentiation associated with nonribosomal peptide synthetase gene cluster dynamics in the cyanobacterium Planktothrix,” the Journal of Phycology, in press.
  28. Y. Tanabe, K. Kaya, and M. M. Watanabe, “Evidence for recombination in the microcystin synthetase (mcy) genes of toxic cyanobacteria Microcystis spp,” Journal of Molecular Evolution, vol. 58, no. 6, pp. 633–641, 2004. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  29. O. Zhaxybayeva, J. P. Gogarten, R. L. Charlebois, W. F. Doolittle, and R. T. Papke, “Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events,” Genome Research, vol. 16, no. 9, pp. 1099–1108, 2006. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  30. K. Rudi, O. M. Skulberg, and K. S. Jakobsen, “Evolution of cyanobacteria by exchange of genetic material among phyletically related strains,” Journal of Bacteriology, vol. 180, no. 13, pp. 3453–3461, 1998. View at Google Scholar · View at Scopus
  31. J. C. Dohm, C. Lottaz, T. Borodina, and H. Himmelbauer, “Substantial biases in ultra-short read data sets from high-throughput DNA sequencing,” Nucleic Acids Research, vol. 36, no. 16, p. e105, 2008. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  32. D. Y. Chiang, G. Getz, D. B. Jaffe et al., “High-resolution mapping of copy-number alterations with massively parallel sequencing,” Nature Methods, vol. 6, no. 1, pp. 99–103, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  33. E. Arner, E. Kindlund, D. Nilsson et al., “Database of Trypanosoma cruzi repeated genes: 20 000 additional gene variants,” BMC Genomics, vol. 8, article 391, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  34. C. Alkan, J. M. Kidd, T. Marques-Bonet et al., “Personalized copy number and segmental duplication maps using next-generation sequencing,” Nature Genetics, vol. 41, no. 10, pp. 1061–1067, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  35. S. Yoon, Z. Xuan, V. Makarov, K. Ye, and J. Sebat, “Sensitive and accurate detection of copy number variants using read depth of coverage,” Genome Research, vol. 19, no. 9, pp. 1586–1592, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  36. B. Daines, H. Wang, Y. Li, Y. Han, R. Gibbs, and R. Chen, “High-throughput multiplex sequencing to discover copy number variants in Drosophila,” Genetics, vol. 182, no. 4, pp. 935–941, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  37. C. Xie and M. T. Tammi, “CNV-seq, a new method to detect copy number variation using high-throughput sequencing,” BMC Bioinformatics, vol. 10, article 80, 2009. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  38. J. Macas, P. Neumann, and A. Navratilova, “Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula,” BMC Genomics, vol. 8, article 427, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  39. M. J. Ferris and C. F. Hirsch, “Method for isolation and purification of cyanobacteria,” Applied and Environmental Microbiology, vol. 57, no. 5, pp. 1448–1452, 1991. View at Google Scholar · View at Scopus
  40. H. W. Paerl, “Marine plankton,” in The Ecology of Cyanobacteria Their Diversity in Time and Space, B. Whitton and M. Potts, Eds., pp. 121–148, Springer, New York, NY, USA, 2000. View at Google Scholar
  41. F. Liu, J. Lu, W. Hu et al., “New perspectives on host-parasite interplay by comparative transcriptomic and proteomic analyses of Schistosoma japonicum,” PLoS Pathogens, vol. 2, no. 4, p. e29, 2006. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus
  42. J. P. McCutcheon and N. A. Moran, “Parallel genomic evolution and metabolic interdependence in an ancient symbiosis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 104, no. 49, pp. 19392–19397, 2007. View at Publisher · View at Google Scholar · View at PubMed · View at Scopus