Research Article

Identification and Quantification of Genomic Repeats and Sample Contamination in Assemblies of 454 Pyrosequencing Reads

Table 1

Annotation of the high read depths contigs for P. rubescens and A. flos-aquae assemblies. For each contig with an estimated copy number of at least 5x, the length, read depth, and estimate of copy number (“Est. copy number”) with upper and lower Confidence interval Limits (CL) are shown. In addition, BLASTX results are shown (maximum E value 10-16). When a contig had hits in multiple different regions, these are separated by a comma. The species to which the BLAST hit belongs is shown in between square brackets.
(a) P. rubescens

ContigLengthReadEst. copyLowerUpperFeatures
(bp)depthnumberCLCL

136641309259.712.49.119.1transposase, IS4 family protein (Nostoc punctiforme PCC 73102)
13972937190.39.26.515.1transposase, IS4 family protein (Cyanothece sp. PCC 8802)
138231190180.28.96.312.9transposase (Microcystis aeruginosa NIES-843)
136881109176.38.85.812.9transposase (Trichodesmium erythraeum IMS101)
1363509173.38.65.613.7No hits
137929424173.18.75.313.3hypothetical protein Npun_R2618 (Nostoc punctiforme PCC 73102), DnaB domain protein helicase domain protein (Cyanothece sp. PCC 7822)
13610852163.28.05.212.6No hits
137111051145.07.25.310.2No hits
137352163144.07.24.011.5conserved hypothetical protein (Cyanothece sp. PCC 7425)
13901611140.97.14.79.8transposase, IS605 OrfB family (Cyanothece sp. PCC 8801)
13846902132.87.22.59.8hypothetical protein L8106_22791 (Lyngbya sp. PCC 8106)
14014770123.86.04.39.8Histone-like DNA-binding protein (Lyngbya sp. PCC 8106)
138431712111.45.63.58.5RNA-directed DNA polymerase (Microcystis aeruginosa NIES-843)
13469669104.25.23.87.4No hits
1385864199.55.13.07.2hypothetical protein L8106_22631 (Lyngbya sp. PCC 8106)
13921205799.35.21.79.4transposase (Microcystis aeruginosa NIES-843)

13462157576.93.72.56.016S rRNA
13463289175.43.72.36.023S rRNA

(b) A. flos-aquae

ContigLengthReadEst. copyLowerUpperFeatures
(bp)DepthNumberCLCL

13355911301.014.29.624.6transposase (Microcystis aeruginosa NIES-843)
13273748257.212.56.025.7transposase (Nodularia spumigena CCY9414), transposase (Nodularia spumigena CCY9414)
136831256189.68.65.517.7transposase (Nostoc sp. PCC 7120)
14262560174.48.25.514.7transposase and inactivated derivatives (Syntrophus aciditrophicus)
141281185165.68.05.013.5unnamed protein product (Microcystis aeruginosa PCC 7806)
129181870163.37.74.514.3transposase (Microcystis aeruginosa NIES-843)
5251566160.67.64.413.7hypothetical protein AM1_C0013 (Acaryochloris marina MBIC11017), hypothetical protein AM1_C0013 (Acaryochloris marina MBIC11017)
214567156.57.24.912.7conserved hypothetical protein (Microscilla marina ATCC 23134), conserved hypothetical protein (Microscilla marina ATCC 23134)
12934575137.36.63.612.0IS1 transposase subfamily, putative (Synechococcus sp. PCC 7335)
13740502115.35.43.89.3transposase (Cyanothece sp. ATCC 51142)
132861782109.06.20.214.2No hits

13746158591.04.32.38.116S
13744216989.94.22.77.823S