Research Article

Linking De Novo Assembly Results with Long DNA Reads Using the dnaasm-link Application

Table 2

Evaluation of dnaasm-link application in comparison to other tools for datasets depicted in Table 1. The first row in table, where algorithm name is “no scaffolding”, provides the input set statistics (no scaffolding algorithm is used) taken from Table 1. The parameters (No. of contigs, etc.) are depicted in first paragraph of “Results” section.

No. of contigs No. of mis. N50 [bp] NA50 [bp] Max [bp] Largest algn. [bp] Avg. mis. Avg. indels Avg. N’s

E. coli no scaffolding 65 9 176396 164044 398301 360084 2.32 0.17 0.00
SSPACE-LongRead 32 29 398301 211043 1274776 564486 2.47 0.37 570.90
LINKS 23 19 637611 235726 1146701 636452 2.36 0.39 233.43
dnaasm-link 22 20 746714 219242 1128693 636452 2.36 0.37 212.75
Fast-SG + OPERA-LG 26 16 349966 342146 659623 658295 2.36 0.30 326.53
Fast-SG + BOSS 60 14 177523 164044 611106 360084 2.32 0.17 64.79
Fast-SG + ScaffMatch 55 18 185955 177523 603113 359089 2.41 0.17 139.44

S. cerevisiae no scaffolding 430 53 53444 49075 257346 249232 85.77 8.80 0.00
SSPACE-LongRead 557 105 167867 126607 736874 452023 95.42 11.27 3690.74
LINKS 202 89 202618 126598 623140 416048 87.04 10.00 850.77
dnaasm-link 190 92 224004 126353 764024 431875 87.28 10.08 861.19
Fast-SG + OPERA-LG 202 59 180866 155226 736942 451889 85.51 9.72 462.50
Fast-SG + BOSS 369 113 57097 47994 257346 249232 85.77 8.80 374.16
Fast-SG + ScaffMatch 328 144 80833 51157 434320 249232 85.41 8.82 489.70

The following reference sequences were used to evaluate the results: NC_000913 for E. coli and NC_001133 … NC_001148, NC_001224 for S. cerevisiae.