Research Article

Linking De Novo Assembly Results with Long DNA Reads Using the dnaasm-link Application

Figure 1

The process of generating and filtering k-mer pairs from long DNA reads. (a) Firstly, a Bloom filter and an array containing the number of occurrences of each k-mer are built based on the k-spectrum generated from the input set of contigs. (b) From each long DNA read, a set of k-mer pairs (k-mer length equal to ) is generated, with a distance between the beginning of the first k-mer and the end of the second equal to and a sliding step equal to . (c) The input set of k-mer pairs is filtered with the Bloom filter; some pairs are discarded (dotted arrows). (d) The resulting set of k-mer pairs after the second filtering process (red arrows - nonunique k-mers - are discarded). It is worth noting that the resulting set of k-mers pairs (d) is very limited in relation to the generated set of k-mers pairs (b) due to errors in long DNA reads and repetitive regions of the investigated genome.