Research Article

Visualization of Genome Signatures of Eukaryote Genomes by Batch-Learning Self-Organizing Map with a Special Emphasis on Drosophila Genomes

Figure 1

BLSOMs for the overlapping 100 kb with a 10 kb sliding step and the nonoverlapping 10- and 5 kb sequences from 101 eukaryotic genomes. (a) DegeTetra- and (b) DegePenta-BLSOMs. BLSOM was constructed with frequencies for degenerate sets in which the frequencies of a pair of complimentary tetra- or pentanucleotides were added. Nodes that include sequences from more than one phylogenetic family are indicated in black, those that contain no genomic sequences are indicated in white, and those containing sequences from a single family are indicated in colors. Differences in color were difficult to distinguish individual phylogenetic families because 47 families were analyzed simultaneously, but the observation that the back nodes were very rare showed the proper clustering of sequences according to phylotype. (c) 100 kb DegePenta-%GC; the %GC was calculated from the vectorial data representing each node in the 100 kb DegePenta-BLSOM and divided into nine categories with an equal number of nodes, as listed at the bottom of this panel; the %GC ranged from 19.5 to 58.1 and the midvalue was 38.8.
985706.fig.001