K-mer-Based Motif Analysis in Insect Species across Anopheles, Drosophila, and Glossina Genera and Its Application to Species Classification
Table 2
CC statistics for k-mers of lengths 7–9 bp for different combinations of the genera under study.
Group comparison
Min
Median
Mean
Max
Std. dev.
No. of comparisons
Heptamers
Anopheles
0.913
0.957
0.955
0.999
0.022
231
Non-Anopheles
0.590
0.833
0.837
0.999
0.087
630
Drosophila
0.590
0.874
0.869
0.999
0.072
435
Non-Drosophila
0.677
0.938
0.882
0.999
0.104
378
Glossina
0.965
0.994
0.986
0.999
0.014
15
Non-Glossina
0.441
0.739
0.772
0.999
0.144
1326
Anopheles vs. Drosophila
0.441
0.648
0.644
0.770
0.059
660
Anopheles vs. Glossina
0.677
0.740
0.744
0.787
0.027
132
Drosophila vs. Glossina
0.642
0.749
0.745
0.812
0.033
180
C. briggsae vs. Anopheles
0.528
0.559
0.562
0.643
0.030
22
C. briggsae vs. Drosophila
0.266
0.620
0.573
0.667
0.102
30
C. briggsae vs. Glossina
0.485
0.492
0.499
0.534
0.018
6
A. mellifera vs. Anopheles
0.568
0.617
0.629
0.702
0.043
22
A. mellifera vs. Drosophila
0.242
0.484
0.474
0.567
0.065
30
A. mellifera vs. Glossina
0.570
0.590
0.589
0.617
0.017
6
Octamers
Anopheles
0.904
0.950
0.948
0.999
0.023
231
Non-Anopheles
0.588
0.824
0.822
0.998
0.089
630
Drosophila
0.588
0.858
0.857
0.997
0.069
435
Non-Drosophila
0.655
0.93
0.869
0.999
0.113
378
Glossina
0.948
0.988
0.978
0.998
0.020
15
Non-Glossina
0.443
0.723
0.761
0.999
0.143
1326
Anopheles vs. Drosophila
0.443
0.637
0.633
0.760
0.055
660
Anopheles vs. Glossina
0.655
0.716
0.719
0.755
0.026
132
Drosophila vs. Glossina
0.621
0.728
0.723
0.791
0.034
180
C. briggsae vs. Anopheles
0.521
0.554
0.556
0.634
0.029
22
C. briggsae vs. Drosophila
0.279
0.610
0.567
0.652
0.094
30
C. briggsae vs. Glossina
0.477
0.484
0.490
0.522
0.017
6
A. mellifera vs. Anopheles
0.564
0.611
0.624
0.696
0.042
22
A. mellifera vs. Drosophila
0.259
0.481
0.477
0.565
0.061
30
A. mellifera vs. Glossina
0.564
0.585
0.583
0.608
0.016
6
Nonamers
Anopheles
0.886
0.939
0.938
0.996
0.025
231
Non-Anopheles
0.577
0.805
0.801
0.993
0.092
630
Drosophila
0.577
0.838
0.839
0.992
0.069
435
Non-Drosophila
0.629
0.919
0.852
0.996
0.121
378
Glossina
0.919
0.975
0.961
0.993
0.028
15
Non-Glossina
0.439
0.705
0.747
0.996
0.143
1326
Anopheles vs. Drosophila
0.439
0.624
0.619
0.746
0.053
660
Anopheles vs. Glossina
0.629
0.689
0.691
0.724
0.024
132
Drosophila vs. Glossina
0.589
0.697
0.694
0.766
0.034
180
C. briggsae vs. Anopheles
0.510
0.544
0.545
0.619
0.027
22
C. briggsae vs. Drosophila
0.285
0.594
0.553
0.636
0.086
30
C. briggsae vs. Glossina
0.464
0.470
0.475
0.503
0.014
6
A. mellifera vs. Anopheles
0.555
0.602
0.615
0.685
0.041
22
A. mellifera vs. Drosophila
0.270
0.475
0.474
0.558
0.058
30
A. mellifera vs. Glossina
0.551
0.572
0.570
0.592
0.014
6
CC values were calculated for the genera Anopheles, Drosophila, and Glossina as well as between these three genera and between two outliers, Apis mellifera and Caenorhabditis elegans, and these two genera. For each combination, the minimum, mean, median, maximum CC values were calculated as well as the standard deviation and the number of species comparisons.