Research Article

Distance Variance Score: An Efficient Feature Selection Method in Text Classification

Table 2

Accuracy of classifiers (unit for mean and std: %).
(a) DBWorld data set

LSDVS
DT(C5.0)SVMDT(C5.0)SVM
Mean stdSig.Mean stdSig.Mean stdSig.Mean stdSig.

372191.30 7.5485.73 6.7391.30 7.5485.73 6.73
272182.64 8.17.00484.14 7.06.50589.50 6.59.73685.22 6.10.807
172170.27 8.22.00080.73 6.68.02389.61 5.57.77987.02 8.92.291
72158.53 10.52.00063.97 11.09.00091.74 6.07.43384.15 8.67.129
52156.49 6.94.00062.66 9.16.00089.07 7.08.18585.10 7.57.584
32153.58 10.79.00057.75 8.83.00092.16 4.58.45483.55 7.19.192
12152.44 10.34.00051.92 10.91.00092.74 6.61.17990.56 6.16.098
2158.03 6.73.00057.38 9.53.00088.26 7.68.33291.40 6.41.722

(b) CNAE data set

LSDVS
DT(C5.0)SVMDT(C5.0)SVM
Mean stdSig.Mean stdSig.Mean stdSig.Mean stdSig.

85687.59 2.2592.93 1.5887.59 2.2592.93 1.58
65686.67 2.21.91082.19 4.52.00087.64 1.77.92483.41 2.02.000
45686.87 2.62.39484.42 2.26.00088.25 1.86.33885.18 2.73.000
25673.11 2.77.00066.53 2.09.00087.93 2.31.70084.53 2.12.000
15617.60 1.48.00015.08 2.48.00086.54 2.46.14682.77 2.33.000
568.86 0.930.0005.33 0.95.00081.99 1.90.00077.47 2.33.000