The Scientific World Journal / 2014 / Article / Tab 7 / Research Article
Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization Table 7 The comparison of nine improved and existing feature-selection methods with respect to micro-
measure on WebKB for NB and SVM, respectively. The bold values indicate the best performance of the classifier when various feature-selection methods are used, respectively.
Feature selection Naïve Bayes Support vector machines 400 800 1200 1600 2000 400 800 1200 1600 2000 IG 71.36 74.17 75.84 76.61 77.46 83.52 85.66 86.10 86.80 86.77 IGX 71.64 73.96 75.64 76.57 77.48 84.10 86.29 86.76 87.27 87.33 CHI 71.92 74.16 74.65 76.19 76.93 85.88 86.52 86.08 86.89 86.89 CHIX 74.98 76.28 77.39 78.57 78.83 86.69 86.97 87.26 87.63 87.34 MI 33.96 34.63 36.32 40.71 49.58 44.96 48.60 57.23 61.75 64.44 MIX 33.69 38.31 44.49 51.00 54.30 38.37 46.86 50.56 56.63 60.85 DF 71.04 73.74 75.82 76.87 77.23 83.84 85.87 87.27 87.04 87.05 DFX 71.93 74.48 76.57 77.52 77.55 84.51 86.28 87.73 87.64 87.59 GINI 72.13 75.58 77.68 77.84 78.42 86.17 86.81 87.09 87.13 86.90 GINIX 73.75 76.96 78.06 78.80 78.97 86.61 87.52 88.20 87.56 87.44 DIA 47.75 48.40 46.06 48.77 51.00 62.66 69.13 74.84 76.95 79.03 DIAX 50.11 53.23 60.99 63.86 63.68 56.42 63.42 69.97 71.73 71.36 CMFS 72.14 73.22 75.62 76.81 77.33 85.64 86.06 86.72 87.63 87.17 CMFSX 73.37 75.89 77.63 78.06 78.70 86.04 86.66 87.29 87.54 87.75 OCFS 73.07 75.23 76.46 77.09 78.02 84.39 85.64 86.49 86.84 87.04 OCFSX 73.79 76.52 78.05 78.27 78.53 87.19 86.86 86.85 87.44 87.78 DFPFS 65.17 64.74 64.65 64.66 65.33 80.69 81.46 81.27 81.26 80.28 DFPFSX 67.96 68.81 68.37 68.58 68.83 81.83 82.08 82.11 82.00 82.01