The Scientific World Journal / 2014 / Article / Tab 8 / Research Article
Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization Table 8 The comparison of nine improved and existing feature-selection methods with respect to AUC on WebKB for NB and SVM, respectively. The bold values indicate the best performance of the classifier when various feature-selection methods are used, respectively.
Feature selection Naïve Bayes Support vector machines 400 800 1200 1600 2000 400 800 1200 1600 2000 IG 0.8051 0.8253 0.8369 0.8399 0.8442 0.8933 0.9044 0.9093 0.9128 0.9120 IGX 0.8055 0.8242 0.8357 0.8398 0.8444 0.8973 0.9080 0.9134 0.9153 0.9151 CHI 0.8214 0.8312 0.8365 0.8402 0.8423 0.9090 0.9114 0.9095 0.9131 0.9133 CHIX 0.8345 0.8416 0.8478 0.8521 0.8520 0.9130 0.9137 0.9158 0.9191 0.9170 MI 0.5009 0.5030 0.5113 0.5419 0.6082 0.6357 0.6618 0.7137 0.7478 0.7620 MIX 0.5237 0.5505 0.5925 0.6359 0.6593 0.5681 0.6359 0.6653 0.7024 0.7310 DF 0.8041 0.8242 0.8344 0.8404 0.8422 0.8952 0.9078 0.9157 0.9140 0.9143 DFX 0.8051 0.8262 0.8382 0.8432 0.8439 0.8981 0.9103 0.9181 0.9174 0.9168 GINI 0.8177 0.8358 0.8463 0.8476 0.8492 0.9105 0.9144 0.9171 0.9155 0.9138 GINIX 0.8229 0.8420 0.8480 0.8534 0.8533 0.9123 0.9179 0.9229 0.9184 0.9174 DIA 0.6252 0.6372 0.6058 0.6192 0.6465 0.7867 0.8151 0.8420 0.8575 0.8696 DIAX 0.5700 0.5931 0.6111 0.6279 0.6511 0.5791 0.6142 0.6473 0.6842 0.7052 CMFS 0.8132 0.8216 0.8354 0.8397 0.8430 0.9067 0.9098 0.9123 0.9181 0.9152 CMFSX 0.8183 0.8351 0.8445 0.8475 0.8513 0.9079 0.9123 0.9165 0.9185 0.9190 OCFS 0.8189 0.8349 0.8436 0.8453 0.8490 0.9106 0.9041 0.9119 0.9160 0.9158 OCFSX 0.8244 0.8401 0.8480 0.8488 0.8508 0.9151 0.9112 0.9141 0.9168 0.9191 DFPFS 0.7504 0.7507 0.7511 0.7517 0.7525 0.8735 0.8750 0.8743 0.8741 0.8729 DFPFSX 0.7723 0.7770 0.7772 0.7782 0.7786 0.8815 0.8824 0.8819 0.8806 0.8809