Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling
Table 10
The macro-averaged F-measure results of methods (with BA-LDA (DB) based representation).
Classification algorithm
oh5
oh10
oh15
ohscal
ohsumed
NB
0.89
0.82
0.88
0.84
0.48
SVM
0.90
0.83
0.89
0.86
0.51
Bagging+NB
0.90
0.84
0.90
0.84
0.49
Bagging+SVM
0.89
0.86
0.89
0.85
0.51
AdaBoost+NB
0.91
0.84
0.88
0.87
0.52
AdaBoost+SVM
0.89
0.86
0.88
0.87
0.52
RandomSubspace+NB
0.90
0.86
0.88
0.90
0.52
RandomSubspace+SVM
0.90
0.86
0.91
0.90
0.51
Stacking
0.90
0.87
0.91
0.88
0.54
ESM
0.90
0.88
0.92
0.90
0.53
BES
0.93
0.90
0.95
0.93
0.55
LibD3C
0.94
0.92
0.95
0.94
0.56
CDM
0.95
0.93
0.97
0.95
0.57
Proposed scheme
0.97
0.95
0.98
0.96
0.61
NB: Naïve Bayes algorithm, SVM: support vector machines, ESM: ensemble selection from libraries of models, BES: Bagging ensemble selection, LibD3C: hybrid ensemble pruning based on k-means and dynamic selection, and CDM: ensemble pruning based on combined diversity measures.