Research Article

Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling

Table 9

The macro-averaged F-measure results obtained by conventional algorithms and the proposed diversity-based ensemble pruning (with LDA (k=50) based representation).

Classification algorithmoh5oh10oh15ohscalohsumed

NB0.760.680.720.610.30
SVM0.780.810.860.730.35
Bagging+NB0.770.700.720.610.30
Bagging+SVM0.850.780.810.730.37
AdaBoost+NB0.740.690.720.610.31
AdaBoost+SVM0.850.780.800.740.36
RandomSubspace+NB0.760.680.700.590.29
RandomSubspace+SVM0.790.710.730.690.33
Stacking0.840.800.810.720.38
ESM0.800.810.810.740.39
BES0.810.820.830.750.41
LibD3C0.840.850.860.760.42
CDM0.860.860.870.780.45
DEP (Genetic clustering)0.820.840.860.760.45
DEP (PSO clustering)0.820.830.850.750.47
DEP (Firefly clustering)0.870.880.880.790.49
DEP (Cuckoo clustering)0.860.850.880.780.47
DEP (Bat clustering)0.850.860.840.740.45

NB: Naïve Bayes algorithm, SVM: support vector machines, ESM: ensemble selection from libraries of models, BES: Bagging ensemble selection, LibD3C: hybrid ensemble pruning based on k-means and dynamic selection, CDM: ensemble pruning based on combined diversity measures, and DEP: the proposed diversity-based ensemble pruning.