Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling
Table 6
Classification results obtained by conventional algorithms and the proposed diversity-based ensemble pruning (with LDA (k=50) based representation).
Classification algorithm
oh5
oh10
oh15
ohscal
ohsumed
NB
75.19
67.43
70.77
60.24
29.41
SVM
77.59
80.29
84.47
71.58
34.72
Bagging+NB
76.08
69.77
70.94
60.21
29.21
Bagging+SVM
84.36
77.20
79.07
71.92
35.98
AdaBoost+NB
73.53
68.07
70.26
60.09
29.60
AdaBoost+SVM
84.06
77.19
78.88
72.08
35.03
RandomSubspace+NB
74.75
67.29
68.51
57.58
28.60
RandomSubspace+SVM
78.02
69.89
71.22
67.65
31.80
Stacking
83.78
81.32
81.69
60.02
40.76
ESM
79.25
79.07
78.91
72.52
37.84
BES
80.11
80.61
81.08
73.02
40.04
LibD3C
82.86
82.93
84.51
74.86
41.17
CDM
84.77
84.13
85.32
76.45
43.55
DEP (Genetic clustering)
81.61
81.96
84.64
74.21
43.27
DEP (PSO clustering)
80.91
81.41
83.31
73.98
45.73
DEP (Firefly clustering)
86.52
86.08
86.29
77.47
47.48
DEP (Cuckoo clustering)
85.06
83.00
85.84
76.81
45.43
DEP (Bat clustering)
84.47
84.18
82.11
72.70
44.13
NB: Naïve Bayes algorithm, SVM: support vector machines, ESM: ensemble selection from libraries of models, BES: Bagging ensemble selection, LibD3C: hybrid ensemble pruning based on k-means and dynamic selection, CDM: ensemble pruning based on combined diversity measures, and DEP: the proposed diversity-based ensemble pruning.