Computational and Mathematical Methods in Medicine

Research Article

Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling

Table 9

The macro-averaged F-measure results obtained by conventional algorithms and the proposed diversity-based ensemble pruning (with LDA (k=50) based representation).


Classification algorithm	oh5	oh10	oh15	ohscal	ohsumed

NB	0.76	0.68	0.72	0.61	0.30
SVM	0.78	0.81	0.86	0.73	0.35
Bagging+NB	0.77	0.70	0.72	0.61	0.30
Bagging+SVM	0.85	0.78	0.81	0.73	0.37
AdaBoost+NB	0.74	0.69	0.72	0.61	0.31
AdaBoost+SVM	0.85	0.78	0.80	0.74	0.36
RandomSubspace+NB	0.76	0.68	0.70	0.59	0.29
RandomSubspace+SVM	0.79	0.71	0.73	0.69	0.33
Stacking	0.84	0.80	0.81	0.72	0.38
ESM	0.80	0.81	0.81	0.74	0.39
BES	0.81	0.82	0.83	0.75	0.41
LibD3C	0.84	0.85	0.86	0.76	0.42
CDM	*0.86*	*0.86*	*0.87*	*0.78*	0.45
DEP (Genetic clustering)	0.82	0.84	0.86	0.76	0.45
DEP (PSO clustering)	0.82	0.83	0.85	0.75	*0.47*
DEP (Firefly clustering)	0.87	0.88	0.88	0.79	0.49
DEP (Cuckoo clustering)	*0.86*	0.85	0.88	*0.78*	*0.47*
DEP (Bat clustering)	0.85	*0.86*	0.84	0.74	0.45

NB: Naïve Bayes algorithm, SVM: support vector machines, ESM: ensemble selection from libraries of models, BES: Bagging ensemble selection, LibD3C: hybrid ensemble pruning based on k-means and dynamic selection, CDM: ensemble pruning based on combined diversity measures, and DEP: the proposed diversity-based ensemble pruning.