Research Article

Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling

Table 5

Classification accuracies obtained with different LDA-based configurations.

ā€‰Naive Bayes (NB)Support Vector Machines (SVM)

Configurationoh5oh10oh15ohscalOhsu-medoh5oh10oh15ohscalOhsu-med

LDA (k=50)74.3866.6669.4059.2728.3576.2478.7383.1770.6234.64
LDA (k=100)70.8563.6467.4460.0529.5678.2878.2583.2373.2338.82
LDA (k=150)69.0265.2465.5159.0129.4376.7279.0984.7473.841.27
LDA (k=200)66.1764.0163.6158.9327.9977.3377.938474.1941.82
GA-LDA (BIC)75.1667.2474.7071.6635.4577.9869.0375.1273.6235.83
PSO-LDA (BIC)75.4068.6076.9072.4335.4678.2272.5675.1775.8936.23
FA-LDA (BIC)75.4871.2677.4872.8035.6079.5074.7376.6376.9037.69
CSA-LDA (BIC)76.6671.9678.7772.9435.6579.5675.9777.9677.0237.94
BA-LDA (BIC)78.8272.2179.7773.0236.5879.8576.5378.8977.3438.89
GA-LDA (CH)79.0272.8880.1174.5336.8580.6277.7280.3178.1738.96
PSO-LDA (CH)80.2072.9380.6674.7637.0381.5077.9180.5078.9939.03
FA-LDA (CH)81.2072.9980.7275.1337.7581.8077.9980.5579.0939.03
CSA-LDA (CH)81.4073.1281.7176.0238.3482.6178.0180.7879.8239.03
BA-LDA (CH)81.4673.4981.8276.2139.2482.8778.9381.0179.8939.52
GA-LDA (DB)84.4676.2284.1378.7140.5084.7380.9585.8882.4643.02
PSO-LDA (DB)84.6080.0785.1479.2142.5785.1381.1186.1784.2243.51
FA-LDA (DB)85.8980.8285.1780.8344.6086.2281.8886.7384.6244.61
CSA-LDA (DB)86.4280.9786.1081.6945.2186.7982.0086.9685.0746.67
BA-LDA (DB)87.6081.3687.3283.5647.0088.8682.0988.0585.2450.08
GA-LDA (SI)81.5773.5782.0376.4839.3683.2179.0082.2479.9340.58
PSO-LDA (SI)82.6173.7682.5076.6139.6683.5879.3383.0380.3640.87
FA-LDA (SI)83.1974.1882.8877.4739.6883.6979.4183.1180.9540.95
CSA-LDA (SI)83.7875.1183.0178.0639.6983.8480.8384.4781.8241.12
BA-LDA (SI)84.1176.0883.0378.1340.0884.4980.9085.5281.9942.65

LDA: latent Dirichlet allocation, GA-LDA: genetic algorithm based LDA, PSO-LDA: particle swarm optimization based LDA, FA-LDA: firefly algorithm based LDA, CSA-LDA: cuckoo search algorithm based LDA, BA-LDA: bat algorithm based LDA, BIC: Bayesian information criterion, CH: Calinski-Harabasz index, DB: Davies-Bouldin index, and SI: Silhouette index.