Research Article

Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

Table 1

Comparison of classification performance with different numbers of top-ranked features.

Misclassification rate
ClassifiersTop 10Top 100Top 1000

Individual

SVM with linear kernel0.2191**0.3479**0.3757
SVM with radial kernel0.2178**0.3211**0.3712
RF0.2436**0.2926**0.2975
𝑘 -NN0.2515**0.3270**0.4354
PLR0.2160**0.3329**0.3761
PAM0.2096**0.3185**0.2310

Pool size = 5
SVM with linear kernel0.3229**0.3817**0.4131
SVM with radial kernel0.3167**0.3771**0.4571
RF0.3272**0.3568**0.3841
𝑘 -NN0.3133**0.3779**0.4720
PLR0.3113**0.3772**0.3983
PAM0.3005**0.3799*0.3681

Classification performance is presented for different number of top-ranked features. The dataset contains a total of 1000 features and 90 samples per class with 10 markers. Top 1000 features denote no feature selection. The table shows results using individual samples and illustrates results derived by means of a pooled dataset when pool size is 5, respectively. Significance levels 𝑃 < 0 . 0 5 and 𝑃 0 . 0 5 indicate comparisons where no feature selection is performed by using the Wilcoxon rank sum test.