Research Article

A Robust Supervised Variable Selection for Noisy High-Dimensional Data

Table 3

Leave-one-out cross validation performance evaluated by classification accuracy for the data of Section 4.1 contaminated by noise of three different types. MRMR is used in version (1) or (20) in the same way as in Table 1 to find 10 variables, while the optimal over all is used.

Dimensionality reductionClassif. methodNoise 1 (normal)Noise 2 (contam. normal)Noise 3 (Cauchy)

MRMR variable selection
Measure of Classification accuracy
relev. redund.

Mutual info. Mutual info. LDA 0.79 0.88 0.92
LDA 0.92 0.85 0.96
LDA 0.92 0.92 0.96
K-S LDA 0.92 0.83 0.89
Sign test LDA 0.84 0.91 0.87
LDA 0.90 0.86 0.94
LDA 1.00 1.00 0.98
LDA 1.00 1.00 0.98
LDA 1.00 1.00 1.00

Unsupervised dimensionality reduction
PCA (with 10 princ. components) LDA 0.79 0.74 0.78

No dimensionality reduction
LDA Infeasible Infeasible Infeasible
PAM 0.79 0.73 0.79
SCRDA 1.00 1.00 1.00
lasso-LR 1.00 1.00 1.00
SVM 1.00 1.00 1.00