Research Article

A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data

Table 5

Pareto optimal configurations in Figure 9 whose resulting models averagely use less than 500 features (above the horizontal line) and accuracy optimal configurations (below the horizontal line). The ID column references Table 6.

IDClassifierParametersFiltern.featsErrorErrorSizeSize
(train)(test)(train)(test)

1Random Forestnum.trees = 20577, Variance 397 0.041 0.048 397.0 397.0
min.node.size = 2
2SVM, Variance 355 0.040 0.044 355.0 355.0
3Lasso Log. Reg.ā€‰ Variance 246 0.041 0.059 246.0 245.9
4Lasso Log. Reg.ā€‰ Variance 232 0.037 0.044 232.0 232.0
5Random Forestnum.trees = 4742, Variance 212 0.048 0.044 212.0 212.0
min.node.size = 2
6Lasso Log. Reg. Variance 174 0.044 0.048 174.0 174.0
7GLM Boosting Variance 728 0.041 0.041 13.3 15.1
8GLM Boosting Variance 363 0.048 0.074 9.8 7.2
9GLM Boosting AUC 232 0.048 0.077 9.7 11.0

10Random Forestnum.trees = 1062, MRMR 1756 0 0.055 1466.0 1611.9
min.node.size = 3
11Random Forestnum.trees = 4671, AUC 760 0 0.051 759.7 760.0
min.node.size = 11