Research Article

Unbiased Feature Selection in Learning Random Forests for High-Dimensional Data

Table 1

Description of the real-world datasets sorted by the number of features and grouped into two groups, microarray data and real-world datasets, accordingly.

Dataset No. of featuresNo. of training No. of tests No. of classes

Colon 2,000 62 2
Srbct 2,308 63 4
Leukemia 3,051 38 2
Lymphoma 4,026 62 3
breast.2.class 4,869 78 2
breast.3.class 4,869 96 3
nci 5,244 61 8
Brain 5,597 42 5
Prostate 6,033 102 2
adenocarcinoma 9,868 76 2
Fbis 2,000 1,711 752 17
La2s 12,432 1,855 845 6
La1s 13,195 1,963 887 6