Research Article

Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

Figure 2

Internal validation procedure. The two datasets, AM (blue) and RO (orange) composed of the 283 rank-significant genes and 151 or 286 samples, respectively, were used for internal performance evaluation. These datasets were first individually used to rank each feature by their random forest variable importance value (RF ranking). These ranks were separately used for selecting the optimal number of features by adding one feature using the same classification method, using a 10-times repeated 10-fold cross-validation procedure. The AM and RO 10-times cross-validation results using the same classification method were combined, and the mean classification performance of each method was compared.
380495.fig.002