Research Article
Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection
Table 1
The prediction performance of the RF model based on various features, evaluated by 10 cycles of 5-fold cross-validation on the MDset dataset.
| Feature | Accuracy ± SD | Sensitivity ± SD | Specificity ± SD | MCC ± SD |
| PSSM-400 | 0.7967 ± 0.0062 | 0.7003 ± 0.0093 | 0.8894 ± 0.0075 | 0.620 ± 0.016 | EIPP | 0.8311 ± 0.0105 | 0.7487 ± 0.0071 | 0.9107 ± 0.0129 | 0.662 ± 0.021 | CT | 0.7482 ± 0.0092 | 0.6591 ± 0.0067 | 0.8406 ± 0.0153 | 0.5096 ± 0.015 | EIPP + BP + NBP | 0.8428 ± 0.0038 | 0.7573 ± 0.0082 | 0.9367 ± 0.0043 | 0.704 ± 0.008 | CT + BP + NBP | 0.7661 ± 0.0197 | 0.7034 ± 0.0132 | 0.8587 ± 0.0114 | 0.568 ± 0.026 | EIPP + CT | 0.8317 ± 0.0139 | 0.7482 ± 0.0068 | 0.9202 ± 0.0127 | 0.671 ± 0.018 | EIPP + BP + NBP + CT | 0.8573 ± 0.0117 | 0.7764 ± 0.0143 | 0.9424 ± 0.0062 | 0.729 ± 0.020 |
|
|