Research Article
Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins
Table 3
Results of embedded FS methods using three regularizers of the linear model in the light of 5CV on PDB1616.
| Feature set | FS method | #Features | Fivefold cross-validation on PDB1616 | ACC (%) | MCC | SP (%) | SN (%) |
| PSSMR_All | NFS | 580 | 71.47 | 0.4306 | 67.82 | 75.12 | ElasticNet | 188 | 72.77 | 0.4563 | 69.68 | 75.87 | Lasso | 61 | 72.83 | 0.4577 | 69.43 | 76.24 | LassoLars | 58 | 73.51 | 0.4707 | 71.53 | 75.50 |
| PSSMS_All | NFS | 580 | 72.77 | 0.4597 | 65.97 | 79.58 | ElasticNet | 207 | 74.01 | 0.4847 | 67.20 | 80.82 | Lasso | 54 | 73.82 | 0.4794 | 68.32 | 79.33 | LassoLars | 38 | 72.77 | 0.4586 | 66.96 | 78.59 |
| ESM_Avg | NFS | 1280 | 79.27 | 0.5908 | 72.52 | 86.01 | ElasticNet | 430 | 81.93 | 0.6442 | 75.37 | 88.49 | Lasso | 142 | 83.11 | 0.6656 | 77.97 | 88.24 | LassoLars | 151 | 82.43 | 0.6514 | 77.72 | 87.13 |
| ESM_All | NFS | 37120 | 78.90 | 0.5843 | 71.53 | 86.26 | ElasticNet | 884 | 86.14 | 0.7267 | 80.94 | 91.34 | Lasso | 367 | 87.87 | 0.7598 | 83.91 | 91.83 | LassoLars | 250 | 86.70 | 0.7353 | 83.66 | 89.73 |
|
|