Research Article
Handling Imbalance Classification Virtual Screening Big Data Using Machine Learning Algorithms
Table 2
Complete set of experiments and average G-mean results.
| Algorithm | PaDEL numeric descriptor | PaDEL fingerprint | No-sample | SMOTE | KSMOTE | Time | No-sample | SMOTE | KSMOTE | Time |
| AID 440 | RF | 0.167 | 0.565 | 0.954 | 23 | 0.29 | 0.442 | 0.96 | 12 | DT | 0.5 | 0.59 | 0.937 | 9.3 | 0.51 | 0.459 | 0.958 | 4.9 | MLP | 0.6 | 0.5 | 0.963 | 20 | 0.477 | 0.498 | 0.964 | 9.6 | LG | 0.56 | 0.67 | 0.963 | 11 | 0.413 | 0.512 | 0.96 | 5.6 | GBT | 0.23 | 0.56 | 0.963 | 33 | 0.477 | 0.421 | 0.963 | 17.1 |
| AID624202 | RF | 0.445 | 0.625 | 0.952 | 29.7 | 0.5 | 0.628 | 0.96 | 15.3 | DT | 0.576 | 0.614 | 0.94 | 10 | 0.54 | 0.564 | 0.94 | 5 | MLP | 0.74 | 0.715 | 0.95 | 25.2 | 0.636 | 0.497 | 0.958 | 13.5 | LG | 0.628 | 0.83 | 0.94 | 26.8 | 0.791 | 0.78 | 0.837 | 13.25 | GBT | 0.489 | 0.61 | 0.95 | 45 | 0.495 | 0.482 | 0.954 | 22.36 |
| AID 651820 | RF | 0.722 | 0.792 | 0.956 | 41 | 0.741 | 0.798 | 0.92 | 19.25 | DT | 0.725 | 0.72 | 0.932 | 8.78 | 0.765 | 0.743 | 0.89 | 4.44 | MLP | 0.82 | 0.817 | 0.915 | 35 | 0.788 | 0.8 | 0.91 | 17.3 | LG | 0.779 | 0.8357 | 0.962 | 19 | 0.75 | 0.768 | 0.89 | 9.36 | GBT | 0.714 | 0.742 | 0.9 | 60.5 | 0.762 | 0.766 | 0.905 | 29.9 |
|
|