Research Article
A Feature Extraction Method of Hybrid Gram for Malicious Behavior Based on Machine Learning
Table 3
Malware detection experiments based on feature selection.
| Feature representation | Features Quantity | Classification algorithm | TPR /% | FPR /% | Accuracy /% | AUC |
| 2-gram | 36 | ID3 | 85.9 | 14.3 | 87.5 | 0.863 | Random Forest | 86.3 | 12.8 | 86.6 | 0.850 | AdboostM1 | 82.9 | 16.3 | 80.2 | 0.808 | Bagging | 83.9 | 15.8 | 82.3 | 0.826 | 3-gram | 53 | ID3 | 86.3 | 15.7 | 85.0 | 0.841 | Random Forest | 94.1 | 13.7 | 92.0 | 0.971 | AdboostM1 | 91.2 | 14.7 | 93.0 | 0.956 | Bagging | 91.2 | 12.8 | 87.5 | 0.931 | 4-gram | 65 | ID3 | 87.9 | 14.3 | 95.8 | 0.868 | Random Forest | 96.8 | 6.2 | 93.1 | 0.98 | AdboostM1 | 90.9 | 9.1 | 87.5 | 0.974 | Bagging | 93.9 | 7.0 | 92.0 | 0.957 | Hybrid n-gram with cross entropy | 28 | ID3 | 96.8 | 6.3 | 92.5 | 0.963 | Random Forest | 97.8 | 5.1 | 96.8 | 0.983 | AdboostM1 | 97.8 | 5.1 | 96.8 | 0.983 | Bagging | 97.6 | 5.2 | 96.8 | 0.897 |
|
|