Research Article
Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System
Table 5
The performance of the RF model.
| Feature extraction method | Matrix size | Testing performance | Cross-validation performance | Accuracy | Precision | Recall | F1-score | Accuracy | Precision | Recall | F1-score |
| Unigram | 1000 | 83.36 | 84.69 | 83.36 | 82.73 | 86.43 ± 0.48 | 87.41 ± 0.46 | 86.37 ± 0.49 | 86.02 ± 0.57 | 3000 | 84.71 | 85.8 | 84.71 | 84.06 | 89.56 ± 0.34 | 90.05 ± 0.46 | 89.62 ± 0.34 | 89.3 ± 0.48 | Bigram | 1000 | 83.05 | 84.41 | 83.05 | 82.38 | 85.79 ± 0.51 | 86.87 ± 0.52 | 85.81 ± 0.49 | 85.4 ± 0.61 | 3000 | 84.7 | 85.81 | 84.7 | 84.09 | 89.48 ± 0.36 | 89.79 ± 0.45 | 89.44 ± 0.35 | 89.12 ± 0.45 | Trigram | 1000 | 83.11 | 84.49 | 83.11 | 82.45 | 85.81 ± 0.46 | 86.93 ± 0.49 | 85.76 ± 0.41 | 85.4 ± 0.44 | 3000 | 84.67 | 85.82 | 84.67 | 84.04 | 89.39 ± 0.44 | 89.9 ± 0.35 | 89.48 ± 0.39 | 89.2 ± 0.33 | Four-gram | 1000 | 83.07 | 84.49 | 83.07 | 82.46 | 85.29 ± 0.51 | 86.62 ± 0.57 | 85.34 ± 0.57 | 84.96 ± 0.56 | 3000 | 84.61 | 85.82 | 84.61 | 84 | 89.41 ± 0.44 | 89.85 ± 0.41 | 89.4 ± 0.39 | 89.11 ± 0.41 |
|
|