Research Article
Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System
Table 4
The performance of the LR model.
| Feature extraction method | Matrix size | Testing performance | Cross-validation performance | Accuracy | Precision | Recall | F1-score | Accuracy | Precision | Recall | F1-score |
| Unigram | 1000 | 80.82 | 81.22 | 80.82 | 80.38 | 84.54 ± 0.4 | 85.53 ± 0.48 | 84.54 ± 0.4 | 84.16 ± 0.43 | 3000 | 82.94 | 83.01 | 82.94 | 82.61 | 89.22 ± 0.4 | 89.36 ± 0.42 | 89.22 ± 0.4 | 89.08 ± 0.41 | Bigram | 1000 | 80.8 | 81.33 | 80.8 | 80.33 | 83.98 ± 0.31 | 85.11 ± 0.35 | 83.98 ± 0.31 | 83.56 ± 0.32 | 3000 | 82.32 | 82.84 | 82.32 | 81.84 | 88.52 ± 0.38 | 88.86 ± 0.43 | 88.52 ± 0.38 | 88.31 ± 0.39 | Trigram | 1000 | 80.56 | 81.08 | 80.56 | 80.09 | 83.92 ± 0.32 | 85.04 ± 0.39 | 83.92 ± 0.32 | 83.5 ± 0.33 | 3000 | 82.32 | 82.84 | 82.32 | 81.84 | 88.52 ± 0.38 | 88.86 ± 0.43 | 88.52 ± 0.38 | 88.31 ± 0.39 | Four-gram | 1000 | 80.32 | 80.87 | 80.32 | 79.85 | 83.47 ± 0.31 | 84.69 ± 0.33 | 83.47 ± 0.31 | 83.05 ± 0.33 | 3000 | 82.32 | 82.35 | 82.32 | 82.01 | 88.36 ± 0.43 | 88.55 ± 0.46 | 88.36 ± 0.43 | 88.18 ± 0.44 |
|
|