Research Article

Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System

Table 2

The performance of the DT model.

Feature extraction methodMatrix sizeTesting performanceCross-validation performance
AccuracyPrecisionRecallF1-scoreAccuracyPrecisionRecallF1-score

Unigram100078.9178.9178.9178.4882.85 ± 0.3783.32 ± 0.4982.89 ± 0.3582.38 ± 0.56
300080.3180.0980.3180.0687.09 ± 0.7587.14 ± 0.6687.15 ± 0.6686.84 ± 0.66
Bigram100078.3478.3278.3477.9682.17 ± 0.2282.72 ± 0.2982.09 ± 0.2881.78 ± 0.26
300081.1380.9181.1380.8885.76 ± 0.585.86 ± 0.4585.86 ± 0.4785.59 ± 0.58
Trigram100077.9277.9277.9277.5382.23 ± 0.5382.84 ± 0.5282.25 ± 0.4881.93 ± 0.47
300080.3180.180.3180.0986.23 ± 0.8786.13 ± 0.8686.09 ± 0.8185.98 ± 0.87
Four-gram100077.9777.9677.9777.681.32 ± 0.5981.93 ± 0.4381.37 ± 0.4981.14 ± 0.53
300080.3780.1580.3780.0985.73 ± 0.7585.66 ± 0.7485.73 ± 0.8285.45 ± 0.72