Research Article

Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System

Table 4

The performance of the LR model.

Feature extraction methodMatrix sizeTesting performanceCross-validation performance
AccuracyPrecisionRecallF1-scoreAccuracyPrecisionRecallF1-score

Unigram100080.8281.2280.8280.3884.54 ± 0.485.53 ± 0.4884.54 ± 0.484.16 ± 0.43
300082.9483.0182.9482.6189.22 ± 0.489.36 ± 0.4289.22 ± 0.489.08 ± 0.41
Bigram100080.881.3380.880.3383.98 ± 0.3185.11 ± 0.3583.98 ± 0.3183.56 ± 0.32
300082.3282.8482.3281.8488.52 ± 0.3888.86 ± 0.4388.52 ± 0.3888.31 ± 0.39
Trigram100080.5681.0880.5680.0983.92 ± 0.3285.04 ± 0.3983.92 ± 0.3283.5 ± 0.33
300082.3282.8482.3281.8488.52 ± 0.3888.86 ± 0.4388.52 ± 0.3888.31 ± 0.39
Four-gram100080.3280.8780.3279.8583.47 ± 0.3184.69 ± 0.3383.47 ± 0.3183.05 ± 0.33
300082.3282.3582.3282.0188.36 ± 0.4388.55 ± 0.4688.36 ± 0.4388.18 ± 0.44