Research Article

Effective Preprocessing and Normalization Techniques for COVID-19 Twitter Streams with POS Tagging via Lightweight Hidden Markov Model

Table 11

Proposed method on normalizing repeated characters, abbreviations, and misspelled words.

TechniquesAbbreviationsRepeated charactersMisspelled words

Ground truth3086721128
Regular expression462
Replace() function using WordNet108253749
Expanding abbreviations by CSV file
Replacement
247
NLTK library210319561
Proposed model2815901036