Research Article
Effective Preprocessing and Normalization Techniques for COVID-19 Twitter Streams with POS Tagging via Lightweight Hidden Markov Model
Table 2
Conventional text normalization methods and techniques.
| Technique | Abbreviations | Repeated characters | Misspelled words |
| Regular expression | X | √ | X | Replace() function using WordNet | X | √ | √ | Expanding abbreviations by CSV file replacement | √ | X | X | Probability model using edit distance | X | √ | √ | Spell correction using TextBlob | X | X | √ | NLTK library | X | √ | √ | Phonetic edit distance | X | √ | √ | PyEnchant library | X | √ | √ |
|
|