Research Article
Validation of Text Data Preprocessing Using a Neural Network Model
Table 1
Text preprocessing technique.
| ā | Technique | Feature |
| Normalization | Lowering | Conversion to lowercase | Pros: search accuracy can be improved | Cons: proper nouns composed of capital letters can be incorrectly classified as general nouns | Stemming | Conversion to stems | Pros: time efficiency can be improved by reducing the size of the text | Cons: dilution of meaning can affect accuracy | Lemmatization | Conversion to headwords | Pros: part-of-speech information is converted into a preserved form, and search accuracy can be improved | Cons: conversion time is long |
| Punctuation | Splitting | Word splitting | Pros: meaning can be preserved Cons: different rules should be applied depending on the purpose, and the rules are complicated | Merging | Word merging |
|
|