Research Article

Validation of Text Data Preprocessing Using a Neural Network Model

Table 6

Accuracy based on the preprocessing type.

No.TypeAccuracy
1st2nd3rd4th5thMean (SD.)

Traditional method1[2] + [4]79.1779.0179.3178.9379.0379.090 (0.15)
2[2] + [3]7979.1479.2578.7379.0979.042 (0.20)
3[3] + [5]79.1578.9879.1778.8978.9779.032 (0.12)
4[3] + [4]78.9679.1278.8279.1379.0979.024 (0.13)
5[1] + [2] + [3] + [4]7978.9679.1278.9379.0179.004 (0.07)
6[1] + [4]78.9478.5979.3379.1578.9778.996 (0.28)
7[1] + [2] + [3] + [5]78.878.8679.0978.7978.9878.904 (0.13)
8[1] + [2] + [3]78.5479.2278.9378.9578.8778.902 (0.24)
9[3]79.0278.9778.7578.7378.9178.876 (0.13)
10[4]78.947978.6278.7778.8978.844 (0.15)
11[2]78.8378.7878.8378.8778.978.842 (0.05)
12[1] + [3] + [4]78.4478.9478.879.1578.8578.836 (0.26)
13[1] + [3] + [5]78.4178.878.6779.3478.878.804 (0.34)
14[2] + [5]78.7978.7778.8578.6878.7878.774 (0.06)
15[5]78.3478.6778.8678.9478.778.702 (0.23)
16[1] + [2] + [5]78.7878.2578.8878.7578.6778.666 (0.24)
17[1] + [5]78.5178.3878.7278.9678.5978.632 (0.22)
18[1]78.778.2878.4678.5178.5178.492 (0.15)
19[1] + [2]78.2278.6178.2778.5678.3678.404 (0.17)
20[1] + [2] + [4]78.2678.4578.3578.1778.3178.308 (0.10)
21[0]78.4278.3678.2878.1178.3478.302 (0.12)
22[1] + [3]78.2478.1678.0778.3678.378.226 (0.11)

Developed method
23[7-2]78.3578.6879.178.6578.3478.624 (0.31)
24[7-1]78.3978.1578.1978.3778.3378.286 (0.11)
25[6]7372.9672.4372.5972.872.756 (0.24)

[0]: no applied preprocessing; [1]: special character elimination; [2]: lemmatization; [3]: lowering; [4]: punctuation splitting; [5]: punctuation merging; [6]: essential terminology preprocessing; [7-1]: ascending order based on entropy complexity; [7-2]: descending order based on entropy complexity.