Validation of Text Data Preprocessing Using a Neural Network Model
Table 6
Accuracy based on the preprocessing type.
No.
Type
Accuracy
1st
2nd
3rd
4th
5th
Mean (SD.)
Traditional method
1
[2] + [4]
79.17
79.01
79.31
78.93
79.03
79.090 (0.15)
2
[2] + [3]
79
79.14
79.25
78.73
79.09
79.042 (0.20)
3
[3] + [5]
79.15
78.98
79.17
78.89
78.97
79.032 (0.12)
4
[3] + [4]
78.96
79.12
78.82
79.13
79.09
79.024 (0.13)
5
[1] + [2] + [3] + [4]
79
78.96
79.12
78.93
79.01
79.004 (0.07)
6
[1] + [4]
78.94
78.59
79.33
79.15
78.97
78.996 (0.28)
7
[1] + [2] + [3] + [5]
78.8
78.86
79.09
78.79
78.98
78.904 (0.13)
8
[1] + [2] + [3]
78.54
79.22
78.93
78.95
78.87
78.902 (0.24)
9
[3]
79.02
78.97
78.75
78.73
78.91
78.876 (0.13)
10
[4]
78.94
79
78.62
78.77
78.89
78.844 (0.15)
11
[2]
78.83
78.78
78.83
78.87
78.9
78.842 (0.05)
12
[1] + [3] + [4]
78.44
78.94
78.8
79.15
78.85
78.836 (0.26)
13
[1] + [3] + [5]
78.41
78.8
78.67
79.34
78.8
78.804 (0.34)
14
[2] + [5]
78.79
78.77
78.85
78.68
78.78
78.774 (0.06)
15
[5]
78.34
78.67
78.86
78.94
78.7
78.702 (0.23)
16
[1] + [2] + [5]
78.78
78.25
78.88
78.75
78.67
78.666 (0.24)
17
[1] + [5]
78.51
78.38
78.72
78.96
78.59
78.632 (0.22)
18
[1]
78.7
78.28
78.46
78.51
78.51
78.492 (0.15)
19
[1] + [2]
78.22
78.61
78.27
78.56
78.36
78.404 (0.17)
20
[1] + [2] + [4]
78.26
78.45
78.35
78.17
78.31
78.308 (0.10)
21
[0]
78.42
78.36
78.28
78.11
78.34
78.302 (0.12)
22
[1] + [3]
78.24
78.16
78.07
78.36
78.3
78.226 (0.11)
Developed method
23
[7-2]
78.35
78.68
79.1
78.65
78.34
78.624 (0.31)
24
[7-1]
78.39
78.15
78.19
78.37
78.33
78.286 (0.11)
25
[6]
73
72.96
72.43
72.59
72.8
72.756 (0.24)
[0]: no applied preprocessing; [1]: special character elimination; [2]: lemmatization; [3]: lowering; [4]: punctuation splitting; [5]: punctuation merging; [6]: essential terminology preprocessing; [7-1]: ascending order based on entropy complexity; [7-2]: descending order based on entropy complexity.