Recognition of the Script in Serbian Documents Using Frequency Occurrence and Co-Occurrence Analysis
Table 7
Frequency analysis of the script type occurrence in documents from database.
Printed documents
Type of script
Doc 1
Doc 2
Doc 3
Doc 4
Doc 5
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
S
2243
2764
11396
13593
1510
1914
2217
2516
2069
2542
A
906
53
4060
306
693
65
598
53
897
64
D
183
468
724
1933
82
286
261
445
151
461
F
0
7
0
8
0
8
8
26
0
12
Web documents
Type of script
Doc 6
Doc 7
Doc 8
Doc 9
Doc 10
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
Latin
Cyrillic
S
1486
1799
1358
1682
783
996
1657
2078
1328
1637
A
598
48
636
46
408
48
750
62
588
68
D
99
304
75
292
58
174
134
344
99
284
F
0
7
0
9
0
13
0
18
0
9
The above results are further processed in order to calculate the ratio of script type occurrence between Latin and Cyrillic document. Complete results are given in Table 8.