Research Article

Recognition of the Script in Serbian Documents Using Frequency Occurrence and Co-Occurrence Analysis

Table 7

Frequency analysis of the script type occurrence in documents from database.

Printed documents
Type of scriptDoc 1Doc 2Doc 3Doc 4Doc 5
LatinCyrillicLatinCyrillicLatinCyrillicLatinCyrillicLatinCyrillic

S224327641139613593151019142217251620692542
A906534060306693655985389764
D183468724193382286261445151461
F070808826012

Web documents
Type of scriptDoc 6Doc 7Doc 8Doc 9Doc 10
LatinCyrillicLatinCyrillicLatinCyrillicLatinCyrillicLatinCyrillic

S14861799135816827839961657207813281637
A5984863646408487506258868
D99304752925817413434499284
F070901301809

The above results are further processed in order to calculate the ratio of script type occurrence between Latin and Cyrillic document. Complete results are given in Table 8.