The Scientific World Journal

Research Article

Recognition of the Script in Serbian Documents Using Frequency Occurrence and Co-Occurrence Analysis

Table 10

GLCM five descriptors of the script type co-occurrence in documents from database.


Printed documents
	Doc 1		Doc 2		Doc 3		Doc 4		Doc 5
	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic

Uniformity	0.2885	0.4725	0.2473	0.4167	0.2557	0.4120	0.2759	0.4545	0.2707	0.4498
Entropy	−1.5191	−1.1774	−1.6379	−1.3079	−1.6047	−1.2999	−1.5675	−1.1650	−1.5847	−1.1799
Max. probability	0.4655	0.6636	0.3952	0.6139	0.4120	0.6098	0.4439	0.6457	0.4349	0.6405
Dissimilarity	0.6847	0.5933	0.7469	0.6592	0.7502	0.6427	0.7064	0.6041	0.7117	0.6217
Contrast	1.0324	1.1790	1.1106	1.2859	1.1258	1.2261	1.0577	1.1449	1.0630	1.1949

Web documents
	Doc 6		Doc 7		Doc 8		Doc 9		Doc 10
	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic	Latin	Cyrillic

Uniformity	0.2447	0.3714	0.2754	0.3817	0.2533	0.5005	0.2252	0.3147	0.2522	0.3325
Entropy	−1.6524	−1.3738	−1.5725	−1.3412	−1.5990	−1.0779	−1.6778	−1.5650	−1.6144	−1.5059
Max. probability	0.3964	0.5650	0.4409	0.5753	0.3972	0.6844	0.3195	0.5154	0.4016	0.5318
Dissimilarity	0.7723	0.7320	0.6912	0.7209	0.7294	0.5686	0.8317	0.7667	0.7256	0.7416
Contrast	1.1862	1.3869	1.0287	1.3681	1.0459	1.1158	1.2122	1.4220	1.0641	1.3716

The above results are further processed in order to calculate the ratio of script type co-occurrence in between Latin and Cyrillic document. These results are shown in Table 11.