Table of Contents Author Guidelines Submit a Manuscript
Applied Computational Intelligence and Soft Computing
Volume 2014, Article ID 896128, 12 pages
http://dx.doi.org/10.1155/2014/896128
Research Article

Script Identification from Printed Indian Document Images and Performance Evaluation Using Different Classifiers

1Department of Computer Science & Engineering, Aliah University, Kolkata, India
2Department of Computer Science, West Bengal State University, Barasat, India
3Department of Computer Science & Engineering, Jadavpur University, Kolkata, India

Received 18 June 2014; Accepted 18 November 2014; Published 7 December 2014

Academic Editor: Erich Peter Klement

Copyright © 2014 Sk Md Obaidullah et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Identification of script from document images is an active area of research under document image processing for a multilingual/ multiscript country like India. In this paper the real life problem of printed script identification from official Indian document images is considered and performances of different well-known classifiers are evaluated. Two important evaluating parameters, namely, AAR (average accuracy rate) and MBT (model building time), are computed for this performance analysis. Experiment was carried out on 459 printed document images with 5-fold cross-validation. Simple Logistic model shows highest AAR of 98.9% among all. BayesNet and Random Forest model have average accuracy rate of 96.7% and 98.2% correspondingly with lowest MBT of 0.09 s.