Table of Contents Author Guidelines Submit a Manuscript
Applied Computational Intelligence and Soft Computing
Volume 2014, Article ID 896128, 12 pages
http://dx.doi.org/10.1155/2014/896128
Research Article

Script Identification from Printed Indian Document Images and Performance Evaluation Using Different Classifiers

1Department of Computer Science & Engineering, Aliah University, Kolkata, India
2Department of Computer Science, West Bengal State University, Barasat, India
3Department of Computer Science & Engineering, Jadavpur University, Kolkata, India

Received 18 June 2014; Accepted 18 November 2014; Published 7 December 2014

Academic Editor: Erich Peter Klement

Copyright © 2014 Sk Md Obaidullah et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. M. Obaidullah, S. K. Das, and K. Roy, “A system for handwritten script identification from Indian document,” Journal of Pattern Recognition Research, vol. 8, no. 1, pp. 1–12, 2013. View at Google Scholar
  2. D. Ghosh, T. Dube, and A. Shivaprasad, “Script Recognition—a review,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. 12, pp. 2142–2161, 2010. View at Publisher · View at Google Scholar · View at Scopus
  3. A. L. Spitz, “Determination of the script and language content of document images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 235–245, 1997. View at Publisher · View at Google Scholar · View at Scopus
  4. L. Lam, J. Ding, and C. Y. Suen, “Differentiating between oriental and European scripts by statistical features,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 12, no. 1, pp. 63–79, 1998. View at Publisher · View at Google Scholar · View at Scopus
  5. J. Hochberg, P. Kelly, T. Thomas, and L. Kerns, “Automatic script identification from document images using cluster-based templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 177–181, 1997. View at Publisher · View at Google Scholar · View at Scopus
  6. L. Zhou, Y. Lu, and C. L. Tan, “Bangla/English script identification based on analysis of connected component profiles,” in Proceedings of the 7th International Conference on Document Analysis Systems (DAS '06), vol. 3872 of Lecture Notes in Computer Science, pp. 243–254, 2006.
  7. J. R. Prasad, U. V. Kulkarni, and R. S. Prasad, “Template matching algorithm for Gujrati character recognition,” in Proceedings of the 2nd International Conference on Emerging Trends in Engineering and Technology (ICETET '09), pp. 263–268, Nagpur, India, December 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. B. Patil and N. V. Subbareddy, “Neural network based system for script identification in Indian documents,” Sadhana, vol. 27, part i1, pp. 83–97, 2002. View at Google Scholar
  9. A. M. Elgammal and M. A. Ismail, “Techniques for language identification for hybrid Arabic-English document images,” in Proceedings of the IEEE 6th International Conference on Document Analysis and Recognition, pp. 1100–1104, 2001.
  10. B. V. Dhandra, P. Nagabhushan, M. Hangarge, R. Hegadi, and V. S. Malemath, “Script identification based on morphological reconstruction in document images,” in Proceedings of the 18th International Conference on Pattern Recognition (ICPR '06), vol. 2, pp. 950–953, Hong Kong, August 2006. View at Publisher · View at Google Scholar · View at Scopus
  11. B. B. Chaudhuri and U. Pal, “An OCR system to read two Indian language scripts: Bangla and Devanagari (Hindi),” in Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR '97), vol. 2, pp. 1011–1015, Ulm, Germany, August 1997. View at Publisher · View at Google Scholar
  12. C. L. Tan, P. Y. Leong, and S. He, “Language Identification in Multilingual Documents,” 2003.
  13. S. Chaudhury and R. Sheth, “Trainable script identification strategies for Indian languages,” in Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR '99), pp. 657–660, 1999.
  14. M. C. Padma and P. A. Vijaya, “Wavelet packet based texture features for automatic script identification,” International Journal of Image Processing, vol. 4, no. 1, 2010. View at Google Scholar
  15. G. D. Joshi, S. Garg, and J. Sivaswamy, “Script identification from Indian documents,” in Proceedings of the 7th International Workshop on Document Analysis Systems VII, vol. 3872 of Lecture Notes in Computer Science, pp. 255–267, Nelson, New Zealand, 2000. View at Publisher · View at Google Scholar
  16. D. Dhanya, A. G. Ramakrishnan, and P. B. Pati, “Script identification in printed bilingual documents,” Sadhana, vol. 27, part 1, pp. 73–82, 2002. View at Google Scholar
  17. http://commons.wikimedia.org/wiki/File:States_of_South_Asia.png.
  18. K. Roy, U. Pal, and A. Banerjee, “A system for word-wise handwritten script identification for Indian postal automation,” in Proceedings of the 1st IEEE INDICON India Annual Conference, pp. 266–271, December 2004. View at Publisher · View at Google Scholar
  19. A. Kaehler and G. R. Bradski, Learning OpenCV, O’reilly Media, 2008.
  20. V. Singhal, N. Navin, and D. Ghosh, “Script-based classification of hand-written text documents in a multilingual environment,” in Proceedings of the 13th International Workshop on Research Issues in Data Engineering: Multi-lingual Information Management, Research Issues in Data Engineering, pp. 47–54, 2003.
  21. J. Hochberg, K. Bowers, M. Cannon, and P. Kelly, “Script and language identification for handwritten document images,” The International Journal on Document Analysis and Recognition, vol. 2, no. 2-3, pp. 45–52, 1999. View at Google Scholar
  22. K. Roy, S. Kundu Das, and S. M. Obaidullah, “Script identification from handwritten document,” in Proceedings of the 3rd National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG '11), pp. 66–69, Karnataka, Hubli, India, December 2011. View at Publisher · View at Google Scholar · View at Scopus
  23. S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, and D. Kumar Basu, “A novel framework for automatic sorting of postal documents with multi-script address blocks,” Pattern Recognition, vol. 43, no. 10, pp. 3507–3521, 2010. View at Publisher · View at Google Scholar · View at Scopus
  24. V. Singhal, N. Navin, and D. Ghosh, “Script-based classification of hand-written text documents in a multilingual environment,” in Proceedings of the 13th International Workshop on Research Issues in Data Engineering: Multi-Lingual Information Management (RIDE-MLIM '03), pp. 47–54, March 2003. View at Publisher · View at Google Scholar
  25. S. B. Moussa, A. Zahour, A. Benabdelhafid, and A. M. Alimi, “Fractal-based system for Arabic/Latin, printed/handwritten script identification,” in Proceedings of the 19th International Conference on Pattern Recognition (ICPR '08), pp. 1–4, IEEE, December 2008. View at Scopus
  26. M. Hangarge, K. C. Santosh, and R. Pardeshi, “Directional discrete cosine transform for handwritten script identification,” in Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR '13), pp. 344–348, Washington, DC, USA, August 2013. View at Publisher · View at Google Scholar · View at Scopus
  27. R. Rani, R. Dhir, and G. S. Lehal, “Script identification of pre-segmented multi-font characters and digits,” in Proceedings of the 12th International Conference on Document Analysis and Recognition, pp. 1150–1154, August 2013. View at Publisher · View at Google Scholar · View at Scopus
  28. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” SIGKDD Explorations, vol. 11, pp. 10–18, 2009. View at Google Scholar
  29. N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian network classifiers,” Machine Learning, vol. 29, no. 2-3, pp. 131–163, 1997. View at Publisher · View at Google Scholar · View at Scopus
  30. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: a library for large linear classification,” Journal of Machine Learning Research, vol. 9, pp. 1871–1874, 2008. View at Google Scholar · View at Scopus
  31. M. D. Buhmann, Radial Basis Functions: Theory and Implementations, Cambridge Monographs on Applied and Computational Mathematics (12), Cambridge University Press, Cambridge, UK, 2003.
  32. S. V. Chakravarthy and J. Ghosh, “Scale-based clustering using the radial basis function network,” IEEE Transactions on Neural Networks, vol. 7, no. 5, pp. 1250–1261, 1996. View at Publisher · View at Google Scholar · View at Scopus
  33. A. J. Howell and H. Buxton, “RBF network methods for face detection and attentional frames,” Neural Processing Letters, vol. 15, no. 3, pp. 197–211, 2002. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  34. J. Hühn and E. Hüllermeier, “FURIA: an algorithm for unordered fuzzy rule induction,” Data Mining and Knowledge Discovery, vol. 19, no. 3, pp. 293–319, 2009. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  35. L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus