Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2014, Article ID 537428, 14 pages
http://dx.doi.org/10.1155/2014/537428
Research Article

A New Dataset Size Reduction Approach for PCA-Based Classification in OCR Application

1Image Processing and Pattern Recognition Research Lab, R&D Center, Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia
2Department of Information System, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia

Received 25 August 2013; Revised 14 January 2014; Accepted 19 January 2014; Published 17 April 2014

Academic Editor: Yi-Hung Liu

Copyright © 2014 Mohammad Amin Shayegan and Saeed Aghabozorgi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in mobile cloud computing: taxonomy and open challenges,” IEEE Communications Surveys & Tutorials, vol. 99, pp. 1–24, 2013. View at Google Scholar
  2. A. Phinyomark, P. Phukpattaranont, and C. Limsakul, “Feature reduction and selection for EMG signal classification,” Expert Systems with Applications, vol. 39, no. 8, pp. 7420–7431, 2012. View at Publisher · View at Google Scholar · View at Scopus
  3. H. B. Borges and J. C. Nievola, “Comparing the dimensionality reduction methods in gene expression databases,” Expert Systems with Applications, vol. 39, pp. 10780–10795, 2012. View at Publisher · View at Google Scholar · View at Scopus
  4. M. Song, H. Yang, S. H. Siadat, and M. Pechenizkiy, “A comparative study of dimensionality reduction techniques to enhance trace clustering performances,” Expert Systems With Applications, vol. 40, pp. 3722–3737, 2013. View at Google Scholar
  5. C. Yang, W. Zhang, J. Zou, S. Hu, and J. Qiu, “Feature selection in decision systems: a mean-variance approach,” Mathematical Problems in Engineering, vol. 2013, Article ID 268063, 8 pages, 2013. View at Publisher · View at Google Scholar · View at MathSciNet
  6. G. A. Abandah, K. S. Younis, and M. Z. Khedher, “Handwritten Arabic character recognition using multiple classifiers based on letter form,” in Proceedings of the 5th IASTED International Conference on Signal Processing, Pattern Recognition & Applications (SPPRA '08), pp. 128–133, February 2008. View at Scopus
  7. W. Zhongdong, Y. Jianping, X. Weixin, and G. Xinbo, “Reduction of training datasets via fuzzy entropy for support vector machines,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '04), pp. 2381–2385, October 2004. View at Scopus
  8. S. V. N. Vishwanathan and M. N. Murty, “Use of multi category proximal SVM for data set reduction,” International Journal of Studies in Fuzziness and Soft Computing, vol. 140, pp. 3–20, 2004. View at Google Scholar
  9. K. Hara and K. Nakayama, “Training data selection method for generalization by multilayer neural networks,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E81-A, no. 3, pp. 374–381, 1998. View at Google Scholar · View at Scopus
  10. C. Ding and X. He, “K-means clustering via principal component analysis,” in Proceedings of the 21st International Conference on Machine Learning (ICML '04), pp. 1–9, July 2004. View at Scopus
  11. S. Abdul Sattar and S. Shahl, “Character recognition of arabic script languages,” in Proceedings of the 2nd International Conference on Communication and Information Technology, pp. 502–506, 2012.
  12. M. Elzobi, A. Al-Hamadi, L. Dinges, and B. Michaelis, “A structural features based segmentation for off-line handwritten Arabic text,” in Proceedings of the 5th International Symposium on I/V Communications and Mobile Networks (ISIVC '10), pp. 1–4, October 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. H.-C. Kim, D. Kim, and S. Yang Bang, “A numeral character recognition using the PCA mixture model,” Pattern Recognition Letters, vol. 23, no. 1-3, pp. 103–111, 2002. View at Publisher · View at Google Scholar · View at Scopus
  14. P. Zhang, C. Y. Suen, and T. D. Bui, “Multi-modal nonlinear feature reduction for the recognition of handwritten numerals,” in Proceedings of the 1st Canadian Conference on Computer and Robot Vision, pp. 393–400, May 2004. View at Publisher · View at Google Scholar · View at Scopus
  15. V. Deepu, S. Madhvanath, and A. G. Ramakrishnan, “Principal component analysis for online handwritten character recognition,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), vol. 2, pp. 327–330, August 2004. View at Publisher · View at Google Scholar · View at Scopus
  16. A. Sharma and K. K. Paliwal, “Fast principal component analysis using fixed-point algorithm,” Pattern Recognition Letters, vol. 28, no. 10, pp. 1151–1155, 2007. View at Publisher · View at Google Scholar · View at Scopus
  17. M. T. Parvez and S. A. Mahmoudi, “Offline Arabic handwritten text recognition : a survey,” ACM Computing Survey, vol. 45, no. 2, article 23, 2013. View at Google Scholar
  18. R. Azmi, B. Pishgoo, N. Norozi, M. Koohzadi, and F. Baesi, “A hybrid GA and SA algorithms for feature selection in recognition of hand-printed Farsi characters,” in Proceedings of the IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS '10), vol. 3, pp. 384–387, October 2010. View at Publisher · View at Google Scholar · View at Scopus
  19. Y. El-Glaly and F. Quek, “Isolated Handwritten Arabic Character Recognition using Multilayer Perceptron and K Nearest Neighbor Classifiers,” 2012, http://filebox.vt.edu/users/yasmineg/index_htm_files/MLSP%20Arabic%20Character%20Recognition.pdf.
  20. A. R. Kheyrkhah and E. Rahmanian, “Optimizing a Farsi handwritten character recognition system by selecting effective features on classifier using genetic algorithm,” in Proceedings of the 1st Joint Congress on Fuzzy and Intelligent Systems, 2007 (Persian).
  21. A. M. Urmanov, A. A. Bougaev, and K. C. Gross, “Reducing the size of a training set for classification,” US Patent no. 7478075 B2, 2007.
  22. I. Javed, M. N. Ayyaz, and W. Mehmood, “Efficient training data reduction for SVM based handwritten digits recognition,” in Proceedings of the International Conference on Electrical Engineering (ICEE '07), pp. 1–4, April 2007. View at Publisher · View at Google Scholar · View at Scopus
  23. B. Boucheham, “PLA data reduction for speeding up time series comparison,” International Arab Journal of Information Technology, vol. 9, no. 5, 2012. View at Google Scholar · View at Scopus
  24. J. Cervantes, X. Li, and W. Yu, “Support vector classification for large data sets by reducing training data with change of classes,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '08), pp. 2609–2614, October 2008. View at Publisher · View at Google Scholar · View at Scopus
  25. S. Mozaffari, K. Faez, and M. Ziaratban, “Character representation and recognition using quadtree-based fractal encoding scheme,” in Proceedings of the International Conference on Document Analysis and Recognition, pp. 819–823, September 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. M. Ziaratban, K. Faez, and M. Ezoji, “Use of legal amount to confirm or correct the courtesy amount on Farsi bank checks,” in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '07), pp. 1123–1127, September 2007. View at Publisher · View at Google Scholar · View at Scopus
  27. A. Mowlaei and K. Faez, “Recognition of isolated handwritten Persian/Arabic characters and numerals using support vector machines,” in Proceedings of the IEEE International Workshop on Neural Networks for Signal Processing, pp. 547–554, 2003.
  28. J. Sadri, C. Y. Suen, and T. D. Bui, “Application of support vector machines of handwritten Arabic/Persian digits,” in Proceedings of the 2nd Iranian Conference on Machine Vision, Image Processing & Applications (MVIP '03), vol. 1, pp. 300–307, 2003.
  29. S. Mozaffari, K. Faez, and H. R. Kanan, “Feature comparison between fractal codes and wavelet transform in handwritten alphanumeric recognition using SVM classifier,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), vol. 2, pp. 331–334, August 2004. View at Publisher · View at Google Scholar · View at Scopus
  30. H. Soltanzadeh and M. Rahmati, “Recognition of Persian handwritten digits using image profiles of multiple orientations,” Pattern Recognition Letters, vol. 25, no. 14, pp. 1569–1576, 2004. View at Publisher · View at Google Scholar · View at Scopus
  31. M. Ziaratban, K. Faez, and F. Faradji, “Language-based feature extraction using template-matching in Farsi/Arabic handwritten numeral recognition,” in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '07), pp. 297–301, September 2007. View at Publisher · View at Google Scholar · View at Scopus
  32. R. Enayatifar and M. Alirezanejad, “Offline handwriting digit recognition by using direction and accumulation of pixels,” in Proceedings of the International Conference on Computer and Software Modeling, vol. 14, pp. 214–220, 2011.
  33. H. Khosravi and E. Kabir, “Introducing a very large dataset of handwritten Farsi digits and a study on their varieties,” Pattern Recognition Letters, vol. 28, no. 10, pp. 1133–1141, 2007. View at Publisher · View at Google Scholar · View at Scopus
  34. M. Hanmandlu, K. R. Murali Mohan, S. Chakraborty, S. Goyal, and D. R. Choudhury, “Unconstrained handwritten character recognition based on fuzzy logic,” Pattern Recognition, vol. 36, no. 3, pp. 603–623, 2003. View at Publisher · View at Google Scholar · View at Scopus
  35. A. C. Downton, E. Kabir, and D. Guillevic, “Syntactic and contextual post processing of handwritten addresses for optical character recognition,” in Proceedings of the 9th International Conference on Pattern Recognition, pp. 1072–1076, 1988.
  36. T. Sitamahalakshmi, A. Vinay Babu, and M. Jagadeesh, “Character recognition using Dempster-Shafer theory—combining different distance measurement methods,” International Journal of Engineering and Technology, vol. 2, no. 5, pp. 1177–1184, 2010. View at Google Scholar
  37. V. Curic, J. Lindblad, N. Sladoje, H. Sarve, and G. Borgefors, “A new set distance and its application to shape registration,” Pattern Analysis and Applications, vol. 12, pp. 1–12, 2012. View at Google Scholar
  38. A. Alaei, P. Nagabhushan, and U. Pal, “A new dataset of Persian handwritten documents and its segmentation,” in Proceedings of the 7th Iranian Conference on Machine Vision and Image Processing (MVIP '11), pp. 1–5, November 2011. View at Publisher · View at Google Scholar · View at Scopus
  39. M. Kherallah, A. Elbaati, H. E. Abed, and M. Alimi, “The on/off (LMCA) dual arabic handwriting database,” in Proceedings of the International Conference on Frontiers in Handwriting Recognition, 2008.
  40. Y.-C. Chim, A. A. Kassim, and Y. Ibrahim, “Dual classifier system for handprinted alphanumeric character recognition,” Pattern Analysis and Applications, vol. 1, no. 3, pp. 155–162, 1998. View at Google Scholar · View at Scopus