About this Journal Submit a Manuscript Table of Contents
Advances in Artificial Intelligence
Volume 2013 (2013), Article ID 176890, 12 pages
http://dx.doi.org/10.1155/2013/176890
Research Article

Imprecise Imputation as a Tool for Solving Classification Problems with Mean Values of Unobserved Features

Department of Control, Automation and System Analysis, St. Petersburg State Forest Technical University, Institutski per. 5, St. Petersburg 194021, Russia

Received 11 October 2012; Revised 9 February 2013; Accepted 10 March 2013

Academic Editor: Wolfgang Faber

Copyright © 2013 Lev V. Utkin and Yulia A. Zhuk. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. R. Alaiz-Rodríguez, A. Guerrero-Curieses, and J. Cid-Sueiro, “Minimax regret classifier for imprecise class distributions,” Journal of Machine Learning Research, vol. 8, pp. 103–130, 2007. View at Scopus
  2. R. Alaiz-Rodríguez, A. Guerrero-Curieses, and J. Cid-Sueiro, “Improving classification under changes in class and within-class distributions,” in Systems: Computational and Ambient Intelligence, J. Cabestany, F. Sandoval, A. Prieto, and J. Corchado, Eds., vol. 5517 of Lecture Notes in Computer Science, pp. 122––130, Springer, Berlin, Germany, 2009.
  3. S. Kotsiantis, D. Kanellopoulos, and P. Pintelas, “Handling imbalanced datasets: a review,” GESTS International Transactions on Computer Science and Engineering, vol. 30, no. 1, p. 25–36, 2006.
  4. G. M. Weiss, “Mining with rarity: a unifying framework,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 7––19, 2004.
  5. B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, The MIT Press, Cambridge, Mass, USA, 2002.
  6. D. B. Rubin, “Multiple Imputation after 18+ Years,” Journal of the American Statistical Association, vol. 91, no. 434, pp. 473–489, 1996. View at Scopus
  7. M. Saar-Tsechansky and F. Provost, “Handling missing values when applying classification models,” Journal of Machine Learning Research, vol. 8, pp. 1625–1657, 2007. View at Scopus
  8. G. E. A. P. A. Batista and M. C. Monard, “An analysis of four missing data treatment methods for supervised learning,” Applied Artificial Intelligence, vol. 17, no. 5-6, pp. 519–533, 2003. View at Scopus
  9. A. Farhangfar, L. Kurgan, and J. Dy, “Impact of imputation of missing values on classification error for discrete data,” Pattern Recognition, vol. 41, no. 12, pp. 3692–3705, 2008. View at Publisher · View at Google Scholar · View at Scopus
  10. S. Garcia and F. Herrera, “An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons,” Journal of Machine Learning Research, vol. 9, pp. 2677–2694, 2008.
  11. J. Grzymala-Busse and M. Hu, “A comparison of several approaches to missing attribute values in data mining,” in Rough Sets and Current Trends in Computing, pp. 378––385, Springer, Berlin, Germany, 2001.
  12. J. Luengo, S. Garcia, and F. Herrera, “On the choice of the best imputation methods for missing values considering three groups of classification methods,” Knowledge and Information Systems, vol. 32, no. 1, p. 77–108, 2012.
  13. J. Ning and P. E. Cheng, “A comparison study of nonparametric imputation methods,” Statistics and Computing, vol. 22, no. 1, pp. 273–285, 2012.
  14. S. Destercke, D. Dubois, and E. Chojnacki, “Unifying practical uncertainty representations. II: clouds,” International Journal of Approximate Reasoning, vol. 49, no. 3, pp. 664–677, 2008. View at Publisher · View at Google Scholar · View at Scopus
  15. S. Ferson, V. Kreinovich, L. Ginzburg, D. S. Myers, and K. Sentz, “Constructing probability boxes and Dempster-Shafer structures,” Tech. Rep. SAND2002-4015, Sandia National Laboratories, January 2003.
  16. C. P. Robert, The Bayesian Choice, Springer, New York, NY, USA, 1994.
  17. L.V. Utkin, “Regression analysis using the imprecise Bayesian normal model,” International Journal of Data Analysis Techniques and Strategies, vol. 2, no. 4, pp. 356–372, 2010. View at Publisher · View at Google Scholar
  18. L. V. Utkin and F. P. A. Coolen, “On reliability growth models using Kolmogorov-Smirnov bounds,” International Journal of Performability Engineering, vol. 7, no. 1, pp. 5–19, 2011. View at Scopus
  19. L.V. Utkin and Y. A. Zhuk, “A machine learning algorithm for classification under extremely scarce information,” International Journal of Data Analysis Techniques and Strategies, vol. 4, no. 2, pp. 115––133, 2012.
  20. J. O. Berger and G. Salinetti, “Approximations of Bayes decision problems: the epigraphical approach,” Annals of Operations Research, vol. 56, no. 1, pp. 1–13, 1995. View at Publisher · View at Google Scholar · View at Scopus
  21. J. Shao, “Monte Carlo approximations in Bayesian decision theory,” Journal of the American Statistical Association, vol. 84, no. 407, pp. 727––732, 1989. View at Publisher · View at Google Scholar
  22. A. Farhangfar, L. Kurgan, and J. Dy, “Impact of imputation of missing values on classification error for discrete data,” Pattern Recognition, vol. 41, no. 12, pp. 3692–3705, 2008. View at Publisher · View at Google Scholar · View at Scopus
  23. D. Williams, X. Liao, Y. Xue, L. Carin, and B. Krishnapuram, “On classification with incomplete data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 427–436, 2007. View at Publisher · View at Google Scholar · View at Scopus
  24. R. Esposito and L. Saitta, “Monte Carlo theory as an explanation of bagging and boosting,” in Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI '03), pp. 499––504, 2003.
  25. P. Sollich, “Bayesian methods for support vector machines: evidence and predictive class probabilities,” Machine Learning, vol. 46, no. 1–3, pp. 21–52, 2002. View at Publisher · View at Google Scholar · View at Scopus
  26. J. E. Hurtado, “An examination of methods for approximating implicit limit state functions from the viewpoint of statistical learning theory,” Structural Safety, vol. 26, no. 3, pp. 271–293, 2004. View at Publisher · View at Google Scholar · View at Scopus
  27. J. E. Hurtado and D. A. Alvarez, “Classification approach for reliability analysis with stochastic finite-element modeling,” Journal of Structural Engineering, vol. 129, no. 8, pp. 1141–1149, 2003.
  28. A. Frank and A. Asuncion, UCI Machine Learning Repository, 2010.
  29. V. Vapnik, Statistical Learning Theory, Wiley, New York, NY, USA, 1998.
  30. P. Walley, “Measures of uncertainty in expert systems,” Artificial Intelligence, vol. 83, no. 1, pp. 1––58, 1996. View at Publisher · View at Google Scholar
  31. V. P. Kuznetsov, Interval Statistical Models. Radio and Communication, Moscow, Russia, 1991, in Russian.
  32. P. Walley, Statistical Reasoning with Imprecise Probabilities, Chapman and Hall, London, UK, 1991.
  33. S. Ferson, L. Ginzburg, and R. Akcakaya, “Whereof one cannot speak: when input distributions are unknown,” Applied Biomathematics Report, 2001, http://www.ramas.com/whereof.pdf.
  34. A. N. Tikhonov and V. Y. Arsenin, Solution of Ill-Posed Problems, W.H. Winston, Washington, DC, USA, 1977.
  35. T. Evgeniou, T. Poggio, M. Pontil, and A. Verri, “Regularization and statistical learning theory for data analysis,” Computational Statistics and Data Analysis, vol. 38, no. 4, pp. 421–432, 2002. View at Publisher · View at Google Scholar · View at Scopus
  36. R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2005.
  37. C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines, 2001, http://www.csie.ntu.edu.tw/~cjlin/libsvm/.