Table of Contents
Advances in Artificial Neural Systems
Volume 2015, Article ID 265637, 10 pages
http://dx.doi.org/10.1155/2015/265637
Research Article

Hybrid Feature Selection Based Weighted Least Squares Twin Support Vector Machine Approach for Diagnosing Breast Cancer, Hepatitis, and Diabetes

Indian Institute of Information Technology, Allahabad 211012, India

Received 30 September 2014; Accepted 23 December 2014

Academic Editor: Chao-Ton Su

Copyright © 2015 Divya Tomar and Sonali Agarwal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. Lemnaru, Strategies for dealing with real world classification problems [Ph. D. thesis], Faculty of Computer Science and Automation, Universitatea Technica, Din Cluj-Napoca, Cluj-Napoca, Romania, 2012.
  2. D. Tomar and S. Agarwal, “A survey on pre-processing and post-processing techniques in data mining,” International Journal of Database Theory & Application, vol. 7, no. 4, 2014. View at Google Scholar
  3. H.-H. Hsu, C.-W. Hsieh, and M.-D. Lu, “Hybrid feature selection by combining filters and wrappers,” Expert Systems with Applications, vol. 38, no. 7, pp. 8144–8150, 2011. View at Publisher · View at Google Scholar · View at Scopus
  4. 2014, http://www.imaginis.com/breast-cancer-resource-center.
  5. A. Jemal, M. M. Center, C. DeSantis, and E. M. Ward, “Global patterns of cancer incidence and mortality rates and trends,” Cancer Epidemiology Biomarkers and Prevention, vol. 19, no. 8, pp. 1893–1907, 2010. View at Publisher · View at Google Scholar · View at Scopus
  6. E. L. Mohamed, R. Linder, G. Perriello, N. Di Daniele, S. J. Pöppl, and A. De Lorenzo, “Predicting type 2 diabetes using an electronic nose-based artificial neural network analysis,” Diabetes, Nutrition and Metabolism, vol. 15, no. 4, pp. 215–221, 2002. View at Google Scholar · View at Scopus
  7. D. Tomar and S. Agarwal, “Predictive model for diabetic patients using hybrid twin support vector machine,” in Proceedings of the 5th International Conferences on Advances in Communication Network and Computing (CNC ’14), pp. 1–9, 2014.
  8. http://www.medicalnewstoday.com/articles/145869.php.
  9. D. Tomar and S. Agarwal, “A survey on data mining approaches for healthcare,” International Journal of Bio-Science & Bio-Technology, vol. 5, no. 5, pp. 241–266, 2013. View at Publisher · View at Google Scholar · View at Scopus
  10. X. Wang Benjamin, Boosting support vector machine [M.S. thesis], 2005.
  11. B. X. Wang and N. Japkowicz, “Boosting support vector machines for imbalanced data sets,” Knowledge and Information Systems, vol. 25, no. 1, pp. 1–20, 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. J. Laurikkala, “Instance-based data reduction for improved identification of difficult small classes,” Intelligent Data Analysis, vol. 6, no. 4, pp. 311–322, 2002. View at Google Scholar · View at Scopus
  13. N. Japkowicz and S. Stephen, “The class imbalance problem: a systematic study,” Intelligent Data Analysis, vol. 6, no. 5, pp. 429–449, 2002. View at Google Scholar · View at Scopus
  14. M. Kubat and S. Matwin, “Addressing the curse of imbalanced training sets: one-sided selection,” in Proceedings of the 14th International Conference on Machine Learning (ICML ’97), vol. 97, pp. 179–186, 1997.
  15. C. Ling and C. Li, “Data mining for direct marketing—specific problems and solutions,” in Proceedings of 4th International Conference on Knowledge Discovery and Data Mining, pp. 73–79, 1998.
  16. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. View at Publisher · View at Google Scholar · View at Scopus
  17. P. Domingos, “MetaCost: a general method for making classifiers cost-sensitive,” in Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164, San Diego, Calif, USA, August 1999. View at Publisher · View at Google Scholar
  18. C. Elkan, “The foundations of cost-sensitive learning,” in Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI ’01), pp. 973–978, Seattle, Wash, USA, August 2001. View at Scopus
  19. K. M. Ting, “An instance-weighting method to induce cost-sensitive trees,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 3, pp. 659–665, 2002. View at Publisher · View at Google Scholar · View at Scopus
  20. B. Zadrozny, J. Langford, and N. Abe, “Cost-sensitive learning by cost-proportionate example weighting,” in Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM ’03), pp. 435–442, Melbourne, Fla, USA, November 2003. View at Scopus
  21. Z.-H. Zhou and X.-Y. Liu, “Training cost-sensitive neural networks with methods addressing the class imbalance problem,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 1, pp. 63–77, 2006. View at Publisher · View at Google Scholar · View at Scopus
  22. X. Yang, Q. Song, and Y. Wang, “A weighted support vector machine for data classification,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 21, no. 5, pp. 961–976, 2007. View at Publisher · View at Google Scholar · View at Scopus
  23. J. A. K. Suykens, J. de Brabanter, L. Lukas, and J. Vandewalle, “Weighted least squares support vector machines: robustness and sparce approximation,” Neurocomputing, vol. 48, pp. 85–105, 2002. View at Publisher · View at Google Scholar · View at Scopus
  24. J. R. Quinlan, “Improved use of continuous attributes in C4.5,” Journal of Artificial Intelligence Research, vol. 4, pp. 77–90, 1996. View at Google Scholar · View at Scopus
  25. D. Tomar and S. Agarwal, “Feature selection based least square twin support vector machine for diagnosis of heart disease,” International Journal of Bio-Science and Bio-Technology, vol. 6, no. 2, pp. 69–82, 2014. View at Publisher · View at Google Scholar · View at Scopus
  26. H. J. Hamilton, N. Shan, and N. Cercone, RIAC: A Rule Induction Algorithm Based on Approximate Classification, Computer Science Department, University of Regina, 1996.
  27. B. Ster and A. Dobnikar, “Neural networks in medical diagnosis: comparison with other methods,” in Proceedings of the International Conference on Engineering Applications of Neural Networks (EANN ’96), pp. 427–430, 1996.
  28. C. A. Peña-Reyes and M. Sipper, “A fuzzy-genetic approach to breast cancer diagnosis,” Artificial Intelligence in Medicine, vol. 17, no. 2, pp. 131–155, 1999. View at Publisher · View at Google Scholar
  29. M. F. Akay, “Support vector machines combined with feature selection for breast cancer diagnosis,” Expert Systems with Applications, vol. 36, no. 2, pp. 3240–3247, 2009. View at Publisher · View at Google Scholar · View at Scopus
  30. C.-L. Huang, H.-C. Liao, and M.-C. Chen, “Prediction model building and feature selection with support vector machines in breast cancer diagnosis,” Expert Systems with Applications, vol. 34, no. 1, pp. 578–587, 2008. View at Publisher · View at Google Scholar · View at Scopus
  31. H.-L. Chen, B. Yang, J. Liu, and D.-Y. Liu, “A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis,” Expert Systems with Applications, vol. 38, no. 7, pp. 9014–9022, 2011. View at Publisher · View at Google Scholar · View at Scopus
  32. N. Rathore and S. Agarwal, “Predicting the survivability of breast cancer patients using ensemble approach,” in Proceedings of the International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT ’14), pp. 459–464, IEEE, February 2014. View at Publisher · View at Google Scholar · View at Scopus
  33. K. Polat and S. Güneş, “Breast cancer diagnosis using least square support vector machine,” Digital Signal Processing, vol. 17, no. 4, pp. 694–701, 2007. View at Publisher · View at Google Scholar · View at Scopus
  34. M. Karabatak and M. C. Ince, “An expert system for detection of breast cancer based on association rules and neural network,” Expert Systems with Applications, vol. 36, no. 2, pp. 3465–3469, 2009. View at Publisher · View at Google Scholar · View at Scopus
  35. E. D. Übeyli, “Implementing automated diagnostic systems for breast cancer detection,” Expert Systems with Applications, vol. 33, no. 4, pp. 1054–1062, 2007. View at Publisher · View at Google Scholar · View at Scopus
  36. H. Temurtas, N. Yumusak, and F. Temurtas, “A comparative study on diabetes disease diagnosis using neural networks,” Expert Systems with Applications, vol. 36, no. 4, pp. 8610–8615, 2009. View at Publisher · View at Google Scholar · View at Scopus
  37. X. Liu and H. Fu, “PSO-based support vector machine with Cuckoo search technique for clinical disease diagnoses,” The Scientific World Journal, vol. 2014, Article ID 548483, 7 pages, 2014. View at Publisher · View at Google Scholar
  38. M. F. Ganji and M. S. Abadeh, “A fuzzy classification system based on ant colony optimization for diabetes disease diagnosis,” Expert Systems with Applications, vol. 38, no. 12, pp. 14650–14659, 2011. View at Publisher · View at Google Scholar · View at Scopus
  39. M. Ashraf, G. Chetty, D. Tran, and D. Sharma, “Hybrid approach for diagnosing thyroid, hepatitis, and breast cancer based on correlation based feature selection and Naïve bayes,” in Neural Information Processing, Lecture Notes in Computer Science, pp. 272–280, Springer, Berlin, Germany, 2012. View at Google Scholar
  40. K. Polat and S. Güneş, “Hepatitis disease diagnosis using a new hybrid system based on feature selection (FS) and artificial immune recognition system with fuzzy resource allocation,” Digital Signal Processing, vol. 16, no. 6, pp. 889–901, 2006. View at Publisher · View at Google Scholar · View at Scopus
  41. P. Yang, W. Liu, B. B. Zhou, S. Chawla, and A. Y. Zomaya, “Ensemble-based wrapper methods for feature selection and class imbalance learning,” in Advances in Knowledge Discovery and Data Mining, pp. 544–555, Springer, Berlin, Germany, 2013. View at Google Scholar
  42. A. Al-Shahib, R. Breitling, and D. Gilbert, “Feature selection and the class imbalance problem in predicting protein function from sequence,” Applied Bioinformatics, vol. 4, no. 3, pp. 195–203, 2005. View at Publisher · View at Google Scholar · View at Scopus
  43. Jayadeva, R. Khemchandani, and S. Chandra, “Twin support vector machines for pattern classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905–910, 2007. View at Publisher · View at Google Scholar · View at Scopus
  44. M. A. Kumar and M. Gopal, “Least squares twin support vector machines for pattern classification,” Expert Systems with Applications, vol. 36, no. 4, pp. 7535–7543, 2009. View at Publisher · View at Google Scholar · View at Scopus
  45. D. Tomar, S. Singhal, and S. Agarwal, “Weighted least square twin support vector machine for imbalanced dataset,” International Journal of Database Theory and Application, vol. 7, no. 2, pp. 25–36, 2014. View at Publisher · View at Google Scholar
  46. Dataset, 2014, http://archive.ics.uci.edu/ml/datasets.html.
  47. http://www.is.umk.pl/projects/datasets.html.
  48. K. Polat, S. Güneş, and A. Arslan, “A cascade learning system for classification of diabetes disease: generalized discriminant analysis and least square support vector machine,” Expert Systems with Applications, vol. 34, no. 1, pp. 482–487, 2008. View at Publisher · View at Google Scholar · View at Scopus
  49. H. Kahramanli and N. Allahverdi, “Design of a hybrid system for the diabetes and heart diseases,” Expert Systems with Applications, vol. 35, no. 1, pp. 82–89, 2008. View at Google Scholar
  50. N. Jankowski, “Controlling the structure of neural networks that grow and shrink,” in Proceedings of the 2nd International Conference on Cognitive and Neural Systems, 1998.
  51. T. Sousa, A. Silva, and A. Neves, “A particle swarm data miner,” in Progress in Artificial Intelligence, vol. 2902 of Lecture Notes in Computer Science, pp. 43–53, Springer, Berlin, Germany, 2003. View at Publisher · View at Google Scholar · View at Scopus
  52. J. Abonyi and F. Szeifert, “Supervised fuzzy clustering for the identification of fuzzy classifiers,” Pattern Recognition Letters, vol. 24, no. 14, pp. 2195–2207, 2003. View at Publisher · View at Google Scholar · View at Scopus
  53. D. E. Goodman Jr., L. C. Boggess, and A. B. Watkins, “Artificial immune system classification of multiple-class problems,” in Proceedings of the Artificial Neutral Networks in Engineering Conference (ANNIE ’02), pp. 179–184, 2002. View at Scopus
  54. K. P. Bennett and J. A. Blue, “Support vector machine approach to decision trees,” in Proceedings of the IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence, vol. 3, pp. 2396–2401, Anchorage, Alaska, USA, May 1998. View at Publisher · View at Google Scholar · View at Scopus
  55. S. Şahan, K. Polat, H. Kodaz, and S. Güneş, “A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis,” Computers in Biology and Medicine, vol. 37, no. 3, pp. 415–423, 2007. View at Publisher · View at Google Scholar · View at Scopus