Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2017, Article ID 6393652, 15 pages
https://doi.org/10.1155/2017/6393652
Research Article

Fast Density Clustering Algorithm for Numerical Data and Categorical Data

1Zhejiang University of Technology, Zhejiang 310023, China
2Electrical Engineering Department, Ningbo Wanli University, Ningbo 310023, China

Correspondence should be addressed to Chen Jinyin; nc.ude.tujz@niynijnehc

Received 20 August 2016; Revised 2 January 2017; Accepted 15 January 2017; Published 26 March 2017

Academic Editor: Erik Cuevas

Copyright © 2017 Chen Jinyin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, San Francisco, Calif, USA, 2001.
  2. C.-C. Hsu, C.-L. Chen, and Y.-W. Su, “Hierarchical clustering of mixed data based on distance hierarchy,” Information Sciences, vol. 177, no. 20, pp. 4474–4492, 2007. View at Publisher · View at Google Scholar · View at Scopus
  3. C.-C. Hsu and Y.-P. Huang, “Incremental clustering of mixed data based on distance hierarchy,” Expert Systems with Applications, vol. 35, no. 3, pp. 1177–1185, 2008. View at Publisher · View at Google Scholar · View at Scopus
  4. S. P. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129–137, 1982. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  5. T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: an efficient data clustering method for very large databases,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 103–114, ACM, Montreal, Canada, June 1996.
  6. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD '96), Portland, Ore, USA, August 1996.
  7. Z. Huang, “A fast clustering algorithm to cluster very large categorical data sets in data mining,” in Research Issues on Data Mining and Knowledge Discovery, pp. 1–8, ACM Press, Tuscon, Ariz, USA, 1997. View at Google Scholar
  8. Z. Huang and M. K. Ng, “A fuzzy k-modes algorithm for clustering categorical data,” IEEE Transactions on Fuzzy Systems, vol. 7, no. 4, pp. 446–452, 1999. View at Publisher · View at Google Scholar · View at Scopus
  9. M.-S. Yang and Y.-C. Tian, “Bias-correction fuzzy clustering algorithms,” Information Sciences, vol. 309, pp. 138–162, 2015. View at Publisher · View at Google Scholar · View at Scopus
  10. D. Barbara, J. Couto, and Y. Li, “COOLCAT: an entropy-based algorithm for categorical clustering,” in Proceedings of the 11th International Conference on Information and Knowledge Management, pp. 582–589, ACM Press, McLean, Va, USA, November 2002.
  11. H. He and Y. Tan, “A two-stage genetic algorithm for automatic clustering,” Neurocomputing, vol. 81, no. 1, pp. 49–59, 2012. View at Publisher · View at Google Scholar · View at Scopus
  12. A. Saha and S. Das, “Categorical fuzzy k-modes clustering with automated feature weight learning,” Neurocomputing, vol. 166, pp. 422–435, 2015. View at Publisher · View at Google Scholar · View at Scopus
  13. S. Zahra, M. A. Ghazanfar, A. Khalid, M. A. Azam, U. Naeem, and A. Prugel-Bennett, “Novel centroid selection approaches for KMeans-clustering based recommender systems,” Information Sciences, vol. 320, pp. 156–189, 2015. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  14. Z. Huang, “Clustering large data sets with mixed numeric and categorical values,” in Proceedings of the the 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 21–34, World Scientific Publishing, Singapore, 1997.
  15. S. P. Chatzis, “A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional,” Expert Systems with Applications, vol. 38, no. 7, pp. 8684–8689, 2011. View at Publisher · View at Google Scholar · View at Scopus
  16. I. Gath and A. B. Geva, “Unsupervised optimal fuzzy clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 773–780, 1989. View at Publisher · View at Google Scholar · View at Scopus
  17. Z. Zheng, M. Gong, J. Ma, L. Jiao, and Q. Wu, “Unsupervised evolutionary clustering algorithm for mixed type data,” in Proceedings of the IEEE Congress on Evolutionary Computation, pp. 1–8, Barcelona, Spain, 2010.
  18. C. Li and G. Biswas, “Unsupervised learning with mixed numeric and nominal data,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 4, pp. 673–690, 2002. View at Publisher · View at Google Scholar · View at Scopus
  19. D. W. Goodall, “A new similarity index based on probability,” Biometrics, vol. 22, no. 4, pp. 882–907, 1966. View at Publisher · View at Google Scholar
  20. C.-C. Hsu and Y.-C. Chen, “Mining of mixed data with application to catalog marketing,” Expert Systems with Applications, vol. 32, no. 1, pp. 12–23, 2007. View at Publisher · View at Google Scholar · View at Scopus
  21. A. Ahmad and L. Dey, “A k-mean clustering algorithm for mixed numeric and categorical data,” Data & Knowledge Engineering, vol. 63, no. 2, pp. 503–527, 2007. View at Publisher · View at Google Scholar · View at Scopus
  22. J. Ji, T. Bai, C. Zhou, C. Ma, and Z. Wang, “An improved k-prototypes clustering algorithm for mixed numeric and categorical data,” Neurocomputing, vol. 120, pp. 590–596, 2013. View at Publisher · View at Google Scholar · View at Scopus
  23. J. Ji, W. Pang, C. Zhou, X. Han, and Z. Wang, “A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data,” Knowledge-Based Systems, vol. 30, no. 1, pp. 129–135, 2012. View at Publisher · View at Google Scholar · View at Scopus
  24. J.-Y. Chen and H.-H. He, “A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data,” Information Sciences, vol. 345, no. 1, pp. 271–293, 2016. View at Publisher · View at Google Scholar · View at Scopus
  25. B. Everitt, S. Landau, and M. Leese, Cluster Analysis, Arnold, London, UK, 2001.
  26. G. David and A. Averbuch, “SpectralCAT: categorical spectral clustering of numerical and nominal data,” Pattern Recognition, vol. 45, no. 1, pp. 416–433, 2012. View at Publisher · View at Google Scholar · View at Scopus
  27. Y.-M. Cheung and H. Jia, “Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number,” Pattern Recognition, vol. 46, no. 8, pp. 2228–2238, 2013. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  28. A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, vol. 344, no. 6191, pp. 1492–1496, 2014. View at Publisher · View at Google Scholar · View at Scopus
  29. J.-Y. Chen and H.-H. He, “Research on density-based clustering algorithm for mixed data with determine cluster centers automatically,” Acta Automatica Sinica, vol. 41, no. 10, pp. 1798–1813, 2015. View at Publisher · View at Google Scholar · View at Scopus
  30. I. H. Witten, E. Frank, and M. A. Hall, Data Mining, Morgan Kaufmann, 2011.
  31. Z. Xiao, S.-J. Ye, B. Zhong, and C.-X. Sun, “BP neural network with rough set for short term load forecasting,” Expert Systems with Applications, vol. 36, no. 1, pp. 273–279, 2009. View at Publisher · View at Google Scholar · View at Scopus