- About this Journal
- Abstracting and Indexing
- Aims and Scope
- Article Processing Charges
- Articles in Press
- Author Guidelines
- Bibliographic Information
- Citations to this Journal
- Contact Information
- Editorial Board
- Editorial Workflow
- Free eTOC Alerts
- Publication Ethics
- Reviewers Acknowledgment
- Submit a Manuscript
- Subscription Information
- Table of Contents
Applied Computational Intelligence and Soft Computing
Volume 2013 (2013), Article ID 863146, 12 pages
Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
1Department of Computer Science and Engineering, Faculty of Engineering and Technology, Mody Institute of Technology and Science, Lakshmangarh, Rajasthan 332311, India
2School of Computer Engineering, KIIT University, Bhubaneswar 751024, India
Received 21 August 2013; Revised 20 October 2013; Accepted 11 November 2013
Academic Editor: Sebastian Ventura
Copyright © 2013 Singh Vijendra and Sahoo Laxman. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- L. C. Hamilton, Modern Data Analyis: A First Course in Applied Statistics, Brooks/Cole, 1990.
- D. H. Fisher, “Iterative optimization and simplification of hierarchical clustering,” in Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining (KDD ’95), pp. 118–123, Montreal, Canada, 1995.
- J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000.
- N. Kumar and K. Kummamuru, “Semisupervised clustering with metric learning using relative comparisons,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 4, pp. 496–503, 2008.
- M. H. Dunham, Data Mining: Introductory and Advanced Topics, Prentice Hall, 2003.
- A. Thalamuthu, I. Mukhopadhyay, X. Zheng, and G. C. Tseng, “Evaluation and comparison of gene clustering methods in microarray analysis,” Bioinformatics, vol. 22, no. 19, pp. 2405–2412, 2006.
- K. Wang, Z. Du, Y. Chen, and S. Li, “V3COCA: an effective clustering algorithm for complicated objects and its application in breast cancer research and diagnosis,” Simulation Modelling Practice and Theory, vol. 17, no. 2, pp. 454–470, 2009.
- J. Sander, M. Ester, H. Kriegel, and X. Xu, “Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 169–194, 1998.
- B. R. S. A. Young, Efficient algorithms for data mining with federated databases [Ph.D. thesis], University of Cincinnati, 2007.
- G. Sheikholeslami, S. Chatterjee, and A. Zhang, “WaveCluster: a wavelet-based clustering approach for spatial data in very large databases,” Journal of Very Large Data Bases, vol. 8, no. 3-4, pp. 289–304, 2000.
- J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco, Calif, USA, 2011.
- J. McQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the 5th Berkeley Symposium on Mathematics, Statistics, and Probabilistics, vol. 1, pp. 281–297, 1967.
- E. M. Voorhees, “Implementing agglomerative hierarchic clustering algorithms for use in document retrieval,” Information Processing and Management, vol. 22, no. 6, pp. 465–576, 1986.
- T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: an efficient data clustering method for very large databases,” in Proceedings of the ACM SIGMOD International Conference on Management of Data of ACM SIGMOD Record, vol. 25, no. 2, pp. 103–114, June 1996.
- S. Guha and R. Rastogi, “CURE: an efficient clustering algorithm for large databases,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. 27, no. 2, pp. 73–84, ACM Press, 1998.
- F. Korn, B. Pagel, and C. Faloutsos, “On the “dimensionality curse” and the ‘self-similarity blessing–,” IEEE Transactions on Knowledge and Data Engineering, vol. 13, no. 1, pp. 96–111, 2001.
- K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is nearest neighbors meaningful,” in Proceedings of the 7th International Conference on Database Theory (ICDT '99), pp. 217–235, London, UK, 1999.
- B. W. Silverman, Density Estimation for Statistics and Data Analysis, Chapman and Hall, London, UK, 1986.
- M. Ester, H. P. Kriegel, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD '96), pp. 226–223, Portand, Ore, USA, 1996.
- A. Hinneburg and H. Gabriel, “DENCLUE 2. 0: fast clustering based on kernel density estimation,” in Advances in Intelligent Data Analysis VII, pp. 70–80, Springer, Berlin, Germany, 2007.
- A. H. Pilevar and M. Sukumar, “GCHL: a grid-clustering algorithm for high-dimensional very large spatial data bases,” Pattern Recognition Letters, vol. 26, no. 7, pp. 999–1010, 2005.
- W. Wang, J. Yang, and R. Muntz, “STING: a statistical information grid approach to spatial data mining,” in Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97), vol. 97, pp. 186–195, 1997.
- C. C. Aggarwal, C. Procopiuc, J. L. Wolf, and J. S. Park, “Fast algorithms for projected clustering,” in Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, vol. 28, no. 2, pp. 61–72, 1999.
- A. Patrikainen and M. Meila, “Comparing subspace clusterings,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 7, pp. 902–916, 2006.
- R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic subspace clustering of high dimensional data for data mining applications,” in Proceedings of the ACM-SIGMOD Conference on the Management of Data, vol. 27, no. 2, pp. 94–105, 1998.
- K. Kailing, H. P. Kriegel, and P. Kroger, “Density-connected subspace clustering for high-dimensional data,” in Proceedings 4th SIAM International Conference on Data Mining (SDM '04), p. 4, 2004.
- C. C. Aggarwal and P. S. Yu, “Finding generalized projected clusters in high dimensional spaces,” in Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 29, no. 2, pp. 70–81, 2000.
- H. Kriegel, P. Kröger, M. Renz, and S. Wurst, “A generic framework for efficient subspace clustering of high-dimensional data,” in Proceedings of the 5th IEEE International Conference on Data Mining (ICDM '05), pp. 250–257, November 2005.
- M. J. Zaki, M. Peters, I. Assent, and T. Seidl, “CLICKS: an effective algorithm for mining subspace clusters in categorical datasets,” Data and Knowledge Engineering, vol. 60, no. 1, pp. 51–70, 2007.
- J. H. Friedman and J. J. Meulman, “Clustering objects on subsets of attributes,” Journal of the Royal Statistical Society B, vol. 66, no. 4, pp. 815–839, 2004.
- K. Woo, J. Lee, M. Kim, and Y. Lee, “FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting,” Information and Software Technology, vol. 46, no. 4, pp. 255–271, 2004.
- Y. Chu, J. Huang, K. Chuang, D. Yang, and M. Chen, “Density conscious subspace clustering for high-dimensional data,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 1, pp. 16–30, 2010.
- C. H. Cheng, A. W. Fu, C. H. Cheng, A. W. Fu, and Y. Zhang, “Entropy-based subspace clustering for mining numerical data,” in Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 84–93, 1999.
- S. Goil, H. S. Nagesh, and A. Choudhary, “MAFIA: efficient and scalable subspace clustering for very large data sets,” in Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 443–452, 1999.
- H. S. Nagesh, S. Goil, and A. Choudhary, “Adaptive grids for clustering massive data sets,” in Proceedings of the 1st SIAM International Conference on Data Mining (SDM '01), pp. 1–17, 2001.
- C. M. Procopiuc, M. Jones, P. K. Agarwal, and T. M. Murali, “A Monte Carlo algorithm for fast projective clustering,” in Proceedings of the 2002 ACM SIGMOD International Conference on Managment of Data, pp. 418–427, June 2002.
- B. L. Milenova and M. M. Campos, “O-Cluster: scalable clustering of large high dimensional sata sets,” in Oracle Data Mining Technologies, pp. 290–297, Oracle Corporation, 2002.
- L. Jing, M. K. Ng, and J. Z. Huang, “An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data,” IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 8, pp. 1026–1041, 2007.
- E. K. K. Ng, A. W. Fu, and R. C. Wong, “Projective clustering by histograms,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 3, pp. 369–383, 2005.
- A. Elke, C. Böhm, H. P. Kriegel, P. Kröger, I. M. Gorman, and A. Zimek, “Detection and visualization of subspace cluster hierarchies,” in Advances in Databases: Concepts, Systems and Applications, pp. 152–163, springer, Berlin, Germany, 2007.
- S. Robert and P. H. A. Sneath, Principles of Numerical Taxonomy, W. H. Freeman, 1963.
- D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
- E. R. Hruschka, R. J. G. B. Campello, A. A. Freitas, and A. C. P. L. F. de Carvalho, “A survey of evolutionary algorithms for clustering,” IEEE Transactions on Systems, Man and Cybernetics C, vol. 39, no. 2, pp. 133–155, 2009.
- J. Handl and J. Knowles, “An evolutionary approach to multiobjective clustering,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 1, pp. 56–76, 2007.
- T. Kohonen, Self-Organizing Maps, vol. 30, Springer, 2001.
- D. E. Goldberg and K. Deb, “A comparative analysis of selection schemes used in genetic algorithms,” in Foundations of Genetic Algorithms, vol. 51, pp. 61801–62996, Urbana, 1991.
- J. E. Baker, An analysis of the effects of selection in genetic algorithms [Ph.D. thesis], Graduate School of Vanderbilt University, Nashville, Tenn, USA, 1989.
- J. Yeh and J. C. Fu, “A hierarchical genetic algorithm for segmentation of multi-spectral human-brain MRI,” Expert Systems with Applications, vol. 34, no. 2, pp. 1285–1295, 2008.
- J. H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, Mich, USA, 1975.
- D. Thierens and D. Goldberg, “Elitist recombination: an integrated selection recombination GA,” in Proceedings of the 1st IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Intelligence (ICEC '94), pp. 508–512, June 1994.
- J. H. Friedman, L. B. Jon, and A. F. Raphael, “An algorithm for finding best matches in logarithmic expected time,” ACM Transactions on Mathematical Software, vol. 3, no. 3, pp. 209–226, 1977.