Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2014 (2014), Article ID 761468, 11 pages
http://dx.doi.org/10.1155/2014/761468
Research Article

Adaptive Initialization Method Based on Spatial Local Information for -Means Algorithm

1School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
2College of Science, Huazhong Agricultural University, Wuhan 430070, China

Received 26 November 2013; Accepted 20 February 2014; Published 30 March 2014

Academic Editor: Yi-Kuei Lin

Copyright © 2014 Honghong Liao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

-means algorithm is a widely used clustering algorithm in data mining and machine learning community. However, the initial guess of cluster centers affects the clustering result seriously, which means that improper initialization cannot lead to a desirous clustering result. How to choose suitable initial centers is an important research issue for -means algorithm. In this paper, we propose an adaptive initialization framework based on spatial local information (AIF-SLI), which takes advantage of local density of data distribution. As it is difficult to estimate density correctly, we develop two approximate estimations: density by -nearest neighborhoods ( -NN) and density by -neighborhoods ( -Ball), leading to two implements of the proposed framework. Our empirical study on more than 20 datasets shows promising performance of the proposed framework and denotes that it has several advantages: (1) can find the reasonable candidates of initial centers effectively; (2) it can reduce the iterations of -means’ methods significantly; (3) it is robust to outliers; and (4) it is easy to implement.