Mathematical Problems in Engineering
Volume 2014 (2014), Article ID 327142, 9 pages
Research Article

An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

Yu Wang1,2

1School of Economic and Business Administration, Chongqing University, Chongqing 400030, China
2Chongqing Key Laboratory of Logistics, Chongqing University, Chongqing 400044, China

Received 16 December 2013; Accepted 13 January 2014; Published 19 February 2014

Academic Editor: J. J. Judice

Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH), to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.