Research Article  Open Access
Spectral Nonlinearly Embedded Clustering Algorithm
Abstract
As is well known, traditional spectral clustering (SC) methods are developed based on the manifold assumption, namely, that two nearby data points in the highdensity region of a lowdimensional data manifold have the same cluster label. But, for some highdimensional and sparse data, such an assumption might be invalid. Consequently, the clustering performance of SC will be degraded sharply in this case. To solve this problem, in this paper, we propose a general spectral embedded framework, which embeds the true cluster assignment matrix for highdimensional data into a nonlinear space by a predefined embedding function. Based on this framework, several algorithms are presented by using different embedding functions, which aim at learning the final cluster assignment matrix and a transformation into a low dimensionality space simultaneously. More importantly, the proposed method can naturally handle the outofsample extension problem. The experimental results on benchmark datasets demonstrate that the proposed method significantly outperforms existing clustering methods.
1. Introduction
As one of the fundamental topics in data mining and machine learning, clustering has been successfully applied in various fields. Generally speaking, the target of clustering is to group the examples into a number of classes, or clusters. Over the past decades, a large family of clustering algorithms has been studied extensively, which is mainly divided into two categories: generative clustering approaches and discriminative clustering models. Generative clustering approaches, for example, mixture models [1, 2], generally integrate Bayesian approaches into its models. However, generative models add restrict assumptions on the classconditional densities, which might lead to unconvincing clustering results when these assumptions do not hold. Discriminative methods, such as spectral clustering (SC) [3] and Kmeans clustering [4], learn discriminative models based on loss functions from unlabeled data through the lowdensity separation assumption.
Recently, discriminative clustering methods, such as the variants of kernelbased clustering and spectral clustering, have attracted more and more renewed attentions. It is easy to perform them to capture nonlinear cluster structures. Motivated by the outstanding performance of support vector machine (SVM) in supervised learning, maximum margin clustering (MMC) [5–7] methods have been developed to obtain a decision boundary that can separate data points into different clusters to the utmost extent. Although these clustering methods have the ability of exploiting nonlinear data structures, they are still sensitive to highdimensional data points. For example, Kmeans clustering iteratively computes the distance between each data point and the center of each cluster. Hence, its clustering performance severely depends on the distance measurement. However, highdimensional data, such as some image data, would have a bad influence on the similarity computation by virtue of Euclidian distance, and the performance of Kmeans clustering would be degraded dramatically. SC can perform clustering by utilizing the spectrum of the similarity matrix to discover the nonlinear and lowdimensional manifold structure of data points. In other words, it heavily relies on the manifold assumption [8, 9], namely, that two nearby data points of a lowdimensional manifold have the same class label. However, for highdimensional and sparse data, the manifold assumption may not hold due to the bias caused by the curse of dimensionality. Nie et al. [10] have validated that graphbased spectral clustering methods cannot always exploit the lowdimensional manifold structure, which would result in the performance degradation of SC. Another challenge for traditional SC methods is that they do not solve the outofsample extension problem; that is, the discrete cluster assignment vectors for some new unseen samples cannot be automatically obtained. The algorithm proposed in [11] takes advantage of the Nyström method to approximate the eigenfunction for the unseen data points. The method described in [12] makes good use of some heuristics to evaluate the implicit eigenfunction for the new data points. But, the performance of these methods heavily relies on the estimated affinity matrix defined between training and new data points.
To improve the clustering performance of SC for highdimensional data further, in this paper, we firstly propose a general spectral embedded clustering framework, which incorporates dimensionality reduction methods into the model of SC. Secondly, by using different lowdimensional embedding functions, we derive the corresponding optimization models and develop the spectral nonlinearly embedded algorithms based on extreme learning machine (ELM) and kernel functions, respectively. Our main contributions include the following:(1)A general spectral embedded clustering framework is presented by imposing a linearity regularization on the objective function of SC. The proposed framework introduces dimensionality reduction of the training data by controlling the error between the cluster assignment matrix and the lowdimensional embedding of the data.(2)Based on the proposed general framework, several models can be derived by using different embedding functions, which include the linear embedding functions and the nonlinear functions in Reproducing Kernel Hilbert Space (RKHS) as well as in ELM feature space. The spectral embedded clustering model (SEC) proposed in [12] can be considered as the special case of the general framework.(3)We prove that the spectral nonlinearly embedded clustering model based on ELM (ESEC) is an approximation of the kernelbased spectral nonlinearly embedded clustering (KSEC) method under some conditions. The fast spectral nonlinearly embedded clustering algorithm is proposed based on ESEC by utilizing the efficient learning ability of ELM.(4)The outofsample extension problem can be naturally solved for the clustering methods under our proposed SEC framework.(5)Experimental results on benchmark datasets demonstrate that the proposed ESEC outperforms the existing SC methods, Kmeans clustering, and SEC and KSEC for insample clustering. For outofsample clustering, ESEC also has better generalization capability over the Nyström method and superior performance than Kmeans clustering, SEC, and KSEC.
The rest of this paper is organized as follows. Related works are introduced in Section 2. In Section 3, we present the general spectral embedded clustering framework and derive several different models by using different embedding functions. The relationship between ESEC and KSEC is demonstrated and the ESEC clustering algorithm is described in detail. In addition, clustering for outofsample data is also discussed. To validate our model, experimental results are reported in Section 4. Finally, we give the related conclusions and a discussion of future works in Section 5. In order to avoid confusion, we give a list of the main notations used in this paper in Notations section.
2. Related Works
2.1. Spectral Clustering
Given a dataset , the main task of clustering is to partition into clusters. SC aims at finding a cluster assignment matrix of the training data by a weighted graph whose vertices are over . Several SC algorithms have been proposed in [3, 13, 14]. In this paper, we mainly discuss the SC algorithm with kway normalized cuts [3].
Specifically, denote an undirected weighted graph by , where is a vertex set and represents an affinity matrix. Each entry of the symmetric matrix is used to record the edge weights that characterize the similarity relationship between a pair of vertices of . is commonly defined by The Laplacian graph is defined by , where is a diagonal matrix with the diagonal elements as . Based on the normalized cut criterion, where the size of a subset of a graph is measured by the weights of its edges and the normalized Laplacian matrix is used, the optimization problem can be transformed into the following trace maximization problem [3]:where denotes the identity matrix of size by and represents the cluster assignment matrix with continuous values by relaxation. Then optimal solution of (2) can be obtained by eigenvalue decomposition of the matrix .
2.2. Extreme Learning Machine
The output function of ELM for generalized singlehiddenlayer feedforward neural networks (SLFNs) in the case of one output node iswhere is the vector of the output weights between the hidden layer of L nodes and the output node and is the output (row) vector of the hidden layer with respect to the input . In fact, maps the data from the ddimensional input space to the Ldimensional hiddenlayer feature space (ELM feature space) . ELM is to minimize the training error as well as the norm of the output weights [15]where is a tradeoff parameter between the complexity and fitness of the decision function and is the hiddenlayer output matrix denoted by
Similar to support vector machine (SVM), to minimize the norm of the output weights is actually to maximize the distance of the separating margins of the two different classes in the ELM feature space: , which actually controls the complexity of the function in the ELM feature space.
3. General Spectral Embedded Clustering Framework
As mentioned above, SC methods greatly depend on the construction of the affinity matrix . For some highdimensional data, it might not exhibit an evident lowdimensional manifold structure. In this case, the clustering performance of SC may be inferior to the Kmeans clustering.
In the following subsections, we will firstly propose a general spectral embedded clustering framework, which incorporates a linearity regularization into the traditional normalized SC model. By using different embedding functions, this framework can generate a family of spectral embedded clustering algorithms, such as SEC, KSEC, and ESEC. Secondly, we demonstrate the relationship between ESEC and KSEC. The ESEC algorithm is then proposed for highdimensional data clustering. Finally, the outofsample extension problem is discussed for our proposed ESEC method.
3.1. Formulation
Generally, clustering models of traditional SC methods can be transformed into the following minimization problem:where is the normalized Laplacian matrix.
To make use of the underlying dense grouping structure of data in a lowdimensional subspace, the proposed general framework introduces a regularization term into the optimization problem (6), which controls the error between the cluster assignment matrix and the lowdimensional embedding of the data. Specifically, we minimize the following objective function:where and are two regularization parameters and is the lowdimensional embedding of training data. The second term represents the error between the relaxed cluster assignment matrix and the lowdimensional embedding of the data. The third term is the norm penalty of and represents the complexity of functions in a highdimensional feature space.
In dimensionality reduction, linear embedding functions and nonlinear embedding functions are commonly used to address outofsample problems. This is due to the fact that they contain few parameters, which are not expensive in computational time and memory. In this paper, we mainly discuss kernelbased and ELMbased nonlinear embedding functions.
If we choose a linear embedding function , , where and , the optimization problem (7) becomeswhich is equivalent to the SEC method proposed in [12].
If we use a nonlinear embedding function in RKHS, that is, , then , where is a symmetric kernel matrix and ; problem (7) can be rewritten aswhich is referred to as KSEC.
Alternatively, if we consider an embedding function in ELM feature space, that is, , then , where represents the hiddenlayer output matrix of ELM. Problem (7) can be reformulated as which is referred to as ESEC.
3.2. Method
Firstly, to solve the optimization problems (9), we transform them into another simple form and have the following theorem.
Theorem 1. The optimization problems (9) can be transformed into the following minimization problem:where and denotes the identity matrix of size n by n.
Proof. Problem (9) is firstly transformed into the following form:where .
By setting the derivatives of the objective function (16) with respect to to zero, we haveBy substituting in (12) by (13), the optimization problem (12) becomeswhich can be denoted as follows:where . This completes the proof of Theorem 1.
Based on Theorem 1, the relaxed cluster assignment matrix of KSEC can be achieved by computing the eigenvectors of corresponding to the smallest eigenvalues. The columns of are corresponding to the top eigenvectors. Finally, the discretevalued cluster assignment matrix can be obtained by clustering each row of .
To inherit the advantage of fast learning speed of ELM, we mainly discuss ESEC based on ELM with multioutputs, since ELM with single output can be regarded as a special case of it. We have the following theorem on ESEC, which is the foundation of the proposed ESEC algorithm.
Theorem 2. The optimization problem (10) can be transformed into the following minimization problem:where or . denotes the identity matrix of size by and is the number of hidden layer nodes in ELM.
Proof. By setting the derivatives of the objective function (10) with respect to to zero, we haveBy substituting in (10) by (17), the optimization problem (10) becomesProblem (18) can be further transformed into the following objective function:which can be denoted as follows:where . can be transformed into another form as follows: This completes the proof of Theorem 2.
ESEC makes good use of an embedding function in ELM feature space instead of RKHS. Thus, the form of ESEC is similar to that of KSEC. It can be proved that there is a link between ESEC and KSEC. We have the following theorem.
Theorem 3. If the mapping in ELM is , where denotes any kernel function and L is the number of hidden nodes in ELM and ( is the parameter of kernel function ) are random sampling points from any continuous probability distribution, then ESEC is an approximation of KSEC by discretizing the embedding function .
Proof. Since in RKHS, can be denoted aswhere . Let and ; thenThus, we approximately derive the embedding function of ESEC from KSEC. This completes the proof of Theorem 3.
The proposed ESEC algorithm is described as follows.
Algorithm 1.
Input. The input is the training dataset and the number of clusters .
Output. The output is the class assignment matrix of cluster .
Step 1. Construct the graph Laplacian from .
Step 2. Randomly generate input weights and initiate an ELM network of hidden neurons; calculate the output matrix of the hidden layer.
Step 3. If ; let .
Else .
Step 4. Compute the matrix .
Step 5. Find the eigenvectors of corresponding to the smallest eigenvalues, which form the optimal .
Step 6. Treat each row of as a new training sample, and use the Kmeans algorithm to cluster the training samples into clusters. Let be the final discrete class assignment matrix of cluster for training data.
Return the class assignment matrix of cluster .
3.3. Computational Complexity
From Algorithm 1, we can see that the most costly computation is computing the matrix and carrying out the eigendecomposition of . If , computing needs to obtain the inversion of , whose computational complexity is . In addition, the computational complexity of eigenvalue decomposition of is . Thus, the total computational complexity of ESEC is , where . Correspondingly, for KSEC, computational complexity of calculating is and its total computational complexity is . Consequently, ESEC has lower computational complexity than KSEC.
3.4. Clustering for OutofSample Data
By performing Algorithm 1, we can obtain the cluster assignment matrix for the training data. Thus, can be easily computed by using formula (17). Then, for any new data point , we can obtain the prediction result In this paper, we use the spectral rotation method to calculate the discrete cluster assignment vector for . Firstly, an orthogonal matrix is computed by the following spectral rotation method:where and denote the and vectors of all 1s, respectively. is an orthogonal matrix and is defined bywhere represents a diagonal matrix with the same diagonal elements as the square matrix . Secondly, the discrete cluster assignment vector for is calculated as follows:Finally, the class of the data point iswhere is the ith element in the vector .
4. Experiments
To evaluate the insample clustering and outsample clustering performance of different clustering methods, we test all algorithms on UCI datasets (Iris, Glass, Wine, WPBC, SpectHeart, and Isolet (http://archive.ics.uci.edu/ml/datasets.html.)), face recognition datasets (Yale (http://vision.ucsd.edu/~leekc/ExtYaleDatabase/Yale%20Face%20Database.htm.), ORL (http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.)), digits recognition datasets (USPS (http://wwwi6.informatik.rwthaachen.de/~keysers/usps.html)), and object recognition datasets (COIL20 (http://www.cs.columbia.edu/CAVE/software/softlib/coil20.php.)). Some datasets are resized, and the basic information of datasets is listed in Table 1. All the experiments have been performed in MATLAB R2013a running in a 3.10 GHZ Intel Core™i52400 with 4 GB RAM.

In insample clustering, we assign a cluster label to each unlabeled insample data point. The proposed ESEC algorithm is compared with Kmeans (KM) clustering, SC [3], SEC [12], and KSEC. For KM, an EMlike algorithm is used to assign cluster labels as in [16]. In outsample clustering, the cluster label of each unseen data point is assigned to the closest cluster center learned from the insample data points for KM. We use the proposed outofsample approach to cope with unseen data for ESEC. Similar outofsample method is also used for KSEC and SEC by using different embedding functions. Since the Nyström method [10] can be used to deal with unseen data, we also compare ESEC with the Nyström method for the outofsample SC.
4.1. Experimental Setup
Each dataset is randomly divided into seen and unseen samples, and we use the seen data to obtain the optimal parameters of different clustering methods by crossvalidation. Then, we use the unseen data to test the performance of all algorithms using the obtained optimal parameters. In the experiments, 80% of the data are randomly selected as seen data and the remaining data are used as unseen data.
The selftuning SC method [17] is used to determine the parameter in (1) for SC, SEC, KSEC, and ESEC. The parameter K of Knearestneighbor graph is set to 5 empirically. For fair comparisons, we set the parameters in SEC, KSEC, and ESEC as 1 and select the parameter in these methods from .
For the Nyström SC method, we set the same , where , with being the mean value of the square distance between the insample data as suggested in [18] and . For the insample clustering, the best clustering results from the best parameters for SEC, KSEC, and ESEC are reported in Table 2. For ESEC, we use the RBF kernel as the hidden node function and a grid search of the number of hidden nodes on is conducted to seek for the optimal result by using fivefold crossvalidation. By means of the optimal parameters in insample setting, the results for the outofsample clustering are obtained and reported in Table 3.


It should be noted that the results of all clustering methods rely on the initialization. To get statistical results for different parameters and random partitions, all clustering algorithms are independently repeated 50 times, and we report the mean clustering result and standard deviation using the best parameters on the seen and unseen data. In the experiments, we set the number of clusters as the number of classes in each dataset. The clustering accuracy (ACC) (refer to [19] for its definition) and time cost are used to evaluate the clustering performance.
4.2. InSample Clustering Experiments
To compare clustering performances of various clustering algorithms, we report the insample clustering results on all the datasets in Table 2. As can be seen from Table 2, SC outperforms KM for most of the low dimensionality datasets, such as Iris, Glass, Wine, WPBC, and SpectfHeart. But, it might become worse on the high dimensionality datasets, such as Yale and Isolet. This is due to the fact that SC prefers the datasets that have a clear manifold structure in a lowdimensional space. If this assumption does not hold, it even performs worse than the KM algorithm. The performance of SEC is better than KM and SC on Glass, WPBC, USPS, and Isolet. Hence, it does not achieve overwhelming advantages for insample clustering on all datasets. One possible explanation is that SEC improves SC by introducing the linear embedding functions, which is only applicable to the data with linear or approximately linear structures. KSEC and ESEC significantly outperform KM, SC, and SEC in most cases. KSEC and ESEC all achieve 5 best insample clustering results among all datasets. It should be noted that KSEC and ESEC also have superior clustering performance for low dimensionality datasets, since they all introduce regularization terms into SC and can be considered as regularized SC. Compared with KSEC, ESEC achieves better or at least comparable results, which demonstrates that the proposed ESEC method is effective on all the datasets and has the ability to handle the datasets that do not have a clear manifold structure in a lowdimensional space. The running time of all algorithms is listed in Table 3. It is shown that ESEC runs much faster than KSEC, which is consistent with the theoretical analysis, and the running time of ESEC and KSEC is lower than that of KM, SC, and SEC for most of the datasets. Overall, compared with other methods, the proposed ESEC method has better or comparative insample performance at much faster training speed.
In Figure 1, we further analyze the sensitivity of the insample clustering performances of SEC, KSEC, and ESEC with respect to the parameter . We can see from Figure 1 that ESEC prefers a large value for on Yale and ORL, and its performance on these datasets is relatively stable when is set as a large value. While ESEC and KSEC favor a small value of for COIL20 and Isolet. We can observe that ESEC outperforms KSEC and SEC in a wide range of ; that is, the clustering accuracy of ESEC is less sensitive to the parameter for most of the datasets when compared with SEC and KSEC.
(a)
(b)
(c)
(d)
(e)
(f)
4.3. OutofSample Clustering Experiments
We also study the performances of KM, Nyström SC, SEC, KSEC, and ESEC for the outofsample extension. Table 4 shows the clustering accuracies of these methods for the outofsample clustering on all the datasets. The optimal parameters of SEC, KSEC, and ESEC are determined by crossvalidation from the insample clustering. From Table 4, it can be seen that SEC, KSEC, and ESEC significantly outperform the Nyström method for outofsample clustering. The reason is that the Nyström method utilizes Nyström extension to evaluate the similarity matrix between the unseen data, which might be inaccurate or even has a serious deviation. However, our proposed framework aims at minimizing the error between the cluster assignment matrix and the lowdimensional embedding of the data, which is feasible for handling realworld data. Thus, ESEC has the natural ability of solving outofsample extension problems. In addition, KM is sharply degraded on Yale and ORL compared with the corresponding results in Table 2. This is due to the fact that the unseen face data has the large variation compared to the seen data. On the other hand, the clustering accuracies of ESEC are comparable to the insample testing results, which validates that ESEC has better generalization performance. ESEC achieves 6 best clustering results among all ten testing results and has comparable results on the rest of the datasets when compared to KSEC. Consequently, the proposed ESEC algorithm provides a new way to cope with the outofsample data in clustering tasks.

5. Conclusion
In this paper, we propose a general spectral embedded clustering framework based on the objective function of SC, from which SEC, KSEC, and ESEC can all be derived by using different embedding functions. By virtue of ELM, the fast spectral nonlinearly embedded clustering algorithm (ESEC) is proposed, which can naturally solve the outofsample extension problem for the clustering tasks. Experimental results on benchmark datasets validate the effectiveness and efficiency of the proposed ESEC method for both insample and outofsample clustering. In the future, we intend to develop a new semisupervised clustering framework by incorporating pair constraints into the present framework and propose some semisupervised clustering algorithms based on spectral nonlinearly embedded clustering models.
Notations
:  The input dimensional Euclidean space 
:  The output 01 binary space 
:  The number of total training data points 
:  The number of classes that the samples belong to 
:  is the training data matrix 
:  is the 01 class assignment matrix; is the label vector of , and all components of are s except one being 
:  is the embedding vector function 
:  Kernel function of variables and 
:  Kernel matrix 
:  ; Its columns are the coefficients of kernel functions to represent the embedding function 
:  The trace of the matrix , that is, the sum of the diagonal elements of the matrix . 
Competing Interests
The authors declare that they have no competing interests.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (no. 61403394) and the Fundamental Research Funds for the Central Universities (no. 2014QNA46).
References
 A. P. Benavent, F. E. Ruiz, and J. M. Sáez, “Learning Gaussian mixture models with entropybased criteria,” IEEE Transactions on Neural Networks, vol. 20, no. 11, pp. 1756–1771, 2009. View at: Publisher Site  Google Scholar
 K. Zhang and J. T. Kwok, “Simplifying mixture models through function approximation,” IEEE Transactions on Neural Networks, vol. 21, no. 4, pp. 644–658, 2010. View at: Publisher Site  Google Scholar
 J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000. View at: Publisher Site  Google Scholar
 M. Filippone, F. Camastra, F. Masulli, and S. Rovetta, “A survey of kernel and spectral methods for clustering,” Pattern Recognition, vol. 41, no. 1, pp. 176–190, 2008. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 L. Xu, J. Neufeld, B. Larson, and D. Schuurmans, “Maximum margin clustering,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '05), pp. 1537–1544, Vancouver, Canada, 2005. View at: Google Scholar
 K. Zhang, I. W. Tsang, and J. T. Kwok, “Maximum margin clustering made practical,” IEEE Transactions on Neural Networks, vol. 20, no. 4, pp. 583–596, 2009. View at: Publisher Site  Google Scholar
 Y. Li, I. W. Tsang, J. T. Kwok, and Z. Zhou, “Tighter and convex maximum margin clustering,” in Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 344–351, Clearwater Beach, Fla, USA, 2009. View at: Google Scholar
 M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: a geometric framework for learning from labeled and unlabeled examples,” Journal of Machine Learning Research, vol. 7, pp. 2399–2434, 2006. View at: Google Scholar
 F. Nie, Z. Zeng, I. W. Tsang, D. Xu, and C. Zhang, “Spectral embedded clustering: a framework for insample and outofsample spectral clustering,” IEEE Transactions on Neural Networks, vol. 22, no. 11, pp. 1796–1808, 2011. View at: Publisher Site  Google Scholar
 C. Fowlkes, S. Belongie, F. Chung, and J. Malik, “Spectral grouping using the Nyström method,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 214–225, 2004. View at: Publisher Site  Google Scholar
 Y. Bengio, J.F. Paiement, P. Vincent, O. Delalleau, N. L. Roux, and M. Ouimet, “Outofsample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering,” in Proceedings of the 17th Annual Conference on Neural Information Processing Systems (NIPS '03), pp. 126–133, Whistler, Canada, December 2003. View at: Google Scholar
 S. X. Yu and J. Shi, “Multiclass spectral clustering,” in Proceedings o f the 9th IEEE International Conference on Computer Vision, pp. 313–319, Beijing, China, October 2003. View at: Publisher Site  Google Scholar
 A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: analysis and an algorithm,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '01), pp. 849–856, Vancouver, Canada, 2001. View at: Google Scholar
 G.B. Huang, H. Zhou, X. Ding, and R. Zhang, “Extreme learning machine for regression and multiclass classification,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 2, pp. 513–529, 2011. View at: Publisher Site  Google Scholar
 J. Ye, Z. Zhao, and M. Wu, “Discriminative Kmeans for clustering,” in Proceedings of the Neural Information Processing Systems, pp. 1649–1656, Vancouver, Canada, 2007. View at: Google Scholar
 L. ZelnikManor and P. Perona, “Selftuning spectral clustering,” in Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS '04), pp. 1601–1608, Vancouver, Canada, December 2004. View at: Google Scholar
 L. Duan, D. Xu, I. W. Tsang, and J. Luo, “Visual event recognition in videos by learning from web data,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '10), pp. 1959–1966, San Francisco, Calif, USA, June 2010. View at: Publisher Site  Google Scholar
 W. Chen and G. Feng, “Spectral clustering: a semisupervised approach,” Neurocomputing, vol. 77, no. 1, pp. 229–242, 2012. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2016 Mingming Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.