Research Article  Open Access
LowRank KernelBased Semisupervised Discriminant Analysis
Abstract
Semisupervised Discriminant Analysis (SDA) aims at dimensionality reduction with both limited labeled data and copious unlabeled data, but it may fail to discover the intrinsic geometry structure when the data manifold is highly nonlinear. The kernel trick is widely used to map the original nonlinearly separable problem to an intrinsically larger dimensionality space where the classes are linearly separable. Inspired by lowrank representation (LLR), we proposed a novel kernel SDA method called lowrank kernelbased SDA (LRKSDA) algorithm where the LRR is used as the kernel representation. Since LRR can capture the global data structures and get the lowest rank representation in a parameterfree way, the lowrank kernel method is extremely effective and robust for kinds of data. Extensive experiments on public databases show that the proposed LRKSDA dimensionality reduction algorithm can achieve better performance than other related kernel SDA methods.
1. Introduction
For many real world data mining and pattern recognition applications, the labeled data are very expensive or difficult to obtain, while the unlabeled data are often copious and available. So how to use both labeled and unlabeled data to improve the performance becomes a significant problem [1, 2]. Recently, semisupervised dimensionality reduction has attracted considerable attention, which can be directly used in the whole database [3]. Illuminated by semisupervised learning (SSL), many methods have been put forward to relieve the socalled small sample size (SSS) problem of LDA [4, 5]. Semisupervised Discriminant Analysis (SDA) first is proposed by Cai et al. [2], which can easily resolve the outofsample problem [6] and is more suitable for the real world applications. In SDA algorithm, the labeled samples are used to maximize the different classes’ separability and the unlabeled ones to estimate the data’s intrinsic geometric information.
Semisupervised Discriminant Analysis may fail to discover the intrinsic geometry structure when the data manifold is highly nonlinear [2, 7]. The kernel trick [8] has been widely used to generalize linear dimensionality reduction algorithms to nonlinear ones, which maps the original nonlinearly separable problem to an intrinsically larger dimensionality space where the classes are linearly separable. So the kernel SDA (KSDA) [2, 7] can discover the underlying subspace more exactly in the feature space, which brings a better subspace for the classification task by a nonlinear learning technique. Cai et al. discussed how to perform SDA in Reproducing Kernel Hilbert Space (RKHS), which gives rise to kernel SDA [2]. You et al. have presented the derivations of a first approach to optimize the parameters of a kernel. It can map the original class distributions to a space where these are optimally (with respect to Bayes) separated with a hyperplane [7]. A new kernelbased nonlinear discriminant analysis algorithm is proposed to solve the fundamental limitations in LDA [9]. A novel KFDA kernel parameters optimization criterion is presented for maximizing the uniformity of classpair separabilities and class separability in kernel space simultaneously [10]. To overcome the nonlinear dimensionality reduction problems and adopting multiple features restrictions of LFDA, Wang and Sun proposed a new dimensionality reduction algorithm called multiple kernel local Fisher discriminant analysis (MKLFDA) based on the multiple kernel learning [11]. The kernelization of graph embedding applies the kernel trick on the linear graph embedding algorithm to handle data with nonlinear distributions [12]. Weinberger et al. described an algorithm for nonlinear dimensionality reduction based on semidefinite programming and kernel matrix factorization which learns a kernel matrix for high dimensional data that lies on or near a lowdimensional manifold [13].
Lowrank matrix decomposition and completion are recently becoming very popular since Yang et al. and Chen et al. proved that a robust estimation of an underlying subspace which can be obtained by decomposing the observations into a lowrank matrix and a sparse error matrix [14, 15]. Recently, Liu et al. propose a lowrank representation method which is robust to noise and data corruptions due to its ability to decompose noise from the data set [14]. More recently, lowrank representation [16, 17], as a promising method to capture the underlying lowdimensional structures of data, has attracted much attention in the pattern analysis and signal processing communities. LRR method [16–18] seeks the lowest rank representation of all data jointly, such that each data point can be represented as a linear combination of some bases.
The major problem of kernel methods is to find the proper kernel parameters. But all these kernel methods usually use fixed global parameters to determinate the kernel matrix, which are very sensitive to the parameters setting. In fact, the most suitable kernel parameters may vary greatly at different random distribution of the same data. Moreover, the kernel mapping of KSDA always analyze the relationship of the data using the mode onetoothers, which emphasizes local information and lacks global constraints on their solutions. These shortcomings limit the performance and efficiency of KSDA methods. To overcome the disadvantages of the traditional kernel methods, inspired by LRR, we proposed a novel kernelbased Semisupervised Discriminant Analysis called lowrank kernelbased SDA (LRKSDA) where the lowrank representation is used as the kernel method. Compared with other kernels, the lowrank kernel jointly obtains the representation of all the samples under a global lowrank constraint [19]. Thus it is better at capturing the global data structures and very robust to different random distribution of the data set. In addition, we can get the lowest rank representation in a parameterfree way, which is very convenient and robust for kinds of data. Extensive experiments on public databases show that our proposed LRKSDA dimensionality reduction algorithm can achieve better performance than other related methods.
The rest of the paper is organized as follows. We start by a brief review on an overview of SDA in Section 2. We then introduce the lowrank kernelbased SDA framework in Section 3. Then Section 4 reports the experiment results on real world database tasks. In Section 5, we conclude the paper.
2. Overview of SDA
Given a set of samples , where , the first samples are labeled as , and the remaining are unlabeled ones. They all belong to classes. The SDA [2] hopes to find a rejection matrix , which motivates us to present the prior assumption of consistency by a regularizer term. The objective function is as follows: where and are the between class scatter and total class scatter matrix. And is defined as the within class scatter matrix where is the mean vector of the total sample, is the number of samples in the th class, is the average vector of the th class, and is the th sample in the th class.
The parameter in (1) balances the model complexity and the empirical loss. The regularizer term supplies us with the flexibility to incorporate the prior knowledge in the applications. We aim at constructing graph combining the manifold structure through the available unlabeled samples [2]. The key of SSL algorithm is the prior assumption of consistency. For classification, it means that the nearby samples are likely to have same label [20]. And for dimensionality reduction, it implicates that the nearby samples have similar embeddings (lowdimensional representations).
Given a set of samples , we can construct the graph to represent the relationship between nearby samples by NN algorithm. Then put an edge between nearest neighbors of each other. The corresponding weight matrix is defined as follows: where denotes the set of nearest neighbors of . Then term can be defined as follows: where is a diagonal matrix whose entries are column (or row since is symmetric) sum of ; that is, . The Laplacian matrix [21] is .
We can get the objective function of the SDA with regularizer term [2]: By maximizing the generalized eigenvalue problem, we can obtain the projective vector :
3. LowRank KernelBased SDA Framework
3.1. LowRank Representation
Yan and Wang [22] proposed sparse representation (SR) to construct graph [23] by solving optimization problem. However, graph lacks global constraints, which greatly reduce the performance when the data is grossly corrupted. To solve this drawback, Liu et al. proposed the lowrank representation and used it to construct the affinities of an undirected graph (here called LRgraph) [19]. It jointly obtains the representation of all the samples under a global lowrank constraint, and thus it is better at capturing the global data structures [24].
Let be a set of samples; each column is a sample which can be represented by a linear combination of the dictionary [19]. Here, we select the samples themselves as the dictionary : where is the coefficient matrix with each being the representation coefficient of . Different from the SR which may not capture the global structure of the data, LRR seeks the lowest rank solution by solving the following optimization problem [19]: The above optimization problem can be relaxed to the following convex optimization [25]: Here denotes the nuclear norm (or trace norm) [26] of a matrix, that is, the sum of the matrix’s singular values. By considering the noise or corruption in our real world applications, a more reasonable objective function is where can be norm or norm. In this paper we choose norm as the error term which is defined as . The parameter is used to balance the effect of low rank and the error term. The optimal solution can be obtained via the inexact augmented Lagrange multipliers method [27, 28].
3.2. Kernel SDA
Semisupervised Discriminant Analysis may fail to discover the intrinsic geometry structure when the data manifold is highly nonlinear. The kernel trick is a popular technique in machine learning which uses a kernel function to map samples to a high dimensional space [8, 29, 30]. By using the kernel trick, we can nonlinearly map the original data to the kernel feature space.
Let be a nonlinear mapping from into feature space. For any two points and , we use a kernel function to map the data into a kernel feature space. Some commonly used kernels are including the Gaussian radial basis function (RBF) kernel , polynomial kernel , and sigmoid kernel [2, 31].
Let denote the data matrix in the kernel space: . The projective vectors are the eigenvector problem in (6) and then we get transformation matrix . The number of the feature dimensions can be decided by us. Then a data point can be embedded into dimensional feature space by where .
Kernel SDA (KSDA) [2, 7] can discover the underlying subspace more exactly in the feature space. It results in a better subspace for the classification task by a nonlinear learning technique.
3.3. LowRank KernelBased SDA
The major problem of all these kernel methods is to find the proper kernel parameters. And they usually use fixed global parameters to determinate the kernel matrix, which is very sensitive to the parameters setting. In fact, the most proper kernel parameters may vary greatly at different random distribution even if they are for the same data. Moreover, the traditional kernel mapping always analyzes the relationship of the data using the mode onetoothers, which emphasizes local information and lacks global constraints on their solutions. These shortcomings limit the performance and efficiency of KSDA methods. To overcome these shortcomings mentioned above, inspired by lowrank representation, we propose a novel kernelbased Semisupervised Discriminant Analysis (LRKSDA) where LRR is used as the kernel representation.
Let be a lowrank mapping from into a lowrank kernel feature space . For the database , a reasonable objective function is as follows: The optimal solution is the coefficient matrix with each being the lowrank representation coefficient of .
Let denote the data matrix in the kernel space. The projective vectors are the eigenvector problem in (6) and transformation matrix is . The number of the feature dimensions can be decided by us. Then a data point can be embedded into dimensional feature space by where is the lowrank representation of .
Since the lowrank representation jointly obtains the representation of all the samples under a global lowrank constraint to capture the global data structures, we can get the lowest rank representation in a parameterfree way, which is very convenient and robust for kinds of data. So lowrank kernelbased SDA algorithm can improve the performance to a very large extent. The step of the LRKSDA is as follows.
Firstly, map the labeled and unlabeled data to the LRgraph kernel space. Secondly, execute the SDA algorithm for dimensionality reduction. Finally execute the nearest neighbor method for the final classification in the derived lowdimensional feature subspace. The procedure of lowrank kernelbased SDA is described as follows.
Algorithm 1 (lowrank kernelbased SDA algorithm). Input. The whole data set , where samples are labeled and are unlabeled ones.
Output. The classification results.
Step 1. Map the labeled and unlabeled data to feature space by the LRR algorithm: Step 2. Implement the SDA algorithm for dimensionality reduction.
Step 3. Execute the nearest neighbor method for final classification.
4. Experiments and Analysis
In this section, we conduct extensive experiments to examine the efficiency of lowrank kernelbased SDA algorithm. The simulation experiment is conducted in MATLAB7.11.0 (R2010b) environment on a computer with AMD Phenom(tn)II P960 1.79 GHz CPU and 2 GB RAM.
4.1. Experiment Overview
4.1.1. Databases
The proposed LRKSDA is tested on six real world databases, including three face databases and three University of California Irvine (UCI) databases. In these experiments, we normalize the sample to a unit norm.
(1) Extended Yale Face Database B [2]. This database has 38 individuals and around 64 near frontal images under different illuminations per individual. Each face image is resized to 32 32 pixels. And we select the first 20 persons and choose 20 samples of each subject.
(2) ORL Database [22]. The ORL database contains 10 different images of each for 40 distinct subjects. The images are taken at different times, varying the lighting, facial expressions, and facial details. Each face image is manually cropped and resized to 32 32 pixels, with 256 grey levels per pixel.
(3) CMU PIE Face Database [2]. It contains 68 subjects with 41,368 face images. The face images were captured under varying poses, illuminations, and expressions. The size of each image is resized to 32 32 pixels. We select the first 20 persons and choose 20 samples for per subject.
(4) Musk (Version 2) Data Set 2. This database contains 2 classes and 6598 instances with 166 features. Here, we randomly select 300 examples for the experiments.
(5) Seeds Data Set. It contains 210 instances for three different wheat varieties. A soft Xray technique and GRAINS package were used to construct all seven, realvalued attributes.
(6) SPECT Heart Data Set. The database describes diagnosing of cardiac Single Proton Emission Computed Tomography (SPECT) images. Each of the patients is classified into two categories: normal and abnormal. The database of 267 SPECT image sets was processed to extract features that summarize the original SPECT images. The pattern was further processed to obtain 22 binary feature patterns.
4.1.2. Compared Algorithms
In order to demonstrate how the semisupervised dimensionality reduction performance can be improved by lowrank kernelbased SDA, we list out SDA, KSDA1, and KSDA2 algorithm for comparison. In all experiments, the number of the nearest neighbors in the NN regularizer graph is set to 4.
(1) KSDA1 Algorithm. KSDA1 algorithm is the KSDA with Gaussian radial basis function (RBF) kernel .
(2) KSDA2 Algorithm. KSDA2 algorithm is the KSDA which uses polynomial kernel . Here, .
The classification accuracy is influenced by the kernel parameters. So after comparing, we choose a proper kernel parameters and for the KSDA1 and KSDA2 algorithm in each database in the following pairs, respectively, where is for Extended Yale Face Database B, is for ORL database, is for CMU PIE database, is for Musk database, is for Seeds Data Set, and is for SPECT Heart Data Set, respectively. Since the most suitable kernel parameters vary greatly at different random distribution even if they are for the same data, these kernel parameters are relatively suitable after comparing by many times’ runs.
4.2. Experiment 1: Different Algorithms Performances
To examine the effectiveness of the proposed LRKSDA algorithm, we conduct experiments on the six public databases. In our experiments, we randomly select 30% samples from each class as the labeled samples to evaluate the performance with different numbers of selected features. The evaluations are conducted with 20 independent runs for each algorithm. We average them as the final results. First we utilize different kernel methods to get the kernel mapping, and then we implement the SDA algorithm for dimensionality reduction. Finally, the nearest neighbor approach is employed for the final classification in the derived lowdimensional feature subspace. For each database, the classification accuracy for different algorithms is shown in Figure 1. Table 1 shows the performance comparison of different algorithms. Note that the results are the best results of all these different selected features mentioned above. From these results, we can observe the following.

(a)
(b)
(c)
(d)
(e)
(f)
In most cases, our proposed lowrank kernelbased SDA algorithm consistently achieves the highest classification accuracy compared to the other algorithms. LRKSDA achieves the best performance when the dimensionality is larger than a certain low dimension. And the classification accuracy is much higher than the other kernel SDA algorithms. So it improves the classification performance to a large extent, which suggests that lowrank kernel is more informative and suitable for SDA algorithm.
Since the proper kernel parameters are the most important thing of these traditional algorithms and since the kernel parameters of KSDA1 and KSDA2 algorithm are fixed global parameters, the two algorithms are very sensitive to different data or different random distribution of the same data. The performance improvement of these KSDA methods is not obvious. More seriously, as a result of randomly select labeled samples, the random distribution in each run may not adapt the socalled proper kernel parameters of KSDA1 and KSDA2 algorithm. Moreover, the traditional kernel mapping always analyzes the relationship of the data using the mode onetoothers, which emphasizes local information and lacks global constraints on their solutions. This situation may result in not good performance in some case, while the lowrank representation is better at capturing the global data structures. And we can get the lowest rank representation in a parameterfree way, which is very convenient and robust for kinds of data. So lowrank kernelbased SDA separates the different classes very well compared to other kernel SDA. And it can improve the performance to a very large extent, which means that our proposed lowrank kernel method is extremely effective.
4.3. Experiment 2: Influence of the Label Number
We evaluate the influence of the label number in this part. The experiments are conducted with 20 independent runs for each algorithm. We average them as the final results. The procedure is the same with experiment 1. For each database, we vary the percentage of labeled samples from 10% to 50% and the recognition accuracy is shown in Tables 2 and 3, from which we observe the following.


In most cases, our proposed lowrank kernelbased SDA algorithm consistently achieves the best results, which is robust to the label percentage variations. While some other compared algorithms are not as robust as our LRKSDA algorithm, we can see that the classification accuracy is very awful when the label rate is low. Thus, our proposed method has much superiority than the traditional KSDA and SDA algorithms. Sometimes these traditional methods may achieve good performances in some databases with high enough label rate. But they are not as stable as our proposed algorithm. Since the labeled data is very expensive and difficult, our proposed algorithm is much robust and suitable to the real word data.
As we mentioned in the previous part, since the lowrank kernel method gets the kernel matrix in a parameterfree way, it is robust for different kinds of data, while for the traditional kernel like Gaussian radial basis function kernel and polynomial kernel, if the data’s structure does not fit the stable kernel parameters they used, they cannot obtain the good representation of the original data set. Therefore, the lowrank kernel method is much more stable for all the data sets we use. And the lowrank representation jointly obtains the representation of all the samples under a global lowrank constraint, which can capture the global data structures. So it is robust to the label percentage variations even though the label rate is low.
4.4. Experiment 3: Robustness to Different Types Noises
In this test we compare the performance of different algorithms in the noisy environment. Extended Yale Face Database B and Musk database are randomly selected in this experiment. The Gaussian white noise, “salt and pepper” noise, and multiplicative noise are added to the data, respectively. The Gaussian white noise is with mean 0 and different variances from 0 to 0.1. The “salt and pepper” noise is added to the image with different noise densities from 0 to 0.1. And multiplicative noise is added to the data , using the equation , where and are the original and noised data and is uniformly distributed random noise with mean 0 and varying variance from 0 to 0.1. The number of labeled samples in each class is 30%. The experiments are conducted with 20 runs for each algorithm. We average them as the final results. The procedure is the same with experiment 1. For each graph, we vary the parameter of different noise. The results are shown in Tables 4 and 5.


As we can see, our proposed lowrank kernelbased SDA algorithm always achieves the best results, which means that our method is stable for Gaussian noise, “salt and pepper” noise, and multiplicative noise. And because of the robustness of the lowrank representation to noise, our method LRKSDA is much more robust than other algorithms. With the different kinds of gradually increasing noise, the traditional KSDA and SDA algorithms’ performance falls a lot, while our method’s performance is robust to these three noises and drops a few.
Notice that the noise is from a different model other than the original data’s subspaces. LRR can well solve the lowrank representation problem. When the data corrupted by arbitrary errors, LRR can also approximately recover the original data with theoretical guarantees. In other words, LRR is robust in an efficient way. Therefore, our method is much more robust than other algorithms with the three noises mentioned above.
5. Conclusions
In this paper, we propose a novel lowrank kernelbased SDA (LRKSDA) algorithm, which largely improves the performance of KSDA and SDA. Since lowrank representation is better at capturing the global data structures, LRKSDA algorithm separates the different classes very well compared to other kernel SDA. Therefore, our proposed lowrank kernel method is extremely effective. Empirical studies on six real world databases show that our proposed lowrank kernelbased SDA is much robust and suitable to the real word applications.
Disclosure
Current affiliation for Baokai Zu is Computer Science Department, Worcester Polytechnic Institute, Worcester, MA 01609, USA.
Competing Interests
The authors declare that they have no competing interests.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (no. 51208168), Tianjin Natural Science Foundation (no. 13JCYBJC37700), Hebei Province Natural Science Foundation (no. E2016202341), Hebei Province Natural Science Foundation (no. F2013202254 and no. F2013202102), and Hebei Province Foundation for Returned Scholars (no. C2012003038).
References
 D. Zhang, Z. H. Zhou, and S. Chen, SemiSupervised Dimensionality Reduction, SDM, Minneapolis, Minn, USA, 2007.
 D. Cai, X. He, and J. Han, “Semisupervised discriminant analysis,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–7, Rio de Janeiro, Brazil, October 2007. View at: Publisher Site  Google Scholar
 Y. Zhang and D.Y. Yeung, “Semisupervised discriminant analysis using robust pathbased similarity,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, Anchorage, Alaska, USA, June 2008. View at: Publisher Site  Google Scholar
 M. Sugiyama, T. Idé, S. Nakajima, and J. Sese, “Semisupervised local Fisher discriminant analysis for dimensionality reduction,” Machine Learning, vol. 78, no. 12, pp. 35–61, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Song, F. Nie, C. Zhang, and S. Xiang, “A unified framework for semisupervised dimensionality reduction,” Pattern Recognition, vol. 41, no. 9, pp. 2789–2799, 2008. View at: Publisher Site  Google Scholar
 Y. Bengio, J. F. Paiement, P. Vincent et al., “Outofsample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering,” in Advances in Neural Information Processing Systems 16, pp. 177–184, MIT Press, 2004. View at: Google Scholar
 D. You, O. C. Hamsici, and A. M. Martinez, “Kernel optimization in discriminant analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 3, pp. 631–638, 2011. View at: Publisher Site  Google Scholar
 K.R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf, “An introduction to kernelbased learning algorithms,” IEEE Transactions on Neural Networks, vol. 12, no. 2, pp. 181–201, 2001. View at: Publisher Site  Google Scholar
 W.J. Zeng, X.L. Li, X.D. Zhang, and E. Cheng, “Kernelbased nonlinear discriminant analysis using minimum squared errors criterion for multiclass and undersampled problems,” Signal Processing, vol. 90, no. 8, pp. 2333–2343, 2010. View at: Publisher Site  Google Scholar
 J. Liu, F. Zhao, and Y. Liu, “Learning kernel parameters for kernel Fisher discriminant analysis,” Pattern Recognition Letters, vol. 34, no. 9, pp. 1026–1031, 2013. View at: Publisher Site  Google Scholar
 Z. Wang and X. Sun, “Multiple kernel local Fisher discriminant analysis for face recognition,” Signal Processing, vol. 93, no. 6, pp. 1496–1509, 2013. View at: Publisher Site  Google Scholar
 S. Yan, D. Xu, B. Zhang, H.J. Zhang, Q. Yang, and S. Lin, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 40–51, 2007. View at: Publisher Site  Google Scholar
 K. Q. Weinberger, B. D. Packer, and L. K. Saul, “Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization,” in Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS '05), pp. 381–388, January 2005. View at: Google Scholar
 S. Yang, X. Wang, M. Wang, Y. Han, and L. Jiao, “Semisupervised lowrank representation graph for pattern recognition,” IET Image Processing, vol. 7, no. 2, pp. 131–136, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 C.F. Chen, C.P. Wei, and Y.C. F. Wang, “Lowrank matrix recovery with structural incoherence for robust face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '12), pp. 2618–2625, IEEE, Providence, RI, USA, June 2012. View at: Publisher Site  Google Scholar
 G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery of subspace structures by lowrank representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 171–184, 2013. View at: Publisher Site  Google Scholar
 G. Liu and S. Yan, “Latent lowrank representation for subspace segmentation and feature extraction,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV '11), pp. 1615–1622, IEEE, Barcelona, Spain, November 2011. View at: Publisher Site  Google Scholar
 X. Lu, Y. Wang, and Y. Yuan, “Graphregularized lowrank representation for destriping of hyperspectral images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 51, no. 7, pp. 4009–4018, 2013. View at: Publisher Site  Google Scholar
 G. Liu, Z. Lin, and Y. Yu, “Robust subspace segmentation by lowrank representation,” in Proceedings of the 27th International Conference on Machine Learning (ICML '10), pp. 663–670, Haifa, Israel, June 2010. View at: Google Scholar
 D. Zhou, O. Bousquet, N. T. La et al., “Learning with local and global consistency,” Advances in Neural Information Processing Systems, vol. 16, no. 16, pp. 321–328, 2004. View at: Google Scholar
 D. M. Cvetkovic and P. Rowlinson, “Spectral graph theory,” in Topics in Algebraic Graph Theory, pp. 88–112, Cambridge University Press, 2004. View at: Google Scholar
 S. Yan and H. Wang, “Semisupervised learning by sparse representation,” in Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 792–801, SDM, 2009. View at: Publisher Site  Google Scholar
 J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210–227, 2009. View at: Publisher Site  Google Scholar
 C. Cortes and M. Mohri, “On transductive regression,” in Advances in Neural Information Processing Systems 19, pp. 305–312, 2007. View at: Google Scholar
 E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” Journal of the ACM, vol. 58, no. 3, article 11, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 J.F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 Z. Lin, M. Chen, and Y. Ma, “The augmented lagrange multiplier method for exact recovery of corrupted lowrank matrices,” https://arxiv.org/abs/1009.5055. View at: Google Scholar
 G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery of subspace structures by lowrank representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 171–184, 2013. View at: Publisher Site  Google Scholar
 S. Yang, Z. Feng, Y. Ren, H. Liu, and L. Jiao, “Semisupervised classification via kernel lowrank representation graph,” KnowledgeBased Systems, vol. 69, no. 1, pp. 150–158, 2014. View at: Publisher Site  Google Scholar
 V. N. Vapnik and V. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, NY, USA, 1998.
 H. Nguyen, W. Yang, F. Shen, and C. Sun, “Kernel LowRank Representation for face recognition,” Neurocomputing, vol. 155, pp. 32–42, 2015. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2016 Baokai Zu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.