Research Article  Open Access
Mingxia Chen, Jing Wang, Xueqing Li, Xiaolong Sun, "Robust SemiSupervised Manifold Learning Algorithm for Classification", Mathematical Problems in Engineering, vol. 2018, Article ID 2382803, 8 pages, 2018. https://doi.org/10.1155/2018/2382803
Robust SemiSupervised Manifold Learning Algorithm for Classification
Abstract
In the recent years, manifold learning methods have been widely used in data classification to tackle the curse of dimensionality problem, since they can discover the potential intrinsic lowdimensional structures of the highdimensional data. Given partially labeled data, the semisupervised manifold learning algorithms are proposed to predict the labels of the unlabeled points, taking into account label information. However, these semisupervised manifold learning algorithms are not robust against noisy points, especially when the labeled data contain noise. In this paper, we propose a framework for robust semisupervised manifold learning (RSSML) to address this problem. The noisy levels of the labeled points are firstly predicted, and then a regularization term is constructed to reduce the impact of labeled points containing noise. A new robust semisupervised optimization model is proposed by adding the regularization term to the traditional semisupervised optimization model. Numerical experiments are given to show the improvement and efficiency of RSSML on noisy data sets.
1. Introduction
The problem of dimensionality reduction, that is, the transformation of highdimensional data into meaningful lowdimensional features, has arisen much interest of researchers. Recently, there have been much research efforts on developing effective and efficient manifold learning algorithms which can discover the potential intrinsic lowdimensional structures of the highdimensional data. These algorithms included Isometric Mapping (ISOMAP) [1], Locally Linear Embedding (LLE) [2], Laplacian Eigenmaps (LE) [3], and Local Tangent Space Alignment (LTSA) [4].
The above classical manifold learning methods are all unsupervised learning algorithms; that is, they do not consider the prior information. In many applications, we can get some prior information of the input data. For example, in a classification problem, the class labels of partial data can be obtained. Considering prior information in the form of lowdimensional coordinates of certain sample points, the classical manifold learning methods can be extended to semisupervised manifold learning methods [5]. And the semisupervised manifold learning algorithms can yield the lowdimensional coordinates that bear the same meaning as the prior information.
However, these unsupervised and semisupervised methods may have a limited efficiency on realworld data, due to large noise or distortion of data. Practically, each method for dimensionality reduction requires certain assumptions on the data manifold to guarantee its expected efficiency. For example, ISOMAP needs a convex embedding domain of manifolds or a relatively uniform data distribution for estimating geodesic distance. LTSA should have neighbor sets that can approximately recover the local tangent spaces. In LLE, the local geometric structure of the manifold should be well determined via local combination of data neighborhoods.
There are some efforts on improving the original algorithms. One line is to preprocess the data set before applying the methods, without any modifications on algorithms. Smoothing the data set by weighted SVD, or equivalently, weighted PCA to reduce data noise before performing LTSA is suggested in [6]. In [7], the outliers are first detected by the histogram analysis of the neighborhood distances of each point, and the locally smoothed values of data are then computed using the linear errorinvariables (EIV) model. A fast outlier detection method for highdimensional data sets is proposed in [8]. It also employs a local smoothing method and introduces a weighted global function to further reduce the undesirable effect of outliers and noise on the embedding results. These algorithms can be also improved by adaptively selected neighborhoods [9]. The other line is to adjust some details of algorithms. For example, multiple local weight vectors are used to solidify the structures determined by neighborhoods in [10]. In [11], the influence of noisy points on the reconstruction is greatly reduced by solving a new local optimization model. In [12], a robust DLPP version based on L1norm maximization is proposed. In [13], the shortcircuit errors can be reduced by solving the problem of selecting the right number and position of landmarks automatically. A robust version of LTSA is proposed in [14] to further reduce the influence of noise on embedding results by endowing clean data points and noise data points with different weights into local alignment errors. In [15], an outofsample extension framework for a global manifold learning algorithm (ISOMAP) that uses temporal information in outofsample points in order to make the embedding more robust to noise and artifacts is proposed.
Although the improved manifold learning algorithms are more robust against noise than the original algorithms, few works are done on the semisupervised algorithms [16]. In fact, the undesirable effect caused by noise is more complicated in the semisupervised problem. Firstly, it is difficult to accurately explore the local geometric structures when the local neighborhoods contain noisy points. Secondly, the provided prior information may be inexact for noisy points. And the constructed lowdimensional coordinates using the inexact prior information may be far from the real onmanifold coordinates of the sample points. The first issue can be solved by constructing noisefree neighbor sets [7–9] or constructing robust local geometric structures of the noisy neighbor sets [10, 11, 17]. And we do not extend the topic regarding the first issue in the paper.
We focus on the second issue in the paper. We estimate the noise levels of the sample points which reflect the confidence levels in the prior information. Then we construct a new semisupervised optimization model to reduce the undesirable effect of the inexact prior information with low confidence levels. A framework for robust semisupervised manifold learning (RSSML) is proposed by solving the new semisupervised optimization model.
The rest of this paper is organized as follows. In Section 2, we give a brief review of semisupervised manifold learning. In Section 3, we show how to extend the semisupervised manifold learning algorithms so that they can handle inexact information for noisy points. The framework for robust semisupervised manifold learning (RSSML) is presented in the section. After that, we give numerical experiments in Section 4 to show the effectiveness of RSSML.
2. A Brief Review of SemiSupervised Manifold Learning
Our work is an extension of semisupervised manifold learning (SSML). In this section, we give a brief review of (SSML) [5]. Assume that we are given a data set (possibly with noise) from a dimensional manifold. Without loss of generality, suppose that the prior information of the first points is known. And denote by the constructed lowdimensional coordinates using the prior information. The goal of SSML is to calculate the unknown lowdimensional coordinates . SSML proceeds in the following steps.
Step 1 (finding local neighborhoods). Determine the neighbor set for each .
Step 2 (extracting local geometry). The local geometry of the determined neighbor set can be extracted by solving the classical local optimization methods [1–4]. Take LLE as an example; the local geometry is characterized by the linear combination coefficients which can be computed by minimizing the least square optimization model:
Step 3 (constructing semisupervised optimization model). In the unsupervised manifold learning algorithms, the global lowdimensional coordinates are calculated by solving the embedding cost functions which can preserve the extracted local geometries. For example, in LLE, the lowdimensional coordinates can be computed by minimizing the embedding cost function:Different from the embedding cost functions in the unsupervised manifold learning algorithms, a regularization term concerning the prior information is added such that the lowdimensional coordinates can obey the prior information. As in semisupervised LLE, the lowdimensional coordinates are obtained by minimizing the semisupervised optimization model:where is the regularization parameter that reflects the confidence level in prior information.
Semisupervised manifold learning has been widely used in many reallife applications such as face recognition [18], remote sensing image classification [19], object tracking [5], and data visualization [18]. Figure 1 illustrates some of the reallife applications. The available prior information and the computed embedding coordinates are different in different applications. For example, in the applications of face recognition and remote sensing image classification, the known lowdimensional coordinates are constructed according to the labels of the training data points, and SSML aims to predict the labels of the remaining data points using the known lowdimensional coordinates (see Remark 2 in the next section for the way of constructing lowdimensional coordinates and predicting the labels). In object tracking, SSML aims to recover the real locations of the object, using the given locations of the object in certain frames. In data visualization, SSML projects the data points to 2dimensional or 3dimensional embedding space to discover the hidden relations of the data points, using the given 2dimensional or 3dimensional coordinates of the training points.
(a)
(b)
(c)
(d)
Generally, semisupervised manifold learning algorithms work well if the data sets are well sampled from a manifold. In the situation, , and the minimization of (3) is equivalent toWhen the data sets contain noise, the effectiveness of SSML will be significantly decreased. This is because the noise level of each sample point may be different. For the sample points containing large noise, their prior information may not be trustworthy, and the regularization parameter tends to be small. For those points only containing small noise, we are confident in the provided prior information, and tends to be large. A fixed cannot reflect the different confidence levels in the prior information of each point. It is desirable to construct a robust semisupervised optimization model against noise.
3. Robust SemiSupervised Manifold Learning Algorithm
In this section, a framework for robust semisupervised manifold learning algorithm (RSSML) will be proposed. Since the prior information may be inexact for the noisy points, the major problem is how to deal with inexact prior information according to the different noise levels of the sample points. Note that the noise levels of the sample points are unknown generally. It is desirable to measure the noise levels of the sample points before proposing the robust semisupervised optimization model.
3.1. Measure the Noise Level
Recently, some work has been done to measure the noise levels of the points [20–23]. In this paper, we measure the noise levels by the outlier detection algorithm based on reconstruction weights (ODBRW), due to little computation cost, low parameter requirement, and high effectiveness [20]. The ODBRW method is applied only on the training points , which consists of the following steps.
Step 1 (constructing the edge point sets). Search nearest neighbors (KNN) of each , , and determine the neighbor set firstly. Then select the edge points from bywhich requires that angles between any adjacent edges and should be acute or right. If the angle between the adjacent edges and is obtuse, it means that separates and . A point is said to be an edge point of if there is no other neighbor that separates and . The determined edge point sets are very robust on the neighborhood size . See an illustrated example of the edge point set in Figure 2. More explanations about the edge point set can be found in [24].
Step 2 (calculating reconstruction weights). The reconstruction weights of the edge point set for can be obtained by solving the least square problem Denote and ; the least square problem can be solved bywhere and is a vector with all ones.
Step 3 (measuring the noise levels). Form an matrix by ; that is,where is the index set of . The noise level of can be measured by
3.2. Robust SemiSupervised Optimization Model
It is shown in theory that the smaller is, the more likely that tends to be an outlier [20]. For a small , it means that the prior information on is not trustworthy. Hence, we hope that the effect of the prior information on computing the embedding coordinates can be reduced. For a large , the sample point tends to be a clean point, and we are more confident on its prior information. The computed lowdimensional coordinates should bear similar meaning as the prior information. Based on the above analysis, we construct a new robust semisupervised optimization model:Here are the lowdimensional coordinates of , and are the constructed lowdimensional coordinates using the prior information of the training points . is the embedding cost function on the manifold, and is the measured noise level which is normalized to a real number in . Take LLE as an example; the robust semisupervised optimization model is
Denote and and is the diagonal matrix whose elements are . The optimization model (11) can be expressed aswhere and the elements with for and for .
We can solve the optimization model by setting the gradient of (12) to be zero. Partition with and ; it is easy to get thatPartition with ; the dimensional embedding can be computed by solving the following linear system of equations:
Based on the above analysis, we propose a new algorithm called robust semisupervised manifold learning (RSSML) which is summarized as follows.
RSSML Algorithm
Input. Data set , lowdimensional coordinates of labeled points , parameter , neighborhood size .
Output. The lowdimensional coordinates .
Step 1 (selecting local neighborhoods). Determine the neighborhood set of each point ; is the index set of the neighbors: .
Step 2 (extracting local geometry). Extract the local geometry by some local optimization models of manifold learning. Matrix is given by with if or 0 otherwise.
Step 3 (calculating noise levels). Obtain the noise levels for the first data points (training points) by the ODBRW method and construct the diagonal matrix .
Step 4 (embedding global coordinates). Compute the lowdimensional embedding coordinates by solving the linear system of equations (15).
Remark 1. Notice that many unsupervised learning methods can be extended to their semisupervised versions by the proposed robust semisupervised manifold learning (RSSML) algorithm. In this paper, we explore the local geometry by solving the least squares (LS) problem of LLE. And the local geometry can be explored by other local optimization methods such as RLLPE and LTSA. We call them RSSMLLLE, RSSMLRLLPE, and RSSMLLTSA.
Remark 2. In a classification problem, we are given the label information of training points and classes. Without loss of generality, assume that the first points are labeled. The lowdimensional coordinates of labeled points can be constructed with if has label and otherwise. Then, the labels of the unlabeled points can be estimated as .
4. Experiment Results
To verify the effectiveness of the proposed algorithm RSSML on realworld data, we perform experiments on CMU PIE data set [25], HandwrittenAlpha data set [26], and HAND_SHAPE [27]. For comparison, we apply the unsupervised methods LLE, RLLPE, and LTSA (see [2, 4, 11]), their semisupervised versions, and the proposed robust semisupervised versions on the above data sets. In the three realworld examples, some noisy points are also added to test the robustness of RSSML to noisy data sets.
CMU PIE [25]. The original data set contains 11560 samples of 68 individuals in 32 × 32 grayscale image. In the experiment, we randomly selected 160 samples from 10 individuals (for a total of 1,600 samples). Some samples are plotted in Figure 3.
HandwrittenAlpha [26]. The data set (HWAlpha) is extracted from “binaryalphadigs” data set. It consists of 936 images of 26 handwritten alphabets. Each class has 36 images which are of size of 20 × 16.
HAND_SHAPE [27]. The original data set (Cambridge Hand Gesture Data) contains 9 gesture classes in 320 × 240 grayscale images, which are defined by 3 primitive hand shapes and 3 primitive motions. In this experiment, the target task is to classify different hand shapes. Therefore, the final data set is divided into five groups (see Figure 4). We randomly selected 350 points of each group to form the experiment set.
In the experiments, the parameters of these methods are set as follows. All of the manifold learning algorithms are involved with the neighborhood size parameter. The neighborhood size is selected from 8 to 36. For the unsupervised methods LLE, RLLPE, and LTSA, the intrinsic dimension is tried. The different regularization parameter is tried in our robust semisupervised version of LLE, LTSA, or RLLPE. We only report the best results.
To perform classification, the data sets are firstly projected onto lowdimensional space by these unsupervised methods, and then the Nearest Feature Line (NFL) classifier (see [28, 29]) is used on the lowdimensional embedding results for the recognition. For the semisupervised methods, the label information of the unlabeled points can be predicted directly.
In the first experiment, some noisy points are added to the original three data sets. We randomly select 10% sample points from the data sets to generate the noisy points. For each selected image, we randomly chose pixels from the original image and changed the pixel values from to . Some of the noisy images are shown in Figure 3. Half of the sample points are randomly selected as training samples and the remaining are used for testing.
Table 1 lists the classification rates of the manifold learning methods on the three data sets. It is clear that the unsupervised methods are sensitive to noise, especially for LTSA. When the data sets contain noisy points, LTSA may fail to find a reasonable local tangent space of each data point. For SSLLE, SSRLLPE, and SSLTSA, the classification accuracies can be improved obviously. RSSMLLLE, RSSMLRLLPE, and RSSMLLTSA outperform the semisupervised manifold learning approaches in the experiments. It is shown that the robustness to noisy points can be improved by the proposed robust semisupervised model.

To better compare the effectiveness of the above manifold algorithms on noisy data sets, we generate the noisy data with different densities of Gaussian and reverse noise. We randomly select 10% samples from CMU PIE data set to generate the noisy images. The noisy images are generated in two ways. One way is to randomly choose 1/6, 1/8, or 1/12 pixels of each selected image and invert the value from to . Another way is to add the Gaussian noise to the selected image with different variances: 0.02, 0.05, or 0.1. The experimental results of the unsupervised, semisupervised, and robust semisupervised versions of LLE, RLLPE, and LTSA methods on the six experimental sets are shown in Table 2.

As can be seen in Table 2, SSLLE, SSRLLPE, and SSLTSA are more sensitive to noisy points than RSSML (LLE, RLLPE, and LTSA). Under different noise levels, RSSML methods can achieve higher classification accuracies than the other algorithms. It is further shown that RSSML can better handle the noisy points in the experiments. We notice that the classification rates of RSSML are also higher than those of the other methods on the original data set. This is because the original data set may be contaminated by noise in the sampling process. The proposed RSSML methods can also reduce the impact of the noisy points that come from the real world. It is interesting that, in some cases, the results on the noisy data sets may be better than those on the original data set. One explanation is that the linear combination coefficients of large noisy points calculated by the local least square optimization model tend to be small. When we added some large artificial noises to the original points, the effect of these points on the reconstruction can be greatly reduced. Hence, these outliers will not destroy the lowdimensional structure of the manifold.
To further quantitatively compare the performance of the above algorithms, the noisy HAND_SHAPE data (1/6 reverse noise) are divided into five different proportions of training and testing data (2 : 5, 2 : 3, 1 : 1, 3 : 2, and 5 : 2). As can be seen in Table 3, our robust versions (of LLE, RLLPE, and LTSA) have the best performance for different proportions of training and testing data. It is evident that our method can greatly reduce the classification error on the noisy data set.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported in part by NSFC (61370006), NSF of Fujian Province (2014J01237 and 2015J01256), Program for New Century Excellent Talents in Fujian University (2012FJNCETZR01), and Program for Young and MiddleAged Teacher in Science and Technology Research of Huaqiao University (ZQNPY116).
References
 J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, no. 5500, pp. 2319–2323, 2000. View at: Publisher Site  Google Scholar
 S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000. View at: Publisher Site  Google Scholar
 M. Belkin and P. Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation, vol. 15, no. 6, pp. 1373–1396, 2003. View at: Publisher Site  Google Scholar
 Z. Zhang and H. Zha, “Principal manifolds and nonlinear dimensionality reduction via tangent space alignment,” SIAM Journal on Scientific Computing, vol. 26, no. 1, pp. 313–338, 2004. View at: Publisher Site  Google Scholar  MathSciNet
 X. Yang, H. Fu, H. Zha, and J. Barlow, “Semisupervised nonlinear dimensionality reduction,” in Proceedings of the International Conference on Machine Learning, 2006. View at: Google Scholar
 Z. Zhang and H. Zha, Local Linear Smoothing for Nonlinear Manifold Learning, Department of Computer Science and Engineering, Pennsylvania State University, PA, USA, 2003, CSE03003.
 H. Chen, G. Jiang, and K. Yoshihira, “Robust nonlinear dimensionality reduction for manifold learning,” in Proceedings of the 18th International Conference on Pattern Recognition, ICPR 2006, pp. 447–450, China, August 2006. View at: Publisher Site  Google Scholar
 X. Xing, S. Du, and K. Wang, “Robust Hessian locally linear embedding techniques for highdimensional data,” Algorithms, vol. 9, no. 2, article no. 36, 2016. View at: Publisher Site  Google Scholar
 Z. Zhang, J. Wang, and H. Zha, “Adaptive manifold learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 2, pp. 253–265, 2012. View at: Publisher Site  Google Scholar
 J. Wang and Z. Zhang, “Nonlinear embedding preserving multiple locallinearities,” Pattern Recognition, vol. 43, no. 4, pp. 1257–1268, 2010. View at: Publisher Site  Google Scholar
 J. Wang, “Real locallinearity preserving embedding,” Neurocomputing, vol. 136, pp. 7–13, 2014. View at: Publisher Site  Google Scholar
 F. Zhong, J. Zhang, and D. Li, “Discriminant locality preserving projections based on L1norm maximization,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 11, pp. 2065–2074, 2014. View at: Publisher Site  Google Scholar
 Q. Gan, F. Shen, and J. Zhao, “An extended isomap for manifold topology learning with SOINN landmarks,” in Proceedings of the 22nd International Conference on Pattern Recognition, ICPR 2014, pp. 1579–1584, Sweden, August 2014. View at: Publisher Site  Google Scholar
 Y. Zhan and J. Yin, “Robust local tangent space alignment via iterative weighted PCA,” Neurocomputing, vol. 74, no. 11, pp. 1985–1993, 2011. View at: Publisher Site  Google Scholar
 H. Dadkhahi, M. F. Duarte, and B. Marlin, “Isomap outofsample extension for noisy time series data,” in Proceedings of the 25th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2015, USA, September 2015. View at: Publisher Site  Google Scholar
 N. Gu, M. Fan, and D. Meng, “Robust SemiSupervised Classification for Noisy Labels Based on SelfPaced Learning,” IEEE Signal Processing Letters, vol. 23, no. 12, pp. 1806–1810, 2016. View at: Publisher Site  Google Scholar
 K. Kim and J. Lee, “Nonlinear dynamic projection for noise reduction of dispersed manifolds,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, pp. 2303–2309, 2014. View at: Publisher Site  Google Scholar
 Y. Li, SemiSupervised Manifold Learning and Its Application, Department of Computer Science and Technology, 2010, Xian University.
 H. Huang, G.F. Qin, and H.L. Feng, “Semisupervised manifold learning and its application to remote sensing image classification,” Guangxue Jingmi Gongcheng/Optics and Precision Engineering, vol. 19, no. 12, pp. 3025–3033, 2011. View at: Publisher Site  Google Scholar
 J. Wang, “Outlier detection approach based on reconstruction weights,” Journal of Software, vol. 22, no. 7, pp. 1571–1579, 2011. View at: Publisher Site  Google Scholar
 H. P. Kriegel, M. Schubert, and A. Zimek, “Anglełbased outlier detection in highdimensional data,” in Proc£of the Intl Conf£on Kowledge Discovery and Data Mining, pp. 444–459, 2008. View at: Google Scholar
 O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook, Kluwer Academic Publishers, Norwell, MA, USA, 2005.
 H. Chang and D.Y. Yeung, “Robust locally linear embedding,” Pattern Recognition, vol. 39, no. 6, pp. 1053–1065, 2006. View at: Publisher Site  Google Scholar
 T. Lin, H. Zha, and S. U. Lee, “Riemannian Manifold Learning for Nonlinear Dimensionality Reduction,” in European Conference on Computer Vision(ECCV), vol. 3951, pp. 44–55, 2006. View at: Google Scholar
 T. Sim, S. Baker, and M. Bsat, “PIE Face Database of Carnegie Mellon University,” http://vasc.ri.cmu.edu/idb/html/face/. View at: Google Scholar
 “A database for Binary Alphadigits,” http://www.cs.nyu.edu/roweis/data. View at: Google Scholar
 T.K. Kim, S.F. Wong, and R. Cipolla, “Tensor canonical correlation analysis for action classification,” in Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07, pp. 1–8, June 2007. View at: Publisher Site  Google Scholar
 S. Z. Li and J. Lu, “Face recognition using the nearest feature line method,” IEEE Transactions on Neural Networks and Learning Systems, vol. 10, no. 2, pp. 439–443, 1999. View at: Publisher Site  Google Scholar
 S. Z. Li, K. L. Chan, and C. Wang, “Performance evaluation of the nearest feature line method in image classification and retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1335–1339, 2000. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Mingxia Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.