Fusion of Computational Intelligence Techniques and Their Practical Applications
View this Special IssueResearch Article  Open Access
Incremental Discriminant Analysis in Tensor Space
Abstract
To study incremental machine learning in tensor space, this paper proposes incremental tensor discriminant analysis. The algorithm employs tensor representation to carry on discriminant analysis and combine incremental learning to alleviate the computational cost. This paper proves that the algorithm can be unified into the graph framework theoretically and analyzes the time and space complexity in detail. The experiments on facial image detection have shown that the algorithm not only achieves sound performance compared with other algorithms, but also reduces the computational issues apparently.
1. Introduction
Nowadays, increasing amounts of data in the field of industrial, economic, medical, and other application areas, such as signals, measurements, images, and videos, are becoming available due to the development of computer technology. In order to excavate the hidden information in the data implicitly describing underlying processes or structures, advanced intelligent tools are proposed. However, since the stochastic nature of the processes and their measurement, structure in this data is mostly collected with noise. Consequently, it is reasonable to seek robust and adaptive tools that can cope with this nature.
Computational intelligence techniques have been investigated to answer this need. These techniques have been concerned with reproducing the abilities of human brains. Machine learning techniques exactly imitate the learning procedure of human, which construct learning model based on example data and use that to make predictions and decisions. However, due to the noise in data, it is important to construct efficient learning model to help sift useful information from the noise.
In regards to this, machine learning algorithms project highdimensional data into lowdimensional feature space to make their lowfeatures as separable as possible. Generally, they are classified into two categories: supervised learning and unsupervised learning. The essential difference between supervised learning and unsupervised learning is that whether the class information is considered. Generally speaking, the recognition performance of supervised learning is superior to that of unsupervised learning. As a classical machine learning algorithm, linear discriminate analysis (LDA) [1, 2] seeks optimal discriminative vectors to maximize the interclass scatter matrix and to minimize the intraclass scatter matrix. A large number of research works have shown the predominant advantage of LDA in various applications.
It is worth noting that traditional LDA is based on vector model. It requires all data being vectorized before learning. Actually, highdimensional image data is structured data; the vectorization operation will break the correlation relationship of different pixels. Furthermore, the vectorization operation also is easy to result in the curse of dimensionality problem. As a result, machine learning algorithms [3–11] based on tensor algebra are investigated. These algorithms consider highdimensional image as a high order tensor and introduce tensor algebra to analyze tensor data. Tensor representation not only is helpful to preserve the structure of highdimensional image, but also serves as an effective way to avoid the curse of dimensionality problem. To unify all machine learning algorithms, [12] proposes the graph embedding framework. Under this graph embedding framework, two kinds of projective forms are summarized, called vectortovector and tensortotensor forms, respectively.
However, for all machine learning algorithms, they have to train all samples again when new samples are added, which results in heavy computational cost. Consequently, incremental machine learning algorithms are proposed [13–17]. But most incremental learning algorithms focus on vector machine learning. Only a limited number of works study incremental learning in tensor space [18–20]. To investigate the incremental tensor learning, this paper develops incremental tensor discriminant analysis (ITDA), which employs supervised learning in tensor space and introduces incremental learning to process online learning. Furthermore, as a kind of machine learning algorithm, this paper also exploits the relationship between the proposed methods and the graph embedding framework and proves that the algorithm is a special case of tensortotensor form under the graph embedding framework theoretically. This paper also analyzes the time and space complexity in detail. At last, this paper conducts facial image detection experiments to evaluate the proposed method. The experimental results have demonstrated the advantage of the method.
2. Tensor Discriminant Analysis
For multidimensional image data , where , the corresponding class label is , where is the number of the class. Let the number of the th class be ; then the following definitions are introduced.
Definition 1. Withinclass scatter tensor is defined:where represent the mean tensor of the th class.
Definition 2. Betweenclass scatter tensor is defined:where represent the total mean tensor.
Definition 3. Total scatter tensor is defined:It is easy to derive that
Definition 4. Mode withinclass scatter matrix is defined:where is the mode matrix of the th sample and is the mode mean matrix of the th class.
Definition 5. Mode betweenclass scatter matrix is defined:where is the mode total mean matrix.
Definition 6. Mode total scatter matrix is defined:The basic idea of TDA is to seek projective matrices to make withinclass scatter tensor smaller and betweenclass scatter tensor larger. The objective function is In order to solve the above function, the iterative technique is adopted. It is assumed that the projective matrices are known; then is solved as follows:where . Since , so the above equation can be rewritten:Based on the basic concept of TDA and related matrix knowledge, we can get the following theorems.
Theorem 7. In tensor discriminant analysis, the mode intraclass scatter matrix is generally nonsingularity.
Proof. Defining the following matrixwhere is the number of samples, expresses the class label of the th sample. Then the mode intraclass scatter matrix is represented:where , . Generally speaking, ; then Further, we can getSo the theorem is proved.
Theorem 8. Equation (8) can be unified into the graph embedding framework [12].
Proof. Based on the basic concept of tensor algebra, the numerator of (8) can be rewritten:Letting , then the above equation is written:where Within the lowdimensional feature space, it is desired to preserve the property as demonstrated in (4), so the denominator of (8) is formulated as follows:where Combining (19) with (16), (18) can be written:whereConsequently, (8) is expressed:The form of (22) is consistent with the tensortotensor form of the graph embedding framework. Therefore, (8) can be unified into the graph embedding framework.
3. Incremental Tensor Discriminant Analysis
3.1. Incremental Learning Based on a Single Sample
In order to distinguish these variables that need to be updated during incremental learning procedure, the paper employs the subscript old to mark the variables before incremental learning. For example, expresses the total mean tensor before new samples are added.
When a single sample is added, its class label is ; then the mode total mean matrix becomesIf , that is, the new sample belongs to a new class. In this case, the total class number is and mode interclass scatter matrix is updated:where is the updated sample number of the th class. Mode intraclass scatter matrix iswhere is the mode mean matrix of the new sample. Because a single sample is added and it belongs to a new class, we can getThen (25) becomesIt is demonstrated in (27) that mode intraclass scatter matrix will not change when a new sample with new class is added.
When the class label of the new sample , that is, the class label is not a new class. In this case, the total class number ; then mode interclass scatter matrix isMode intraclass scatter matrix is Because the new sample belongs to the th class, then the class mean of the th class becomesBased on this, we can getSo (29) is simplified:
3.2. Incremental Learning Based on Multisamples
When several samples are added, new added samples , , the corresponding class labels are . Without loss of generality, it is assumed that samples belong to the th class; then the mean tensor of the th class is updated:where is the mean tensor of the new samples belonging to the th class. The corresponding mode mean matrix of the th class isThen the number of samples in the th isThe total mean tensor is updated:where is the mean tensor of all new samples. The interclass scatter mean tensor is updated:The corresponding mode interclass scatter matrix isThe mode intraclass scatter matrix isSubstituting (34) into the following equation, we can getSimilarly, we can get Substituting (40) and (41) into (39), we can obtain
Without loss of generality, it is supposed that, for new samples, there are samples belonging to the new class label ; then updated mode interclass scatter matrix isand mode intraclass scatter matrix is
It is not difficult to find that incremental learning based on singular sample only is a special case of incremental learning based on multisample.
3.3. The Complexity Analysis
For tensor discriminant analysis, the main computational time is spent on the computation of interclass mean, total mean, inter and intraclass scatter tensor, and Eigen decomposition. The computation cost of inter and intraclass scatter tensors depends on the number of training samples. If there are a large number of training samples, it cannot avoid to increment computational time.
For incremental discriminant analysis, the main computational time is spent on the computation of updated inter and intraclass scatter matrix and the class number.
For Eigen decomposition, both the time complexity of TDA and ITDA are . The main difference of the time complexity is the computation of inter and intraclass scatter matrix. For TDA, the time complexity is , so the time complexity will increase with the number of training samples. For ITDA, the time complexity is , which is related to the class number and the number of new samples. It has no relationship with the number of initial training samples. Consequently, ITDA is helpful to reduce the time complexity.
Considering the space complexity, ITDA is also superior to TDA. When new samples are added, TDA needs bytes to save all training samples, but ITDA only needs bytes to save new added samples, bytes to save the total mean, bytes to save the class mean, and bytes to save mode scatter matrix. Hence ITDA has the capability to save space.
Compared to incremental learning based on single sample with incremental learning based on multisamples, incremental learning based on single samples has an advantage to reduce the space complexity because it only deals with one sample for each time.
4. Experiments
In this section, a series of experiments are carried out to validate the performance of incremental tensor discriminant analysis (ITDA). The CBCL image data set is used to conduct facial image detection experiments. The dataset contains two classes of images, including facial images and nonfacial images as shown in Figure 1. The total number of the datasets is 2988 images, in which there are 2429 facial images and 559 nonfacial images. For each image, the size is . This paper divides whole dataset into training dataset with 1215 facial images and 280 nonfacial images and testing dataset with 1214 facial images and 279 nonfacial images. Furthermore, training dataset is divided into initial training dataset with 1015 facial images and 80 nonfacial images and four incremental datasets. Each incremental dataset has 50 facial images and 50 nonfacial images.
ITLDA integrates the tensor representation and incremental learning; it is reasonable to believe that it has the advantage to improve the detection performance and reduce the time and space complexity. In this respect, ITLDA is compared with LDA [21], ILDA [14], TPCA [22], ITPCA [23], and TDA [9]. LDA is the classical linear discriminant analysis. ILDA is the incremental version of LDA. TPCA is also called MPCA (multilinear principal component analysis), which carries on principal component analysis with tensor data. ITPCA is proposed to suit for incremental principal component analysis for tensor data. TDA also represents data as tensor structure and conducts multilinear discriminant analysis. For each time of incremental learning, the paper adds one incremental dataset and then extracts lowdimensional features on testing dataset. The nearest neighbor classifier is employed to classify these lowdimensional features.
The comparisons of detection performance for different algorithms with incremental learning are shown in Figures 2, 3, 4, and 5, respectively. It is worth noting that LDA is the worst and ILDA is better than LDA. However the detection results of ILDA drop with the increment of the dimension of lowdimensional features. TPCA and ITPCA have similar detection results and both of them exceed LDA and ILDA. The probable reason is that TPCA and ITPCA represent data as tensor structure, which make full use of the interior structure information to enhance the detection performance. TDA is superior to the above four algorithms. When the dimension of lowdimensional features is low, TDA and ITDA have comparative detection percent and ITDA begins to surmount TDA when the dimension of lowdimensional features increases. Figure 6 and Table 1 have shown the best detection results of different algorithms. It can be seen that the detection performances of different algorithms are improved with the increment of incremental learning numbers and ITLDA always has the best performance. Consequently, it can be derived that the increment of incremental learning number is helpful to improve the detection result. More than that, as shown in Figures 7 and 8, incremental learning algorithms ILDA, ITPCA, and ITDA have the capability to alleviate time and space complexity apparently compared with nonincremental learning. Furthermore, since ITPCA and ITDA adopt tensor representation, they have lower time and space requirements than LDA.

5. Conclusions
In this paper, incremental tensor discriminant analysis (ITDA) is investigated. It adopts tensor representation to keep the structure information for highdimensional images and introduces incremental learning to complete online learning. This paper also proves the relationship between ITDA and the graph framework theoretically. The facial detection experiments have shown that ITDA has better performance than TDA and is able to reduce the time and space complexity apparently.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This present work has been funded with support from the Young Scientist Project of Chengdu University (no. 2013XJZ21), Project of Science and Technology Support Program of Sichuan Province, China, under Grant no. 2014GZ0013, and Project of Education Department of Sichuan Province, China, under Grant no. 13ZA0297.
References
 M. Murtaza, M. Sharif, M. Raza, and J. Shah, “Face recognition using adaptive margin fisher's criterion and linear discriminant analysis,” International Arab Journal of Information Technology, vol. 11, no. 2, pp. 1–11, 2014. View at: Google Scholar
 C. R. Rao, “The utilization of multiple measurements in problems of biological classification,” Journal of the Royal Statistical Society Series B (Methodological), vol. 10, no. 2, pp. 159–203, 1948. View at: Google Scholar
 C. Liu, K. He, J.L. Zhou, and C.B. Gao, “Discriminant orthogonal rankone tensor projections for face recognition,” in Intelligent Information and Database Systems, vol. 6592 of Lecture Notes in Computer Science, pp. 203–211, Springer, Berlin, Germany, 2011. View at: Publisher Site  Google Scholar
 G.F. Lu, Z. Lin, and Z. Jin, “Face recognition using discriminant locality preserving projections based on maximum margin criterion,” Pattern Recognition, vol. 43, no. 10, pp. 3572–3579, 2010. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 H. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “Uncorrelated multilinear discriminant analysis with regularization and aggregation for tensor object recognition,” IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 103–123, 2009. View at: Publisher Site  Google Scholar
 J.L. Minoi, C. E. Thomaz, and D. F. Gillies, “Tensorbased multivariate statistical discriminant methods for face applications,” in Proceedings of the International Conference on Statistics in Science, Business, and Engineering (ICSSBE '12), pp. 1–6, IEEE, Langkawi, Malaysia, September 2012. View at: Publisher Site  Google Scholar
 F. Nie, S. Xiang, Y. Song, and C. Zhang, “Extracting the optimal dimensionality for local tensor discriminant analysis,” Pattern Recognition, vol. 42, no. 1, pp. 105–114, 2009. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 N. Tang, X. Z. Gao, and X. Li, “Tensor subclass discriminant analysis for radar target classification,” Electronics Letters, vol. 48, no. 8, pp. 455–456, 2012. View at: Publisher Site  Google Scholar
 D. Tao, X. Li, X. Wu, and S. J. Maybank, “General tensor discriminant analysis and Gabor features for gait recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1700–1715, 2007. View at: Publisher Site  Google Scholar
 S.J. Wang, J. Yang, M.F. Sun, X.J. Peng, M.M. Sun, and C.G. Zhou, “Sparse tensor discriminant color space for face verification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 6, pp. 876–888, 2012. View at: Publisher Site  Google Scholar
 Z.Z. Yu, C.C. Jia, W. Pang, C.Y. Zhang, and L.H. Zhong, “Tensor discriminant analysis with multiscale features for action modeling and categorization,” IEEE Signal Processing Letters, vol. 19, no. 2, pp. 95–98, 2012. View at: Publisher Site  Google Scholar
 S. Yan, D. Xu, B. Zhang, H.J. Zhang, Q. Yang, and S. Lin, “Graph embedding and extensions: a general framework for dimensionality reduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 40–51, 2007. View at: Publisher Site  Google Scholar
 A. Joseph, Y.M. Jang, S. Ozawa, and M. Lee, “Extension of incremental linear discriminant analysis to online feature extraction under nonstationary environments,” in Neural Information Processing, vol. 7664 of Lecture Notes in Computer Science, pp. 640–647, Springer, Berlin, Germany, 2012. View at: Publisher Site  Google Scholar
 G.F. Lu, J. Zou, and Y. Wang, “Incremental complete LDA for face recognition,” Pattern Recognition, vol. 45, no. 7, pp. 2510–2521, 2012. View at: Publisher Site  Google Scholar
 G.F. Lu, J. Zou, and Y. Wang, “Incremental learning of complete linear discriminant analysis for face recognition,” KnowledgeBased Systems, vol. 31, pp. 19–27, 2012. View at: Publisher Site  Google Scholar
 G.F. Lu, J. Zou, and Y. Wang, “Incremental learning of discriminant common vectors for feature extraction,” Applied Mathematics and Computation, vol. 218, no. 22, pp. 11269–11278, 2012. View at: Publisher Site  Google Scholar
 Q. Wang and L. Zhang, “Least squares online linear discriminant analysis,” Expert Systems with Applications, vol. 39, no. 1, pp. 1510–1517, 2012. View at: Publisher Site  Google Scholar
 J. Sun, D. Tao, S. Papadimitriou, P. S. Yu, and C. Faloutsos, “Incremental tensor analysis: theory and applications,” ACM Transactions on Knowledge Discovery from Data, vol. 2, no. 3, article 11, 2008. View at: Publisher Site  Google Scholar
 J.G. Wang, E. Sung, and W.Y. Yau, “Incremental twodimensional linear discriminant analysis with applications to face recognition,” Journal of Network and Computer Applications, vol. 33, no. 3, pp. 314–322, 2010. View at: Publisher Site  Google Scholar
 J. Wen, X. Gao, Y. Yuan, D. Tao, and J. Li, “Incremental tensor biased discriminant analysis: a new colorbased visual tracking method,” Neurocomputing, vol. 73, no. 4–6, pp. 827–839, 2010. View at: Publisher Site  Google Scholar
 P. N. Belhumeur, J. P. Hespanha, and D. Kriegman, “Eigenfaces vs. fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. View at: Publisher Site  Google Scholar
 H. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “MPCA: multilinear principal component analysis of tensor objects,” IEEE Transactions on Neural Networks, vol. 19, no. 1, pp. 18–39, 2008. View at: Publisher Site  Google Scholar
 C. Liu, T. Yan, W. Zhao et al., “Incremental tensor principal component analysis for handwritten digit recognition,” Mathematical Problems in Engineering, vol. 2014, Article ID 819758, 10 pages, 2014. View at: Publisher Site  Google Scholar  MathSciNet
Copyright
Copyright © 2015 Liu Chang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.