A New SVM Multiclass Incremental Learning Algorithm

Qin, Yuping; Li, Dan; Zhang, Aihua

doi:https://doi.org/10.1155/2015/745815

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusion Acknowledgments References Copyright Related Articles

Special Issue

New Developments in Sliding Mode Control and Its Applications 2014

View this Special Issue

Research Article | Open Access

Volume 2015 | Article ID 745815 | https://doi.org/10.1155/2015/745815

A New SVM Multiclass Incremental Learning Algorithm

Yuping Qin,¹Dan Li,²and Aihua Zhang¹

Academic Editor: Xudong Zhao

Received14 Jul 2014

Accepted23 Aug 2014

Published21 May 2015

Abstract

A new support vector machine (SVM) multiclass incremental learning algorithm is proposed. To each class training sample, the hyperellipsoidal classifier that includes as many samples as possible and pushes the outlier samples away is trained in the feature space. When the new samples are added to the classification system, the algorithm reuses the old classifiers that have nothing to do with the new sample classes. To be classified sample, the Mahalanobis distances are used to decide the class of classified sample. If the sample point is not surrounded by any hyperellipsoidal or is surrounded by more than one hyperellipsoidal, the membership is used to confirm its class. The experimental results show that the algorithm has higher performance in classification precision and classification speed.

1. Introduction

SVMs [1], as a new machine learning method based on statistical learning theory, deliver state-of-the-art performance in real world pattern recognition [2, 3] and data mining applications such as text categorization, hand-written character recognition, image classification, and bioinformatics, even into control field [4, 5]. However, the general training method of SVMs will not work when the amount of training samples is too large to be put into the RAM of computer. In order to solve this problem and improve the speed of training SVMs, the incremental learning algorithm has become one of the key techniques for training SVMs on large data sets, especially on multiclass problem.

When using SVMs to deal with multiclassification problems, the most popular four approaches are one-against-one [6], one-against-rest [7], DAGSVM [8], and binary tree SVM [9–11]. Some incremental learning algorithms have been proposed, such as Batch SVM [12, 13], online recursive algorithm [14], divisional training SVMs algorithm [15], and fast incremental learning algorithm [16]. However, these multiclassification approaches are based on binary classifiers, when new samples were added to the classification system, the whole model of the classifier must be retrained. Reference [17] proposed a class-incremental learning algorithm. The algorithm reuses the old models of the classifier, and only one binary classifier is trained when a new class comes. But it is not suitable for large data set, and new sample set can not include the old class samples. Reference [18] proposed a multiclass incremental learning algorithm based on hypersphere SVMs (HSSVMIL). The algorithm reuses the old models of the classifier, and it is suitable to class-incremental learning and old class sample-incremental learning at the same time. But the distribution of every class sample must be hypersphere shape and the density of the samples is higher in the feature space. Otherwise, the precision of the algorithm is lower. For the disadvantage, [19] proposed a multiclass incremental learning algorithm based on hyper ellipsoidal (HEIL), but the algorithm does not consider the influence of the outlier samples.

In this paper, a Mahalanobis hyperellipsoidal SVM multiclass incremental learning algorithm (MSVMIL) is proposed. To every class sample, the smallest hyperellipsoidal that contains as much samples as possible and pushes the outlier samples away is trained in the feature space. Mahalanobis distances are used to confirm the class of classified sample.

This paper is organized as follows. In Section 2, a review of hypersphere SVM is given. In Section 3, a new multiclass incremental learning algorithm is discussed in detail. In Section 4, experimental results are given on Reuters 21578. Finally, conclusion is outlined.

2. Hyperellipsoidal Support Vector Machine

Given a set of training sample of a class , where . Let be a sample matrix. Training a hyperellipsoidal in the feature space, where is the center of the hyperellipsoidal and is the radius of the hyperellipsoidal. The hyperellipsoidal should contain most of the samples and the radius is as small as possible. If there are not remote points, then the hyperellipsoidal will contain all samples. If there are remote points, then some samples outside the hyperellipsoidal are allowed, training the smallest hyperellipsoidal that contains most of the samples. When we are uncertain whether there are remote points, nonnegative slack variables are introduced to allow some samples outside the hyperellipsoidal. Using the method is similar to finding optimal hyperplane to obtain the smallest hyperellipsoidal [19–21]. The formulation is as follows:where is used to compromise the number of noises out of hyperellipsoidal and the radius of hyperellipsoidal, is covariance matrix of the samples.

To solve the optimization problem above, one can construct the Lagrange function as follows:where and are the Lagrange multipliers.

According to the Kuhn-Tucker Theorem (KKT) in optimization theory, the following conditions are satisfied:

Substituting (3) into (2), the dual optimal problem is obtained as follows:The kernel form of (4) is as follows:where is kernel function, via .

The examples that lie outside or on the margin are the corresponding nonzero. These examples are called support vectors.

The center of the smallest hyperellipsoidal can be obtained by (6). Consider

The Mahalanobis distance from sample to the center of hyperellipsoidal is as follows:

The radius of the smallest hyperellipsoidal can be determined by (8), via KKT conditions. Consider

Remark 1. Mahalanobis distance denotes the distance between a data point and multivariate space centroid, that is, overall mean value.

3. Multiclass Incremental Learning Algorithm

Given a set of training samples and kernel function . Where is the number of samples, , , is the number of classes. corresponds to inner product in feature space, namely, .

Assume that is a subset of , and all samples in belong to the th class. For every subset , train the smallest hyperellipsoidal in feature space, where is the center of the hyperellipsoidal, is the radius of the hyperellipsoidal. is the support vector set.

If there is a new sample set is added to the old classification system, where is a subset of and all samples of belong to the th class. The multiclass incremental learning algorithm based on hyperellipsoidal SVM is described in detail as follows.

Step 1. For every subset , let , training the smallest hyperellipsoidal with in the feature space and saving support vectors to .

Step 2. For every subset , if , then , retraining the hyperellipsoidal with in the feature space and refining .

For the classified sample , compute the Mahalanobis distances according to (7) in the feature space.

If there is not hyperellipsoidal containing the sample point, via , then compute the membership that the sample belongs to the th class according to (9) and then confirm the class of the sample according to (10). Consider

If there are no less than two hyperellipsoidals containing the sample point, compute the membership that the sample belongs to the th class according to (11) for firstly and then confirm the class of the sample according to (10). One has

For the classified ample , the multiclass classification algorithm is described as follows.

Step 1. Compute according to (7).

Step 2. If the sample point is contained by only one hyperellipsoidal , the sample belongs to the th class; go to Step 5. Otherwise, go to Step 3.

Step 3. If the sample point is not contained by any hyperellipsoidal, compute the membership that the sample belongs to the th class according to (9) and then confirm the class of the sample according to (10); go to Step 5; otherwise, go to Step 4.

Step 4. If the sample point is contained by no less than two hyperellipsoidals , then compute the membership that the sample belongs to the mth class according to (11) and then confirm the class of the sample according to (10); go to Step 5.

Step 5. End.

4. Experiments

Experiments are made on Reuters 21578, in which five categories and 2302 texts are used. 1536 texts are used as training set, and the rest are used as testing set (see Table 1). Information gain is used to reduce feature dimension and the weight of every word is computed according to TF-IDF.

To verify the efficiency of the proposed method, the same task is realized by using HSSVMIL, HEIL, and MSVMIL. The computational experiments were done on a Pentium 1.6 G with 512 MB memory. Kernel function is radial basis function (RBF) , where . Penalty parameter of MSVMIL . System parameter of HSSVMIL .

The macroaverage precision (MAAP), macroaverage recall (MAAR), and macroaverage (MAAF) are used to evaluate the classification performance of the algorithms. Defined as follows:where is the number of classes, is the precision of the th class sample, is the recall of th class sample, and is the value of th class sample.

In experiments, the original dataset includes two class samples (Acq and Eran), and three times incremental learning are done. The first incremental samples include three classes (Acq, Eran, and Grain). The second incremental samples include four class samples (Acq, Eran, Grain, and Crude). The third incremental samples include four class samples (Acq, Eran, Crude, and Trade). The Macroaverage precision, macroaverage recall, and macroaverage of three algorithms are given in Table 2. The training time and testing time of three algorithms are given in Table 3.

The experimental results show that the classification precision and recall of MSVMIL are higher than HEIL and MSVMIL. The main reasons are that MSVMIL reduced the volume that surrounds sample points by pushing the outlier samples away and considered the distribution of samples by using Mahalanobis distance. The classification speed of MSVMIL is faster than SSVMIL and is equal basically to HEIL. The training speed of MSVMIL is faster than HEIL and is equal basically to SSVMIL.

5. Conclusion

To solve SVM multiclass incremental learning problem, a novel algorithm based on Mahalanobis hyperellipsoidal SVM is proposed. In the process of incremental learning, only the new samples and the old class support vectors that its class exist in the new samples take part in the training. The history classifiers that have nothing to do with the new samples are reused. In the process of classification, the Mahalanobis distance is used to confirm the classified sample class. The experimental results show, compared with HEIL and SSVMIL, the proposed algorithm has a higher performance in classification precision and classification speed. In fact, data driven [22, 23] is another worth-thinking method, and this idea would be presented in the further work.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This study is partly supported by the National Natural Science Foundation of China (no. 61304149, no. 11171042), the Natural Science Foundation of Liaoning Province in China (no. 201202003), and the education committee project of Liaoning province in China (no. L2014444).

References

V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, 2nd edition, 2000.
View at: Publisher Site | MathSciNet
S. Yin, X. Li, H. Gao, and O. Kaynak, “Data-based techniques focused on modern industry: an overview,” IEEE Transactions on Industrial Electronics, no. 99, 2014.
View at: Publisher Site | Google Scholar
S. Yin, G. Wang, and H. R. Karimi, “Data-driven design of robust fault detection system for wind turbines,” Mechatronics, vol. 24, no. 4, pp. 298–306, 2014.
View at: Publisher Site | Google Scholar
X. Zhao, L. Zhang, P. Shi, and H. R. Karimi, “Robust control of continuous-time systems with state-dependent uncertainties and its application to electronic circuits,” IEEE Transactions on Industrial Electronics, vol. 61, no. 8, pp. 4161–4170, 2014.
View at: Publisher Site | Google Scholar
B. Long, S. Tian, Q. Miao, and M. Pecht, “Research on features for diagnostics of filtered analog circuits based on LS-SVM,” in Proceedings of the Systems Readiness Technology Conference (AUTOTESTCON '11), pp. 360–366, Baltimore, Md, USA, September 2011.
View at: Publisher Site | Google Scholar
K. Bennett and O. Mangasarian, “Combining support vector and mathematical programming methods for induction,” in Advances in Kernel Methods—SV Learning, pp. 307–326, 1999.
View at: Google Scholar
S. R. Gunn, “Support vector machines for classification and regression,” ISIS Technical Report vol. 14, 1998.
View at: Google Scholar
J. C. Platt, N. Cristianini, and J. Shawe-Taylor, “Large margin DAGs for multiclass classification,” in Proceedings of the 13th Annual Neural Information Processing Systems Conference (NIPS '99), pp. 547–553, December 1999.
View at: Google Scholar
S. Cheong, S. H. Oh, and S.-Y. Lee, “Support vector machines with binary tree architecture for multi-class classification,” Neural Information Processing-Letters and Reviews, vol. 2, pp. 47–51, 2004.
View at: Google Scholar
A. A. Kotomin, A. S. Kozlov, V. V. Gorovtsov et al., “Regulation of detonation ability of explosive materials used in spacecraft separation systems,” Solar System Research, vol. 46, no. 7, pp. 511–518, 2012.
View at: Publisher Site | Google Scholar
G. Madzarov, D. Gjorgjevikj, and I. Chorbev, “A multi-class SVM classifier utilizing binary decision tree,” Informatica, vol. 33, no. 2, pp. 225–233, 2009.
View at: Google Scholar | MathSciNet
S. Rüping, “Incremental learning with support vector machines,” in Proceedings of the 1st IEEE International Conference on Data Mining (ICDM '01), pp. 641–642, December 2001.
View at: Google Scholar
C. Domeniconi and D. Gunopulos, “Incremental support vector machine construction,” in Proceedings of the IEEE International Conference on Data Mining (ICDM '01), pp. 589–592, December 2001.
View at: Google Scholar
G. Cauwenberghs and T. Poggio, “Incremental and decrementai support vector machine learning,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '00), pp. 409–415, December 2000.
View at: Google Scholar
J. Zhang, Z. Li, and J. Yang, “A divisional incremental training algorithm of support vector machine,” in Proceedings of the IEEE International Conference on Mechatronics and Automation (ICMA '05), pp. 853–856, August 2005.
View at: Google Scholar
R. Kong and B. Zhang, “Fast incremental learning algorithm for support vector machine,” Control and Decision, vol. 20, no. 10, pp. 1129–1136, 2005.
View at: Google Scholar
B.-F. Zhang, J.-S. Su, and X. Xu, “A class-incremental learning method for multi-class support vector machines in text classification,” in Proceeding of the International Conference on Machine Learning and Cybernetics, pp. 2581–2585, Dalian, China, August 2006.
View at: Publisher Site | Google Scholar
Y. Qin, Q. Leng, X. Meng, and Q. Luo, “A new incremental learning algorithm based on hyper-sphere SVM,” in Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD ’10), pp. 2340–2343, August 2010.
View at: Publisher Site | Google Scholar
X.-K. Wei, G.-B. Huang, and Y.-H. Li, “Mahalanobis eillpsoidal learning machine for one class classification,” in Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 3528–3533, 2007.
View at: Google Scholar
Y.-X. Li and Z.-X. Xue, “Maximal margin ellipsoid-shaped multi-class classification algorithm,” Computer Engineering, vol. 36, pp. 185–189, 2010.
View at: Google Scholar
X.-K. Wei, G.-B. Huang, and Y.-H. Li, “Mahalanobis ellipsoidal learning machine for one class classification,” in Proceeding of the 6th International Conference on Machine Learning and Cybernetics (ICMLC ’07), pp. 3528–3533, Hong Kong, China, August 2007.
View at: Publisher Site | Google Scholar
S. Yin, X. Yang, and H. R. Karimi, “Data-driven adaptive observer for fault diagnosis,” Mathematical Problems in Engineering, vol. 2012, Article ID 832836, 21 pages, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
S. Yin, H. Luo, and S. X. Ding, “Real-time implementation of fault-tolerant control systems with performance optimization,” IEEE Transactions on Industrial Electronics, vol. 61, no. 5, pp. 2402–2411, 2014.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2015 Yuping Qin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2927

Downloads

1102

Citations