Micro Learning Support Vector Machine for Pattern Classification: A High-Speed Algorithm

Yan, Yu; Wang, Yiming; Lei, Yiming

doi:https://doi.org/10.1155/2022/4707637

Computational Intelligence and Neuroscience

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Authors’ Contributions References Copyright Related Articles

Research Article | Open Access

Volume 2022 | Article ID 4707637 | https://doi.org/10.1155/2022/4707637

Micro Learning Support Vector Machine for Pattern Classification: A High-Speed Algorithm

Yu Yan,¹Yiming Wang,^1,2and Yiming Lei¹

Academic Editor: Zaher Mundher Yaseen

Received06 Jun 2022

Revised30 Jun 2022

Accepted07 Jul 2022

Published03 Aug 2022

Abstract

The support vector machine theory has been developed into a very mature system at present. The original support vector machine to solve the optimization problem is transformed into a direct calculation formula of line in this paper and the model is time complexity. In the model of this article, weited theory, multiclassification problem and online learning have all become the direct inference, and we have applied the new model to the UCI data set. We hope that in the future, this model will be useful in real-world problems such as stock forecasting, which require nonlinear hi-speed algorithms.

1. Introduction

Since the establishment of the support vector machine ([1–3]) in 1995, Vapnik et al., the support vector machine(SVM) has been the focus of researchers in data mining. The classical SVM processing is a classic binary classification problem, where SVM labels unlabeled points by solving an optimal line. One of the basic principles of SVM is to use kernel technology so that the specific mapping method cannot be known. Throu the simple inner product of the nonlinear problem in the original space, we can solve this problem with a linear problem in another space that is mapped to. Moreover, the model can significantly improve the time complexity of the small sample problems throu dual theory, and one of the classic model thinking: the maximum interval is also widely used in various models¹.

In 2007, Jayadeva et al. ([4–7]) established the model as the twin support vector machine (TWSVM), which solves the classic two classification problem. Unlike SVM, TWSVM is mainly used to solve nonparallel problems. The support surface of the SVM is two parallel hyperplanes, and TWSVM is the solution to the two nonparallel hyperplanes. This model no longer uses the maximum interval principle in SVM. TWSVM solves for two straight lines that are as close as possible to the two classes of points. Classification is performed by determining which line a new point is close to.

SVM has a time complexity of when the number of samples is and the number of characteristics is , making SVM very suitable for solving hi-dimensional small sample problems. Some other algorithms are also suitable for small samples, but they have mutual advantages and disadvantages with SVM². The kernel technique also makes SVM very suitable for solving nonlinear problems and hi-dimensional problems. There are also some algorithms that are suitable for hi-dimensional problem, but they have mutual advantages and disadvantages with SVM³. However, a common problem with a range of algorithms based on SVM is the inability to solve large-sample problems due to the limitations of the optimization algorithm. Therefore, we want to provide algorithm that maintains the good properties of SVM while reducing the time required to solve large sample sizes. We give a nonoptimal SVM model with a time complexity of . This improvement will go some way towards circumventing the problem that SVM cannot be applied to large-scale data. Our model can be applied to many hi-dimensional or large sample problems⁴ and has substantial implications for solving real-world problems with hi-dimensional, large samples, and hi-time demands.

In this paper, we consider a kind of nonoptimal machine learning model from the point of view that there is only one positive and one negative. The model takes a point from the positive and negative points to train a model, and then uses all the combinations of positive and negative points. Finally, the model is considered by using multiple models.

The basic logic of the model is that it can be used to train several classification models. In this case, we construct a classification model which can be used in kernel technology. It is interesting to note that in this model, the problems of machine learning, such as the classification problem, the weited problem, and the fitting problem, will be straitforward to operate.

The details of our research are shown in Figure 1.

2. Classical Model

Consider the classic two classification problem: given training set where is the label, and we have to look for the decision function to infer any new input corresponding output . In order to facilitate the representation, we use the following notation: represents a data set of positive class points, and represents the data set of negative class points.

First, we review the classical linear SVM model. The model aims to establish a strait line between positive and negative, two types of samples. One of the principles of SVM is the principle of maximum distance, that is, to maximize the distance between two support planes. We assume that the dividing surface is , and the two support surface is and . The problem of solving the problem of SVM is changed into the following optimization problem:

On the other hand, the linear optimization problem is as follows:

Then, we consider the kernel technology in the dual problem and transform into . The Euclidean space is mapped into another space, and the nonlinear problem is transformed into a linearly separable problem in a hi-dimensional space.

We look back on another machine learning algorithm: the TWSVM. The TWSVM focuses on solving the nonparallel problem. Two types of sample points are enriched near the two parallel lines. The model aims to find two nonparallel strait lines, which can be used to determine the type of line in the classification. The optimization problem of the model is as follows:

The dual problem of the model is as follows:where

In order to introduce kernel technology, we consider replacing the two strait lines and ( is the sum of and ).

The dual problem is as follows: where ;

3. New Model

Firstly, we consider the process of learning. If we only have a sample point, for example, if our problem is how to determine whether a person is male or female and the training set is just a lady in the picture. Then, we cannot judge another picture of the characters in the male and female. It is difficult to classification when we have only one class of points. By the same token, even thou our training focuses on ten thousand women’s photographs, without a single photo of men, it is still unable to train a model that can distinguish between men and women. It is difficult for us to compare the difference between a man and ten thousand women in our normal human thinking, and we can only compare the difference between a male and a female.

Therefore, we consider a positive point and a negative point. The training set has only one positive and one negative point. Using the idea of the maximum interval of SVM, we can obtain that the optimal line is the two points of the vertical bisector of the line, and the functional distance between the two points and a strait line is 1 in Figure 2.

Obviously, we can get the dividing line as follows:

Then, we consider a very interesting classification problem, as shown in figure two in Figure 3.

From Figure 3, we can see that each point in the positive point and negative point in the class make the points of the line, and finally, the combination of points is a reasonable way. So we consider the following algorithm.

Then, we consider the general situation, that is, the number of positive and negative points (in order to facilitate the consideration, we assume that there are positive points and negative points). Take a positive point and a negative class point, and we can get its vertical bisector.

We consider the calculation of all the points, and then use each of the subline to consider the classification problem.

The core idea of this algorithm is to take each positive point and each negative class point out to build a subline, and then all the points out of an average. Take the positive and negative values as the classification results and we consider the classification results are as follows:

After we discuss an improvement of the model, we consider the following sample points in Figure 4.

In the training of the sample points, if we consider the model we calculate, it will lead to the left side of the figure, resulting in the training model is not reasonable. The foundation of our model is the two point training division. We consider the linear translation, namely, the introduction of a parameter two, making the line training as follows:

Then, we introduce the nucleus to each line. Since our model is only composed of the inner product of two vectors, we can use to replace the , which can be obtained as follows:

It is evident that the time complexity of the model is . Because we have a large number of points to get the average score, the weited sum of the sample points is the direct inference of the model. Similarly, if we want to get an online learning model or give up some of the sample points, the time complexity will be very low. Under this premise, the training complexity of the multiclassification problem is also very low.

4. Data Testing

First, we compute the linear kernel on the UCI dataset, and the accuracy and variance contrast is shown in Table 1.

We can see that the new model has a good advantage, and then we consider the computation of the nonlinear RBF kernel on the UCI data set in Table 2.

In the theory, we show that the new algorithm has strong superiority in time complexity. Here, we do some experiments to count. It can be seen that there are great advantages of some data sets. Then, we calculate the computation time of the linear kernel and the nonlinear kernel on the different number of data sets in Table 3.

In addition to the analysis of the time complexity of the linear problem, we then count the time of the nonlinear case. Then, we apply the nonlinear kernel in Table 4.

The time complexity of our model is . The time complexity of the SVM and the TWSVM is . Based on Tables 3 and 4, our model is still faster than the SVM and the TWSVM, even in small sample problems with less than 1000 samples. Also, since our algorithm does not need to solve the optimization problem, the time of our algorithm is stable with respect to the growth of the number of samples. It can be expected that in large-scale samples, our algorithm will be much faster than SVM and TWSVM in computation.

5. Conclusion

Based on the point-to-point model, this paper establishes a micro learning support vector machine. The model is different from the traditional SVM in the way of solving it, and it is not necessary to solve the optimization problem. This makes the model have some difference with the traditional machine learning algorithm. Both neural networks and SVM are unknown time to calculate. The microlearning support vector machine is in a fixed time when the length of the orientation and the number of sample points are fixed. This is of great benefit to the stability of our design and practical applications. From the view of time complexity, the algorithm is better than SVM. Extending the micro learning support vector machine to weighted problems, multiclassification problems, and fitting problems is very simple and straightforward.

Our algorithm outperforms both SVM and TSVM in terms of model accuracy and computation time, and it also has good nonlinear generalisation due to the fact that we also use the kernel function. The computation time of this algorithm is explicit because it does not require solving an optimization problem. Overall, this algorithm is well suited to problems such as stock prediction and face recognition, which require nonlinear, hi-dimensional data, and hi computational speed.

Based on [8, 9], we can extend the model to a semisupervised problem in the future. We just have not come up with a suitable modelling idea yet. We believe that the ideas used to build our model can also be extended to the field of feature extraction in the future. And it can be applied to many related problems (e. g. [10, 11]).

We likewise believe that our model can be used to solve problems related to regression after a SVM-to-SVR-like transformation (e. g. [12–17]). Of particular interest is the fact that our algorithms are well suited for applications in the field of financial forecasting(e. g. [18–20].). The field of financial forecasting requires algorithms with controllable computation times and good performance for nonlinear problems.

In the same way as SVM, our algorithm can be used to solve multiclassification problems. We hope that other researchers will apply our algorithm to multiclassification-related problems in the future(e. g. [21–24]). It is worth noting that this algorithm can be used for face recognition(e. g. [25, 26]). Similarly, our model can be used to solve the multilabel problem(e. g. [27, 28]).

Data Availability

Data are from the UCI dataset.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

The authors approved for the submission.

References

P. S. Bradley and O. L. Mangasarian, “Massive data discrimination via linear support vector machines,” Optimization Methods and Software, vol. 13, no. 1, pp. 1–10, 2000.
View at: Publisher Site | Google Scholar
C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121–167, 1998.
View at: Publisher Site | Google Scholar
C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
View at: Publisher Site | Google Scholar
R. Khemchandani and S. Chandra, “Twin support vector machines for pattern classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905–910, 2007.
View at: Publisher Site | Google Scholar
Z. Qi, Y. Tian, and Y. Shi, “Robust twin support vector machine for pattern classification,” Pattern Recognition, vol. 46, no. 1, pp. 305–316, 2013.
View at: Publisher Site | Google Scholar
Y. H. Shao, C. H. Zhang, X. B. Wang, and N. Y. Deng, “Improvements on twin support vector machines,” IEEE Transactions on Neural Networks, vol. 22, no. 6, pp. 962–968, 2011.
View at: Publisher Site | Google Scholar
Y. H. Shao, N. Y. Deng, and Z. M. Yang, “Least squares recursive projection twin support vector machine for classification,” Pattern Recognition, vol. 45, no. 6, pp. 2299–2307, 2012.
View at: Publisher Site | Google Scholar
K. Chen, L. Yao, D. Zhang, X. Wang, X. Chang, and F. Nie, “A semisupervised recurrent convolutional attention model for human activity recognition,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 5, pp. 1747–1756, 2020.
View at: Publisher Site | Google Scholar
M. Luo, X. Chang, L. Nie, Y. Yang, A. G. Hauptmann, and Q. Zheng, “An adaptive semisupervised feature analysis for video semantic recognition,” IEEE Transactions on Cybernetics, vol. 48, no. 2, pp. 648–660, 2018.
View at: Publisher Site | Google Scholar
A. Glowacz, “Thermographic fault diagnosis of ventilation in bldc motors,” Sensors, vol. 21, no. 21, p. 7245, 2021.
View at: Publisher Site | Google Scholar
A. Glowacz, “Ventilation diagnosis of angle grinder using thermal imaging,” Sensors, vol. 21, no. 8, p. 2853, 2021.
View at: Publisher Site | Google Scholar
Q. Bo and W. Cheng, “Intelligent control of agricultural irrigation throu water demand prediction based on artificial neural network,” Computational Intelligence and Neuroscience, vol. 2021, Article ID 7414949, 10 pages, 2021.
View at: Publisher Site | Google Scholar
Q. Liang, “Application of convolution neural network (cnn) model combined with pyramid algorithm in aerobics action recognition,” Computational Intelligence and Neuroscience, vol. 2021, Article ID 6170070, 2110 pages, 2021.
View at: Publisher Site | Google Scholar
S. Mishra, T. Ahmed, V. Mishra et al., “Multivariate and online prediction of closing price using kernel adaptive filtering,” Computational Intelligence and Neuroscience, vol. 2021, Article ID 6400045, 2114 pages, 2021.
View at: Publisher Site | Google Scholar
Z. M. Yaseen, H. Faris, and N. A. Ansari, “Hybridized extreme learning machine model with salp swarm algorithm: a novel predictive model for hydrological application,” Complexity, vol. 2020, Article ID 8206245, 14 pages, 2020.
View at: Publisher Site | Google Scholar
T. Tiyasha, T. M. Tung, and Z. M. Yaseen, “A survey on river water quality modelling using artificial intelligence models: 2000–2020,” Journal of Hydrology, vol. 585, Article ID 124670, 2020.
View at: Publisher Site | Google Scholar
M. Zhu and Z. Meng, “Macroeconomic image analysis and gdp prediction based on the genetic algorithm radial basis function neural network (rbfnn-ga),” Computational Intelligence and Neuroscience, vol. 2021, Article ID 2000159, 2110 pages, 2021.
View at: Publisher Site | Google Scholar
B. Madhu, M. A. Rahman, A. Mukherjee, M. Z. Islam, R. Roy, and L. E. Ali, “A comparative study of support vector machine and artificial neural network for option price prediction,” Journal of Computer and Communications, vol. 9, no. 5, pp. 78–91, 2021.
View at: Publisher Site | Google Scholar
C. Xiao, W. Xia, and J. Jiang, “Stock price forecast based on combined model of ari-ma-ls-svm,” Neural Computing & Applications, vol. 32, no. 10, pp. 5379–5388, 2020.
View at: Publisher Site | Google Scholar
R. Yang, L. Yu, Y. Zhao et al., “Big data analytics for financial market volatility forecast based on support vector machine,” International Journal of Information Management, vol. 50, pp. 452–462, 2020.
View at: Publisher Site | Google Scholar
S. Kang, I. Kim, and P. J. Vikesland, “Discriminatory detection of ssdna by surface-enhanced Raman spectroscopy (sers) and tree-based support vector machine (tr-svm),” Analytical Chemistry, vol. 93, no. 27, pp. 9319–9328, 2021.
View at: Publisher Site | Google Scholar
A. R. Mello, M. R. Stemmer, and A. L. Koerich, “Incremental and decremental fuzzy bounded twin support vector machine,” Information Sciences, vol. 526, pp. 20–38, 2020.
View at: Publisher Site | Google Scholar
D. Pradhan, B. Sahoo, B. B. Misra, and S. Padhy, “A multiclass svm classifier with teaching learning based feature subset selection for enzyme subclass classification,” Applied Soft Computing, vol. 96, Article ID 106664, 2020.
View at: Publisher Site | Google Scholar
Y. X. Xie, Y. J. Yan, G. F. Li, and X. Li, “Scintillation detector fault diagnosis based on wavelet packet analysis and multi-classification support vector machine,” Journal of Instrumentation, vol. 15, no. 3, Article ID T03001, 2020.
View at: Publisher Site | Google Scholar
J. A. A. Ahmed, X. Zhu, and M. Alaili, “Identifying Ethnics of People throu Face Recognition: A Deep Cnn Approach,” Scientific Programming, vol. 2020, Article ID 6385281, 2020.
View at: Publisher Site | Google Scholar
J. Yang and H. Gao, “Cultural emperor penguin optimizer and its application for face recognition,” Mathematical Problems in Engineering, vol. 2020, Article ID 9579538, 16 pages, 2020.
View at: Publisher Site | Google Scholar
H. Wang and Y. Xu, “Sparse elastic net multi-label rank support vector machine with pinball loss and its applications,” Applied Soft Computing, vol. 104, Article ID 107232, 2021.
View at: Publisher Site | Google Scholar
Y. Zhang, Y. Xu, C. Xu, and P. Zhong, “Safe instance screening for primal multi-label prosvm,” Knowledge-Based Systems, vol. 229, Article ID 107362, 2021.
View at: Publisher Site | Google Scholar
D. Liu, Y. Shi, Y. Tian, and X. Huang, “Ramp loss least squares support vector machine,” Journal of computational science, vol. 14, pp. 61–68, 2016.
View at: Publisher Site | Google Scholar
L. Wang, H. Jia, and J. Li, “Training robust support vector machine with smooth ramp loss in the primal space,” Neurocomputing, vol. 71, no. 13-15, pp. 3020–3025, 2008.
View at: Publisher Site | Google Scholar
D. Zhang, L. Yao, K. Chen, S. Wang, X. Chang, and Y. Liu, “Making sense of spatio-temporal preserving representations for eeg-based human intention recognition,” IEEE Transactions on Cybernetics, vol. 50, no. 7, pp. 3033–3044, 2020.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Yu Yan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

231

Downloads

500

Citations