Research Article  Open Access
Applying Least Squares Support Vector Machines to MeanVariance Portfolio Analysis
Abstract
Portfolio selection problem introduced by Markowitz has been one of the most important research fields in modern finance. In this paper, we propose a model (least squares support vector machines (LSSVM)meanvariance) for the portfolio management based on LSSVM. To verify the reliability of LSSVMmeanvariance model, we conduct an empirical research and design an algorithm to illustrate the performance of the model by using the historical data from Shanghai stock exchange. The numerical results show that the proposed model is useful when compared with the traditional Markowitz model. Comparing the efficient frontier and total wealth of both models, our model can provide a more measurable standard of judgment when investors do their investment.
1. Introduction
Portfolio is the combination of securities such as foreign exchange, stocks, and other market instruments. Stock investment has become very common for household investors to involve in the stock market. Investors used many technical methods to minimize risk and optimize return. Among the methods, Markowitz model developed by Harry Markowitz in 1952 had serious practical limitations due to complexities involved in compiling the variance, covariance, expectation, standard deviation of each asset to other assets in the portfolio. In recent years, many works have been done by scholars to make the portfolio theory more efficient. In [1], the authors presented the different variants of the goal programming model that has been applied to the financial portfolio selection problem. In [2], the authors optimized the portfolio based on entropy and higher moments by using a polynomial goal programming model, and the results indicated that the proposed method is suited for portfolio models which have higher moments. Portfolio optimization techniques also significantly improved the returnrisk tradeoff performances using multiobjective evolutionary model proposed in [3]. The authors in [4] used the multiperiod meanvariance model to investigate a defined contribution pension plan investment problem during the accumulation phase. To incorporate social responsibility, a modification of the Markowitz model was proposed in [5]. Konno [6] proposed a meanabsolute deviation portfolio optimization model and applied it to Tokyo stock market. The results of numerical experiments showed that the model generated a portfolio quite similar to that of the Markowitz model within a fraction of time required to solve the latter. In [7], the authors used the independently estimated possibilistic return rates to deal with a portfolio selection problem.
However, there are few scholars who used the methods of machine learning to modify the Markowitz model. As we know, the return rate in the meanvariance model refers to the historical return rate, which can also be called historical volatility. Historical volatility refers to the standard deviation of the underlying asset price changes over the past period of time, which represents the past volatility law. The actual volatility in the trading point cannot be determined, but can only be predicted with historical volatility and current market information. In this paper, we predict the actual volatility with historical volatility by using machine learning. As one of important applications of the machine learning, LSSVM has been used to deal with various financial problems such as stock price prediction [8, 9] and regression [10]. Mustaffa [11] optimized LSSVM for nonvolatile financial prediction. In [12], the authors proposed a time series forecasting model which used LSSVM optimization based on Grey Wolf optimizer algorithm. For time series prediction [13–15], Van Gestel [16] used the LSSVM regression within the evidence framework to infer nonlinear models for predicting time series and volatility. By regularizing least squares fuzzy support vector regression, Khemchandani [17] handled the financial time series forecasting. In [18], the authors optimized LSSVM model by weight particle swarm optimization, and they forecasted stock return accurately.
In our study, we apply LSSVM regression model to traditional Markowitz model and an efficient result is achieved by our proposed model when we compare the efficient frontier and total wealth of both models.
The contents of this paper are as follows: in Section 2, we provide model description about meanvariance, LSSVM and LSSVMmeanvariance model. In Section 3, we describe the data set and software which will be taken to do the empirical research in this paper. In Section 4, we will show and describe the empirical results. In Section 5, conclusions are given.
2. Model Description
2.1. MeanVariance Model
Modern portfolio theory was first introduced by Markowitz [19]. By the model, Markowitz proposed the formulation of an efficient frontier shown in a twodimension graphic, from it, investors can choose their financial portfolio to maximize return for a given level of risk as measured by the variance of returns. Suppose that the investor’s wealth is , the weights on the assets are , and the return rate in the future is , then the investor’s wealth in the future will be . Investors usually determine the proportion of investment in each asset at the initial stage to maximize the expected investment value. Then, the process can be formulated asBy Taylor expansion,Assuming that the series follow normal distribution, then the above function depends on the mean and the variance of . Suppose is a concave function, then we can simplify (1) to where represents the portfolio’s expected return which the investor expected. The mathematical form of (3) is the quadratic programming problem that can be solved by a Lagrangian method. We give the firstorder conditionwhere is the return mean vector composed by assets, is the unit column vector, and is the covariance matrix of return. Then, the final investment proportion is optimally satisfied by
2.2. Least Square Support Vector Machine
Support vector machine (SVM) has been successfully applied for financial problems, especially in time series forecasting. LSSVM is the least squares formulation of SVM and was developed by Pelckmans [20]. LSSVM is the combination of structural risk minimization and VC dimension theory [21] and usually is used for classification as well as regression, such as pattern recognition, fitting functions [22, 23], and data analysis. The algorithm of LSSVM is introduced as follows. The following regression model is constructed by using a nonlinear mapping function , which maps the input data to a higher dimensional feature space:where , is the weight vector, and is the bias. Assume a training set asThe original optimization problem isand the LSSVM can be formulated assubject to the constraintswhere is the regularization parameter and is the random errors. Using a Lagrange multiplier method, we have where are the Lagrange multipliers, from the optimization conditions, by partially differentiating with respect to each parameter, yieldingAfter elimination of parameters and , we obtain the following matrix solution.where the composition of the matrix is . Here, is the radial basis function (RBF) kernel function. For regression models, the RBF kernel is often applied because of its influence and speed in training process [24].Then, we can obtain the regression function as in the above function is the kernel width and we apply it to adjust the degree of generalization. To make the LSSVM model, we should optimize the parameters and . In this paper, to do the comparison test between the traditional meanvariance model and the proposed LSSVMmeanvariance model, we take and unless otherwise specified.
2.3. LSSVM for MeanVariance Model
In this section, we give a description of applying LSSVM to meanvariance model. We first select a portfolio and then calculate the returns of the assets in the portfolio. As mentioned above, in the LSSVM model, we take the matrix of assets’ returns as the training sets, by the process of Section 2.2, we get a regression matrix of assets’ returns. Then, we use returns and regression returns to do the test and follow the steps described in Section 2.1. Finally, we compare the efficient frontier and final wealth return for the two methods, which will be shown in Section 4.
3. Data Set and Software
We select a portfolio consistimg of three assets which are chosen from Shanghai stock market. To do a buyandsell test, we use the historical data for the stock “zgyh”, “nyyh”, and “jtyh” from August 09, 2018, to October 26, 2018. We take the data from August 09, 2018, to October 25, 2018, into meanvariance model and LSSVMmeanvariance model. There are 50 data in total. In the LSSVM model, we divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on October 26, 2018. To do a buyandhold test, we use the historical data for the stock “zgyh”, “nyyh”, and “jtyh” from March 10, 2017, to March 12, 2018. We take the data of closing price every 5 days; then, there are 50 data in total. In the LSSVM model, we also divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on March 19, 2018. For the calculation process, MATLAB R2016a will be used.
4. Empirical Results
4.1. Empirical Results of BuyandSell Strategy
From the stocks data chosen in Section 3, we can see the stocks’ price trend as shown in Figure 1.
(a)
(b)
(c)
The body in the candlestick usually consists of an opening price and a closing price; the price excursions below or above the body are called the wicks. For a stock during the time interval represented, the wick contains the lowest and highest prices, as well as the body contains the opening and closing prices. The red body of a candlestick indicates the security has a higher closed price than it opened, the opening price at the bottom and the closing price at the top. The green body of a candlestick indicates the security has a lower closed price than it opened, the opening price at the top and the closing price at the bottom.
Now we select the real historical data of the stocks “zgyh”, “nyyh”, and “jtyh” from August 09, 2018, to October 25, 2018. Taking the closing data to the calculation of return rate for every day, the total number of data is 50. Then, we get 49 return rate data for each asset. The return rate of “jtyh” is shown in Figure 2 and reflected by “”. Then, we take return rate to the calculation process of Markowitz model. As a result, Table 1 shows the proportion of each asset for traditional meanvariance model, we set 10 investment proportion combinations.

(a)
(b)
(c)
As a comparison, we calculate the proportion of each asset for LSSVMmeanvariance model by using the LSSVM regression. We take the return rate mentioned above to the LSSVM model described in Section 2.2, we divide the data into a training set with 39 data and a test set with 10 data. Then, we get the regression data which is shown in Figure 2 and reflected by “”.
Then, we take regression return rate to Markowitz model. Table 2 shows the proportion of each asset for LSSVMmeanvariance model. Here, we also set 10 investment proportion combinations.

Each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.
As seen in Figure 3, the efficient portfolio frontier for LSSVMmeanvariance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVMmeanvariance model can reduce the risk for investors.
According to the investment proportion combinations shown in Tables 1 and 2, we perform a simulation test. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on October 25, 2018, and sell the portfolio on October 26, 2018. Under the two models, the total wealth on October 26, 2018, is shown in Figure 4; the point in the figure represents the performance of each combination. By the simulation, the wealth invested by using the new model helps investors earn more than the traditional model when taking the buyandsell strategy.
4.2. Empirical Results of BuyandHold for 5Day Strategy
As the proposed model conducted by using buyandsell strategy, we get a satisfied result. However, the data set we selected is small and between summer and autumn, which makes people think that the above results have specific seasonality. To address this concern, we select a long data set covering all seasons of the year, from March 10, 2017, to March 12, 2018. The candlestick charts for three stocks are shown in Figure 5. In addition, in order to show that our model does not only work for buyandsell strategy, we take the data of closing price every 5 days for buyandhold strategy.
(a)
(b)
(c)
The return rate of “jtyh” for buyandhold for 5day strategy is shown in Figure 6 and is reflected by “”. Then, we take return rate to the calculation process of Markowitz model. Table 3 shows the proportion of each asset for traditional meanvariance model, and we set 10 investment proportion combinations. Similar to buyandsell strategy, each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.

(a)
(b)
(c)
As seen in Figure 7, in the buyandhold strategy, the efficient portfolio frontier for LSSVMmeanvariance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVMmeanvariance model can reduce the risk for investors.
Then, we take regression return rate to Markowitz model. Table 4 shows the proportion of each asset for LSSVMmeanvariance model. Here, we also set 10 investment proportion combinations.

According to the investment proportion combinations shown in Tables 3 and 4, we perform a simulation test for buyandhold for 5day strategy. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on March 12, 2018, and sell the portfolio on March 19, 2018. Under the two models, the total wealth on March 19, 2018, is shown in Figure 8; the point in the figure represents the performance of each combination. By the simulation, the new model helps investors earn more than the traditional model when taking the buyandhold strategy.
To illustrate that this result is not caused by the specific period we selected, we calculate the total wealth of each day in 15 days from August 30, 2018, to September 19, 2018 for the two models according to the former 15 days. The calculation steps are taken as same as the above process. We set the total wealth of meanvariance model as and for the LSSVMmeanvariance model; the difference of two models is defined by As shown in Figure 9, almost all the difference values are greater than 0, which indicates that the optimized model has a higher yield of each day in 15 days.
5. Conclusion
Machine learning over the last few years has resulted in a potential opportunity for investors to invest in the financial market with a smarter and profitable way. Combining machine learning technology with financial investment, it can entirely change the way we make investment decisions. This paper gives an overview of how the two technologies can be combined into a powerful tool and proposes the LSSVMmeanvariance algorithm with the aim of maximizing return for a given level of risk as measured by the variance of returns. The efficiency of the proposed method is measured by empirical data, namely, efficient frontier and total wealth. Comparing the efficient frontier and total wealth of both models, the curve of meanvariance model is always below the proposed model. This shows that our model has a higher yield under the same risk and has more total wealth under each combination; our model performs a more measurable standard of judgment when investors do their investment. We confirm the efficiency through the strategy both buyandsell and buyandhold. The encouraging performance shows that our proposed method may become a promising model for the context of study and the results indicate a positive opportunity to be explored in the future.
Data Availability
Data and source program codes in this paper are available upon request from the corresponding author.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
The first author (Jian Wang) was supported by the China Scholarship Council (201808260026).
References
 B. Aouni, C. Colapinto, and D. La Torre, “Financial portfolio management through the goal programming model: current stateoftheart,” European Journal of Operational Research, vol. 234, no. 2, pp. 536–545, 2014. View at: Publisher Site  Google Scholar  MathSciNet
 M. Aksaraylı and O. Pala, “A polynomial goal programming model for portfolio optimization based on entropy and higher moments,” Expert Systems with Applications, vol. 94, pp. 185–192, 2018. View at: Publisher Site  Google Scholar
 B. Y. Qu, Q. Zhou, J. M. Xiao, J. J. Liang, and P. N. Suganthan, “Largescale portfolio optimization using multiobjective evolutionary algorithms and preselection methods,” Mathematical Problems in Engineering, vol. 2017, 14 pages, 2017. View at: Google Scholar
 L. Wang and Z. Chen, “Nash equilibrium strategy for a DC pension plan with statedependent risk aversion: a multiperiod meanvariance framework,” Discrete Dynamics in Nature and Society, vol. 2018, 17 pages, 2018. View at: Publisher Site  Google Scholar
 S. M. Gasser, M. Rammerstorfer, and K. Weinmayer, “Markowitz revisited: Social portfolio engineering,” European Journal of Operational Research, vol. 258, no. 3, pp. 1181–1190, 2017. View at: Publisher Site  Google Scholar
 H. Konno and H. Yamazaki, “Mean absolute deviation portfolio optimization model and its application to Tokyo stock exchange,” Management Science, vol. 37, pp. 519–531, 1991. View at: Publisher Site  Google Scholar
 M. Inuiguchi and T. Tanino, “Portfolio selection under independent possibilistic information,” Fuzzy Sets and Systems, vol. 115, no. 1, pp. 83–92, 2000. View at: Publisher Site  Google Scholar  MathSciNet
 X. Zhou, Z. Pan, G. Hu, S. Tang, and C. Zhao, “Stock market prediction on highfrequency data using generative adversarial nets,” Mathematical Problems in Engineering, vol. 2018, 11 pages, 2018. View at: Google Scholar
 Z. Wang, J. Hu, and Y. Wu, “A bimodel algorithm with datadivider to predict stock index,” Mathematical Problems in Engineering, vol. 2018, 14 pages, 2018. View at: Google Scholar
 D. Prayogo and Y. T. T. Susanto, “Optimizing the prediction accuracy of friction capacity of driven piles in cohesive soil using a novel selftuning least squares support vector machine,” Advances in Civil Engineering, vol. 2018, Article ID 6490169, 9 pages, 2018. View at: Publisher Site  Google Scholar
 Z. Mustaffa and Y. Yusof, “Optimizing LSSVM using ABC for nonvolatile financial prediction,” Australian Journal of Basic and Applied Sciences, vol. 5, no. 11, pp. 549–556, 2011. View at: Google Scholar
 Z. Mustaffa, M. H. Sulaiman, and M. N. M. Kahar, “LSSVM hyperparameters optimization based on GWO algorithm for time series forecasting,” in Proceedings of the 4th International Conference on Software Engineering and Computer Systems, ICSECS 2015, pp. 183–188, Malaysia, August 2015. View at: Google Scholar
 S. Lahmiri and S. Bekiros, “Chaos, randomness and multifractality in Bitcoin market,” Chaos, Solitons & Fractals, vol. 106, pp. 28–34, 2018. View at: Publisher Site  Google Scholar
 S. Lahmiri, “Asymmetric and persistent responses in price volatility of fertilizers through stable and unstable periods,” Physica A: Statistical Mechanics and its Applications, vol. 466, pp. 405–414, 2017. View at: Publisher Site  Google Scholar
 S. Lahmiri and S. Bekiros, “Timevarying selfsimilarity in alternative investments,” Chaos, Solitons & Fractals, vol. 111, pp. 1–5, 2018. View at: Publisher Site  Google Scholar
 T. Van Gestel, J. A. K. Suykens, D.E. Baestaens et al., “Financial time series prediction using least squares support vector machines within the evidence framework,” IEEE Transactions on Neural Networks and Learning Systems, vol. 12, no. 4, pp. 809–821, 2001. View at: Publisher Site  Google Scholar
 R. Khemchandani, . Jayadeva, and S. Chandra, “Regularized least squares fuzzy support vector regression for financial time series forecasting,” Expert Systems with Applications, vol. 36, no. 1, pp. 132–138, 2009. View at: Publisher Site  Google Scholar
 W. Shen, Y. Zhang, and X. Ma, “Stock return forecast with LSSVM and particle swarm optimization,” in Proceedings of the 2009 International Conference on Business Intelligence and Financial Engineering, BIFE 2009, pp. 143–147, China, 2009. View at: Google Scholar
 H. Markowitz, “Portfolio selection,” The Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952. View at: Publisher Site  Google Scholar
 K. Pelckmans, J. A. Suykens, T. Van Gestel et al., “LSSVMlab: a matlab/c toolbox for least squares support vector machines,” Tutorial. KULeuvenESAT, vol. 142, pp. 12, 2002. View at: Google Scholar
 V. N. Vapnik and A. Y. Chervonenkis, “On the uniform convergence of relative frequencies of events to their probabilities,” in Measures of Complexity, pp. 11–30, Springer, Cham, Switzerland, 2015. View at: Google Scholar  MathSciNet
 X. Chen, J. Yang, J. Liang, and Q. Ye, “Recursive robust least squares support vector regression based on maximum correntropy criterion,” Neurocomputing, vol. 97, pp. 63–73, 2012. View at: Publisher Site  Google Scholar
 H. Nie, G. Liu, X. Liu, and Y. Wang, “Hybrid of ARIMA and SVMs for shortterm load forecasting,” Energy Procedia, vol. 16, pp. 1455–1460, 2012. View at: Google Scholar
 N. Xin, X. Gu, H. Wu, Y. Hu, and Z. Yang, “Application of genetic algorithm‐support vector regression (GA‐SVR) for quantitative analysis of herbal medicines,” Journal of Chemometrics, vol. 26, pp. 353–360, 2012. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2019 Jian Wang and Junseok Kim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.