Applying Least Squares Support Vector Machines to Mean-Variance Portfolio Analysis
Portfolio selection problem introduced by Markowitz has been one of the most important research fields in modern finance. In this paper, we propose a model (least squares support vector machines (LSSVM)-mean-variance) for the portfolio management based on LSSVM. To verify the reliability of LSSVM-mean-variance model, we conduct an empirical research and design an algorithm to illustrate the performance of the model by using the historical data from Shanghai stock exchange. The numerical results show that the proposed model is useful when compared with the traditional Markowitz model. Comparing the efficient frontier and total wealth of both models, our model can provide a more measurable standard of judgment when investors do their investment.
Portfolio is the combination of securities such as foreign exchange, stocks, and other market instruments. Stock investment has become very common for household investors to involve in the stock market. Investors used many technical methods to minimize risk and optimize return. Among the methods, Markowitz model developed by Harry Markowitz in 1952 had serious practical limitations due to complexities involved in compiling the variance, covariance, expectation, standard deviation of each asset to other assets in the portfolio. In recent years, many works have been done by scholars to make the portfolio theory more efficient. In , the authors presented the different variants of the goal programming model that has been applied to the financial portfolio selection problem. In , the authors optimized the portfolio based on entropy and higher moments by using a polynomial goal programming model, and the results indicated that the proposed method is suited for portfolio models which have higher moments. Portfolio optimization techniques also significantly improved the return-risk trade-off performances using multiobjective evolutionary model proposed in . The authors in  used the multiperiod mean-variance model to investigate a defined contribution pension plan investment problem during the accumulation phase. To incorporate social responsibility, a modification of the Markowitz model was proposed in . Konno  proposed a mean-absolute deviation portfolio optimization model and applied it to Tokyo stock market. The results of numerical experiments showed that the model generated a portfolio quite similar to that of the Markowitz model within a fraction of time required to solve the latter. In , the authors used the independently estimated possibilistic return rates to deal with a portfolio selection problem.
However, there are few scholars who used the methods of machine learning to modify the Markowitz model. As we know, the return rate in the mean-variance model refers to the historical return rate, which can also be called historical volatility. Historical volatility refers to the standard deviation of the underlying asset price changes over the past period of time, which represents the past volatility law. The actual volatility in the trading point cannot be determined, but can only be predicted with historical volatility and current market information. In this paper, we predict the actual volatility with historical volatility by using machine learning. As one of important applications of the machine learning, LSSVM has been used to deal with various financial problems such as stock price prediction [8, 9] and regression . Mustaffa  optimized LSSVM for nonvolatile financial prediction. In , the authors proposed a time series forecasting model which used LSSVM optimization based on Grey Wolf optimizer algorithm. For time series prediction [13–15], Van Gestel  used the LSSVM regression within the evidence framework to infer nonlinear models for predicting time series and volatility. By regularizing least squares fuzzy support vector regression, Khemchandani  handled the financial time series forecasting. In , the authors optimized LSSVM model by weight particle swarm optimization, and they forecasted stock return accurately.
In our study, we apply LSSVM regression model to traditional Markowitz model and an efficient result is achieved by our proposed model when we compare the efficient frontier and total wealth of both models.
The contents of this paper are as follows: in Section 2, we provide model description about mean-variance, LSSVM and LSSVM-mean-variance model. In Section 3, we describe the data set and software which will be taken to do the empirical research in this paper. In Section 4, we will show and describe the empirical results. In Section 5, conclusions are given.
2. Model Description
2.1. Mean-Variance Model
Modern portfolio theory was first introduced by Markowitz . By the model, Markowitz proposed the formulation of an efficient frontier shown in a two-dimension graphic, from it, investors can choose their financial portfolio to maximize return for a given level of risk as measured by the variance of returns. Suppose that the investor’s wealth is , the weights on the assets are , and the return rate in the future is , then the investor’s wealth in the future will be . Investors usually determine the proportion of investment in each asset at the initial stage to maximize the expected investment value. Then, the process can be formulated asBy Taylor expansion,Assuming that the series follow normal distribution, then the above function depends on the mean and the variance of . Suppose is a concave function, then we can simplify (1) to where represents the portfolio’s expected return which the investor expected. The mathematical form of (3) is the quadratic programming problem that can be solved by a Lagrangian method. We give the first-order conditionwhere is the return mean vector composed by assets, is the unit column vector, and is the covariance matrix of return. Then, the final investment proportion is optimally satisfied by
2.2. Least Square Support Vector Machine
Support vector machine (SVM) has been successfully applied for financial problems, especially in time series forecasting. LSSVM is the least squares formulation of SVM and was developed by Pelckmans . LSSVM is the combination of structural risk minimization and VC dimension theory  and usually is used for classification as well as regression, such as pattern recognition, fitting functions [22, 23], and data analysis. The algorithm of LSSVM is introduced as follows. The following regression model is constructed by using a nonlinear mapping function , which maps the input data to a higher dimensional feature space:where , is the weight vector, and is the bias. Assume a training set asThe original optimization problem isand the LSSVM can be formulated assubject to the constraintswhere is the regularization parameter and is the random errors. Using a Lagrange multiplier method, we have where are the Lagrange multipliers, from the optimization conditions, by partially differentiating with respect to each parameter, yieldingAfter elimination of parameters and , we obtain the following matrix solution.where the composition of the matrix is . Here, is the radial basis function (RBF) kernel function. For regression models, the RBF kernel is often applied because of its influence and speed in training process .Then, we can obtain the regression function as in the above function is the kernel width and we apply it to adjust the degree of generalization. To make the LSSVM model, we should optimize the parameters and . In this paper, to do the comparison test between the traditional mean-variance model and the proposed LSSVM-mean-variance model, we take and unless otherwise specified.
2.3. LSSVM for Mean-Variance Model
In this section, we give a description of applying LSSVM to mean-variance model. We first select a portfolio and then calculate the returns of the assets in the portfolio. As mentioned above, in the LSSVM model, we take the matrix of assets’ returns as the training sets, by the process of Section 2.2, we get a regression matrix of assets’ returns. Then, we use returns and regression returns to do the test and follow the steps described in Section 2.1. Finally, we compare the efficient frontier and final wealth return for the two methods, which will be shown in Section 4.
3. Data Set and Software
We select a portfolio consistimg of three assets which are chosen from Shanghai stock market. To do a buy-and-sell test, we use the historical data for the stock “-zgyh-”, “-nyyh-”, and “-jtyh-” from August 09, 2018, to October 26, 2018. We take the data from August 09, 2018, to October 25, 2018, into mean-variance model and LSSVM-mean-variance model. There are 50 data in total. In the LSSVM model, we divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on October 26, 2018. To do a buy-and-hold test, we use the historical data for the stock “-zgyh-”, “-nyyh-”, and “-jtyh-” from March 10, 2017, to March 12, 2018. We take the data of closing price every 5 days; then, there are 50 data in total. In the LSSVM model, we also divide the data into a training set with 39 data and a test set with 10 data. Then, we compare the performance with the two models on March 19, 2018. For the calculation process, MATLAB R2016a will be used.
4. Empirical Results
4.1. Empirical Results of Buy-and-Sell Strategy
The body in the candlestick usually consists of an opening price and a closing price; the price excursions below or above the body are called the wicks. For a stock during the time interval represented, the wick contains the lowest and highest prices, as well as the body contains the opening and closing prices. The red body of a candlestick indicates the security has a higher closed price than it opened, the opening price at the bottom and the closing price at the top. The green body of a candlestick indicates the security has a lower closed price than it opened, the opening price at the top and the closing price at the bottom.
Now we select the real historical data of the stocks “-zgyh-”, “-nyyh-”, and “-jtyh-” from August 09, 2018, to October 25, 2018. Taking the closing data to the calculation of return rate for every day, the total number of data is 50. Then, we get 49 return rate data for each asset. The return rate of “-jtyh-” is shown in Figure 2 and reflected by “”. Then, we take return rate to the calculation process of Markowitz model. As a result, Table 1 shows the proportion of each asset for traditional mean-variance model, we set 10 investment proportion combinations.
As a comparison, we calculate the proportion of each asset for LSSVM-mean-variance model by using the LSSVM regression. We take the return rate mentioned above to the LSSVM model described in Section 2.2, we divide the data into a training set with 39 data and a test set with 10 data. Then, we get the regression data which is shown in Figure 2 and reflected by “”.
Then, we take regression return rate to Markowitz model. Table 2 shows the proportion of each asset for LSSVM-mean-variance model. Here, we also set 10 investment proportion combinations.
Each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.
As seen in Figure 3, the efficient portfolio frontier for LSSVM-mean-variance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVM-mean-variance model can reduce the risk for investors.
According to the investment proportion combinations shown in Tables 1 and 2, we perform a simulation test. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on October 25, 2018, and sell the portfolio on October 26, 2018. Under the two models, the total wealth on October 26, 2018, is shown in Figure 4; the point in the figure represents the performance of each combination. By the simulation, the wealth invested by using the new model helps investors earn more than the traditional model when taking the buy-and-sell strategy.
4.2. Empirical Results of Buy-and-Hold for 5-Day Strategy
As the proposed model conducted by using buy-and-sell strategy, we get a satisfied result. However, the data set we selected is small and between summer and autumn, which makes people think that the above results have specific seasonality. To address this concern, we select a long data set covering all seasons of the year, from March 10, 2017, to March 12, 2018. The candlestick charts for three stocks are shown in Figure 5. In addition, in order to show that our model does not only work for buy-and-sell strategy, we take the data of closing price every 5 days for buy-and-hold strategy.
The return rate of “-jtyh-” for buy-and-hold for 5-day strategy is shown in Figure 6 and is reflected by “”. Then, we take return rate to the calculation process of Markowitz model. Table 3 shows the proportion of each asset for traditional mean-variance model, and we set 10 investment proportion combinations. Similar to buy-and-sell strategy, each investment proportion combination in the table responds to a maximum return for a given level of risk as measured by the standard variance. The points are constituted by mean and standard variance forming an efficient frontier.
As seen in Figure 7, in the buy-and-hold strategy, the efficient portfolio frontier for LSSVM-mean-variance model has a better performance than the traditional model. It is possible to check that portfolios corresponding to the new proposed method can improve the return at the same risk. For the same expectation of both models, the LSSVM-mean-variance model can reduce the risk for investors.
Then, we take regression return rate to Markowitz model. Table 4 shows the proportion of each asset for LSSVM-mean-variance model. Here, we also set 10 investment proportion combinations.
According to the investment proportion combinations shown in Tables 3 and 4, we perform a simulation test for buy-and-hold for 5-day strategy. We assume that the initial total wealth is 100 million, and we buy the asset portfolio depending on the proportion combination on March 12, 2018, and sell the portfolio on March 19, 2018. Under the two models, the total wealth on March 19, 2018, is shown in Figure 8; the point in the figure represents the performance of each combination. By the simulation, the new model helps investors earn more than the traditional model when taking the buy-and-hold strategy.
To illustrate that this result is not caused by the specific period we selected, we calculate the total wealth of each day in 15 days from August 30, 2018, to September 19, 2018 for the two models according to the former 15 days. The calculation steps are taken as same as the above process. We set the total wealth of mean-variance model as and for the LSSVM-mean-variance model; the difference of two models is defined by As shown in Figure 9, almost all the difference values are greater than 0, which indicates that the optimized model has a higher yield of each day in 15 days.
Machine learning over the last few years has resulted in a potential opportunity for investors to invest in the financial market with a smarter and profitable way. Combining machine learning technology with financial investment, it can entirely change the way we make investment decisions. This paper gives an overview of how the two technologies can be combined into a powerful tool and proposes the LSSVM-mean-variance algorithm with the aim of maximizing return for a given level of risk as measured by the variance of returns. The efficiency of the proposed method is measured by empirical data, namely, efficient frontier and total wealth. Comparing the efficient frontier and total wealth of both models, the curve of mean-variance model is always below the proposed model. This shows that our model has a higher yield under the same risk and has more total wealth under each combination; our model performs a more measurable standard of judgment when investors do their investment. We confirm the efficiency through the strategy both buy-and-sell and buy-and-hold. The encouraging performance shows that our proposed method may become a promising model for the context of study and the results indicate a positive opportunity to be explored in the future.
Data and source program codes in this paper are available upon request from the corresponding author.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The first author (Jian Wang) was supported by the China Scholarship Council (201808260026).
B. Y. Qu, Q. Zhou, J. M. Xiao, J. J. Liang, and P. N. Suganthan, “Large-scale portfolio optimization using multiobjective evolutionary algorithms and preselection methods,” Mathematical Problems in Engineering, vol. 2017, 14 pages, 2017.View at: Google Scholar
X. Zhou, Z. Pan, G. Hu, S. Tang, and C. Zhao, “Stock market prediction on high-frequency data using generative adversarial nets,” Mathematical Problems in Engineering, vol. 2018, 11 pages, 2018.View at: Google Scholar
Z. Wang, J. Hu, and Y. Wu, “A bimodel algorithm with data-divider to predict stock index,” Mathematical Problems in Engineering, vol. 2018, 14 pages, 2018.View at: Google Scholar
D. Prayogo and Y. T. T. Susanto, “Optimizing the prediction accuracy of friction capacity of driven piles in cohesive soil using a novel self-tuning least squares support vector machine,” Advances in Civil Engineering, vol. 2018, Article ID 6490169, 9 pages, 2018.View at: Publisher Site | Google Scholar
Z. Mustaffa and Y. Yusof, “Optimizing LSSVM using ABC for non-volatile financial prediction,” Australian Journal of Basic and Applied Sciences, vol. 5, no. 11, pp. 549–556, 2011.View at: Google Scholar
Z. Mustaffa, M. H. Sulaiman, and M. N. M. Kahar, “LS-SVM hyper-parameters optimization based on GWO algorithm for time series forecasting,” in Proceedings of the 4th International Conference on Software Engineering and Computer Systems, ICSECS 2015, pp. 183–188, Malaysia, August 2015.View at: Google Scholar
T. Van Gestel, J. A. K. Suykens, D.-E. Baestaens et al., “Financial time series prediction using least squares support vector machines within the evidence framework,” IEEE Transactions on Neural Networks and Learning Systems, vol. 12, no. 4, pp. 809–821, 2001.View at: Publisher Site | Google Scholar
W. Shen, Y. Zhang, and X. Ma, “Stock return forecast with LS-SVM and particle swarm optimization,” in Proceedings of the 2009 International Conference on Business Intelligence and Financial Engineering, BIFE 2009, pp. 143–147, China, 2009.View at: Google Scholar
K. Pelckmans, J. A. Suykens, T. Van Gestel et al., “LS-SVMlab: a matlab/c toolbox for least squares support vector machines,” Tutorial. KULeuven-ESAT, vol. 142, pp. 1-2, 2002.View at: Google Scholar
H. Nie, G. Liu, X. Liu, and Y. Wang, “Hybrid of ARIMA and SVMs for short-term load forecasting,” Energy Procedia, vol. 16, pp. 1455–1460, 2012.View at: Google Scholar