A Hybrid Intelligent Method of Predicting Stock Returns
This paper proposes a novel method for predicting stock returns by means of a hybrid intelligent model. Initially predictions are obtained by a linear model, and thereby prediction errors are collected and fed into a recurrent neural network which is actually an autoregressive moving reference neural network. Recurrent neural network results in minimized prediction errors because of nonlinear processing and also because of its configuration. These prediction errors are used to obtain final predictions by summation method as well as by multiplication method. The proposed model is thus hybrid of both a linear and a nonlinear model. The model has been tested on stock data obtained from National Stock Exchange of India. The results indicate that the proposed model can be a promising approach in predicting future stock movements.
Prediction of stock returns has attracted many researchers in the past and at present it is still an emerging area both in academia and in industry. Mathematically, the techniques involved in obtaining prediction of stock returns can be broadly classified into two categories. The first category involves linear models such as autoregressive moving average models, exponential smoothing, linear trend prediction, random walk model, generalized autoregressive conditional heteroskedasticity, and stochastic volatility model . The second category involves those models which are based on artificial intelligence such as artificial neural networks (ANNs) , support vector machines , genetic algorithms (GA), and particle swarm optimization (PSO) . Linear models have a common limitation associated with them, that is, their linear feature which prevents them from detecting nonlinear patterns of data. Due to instability in stock market, the stock data is volatile in nature; thus, linear models are unable to detect nonlinear patterns of such data. Nonlinear models overcome the limitations of linear models, as ANNs embody useful nonlinear functions which are able to detect nonlinear patterns of data . As a consequence, prediction performance improves by using nonlinear models [6, 7].
A lot of work has been done in this field; for instance, radial basis neural network was used for stock prediction of Shanghai Stock Exchange, wherein artificial fish swarm optimization was introduced so as to optimize radial basis function . In time series prediction, ANNs have received overwhelming attention from researchers. For instance, Freitas et al. , Wang et al. , Khashei and Bijari , Chen et al. , and Jain and Kumar  report the use of ANN in time series stock prediction. A new approach called the wavelet denoising based backpropagation neural network was proposed for predicting stock prices . In another work, researchers explored the use of activation functions in ANN, and to improve the performance of ANNs, they suggest the use of three new simple functions; financial time series data was used in the experiments . For seasonal time series prediction, an ANN was proposed, which considers the seasonal time periods in time series. The purpose of such consideration is to determine the number of input and output neurons . Multilayer perceptron and generalized regression neural networks were used to predict the Kuwait Stock Exchange . The results showed that the models were useful in predicting stock exchange movements in emerging markets. Different ANN models were proposed that are able to capture temporal aspects of inputs . For time series predictions, two types of ANN models are proved to be successful: time-lagged feedforward networks and dynamically driven recurrent (feedback) networks .
ANNs are not always guaranteed to yield desired results. In order to solve such problems, the researchers attempted to find global optimization approach of ANN to predict the stock price index . With the goal to further improve the performance of predictors, researchers have also attempted to develop hybrid models for prediction of stock returns. The hybridization may include integration of linear and nonlinear models [10, 20, 21]. Researchers developed a hybrid forecasting model by integrating recurrent neural network based on artificial bee colony and wavelet transforms . In another work, an adaptive network-based fuzzy inference system was used so as to develop a hybrid prediction model . A hybrid system was proposed by integrating linear autoregressive integrated moving average model and ANN . To overcome the prediction problem with the use of technical indicators, researchers proposed a hybrid model by merging ANNs and genetic programming . Feature selection was also used to improve the performance of the system. In another work, random walk prediction model was merged with ANN and a hybrid model was thus developed . Researchers also introduced a hybrid ANN architecture of particle swarm optimization and adaptive radial basis function . Various regressive ANN models, such as self organising maps and support vector regressions, were used to form a hybrid model so as to predict foreign exchange currency rate . For business failure prediction, a hybrid model was proposed in which particle swarm optimization and network-based fuzzy inference system were used . In a recent work, researchers created a hybrid prediction model by integrating autoregressive moving average model and differential evolution based training of its feedforward and feedback parameters . This hybrid model has been compared with other similar hybrid models and the results confirm that it outperforms other models.
The rest of the paper is arranged as follows. Section 2 discusses various prediction based models, including those models which are used in this work. The proposed hybrid model is described in detail in Section 3. Section 4 discusses two commonly used error metrics. In Section 5, experiments and results are presented. Finally, Section 6 presents conclusions.
2. Prediction Based Models
Stock return or simply return is used to refer to a profit on an investment. Return, at time, is calculated using (1), where and are prices of stock at times and respectively:
Few prediction-based models available in literature including those used in this work are described below.
2.1. Exponential Smoothing Model
Exponential smoothing model is used to obtain predictions on time series data Brown . It computes one-step-ahead prediction by means of computing geometric sum of past observations as shown in the following equation: where is prediction of , is prediction for future value, is a smoothing parameter in the range of , and is prediction error. Exponential smoothing assigns exponentially decreasing weights over time. The observations are weighted, with more weight given to the most recent observations.
2.2. Autoregressive Moving Average Model
Autoregressive moving average model was introduced by Box and Jenkins , for times series prediction, which was actually inspired by the early work of Yule  and Wold . The model consists of an autoregressive part and moving average part . This is the reason why the model is referred to as model. The autoregressive moving average model is thus defined as process and it can be generally expressed as shown in the following equation:
Similarly, process can be generally expressed as shown in the following equation:
Hence, the model can be expressed as where and represent the order of the autoregressive model and of the moving average model, respectively. and are coefficients that satisfy stationarity of series. are random errors or white noise at time with zero mean and variance .
2.3. Autoregressive Moving Reference Regression Model
Let us consider time series data with past returns as shown in the following equation:
Based on the available historical data, the future return of an action can be defined as the process in which the elements of the past returns are used to obtain an estimate of future return , where . The value of directly affects the choice of the adopted prediction method used.
Input-output pairs are generated by this model which are given to ANN in a supervised manner; thus, it can be called autoregressive moving reference neural network, , where is the order of regression and is delay from the point of reference.
Using an autoregressive predictor, say , implements the prediction system as shown in the following equation: where is prediction for obtained at time from the information available from the historical series . is the order of regression and is moving reference given in the following equation:
After training the neural network, prediction is obtained as shown in the following equation:
According to (7), inputs to the neural network are given as differences rather than original values, and the network thus requires smaller values of weights which improves its ability to generalize. The output obtained from neural network is not the final predictions, rather than final predictions which are calculated by adding value to the output.
3. The Proposed Hybrid Intelligent Prediction Model
This section discusses the proposed hybrid intelligent prediction model (HIPM) in detail. Consider the actual returns of time series given by . Let the predictions obtained by any linear model be denoted by . The difference between actual time series () and predicted series () is known as prediction error or simply error, which is calculated here in a similar fashion, that is, . In HIPM, the predictions are obtained via two methods, that is, summation method and multiplication method; these methods are defined below.
3.1. Summation Method
According to summation method, actual data is equal to predicted linear data summed up with error terms, as shown in the following equation:
Error terms are thus calculated as shown in the following equation:
3.2. Multiplicative Method
For multiplicative method, actual data is equal to predicted linear data multiplied with error terms, as shown in the following equation:
The error terms are multiplied back to linear predictions because these error terms were calculated as shown in the following equation:
The series of errors obtained by the above two methods are given to ANN by means of , where . Thus, in terms of AR-MRNN, (7) to (9) are modified as given below: where is estimate for obtained at time from the information available from the previously obtained error series . is the order of regression and is moving reference given in the following equation:
After training, the neural network is obtained as shown in (16); thus, minimized errors are obtained:
These minimized errors are added to (10) and (12) after replacing original errors. Hence, final predictions from HIPM are obtained by summation method and multiplicative method as shown in the following two equations, respectively:
Figure 1 shows the complete work flow of HIPM. Initially actual returns, , are given as an input to linear prediction through which predictions are obtained. In the next step, errors are calculated. These errors are fed into ANN which does nonlinear processing. The minimized errors, , obtained from ANN are used to calculate final predictions via two methods, that is, summation method and multiplicative method.
3.3. Recurrent Neural Network with Autoregressive Moving Reference
The importance of a recurrent neural network (RNN) is that it responds to the same input pattern differently at different times. Figure 2 shows the recurrent neural network with one hidden layer which is used in this work. In this type of network model, input layer is not only fed to hidden layer but also fed back into input layer. The network is a supervised AR-MRNN with inputs . The desired output is . As shown, the receiving end for input layer is a long term memory for the network. The function of this memory is to hold the data and pass it to hidden layer immediately after each pattern is sent from input layer. In this way, the network is able to see previous knowledge it had about previous inputs. Thus, long term memory remembers the new input data and uses it when the next pattern is processed. The disadvantage of this network is that it takes longer to train.
The input layer has input neurons equal to the regression order chosen, that is, , possessing linear activation function. The are number of neurons in hidden layer possessing some activation function. The output neuron, that is, , also possesses some activation function. Thus, the network is able to learn how to predict complex nonlinear patterns of the data. This network is also known as Jordan Elman neural network and for training itself, the network uses backpropagation algorithm [36, 37].
4. Error Metrics
The performance of HIPM is checked here by using two error metrics, mean square error and mean absolute error. A brief discussion about these error metrics is given in the following subsections.
4.1. Mean Square Error (MSE)
MSE is the arithmetic mean of the sum of the squares of the forecasted errors. MSE is a standard metric for comparing the differences between two time series and is defined as shown in the following equation: where and are actual returns and predicted returns, respectively, and is the length of the series.
4.2. Mean Absolute Error (MAE)
MAE is a quantity used to measure how close forecasts are to the eventual outcomes. MAE measures the average magnitude of the errors in a set of forecasts, without considering their direction. The mean absolute error is given in the following equation: where and are actual returns and predicted returns, respectively, and is the length of the series.
5. Application of HIPM on Stock Market Data
In order to verify the performance of HIPM predictor, real world stock data has been used for experiments. The stock data of three different information technology companies have been used here. These companies are given in Table 1.
Stock data of above three companies has been obtained from Bombay Stock Exchange, India (http://www.bseindia.com/). Daily adjusted closing prices of three stocks have been taken since 14-05-2013 to 30-12-2013 and returns of 164 days (15-05-2013 to 30-12-2013) are calculated using (1). Predictions were obtained using ESM and RNN.
5.1. Predictions Using ESM
Initially predictions are obtained using linear prediction model; here, ESM has been chosen for the purpose. Value for smoothing factor in ESM has been obtained using the following optimization model: where is actual return, is prediction obtained from ESM, is smoothing factor, and is length of historical series. The smoothing factor is associated with the term as shown in (2). Thus, (21) is an objective function which minimizes MSE of the predictions obtained from exponential smoothing technique. Its constraint guarantees that the value of smoothing factor ranges between 0 and 1. Since ESM is a linear prediction model, it obviously did not produce satisfactory predictions, thus resulting in high prediction error.
5.2. Predictions Using RNN
After obtaining predictions using ESM and series of errors calculated, these errors were given to RNN by means of ((14) to (16)). (RNN) was used to obtain stock predictions. Regression order was chosen after trial and error as it was observed that, by using this particular regression order, RNN produced less prediction error. In each stock, data was divided into two equal parts (50:50). Out of 164 returns, 50% data or 82 returns between15-05-2013 and 05-09-2013 were kept for training RNN and the remaining 50% data, that is, 82 returns between 06-09-2013 and 30-12-2013, for testing. Sliding windows each of 82 returns were created. For each window, input-output pairs were calculated using method which result in 76 input-output pairs in each window. 83 sliding windows were formed each giving prediction for future period; thus, 83 windows give 83 future predictions, while initial window gives prediction for the period . By combing 83 sliding windows, 6308 input-output pairs were obtained. These sliding windows were given to RNN and trained in a supervised manner; the procedure was repeated for all stocks.
In the chosen RNN model, there are 16 neurons in hidden layer possessing sigmoid activation function and an output neuron in output layer also possessing sigmoid activation function. An error threshold of was preset for RNN; it means RNN converged only after average error reached below the threshold. For RNN, it took over 10,000 epochs for each stock to reach below preset error.
Figures 3 and 4 show the prediction output of HIPM (between 06-09-2013 and 31-12-2013) via multiplicative method and summation method for stock 2 and stock 3, respectively. Actual returns are shown by blue solid line, whereas predictions are shown by orange dotted line. HIPM predictor is able to capture nonlinear patterns of data very well. As shown, actual and predicted returns are very close to each other which imply that predictions are satisfactory. There are total six similar figures obtained, out of which only two are displayed above.
The performance of HIPM predictor can be better judged by Table 2, which displays values of error metrics obtained. As seen, Multiplicative method outperforms summation method in terms of less prediction error.
A new and promising approach for prediction of stock returns is presented in this paper. A hybrid prediction intelligent model is developed by combining predictions obtained from a linear prediction model and a nonlinear model. The linear model chosen is exponential smoothing model while autoregressive moving reference neural network is chosen as nonlinear model. This is a new approach wherein errors are fed into neural network so as to obtain minimized errors. Initially prediction of stock returns is obtained using exponential smoothing model and prediction errors calculated. Autoregressive moving reference method is used to calculate input-output pairs for the errors just obtained. These errors are fed into recurrent neural network, and the network learns using backpropagation algorithm in supervised manner. Finally, the prediction of stocks is calculated via two methods, summation method and multiplicative method. Based on results, it is observed that the proposed model is able to detect the nonlinear patterns of data very well and results are satisfactory. Input to neural network is given as differences rather than original values. The network thus needs to find smaller weights, thus increasing its prediction performance. The performance of proposed hybrid model can be further improved and applied in other areas too; this is certainly an important avenue for future research.
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
C. H. Chen, “Neural networks for financial market prediction,” in Proceedings of the 1994 IEEE International Conference on Neural Networks, pp. 1199–1202, Orlando, Fla, USA, June 1994.View at: Google Scholar
R. Majhi, G. Panda, G. Sahoo, A. Panda, and A. Choubey, “Prediction of S&P 500 and DJIA stock indices using particle swarm optimization technique,” in Proceeding of the IEEE Congress on Evolutionary Computation (CEC '08), pp. 1276–1282, Hong Kong, China, June 2008.View at: Publisher Site | Google Scholar
S. Haykin, Neural Networks: A Comprehensive Foundation, MacMillan College, New York, NY, USA, 1994.
S. Samarasinghe, Neural Networks for Applied Sciences and Engineering, Auerbach Publications, Taylor & Francis, New York, NY, USA, 2007.
G. Sermpinis, K. Theofilatos, A. Karathanasopoulos, E. F. Georgopoulos, and C. Dunis, “Forecasting foreign exchange rates with adaptive neural networks using radial-basis functions and particle swarm optimization,” European Journal of Operational Research, vol. 225, no. 3, pp. 528–540, 2013.View at: Publisher Site | Google Scholar | MathSciNet
M. Rout, B. Majhi, R. Majhi, and G. Panda, “Forecas ting of currency exchange rates using an adaptive arma model with differential evolution based training,” Journal of King Saud University—Computer and Information Sciences, vol. 26, pp. 7–18, 2014.View at: Google Scholar
R. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series, Courier Dover Publications, 2004.
G. E. P. Box and G. M. Jenkins, Time Series Analysis, Forecasting and Control, Holden-day, San Francisco, Calif, USA, 1970.View at: MathSciNet
G. U. Yule, “Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series,” Journal of the Royal Statistical Society, vol. 89, pp. 30–41, 1926.View at: Google Scholar
H. O. Wold, A Study in the Analysis of Stationary Time Series, Almgrist & Wiksell, Stockholm, Sweden, 1938.
D. Rumelhart and J. McClelland, Parallel Distributed Processing, MIT Press, Cambridge, Mass, USA, 1986.