Complexity in Financial MarketsView this Special Issue
Predicting the Direction Movement of Financial Time Series Using Artificial Neural Network and Support Vector Machine
Prediction of financial time series such as stock and stock indexes has remained the main focus of researchers because of its composite nature and instability in almost all of the developing and advanced countries. The main objective of this research work is to predict the direction movement of the daily stock prices index using the artificial neural network (ANN) and support vector machine (SVM). The datasets utilized in this study are the KSE-100 index of the Pakistan stock exchange, Korea composite stock price index (KOSPI), Nikkei 225 index of the Tokyo stock exchange, and Shenzhen stock exchange (SZSE) composite index for the last ten years that is from 2011 to 2020. To build the architect of a single layer ANN and SVM model with linear, radial basis function (RBF), and polynomial kernels, different technical indicators derived from the daily stock trading, such as closing, opening, daily high, and daily low prices and used as input layers. Since both the ANN and SVM models were used as classifiers; therefore, accuracy and F-score were used as performance metrics calculated from the confusion matrix. It can be concluded from the results that ANN performs better than SVM model in terms of accuracy and F-score to predict the direction movement of the KSE-100 index, KOSPI index, Nikkei 225 index, and SZSE composite index daily closing price movement.
Prediction of the stock market index and its direction has remained the most interesting and difficult task for researchers, data scientists, and econometricians because of its volatile nature. Although numerous studies have been conducted in the literature to predict the stock price indexes using different econometrics models’ meager studies are available on the prediction of its direction movement, this area of research got significant importance in the last few years especially after the emergence of the machine and deep learning methods. Predicting the direction of the stock price index movement accurately is essential for developing precise and valid market planning . The financial time series such as the stock market has a nonlinear, complex, and chaotic structure; therefore, it is one of the most difficult tasks to predict . For example, numerous financial variables such as demand and supply, interest rates, investments, dividends, economy, and political climate can affect the arbitrary movements and sometimes crash of the stock market . Therefore, appropriate models and efficient preprocessing of the data are required for accurate and better predictions. Several classical time series models have been suggested in the literature and used to predict the stock market indexes. Similarly, the autoregressive conditional heteroskedasticity (ARCH) and generalized autoregressive conditional heteroskedasticity (GARCH) models proposed by Engle  and Bollerslev  have been used by numerous data scientists to forecast the financial time series data. The well-known ARMA (typically knows as Box & Jenkins models) proposed in  is the hybrid form of moving average (MA) and autoregressive (AR) terms and is used under stationary conditions to forecast the time series data. In contrast to this method, autoregressive integrated moving average (ARIMA) models were proposed for nonstationary time-series datasets. These classical techniques have been used by various researchers for many years to address the issue of nonlinearity and nonstationary of the stock market indexes.
However, with the expansion of machine learning techniques, recently, many scientists have been using these models for the prediction of the nonlinear and complex nature of financial time series data. The most common of these models is ANN and SVM that are widely used to predict the direction of the stock market movement. But because of the noise and nonstationary behavior of the stock market indexes it becomes very difficult to train the ANN model which could produce efficient results [7–9]. Therefore, extreme care is needed while training models with appropriate tuning parameters like hidden layers, algorithm selection, nodes in each hidden layer, learning rate, and the activation function to minimize the forecast error. Recently, resilient back-propagation neural network (RBPNN) with and without weight backtracking was developed in  to train the feed-forward neural network (FFNN) which converges faster than the classical back-propagation neural network (BPNN). The novelty in this paper is that we have used this idea to build the architect for FFNN that predict the direction movement of the four well-known stock price indexes daily closing price movement by extracting various (15) technical indicators from the historical data and used them as input layers to ANN and SVM
The rest of the paper is categorized into the following sections. The related work is introduced in Section 2; the preprocessing of the data, derivation of technical indicators used as inputs is described in Section 3; the proposed model is presented in Section 4. Similarly, results and discussion is presented in Section 5, followed by the conclusion of the paper in Section 6.
2. Literature Review
More recently, the prediction of the direction of the stock market indexes or stock price movement has become an attractive and fascinating field of research for both financial experts and theoretical scientists. Both classical time series prediction models, machine learning techniques particularly SVM, ANN, and hybrid models are widely used to predict the stock price indexes movements. Here, in this section, we will consider research studies conducted primarily with these soft computing models and their hybrid versions that are commonly used for predicting the stock market. However, many studies have been conducted in the literature using ANN and its hybrid versions some of which are discussed here.
There is rich literature available on ANN and its hybrid versions to predict the random behavior of the stock market. Different procedures have been implemented to extract the important features/characteristics that are used as an input layer to construct the structure of the ANN model. Some of the related literature is discussed as follows.
Abraham et al.  proposed a soft computing and automated method for the forecasting of the stock market and trend analysis for which two years’ daily data were taken from the Nasdaq-100 index and other companies listed in the same stock market. For analyzing the trend of the stock market, a one-day ahead forecasting was made based on the neural network and neuro-fuzzy system. The method of principal component analysis (PCA) was employed for preprocessing of the data. Performance measures that are root mean square (RMSE) were used to investigate the reliability of the proposed hybrid model where it was concluded that the forecasting results of the proposed hybrid model are promising.
Lahmiri  has used resilient back-propagation neural network (RBNN) to predict the price level of five major international stock markets that is S&P500, Nikkei, STSE100, DAX, and CAC40. In this article, the author used different technical analysis measures, for example, indicators, oscillators, stochastics, and indexes along with the resilient back-propagation neural networks for the prediction of the price level. It is evident from the results of the study that technical analysis indicators can be used effectively in building the ANN architecture to estimate the stock price indexes.
The direction movement of the Taiwan stock exchange was modeled by Chen et al.  with the help of the probabilistic neural network (PNN) to forecast the direction of the stock returns after it was trained by the historical data. The proposed method was compared with the generalized method of movement (GMM) with Kalman filter by using different performance measures. Empirical evidence showed that investment strategies based on the PNN model yield higher profits than by-and-hold strategies.
Baba and Suto  constructed an intelligent decision support system for the prediction of the Tokyo stock market using a neural network (NN) and temporal difference learning (TDL) approach. The performance of the proposed system was simulated by utilizing the sixteen enterprises listed in the Tokyo stock exchange. The simulation results suggest the overall effectiveness of the combined system of the TDL and NN model. The total relative and maximum error suggests that the TD-learning method is a good forecasting tool for stock prices.
A three-layer feed-forward neural network (FFNN) was used by  to predict the stock price moments of 367 different firms in the Shanghai stock exchange. The proposed method was compared with univariate and multivariate linear models, using statistical measures like mean absolute deviation (MAD), mean absolute percentage error (MAPE), mean square error (MSE), and standard deviation (SD). The results of these statistical measures suggest that the forecast reliability of the FFNN method is better than the other two competing models and hence recommended to predict the stock price moment of different firms listed on the Shanghai stock exchange.
A composite time series adaptive network-based fuzzy inference system (ANFIS) approach was combined with the empirical mode decomposition (EMD) that is used to predict the stock prices of the Taiwan stock exchange . Different classical and machine learning techniques including the well-known autoregressive (AR) and support vector machine (SVM) were compared with the proposed method to know its prediction reliability. It is evident from the performance measures of RMSE suggest that the hybrid ANFIS model is the best candidate among the other classical and hybrid models.
Moreover, the direction movement of the stock and stock price index for the Indian stock exchange has been predicted by Patel et al.  with the help of implementing four different machine learning techniques such as ANN, SVM, random forest, and Naïve-Bayes with two different approaches for input to these models. The primary method uses ten different technical indicators like simple moving average, weighted moving average, momentum, stochastic %K, stochastic %D, relative strength index (RSI), moving average convergence divergence (MACD), Larry William R%, A/D (accumulation/distribution) oscillator, and commodity channel index (CCI). In the second method, the actual values of these technical parameters are transformed into discrete values which are called the trend deterministic data preparation layer. The performance of these techniques was checked by taking ten years of historical data of two Indian stocks. The experimental results show that when the actual values of the technical indicators are used as input layers, the performance of the random forest is better than the other models, and for the second approach, when the technical indicators are used as trend deterministic layers, the performance of all the four models are improved.
To predict the daily Istanbul Stock Exchange 100- index, Kara et al.  applied ANN and SVM models as classifiers. The authors used ten technical indicators as input layers to build the architect of the mentioned two models. The choice of these technical parameters is based on the previous literature and expert opinion which confirms that these technical parameters influence and have the potential to bring significant changes in the stock market indexes. The empirical results confirm that the prediction reliability of the ANN model is 74.74% which is greater than the accuracy level of 71.52% of the SVM model.
With the introduction of the support vector machine by Vapnik , it is widely used by numerous researchers along with its different versions for the prediction of different stock markets throughout the world. For example, the SVM model was used as a classifier to predict the short-term direction movement of the NIKKEI 225 index . The United States of America (USA) imports one-third of its autos and their parts from Japan which confirms a close relationship between Japan and the USA economy. Therefore, the US S&P-500 index and exchange rate of US dollars against Japanese Yen (JPY) were used as inputs to the SVM model. The proposed model is then compared with Elman BPNN, linear discriminant analysis, and quadratic discriminant analysis to check the prediction performance of the SVM. It is evident from the results that the hit ratio of the SVM is 73% and integrating it with other classification methods gives an accuracy level of 75% which is higher than the other competing models.
Similarly, an attempt was made to predict the daily Korean stock price index (KOSPI) by using the SVM method  where twelve technical indicators were used as the input layer to construct the SVM model. However, after the construction of the model, it was further applied to predict the direction moment of the KOSPI stock market index. Moreover, a comparison of the suggested SVM model was made with the BPNN and case-based reasoning (CBR) model where it was concluded that the SVM is the best forecasting tool for the prediction of the stock market index.
The method of random forest and SVM was used by  to predict the S&P, CNX, and NIFTY stock market index of the national stock exchange which is one of the fastest-growing financial exchanges in the developing Asian countries. A comparison of these two methods was made with the neural network, discriminant analysis, and logit model. It is evident from the research work that the SVM method outperforms random forest, discriminant analysis, and logit model.
Similarly, different artificial methods were hybridized by  to forecast the stock price index of seven major financial markets. In the preprocessing phase, the method of self-organizing map (SOM) was deployed to break down the data into different components based on the similarity of statistical distribution. After dividing the diversified dataset into different similar groups, the method of support vector regression (SVR) is applied to forecast the stock price index. The results indicate that the prediction accuracy can be increased by using the two-stage SVR algorithm rather than a single-stage classical method of SVR. For further details of the applicability of the supporting machine for the prediction of the stock markets, the readers are suggested to consult the review paper by .
It is evident from the literature review that forecasting the direction movement of the stock price index is of paramount importance, but there are very limited studies conducted on the prediction of the KSE-100 index using support vector machine and artificial neural network with numerous technical indicators as input layers. Therefore, the main objective of this research work is to model the Pakistan KSE-100 index along with other well-established stock prices using state-of-the-art machine learning models with a higher degree of forecasting accuracy.
3. Materials and Methods
For this research study, we have used ten years of data ranging from January 1, 2011, to September 27, 2020, of four Asian countries, namely, the KSE-100 index of Pakistan, Korean stock price index (KOSPI), NIKKEI 225 of Japan, and Shanghai Component index of China. The total number of data points used in this study is sliced into two groups, i.e., one group consists of 80% of the data that is considered for training the model, and the second group having the remaining 20% which is utilized for testing the model. Since the main objective of this research work is to forecast the direction movement of the daily closing prices of four stock indexes with the help of ANN and SVM; therefore, fifteen technical indicators are used as input to these two well-known machine learning techniques. These technical indicators consist of the following 15 attributes/variables that are determined from an extensive literature review which are, namely, stochastic %K, stochastic %D, ROC, William Larry R%, Momentum, Disparity 5, Disparity 15, OSCP, CCI, RSI, Pivot point, S1, S2, R1, and R2 [7, 24]. The details of the indicators mentioned above, and their mathematical structure are presented in Table 1. The four stock price indices’ descriptive statistics, such as mean and standard deviation, are calculated and presented in Tables 2–5. To predict the direction movement of the stock market prices, the prediction problem is categorized as a classification problem by converting it into “0” and “1,” where “0” indicates that the daily closing stock price index for the next day is lower than today’s price and “1” means that it is higher. The percentage of increase and decrease cases in each year is also calculated and presented in Tables 6–9. Furthermore, the technical parameters are converted into a series of “0” and “1” by using min-max scaling criteria. The robustness of this approach makes it more useful than the classical normalization method which has zero breakdown point against any single outlying observation in the data. Mathematical composition of the min-max scaling is done by using the following mathematical equation:where Ymin and Ymax are the highest and lowest values in the data respectively.
It is verified from Tables 2–5 that the values of the stochastic oscillator %K and William %R vary between “0” and “100.” The four levels of pivot point, i.e., support 1 (S1), support 2 (S2), resistance 1 (R1), and resistance 2 (R2) have the same minimum value for the KSE-100 index, whereas for the other three indexes, the minimum values for these four levels are not the same as well as having and a slight variation amid its extreme values.
It can be seen from the formulas of pivot point, S1, S2, R1, and R2 presented in Table 1 that the previous day high, low, and closing stock prices are used to calculate its values. Pivot points are an intraday indicator for trading futures, commodities, and stocks. For example, traders know that if the price falls below the pivot point they will likely be shorting early in the session. Conversely, if the price is above the pivot point, they will be buying. The mathematical structure of pivot point and its four levels are probably same, that is why the minimum and maximum values are almost the same for the four stock price indexes.
The mean and standard deviation of the pivot point and its four levels for all the four stock closing prices are approximately identical, demonstrating that they are strongly correlated which is theoretically sound as these four levels depend upon the pivot point.
The percentage of the number of increase and decrease cases in each year is presented in Tables 6–9. It is evident from these tables that there is nonlinear and disordered nature in the increase/decrease cases in each year for all four stock price indexes.
During the pandemic period, Asian stock markets have shown greater resistance to the current pandemic of COVID-19. It can be seen from the investigational results presented in Tables 7 and 9 that KOSPI and SZSE composite indices registered in 2020 showed the highest percentage of increase from all samples period as compared to the KSE-100 and Nikkei 225 indices.
4. Prediction Models
In this study, two different soft computing models have been used to predict the KSE-100 index, Nikkei 225 index, KOSPI index, and SZSE composite index daily closing price direction movement. A brief explanation of these models is outlined in the following segment of the paper.
4.1. ANN Model
Neural networks were first proposed in 1944 by Warren McCullough and Walter Pitts which are nonlinear mapping structures based on the function of the human brain. The main idea of ANN is derived from the neuron nodes interrelated similar to a net. The brain consists of hundreds of billions of neuron nodes that are responsible for processing information towards and from the brain. Just like the brain, ANN has also consisted of thousands of artificial neurons connected by nodes. The input components accept several methods and composition of data, built on an internal weighting structure, while NN tries to acquire information obtainable to yield one output information. ANNs correspondingly practice an established rule of directions named back-propagation as humans require instructions and directions to originate an outcome. A general architect of the “N” layer feed-forward neural network (FFNN) is presented in Figure 1. In this study, the resilient back-propagation learning procedure is proposed to train a single-layer feed-forward neural network. The benefit of resilient back-propagation to train the model over the traditional BPNN is that it takes minimum time to train the architect of the neural net and does not require any learning rate . The threshold value of 0.04 with logistic as activation function is employed to predict the direction movement of the four different stock prices direction movement. The input layers for the network are 15 technical indicators (presented in Table 1), and the single output layer is known as the directional movement of the daily stock closing price taking the value of either “0” or “1.” If the output value is greater or equal to 0.5, the prediction of direction movement is considered to be upward, and if it is less than 0.5, it is considered to be downward. From the investigations of the literature review, a small threshold value of 0.04 is selected to train the FFNN model with a resilient back-propagation learning algorithm. Table 10 shows all the parameter settings for the proposed FFNN model. All the investigational results were found using the “neuralnet” package in Rstudio.
4.2. SVM Model
The method of SVM was developed by Vapnik , and until now, it is the most widely used supervised machine learning technique that is based on statistical theory and structured risk minimization rule. In practical problems, the method of SVM can be used both for classification as well as regression problems. However, it is mostly used in classification problems. The key concept of SVM is to contract the upper bound of generalization error which is a different approach to the other machine learning techniques like backpropagation network (BPN) that works on the principle of minimizing the empirical error .
In the SVM algorithm, each data item is plotted as a point in n-dimensional space (where n is the number of features) with the value of each feature being the value of a particular coordinate. Then, classification is performed by finding the hyperplane that differentiates the two classes very well. Consider that describe a set of input vectors with equivalent class labels denoted by . The SVM algorithm has the potential to transform the input vectors into a high-dimensional feature space . The SVM kernel is a function that takes low dimensional input space and transforms it to a higher dimensional space, i.e., it converts nonseparable problems to separable problems. It is mostly useful in nonlinear separation problems. To draw a hyperplane in a nonlinear separation problem, a kernel function K (xi, xj) is used that accomplishes the mapping . The final limiting function of the hyperplane is expressed by the following mathematical structure:
The values of αi in (2) can be found by solving the equation:
Equation (3) holds only if and . There is a trade-off between the misclassification error and maximizing the margin that can be controlled by the parameter “C,” technically known as the regularization parameter. The most well-known and widely used kernels for SVM are linear, polynomial, and radial basis. We have considered the linear, RBF kernel, and polynomial kernels. It is evident from Table 5 that the precision of the SVM with radial bases kernel is maximum; therefore, is suggested to train the model. The mathematical structure of linear and radial bases kernel is defined in equations (4) and (5):where is the kernel parameter used in both RBF and polynomial kernels.
For the linear kernel, the only hyperparameter that needs to be optimized is “C.” However, for the RBF kernel, both the regularization parameter C and the kernel parameter need to be optimized simultaneously with the help of grid search, whereas for the polynomial kernel we have three parameters to be tuned, i.e., C, d, and .
4.2.1. Parameter Optimization Using a Grid Search
A grid search is a comprehensive method for the optimization of the model parameter based on the predefined set of the hyperparameter space using lower and upper bounds with the number of steps defined. Three different scales that are linear, quadratic, and logarithmic scales are used for this purpose, and their performance is evaluated using some performance metrics.
Cross-validation (CV) can be used as a performance metric for the optimization of the SVM parameters (e.g., C, degree, etc.) with the main objective is to identify a good combination of hyperparameter so that the classifier can make accurate predictions. Cross validation is a useful technique to avoid the problem of overfitting . The flowchart of SVM parameter optimization using grid search is presented in Figure 2. However, to choose the best values of the parameters C and using k-fold CV, the data is splitted into “k” subsets (for example, k = 10) where “k − 1” subsets are used for training and the remaining one set as testing subset. Based on this splitting, the cross-validation error for the SVM classifier is computed for different combination of hyperparameters, and the best combination of hyperparameter values is selected based on highest accuracy or lower CV error for training of the SVM model. The results of the analyzed stock indices are presented in Figures 3–6. Moreover, there is one parameter C to be optimized in linear kernel, two parameters C and in RBF, and three parameters C, , and degree in polynomial. There can be additional parameters; however, a large number of parameters may result in a huge number of combinations. One of the limitations of the SVM parameters is that there are no exact ranges of C and values.
It is believed that the wider the parameter range is, the more possibilities the grid search method may have in finding the best combination of the parameters. Therefore, for this experiment, it was decided to choose the optimum range of C and from 0.001 to 1000.
There is a direct relationship between the values of and C, for larger values, the effect of C became negligible if is small, C affects the SVM model with RBF kernel in the same way as it affects using a linear kernel. By using the grid search optimization technique, the maximum prediction accuracy for linear, RBF, and polynomial kernels using three different stock prices time series data reached its maximum and can be seen in Table 11. Furthermore, the investigational results suggest that the prediction accuracy of the SVM with linear, and RBF kernel for the four different stock prices that are optimized by the grid search method is maximum, and hence used in this study.
5. Results and Discussion
Here, the SVM and ANN models are used as classifiers; therefore, accuracy and F-score are implemented as precision metrics to check the efficiency of the suggested models. Before calculating accuracy and F-score, precision and recall are calculated by utilizing the confusion matrix, i.e., true positive (TP), false positive (FP), true negative (TN), and false negative (FN). The mathematical composition of these performance measures is defined in the following equations:where precision is the ratio of true positive to all true and false positive. Since the main objective is the prediction of direction movement of the four stock indexes closing prices, true positive (TP) means that the direction movement is upward and the suggested model correctly predicted it. Similarly, FP means that the proposed model incorrectly predicted the direction movement as upward when it is downward. The true negative (TN) means that the proposed model correctly predicted the direction movement as downward, and the false negative (FN) means that the direction movement is actually upward and the suggested model predicted it incorrectly as downward. Both the SVM and ANN models are built with the best choices of hyperparameter selection as reported in Tables 11 and 12.
The empirical analysis consists of two steps that are training the models with optimization of hyperparameters using the grid search approach followed by checking its forecasting accuracy on the testing data. The optimization of the RBF kernel with two tuning parameters is conducted by using k-fold cross-validation. The results of this cross-validation are presented in Figures 3–6, whereas the performance measures of the two models are presented in Table 12.
Furthermore, it can be observed from Table 11 that the accuracy and F-score for the KSE-100 index, Nikkei 225, and SZSE composite is maximum with a linear kernel with the single-tuned parameter using the grid search approach, i.e., the value of C = 964.77, 638.0629, and 324.72. The accuracy and F-score for these three indexes are (0.8519, 0.8395), (0.8022, 0.7912), and (0.8998, 0.8790), whereas for the KOSPI index, the maximum accuracy and F-score (0.8180, 0.7932) is achieved by using RBF kernel with tuned hyperparameters values of C = 150 and After the optimization of the hyperparameters of both the models, accuracy metrics defined in equations (7)–(12) are calculated for the testing dataset, and investigational results are presented in Table 12.
Similarly, the feed-forward ANN model with a resilient back-propagation learning algorithm is used to train the model as per the input results presented in Table 10. Here, a single hidden layer with one neuron is used to train the feed-forward ANN model that gives an accuracy level between 84 percent and 95 percent and therefore using more than one layer may probably lead to the problem of overfitting. After training the two proposed models with the best parameter selection using grid search, the next step is to compare them by using accuracy and F-score. These two-performance metrics are used for test data, and their results are presented in Table 12.
It is evident from the results presented in Table 12 that the performance of predicting the direction movement of closing prices of KSE-100, KOSPI, Nikkei 225, and SZSE composite indexes of ANN model is maximum than that of SVM model with linear and RBF kernel. The accuracy and F-score presented in the above table were found using equations (11) and (12). It can be observed from Table 12 that the accuracy of ANN model for KSE-100, KSOPI, Nikkei 225, and SZSE composite are 0.9011, 0.8576, 0.8456, and 0.9513, and the corresponding values of F-score are 0.8898, 0.8314, 0.8277, and 0.9345, respectively. These results are slightly higher than that of SVM model based on linear and RBF kernel with its accuracy 0.8519, 0.8180, 0.8022, and 0.8998 and its corresponding F-scores 0.8395, 0.7932, 0.7912, and 0.8790, respectively, suggesting it as a good competitor to the ANN model. The average prediction accuracy of the ANN model for the four different stock prices is 88.91% with an F-score of 87.08% suggesting that its performance is better than that of the SVM model with an average accuracy of 84.29% and an average F-score of 82.57%, respectively.
Since it is well-known fact that the behavior of stock market data is complex and nonlinear and that its prediction is not only challenging but a difficult task. The main purpose of this research study is to predict the direction movement of the closing price of the four different stock prices, namely, the KSE-100 index, KOSPI index, Nikkei 225 index, and SZSE composite using the two well-known machine learning techniques, i.e., ANN and SVM models. These two models are trained by using the ten years of historical data (i.e., from 2011 to 2020) of daily closing prices of these four stock price indexes from the yahoo finance official website. Fifteen various technical indicators have been extracted from the historical stock data and used as input layers to train the two models. For all the four stock indexes, the single-layer ANN model prediction accuracy and corresponding F-scores showed to be higher than the SVM model (with three different kernels, i.e., linear, RBF, and polynomials).
However, given the accuracy and F-score values obtained for both the ANN and SVM models they are considered to be useful tools with the best possible input layers to predict the direction movement of the stocks and stock price indexes, which will minimize the risk of loss and will improve confidence level of the investors to invest in the stock markets such as the KSE-100 index, KOSPI index, Nikkei 225 index, and SZSE composite index. As evident from the results, the prediction accuracy of the ANN model is higher than that of SVM model which confirms that why many researchers in the previous studies (e.g., 7, 8, 20, 21) used these two models to predict the stock market indexes which outperforms other models utilized in the literature for the same purpose. Limitation of this study is the noninclusion of the impact of other macroeconomic variables such as foreign exchange rates, interest rates, and consumer price as input layers to the proposed models. Nonetheless, fifteen technical indicators used in this study proved to be very useful in predicting the direction movement of the KSE-100, KOSPI, Nikkei 255, and SZSE composite indexes using the ANN and SVM models.
As future work, prediction of the stock prices by implementing an ensemble model based on a long-short-term memory model (LSTM) with different input layers is under consideration. Although most of the stock prices have a daily pattern as in this study, this can be different in other areas, such as there may be a seasonal pattern on some sales datasets. Therefore, an attempt will be made to use different lengths of historical data, different lengths of binary features, and other driving features such as exchange rates and other macroeconomic indicators.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
M. T. Leung, H. Daouk, and A. S. Chen, “Forecasting stock indices: a comparison of classification and level estimation models,” International Journal of Forecasting, vol. 16, no. 2, pp. 173–190, 2000.View at: Publisher Site | Google Scholar
Y. S. A. Mostafa and A. F. Atiya, “Introduction to financial forecasting,” Applied Intelligence, vol. 6, no. 3, pp. 205–213, 1996.View at: Publisher Site | Google Scholar
T. Z. Tan, C. Quek, and G. S. Ng, “Biological brain-inspired genetic complementary learning for stock market and bank failure prediction,” Computational Intelligence, vol. 23, no. 2, pp. 236–261, 2007.View at: Publisher Site | Google Scholar
R. F. Engle, “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation,” Econometrica, vol. 50, no. 4, pp. 987–1008, 1982.View at: Publisher Site | Google Scholar
T. Bollerslev, “Generalized autoregressive conditional heteroskedasticity,” Journal of Econometrics, vol. 31, no. 3, pp. 307–327, 1986.View at: Publisher Site | Google Scholar
G. Box and G. Jenkins, Time Series Analysis: Forecasting and Control, Holden Day, San Francisco, CA, USA, 1976.
K. J. Kim and I. Han, “Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index,” Expert Systems with Applications, vol. 19, no. 2, pp. 125–132, 2000.View at: Publisher Site | Google Scholar
K. J. Kim, “Financial time series forecasting using support vector machines,” Neurocomputing, vol. 55, no. 1-2, pp. 307–319, 2003.View at: Publisher Site | Google Scholar
K. Manish and M. Thenmozhi, “Forecasting stock index movement: a comparison of support vector machines and random forest,” in Proceedings of the Ninth Indian institute of Capital Markets Conference, Mumbai, India, December 2005.View at: Google Scholar
M. Riedmiller and H. Braun, “Rprop-a fast adaptive learning algorithm,” in Proceedings of the 7th International Symposium on Computer and Information Sciences, ISCIS VII, Antalya, Turkey, 1992.View at: Google Scholar
A. Abraham, B. Nath, and P. K. Mahanti, “Hybrid intelligent systems for stock market analysis,” in Proceedings of the International Conference on Computational Science, pp. 337–345, Springer, San Francisco, CA, USA, May 2001.View at: Publisher Site | Google Scholar
S. Lahmiri, “Resilient back-propagation algorithm, technical analysis and the predictability of time series in the financial industry,” Decision Science Letters, vol. 1, no. 2, pp. 47–52, 2012.View at: Google Scholar
A. S. Chen, M. T. Leung, and H. Daouk, “Application of neural networks to an emerging financial market: forecasting and trading the Taiwan Stock Index,” Computers & Operations Research, vol. 30, no. 6, pp. 901–923, 2003.View at: Publisher Site | Google Scholar
N. Baba and H. Suto, “Utilization of artificial neural networks and the TD-learning method for constructing intelligent decision support systems,” European Journal of Operational Research, vol. 122, no. 2, pp. 501–508, 2000.View at: Publisher Site | Google Scholar
F. Li and C. Liu, “Application study of BP neural network on stock market prediction,” in Proceedings of the 2009 Ninth International Conference on Hybrid Intelligent Systems, pp. 174–178, Shenyang, China, August 2009.View at: Publisher Site | Google Scholar
L. Y. Wei, “A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting,” Applied Soft Computing, vol. 42, pp. 368–376, 2016.View at: Publisher Site | Google Scholar
J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques,” Expert Systems with Applications, vol. 42, no. 1, pp. 259–268, 2015.View at: Publisher Site | Google Scholar
Y. Kara, M. A. Boyacioglu, and Ö. K. Baykan, “Predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the Istanbul Stock Exchange,” Expert Systems with Applications, vol. 38, no. 5, pp. 5311–5319, 2011.View at: Publisher Site | Google Scholar
V. N. Vapnik, The nature of statistical learning. Theory, Springer, Berlin, Germany, 1995.
W. Huang, Y. Nakamori, and S. Y. Wang, “Forecasting stock market movement direction with support vector machine,” Computers & Operations Research, vol. 32, no. 10, pp. 2513–2522, 2005.View at: Publisher Site | Google Scholar
M. Kumar and M. Thenmozhi, “Forecasting stock index movement: a comparison of support vector machines and random forest,” in Proceedings of the Indian institute of Capital Markets 9th Capital Markets Conference Paper, Navi Mumbai, India,, 2006.View at: Google Scholar
S. H. HsuJ. P. A. Hsieh, T. C. Chih, and K. C. Hsu, ““A two-stage architecture for stock price forecasting by integrating self-organizing map and support vector regression,” Expert Systems with Applications, vol. 36, no. 4, pp. 7947–7951, 2009.View at: Publisher Site | Google Scholar
G. S. Atsalakis and K. P. Valavanis, “Surveying stock market forecasting techniques - Part II: soft computing methods,” Expert Systems with Applications, vol. 36, no. 3, pp. 5932–5941, 2009.View at: Publisher Site | Google Scholar
S. B. Achelis, Technical Analysis from A to Z, McGraw Hill, NY, USA, 2001.
X. Xu, C. Zhou, and Z. Wang, “Credit scoring algorithm based on link analysis 787 ranking with support vector machine,” Expert Systems with Applications, vol. 36, pp. 2625–2632, 2009.View at: Publisher Site | Google Scholar
S. W. Lin, K. C. Ying, S. C. Chen, and Z. J. Lee, “Particle swarm optimization for parameter determination and feature selection of support vector machines,” Expert Systems with Applications, vol. 35, no. 4, pp. 1817–1824, 2008.View at: Publisher Site | Google Scholar
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.View at: Publisher Site | Google Scholar