Research Article | Open Access
Can Yang, Junjie Zhai, Guihua Tao, "Deep Learning for Price Movement Prediction Using Convolutional Neural Network and Long Short-Term Memory", Mathematical Problems in Engineering, vol. 2020, Article ID 2746845, 13 pages, 2020. https://doi.org/10.1155/2020/2746845
Deep Learning for Price Movement Prediction Using Convolutional Neural Network and Long Short-Term Memory
The prediction of stock price movement direction is significant in financial studies. In recent years, a number of deep learning models have gradually been applied for stock predictions. This paper presents a deep learning framework to predict price movement direction based on historical information in financial time series. The framework combines a convolutional neural network (CNN) for feature extraction and a long short-term memory (LSTM) network for prediction. We specifically use a three-dimensional CNN for data input in the framework, including the information on time series, technical indicators, and the correlation between stock indices. And in the three-dimensional input tensor, the technical indicators are converted into deterministic trend signals and the stock indices are ranked by Pearson product-moment correlation coefficient (PPMCC). When training, a fully connected network is used to drive the CNN to learn a feature vector, which acts as the input of concatenated LSTM. After both the CNN and the LSTM are trained well, they are finally used for prediction in the testing set. The experimental results demonstrate that the framework outperforms state-of-the-art models in predicting stock price movement direction.
Financial time series prediction, particularly stock price movement prediction, has been one of the most difficult problems for investors and researchers. Forecasting the direction of stock price movement accurately plays a key role in determining to buy and sell a stock. However, stock price is easily affected by macro- or microeconomics, such as interest rates, exchange rates, and monetary policy, making prediction become a challenging task. Motivated by great profits in stock market investment, researchers and speculators have focused on stock market prediction research for decades. Traditional statistical methods like logistic regression, exponential average, ARIMA, and GARCH were used to predict the stock price movement [1, 2]. However, statistical methods are under an assumption that the time series is generated from a linear process and therefore exhibits a poor performance in nonlinear stock price movement prediction. Accordingly, due to the great success in nonlinear field, machine learning and deep learning methods are gradually applied in forecasting stock price movement. Most of them performed two-stage predictions, which are extracting features and then using them as input to the model to make predictions.
Feature extraction is one of the most important parts in stock prediction process. Better market features always contribute to better predictions. Technical analysis is mostly performed to extract features from the original market data . Machine learning methods such as kNN, ANN, SVM, and RF are often utilized to learn the relationship between the features from the technical analysis and price movement [3, 4]. Moreover, deep learning methods, especially for CNN, which have achieved great success in computer vision and image processing, are also used for feature extraction. A time series to image conversion approach was proposed in , in order to help CNN extracting useful features from financial variables. Nevertheless, in the approach, the potential influence from correlated stock markets was ignored. To address this problem, a three-dimensional input tensor construction approach was designed in , which is capable of extracting features from correlated stock markets. Inspired by their idea, this paper also employed this three-dimensional input tensor construction approach for feature extraction. Another important part in stock prediction process is selecting or enhancing a model. Recent research studies had revealed that deep learning models are superior to traditional machine learning models in financial market prediction [7–10]. CNN , RNN , and LSTM  were commonly used deep learning models in predicting the stock price movement. In addition, constructing hybrid models is a popular way to enhance the performance of model, such as SVM-ANN model , CNN-SVM model , and CNN-LSTM model [14–20].
In this study, we proposed a hybrid model consisting of CNN and LSTM to predict the direction of stock price movement. On the one hand, we improved the three-dimensional input tensor for CNN to extract features. There are two differences between our approach and Hoseinzade and Haratizadeh’s approach . First, Hoseinzade and Haratizadeh used a diversity of financial variables including stock prices, technical indicators, and stock indices from other markets to construct a three-dimensional input tensor as the input of a specified CNN model. In their input tensor, the influence of transformation of technical indicators and the degree of correlation between other stock markets are ignored, while in our improved three-dimensional tensor, technical indicators were converted into deterministic trend signals following a certain rule and stock markets were ordered according to PPMCC. Another difference lies on that the prediction model used in  is a specified CNN, while in our approach, a hybrid model consisting of CNN and LSTM is employed. And the hybrid model is able to combine the advantages of CNN in feature extraction with the advantages of LSTM in time series prediction. On the other hand, we proposed a CNN-LSTM model for stock price movement forecast. Compared with other CNN-LSTM models [14–20], the main difference between them and our proposed hybrid model lies on the CNN-based feature extraction module. Their feature extraction modules mainly aimed at extracting features from one-dimensional or two-dimensional input variables, while ours was aimed at three-dimensional input tensor. Different purposes lead to different structures of feature extraction modules. The final experimental results demonstrated that the improvement on input tensor and the combination of CNN and LSTM can significantly improve the prediction performance of the model.
In brief, the main contributions of this work can be summarized as follows:(1)We built an improved three-dimensional input tensor for CNN by converting the technical indicators into deterministic trend signals and using PPMCC to order the correlated stock indices.(2)We designed a CNN-based feature extraction module, which is suitable for extracting features from the three-dimensional input tensor.(3)Extensive experiments demonstrated that our improvement on the three-dimensional input tensor can significantly improve prediction performance, and our proposed model outperforms several state-of-the-art models in terms of F-measure.
The rest of the paper is organized as follows. The related work is introduced in Section 2. Section 3 proposes our framework and methods. Section 4 provides extensive experiments. Finally, the conclusion is drawn in Section 5.
2. Related Work
In stock market forecast domain, the previous research approaches are usually categorized into two groups. One focuses on achieving better feature extraction from a series of financial variables. The other attempts to improve prediction performance by enhancing the models.
2.1. Feature Extraction
Extracting useful features from a diverse set of financial variables is one of the most important issues in stock price movement prediction. A better prediction performance can be gained by having better input features. Technical analysis can be used to extract market features from the original financial variables. And stock prediction often uses technical analysis to form features used as input for the models. As reported by Shynkevich et al. , approximately 20% of stock market prediction models use technical indicators as input features. These models used for extracting market features from technical indicators mainly include machine learning models and deep learning models.
ANN and SVM are commonly used machine learning models for feature extraction in stock market prediction. Thenmozhi and Chand  used SVM to extract information transmission features from six global markets over the period from 1999 to 2011 to predict stock returns. Patel et al.  focused on investigating the effect of feature extraction on the prediction performance of models. They employed four machine learning models, which are ANN, SVM, RF, and NB, to extract features from ten technical indicators that were converted into deterministic trend signals and then made predictions in Indian stock markets. Their results showed that converting technical indicators into deterministic trend signals is beneficial to feature extraction and hence improving prediction performance.
As a typical deep learning model, CNN had exhibited great ability for feature extraction in computer vision and image processing. Recently, it was gradually applied to extract market features in stock prediction fields. Persio and Honchar  used CNN to extract features from a one-dimensional input variable which is obtained from the history of close price. To compensate for the lack of sufficient information in one-dimensional input, researchers attempted to provide more sufficient financial variables for CNN to extract market features. In fact, some researchers directly used the candlestick chart as the input of CNN [23, 24]. Furthermore, instead of directly taking the image as the input of CNN, Sim et al.  employed high-frequency data of close price to construct the input image as the input for CNN model. Sezer and Ozbayoglu  proposed a time series to image conversion approach, which utilized 15 technical indicators and 15 different intervals of technical indicators to generate a input image. However, in the approach, the potential influence from correlated stock markets was ignored. To address this problem, Hoseinzade and Haratizadeh  recently proposed an approach to build a three-dimensional input tensor for CNN to extract market features. And the experimental results showed the effectiveness of the three-dimensional input tensor in extracting features and hence contribute to improve the performance of the model in predicting the direction of the stock price movement.
2.2. Model Enhancement
Combining the model with other techniques is a common way to improve the prediction performance. In , the authors used Harmony search and GA to enhance traditional ANN model and then utilized enhanced ANN to make a prediction. And the results showed that the proposed ANN model is found as a dominant model compared with the other models. Besides, Yin and Bai  designed an adaptive SVR for stock data at different time scales. Experimental results showed that the improved SVR with dynamic optimization of learning parameters by PSO can achieve a better result than the traditional SVR. However, in recent years, machine learning models are challenged by deep learning models in stock market prediction . By investigating the Chinese stock market, Chen et al.  found that the deep learning model outperforms the backpropagation, the extreme learning machine, and RBFNN in stock price prediction. Similarly, Yu and Yan  designed a DNN model based on PSR and LSTM to predict stock prices. By predicting multiple stock indices for different periods, they found the proposed DNN model gets a higher prediction accuracy than ARIMA, SVR, and MLP. Furthermore, a similar conclusion can be drawn in .
Designing a hybrid model is another popular way to enhance the prediction performance of single-structure model. In , a two-stage fusion approach was proposed. SVR in the first stage and the second stage involves different models, including ANN, RF, and SVR. Experiments on Indian stock market demonstrated the effectiveness of the fusion prediction models. Zhou et al.  developed a learning architecture by cascading the logistic regression model onto the GBDT for predicting the stock indices. Cao and Wang  established a hybrid prediction model, which consists of CNN and SVM, to make stock market predictions. And the results illustrated that the combination of CNN and SVM can significantly improve the model’s prediction performance. Long et al.  proposed an end-to-end model named MFNN for feature extraction on stock price movement prediction task. In their model, both convolutional and recurrent neurons were integrated to construct the multifilter structure. Experiments on Chinese stock market index CSI300 showed the superiority of MFNN to traditional machine learning models, statistical models, CNN, RNN, and LSTM in terms of the accuracy, profitability, and stability. In fact, a more commonly used hybrid model is the CNN-LSTM model [14–20]. For example, in , the authors found that the CNN-LSTM model is superior to LSTM and CNN in stock price movement prediction. In , Li et al. added an attention mechanism to the CNN-LSTM model and further improved its scalability and prediction accuracy. Similarly, Zhou et al.  developed a generic framework by using LSTM and CNN for adversarial training to predict stock price direction in the high-frequency stock market and achieved significant results.
3. The Proposed Framework
The architecture of our proposed model is illustrated in Figure 1, which is comprised of three major steps, including input data representation, CNN for feature extraction, and LSTM for prediction.
3.1. Data Representation
3.1.1. Data Labelling
In the field of forecasting stock price movement, the price movement direction often was classified into two classes: up and down [6, 32]. Class labels indicate the movement direction of the stock price. In this paper, the labels are computed by using the daily close price of a stock index. Let be the close price for a stock index on day . The class label for the -th day is defined as
3.1.2. Transformed Deterministic Signals
It is well known that technical indicators are widely used in stock market prediction. In this paper, we employ ten technical indicators and convert them into deterministic trend signals for prediction since Jigar et al.  demonstrated that trend deterministic values of technical indicators are better than the native values of technical indicators in stock trend forecasting. Table 1 presents the specific details.
, , and denote the close price, low price, and high price at time , respectively; and represent, respectively, lowest low and highest high in the last t days; means upward price change while is the downward price change at time . EMA refers to the exponential moving average, , , and denotes the time period of day exponential moving average.
3.1.3. Input Tensor Building
In , the authors ordered the features in the two-dimensional input matrix according to the correlation between instances and features before they are presented as input to the CNN. And their results showed that the CNN with a specifically ordered features outperforms CNN that utilizes randomly ordered features. Inspired by their idea, we try to apply this correlation to the three-dimensional input tensors for CNN.
In Figure 2, we show the representation of the three-dimensional input tensor. In this paper, , , and are 10, 10, and 11, respectively. In the proposed framework shown in Figure 1, the input is a three-dimensional tensor, each dimension of which represents the number of technical indicators, the number of trading days, and the number of correlated stock indices, where there are converted deterministic variables from the technical indicators for each of these markets, days used for prediction, and correlated market indices.
Different from , in our three-dimensional input tensor, the technical indicators are transformed into deterministic trend signals and the stock indices are ranked by PPMCC. Actually, PPMCC is one of the most common measurements of determining linear dependence, which is capable of reflecting the degree of linear correlation between two variables [33, 34]. The calculation formula is as follows:where and are the values of the -th and the -th feature on the -th day index. and are the average values of the -th and -th feature. represents the number of data. If , there is a positive correlation, and if , it is negatively correlated; otherwise, it is linearly independent.
In detail, we take the calculation of PPMCC between S&P 500 and DJIA as an example. and in equation (2) are close prices of S&P 500 and DJIA on the -th day, respectively. and are the corresponding average of the close price of S&P 500 and DJIA. is the number of trading days in S&P 500. Following equation (2), we can obtain the PPMCC between S&P 500 and DJIA. The PPMCC between S&P 500 and the other 10 stock indices can also be obtained in a similar way. And the results can be found in Figure 3. Therefore, the order of stock indices in the three-dimensional input tensor is S&P 500, NASDAQ, DJIA, RUSSELL, NYSE, DAX, N225, FTSE, CAC40, HSI, and SSE.
3.2. CNN for Feature Extraction
In general, the CNN model includes several layers , such as the input layer, the convolutional layer, the pooling layer, the fully connected layer, and the output layer. In this paper, we do not employ the pooling layer because Yang et al.  claimed in the financial study that if a pooling layer is adopted, the information would probably be lost. Specifically, the convolutional layer is designed for performing convolution operations on the input data. Actually, the convolution operation can be considered as a filter used for the input data. The size of a filter suggests its coverage. Moreover, all the filters share the same weights in the convolution operation, and the weights are updated in training. Similar to , Figure 2 exhibits how the filter works in the three-dimensional input tensor. Next, a fully connected layer is used for linking the flattened layer to the output layer, which is a MLP network that can perform the prediction and classification operations.
Inspired by , the authors used a parallel convolutional layer to generate multiple time series representations of different time scales and achieved significant results. And in the proposed CNN feature extraction module, there are 5 layers, including a parallel convolutional layer, a merge layer, two convolutional layers, and a flattened layer, which are shown in the virtual line frame of Figure 4. In the parallel layer, the convolutions for different branches are independent of each other. In the merged layer, all extracted features of parallel layers are concatenated. Then, the concatenated feature will be processed by the remaining two convolutional layers. Finally, the flattened layer obtains the feature vector. Notably, the fully connected and the output layer are only used for training, and in testing, the LSTM network replaces them and is concatenated with the feature vector generated from the flattened layer.
A specific configured in the CNN feature extraction module shows that the input tensor is a matrix of 10 by 11 with a depth of 10. The parallel convolutional layers perform and convolutional operations, and the filters both are ten, after which there is one convolutional layer with ten filters, and in the next convolutional layer, ten filters are utilized. By the way, in each convolutional layer, the padding method takes “same.” Then, a flattened layer is used to generate the feature vector. When training, the flattened layer is concatenated with a fully connected network consisting of two hidden layers: the first layer has 10 neurons and the second layer has 2 neurons. Specifically, the loss function is categorical cross entropy, epochs are 24, and batch size is 32 in our experiments. Finally, the “softmax” activation function is employed in the output layer.
3.3. LSTM for Prediction
In the combination model, the LSTM network, concatenated with the trained CNN, is used for final prediction. Specifically, the feature vector generated from the flattened layer acts as the input for the LSTM network to make a prediction. The LSTM network is comprised of an input layer, a hidden layer, and an output layer. In detail, the hidden layer, including the memory cells, is the main characteristic of LSTM networks. Each of the memory cells has three gates designed for maintaining and adjusting its cell state : a forget gate (), an input gate (), and an output gate (). Specifically, each of the gates can be considered a filter to fulfill a certain purpose. The forget gate and the input gate define which information to remove from and add to the cell state, respectively. The output gate specifies which information from the cell state will be utilized as output.
Figure 5 illustrates the structure of a memory cell. We formulate the LSTM model to process time series of stock indices, referring to the literature . During a forward pass, denotes an output of LSTM at day and can be calculated as follows:where is the weight matrix and is the input vector at time . , , and are forgotten, input, and output gates at time , respectively. and denote the distorted input to the memory cell and the content of the memory cell at time . In addition, represents the value of the hidden node, and the symbol represents the elementwise production operation. The corresponding details of the back propagation through time are introduced in .
In terms of the configuration of the LSTM network, the optimizer adopts the “Adam” optimization algorithm, the loss function is a categorical cross entropy, epochs are 12, and batch size is 64. As for the time steps, the number of hidden neurons, and the dropout rate, we present the levels instead of specified values. Details can be found in Table 2. For each stock index, the determination of these parameters is according to the prediction performance on the validation set. Notably, the proposed model is implemented by Python with a version of 3.5.4. We mainly use such machine learning libraries as “Keras” and “NumPy” for various functionalities.
In this study, we use 11 influential international stock market indices, including CAC40, DJIA, S&P 500, NASDAQ, DAX, FTSE, NYSE, HSI, N225, SSE, and RUSSELL. The data are from the period of January 4, 2010, to December 29, 2017. All the data are downloaded from Yahoo Finance (https://finance.yahoo.com/).
In addition, our experimental scheme for the investigation is based on the proposed deep learning framework, called “CNN3D-DR + LSTM,” and the workflow of proposed model can be seen in Figure 6. Besides, the main steps are described as follows:(1)Data preprocessing: the original dataset is used to generate the labels and the deterministic trend signals, which act as the input within three-dimensional tensors. Then, the stock indices in the three-dimensional tensor are ranked by PPMCC.(2)Data partitioning: all the labelled data are first divided into 3 parts—the training set for training, the validation set for parameter determination, and the testing set for performance evaluation.(3)Training: the training dataset is used to train the CNN connecting with a fully connected neural network and then the trained CNN is used to generate a series of feature vectors, which act as the input of the LSTM neural network. Next, we set different parameters and use the obtained feature vectors to train the LSTM network. Then, we use the validation set to evaluate the prediction performance of the hybrid model. The parameters of optimal prediction performance are obtained.(4)Testing: the testing dataset and the trained CNN are used to compute the feature vectors and then they are put into LSTM with optimal parameters for predicting the direction of stock price movements.(5)Evaluation: the prediction performance is evaluated by comparing the predicted value with the real ones.
4.1. Evaluation Methodology
The evaluation scheme is based on the confusion matrix for two-class classification shown in Table 3; here, , , , and denote true positive, false positive, false negative, and true negative counts, respectively. Precision, recall, accuracy, and F-measure are commonly used indicators to evaluate the prediction performance, and the corresponding formula is as follows:
Accuracy is an important evaluation indicator. However, it may not be suitable for an unbalanced dataset . For a full assessment of the prediction performance, we also take precision, recall, and F-measure into consideration. In order to evaluate the prediction performance for each class, the precision, recall, and F-measure take the mean of values for positive and negative classes. By the way, the mean of the F-measure values for positive and negative classes is also called macroaverage F-measure [6, 32]. Furthermore, we use the ROC curve that is created by plotting the TPR against the FPR at different possible thresholds to visualize the performance of the proposed models. And the AUC (area under the ROC curve) is taken as an overall performance measure because it is independent of the cutoff value. The higher the AUC value is, the better prediction performance that the model achieves.
4.2. Experiments on S&P 500
In this section, we conduct extensive experiments on the S&P 500 to investigate the effectiveness of the proposed model. For the simplicity of description, we let “D” represent the fact that the technical indicators have been converted into deterministic trend signals, and we let “R” represent the fact that the stock indices in the three-dimensional tensor have been ranked by PPMCC. Specifically, we design several models for comparative experiments as follows:(1)CNN3D: we take a three-dimensional input tensor as the input data for a CNN model to make a prediction. In the input tensor, the value of a technical indicator is normalized and not converted to deterministic trend signals.(2)CNN3D-DR: in this model, the difference from CNN3D is the fact that the technical indicators in the input tensor are transformed into deterministic trend signals and the stock indices in the tensor are ranked by PPMCC.(3)LSTM-D: to test the LSTM network, the deterministic trend signals transformed from technical indicators are utilized as the input to make predictions.(4)CNN3D + LSTM: regardless of the deterministic trend signals, the CNN is used to extract features from the three-dimensional input tensor, while the LSTM network is employed for making predictions. The input tensor used here is the same as in CNN3D.(5)CNN3D-D + LSTM: in the three-dimensional input tensor, the technical indicators are transformed into deterministic trend signals but the stock indices are not ranked. And the CNN is used to extract features, while the LSTM network is used to make predictions.(6)CNN3D-DR + LSTM: in contrast to CNN3D + LSTM, in the input tensor, the technical indicators are transformed into deterministic trend signals and the stock indices are ranked by PPMCC.
First of all, we divide the dataset into three parts: training set, validation set, and testing set. The validation set is used to determine the optimal parameters in LSTM network. Here, we define as the ratio of training set and validation set to the testing set. For example, means that the ratio of training set and validation set is 80% of dataset, while the testing set is 20%. For simplicity, we set the ratio between training set and validation set as 4 : 1. Table 4 shows the macroaverage F-measure of CNN3D-DR + LSTM with different parameters on validation set in S&P 500 when . And we can find the optimal time steps in LSTM is 6, the optimal number of hidden neurons is 100, and the optimal dropout rate is 0.3.
Then, we design a group of experiments with different sizes of the training set and the testing set to detect the suitability and robustness of the proposed framework. We conduct experiments on S&P 500 and exhibit the average prediction results of the experiments with . can be set to 60/40, 65/35, 70/30, 75/35, and 80/20, and Table 5 presents the corresponding optimal parameters. Furthermore, we show the average performance of different models with different in Table 6. For a clearer visualization, we illustrate the results in Figure 7.
To compare the results of CNN3D and CNN3D-DR, we find that CNN3D-DR can provide better average performance. In particular, in the comparison between the CNN3D + LSTM and the CNN3D-DR + LSTM, the CNN3D-DR + LSTM shows significant superiority compared to the CNN3D + LSTM, which demonstrates that the improvement of three-dimensional input tensor can significantly improve the prediction accuracy. Furthermore, neither CNN3D-DR nor LSTM-D defeats CNN3D-DR + LSTM, indicating that the hybrid model is effective in improving prediction performance. In brief, the CNN3D-DR + LSTM outperforms the others in the given situation and demonstrates that the improvement of three-dimensional input tensor and the combination of CNN and LSTM can improve the prediction performance. To better evaluate the performance of stock price movement direction prediction, we illustrate the ROC curves of different experiment groups where takes 80/20 in Figure 8, from which we can find that the area under the ROC curve (AUC) of the proposed model is larger than the others.
4.3. Comparison with Other Models
In addition, we conduct a group of experiments to evaluate the performance of the proposed model compared with several state-of-the-art models. We apply all the models in predicting stock price movement direction on five different stock indices, which are S&P 500, DJIA, NASDAQ, NYSE, and RUSSELL, respectively. In the comparison with other models, we divide the dataset into 2 parts: the first 80% of the data is used for training, while the remaining 20% acts as the testing data. Accordingly, the performance of these models is compared in terms of the average macroaverage F-measure.
In terms of the proposed hybrid model, Table 7 shows its optimal parameters on different stock indices. Besides, in other models, the same parameter settings reported in the original paper are used. The details of other models are described as follows:(1)PCA + ANN : first, the initial data are mapped to a new feature space by using PCA. Then, we use the resulting representation of the data to train a three-layered ANN for stock price direction prediction. In the hidden layer, the number of neurons is set to 10 and a tangent sigmoid function is used. And a logistic sigmoid transfer function is used in the output layer.(2)SVM : ten technical indicators are represented as trend deterministic data and are then fed into SVM to predict stock price index movement. For each stock, the optimal parameters of SVM are obtained from several given parameter levels. By the way, the selected ten technical indicators are same as this paper.(3)CNN-cor : the feature set is extracted from different technical indicators, price and temporal information, and then ordered by the correlations between instances and features. Finally, the ordered features are used to build a two-dimensional input matrix for the specified CNN to predict the direction of stock price movement.(4)CNNpred : a diverse set of financial variables, including technical indicators, stock indices, commodities, future contracts, etc., is used to construct three-dimensional input tensors. Then, the input tensors are fed into a specified CNN model to make predictions.(5)CNN + LSTM: we implement a common CNN-LSTM model for comparison. In the model, ten technical indicators are used to construct a two-dimensional input data for CNN to extract features. The ten technical indicators are the same as those in  and the parameters of CNN are the same as this paper. Besides, LSTM is utilized for price direction forecasting. The time steps, number of hidden neuron, and dropout rate are 10, 50, and 0.1, respectively.
Table 8 shows the average results. In addition, we also show the best performance of the models in Table 9. The experimental results on different stock indices demonstrate that the proposed model is superior to the other common models, including ANN, SVM, CNN, and CNN + LSTM.
This paper presented a combined deep learning framework with CNN and LSTM neural networks to predict the stock price movement direction. First, we improved the three-dimensional input tensor by transforming the technical indicators into deterministic trend signals and ranking the correlated stock indices according to PPMCC. Then, we designed a CNN-based module for feature extraction. Finally, we employed a LSTM network for stock price movement direction prediction.
Extensive experiments demonstrated that the deterministic trend signals and the ranked stock indices in the three-dimensional input tensor play a significant role in improving the prediction performance. Moreover, the result of comparing with several state-of-the-art models showed the superiority of the proposed model in predicting direction of the stock price movement.
In future work, it would probably be a core challenge to design better learning models via intelligently extracting more valuable features to further improve the prediction performance.
|ANN:||Artificial neural network|
|ARIMA:||Autoregressive integrated moving average|
|CNN:||Convolutional neural network|
|DAX:||DAX performance index|
|DJIA:||Dow Jones industrial average|
|DNN:||Deep neural network|
|FPR:||False positive rate|
|FTSE:||FTSE 100 index|
|GARCH:||Generalized autoregressive conditional heteroscedasticity|
|GBDT:||Gradient boosted decision tree|
|HSI:||Hang Seng index|
|LSTM:||Long short-term memory|
|MFNN:||Multifilter neural network|
|N225:||Nikkei 225 index|
|NASDAQ:||NASDAQ composite index|
|NYSE:||New York stock exchange index|
|PPMCC:||Pearson product-moment correlation coefficient|
|PSO:||Particle swarm optimization|
|RBFNN:||Radial basis function neural network|
|RNN:||Recurrent neural network|
|ROC:||Receiver operating characteristic|
|RUSSELL:||RUSSELL 2000 index|
|S&P 500:||S&P 500 index|
|SSE:||SSE composite index|
|SVM:||Support vector machine|
|SVR:||Support vector regression|
|TPR:||True positive rate.|
The data used to support the findings of this study can be downloaded from Yahoo Finance (https://finance.yahoo.com/).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
- J. Sun, K. Xiao, C. Liu, W. Zhou, and H. Xiong, “Exploiting intra-day patterns for market shock prediction: a machine learning approach,” Expert Systems With Applications, vol. 127, pp. 272–281, 2019.
- Z. Lin, “Modelling and forecasting the stock market volatility of sse composite index using garch models,” Future Generation Computer Systems, vol. 79, pp. 960–972, 2018.
- Y. Shynkevich, T. M. McGinnity, S. A. Coleman, A. Belatreche, and Y. Li, “Forecasting price movements using technical indicators: investigating the impact of varying input window length,” Neurocomputing, vol. 264, pp. 71–88, 2017.
- J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques,” Expert Systems with Applications, vol. 42, no. 1, pp. 259–268, 2015.
- O. B. Sezer and A. M. Ozbayoglu, “Algorithmic financial trading with deep convolutional neural networks: time series to image conversion approach,” Applied Soft Computing, vol. 70, pp. 525–538, 2018.
- E. Hoseinzade and S. Haratizadeh, “Cnnpred: cnn-based stock market prediction using a diverse set of variables,” Expert Systems with Applications, vol. 129, pp. 273–285, 2019.
- Y. Chen, W. Lin, and J. Z. Wang, “A dual-attention-based stock price trend prediction model with dual features,” IEEE Access, vol. 7, pp. 148047–148058, 2019.
- L. Chen, Z. Qiao, M. Wang, C. Wang, R. Du, and H. E. Stanley, “Which artificial intelligence algorithm better predicts the Chinese stock market?” IEEE Access, vol. 6, pp. 48625–48633, 2018.
- P. Yu and X. Yan, “Stock price prediction based on deep neural networks,” Neural Computing and Applications, vol. 132, pp. 1–20, 2019.
- H. M, G. E. A., V. K. Menon, and S.K. P., “Nse stock market prediction using deep-learning models,” Procedia Computer Science, vol. 132, pp. 1351–1362, 2018.
- S. Borovkova and I. Tsiamas, “An ensemble of lstm neural networks for high-frequency stock market classification,” Journal of Forecasting, vol. 38, no. 6, pp. 600–619, 2019.
- J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock market index using fusion of machine learning techniques,” Expert Systems with Applications, vol. 42, no. 4, pp. 2162–2172, 2015.
- J. Cao and J. Wang, “Stock price forecasting model based on modified convolution neural network and financial time series analysis,” International Journal of Communication Systems, vol. 32, no. 12, p. e3987, 2019.
- S. Jain, R. Gupta, and A. A. Moghe, “Stock price prediction on daily stock data using deep neural networks,” in Proceedings of the 2018 International Conference on Advanced Computation and Telecommunication (ICACAT), pp. 1–13, IEEE, New York, NY, USA, 2018.
- J. Eapen, D. Bein, and A. Verma, “Novel deep learning model with cnn and bi-directional lstm for improved stock market index prediction,” in Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), pp. 264–270, IEEE, New York, NY, USA, 2019.
- X. Zhan, Y. Li, R. Li, X. Gu, O. Habimana, and H. Wang, “Stock price prediction using time convolution long short-term memory network,” in Proceedings of the International Conference on Knowledge Science, Engineering and Management, pp. 461–468, Springer, Berlin, Germany, 2018.
- C. Li, X. Zhang, M. Qaosar, S. Ahmed, K. M. R. Alam, and Y. Morimoto, “Multi-factor based stock price prediction using hybrid neural networks with attention mechanism,” in Proceedings of the 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 961–966, IEEE, Berlin, Germany, 2019.
- X. Zhou, Z. Pan, G. Hu, S. Tang, and C. Zhao, “Stock market prediction on high-frequency data using generative adversarial nets,” Mathematical Problems in Engineering, vol. 34, 2018.
- J. Liu, Y. Chen, K. Liu, and J. Zhao, “Attention-based event relevance model for stock price movement prediction,” in Proceedings of the China Conference on Knowledge Graph and Semantic Computing, pp. 37–49, Springer, Berlin, Germany, 2017.
- P. Oncharoen and P. Vateekul, “Deep learning using risk-reward function for stock market prediction,” in Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence, pp. 556–561, Berlin, Germany, 2018.
- M. Thenmozhi and G. Sarath Chand, “Forecasting stock returns based on information transmission across global markets using support vector machines,” Neural Computing and Applications, vol. 27, no. 4, pp. 805–824, 2016.
- L. D. Persio and O. Honchar, “Artificial neural networks architectures for stock price prediction: comparisons and applications,” International Journal of Circuits, Systems and Signal Processing, vol. 10, pp. 403–413, 2016.
- S.-J. Guo, F.-C. Hsu, and C.-C. Hung, “Deep candlestick predictor: a framework toward forecasting the price movement from candlestick charts,” in Proceedings of the 2018 9th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 219–226, IEEE, Berlin, Germany, 2018.
- K. Jearanaitanakij and B. Passaya, “Predicting short trend of stocks by using convolutional neural network and candlestick patterns,” in Proceedings of the 2019 4th International Conference on Information Technology (InCIT), pp. 159–162, IEEE, Berlin, Germany, 2019.
- H. S. Sim, H. I. Kim, and J. J. Ahn, “Is deep learning for image recognition applicable to stock market prediction?” Complexity, vol. 10, 2019.
- M. Göçken, M. Özçalıcı, A. Boru, and A. T. Dosdoğru, “Integrating metaheuristics and artificial neural networks for improved stock price prediction,” Expert Systems with Applications, vol. 44, pp. 320–331, 2016.
- Y. Guo, S. Han, C. Shen, Y. Li, X. Yin, and Y. Bai, “An adaptive svr for high-frequency stock price forecasting,” IEEE Access, vol. 6, pp. 11397–11404, 2018.
- R. Singh and S. Srivastava, “Stock prediction using deep learning,” Multimedia Tools and Applications, vol. 76, no. 18, pp. 18569–18584, 2017.
- Q. Wang, W. Xu, and H. Zheng, “Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles,” Neurocomputing, vol. 299, pp. 51–61, 2018.
- F. Zhou, Q. Zhang, D. Sornette, and L. Jiang, “Cascading logistic regression onto gradient boosted decision trees for forecasting and trading stock indices,” Applied Soft Computing, vol. 84, p. 105747, 2019.
- W. Long, Z. Lu, and L. Cui, “Deep learning-based feature engineering for stock price movement prediction,” Knowledge-Based Systems, vol. 164, pp. 163–173, 2019.
- H. Gunduz, Y. Yaslan, and Z. Cataltepe, “Intraday prediction of borsa istanbul using convolutional neural networks and feature correlations,” Knowledge-Based Systems, vol. 137, pp. 138–148, 2017.
- M.-T. Puth, M. Neuhäuser, and G. D. Ruxton, “Effective use of Pearson's product-moment correlation coefficient,” Animal Behaviour, vol. 93, pp. 183–189, 2014.
- J. Guo and X. Li, “Prediction of index trend based on lstm model for extracting image similarity feature,” in Proceedings of the 2019 International Conference on Artificial Intelligence and Computer Science, pp. 335–340, New York, NY, USA, 2019.
- Y. LeCun and Y. Bengio, “Convolutional networks for images, speech, and time series,” The Handbook of Brain Theory and Neural Networks, vol. 3361, no. 10, p. 1995, 1995.
- H. Yang, Y. Zhu, and Q. Huang, “A multi-indicator feature selection for cnn-driven stock index prediction,” in Proceedings of the International Conference on Neural Information Processing, pp. 35–46, Springer, Berlin, Germany, 2018.
- C. Yang, S. Ren, Y. Liu, H. Cao, Q. Yuan, and G. Han, “Personalized channel recommendation deep learning from a switch sequence,” IEEE Access, vol. 6, pp. 50824–50838, 2018.
- K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, “Lstm: a search space odyssey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2016.
- X. Zhong and D. Enke, “Forecasting daily stock market return using dimensionality reduction,” Expert Systems with Applications, vol. 67, pp. 126–139, 2017.
Copyright © 2020 Can Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.