Stock Index Prices Prediction via Temporal Pattern Attention and Long-Short-Term Memory

Wei, Xiaolu; Lei, Binbin; Ouyang, Hongbing; Wu, Qiufeng

doi:https://doi.org/10.1155/2020/8831893

Advances in Multimedia

On this page

Abstract Introduction Results Conclusions Data Availability Disclosure Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 8831893 | https://doi.org/10.1155/2020/8831893

Stock Index Prices Prediction via Temporal Pattern Attention and Long-Short-Term Memory

Xiaolu Wei,¹Binbin Lei,²Hongbing Ouyang,³and Qiufeng Wu⁴

Academic Editor: Constantine Kotropoulos

Received07 Sept 2020

Revised18 Nov 2020

Accepted26 Nov 2020

Published11 Dec 2020

Abstract

This study attempts to predict stock index prices using multivariate time series analysis. The study’s motivation is based on the notion that datasets of stock index prices involve weak periodic patterns, long-term and short-term information, for which traditional approaches and current neural networks such as Autoregressive models and Support Vector Machine (SVM) may fail. This study applied Temporal Pattern Attention and Long-Short-Term Memory (TPA-LSTM) for prediction to overcome the issue. The results show that stock index prices prediction through the TPA-LSTM algorithm could achieve better prediction performance over traditional deep neural networks, such as recurrent neural network (RNN), convolutional neural network (CNN), and long and short-term time series network (LSTNet).

1. Introduction

Stock index prediction is one of the critical topics in financial time series forecasting. However, stock index characteristics, noisy and nonstationary phenomena, make predictions face challenges [1, 2]. “Noisy” indicates a lack of information for investors to detect past stock index behaviors. “Nonstationary” arises in a situation where the stock index may change dramatically in different periods. These characteristics lead to reduced stock index prediction results as predicted by traditional econometric models such as linear model, Autoregressive Integrated Moving Average (ARIMA), and Vector Autoregression (VAR) [3–5]. All such methods are generally based on several assumptions, such as independent and normally distributed variables that are contradicted with real market behavior.

Additionally, recent academic literature evidences that time series prediction based on neural networks have been widespread. Unlike traditional models, deep neural networks have several distinct advantages: nonparametric, self-learning, nonassumption, noise-tolerant, and ability to capture nonlinear interdependencies that are not commonly available in traditional models [6–8]. Hence, deep neural networks are usually more effective in forecasting stock index prices than traditional models [9].

Recent studies recognize deep neural networks as another promising tool in financial time series forecasting [10, 11] due to its ability to model nonlinear patterns, comprehend complicated causal relationships, and learn from colossal history dataset. In the field of financial time series, prediction through deep neural networks uses various approaches, such as long and short-term memory (LSTM) [12] and support vector machine (SVM) [13]. The related studies mainly consist of three categories. In the first category, researchers identify significant events through some templates, such as the stock market bulletin [14, 15]. In the second category, researchers seek inherent structure in the time series and predict future patterns [16, 17]. In the third category, researchers predict the numerical value of financial time series accurately through technical analysis, which investigates past stock prices and volumes [18].

However, the performance of neural network techniques in the stock index scenario is relatively less explored [19–21]. Also, the overall body of literature highlights five problems in stock index prediction through neural networks. First, existing studies mainly focus on predicting a single stock index without considering differences in various industries [22]. Second, previous studies mainly focus on the univariate time series without considering dynamics dependencies among multiple variables [23]. Third, most studies proposed that a single neural network model cannot perform well in combining nonlinear and linear structures inherent in most financial data [19, 20]. Forth, these models are mainly designed for multivariate time series with strong periodic patterns, which are not adaptable to datasets with nonperiodic or weak periodic patterns [24]. Fifth, most literature use RSE, CORR, and t-statistic to evaluate a model’s performance, which is based on strict assumption such as normal distribution.

To solve the problems mentioned above, this paper extends Temporal Pattern Attention and Long Short-Term Memory (TPA-LSTM) method [25] to the financial field and verify the effectiveness of TPA-LSTM in this sector comprehensively. This paper’s contributions mainly include the following three aspects. First, this paper considers the differences among different industries and predicts eight industry stock index prices in Hangseng Composite Index simultaneously with full consideration of interdependencies among them. Due to the macroeconomic environment and other conjunctural factors, industry stock index prices change collaboratively. Therefore, this paper attempts to predict eight industry stock indexes simultaneously, including consumer good manufacturing, consumer service, energy, industry, information technology, integrated industry, raw material, and real estate. Second, this paper models stock index prices’ complicated structure through a temporal pattern attention (TPA) mechanism proposed by the TPA-LSTM method. Datasets of stock index prices represent weak periodic patterns with short-term and long-term memory. TPA mechanism, which includes (Long Short-Term Memory) LSTM module, CNN module, and temporal attention module, is adaptable to various datasets, even multivariate time series data with nonperiodic and weak periodic patterns. LSTM component could capture one pattern with long length by itself, while the CNN module can extract short-term patterns in the time dimension and local dependencies between variables. Additionally, the temporal attention module could select variables that are helpful for forecasting and capturing temporal information. Therefore, we could discover short-term and long-term weak repeating patterns of multivariate industry stock index prices and predict prices more accurately. Third, this paper attempts to test TPA-LSTM’s robustness in the financial field through three evaluation metrics, including two single method’s performance measures and one performance difference test. These evaluation metrics are different from the performance evaluation metrics used in [25] and the traditional statistical significance test, which is based on strict assumptions.

The rest of the paper is structured as follows. Section 2 reviewed the mathematical model on stock index price prediction and detailed introduction to the TPA-LSTM method. Section 3 discussed the experimental preliminaries, which contain experimental data and selection of evaluation criteria. Section 4 presents the application steps of the TPA-LSTM method and the experimental results. Finally, section 5 highlighted the conclusion.

2. Framework

2.1. Mathematical Model on Stock Index Prices Prediction

This study attempts to predict stock index prices in different industries simultaneously. In the given datasets , where represents eight industry indexes’ prices at a time , and is the variable dimension, this paper is interested in the task of forecasting stock index prices in a rolling forecasting step. Instead of looking at a single stock index’s price , this paper predicts with dimensions simultaneously, wherein h is the desirable horizon ahead of the current timestamp and are available. Similarly, this paper forecasts in the next time step and assumes are available. Moreover, this paper uses only to predict stock index prices , where is the window size. This is based on the assumption that there is no useful information before the window , which is set to be 30 in this paper [24]. Therefore, the input matrix of this paper at timestamp T is , and the output matrix is .

2.2. Framework

Neural networks are widely used for financial time series prediction. The prediction process needs to address three difficulties. Firstly, many studies prefer univariate time series prediction rather than multivariate time series prediction, which is considered by the method applied in this paper. Secondly, the ignorance of weak periodic patterns in financial multivariate time series usually yields unsatisfactory outcomes. Finally, with the increase of data size, the time needed to predict time series increases remarkably. Therefore, an algorithm to reduce the total number of data points and time series prediction’s operation time is crucial. These three difficulties are closely related to each other in the financial time series prediction process.

Shun-Yao Shih et al. proposed TPA-LSTM [25]. Compared to other forecasting methods, TPA-LSTM is the first method to predict n-dimensional time series with a mixture of short-term and long-term weak repeating patterns, which could solve the above problems. In this section, we describe the details of the TPA-LSTM algorithm applied in this paper.

TPA-LSTM consists of a nonlinear part and a linear part. The nonlinear part is a temporal attention mechanism that includes a recurrent layer, convolutional layer, and a temporal pattern attention layer, while the linear part uses an autoregressive model (AR) to forecast the result.

2.2.1. Recurrent Layer

The first layer of TPA-LSTM is a long short-term memory network (LSTM). Given the input matrix , wherein , this recurrent layer aims at capturing long-term information; the outputs of the recurrent layer are the hidden states at each time stamp. The hidden states of recurrent layer’s units at time t can be formulated as follows:which is defined by the following equations:where , (n = 8), , is the element-wise product, and is the sigmoid function.

2.2.2. Convolutional Layer

Given previous LSTM hidden states and initial input matrix , this section extracts short-term signal patterns and interdependencies among eight variables. The output in this section can be expressed as follows:where represents the convolutional value of the ith row vector and the jth filter, denotes the k filters we have, T is the maximum length this paper is paying attention which is set to be 30.

2.2.3. Temporal Pattern Attention Layer

The traditional attention mechanism selects relevant information relative to the current time step, which may lead to a failure to ignore noisy variables and detect useful temporal patterns in multivariate time series forecasting. TPA-LSTM develops a new temporal pattern attention mechanism to alleviate this problem, which could select useful variables and capture temporal information for forecasting. Given the previous convolutional value , recurrent value H, and initial input matrix , the output of this temporal pattern attention layer is a nonlinear projection part, which is computed as follows:where is the weighted context of hidden states of the convolutional matrix, is the attention weights which can be expressed as follows:

2.2.4. Autoregressive Layer

Due to the nonlinear property of the proposed attention mechanism, the TPA-LSTM method decomposes the prediction into a nonlinear part and a linear part. The nonlinear part’s prediction is captured by a recurrent layer, a convolutional layer, and a temporal pattern attention layer. In contrast, the linear part’s prediction is solved by the Autoregressive (AR) model in this section. Given the initial input X, we can get the forecasting result of the nonlinear part through AR Layer, which is formulated as follows:

Then, the forecasting result of TPA-LSTM can be expressed as follows:

The pseudocode of TPA-LSTM is described in Algorithm 1.

	Input initial , wherein (n = 8)
	Output a mixed output of a linear part and a nonlinear part
	Initialize best_val = 10000000
	for to epoches do




	if highway >0 do


	if val_loss < best_val
	best_val = val_loss
	model save
	else
	continue
	end

3. Experimental Evaluation

3.1. Data

Industry stock indexes in Hangseng Composite Index collated from the Wind platform are examined in the experiment. Excluding the financial and utilities industries, the datasets in this article include consumer good manufacturing, consumer service, energy, industry, information technology, integrated industry, and raw material, real estate. The data set covers the period from 01/09/2006 up to 01/02/2019.

More specifically, we used the daily closing prices as the datasets of this study and illustrated in Figure 1. The short-term and long-term repeating patterns are not visible due to nonstational time series or patterns with a flexible period. Each sample data of stock index prices is split into a training set (60%), validation set (20%), test set (20%) in chronological order. The study uses a validation set to tune hyperparameters while using a test set to evaluate and compare TPA-LSTM and other models’ forecasting performance. Also, the Null values are instantly dropped due to its small scale.

3.2. Evaluation Criteria

The prediction performance of the TPA-LSTM method is compared with five methods, including Long and Short-term Time series network (LSTNet) with recurrent-skip layer, LSTNet with the attention layer [24], RNN, CNN, and Laplacian echo state network (LAESN) [26]. These five methods can analyze multivariate input and output. To verify the validity of the TPA-LSTM method proposed in this paper, we select three evaluation metrics: Root Relative Squared Error (RSE), Empirical Correlation Coefficient (CORR), and a multistep conditional predictive ability test [27]. The first two evaluation metrics are performance measures, while the last evaluation metric is a performance difference test. The Root Relative Squared Error is in a scaled version, which is designed to make comparisons more efficient and valid. In the multistep conditional predictive ability test, we just sum the RSE and RAE of the 8 output variables at each time point and reject the hypothesis that two models have equal out-of-sample performance when the test statistic is larger than the critical value. The definitions of these criteria are found in Table 1.

Here, are true values and predicted values of industry stock index prices, respectively, n is the number of out-of-sample forecasts, is forecast horizon, T is the total sample size, m is the maximum estimation of window size, is a test function, is out-of-sample forecast loss differences of two methods, , , , where is a weight function.

4. Results

The study’s objective was to predict the industry stock index prices in Hangseng Composite Index, applying the TPA-LSTM method to multivariate time series prediction of stock index prices in different industries included in the Hangseng Composite Index. Typically TPA-LSTM takes into account the time series’s linear and nonlinear structure and uses four components, including a convolutional component, recurrent component, temporal pattern attention component, and an autoregressive component, to extract the time series’s weak repetitive patterns. We repeated the algorithm until we found the lowest validation loss value. The following section discusses the experiment and results.

The study results found that an attention length of 30 produces the best possible results when predicting industry stock indexes’s prices based on TPA-LSTM. Also, the learning rate, the dropout rate, the horizon h, and the optimization algorithm are arbitrarily chosen to be 0.2, , 24, and the Adam algorithm, respectively. Python 3 language is used to construct the program. Table 2 shows the prediction result of industry stock index prices.

Table 2 illustrates the prediction performances of all methods on all test sets (20%) in all metrics, including RSE and CORR of TPA-LSTM, LSTNet-Skip, LSTNet-Attn, RNN, CNN, and LAESN. We set horizon = {3, 6, 9, 12, 15, 18, 21, 24}, respectively. The results show that the larger the horizons, the worse the prediction results in most cases. The best results for five methods and two metrics are highlighted in bold-face in Table 2. The total count of the bold-faced results is 14 for TPA-LSTM, 1 for LSTNet-Attn, 1 for RNN, 0 for LSTNet-Skip, CNN, and LAESN. Moreover, an asterisk sign indicates that the test rejects equal conditional predictive ability at the 1% level and that the TPA-LSTM method outperforms other methods through conditional predictive ability tests on average.

These results show that even though the periodic patterns of industry stock index prices are not clear, the methods proposed by TPA-LSTM still perform much better than other neural network methods on most datasets. More specifically, TPA-LSTM outperforms the neural baseline methods in most cases. When the horizon is 24, TPA-LSTM outperforms LSTNet-Skip, LSTNet-Attn, RNN, CNN, and LAESN by 28.71%, 4.44%, 1.93%, 195.26%, and 44.68% in the RSE metric, respectively. Moreover, TPA-LSTM outperforms LSTNet-Skip, LSTNet-Attn, RNN, and CNN by 4.07%, 2.15%, 1.39%, 11.75%, and 7.66% in the CORR metric, respectively. The TPA-LSTM method has robust performance in different metrics, partly due to its consideration of interdependencies among multiple variables and weak periodic patterns with complex structures.

5. Conclusions

Multivariate time series prediction with neural networks plays a significant role in reducing the risks and uncertainty of financial markets. More specifically, it provides essential support in understanding the trends and behavior of industry stock index prices. The existing literature on multivariate time series prediction through neural networks mainly concentrates on forecasting univariate time series without considering interdependencies among different variables and mostly fails to capture the linear structure and weak periodic patterns. In this study, we applied a different approach, Temporal Pattern Attention and Long Short-Term Memory (TPA-LSTM), to predict stock indexes’ prices in different industries included in the Hangseng Composite Index. TPA-LSTM method is a new prediction model that enables the prediction of multivariate time series simultaneously with a top concern of weak periodic patterns and a mixture of linear and nonlinear structures.

Further, the TPA-LSTM method comprises four components, the Temporal Pattern Attention component, the RNN component, and the Autoregressive component. The experiment results indicate that by combining the strengths of convolutional network, recurrent network, temporal attention component, and autoregressive component, the TPA-LSTM method significantly improves state-of-the-art results in multivariate time series forecasting on the dataset of industry stock index prices. With the empirical results, this paper shows that the applied TPA-LSTM method is a satisfactory alternative for multivariate time series forecasting in stock indices.

There are two possible extensions of multivariate time series prediction in industry stock index prices. In the first extension, stock index price prediction can analyze automatic adjustment of hyperparameters, including window size and horizon h, which are tuned manually in TPA-LSTM. The second extension is to investigate possible profits in different trade strategies with the TPA-LSTM method.

Data Availability

The data used to support the findings of this study have been deposited in Xiaolu Wei’s repository (https://github.com/xiaoluees/TPA-LSTM-data).

Disclosure

This manuscript is the second stage of the three-stage architecture proposed in “Discovery and Prediction of Stock Index Pattern via Three-Stage Architecture of TICC, TPA-LSTM, and Multivariate LSTM-FCNs” published on IEEE Access.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Binbin Lei was responsible for collecting data in this manuscript. The whole architecture of this study was proposed by Xiaolu Wei, Hongbing Ouyang, and Qiufeng Wu based on their earlier study.

Acknowledgments

This work was supported by the 68th batch of general funding from the China Postdoctoral Science Foundation (grant no. 2020M682378) and the second batch of postdoctoral innovation research positions in Hubei Province in 2020 (grant no. 090459).

References

Y. S. Abu-Mostafa and A. F. Atiya, “Introduction to financial forecasting,” Applied Intelligence, vol. 6, no. 3, pp. 205–213, 1996.
View at: Publisher Site | Google Scholar
C. Huang and J. Cao, “Active control strategy for synchronization and anti-synchronization of a fractional chaotic financial system,” Physica A: Statistical Mechanics and its Applications, vol. 473, pp. 262–275, 2017.
View at: Publisher Site | Google Scholar
J. Zhang, Y.-F. Teng, and W. Chen, “Support vector regression with modified firefly algorithm for stock price forecasting,” Applied Intelligence, vol. 49, no. 5, pp. 1658–1674, 2019.
View at: Publisher Site | Google Scholar
H. Takayasu, “Practical fruits of econophysics,” Proceedings of the National Academy of Science, vol. 97, no. 4, pp. 1554–1559, 2006.
View at: Google Scholar
Y. Zuo and E. Kita, “Stock price forecast using Bayesian network,” Expert Systems with Applications, vol. 39, no. 8, pp. 6729–6737, 2012.
View at: Publisher Site | Google Scholar
S. Haykin, Neural Networks: a Comprehensive Foundation, Prentice Hall PTR, Upper Saddle River, NJ, USA, 1994.
L. Di Persio and O. Honchar, “Artificial neural networks approach to the forecast of stock market price movements,” International Journal of Economics and Management Systems, vol. 1, pp. 158–162, 2016.
View at: Google Scholar
H. Yan and H. Ouyang, “Financial time series prediction based on deep learning,” Wireless Personal Communications, vol. 102, no. 2, pp. 683–700, 2018.
View at: Publisher Site | Google Scholar
J. Henríquez and W. Kristjanpoller, “A combined independent component analysis-neural network model for forecasting exchange rate variation,” Applied Soft Computing, vol. 83, Article ID 105654, 2019.
View at: Publisher Site | Google Scholar
H. Ouyang, X. Wei, and Q. Wu, “Discovery and prediction of stock index pattern via three-stage architecture of TICC, TPA-LSTM and multivariate LSTM-FCNs,” IEEE Access, vol. 8, pp. 123683–123700, 2020.
View at: Publisher Site | Google Scholar
M. Adya and F. Collopy, “How effective are neural networks at forecasting and prediction? a review and evaluation,” Journal of Forecasting, vol. 17, no. 5-6, pp. 481–495, 1998.
View at: Publisher Site | Google Scholar
H. Jia, “Investigation into the effectiveness of long short term memory networks for stock price prediction,” 2016, https://arxiv.org/abs/1603.07893.
View at: Google Scholar
A. Altan and S. Karasu, “The effect oF kernel values in support vector machine to forecasting performance oF financial time series,” The Journal of Cognitive Systems, vol. 4, no. 1, pp. 17–21, 2019.
View at: Google Scholar
Y. Gurin, T. Szymanski, and M. T. Keane, “Discovering news events that move markets,” in Proceedings of the 2017 Intelligent Systems Conference, pp. 452–461, London, UK, September 2017.
View at: Publisher Site | Google Scholar
M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed, “DeepAnT: a deep learning approach for unsupervised anomaly detection in time series,” IEEE Access, vol. 7, pp. 1991–2005, 2018.
View at: Publisher Site | Google Scholar
X.-D. Zhang, A. Li, and R. Pan, “Stock trend prediction based on a new status box method and AdaBoost probabilistic support vector machine,” Applied Soft Computing, vol. 49, pp. 385–398, 2016.
View at: Publisher Site | Google Scholar
H. Ouyang, X. Wei, and Q. Wu, “Stock index pattern discovery via toeplitz inverse covariance-based clustering,” Romanian Journal of Economic Forecasting, vol. 23, no. 2, pp. 58–72, 2020.
View at: Google Scholar
O. B. Sezer, M. Ozbayoglu, and E. Dogdu, “A deep neural-network based stock trading system based on evolutionary optimized technical analysis parameters,” Procedia Computer Science, vol. 114, pp. 473–480, 2017.
View at: Publisher Site | Google Scholar
A. H. Moghaddam, M. H. Moghaddam, and M. Esfandyari, “Stock market index prediction using artificial neural network,” Journal of Economics, Finance and Administrative Science, vol. 21, no. 41, pp. 89–93, 2016.
View at: Publisher Site | Google Scholar
H. Ouyang, X. Wei, and Q. Wu, “Agricultural commodity futures prices prediction via long-and short-term time series network,” Journal of Applied Economics, vol. 22, no. 1, pp. 468–483, 2019.
View at: Publisher Site | Google Scholar
J. Cao and J. Wang, “Exploration of stock index change prediction model based on the combination of principal component analysis and artificial neural network,” Soft Computing, vol. 24, no. 11, pp. 7851–7860, 2020.
View at: Publisher Site | Google Scholar
J. Patel, S. Shah, P. Thakkar, and K. Kotecha, “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques,” Expert Systems with Applications, vol. 42, no. 1, pp. 259–268, 2015.
View at: Publisher Site | Google Scholar
Y. Chen and Y. Hao, “A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction,” Expert Systems with Applications, vol. 80, pp. 340–355, 2017.
View at: Publisher Site | Google Scholar
G. Lai, W. C. Chang, Y. Yang, and H. Liu, “Modeling long-and short-term temporal patterns with deep neural networks,” in Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 95–104, ACM, Santiago, Chile, August 2015.
View at: Google Scholar
S. Y. Shih, F. K. Sun, and H. Y. Lee, “Temporal pattern attention for multivariate time series forecasting,” Machine Learning, vol. 108, no. 8-9, pp. 1421–1441, 2019.
View at: Publisher Site | Google Scholar
M. Han and M. Xu, “Laplacian echo state network for multivariate time series prediction,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 238–244, 2018.
View at: Publisher Site | Google Scholar
R. Giacomini and H. White, “Tests of conditional predictive ability,” Econometrica, vol. 74, no. 6, pp. 1545–1578, 2006.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Xiaolu Wei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1596

Downloads

1360

Citations