Abstract

Integrating autoencoder (AE), long short-term memory (LSTM), and convolutional neural network (CNN), we propose an interpretable deep learning architecture for Granger causality inference, named deep learning-based Granger causality inference (DLI). Two contributions of the proposed DLI are to reveal the Granger causality between the bitcoin price and S&P index and to forecast the bitcoin price and S&P index with a higher accuracy. Experimental results demonstrate that there is a bidirectional but asymmetric Granger causality between the bitcoin price and S&P index. And the DLI performs a superior prediction accuracy by integrating variables that have causalities with the target variable into the prediction process.

1. Introduction

Time series is a series of observation values of a variable arranged in a chronological order, which reflects the change of a phenomenon itself with time if there are no exogenous variables. Generally speaking, time series analysis focuses more on predicting the future based on the existing historical data [13] than interpreting the causalities which may exist among the variables. Exploring the causalities among financial time series can be important for portfolio management [4]. As a decentralized cryptocurrency, bitcoin has attracted more and more investors and traders owing to high-investment returns in recent years [5]. From January 1, 2014, to December 31, 2018, bitcoin price jumped from $771 to $3742 (USD), which made bitcoin a promising investment cryptocurrency. Interestingly, Yermack [6] asserted that bitcoin was not a currency as it performs poorly as a unit of account and as a store of value. And Corbet et al. [7] supported the conclusion of Yermack that bitcoin was a speculative asset rather than a currency. Moreover, Dyhrberg [8] proved that bitcoin can serve as a hedge against the stock market, and it is a helpful tool for both portfolio diversification and risk management. Therefore, it is of great importance for investors and traders to forecast the bitcoin price and investigate the causes of its volatility.

In most circumstances, causality inference among financial time series is based on the Granger causality [9]. As a predictive causality, the Granger causality refers to that a time series x Granger-causes y if x’s values provide statistically significant information about future values of y, i.e., predictions of y based on its prior values, and the prior values of x are better than predictions of y based only on its prior values. Some traditional approaches for Granger causality inference mainly include vector autoregression (VAR) [10], vector error correction model (VECM) [11], and their variants [12, 13]. VAR and VECM are valid mostly when the input is stationary data. However, the results of some unit root test methods, such as ADF [14], showed that most economic time series are not stationary, while they may be stationary after preprocessing. Hence, traditional Granger causality inference for nonstationary time series needs to preprocess the input to reach a stationary sequence, which may bring pretesting distortions. The Wald test [15] has attracted much attention because there is no pretesting distortion, and it is based on a standard asymptotical distribution, irrespective of the unit roots and the cointegrating properties of the data [16]. However, the Wald test method may be inefficient since it intentionally overfits the VAR. Moreover, those aforementioned approaches are not good at capturing the complex representation of the input data.

Deep learning-based architecture could learn more abstract representation from the input data without data stationarity requirement. Chong et al. [17] proposed a deep learning-based stock market forecasting model to examine the ability of three unsupervised feature extraction methods of predicting future market behaviour. Based on a deep learning model, Chen et al. [18] built a computer-aided diagnosis and decision-making system for medical data from MR images. Long et al. [19] proposed a multifilter neural network that integrated convolutional and recurrent neurons for feature extraction on economic time series samples and price volatility prediction. And the aforementioned deep learning-based forecasting models achieved promising forecasting performances. Lahmiri and Bekiros [20] employed LSTM for cryptocurrency prediction, which proved deep learning was highly efficient in predicting the inherent chaotic dynamics of cryptocurrency markets. Those aforementioned deep learning-based models are prone to perform better than traditional econometric methods, which suggest the deep learning-based architecture is more potent in dealing with financial time series data.

In this paper, we construct a deep learning-based Granger causality inference architecture, named DLI, which consists of AE, CNN, and LSTM. The two contributions of our work are exploring the Granger causality between the bitcoin price and S&P index and predicting the bitcoin price and S&P index with a higher accuracy.

The remainder of this paper is organized as follows. Available datasets we employed are presented in Section 2. The proposed DLI is depicted in Section 3. Experiments and results are introduced in Section 4. Our contributions and future work are summarized in Section 5.

2. Data

We took the bitcoin price1 and S &P index2 as experimental datasets. Both of them can be downloaded from the Yahoo website, and their relative prices are in US dollars. Without loss of generality, we take the daily closing price as the day’s price. The descriptive statistics for the bitcoin price and S&P index covering the period from January 1, 2014, to December 31, 2018, can be found in Table 1. The sample of the bitcoin price and S&P index contains 1,826 and 1,258 data points, respectively. Since stock markets are usually closed for holidays or other reasons, we employed AE to remove the data noise caused by default values.

To obtain a desirable model, we divide the experimental data into three parts: 70% training dataset, 10% validation dataset, and 20% test dataset. The training dataset is to reach a sound model, the validation dataset is to further determine the parameters of the whole network, and the test dataset is to test the generalization ability of the model.

3. Model Development

Autoencoder is a simple but powerful unsupervised deep learning model. A typical AE consists of three layers: input layer, hidden layer, and output layer, as shown in Figure 1. And its output layer is an approximate reconstruction of the input layer, which can be used for filtering and representation learning. In the proposed DLI, we adopt AE as a filter to denoise the origin input, which is helpful for improving prediction accuracy.

Long short-term memory is a widely used deep learning model, which focuses on processing sequence data, such as time series data and speech. It is an extension of the recurrent neural network by adding the gate mechanism, which shows a better performance in long-term prediction. In the proposed DLI, we hope it can achieve a long-term accurate prediction by introducing the LSTM model.

Convolutional neural network is also a widely used deep learning model [21], which focuses on processing time series data (1D CNN), image (2D CNN), and video or medical image (3D CNN). CNN includes the convolution layer and pooling layer, as shown in Figure 1. And it can greatly reduce the amount of parameters and speed up training by local receptive fields and shared weights. Moreover, LeCun and Bengio [22] showed that time series have a strong 1D structure: variables that are spatially or temporally nearby are highly correlated, and CNN can effectively extract the spatial feature of time series. Therefore, CNN is introduced into the proposed DLI to extract the spatial feature and to speed up training.

Figure 1 shows the graphic illustration of the DLI which consists AE, CNN, and LSTM. We assume that both S&P index () and bitcoin price () are time series of length , where and . Let be the S&P index at time and be the bitcoin price at time .

The DLI consists of three processing stages: denoising, feature extracting, and forecasting. As described in Section 2, since stock markets are usually closed for holidays or other reasons, the S&P index time series has many default values. Therefore, at the denoising stage, AE is firstly used for data filtering to remove the noises in the S&P index. At the feature extracting stage, the denoised S&P index and bitcoin price would be taken as the inputs of CNN and LSTM to extract deep representations, respectively. At the forecasting stage, we would obtain the bitcoin price prediction through a fully connected layer.

The optimization of the DLI model is to minimize the reconstruction error of AE and the training error of the whole model. At the denoising stage, the output of AE is an approximate copy of the input. Therefore, we have to minimize the reconstruction error between the input and the output, which could maintain the economic significance of the S&P index. The reconstruction error of AE is defined as follows:where and are activation functions, and are weights, and and are biases.

It is necessary for obtaining a sound model to minimize the training error of the whole model. The objective function of the whole model can be described aswhere denotes the predicted value.

4. Empirical Results

In this part, we will explore the Granger causality between the bitcoin price and S&P index. To investigate whether the S&P index Granger-causes the bitcoin price, we firstly predict the bitcoin price without considering the S&P index, as shown in Figure 2. Then, for comparison, we take the S&P index as auxiliary information to predict the bitcoin price, as shown in Figure 3. In the same way, to investigate whether the bitcoin price Granger-causes the S&P index, we firstly predict the S&P index without considering the bitcoin price, as shown in Figure 4. Then, for comparison, we take the bitcoin price as auxiliary information to predict the S&P index, as shown in Figure 5. In addition, we employ the traditional approach ARIMA to demonstrate the superiority of the proposed model. Owing to the continuous value prediction, we employ the root mean squared errors (RMSEs) as the forecasting performance indicator. The smaller the RMSE value, the better the prediction performance. And the corresponding prediction RMSEs are shown in Table 2.

From Table 2, we can see that the bitcoin price prediction RMSE of the DLI decreases by 92.10% and 23.32% compared with that of the ARIMA and LSTM, respectively. And the S&P index prediction RMSE of the DLI significantly decreases by 98.06% and 50.96% compared with that of the ARIMA and LSTM, respectively. The above results demonstrate that both bitcoin price and S&P index prediction performances would be enhanced with consideration of the S&P index and bitcoin price, respectively. And the prediction performance improvement of the S&P index is more significant than that of the bitcoin price. Therefore, we can conclude that there is a bidirectional but asymmetric Granger causality between the bitcoin price and S&P index.

5. Conclusions

In this paper, we proposed an interpretable deep learning-based Granger causality inference architecture by integrating AE, CNN, and LSTM, named DLI. The proposed DLI, as a deep learning-based model, one of its advantages compared with traditional econometric models is that it can process big data efficiently and retain its original economic significance of variables after data preprocessing.

Our two contributions are exploring the Granger causality between the bitcoin price and S&P index and predicting the bitcoin price and S&P index with a higher accuracy. Our experiments reveal a bidirectional but asymmetric Granger causality between the bitcoin price and S&P index. And the DLI performs a superior prediction accuracy by integrating variables that have causalities with the target variable into the prediction process.

In future work, the proposed DLI can be extended to some other economic variables to provide a reasonable reference for portfolio management, or it can be used for prediction in other scientific fields. Moreover, the DLI can also be extended from two variables to multivariables to determine causalities among the multitime series.

Data Availability

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The author acknowledges the National Natural Science Foundation of China (Grant no. 11801060) and the Innovation Program of Shandong University of Science and Technology (no. SDKDYC190114).