Abstract

The results of data description using ten samples of high-frequency data to describe the intraday characteristics of the CSI 300 index futures show that there is no significant summit and fat tail phenomenon. The Granger causality test shows that there is not only a two-way Granger causality between returns and trading volume but also an instantaneous causality relationship. Therefore, the A-type SVAR models are identified and estimated after setting up constraints, and all the models are tested stable. Subsequent variance decomposition results show that the residual disturbance of returns can be explained more than 99.9% by its lagged terms; the residual disturbance of trading volume explained by its lagged terms and returns is quite different, and the range of interpretation is very wide. The impulse response results show that the market responds very quickly to new information. When a shock is reached, the market can reach a new equilibrium point after about three observation time periods. This shows that the market is able to digest new information quickly, and arbitrage trading becomes very difficult in this market.

1. Introduction

Returns and trading volume are the two important indicators of the capital market. Research of the relation between returns and trading volume is important to financial markets. First, it can provide information about market structures. Second, the results of relationship can suggest whether technical or fundamental analysis should be used in developing trading strategies. Third, it determines whether a future contract is successful or not. Finally, it can help us to explain the informational efficiency of the future market. The relation between returns and trading volume for equities as well as futures has been the subject of a large number of studies. Previous studies have shown that the relationship between trading volume and returns is very complex, not only in the different investments but also in the different sequence of mutual influence. There are mainly three research perspectives.

The first is the sequential information arrival hypothesis [1], which hypothesizes that the market information is spreading outwards gradually, and it causes the changes in return and trading volume when the market information is transmitting. With the continuous increase in new information, the returns and trading volume increase synchronously. According to the hypothesis of continuous information arrival, the information is spread out step by step and traders get the information step by step too. So there are many intermediate equilibrium points before the price reaches the final equilibrium. When all traders get the relevant information, the market can reach the balance finally. There should be a bidirectional relationship between returns and trading volume. Therefore, the past information on price fluctuations can help predict the future trading volume of transactions, and in the same way, the information on trading volume in the past can also help predict the future price fluctuations. The researchers have examined the trading volume and returns relationship in a variety of contexts by employing a range of analytical methods in this area. There are a lot of empirical studies which support the positive contemporaneous relationship between returns and trading volume [25]. The second hypothesis is the mixture distribution hypothesis [6], which thought that the returns and trading volume of financial assets are determined by a potentially observable information flow. When the information flow arrives in the market, the changes will occur in both returns and trading volume. Under the influence of the information flow, the changes in the market’s demand break original balance and cause fluctuation in price. The transaction is carried out while the price is fluctuating, and the reflection of trading volume and the price to new information is instantaneous. Regardless of the direction of price fluctuations, the trading volume will increase with the increase in price fluctuations. Thus, there is a positive correlation between returns and trading volume [710]. The third hypothesis is the noise trader model [11], which suggests that noise traders’ behavior causes a positive causal relationship between trading volume and returns in either direction by using past information about price changes to make investment decisions.

As stated above, previous studies have failed to reach an agreement on which theory is to be supported in the future market. Some researchers study the dynamic interaction relationship between returns and trading volume by the linear vector autoregressive model [5, 12]. This implies that all current variables in the model can regress on a number of lagged variables, and there is no current relationship between model variables. But the research results of the CSI 300 index futures [13] show that there is a significant current positive relationship between returns and trading volume. Wen et al. [14] found that, in the stock markets of China, the upward price adjustments impose larger effects on stock market volatility than downward price adjustments, primarily leading to a significant reduction in stock market volatility. Bouri et al. [15] used the cross-correlation function (CCF) approach to test the linkages between the international oil market and the Chinese stock market and found that there is a strong asymmetry in the dependence between China’s stock market and the world’s crude oil market. It is difficult to determine the relationship between volume and returns in different transaction species. Otherwise, because of strong liquidity of the CSI 300 index futures, high-frequency trading or even ultrahigh-frequency trading becomes possible. Some of the high-frequency trading intervals even can reach the millisecond. However, previous research on the CSI 300 index futures mainly chose minutely or longer interval high-frequency data samples as the research object, which not only loses a lot of information but also plays a very limited role in ultrahigh-frequency trading. Therefore, the main purpose of this paper is to find out the dynamic interactions between intraday returns and trading volume on the CSI 300 index futures.

The VAR model is often used to study the lead-lag relationship among multivariate variables. It constructs the model by taking every endogenous variable in the system as a function of the lag value of all endogenous variables in the system; thus, the univariate autoregressive model is extended to the vector autoregressive model composed of multivariate time series variables. The VAR model is used to estimate the dynamic relationship of joint endogenous variables without any preconditions. The SVAR model is used to measure whether there is a causal relationship between different variables in the current period, that is, instantaneous causal relationship, and the model needs constraints. When we study the relationship between returns and volume, whether there is instantaneous causality between returns and volume is the key to choose the VAR model or SVAR model. The SVAR model is chosen for the causality of the current period, and the VAR model is chosen for the contrary. In this paper, there is an instantaneous causal relationship between intraday high-frequency trading volume and return rate, so we choose the SVAR model to study the relationship between them.

Three important issues are addressed in this paper: first, compared with the past research, shorter time interval samples are used to find out that whether there is a positive relationship between returns and trading volume in the CSI 300 index futures. Second, what is the quantitative relationship between volatility and trading volume? How much volatility and volume affect each other? Third, whether trading volume contains information that can be used to forecast future returns.

Based on the ideas, this paper is structured as follows: In Section 2, we give the structure of the model and set up the constraints of the model. Section 3 is mainly about preprocessing high-frequency data in order to find intraday data feature of the CSI 300 stock index futures. Section 4 comprises empirical test and result analysis, and Section 5 concludes the study.

2. Methodology

In order to identify the impact of trading volume and returns on the CSI index futures, the decomposition method suggested by Quah and Vahey [16] is applied to a two-variable VAR system. The basic mathematical expression form of the VAR(p) model with K endogenous variables can be expressed as follows:where are coefficient matrices for and is a K-dimensional process with and time invariant definite covariance matrix . A VAR can be interpreted as a reduced form of the SVAR model. The structural form of the SVAR model can be defined as [17]

It is assumed that the structural errors are white noise, and the coefficient matrices for are structural coefficients. By multiplying equation with the inverse of A on the left-hand side, we can get the reduced VAR form formula similar to formula (1):where and its variance-covariance matrix by . SVAR models can be distinguished into three types depending on the imposed restrictions: A model, while the matrix B is set to ; B model, while the matrix A is set to ; and AB model which can be restricted on both matrices. For the above reduced formula (2), if the disturbance vector is the multivariate Gaussian distribution, then we can estimate the model system by using OLS or MLE method. Obviously, after estimating the VAR model, the original SVAR model can be obtained through the intrinsic link between SVAR and the corresponding VAR model. The parameters are estimated by minimizing the negative of the concentrated log-likelihood function:where signifies an estimate of the reduced form variance matrix for the error process.

The stability is an important characteristic of VAR or SVAR process. In practice, the stability of an empirical VAR or SVAR process can be analyzed by considering the companion form and calculating the eigenvalues of the coefficient matrix. Because the SVAR model can be rewritten as the SVR model, we just give the formula to determine stability for the VAR model here. First, we can write a VAR(p) process as a VAR(1) process:where is the dimensions of the stacked vectors and and the dimension of the matrix A is . If the moduli of the eigenvalues of A are less than one, then the VAR(p) process is stable.

In this paper, for the high-frequency intraday data of the CSI 300 stock index futures, it is assumed that the trading volume and the returns of the structural VAR model can be estimated as follows:where are the structural errors, are the regression coefficients, and are the constants. Then, we continue to transform the following formula:

Converting to the vector form:

This is the form of the SVAR model which we need to estimate.

3. Data Description

The main purpose of this paper is studying the relationship between intraday returns and trading volume of the CSI 300 index futures. Therefore, the intraday high-frequency data of the CSI 300 index futures are chosen as the object for the study. There is another reason for choosing the intraday data as a research sample. Because of the overnight information and weekend effects, overnight yields have far greater impact on the SVAR model than intraday yields, which is unfavorable for the estimation of intraday models. So in this paper, intraday high-frequency data from 2 July to 13 July are selected as the study sample, which is from the IF1208 contract of the Shanghai and Shenzhen 300 stock (CSI 300) index futures in 2012. These sample select two transactions per second high-frequency data to describe the intraday pattern of the CSI 300 index futures. After excluding the opening minutes’ data, the closing minutes’ data, and the turnover no change data, we get a total of 300997 sample records. All the data are taken from the China Financial Futures Exchange. The principles for selecting the samples are as follows:(1)Choosing research samples not only needs to eliminate the unstable period when stock index futures just listed (October 2010) but also needs to eliminate the influence of external information such as Chinese holidays. If the sample selected for the study happens to be in the digestion period of a major external policy information market, the conclusions of the study may not be representative. China has no holidays in July, so it is less affected by external market information.(2)In order to make the research sample representative and make the research more general, we selected a complete trading cycle from Friday to Friday as the research object.(3)The research period of this paper is very ordinary, and there are no special incidents, so it has a certain representativeness and can explain some problems. Of course, we can also change to another time cycle to study the relationship between the two, but we cannot study the relationship between the volume and yield of all samples, so the amount of data is too large.(4)Furthermore, in order to eliminate the influence of overnight information, we also removed samples of one minute after the opening, one minute before the closing, and no volume.

Suppose is the CSI 300 index futures’ price at time t, the returns can be defined as follows:

In order to find the intraday pattern of CSI 300 index futures, we use R language to describe the data features as shown in Table 1.

Table 1 shows that the average of the CSI 300 index futures’ intraday returns is 0 in all ten observational samples, and its standard error is 0.01. We also find that skewness does exist in some of the observation samples, but the skewness’ values are not too big, and they obey asymmetric distribution generally. The kurtosis of observational samples is significantly less than 3, which means there is no significant summit and fat tail phenomenon, and the result is different from the previous research results of high-frequency data. This has a great relationship with the sample frequency. The sample does not conform to normal distribution. Partial normal distribution is more suitable for the sample distribution. Returns and trading volume of the observed samples are plotted in Figure 1.

From Figure 1, we can see the CSI 300 index futures’ returns and trading volume have no obvious intraday pattern, but they have significant correlation characteristics. Generally, there are two or more trading volume peaks in each trading day. At some point, the trading volume will exceed 200 hands at each observation interval. This shows that the market with significant fluctuations in the aggregation effect is not only having a strong liquidity but also having a strong impact. In comparison, the returns series is stable, and there is a strong correlation between the variation and the trading volume change.

In addition to the aggregation effect of yield, we can clearly see that, at some moments, when the volume is much higher than the average volume, the yield will change dramatically. This phenomenon reflects the very active intraday trading in the stock index futures’ market and the fierce competition between the buyers and sellers for market prices. As a result, a large number of price-limiting orders will be formed in buy one and sell one locations, forming a price protection barrier. Breaking through the existing barriers requires greater market energy; that is, a larger volume needs to be cooperated.

4. Empirical Test

Vector autoregressive (VAR) model is regularly used to forecast the time series system and analyze the dynamic effects of the system. It does not take any prior constraints and needs to be exogenous or endogenous assumptions, and all variables in the model are regarded as endogenous variables. The SVAR or VAR model can be established by the process shown in Figure 2.

4.1. Establishment of the VAR Model

A stability condition should be satisfied before we establish a VAR or SVAR model. Otherwise, the OLS regression is easy to be pseudoregression. Therefore, the first task is to check out the stationarity of the time series data. In general, the unit root test is used to verify the stability of the data. However, in this paper, the number of all ten samples is nearly 30,000, and it is consistent with the hypothesis that large sample has weak stationarity, so we do not need to check again whether the data are stabile or not. The VAR model can be established directly.

After making sure the variables of the model are stable, we need to determine the lag term of the model. The lag term of the VAR model is usually determined by Akaike information criterion (Schwarz = AIC) and Schwartz criterion (criterion = SC) which can estimate the complexity of the model and the goodness of fitting data:where n is the number of estimating parameters, T is the sample size, and is the value of likelihood function. When AIC and SC standards of the model are not uniform, we can choose one of them as the criteria for judging. The results of AIC and SC criteria for ten observation samples are shown in Table 2.

As we expected, the SC and AIC results are not consistent, but the gap between them is very small. Here, we take the SC standard because the lower lag term can reduce the complexity of the model and the number of the coefficients which are to be evaluated. From the results of the above table, we can establish ten initial VAR models: VAR(7), VAR(6), VAR(8), VAR(9), VAR(8), VAR(9), VAR(8), VAR(9), VAR(7), and VAR(8), for D02–D13 samples which can be used to do the Granger causality test.

4.2. Granger Causality Test

In order to find out the dynamic relationship between trading volume and returns, the Granger causality test is needed. Table 3 shows the results of the Granger causality test with the R language VARS package for CSI 300 index futures’ intraday trading volume and returns.

The hypothesis of the Granger causality test is that there is causality between variables to be tested, and the alternative hypothesis is that there is no causality between variables. In Table 3, R or A indicates acceptance or rejection of the original hypothesis at the significant level of 0.1.

In our study, the Granger causality test and instantaneous causality test are tested for returns and trading volume. From the results of the test, the relationship between returns and trading volume is complex: seven of ten samples support that trading volume is the Granger reason for returns, nine of the ten samples support that returns is the Granger reason for trading volume; and eight of the ten samples also support that there is instantaneous causality between returns and trading volume. In general, the results show that not only the past trading volume and returns but also the current trading volume will have an impact on the current returns forecast, and in accordance with practice, when the new information flows into the market, the increase of trading volume does cause the fluctuations in price, and vice versa. So the establishment of the SVAR model is more appropriate than the establishment of the VAR model in this situation.

4.3. Establishment of the SVAR Model

As we all know, the explanatory variables of the VAR model only have lagged items, and there is no current relationship between variables. So in this situation, the current trading volume is an important variable to explain the returns, which must be included into the model. Thus, the SVAR model is needed to adapt to this situation. In Section 2 of this article, we identify the SVAR model as follows:

In the model, the trading volume is interpreted as the impact of returns, according to the SVAR model theory. If a SVAR model of n variables can be identified, then we need to impose restrictions. So we only need to impose one restriction on this model. As we mentioned above, is a white noise structure disturbance vector, and it represents the mutual impact of trading volume and returns. In fact, there is no strong correlation between the variance covariance matrixes. Therefore, we can consider applying a constraint to the variance covariance matrix , making as a unit matrix, . After setting up the restrictive conditions, the next step is to estimate the parameters of the SVAR model. The VAR model can be regarded as a reduced form of the SVAR model. Therefore, we can estimate the parameters of the SVAR model by estimating the parameters of the VAR model first. Table 4 is the regression result of the VAR model, in which the yield is treated as an interpreted variable. In the regression results, we found that (1)the closer the lag time is, the more significant the estimation results of the coefficient of variables are, (2) the coefficient of constant term is generally not significant, which is related to the sample mean value of zero, (3) trading volume has little contribution to yield forecast in the past, which reflects the model estimation coefficient is not significant or the absolute value of coefficient is too small. Next, we need to estimate the coefficients of matrix A. By using the coefficients of matrix A and the estimated results of the VAR model, we can calculate the coefficients of dividing SVAR model. A scoring algorithm function is used to estimate the structural parameters. Then, the parameters of A matrix are estimated as shown in Table 5.

The results of the estimated coefficient matrix from Table 5 show that the estimated coefficient A incline to be a smaller negative value and the estimated coefficient B incline to be a large positive value. The situation is caused by two reasons: first, the order of magnitude of the trading volume and returns can explain part of the reason; second, when the returns is the explanatory variable, the impact of the current trading volume is very small, and it is mainly explained by its lagged terms. When the trading volume is the explanatory variable, the impact of the current returns is great. Once again, this result shows that the stock index futures’ market has strong liquidity. The smaller returns shock can cause a large trading volume change; the larger trading volume impact can only cause a small change in returns. Relatively, this situation is consistent with the sequential information arrival hypothesis which reflected in the existing of a two-way causality relationship between returns and trading volume.

Before employing impulse response functions (IRFs) and forecasting error variance decomposition (FEVD), the stability of the model is tested first. This means that the model generates stationary time series with time invariant means, variances, and covariance structure, given sufficient starting values. By evaluating the characteristic polynomial, we can test whether the models are stable or not.

The test results from Figure 3 show that all the values of the eigenvalues of the companion coefficient matrix are less than one, and the models are stable. So we can use the model to identify shocks and trace out by employing impulse response functions and forecasting error variance decomposition.

4.4. Impulse Responses

The Granger causality test only reflects the causal relationship between variables, but it cannot reflect the degree of interaction between variables of the model system. Therefore, impulse response analysis is needed to understand the impact of trading volume on returns.

The impulse response function is used to measure the influence of a standard shock on the current and future values of other variables; it can describe the dynamic interaction and the effect between variables. As we all know, returns are one of the important indicators of futures trading, which can directly decide the trading strategy is successful or not. Therefore, we focus to analyze the impact of trading volume on the returns and analyze the dynamic characteristics between returns and trading volume that means to calculate the impact which is caused by a standard deviation of the returns for returns. The results of impulse response from trading volume are shown in Figure 4.

The horizontal axis in Figure 3 represents the number of retroactive periods, set from one to ten, and the vertical axis represents the response variables, and the impulse response function is shown by the solid line.

As shown in Figure 3, the impulse responses tend to 0 after the three periods under the unit standard impact of variable V in all the ten observation samples. This shows that the system is very stable, and the price can be restored after three observation periods to balance by the shocks of market innovation. For a positive impact, the biggest positive response is given in the first observation period. A negative response is given in the second observation period, and the size is half of the first reaction roughly. The response of third observation period was positive, and the size of it is half of the second one. The size of response is exponential decay and eventually tending to 0.

The impact of trading volume generally reflects the impact costs of market transactions. It can be concluded from the results that the liquidity of the market is very strong, and it has strong ability to absorb shocks, which means the market is close to an effectiveness market. This shows that there is a large number of market arbitragers so that the arbitrage trading in the market is becoming more and more difficult. The space of arbitrage gets further contracted, and transaction speed becomes the most important factor in arbitrage trading in this market.

4.5. Forecasting Error Variance Decomposition (FEVD)

By analyzing the contribution of a structural shock for the variation of endogenous variables, variance decomposition provides a method to describe the dynamic change of the system. It gives information about the relative importance of each random perturbation which influences the variables in the SVAR model. The basic idea is to decompose the fluctuation of each endogenous variable in the system according to its origin into the components which associate with the new equation. By comparing the contribution of each part, we can find out the relative importance of new information to the endogenous variables of the model. With the extension of the lag period, the influence of new information on each variable tends to be stable. So we can quantify the relationship between variables. Variance decomposition technique is used to decompose the variance of the two variables so that we can calculate the relative importance of each variable impact. Figure 5 shows the results of variance decomposition of the SVAR model.

As shown in Figure 5, when the returns are used as the explained variable, the residual disturbance can be explained more than 99.9% by its lagged terms and the impact of the trading volume for its basic can be negligible. The system can become stable in the first period, and all ten samples are the same; when the trading volume is used as the explained variable, the residual disturbance explained by its lagged terms and returns is quite different, and the range of interpretation is very wide. Generally, the system will return to stable in the third period. Therefore, in the prediction of returns, especially the prediction by ultrahigh-frequency data, its own lag terms play a decisive role and the impact of trading volume can be ignored.

5. Conclusions

The purpose of this paper is to find out the dynamic relationship between intraday returns and trading volume. High-frequency data are used to describe the intraday data characteristics of the CSI 300 stock index futures. And the results show that there is no significant summit and fat tail phenomenon which is different from the previous research results of high-frequency data. Then, we established the initial VAR model.

The Granger causality test shows that there is not only a two-way Granger causality between returns and trading volume but also has an instantaneous causality relationship. This mode of information diffusion is similar to the sequential information arrival hypothesis which means that there is a current relationship between trading volume and returns. However, the established VAR model does not contain the current variable. Therefore, the SVAR model with current variables is needed which can explain the relationship between returns and trading volume better. According to the characteristics of high-frequency data, we set up the restrictive condition for the SVAR model, and the A-type SVAR model is obtained. After the parameters of the coefficient matrix are estimated, we found that the stock index futures’ market liquidity is very strong. The smaller returns shock can cause a large change in the trading volume, and the impact of larger trading volume can only cause a small change in returns.

The stability of the model is tested by using the stationary time series, and the results show that the models are stable. Then the impulse response functions and forecasting error variance decomposition can be employed. Variance decomposition results that when the returns are used as the explained variable, the residual disturbance can be explained more than 99.9% by its lagged terms; when the volume is used as the explained variable, the residual disturbance explained by its lagged terms and returns is quite different, the range of interpretation is very wide. The impulse response results show that the market responds very quickly to new information. When a shock is reached, the market can reach a new equilibrium point after about three observation time periods. This shows that the market is able to digest new information quickly, and arbitrage trading becomes very difficult in this market. Therefore, in forecasting the intraday high-frequency yield of the stock index futures, we only need to use the past yield as an explanatory variable.

In this dynamic study of intraday return and volume, the influence of external factors on this variable is very low, so we only consider the influence of different information in the market. As Bouri and Qian et al. [15] mentioned, in future asset price studies, we may take into account the extent to which a variety of external factors affect price volatility.

Data Availability

The CSV data used to support the findings of this study are included within the supplementary information file.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the 13th Five-Year Plan of Philosophy and Social Sciences in Guangdong Province-Discipline Co-Construction (GD18XYJ36).

Supplementary Materials

The study sample selected two data per second for ten trading days (0702 to 0713 in 2012). Date: date of the transaction; code: contract code; tip: time label; num: transaction serial number; price: stock index futures price; r: rate of return; volume: cumulative turnover; v: volume of trading during observation period; hold: open interest; h: change of position; bid_ask: bid-ask spread. (Supplementary Materials)