#### Abstract

The main purpose of this paper is to consider the multivariate GARCH (MGARCH) framework to model the volatility of a multivariate process exhibiting long-term dependence in stock returns. More precisely, the long-term dependence is examined in the first conditional moment of US stock returns through multivariate ARFIMA process, and the time-varying feature of volatility is explained by MGARCH models. An empirical application to the returns series is carried out to illustrate the usefulness of our approach. The main results confirm the presence of long memory property in the conditional mean of all stock returns.

#### 1. Introduction

Long memory seems to be in recent years a very widespread phenomenon in the modelling of economic and financial time series. It can be defined in terms of the persistence of the autocorrelations which decays at a very slow hyperbolic rate. A large numbers of papers demonstrate the existence of long memory in financial economics. Peters [1] and Greene and Fielitz [2] found evidence of long-term positive dependence in stock returns by applying the rescaled range statistic. (the statistic is the range of partial sums of deviations of a time series from its mean, rescaled by its standard deviation) proposed by Hurst [3] and modified by Lo [4]. Similar evidence on the German stock return is given by Lux [5]. However, Jegadeesh [6] challenged the notion of mean reversion for stock returns. He reported negative first-order serial correlation and significant positive serial correlation for longer lags using monthly returns for individual stocks. Kim et al. [7] also challenged the findings of mean reversion. Their findings for postwar data showed persistence in returns. Christodoulou-Volos and Siokis [8] examined the presence of long-range dependence in a sample of 34 stock index returns using the procedure of Geweke and Porter-Hudak [9] and Robinson [10]. Their results provided significant and robust evidence of fractional dynamics in most major and small stock markets over the sample periods. Goetzmann [11] applied tests which provided some evidence that the London stock exchange and New York Stock Exchange stock market prices may exhibit long-term memory**.** Some authors found significant and robust evidence of positive long-term persistence in the Greek stock market [12], in the Brazil stock market [13, 14], and in the Finnish stock market return data [15, 16]. Sadique and Silvapulle [17] examined the presence of long memory in weekly stock returns of seven countries. They found evidence of long-term dependence in four countries. Moreover, Cajueiro and Tabak [18] found that the markets of Hong Kong, Singapore, and China exhibited long-range dependence, while Mills [19] and Zhuang et al. [20] investigated British stock returns and found little evidence of long-range dependence. However, Limam [21] analyzed stock index returns in 14 markets and concluded that long memory tends to be associated with thin markets, and Huang and Yang [22] applied the modified technique to intraday data and provide evidence of long-memory phenomenon in both the New York Stock Exchange and Nasdaq indices.

In fact, all these empirical works are based on univariate models. However, for many important questions in empirical literature, multivariate settings are preferable. For example, suppose that one is considering a portfolio of many assets; the return of the portfolio can be directly computed if one knows the asset shares and the return of each asset [23]. Granger and Joyeux [24] proposed the univariate fractionally integrated autoregressive moving average (ARFIMA) model to explain the long memory property that exists in the conditional mean. Therefore, we consider in this study the multivariate ARFIMA model (for more details, see Sowell [25] and Gil-Alana [26, 27]), where the fractional integration parameters determine the long memory properties of the data. Apart from the presence of long-term dependence in the conditional mean, we will take into account the volatility's properties of the series. (See Ding et al. [28] and Doukhan et al. [29].) More precisely, the time-varying feature of volatility is explained by multivariate generalized autoregressive conditionally heteroscedastic models (MGARCH).

Despite the fact that univariate descriptions are useful and important, various financial operations require a multivariate framework, since high volatilities are often observed in the same time periods across different assets. The development of MGARCH models from the original univariate specifications represents a major step-forward in the modelling of time series; these models permit time-varying conditional covariances as well as variances. To this end, the main contribution of this paper is then to incorporate the MGARCH framework to model the volatility of a multivariate process exhibiting a long-term dependence and slow decay in the stock returns. More precisely, we have examined the long memory property in the first conditional moment of daily stock returns; the robustness of the results is also investigated by considering that its innovations are generated by a MGARCH process. We found that the long memory property existed in the conditional mean of the Nasdaq 100, New York Stock Exchange** (**NYSE) composite, and Russell 3000 stock returns.

The rest of this paper is organized as follows. We briefly review the multivariate models in the next section. Section 3 outlines the quasi-maximum likelihood estimation and testing procedures for the models which are applied to US stock returns. Section 4 describes multistep forecasting with some MGARCH model, and Section 5 presents the data used and provides empirical results. The paper ends with a short concluding section.

#### 2. Econometric Framework

The following is a brief description of the time series models used in this study. The vector ARFIMA models are discussed in detail in Sowell [25] and Luceño [30]. Tsay [31] provided a more detailed description for the estimation of the vector ARFIMA models. He suggested a conditional likelihood Durbin-Levinson algorithm to efficiently evaluate the conditional likelihood function of the vector ARFIMA processes. Hosoya [32] and Nielsen [33] proposed a class of maximum likelihood estimators and tests. Lobato [34] analyzed a two-step estimator of the long memory parameters of a vector process by using a semiparametric version of the multivariate Gaussian likelihood function in the frequency domain.

Consider a vector ARFIMA processes: where , , is an -dimensional vector of observations, is the conditional mean vector of the process, and are matrix polynomials in the lag operator and satisfy the usual stationary and invertible conditions, respectively, that is, the roots of ( are the determinant of the matrix ), and are outside the unit circle. is the identity matrix of order and . The fractional differencing operator is defined by the binomial expansion, where is the Gamma function. (The Gamma function is defined as .) Hence, the long-range dependence between observations is eventually determined only by the fractional differencing parameter. These characteristics can be seen in the shapes of the spectral density and the autocorrelation function. Indeed, if for , the multivariate process is stationary and invertible. If , the process is characterized by strong positive dependency between observations. This is noted in the frequency domain by the spectral density increasing to an infinite value at the zero frequency. In the time domain, the persistence is indicated by the slow decline of the autocorrelation functions which are not absolutely summable. In this case, is said to have a long memory. (See Beran [35] for the overviews of long memory processes.) If , the process exhibits negative dependency between observations. In the frequency domain, this is indicated by the decline of the spectral density to zero, as the frequency approaches zero. The time domain indicates the antipersistence by the rapid decline of the autocorrelation functions.

So, in this paper, we consider a vector ARFIMA model, which generates the long memory property in the first conditional moment and which allows its innovations to be generated by a multivariate MGARCH process. As an illustration, the proposed model is applied to the daily stock returns of Nasdaq, New York Stock Exchange, and Russell indices.

The most commonly employed distribution in the literature is the multivariate normal. Thus, we assume that the stochastic vector process is conditionally multivariate normal with zero expected value and covariance matrix with . We denoted by the information set generated by the observed series until time . is an conditional covariance matrix of : and is an independent identically distributed random vector error process such that and .

As noted by Silvennoinen and Teräsvirta [36] (see also Silvennoinen [37]), the specification of an MGARCH should be flexible enough to be able to represent the dynamics of the conditional variances and covariances, and, as the number of parameters in an MGARCH model often increases rapidly with the dimension of the model, the specification should be parsimonious enough to allow for easy estimation of the model. Another feature that needs to be taken into account in the specification is that the conditional covariance matrices should be positive definite.

##### 2.1. Generalizations of the Univariate GARCH Model: VEC and BEKK Models

The model of Bollerslev et al. [38] is the first multivariate GARCH model. It is a generalization of the univariate GARCH model, where each element of is a linear function of the lagged squared errors as well as cross-products of errors and lagged values of the elements of . The model is given by where (vech is an operator that replaces the columns of the lower triangular part of in a vector column), is the conditional covariance matrix, , is a parameter vector, and and are parameter matrices.

The main disadvantage of this model is the number of parameters which is equal to and which becomes larger and larger as the number of variables increases. Thus, the estimation of the parameters is difficult. Furthermore, the positivity of is not guarantee. To overcome this problem, Bollerslev et al. [38] suggest the diagonal VEC (see also Bauwens et al. [39] and Silvennoinen and Teräsvirta [36]) model in which and are assumed to be diagonal, each element depending only on its own lag and on the previous value of . This restriction reduces the number of parameters to . But even under this diagonality assumption, large scale-systems are still highly parameterized and difficult to work with in practice.

One of the most general forms, proposed in Engle and Kroner [40], is the BEKK (Baba-Engle-Kraft-Kroner) representation. This formulation developed a general quadratic form for the conditional covariance equation which eliminated the problem of assuring the positive definiteness of the conditional covariance matrix. The representation for the conditional covariance matrix takes the form where the summation limit determines the generality of the process, is upper triangular matrix, and and are both parameter matrices. This representation guarantees that is positive definite. Although the form of the above model is quite general especially when is reasonably large, it suffers from the problems due to overparameterization. (See Engle and Kroner [40] for more discussion on the identification problem of this model.).

The parameters of the BEKK model do not represent directly the impact of the different lagged terms on the elements of , like in the VEC model.

The number of parameters in the BEKK model is equal to and is still quite large. Thus, as we already mentioned, a problem with these models is that the number of parameters can increase very rapidly as the dimension of the process increases, what creates difficulties in the estimation of the models due to several matrix inversions. So it is typically assumed that in applications of this model. A further simplified version of (5) in which and are diagonal matrices has sometimes appeared in applications. This is a diagonal BEKK model, (Bauwens et al. [39] propose a scalar BEKK model, where and are equal to a scalar times a matrix of ones to reduce also the number of parameters) where the number of parameters is equal to . This model is also a DVEC model, but it is less general, although it is guaranteed to be positive definite while the DVEC is not.

##### 2.2. Nonlinear Combinations of Univariate GARCH Models: CCC and DCC Models

Bollerslev [41] proposed a class of MGARCH models in which the conditional correlation matrix is time invariant, and thus the conditional covariances are proportional to the product of the corresponding conditional standard deviations. This model is so-called constant conditional correlation (CCC). This restriction greatly reduces the number of unknown parameters and thus simplifies the estimation. So, the conditional covariance may always be decomposed as:
where can be defined as any univariate GARCH model, and is a symmetric positive definite matrix of conditional correlations with typical element
with , for all . denotes the diagonal matrix with typical element . The CCC-GARCH model assumes that the conditional correlations are constant , so that the temporal variation in is determined solely by the time-varying conditional variances for each of the elements in . As long as each conditional variances are positive (see Nelson and Cao [42] for discussion of positivity conditions for in univariate models), the *CCC* model guarantees that the resulting conditional covariance matrices are positive definite.

Despite the simplicity of this model, the assumption that the conditional correlations are constant may seem too restrictive and unrealistic. Engle [43] proposed a new class of estimator that both preserves the ease of estimation of Bollerslev's constant correlation model yet allows for nonconstant correlations. Dynamic conditional correlation *(DCC)-GARCH *(various generalizations of the DCC-GARCH model are proposed in the literature [44–47]) preserves the parsimony of univariate GARCH models of individual assets' volatility with a simple GARCH process. Further, the number of parameters estimated using maximum likelihood is considerable improvement over both the VEC and the BEKK models. Tse and Tsui [48] have also proposed a dynamic correlation MGARCH model; however, no attempt has been made to allow for separate estimation of the univariate GARCH processes and the dynamic correlation estimator.

The *DCC* model of Engle [43] computes the time changing conditional correlation matrix from the standardized residuals series
where , is the symmetric positive definite matrix given by and are nonnegative scalar parameters satisfying . is the unconditional covariance matrix composed of the standardized residuals resulting from the first-step estimation, where is the standardized residuals vector (, for .), and
The typical element of will be of the form
A slightly different formulation was suggested by Tse and Tsui [48] and are nonnegative scalar parameters such that . Here, is a time-invariant symmetric positive definite parameter matrix of conditional correlations with ones on the diagonal, and is a correlation matrix of the past -standardized residuals . The positive definiteness of is ensured by construction if and are positive definite such that . The typical element of will be of the form
where , for .

The number of parameters in both the DCC models is equal to if the conditional variances are specified as GARCH(1,1). However, to check if the assumption of the conditional correlations is empirically relevant, one can test . A drawback of the DCC models is that and are scalars, so that all the conditional correlations obey the same dynamics. This is necessary to ensure that is positive definite through sufficient conditions on the parameters.

#### 3. Estimation of MGARCH Model

Bollerslev [41] introduced the CCC-GARCH specification, where univariate GARCH models are estimated for each asset, and then the correlation matrix is estimated using the standard closed form MLE correlation estimator by transforming the residuals using their estimated conditional standard deviations. The assumption of constant correlation makes estimating a large model feasible and ensures that the estimator is positive definite, simply requiring each univariate conditional variance to be nonzero and the correlation matrix to be of full rank. However, the constant correlation estimator, as proposed, does not provide a method to construct consistent standard errors using the multistage estimation process. Bollerslev [41] noticed that the notion of constant correlation is plausible, but Tse and Tsui [48] and Tse [49] found that it can be rejected for some assets.

For the maximum likelihood estimation (MLE) of parameters, we assume the conditional normality of the errors. The log-likelihood function of the model has the following form: where is the vector of all the parameters in the model.

Estimating the parameters simultaneously with the conditional variance parameters would increase the efficiency at least in large samples, but this is computationally more difficult. For this reason, we estimate the fractionally integrated model for the conditional mean, and we consider as the data for fitting the MGARCH model, where , for .

Engle and Sheppard [50] (see also Sheppard [51]) showed that the log likelihood can be written as the sum of a mean and volatility part, depending on a set of unknown parameters and a correlation part that is The conditional variance matrix of a DCC model can be expressed as . The DCC model was designed to allow for two stage estimation, where in the first stage univariate GARCH models are estimated for each residual series, and in the second stage, the residuals, transformed by their standard deviations estimated during the first stage, are used to estimate the parameters of the dynamic correlation. The likelihood used in the first stage involves replacing with the identity matrix in (19). Let , where the elements of correspond to the parameters of the univariate GARCH model for the asset returns, , for . The resulting first-stage quasi-likelihood function is which is simply the sum of the log likelihoods of the individual GARCH models for each of the asset returns. Once the first stage has been estimated, the second stage is estimated using the correctly specified likelihood, conditioned on the parameters estimated in the first step where are the univariate GARCH standardized residuals.

Since we are conditioning on , the only portion of the log likelihood that will influence the parameter selection is , and, in the estimation of the DCC parameters, it is often easier to exclude the constant terms and simply maximize

*Test of Constant Conditional Correlations*

The first step of modelling time-varying conditional correlations is to test the hypothesis of constant correlations. Testing data for constant correlation has proven to be a difficult problem, as testing for dynamic correlation with data that have time-varying volatilities can result in misleading conclusions and can lead to rejecting constant correlation when it is true due to misspecified volatility models. Bera and Kim [52] have provided tests of a null of constant correlation against an alternative of a dynamic correlation structure. It is an information matrix-type test that besides constant correlations examines at the same time various features of the specified model. An alternative test has been proposed by Longin and Solnik [53]. We are interested in testing the null of constant correlation against an alternative of dynamic conditional correlation via the Lagrange Multiplier (LM) approach suggested by Tse [49] which tested a null of constant conditional correlation against an ARCH in correlation alternative. A rejection of the null hypothesis supports the hypothesis of time-varying correlations. Rewrite the DCC model
Then, we test
against the alternative
where the conditional variances are GARCH (1,1). Under , the LM statistic is asymptotically . Under the normality assumption, the conditional log likelihood of the observation at time is given by
and the log-likelihood function is given by .

Engle and Sheppard [50] proposed another test of the constant correlation hypothesis in the DCC models. The null hypothesis for all is tested against the alternative . The test is easy to implement since implies that coefficients in the regression are equal to zero, where , is like the operator, but it only selects the elements under the main diagonal; is the vector of standardized residuals (under ), and .

*Portmanteau Statistics*

It is crucial to check the adequacy of the MGARCH specification. Bollerslev [41] suggested some diagnostics for the constant correlation MGARCH model. He computed the Ljung-Box portmanteau statistic on the cross-products of the standardised residuals across different equations. Critical values were based on the distribution. As mentioned by Tse [49], diagnostics for conditional heteroscedasticity models applied in the literature can be divided into three categories: portmanteau test of the Ljung-Box type, residual-based diagnostic, and Lagrange multiplier test. So, to check the overall significance of the residual autocorrelation, we consider the Ljung-Box portmanteau statistic. This test was introduced by Box and Pierce [54] for goodness-of-fit checking of univariate strong ARMA models. Ljung and Box [55] proposed a slightly different portmanteau test which is nowadays one of the most popular diagnostic checking tools in ARMA modelling of time series. Following Hosking [56, 57], a multivariate version of the Ljung-Box portmanteau (For more details see Francq and Raïssi [58].) statistic is given by
where denotes the trace of a matrix. , and is the sample autocovariance matrix of order . Under the null hypothesis, is distributed asymptotically as . Duchesne and Lalancette [59] generalized this statistic using a spectral approach and obtained higher asymptotic power by using a different kernel than the truncated uniform kernel used in . This test is also used to detect misspecification in the conditional variance matrix , by replacing with . Ling and Li [60] proposed an alternative portmanteau statistic for multivariate conditional heteroscedasticity. They defined the sample lag- (transformed) residual autocorrelation as
Their test statistic is given by and is asymptotically distributed as under the null hypothesis. In the derivation of the asymptotic results, normality of the innovation process is not assumed. The statistic is thus robust with regard to the distribution choice. Tse and Tsui [48] showed that there is a loss of information in the transformation of the residuals, and the test may suffer from a power reduction.

#### 4. Multivariate GARCH Prediction

Forecasting is one of the main objectives of multivariate time series analysis. Predictions from multivariate GARCH models can be generated in a similar fashion to predictions from univariate GARCH models. (For more details see Moon et al. [61] and Hlouskova et al. [62].) Indeed, for the univariate GARCH models, such as CCC model and principal component model, the predictions are generated from the underlying univariate GARCH models and then converted to the scale of the original multivariate time series by using the appropriate transformation. This section focuses on predicting from diagonal BEKK and DCC model.

To illustrate the prediction of conditional covariance matrix for multivariate GARCH models, consider the conditional variance equation for the diagonal BEKK(1,1,1) model where , , and are matrices, is upper triangular, and and are diagonal matrices. The model (26) is estimated over the time period . Given the information at time , the one-step-head prediction () of conditional covariance matrix at time is given by when , it can be shown that where is obtained in the previous step. This procedure can be iterated to obtain for .

Let us consider the - model given by (18), which can be written as: where , is the symmetric positive definite matrix, and are nonnegative scalar parameters satisfying , and is the unconditional covariance matrix composed of the standardized residuals resulting from the first step estimation. However, the -step ahead forecast of a standard GARCH(1,1) and the DCC evolution process are given by where and . Thus, the -step ahead forecast of the correlation cannot be directly solved forward to provide a convenient method for forecasting. In examining methods to overcome this difficulty, two forecasts seem to be the most natural, each requiring a different set of approximations. The first technique proposed would be to generate the -step ahead forecast of by making the approximation that . Using this approximation, we then have the -step ahead forecast of that is, for and . An alternative approximation would be that and that . Using this approximation, we can forecast directly using the relationship In order to test which of these approximations performs better, Engle and sheppard [50] have conducted a Monte Carlo experiment. They have concluded that the forecast produced by solving forward for was more biased than the method for solving forward which had better bias properties for almost all correlations and horizons. Also of interest is that both forecasts appear to be unbiased when the unconditional correlation is zero, and that they make essentially the same forecast when the unconditional correlation is zero.

While none of these two techniques significantly outperformed the other, it would seem that a logical choice for forecasting would be the method that directly solves forward which appear easier to implement. Therefore, we will choose this second technique for further work.

#### 5. Empirical Application

##### 5.1. The Data

The data employed in this study are taken from Datastream and are the 4530 daily observations on the Nasdaq 100 (NAS), New York Stock Exchange composite (NYA), and Russell 3000 (RUA) stock returns over the period January 4, 1988 to December 21, 2005. The returns series denoted by are calculated as , where is the price index. Models used are *Full-*BEKK(1,1), *Diagonal *BEKK(1,1), CCC(1,1)-GARCH and DCC(1,1)-GARCH, where each of the univariate GARCH models estimated is a GARCH(1,1), and we focus our attention to the covariance matrix modelling.

Figures 1 and 2 give the plots of the daily price indices and daily stock returns. We can see that the market volatility is changing over time which suggests a suitable model for the data should have a time-varying volatility structure as suggested by the GARCH model.

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

##### 5.2. Estimation Results

For the above mentioned indices, the sample mean of returns, standard deviation of returns, skewness and kurtosis coefficients as well as the Jarque-Bera and the Ljung-Box (univariate and multivariate version) tests are all reported in Table 1. Skewness is used to describe asymmetry from the normal distribution in a set of statistical data, taking two forms: positive or negative, it depends on whether data points are skewed to the left (negative coefficient) or to the right (positive coefficient) of the data average. Negative skewness means that there is a substantial probability of a big negative return, whereas positive one means that there is a greater than normal probability of a big positive return which indicates that the tail on the right side is longer than the left one and the bulk of the values lie to the left of the mean. The kurtosis measures the peakedness and fatness of the tails of a probability distribution. A fat tailed distribution has higher than normal chances of a big positive or negative realization. For all series, the returns distributions display positive skewness. Moreover, the data indicates high degree of excess kurtosis (leptokurtic), since the kurtosis coefficients are significantly larger than those of a normal distribution which is three. The returns series appear extremely nonnormal based on the Jarque-Bera test. The Ljung-Box test applied to the series and squared series provides clear evidence against the hypothesis of serial independence of observations and indicates the existence of ARCH effect. The unit root tests of Phillips and Perron [63], Kwiatkowski et al. [64], and Dickey and Fuller augmented [65] reject the stationarity hypothesis at both 5% and 1% significance levels for all series. (The results are not reported here to conserve space but are available from the authors upon request.)

The estimation of fractional integration parameter based on the GPH procedure as well as the CCC and DCC-GARCH are given in Tables 2 and 3, respectively. The results reveal a clear evidence of long-range dependence for all stock returns, since estimates are significantly positive implying covariance stationarity of the process. These results are statistically significant and contrast with most of the studies of long memory in asset returns which have generally found weak or no evidence for long memory. However, Henry [66] investigated long-range dependence in nine international stock index returns and found evidence of long memory in four of them, the German, Japanese, South Korean, and Taiwanese markets, but not for the markets of the UK, USA, Hong Kong, Singapore, and Australia. (See also Aydogan and Booth [67].) Furthermore, Serletis and Rosenberg [68] analyzed daily data on four US stock market indices and concluded that US stock market returns display antipersistence. This implies that the behaviour of stock returns is inconsistent with the efficient market hypothesis, which asserts that returns of a stock market are unpredictable from previous price changes [69, 70].

Moreover, we observe that the correlations across the three stock returns give strong evidence of time-varying correlations between them. The last column of Table 3 presents the results of the estimation of the DCC parameters. we note that the estimate of is statistically significant at 1% significance level meaning that the correlation is significantly time varying. The DCC parameters estimates imply a highly persistent correlation with and ; however, they satisfy the condition of stationarity. Thus, the model is mean reverting, and the conditional correlation matrix is positive semidefinite. Apart from the tables, we compute the Lagrange Multiplier statistic proposed by Tse [49] for constant conditional correlation test for the trivariate model which is significant at the level; its *P-value* is 0.0001. Thus, there is an evidence against time-invariant correlations among the selected stock returns. In Figure 3, (R11, R12, and R13 are respectively the returns series of NAS, NYA and RUA), we observe a slow decay of the autocorrelation functions which indicate the presence of long memory behaviour. The plots of conditional variance of *BEKK-*GARCH models and CCC-GARCH are shown, respectively, in Figures 4 and 5.

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

To check the goodness of fit of our model, we consider several diagnostic tests on the standardized residuals, the Ljung-Box test for the 12th order serial autocorrelation and heteroskedasticity, when it is applied on the standardized residuals. We use the Jarque-Bera test, the skewness, and the kurtosis coefficients to test the normality of standardized residuals. From Table 3, the multivariate portmanteau test reveals that the hypothesis of no residual autocorrelation is rejected in the residuals only for both *full* and *diagonal-*BEKK(1,1,1) models at 5% level of significance. From Table 4, we can see that, for most series, the hypothesis of uncorrelated standardized and squared standardized residuals is well supported, indicating that there is no statistically significant evidence of misspecification. The skewness and kurtosis coefficients indicate that the standardized residuals are still not normally distributed, which is confirmed by the Jarque-Bera test. Finally, the hypothesis of no conditional heteroscedasticity is not rejected in the residuals for all the series at 5% level of significance.

##### 5.3. Forecasting Performance of Estimated Models

The DCC model seems better with respect to the other models in terms of information criteria. So, in order to assess the out-of-sample forecasting performance for diagonal BEKK, CCC and DCC models, we still use root mean square error (RMSE) and mean absolute error (MAE) as two criteria for comparison. we have selected an out-of-sample forecast data set using the last 1000 observations of the original data. We have reestimated the models adding a new observation and obtaining the day step ahead forecasts for , 3, and 5. The results are shown in Table 5. They appear to show a trend that the forecasting errors are proportionate to the forecasted periods. Moreover, we draw clear inference to the effect that they all appeared to be more accurate in DCC than in the other models, regardless of what criterion is adopted. This seems to be consistent not only in RMSE but also in MAE (the DCC model has the lowest RMSE and MAE). In addition, predicting results of CCC perform even worse than diagonal BEKK. The results successfully provide evidence in favour of the predictive superiority of the DCC model against the diagonal BEKK and CCC. We can, thus, conclude that the forecasts of the DCC modelling are significantly better than those of the other model.

#### 6. Conclusion

The aim of the present paper is to study the dynamic modelling of the US stock returns. We considered multivariate GARCH framework to model the time-varying covariance matrices of a process exhibiting a long-term dependence and be used to produce out-of-sample forecasts. In particular, we examined the persistence phenomenon in the first conditional moment of daily stock returns; the robustness of the results was also investigated by considering that its innovations were generated by a multivariate GARCH process. As illustration, we applied our models to the trivariate systems. The estimated parameters show that the returns series are characterized by long memory behaviour and time-varying correlations. These results are statistically significant and contrast with most of the studies of long memory in the returns series showing weak or no evidence for long-term dependence. Using daily returns of Nasdaq 100, New York Stock Exchange composite, and Russell 3000, the results successfully provide evidence that DCC model outperforms the other ones in estimating and forecasting covariance matrices for out-of-sample analysis.

#### Acknowledgments

I would like to thank Anne Peguin and Mohamed Boutahar for their useful comments on an earlier version of this paper. I'm especially grateful to the anonymous reviewer and the editor (P. K. Narayan) for their detailed comments and numerous suggestions that helped to improve the paper. Any errors and/or omissions are, however, my own.