Abstract

Seasonal Autoregressive Fractionally Integrated Moving Average (SARFIMA) models are used in the analysis of seasonal long memory-dependent time series. Two methods, which are conditional sum of squares (CSS) and two-staged methods introduced by Hosking (1984), are proposed to estimate the parameters of SARFIMA models. However, no simulation study has been conducted in the literature. Therefore, it is not known how these methods behave under different parameter settings and sample sizes in SARFIMA models. The aim of this study is to show the behavior of these methods by a simulation study. According to results of the simulation, advantages and disadvantages of both methods under different parameter settings and sample sizes are discussed by comparing the root mean square error (RMSE) obtained by the CSS and two-staged methods. As a result of the comparison, it is seen that CSS method produces better results than those obtained from the two-staged method.

1. Introduction

In the recent years, there have been a lot of studies about Autoregressive Fractionally Integrated Moving Average (ARFIMA) models in the literature. However, most of time series in real life may have seasonality, in addition to long-term structure. Therefore, SARFIMA models have been introduced to model such time series. Generally, SARFIMA (𝑝,𝑑,π‘ž)(𝑃,𝐷,𝑄)𝑠 process is given in the following form:πœ™(𝐡)Ξ¦(𝐡)(1βˆ’π΅)𝑑(1βˆ’π΅π‘ )𝐷𝑋𝑑=Θ(𝐡)πœƒ(𝐡)𝑒𝑑,(1.1) where 𝑋𝑑 is a time series, 𝐡 is the back shift operator, such as 𝐡𝑖𝑋𝑑=π‘‹π‘‘βˆ’π‘–, 𝑠 is the seasonal lag, 𝑑 and 𝐷 represent the nonseasonal and seasonal fractionally differences; respectively, 𝑒𝑑 is a white noise process and has normal distribution (𝑁(0,𝜎2𝑒)), and πœ™(𝐡), Ξ¦(𝐡), πœƒ(𝐡), andΘ(𝐡) are given byπœ™ξ€·(𝐡)=1βˆ’πœ™1π΅βˆ’β‹―βˆ’πœ™π‘π΅π‘ƒξ€Έ,ξ€·πœƒ(𝐡)=1+πœƒ1𝐡+β‹―+πœƒπ‘žπ΅π‘žξ€Έ,ξ€·Ξ¦(𝐡)=1βˆ’Ξ¦1π΅π‘ βˆ’β‹―βˆ’Ξ¦π‘π΅π‘π‘ ξ€Έ,ξ€·Ξ˜(𝐡)=1+Θ1𝐡+β‹―+Ξ˜π‘žπ΅π‘žπ‘ ξ€Έ,(1.2) where 𝑝,π‘ž and 𝑃,𝑄 are the orders of the nonseasonal and seasonal parameters, respectively.

Baillie [1] and Hassler and Wolters [2] examined the basic characteristics of ARFIMA models, while some significant contributions to the SARFIMA models were presented by Giraitis and Leipus [3], Arteche and Robinson [4], Chung [5], Velasco and Robinson [6], Giraitis et al. [7], and Haye [8]. When all parameters are different from zero in (1.1) and when some parameters such as 𝑝,π‘ž,𝑃,𝑄 are equal to zero, different parameter estimation methods are compared by performing simulation studies in the literature [9–11].

Seasonal long-term structure exists in time series in various study fields such as the cumulative money series in Porter-Hudak [12], the IBM input series in Ray [13], and the Nile River data in Montanari et al. [14]. Candelon and Gil-Alana [15] forecasted the industrial production index of countries in South America by employing the SARFIMA models. Gil-Alana [16] found that the GDP series in Germany, Italy, and Denmark had a structure which was suitable to use SARFIMA models.

Brietzke et al. [17] utilized Durbin-Levinson algorithm for the 𝑝=π‘ž=𝑃=𝑄=𝑑=0 model. Ray [13] modified the method proposed by Hosking [18] and used this modified method for a special SARFIMA process having two different seasonal difference parameters. DarnΓ© et al. [19] adapted the method, proposed for ARFIMA by Chung and Baillie [20], to SARFIMA models. However, the properties of the CSS method employed in DarnΓ© et al. [19] have not been examined by a simulation study yet.

Arteche and Robinson [4] introduced a semiparametric method based on spectral density functions while estimating parameters for SARFIMA model in the case of 𝑑=0. GPH method used in ARFIMA is extended to be used in SARFIMA models for 𝑝=π‘ž=𝑃=𝑄=𝑑=0 by Porter-Hudak [12], and GPH estimator has been modified by Ooms and Hassler [21]. Also, a simulation study for different values of 𝑑,𝐷,𝑠, and sample size has been conducted using GPH, Whittle and Exact Maximum likelihood (EML) by Reisen et al. [9, 10] and Palma and Chan [11]. In addition to these studies, many methods for determining seasonal long-term structure have been proposed by Hassler and Wolters [22], Gil-AlaΓ±a and Robinson [23], Arteche [24], and Gil-Alana [25, 26].

We examine the properties of the CSS and two staged estimation methods by a simulation study in which both methods are compared based on various parameter settings and sample sizes. In the simulation study, a specific form of the model given in (1.1) in which 𝑝, 𝑑, and π‘ž are equal to zero is examined by using the both CSS and two staged estimation methods. This model can also be expressed as SARFIMA  (𝑃,𝐷,𝑄)𝑠. After simulation study was conducted, the results obtained from the CSS and two staged estimation methods are compared, and it is observed that better results are obtained when the CSS method is employed.

The outline of this study is as follows. Section 2 contains brief information related to SARFIMA models. The CSS method and two staged methods are explained in Sections 3 and 4, respectively. The outline of the simulation study and the results are given in Section 5. Finally, the results obtained from the simulation study are summarized in the last section.

2. SARFIMA Models

When 𝑝, π‘ž, 𝑑, 𝑃, and 𝑄 are set to zero in model (1.1), this model is called as Seasonal Fractionally Integrated (SFI) model. The SFI model was firstly introduced by Arteche and Robinson [4], and basic information about the model can be found in Baillie [1]. SFI model can be given by(1βˆ’π΅π‘ )𝐷𝑋𝑑=𝑒𝑑.(2.1) Infinite moving average presentation of the model (2.1) is as follows:𝑋𝑑=Ξ¨(𝐡𝑠)𝑒𝑑=βˆžξ“π‘˜=0πœ“π‘˜π‘’π‘‘βˆ’π‘ π‘˜,(2.2) where πœ“π‘˜=Ξ“(π‘˜+𝐷)/(Ξ“(𝐷)Ξ“(π‘˜+1)), (πœ“π‘˜βˆΌπ‘˜π·βˆ’1/Ξ“(𝐷), for π‘˜β†’βˆž).

Infinite autoregressive presentation of the model (2.1) is as follows:Ξ (𝐡𝑠)𝑋𝑑=βˆžξ“π‘˜=0πœ‹π‘˜π‘‹π‘‘βˆ’π‘ π‘˜=𝑒𝑑,(2.3) where πœ‹π‘˜=Ξ“(π‘˜βˆ’π·)/(Ξ“(βˆ’π·)Ξ“(π‘˜+1)), (πœ‹π‘˜=π‘˜βˆ’π·βˆ’1/Ξ“(βˆ’π·), for π‘˜β†’βˆž).

For model (2.1), autocovariance and autocorrelation functions can be, respectively, written as follows:𝛾(π‘ π‘˜)=(βˆ’1)π‘˜Ξ“(1βˆ’2𝐷)Ξ“πœŽ(π‘˜βˆ’π·+1)Ξ“(1βˆ’π‘˜βˆ’π·)2𝑒,π‘˜=1,2,…,(2.4)𝜌(π‘ π‘˜)=Ξ“(1βˆ’π·)Ξ“(π‘˜+𝐷)Ξ“(𝐷)Ξ“(π‘˜βˆ’π·+1),π‘˜=1,2,…,(2.5) when π‘˜βŸΆβˆž,𝜌(π‘ π‘˜)βˆΌΞ“(1βˆ’π·)π‘˜Ξ“(𝐷)2π·βˆ’1.(2.6) For model (2.1), spectral density function is as follows:πœŽπ‘“(πœ”)=2𝑒2πœ‹2sinπ‘ πœ”2ξ‚ξ‚„βˆ’2𝐷,0<πœ”β‰€πœ‹.(2.7) Note that the spectral density function is infinite at the frequencies 2πœ‹πœˆ/𝑠, 𝜈=1,…,[𝑠/2].

When 𝑝, π‘ž, 𝑑, 𝐷, 𝑃, and 𝑄 are different from zero in model (1.1), closed form for autocovariances cannot be determined. However, some methods, such as the splitting method presented by Bertelli and Caporin [27], employed to calculate autocovariances of ARFIMA models, can also be used for those in SARFIMA models.

Let 𝛾1(β‹…) denote the autocovariance function of SARFIMA (𝑝,𝑑,π‘ž)(𝑃,𝐷,𝑄)𝑠 models. Autocovariances are calculated in terms of splitting method as follows:𝛾1(π‘˜)=βˆ’π‘šξ“β„Ž=βˆ’π‘šπ›Ύ2(β„Ž)𝛾3(π‘˜βˆ’β„Ž).(2.8)𝛾2(β‹…) and 𝛾3(β‹…) are autocovariances functions for SARFIMA (𝑝,0,π‘ž)(𝑃,0,𝑄)𝑠 and SARFIMA (0,𝑑,0)(0,𝐷,0)𝑠 models, respectively. 𝛾3(β‹…) is calculated using splitting method given in a following expression:𝛾3(π‘˜)=βˆ’π‘šξ“β„Ž=βˆ’π‘šπ›Ύ4(β„Ž)𝛾5(π‘˜βˆ’β„Ž).(2.9)𝛾4(β‹…) and 𝛾5(β‹…) are autocovariances functions for SARFIMA (0,0,0)(0,𝐷,0)𝑠 and SARFIMA (0,𝑑,0)Γ—(0,0,0)𝑠 models, respectively. The closed form for 𝛾4(β‹…) is given in (2.4). The autocovariances of 𝛾5(β‹…) are autocovariances of fractionally integrated process and the closed form is given by [28] as follows:𝛾5(π‘˜)=(βˆ’1)π‘˜Ξ“(1βˆ’2𝑑)Ξ“πœŽ(π‘˜βˆ’π‘‘+1)Ξ“(1βˆ’π‘˜βˆ’π‘‘)2𝑒.(2.10) To generate series, which are appropriate for SARFIMA (𝑃,𝐷,𝑄)𝑠 models, the following algorithm is applied.

Step 1. Generate 𝐙=(𝑧1,…,𝑧𝑛)𝑇random variable vector with standard normal distribution.

Step 2. Obtain the matrix πšΊπ‘›=[𝛾(π‘–βˆ’π‘—)], 𝑖,𝑗=1,…,𝑛 by utilizing the expression (2.4).

Step 3. Split the covariance matrix as follows: 𝚺=𝐋𝐋𝑇 where, 𝐋 is a lower triangular matrix.

This splitting is called Cholesky. It is possible to obtain Cholesky decomposition of positive definite and symmetric matrices. Note that matrix πšΊπ‘› is positive definite and symmetric.

Step 4. Obtain series 𝐗=(𝑋1,…,𝑋𝑛)𝑇 by using 𝐗=(𝑋1,…,𝑋𝑛)𝑇=𝐋𝐙 formula. 𝐗=(𝑋1,…,𝑋𝑛) has a suitable structure for SARFIMA (0,𝐷,0)𝑠 model.

Step 5. Generate series according to SARMA  (𝑃,𝑄)𝑠 model by taking 𝐗=(𝑋1,…,𝑋𝑛) as error series. By this way, the new generated series have the structure of SARFIMA (𝑃,𝐷,𝑄)𝑠. This algorithm is easily extended to SARFIMA (𝑝,𝑑,π‘ž)(𝑃,𝐷,𝑄)𝑠 model.

3. The Two-Staged Method

The two-staged method can be used to estimate the parameters of SARFIMA (𝑃,𝐷,𝑄)𝑠 model. In the first phase of this method, it is assumed that the time series has a suitable structure to use the SARFIMA (0,𝐷,0)𝑠 model, and seasonal fractionally difference parameter 𝐷 is estimated. In the second phase, estimation of the parameter, EML method given below, can be employed.

Theoretical autocovariance and autocorrelation functions for SARFIMA (0,𝐷,0)𝑠 model are shown in (2.4) and (2.5) respectively. Let time series 𝑋𝑑 have 𝑛 observations (π‘₯1,…,π‘₯𝑛), and let Ξ© represent the autocorrelation matrix of π‘₯1,…,π‘₯𝑛. Therefore, the likelihood function of π‘₯1,…,π‘₯𝑛 is as follows:𝐿(𝐷)=(2πœ‹)βˆ’π‘›/2||𝛀||βˆ’1/2ξ‚†βˆ’1exp2π—ξ…žπ›€βˆ’1𝐗.(3.1) Cholesky decomposition is used for the matrix Ξ© as multiplication of lower and upper triangular matrices in calculation of the likelihood function. Instead of calculating the inverse of matrix Ξ© (𝑛×𝑛), inverses of lower and upper triangular matrices are calculated by using the decomposition. Thus, the decomposition decreases computational difficulty and calculation time. Cholesky decomposition of the matrix Ξ© is written as follows:𝛀=π‹π‹ξ…ž.(3.2) Let 𝐖=π‹βˆ’1𝐗, and it can be writtenπ—ξ…žπ›€βˆ’1𝐗=π—ξ…žξ€·π‹π‹ξ…žξ€Έβˆ’1𝐋𝐗=βˆ’1π—ξ€Έξ…žξ€·π‹βˆ’1𝐗=π–ξ…žπ–||𝛀||βˆ’1/2=||π‹π‹ξ…ž||βˆ’1/2=||𝐋||βˆ’1.(3.3) Thus, (3.1) can be rewritten as𝑃𝐷𝐗=exp(2πœ‹)βˆ’π‘›/2||𝐋||βˆ’1ξ‚†βˆ’1exp2π–ξ…žπ–ξ‚‡.(3.4) The likelihood function given in (3.4) is maximized in terms of seasonal fractionally difference parameter by using an optimization algorithm. After seasonal fractionally difference parameter is estimated by using EML, the rest of the parameters of SARMA  (𝑃,𝑄)𝑠 model are estimated in the second phase by using the classic method. In the second phase, the order of the seasonal model can be determined by using the Box-Jenkins approach. Therefore, the two-staged method can be summarized as follows.

Phase 1. Estimate the parameter 𝐷 by assuming the time series suitable for SARFIMA (0,𝐷,0)𝑠.

Phase 2. Estimate seasonal autoregressive and moving average parameters by using the Box-Jenkins methodology.

4. The CSS Method

Chung and Baillie [20] proposed a method based on minimization of conditional sum of square. This method can be used for SARFIMA  (𝑝,𝑑,π‘ž)(𝑃,𝐷,𝑄)𝑠 models. Conditional sum of square method for SARFIMA model is as follows:ξ‚€1𝑆=2ξ‚ξ€·πœŽlog2πœ€ξ€Έ+ξ‚΅12𝜎2πœ€ξ‚Άπ‘›ξ“π‘‘=1ξ€½πœƒβˆ’1(𝐡)Ξ˜βˆ’1(𝐡)πœ™(𝐡)Ξ¦(𝐡)(1βˆ’π΅)𝑑(1βˆ’π΅π‘ )𝐷𝑋𝑑2.(4.1) In the CSS method, firstly, seasonal fractionally difference procedure is executed for 𝑋𝑑. Secondly, fractionally difference procedure is executed for (1βˆ’π΅π‘ )𝐷𝑋𝑑. Thirdly, SARMA filtering is applied to (1βˆ’π΅)𝑑(1βˆ’π΅π‘ )𝐷𝑋𝑑. By calculating sum of squares of this obtained series (πœƒβˆ’1(𝐡)Ξ˜βˆ’1(𝐡)πœ™(𝐡)Ξ¦(𝐡)(1βˆ’π΅)𝑑(1βˆ’π΅π‘ )𝐷𝑋𝑑), conditional sum of square is calculated for a fixed value of 𝜎2𝑒 and 𝐷. Chung and Baillie [20] also emphasize that the estimations of parameters obtained by the CSS method have less bias when the mean value of the series is known. It is easy to use the CSS method because it does not need to calculate autocovariances. In the literature, the CSS method for the SARFIMA  (𝑃,𝐷,𝑄)𝑠 model has been used only by DarnΓ© et al. [19].

5. Simulation Study

In this section, the parameters of SARFIMA (𝑃,𝐷,𝑄)𝑠 model are estimated by using the CSS and the two-staged methods separately under different parameter settings and sample sizes. Also, the advantages and the disadvantages of both methods are discussed.

The algorithm, whose steps are given in Section 2, is used to generate various SARFIMA (𝑃,𝐷,𝑄)𝑠 models. SARFIMA (1,𝐷,0)𝑠 and SARFIMA (0,𝐷,1)𝑠 models are emphasized in the simulation study. For SARFIMA (1,𝐷,0)𝑠 model, 36 different cases are examined such as seasonal fractionally difference 𝐷=0.1,0.2,0.3, seasonal autoregressive parameter Ξ¦=0.3,0.7,(βˆ’0.3),(βˆ’0.7), sample sizes 𝑛=120,240,360, and period 𝑠=4. Similarly, the same parameters are also used for SARFIMA (0,𝐷,1)𝑠 model by taking Θ=0.3,0.7,(βˆ’0.3),(βˆ’0.7). For each case, 1000 time series are generated, so totally we generate 72000 time series. The parameters of the generated time series are estimated by using both the CSS and two-staged methods whose results are summarized in Tables 1 and 2. For each 1000 time series, the mean, standard deviation, and root mean square error (RMSE) values of estimated parameters are exhibited in these tables. RMSE values are computed byξƒŽRMSE=βˆ‘1000𝑖=1ξ€·π›½π‘–βˆ’Μ‚π›½π‘–ξ€Έ2,1000(5.1) where 𝛽𝑖 and ̂𝛽𝑖 denote the real and estimated values of parameter, respectively.

In Table 1, for SARFIMA (1,𝐷,0)𝑠 model, the simulation results for different values of Ξ¦ and sample size 𝑛 are shown when the CSS and the two-stage methods are executed. From this table, for CSS method, we observe that RMSE values have sharply decreased for the estimated parameters of seasonal fractional difference and seasonal autoregressive, when the sample size increases. It is also seen that the values of RMSE do not change much whether the sign of parameter of seasonal autoregressive is positive or not. In the case of having larger value of seasonal autoregressive parameter in absolute, RMSE values of seasonal autoregressive (RMSE(Ξ¦)) parameters get smaller. When 𝐷=0.1 and 𝐷=0.2 are compared, the values of RMSE(𝐷) in 𝐷=0.1 are smaller than those in 𝐷=0.2, whereas the values of RMSE(Ξ¦) in 𝐷=0.1, 𝐷=0.2, and 𝐷=0.3 are close with each other. Note that the values of RMSE(𝐷) in 𝐷=0.3 are larger than those in 𝐷=0.1.

According to Table 1, when the two-staged method is executed, it is observed that the sample size does not affect significantly the values of RMSE, especially for RMSE(Ξ¦) when Ξ¦=βˆ’0.7. However, when the absolute value estimated of seasonal autoregressive parameter increases, the values of RMSE(Ξ¦) increase dramatically in 𝐷=0.1 and 𝐷=0.2. The values of RMSE(𝐷) are not affected by both the sign and magnitude of seasonal autoregressive parameter, especially in 𝐷=0.1. It is worth to point out that the values of RMSE(𝐷) are quite larger for the negative values of seasonal autoregressive parameters in both 𝐷=0.2 and 𝐷=0.3. It can be inferred from the comparison between 𝐷=0.1 and 𝐷=0.2 that for the negative values of seasonal autoregressive parameter, both the values of RMSE(𝐷) and RMSE(Ξ¦) increase gradually while 𝐷 is increasing. Especially, the values of RMSE(Ξ¦) in 𝐷=0.3 get the biggest values when the seasonal autoregressive parameter is negative. Therefore, for the negative values of seasonal autoregressive parameters, we can say that the estimation error gets bigger while the order of seasonal fractional difference is increasing.

In Table 2, for the SARFIMA (0,𝐷,1)𝑠 model, the simulation results for different values of parameter Θ and sample size 𝑛 are shown for the CSS and two-staged methods. From this table, we observe that RMSE values have decreased for the estimated parameters of seasonal fractional difference and seasonal moving average, when the sample size increases, the CSS method is executed. It is also seen that the values of RMSE do not change much whether the sign of parameter of seasonal moving average is positive or not. In the case of having larger value of seasonal moving average parameter for the negative values, RMSE values for seasonal fractionally difference (RMSE(𝐷)) are smaller. When we compare 𝐷=0.1 with 𝐷=0.2, the values of RMSE(𝐷) in 𝐷=0.1 are smaller than those in 𝐷=0.2, whereas the values of RMSE(Θ) among 𝐷=0.1, 𝐷=0.2, and 𝐷=0.3 are close with each other. Note that the values of RMSE(𝐷) in 𝐷=0.3 are larger than those in 𝐷=0.1.

When Table 2 is examined, it is observed that the values of RMSE(Θ) decrease when sample size increases for two-staged method. However, there is no positive or negative relations between the value of seasonal moving average parameter and the values of RMSE(𝐷) and RMSE(Θ) when two-stage method is executed. We would like to remark that RMSE(Θ) has the smallest value in each sample size for Θ=0.7 and that values of RMSE(𝐷) are quite big for the negative values of seasonal moving average parameter with respect to its positive values when 𝐷=0.2, and 0.3 in Table 2.

6. Discussions

In the literature, the two-staged method is a widely used method to estimate parameters of SARFIMA models. Although there is another method called CSS, this method has not been employed to estimate the parameters of SARFIMA model. In this study, the CSS and the two-staged methods are employed to estimate parameters of the SARFIMA models by conducting a simulation study, and by this way the properties of these two methods are examined under different parameter settings and sample sizes.

From the results of the simulation, we deduce that when the sample size increases, the CSS method gives more accurate estimates. Besides, we can infer that when seasonal autoregressive parameter in SARFIMA (1,𝐷,0)4 model gets close to 1 or βˆ’1, the parameter estimates of the CSS method have less error. The CSS method produces quite good estimates for 𝐷 when the seasonal autoregressive parameter in SARFIMA (1,𝐷,0)4 model and the seasonal moving average parameter in SARFIMA (0,𝐷,1)4 model are positive.

When the CSS method is compared with the two-staged method, the CSS method has lower RMSE values than the two-staged method under different parameter settings and sample sizes, especially in autoregressive models. Two-staged method generates misleading results when Ξ¦ is chosen near βˆ’1 (Ξ¦=βˆ’0.7). However, this is not the case for the CSS method. Based on the obtained results and simplicity of the method, for forthcoming studies it can be easily suggested that the CSS method should be preferred rather than the two-staged method in the parameter estimation for SARFIMA models.