Abstract

This paper provides an application of generalized space-time autoregressive (GSTAR) model on GDP data in West European countries. Preliminary model is identified by space-time ACF and space-time PACF of the sample, and model parameters are estimated using the least square method. The forecast performance is evaluated using the mean of squared forecast errors (MSFEs) based on the last ten actual data. It is found that the preliminary model is GSTAR(2;1,1). As a comparison, the estimation and the forecast performance are also applied to the GSTAR(1;1) model which has fewer parameter. The results showed that the ASFE of GSTAR(2;1,1) is smaller than that of the order (1;1). However, the t-test value shows that the performance is significantly indifferent. Thus, due to the parsimony principle, the GSTAR(1;1) model might be considered as a forecasting model.

1. Introduction

Space-time data are frequently found in many areas of research, for example, monthly tea production from some plants, yearly housing price at capital cities, and yearly per capita GDP (gross domestic product) of several countries in some region. The generalized space-time autoregressive model of order (𝑝;𝜆1,,𝜆𝑝), shortened by GSTAR(𝑝;𝜆1,…,𝜆𝑝), is one of space-time models characterized by autoregressive terms lagged in the 𝑝th order in time and the order of (𝑝;𝜆1,,𝜆𝑝) in space [1].

The term of generalization is associated with the model parameters. When a parameter matrix is diagonal, the GSTAR model is the same as space time autoregressive (STAR) model given by Martin and Oeppen [2] and Pfeifer and Deutsch [3]. The notion of generalization has also been used by Terzi [4] who also generalizes STAR models but in a different context. He generalized the STAR(1;1) by adding the contemporaneous spatial correlation but still preserved the scalar parameters.

When 𝑝=1 and 𝜆1=1, GSTAR(1;1) is called the first order of GSTAR model. The model has interpretation that the current observation in a certain location only depends on the immediate past observations recorded at the location of interest and at its nearest neighbourhood [5]. The order (1;1) is the simplest natural assumption if one wants to forecast future observations in a certain location. The STAR(1;1) model is another simple space time model which also has the same interpretation as GSTAR(1;1) model. However, contrary to the generalization model, its parameters of each spatial order are assumed to be the same for all location though; there is no a priori justification for this assumption [1]. Parameters of GSTAR model can be estimated by the method of least square. This method has been used to model the monthly oil production [5] and to model the monthly tea production [1]. However, when the model was applied to their data, none of the papers included a description about how to assess the model based on forecasting performance which is an important step when the modelling purpose is to build a forecasting model. In this paper, we attempt to put in the idea to optimize the goodness of fit in model selection.

This paper is presented as follows. In Section 2, the GSTAR model and the parameters least squares estimation is reviewed with the example given for GSTAR(1;1) and GSTAR(2;1,1). To illustrate the estimator properties for finite sample size, simulation study is discussed in Section 3. In the last section, the model is applied to the per capita GDP ratio data in West European countries for the period 1956–1996. The one step ahead forecasting is performed for each model for the period 1997–2006. As a comparison performance measure, it is used the empirical mean of squared forecast error (MSFE) where forecast error is defined as the difference between the actual value and the forecast value.

2. The Model

Let 𝐙(𝑡)=(𝑍1(𝑡),,𝑍𝑁𝑡) be an 𝑁-dimensional vector process with zero mean with 𝑁 as is a fixed positive integer. GSTAR(𝑝;𝜆1,,𝜆𝑝) process is a space-time process 𝐙(𝑡) which satisfies𝐙(𝑡)=𝑝𝜆𝑘=1𝑘=0𝚽𝑘𝐖()𝐙(𝑡𝑘)+𝐞(𝑡),(2.1) where 𝑝 is the autoregressive order, 𝜆𝑘 is the spatial order of the 𝑘th autoregressive term, 𝐖()=(𝑤()𝑖𝑗) is an 𝑁×𝑁 matrix of spatial weight for the spatial order which has a zero diagonal, sum of each row is equal to one, and matrix 𝐖(0) is defined as the identity matrix I. An 𝑁×𝑁 matrix Φ𝑘 is a diagonal parameter matrix of temporal lag 𝑘 and spatial lag with the diagonal element(𝜙(1)𝑘,,𝜙(𝑁)𝑘). Finally, 𝐞(𝑡) is an error vector at time 𝑡 which is assumed to be independent normal with zero mean and constant variance.

Model parameters 𝜙(1)𝑘,,𝜙(𝑁)𝑘 for 𝑘=1,2, and =1,2, can be estimated by the least squares (LS) estimation. The procedure estimation and the asymptotic properties of LS estimators have been discussed extensively in Borovkova et al. [1]. The following examples are an illustration on how to find the LS estimator for GSTAR(1;1) and GSTAR(2;1,1) models, respectively.

Example 2.1 (LS estimation for GSTAR(1;1) model). From (2.1), GSTAR(1;1) model can be expressed as 𝐙(𝑡)=𝚽10𝐙(𝑡1)+𝚽11𝐖(1)𝐙(𝑡1)+𝐞(𝑡).(2.2) Suppose observation 𝐙(𝑡)=(𝑍1(𝑡),,𝑍𝑁(𝑡)) is given for 𝑡=1,2,,𝑇. By rearranging the component of 𝐙(𝑡)for 𝑡=1,2,,𝑇, by location then by time, model (2.2) can be expressed as a linear model 𝐙=𝐗𝚽+𝐞,(2.3) where 𝐙=(𝑍1(1),,𝑍1(𝑇),,𝑍𝑁(1),,𝑍𝑁(𝑇)), 𝐗=diag(𝐗1,,𝐗𝑁) with 𝐗𝑖=𝑍𝑖(0)𝑗𝑖𝑤𝑖𝑗𝑍𝑗𝑍(0)𝑖(1)𝑗𝑖𝑤𝑖𝑗𝑍𝑗𝑍(1)𝑖(𝑇1)𝑗𝑖𝑤𝑖𝑗𝑍𝑗(𝑇1),(2.4)Φ=(𝜙(1)10,𝜙(1)11,,𝜙(𝑁)10,𝜙(𝑁)11), and 𝐞(𝑒1(1),,𝑒1(𝑇),,𝑒𝑁(1),,𝑒𝑁(𝑇)).
Then, the LS estimator for parameter matrix Φ is a solution of 𝐗𝐗𝐗𝚽=𝐙.(2.5)

Example 2.2 (LS estimation for GSTAR(2;1,1) model). For order 𝑝=2, 𝜆1 = 1 and 𝜆2 = 1, model (2.1) is called GSTAR(2;1,1) model which can be written as 𝐙(𝑡)=𝚽10𝐙𝚽(𝑡1)+11𝐖(1)𝐙(𝑡1)+𝚽20𝐙(𝑡2)+𝚽21𝐖(2)𝐙(𝑡2).(2.6) By rearranging the component of 𝐙(𝑡)for 𝑡=2,3,,𝑇, by location then by time, model (2.6) can also be expressed as a linear model 𝐙=𝐗Φ+𝐞,(2.7) where Z and e are defined as in Example 2.1, 𝐗=diag(𝐗1,,𝐗𝑁) with 𝐗𝑖=𝑍𝑖(1)𝑗𝑖𝑤(1)𝑖𝑗𝑍𝑗(1)𝑍𝑖(0)𝑗𝑖𝑤(2)𝑖𝑗𝑍𝑗𝑍(0)𝑖(2)𝑗𝑖𝑤(1)𝑖𝑗𝑍𝑗(2)𝑍𝑖(1)𝑗𝑖𝑤(2)𝑖𝑗𝑍𝑗𝑍(1)𝑖(𝑇1)𝑗𝑖𝑤(1)𝑖𝑗𝑍𝑗(𝑇1)𝑍𝑖(𝑇2)𝑗𝑖𝑤(2)𝑖𝑗𝑍𝑗(𝑇2)(2.8) and Φ=(𝜙(1)10,𝜙(1)11,𝜙(1)20,𝜙(1)21,,𝜙(𝑁)10,𝜙(𝑁)11,𝜙(𝑁)20,𝜙(𝑁)21).
Then, the LS estimator for parameter matrix Φ is a solution of 𝐗𝐗𝐗𝚽=𝐙.(2.9)

3. Simulation Study

Under the stationary assumption, the LS estimator for the GSTAR parameters is a consistent estimator [1]. To get insight into the LS properties for finite sample we performed a Monte Carlo simulation with 1000 replications. Artificial data were generated from GSTAR(1;1) model where the error 𝐞(𝑡) is normally distributed with mean 0 and covariance matrix I4. Spatial weight matrix and model parameters that used in the simulation, respectively, were 𝐖(1)=,𝜙00.50.500.5000.50.5000.500.50.50(1)10=0.2,𝜙(2)10=0.5,𝜙(3)10=0.3,𝜙(4)10𝜙=0.2,(1)11=0.4,𝜙(2)11=0.3,𝜙(3)11=0.5,𝜙(4)11=0.7(3.1) such that the parameter matrix for the generated model was Φ10=diag(0.2,0.5,0.3,0.2),Φ11=diag(0.4,0.3,0.5,0.7)(3.2)

The LS estimator vector 𝜙𝜙=((1)10,𝜙(2)10,𝜙(3)10,𝜙(4)10,𝜙(1)11,𝜙(2)11,𝜙(3)11,𝜙(4)11) was calculated using R functions in Algorithm 1. Simulation was carried out for sample size 𝑇= 40, 50, 100, 500, 1000, 10000, and for each 𝑇, simulations were repeated 1000 times to obtain the estimates average𝜙𝑇=110001000𝑟=1𝜙𝑇(𝑟)(3.3)

# =============================================================
# [FUNCTION]: OLS estimation for GSTAR(p;L1,…,Lp) models
# =============================================================
# 3 dimension zeros matrix
# –––––––––––––––-
zeros <-function(m,n,p){
W<-rep(0, m*n*p)
dim(W)<-c(m, n, p)
W}
# “vec” operator
vec<-function(X){
a<-dim(X)
Y<- t(X[1, ])
for (i in 2:a[1]){
Y<- cbind(Y,t(X[i, ]))}
t(Y)}
#––––––––––
# Inverse of matrix
#––––––––––
inv<-function(X){
if(dim(X)[1]! = dim(X)[2]) stop(“THE MATRIX MUST BE SYMMETRIC!!!”)
else{
if (det(X)==0) stop("THE MATRIX IS SINGULAR!!!")
else { n<-dim(X)[1]
solve(X) } } }
# –––––––––––––––––––––––––––––––
# Construction of vector Zi, for each i=1,…,N
# –––––––––––––––––––––––––––
# Construction of vector Zi, for each i=1,…,N
# –––––––––––––––––––––––––––
# suppose x = c(p,L1,..., Lp) represent the model order
Zi<-function(Zt, x){
N<-dim(Zt)[1] #number of sites
T<-dim(Zt)[2]-1 #number of time periods
p<-x [1]
Zi<-matrix(0, T-p+1, N)
for (i in 1: N)
Zi[,i]<-Zt[i,(p+1):(T+1)]
Zi}
# –––––––––––––––––––––––––––––––-
# Construction of matrix Xi, for each i = 1,…,N
# ––––––––––––––––––––––-
Xi<-function(Zt,x)
{N<-dim(Zt)[1] #number of sites
T<-dim(Zt)[2]-1 #number of time periods
p<-x[1]
La<-x[2: length(x)]
r<-lmd+1 # where lmd = the greatest order for weight matrices
WZ<-zeros(N, T, r)
for (k in 1: r)
WZ[,, k]<-W [,, k]%*%Zt[, 1: T]
Xi<-zeros((T-p+1),sum(La+1), N)
if (p==1)
{ for (i in 1:N){
  TR<-WZ[i,p:T,1:(La[1]+1)]
   Xi[,, i]<-TR } }
if (p>=2){
  for (i in 1:N){
 TR<-WZ[i,p:T,1:(La[1]+1)]
 for (s in 2:p)
  TR<-cbind(TR,WZ[i,(p-s+1):(T-s+1),1:(La[s]+1)])
Xi[,, i]<-TR } }
Xi}
# –––––––––––––––––––––––––––––––-
# OLS parameter of GSTAR model
# ––––––––––––––
gstar<-function(Zt,x){
p<-x[1]
La<-x[2:length(x)]
r<-lmd+1 # where lmd = the greatest order for weight matrices
N<-dim(Zt)[1] #number of sites
T<-dim(Zt)[2]-1 #number of time periods
Xi<-Xi(Zt,x)
Zi<-Zi(Zt,x)
coef.OLS<-matrix(0,sum(La+1),N)
col.name<-array(0,N)
for (i in 1:N){
  coef.OLS[, i]<-inv(t(Xi[,, i])%*%Xi[,,i])%*%t(Xi[,, i])%*%Zi[,i]
  col.name[i]<-paste("site",i)}
colnames(coef.OLS)<-col.name
round(coef.OLS,4)}
# –––––––––––––––––––––––––––––––-
# Residuals of GSTAR model
# ––––––––––––––
# (1). To find the LS estimates only, for example GSTAR(2;1,1), use
# the command:
#  > gstar(Zt,c(2,1,1))
# where Zt is data matrix.
# (2). To find the estimates, prediction values, and residuals
# vector respectively, call the function by the following
# commands:
#    > as.2<-res(Zt,c(2,1,1))
#   > as.2$coef
#   > as.2$pred
#   > as.2$res
# –––––––––––––––––––––––––––––––-
res<-function(Zt,x){
coef<-gstar(Zt,x)
Xi<-Xi(Zt,x)
p<-x[1]
La<-x[2:length(x)]
N<-dim(Zt)[1] #number of sites
T<-dim(Zt)[2]-1 #number of time periods
Z.OLS<-matrix(0,T-p+1,N)
res.OLS<-matrix(0,N,T-p+1)
if (p==1){
for (i in 1:N){
if (La[1]!=0)Z.OLS[,i]<-Xi[,,i]%*%coef[,i]
else Z.OLS[,i]<-Xi[,,i]*coef[,i]
res.OLS[i,]<-t(Z.OLS[,i]-Zt[i,(p+1):(T+1)])}
}
if (p!=1){
for (i in 1:N){
Z.OLS[,i]<-Xi[,,i]%*%coef[,i]
res.OLS[i,]<-t(Z.OLS[,i]-Zt[i,(p+1):(T+1)]) } }
az<-new.env()
az$Xi<-Xi # matrix Xi
az$coef<-coef
az$pred<-t(Z.OLS)
az$res<-res.OLS
ax<-as.list(az)}
# –––––––––––––––––––––––––––––––-

and empirical mean squared error (MSE)MSE𝑇=110001000𝑟=1𝜙𝑇(𝑟)𝜙2,(3.4)

where 𝜙𝑇(𝑟)𝜙2=4𝑖=1(𝜙(𝑖)10𝜙(𝑖)10)2𝜙+((𝑖)11𝜙(𝑖)11)2.

The result is presented in Table 1. It can be seen that the parameters estimates (in average) approaches the true parameters as 𝑇 increases while the empirical MSE is getting smaller and approaching 0 as 𝑇 increasing. It means that behavior of the LS estimator in the simulation exhibits the consistent property. In general, we can notice that the LS estimation could give fair estimates even for moderate sample size such as 𝑇=40 and 𝑇=50.

4. Application of GSTAR Model to the Ratio of Per Capita GDP Data

In this section, we apply the GSTAR model to the ratio of per capita GDP data in 16 West European countries. The data was kindly given by Maddison [6] from Faculty of Economics, University of Groningen, the Netherland, who passed away on April 24 2010. The per capita GDP of a country is the country GDP value divided by its population, and the per capita GDP of total West Europe is the sum of each West European country GDP divided by the total population in West Europe. The ratio of the per capita GDP of a country is the country per capita GDP value divided by the per capita GDP of total West Europe, multiplied by 100. Hence, the unit of the per capita GDP ratio is the percentage. For the data analysis in the following subsection, we will use the ratio of the per capita GDP data and for simplicity it will be called the GDP ratio data.

4.1. Dataset and Preliminary Model Building

The dataset is the GDP ratio data for periods 1955–2006. It consists of 52 observations of 16 dimensional vectors. For the purpose of forecasting the data was grouped into the training data set and test data set. The training data is the first 42 observations that will be used for model building and the test data is the last ten data that will be used in forecasting performance comparison.

Clearly, the 42 observations in the training data, depicted in Figure 1(a), are not stationary though they tend to converge to the value between 50% and 150%. Therefore, to achieve the zero mean stationary data, the first difference transformation and data centralization must be applied.

Suppose 𝑌𝑖(𝑡) represents the per capita GDP ratio for country 𝑖,𝑖=1,2,,16 at time 𝑡,𝑡=1,0,1,,50. The first difference transformation of 𝑌𝑖(𝑡) denoted by 𝐷𝑖(𝑡) is𝐷𝑖(𝑡)=𝑌𝑖(𝑡)𝑌𝑖(𝑡1)(4.1)

and the centralization of the differenced data is𝑍𝑖(𝑡)=𝐷𝑖(𝑡)𝐷𝑖,(4.2)

where 𝐷𝑖 is the average of 𝑍𝑖(t) at location 𝑖,for𝑖=1,2,,16. Plot of the 16 series of this transformation is displayed in Figure 1(b) and their behaviour has seemed to represent stationary series.

As a preliminary model building, we set some notations used in model (2.1). The length of time period is 𝑇= 40, and the number of sites is 𝑁=16. Time period 𝑡=0 will correspond to 1956, 𝑡=1 to 1957, 𝑡=2 to 1958, and so on. Since the time dimension is one, the time lag can be ordered naturally by the sequence of 𝑘=1,2,. On the other hand, the spatial order may be defined in a different ways because in a two-dimensional space there is no specific order just as in one-dimensional space. For the GDP data, there are 16 countries. The first order and the second order neighbours of the countries are given in Table 2. These are defined based on the geographical location of the countries. The first order neighbours of a country are those which have a common border with the country or within a close distance along a sea route. A second order neighbour of a country is the union of all first order neighbourhood countries of its first order neighbours, excluding itself.

Suppose 𝑛𝑖(),=1,2 represents the number of countries which are the th neighbours of a country 𝑖. The spatial weight of order between countries 𝑖 and 𝑗 can be defined as𝑤()𝑖𝑗=1𝑛𝑖(),if𝑗isthethneighboursof𝑖,0,if𝑗isnotthethneighboursof𝑖.(4.3) For example, from Table 2 it can be seen that Austria has 3 the first order neighbours, Germany, Netherland and Switzerland, and has 6 second order neighbours, Belgium, Denmark, France, Greece, Netherland, and Sweden. Then, the first order of spatial weight between Austria and each nearby country is𝑤(1)16=𝑤(1)19=𝑤(1)1,15=13(4.4)

and the second order of spatial weight is 𝑤(2)12=𝑤(2)13=𝑤(2)15=𝑤(2)17=𝑤(2)1,10=𝑤(2)1,14=1/6.

4.2. GSTAR Model Building for the GDP Data

After transforming the data and constructing spatial weight matrix, the next step is identification of the model order. In STAR model-building, [3] used the sample space time autocorrelation function (STACF) and space time partial autocorrelation function (STPACF) as the primary tools for model identification. The order (𝑝;𝜆1,,𝜆𝑝) model can be characterized by the tail-off behaviour of the autocorrelations and the cut-off behaviour of the partial after 𝑝 time lag and 𝜆𝑝 spatial lag. Since STAR model is a special case of GSTAR model, these autocorrelation functions will be adopted to identify the order of GSTAR model.

Figure 2 presents sample STACF and STPACF for the differenced data 𝐙(𝑡) up to time lag 10 and spatial lags 0, 1, and 2. The pattern is not clearly suggested an exact order. However, since the sample STPACF cut off after time lag 2 and spatial lag 1, order (2;1,1) can be considered as the space-time order candidate. In addition, the space-time partials also cut off at time lag 5 and spatial 2, but applying this GSTAR model to the data will result too many estimated parameters because there will be at least 160 parameters which have to be estimated.

The 64 parameters and the error variance in this model were estimated using the least square method and the result is presented in Table 3. The empirical MSE for GSTAR(2;1,1) model is 2.735 counted based on 16×39 or 624 values. The residuals histogram and normal probability plot on Figures 4(a) and 4(b) show that the GSTAR(2;1,1) residuals are approximately normal distributed with zero mean and constant variance. Meanwhile, fitted value versus residuals plot in Figure 4(c) exhibits that the residuals do not show a significant pattern. From Figure 3 we can observe that the STACF of the residuals is significantly almost zero except for time lag 5 and 10, and spatial lag 2. The exception points at time lags 5 and 10, suggesting that the seasonal difference of order 5 might be useful for further model analysis. But the seasonal model analysis is not discussed here because it is out of the research scope.

For forecasting purpose, the estimated parameters in Table 3 can be used to predict the -step-ahead forecast value at time forecast origin 𝑇 defined by𝐙𝑇𝚽()=10𝐙𝑇𝚽(1)+11𝐖𝑇(1)𝐙𝑇𝚽(1)+20𝐙𝑇𝚽(2)+21𝐖𝑇(2)𝐙𝑇(2).(4.5)

5. GSTAR(1;1) Modeling

GSTAR(1;1) model is the simplest model of GSTAR(𝑝;𝜆1,,𝜆𝑝) model class defined in (2.1) because it is only characterized by the autoregressive terms lagged in time and spatial of order one. From (2.1) we can write GSTAR(1;1) as 𝑍𝑖(𝑡)=𝜙0𝑖𝑍𝑖(𝑡1)+𝜙𝑁1𝑖𝑗=1𝑤𝑖𝑗𝑍𝑗(𝑡1)+𝑒𝑖(𝑡),(5.1) or in matrix notation 𝐙(𝑡)=𝚽10𝐙(𝑡1)+𝚽11𝐖(1)𝐙(𝑡1)+𝐞(𝑡),(5.2) where 𝐙(𝑡)=(𝑍1(𝑡),,𝑍N(𝑡)), Φ10=diag(𝜙(1)10,,𝜙(𝑁)10), and Φ11=diag(𝜙(1)11,,𝜙(𝑁)11). To simplify the notation, Φ10 and Φ11 are frequently written, respectively, by Φ0 and Φ1.

The model has an interpretation that the current observation in a certain location only depends on the immediate past observations recorded at the location of interest and at the nearest locations. The GSTAR(1;1) model has 2𝑁 parameters since it is assumed that the parameter for each location is allowed to be unequal. For the per capita GDP data the least square estimator for GSTAR(1;1) model is presented in Table 4. The residuals distribution of the model, presented in Figure 5, give the conclusion that the residuals are approximately normally distributed with zero mean and the variance is nearly constant. In general, the STACF plot in Figure 6 shows that the residuals are significantly uncorrelated. The insignificant value was only found at the last time lag.

For the GDP data case, GSTAR(2;1,1) model has 64 parameters while GSTAR(1;1) has 32 parameters. Hence, it is not a surprise if the empirical MSE of GSTAR(2;1,1) is less than that of GSTAR(1;1). However, though the number of parameters of GSTAR(1,1) is half of the other one, the empirical MSE of GSTAR(1,1) is only decreasing 0.257 compared to the GSTAR(2;1,1).

The distribution of the MSE difference for each country is presented in a bubble plot (Figure 7). Center of the bubble is the value of MSE difference and the radius is its absolute value. The bubble placed under the zero axis indicates that the GSTAR(2,1,1) MSE for the associated country is smaller than that of the GSTAR(1;1) MSE. From the figure, we can see that the decrease of MSE values is mostly contributed by five countries, Belgium, Denmark, Ireland, Sweden, and Switzerland. Though the MSE value seems different, the paired 𝑡-test for the result gives the 𝑃 values < 0.678. It means that the MSE values for both models are significantly indifferent.

6. Comparison of Forecast Performance

For the purpose of forecasting model, it would be useful if we also consider their forecast performance. Therefore, in this section we will examine the one-step-ahead forecasting performance for each model candidate using the last ten actual data points of the per capita GDP ratio data set. Result of this section is expected to become a supplementary reference in finding the most parsimony space-time model for the case of per capita GDP ratio.

In (4.5) we have defined the -step-ahead forecasting for GSTAR(2;1,1) model. For GSTAR(1;1) model, it can be estimated by 𝐙𝑇()=𝚽10𝐙𝑇(1)+𝚽11𝐖(1)𝐙𝑇𝐙(1),(6.1)𝑇𝐙()=0.1674𝑇(1)+0.0453𝐖(1)𝐙𝑇(1),(6.2) where Φ10=diag(𝜙(1)10,,𝜙(𝑁)10), Φ11=diag(𝜙(1)11,,𝜙(𝑁)11), and for 0, 𝐙𝑇()=𝐙(𝑇).

The one-step-ahead forecast 𝐙𝑇+𝑗1(1) of each data points can be calculated using (4.5), (6.1), and (6.2) for =1. Note that calculation of the forecast value for each time 𝑗=1,2,,10, is performed without updating the parameter estimates and average data. The one-step-ahead forecast for GSTAR(2;1,1) model is 𝐙𝑇+𝑗1(1)=𝚽10𝐙(𝑇+𝑗1)+𝚽11𝐖(1)𝐙(𝑇+𝑗1)+𝚽20𝐙(𝑇+𝑗2)+𝚽12𝐖(1)𝐙(𝑇+𝑗1),(6.3) where 𝑗=1,2,,10. Meanwhile the one-step-ahead forecast for GSTAR(1;1) model is 𝐙𝑇+𝑗1(1)=𝚽10𝐙(𝑇+𝑗1)+𝚽11𝐖(1)𝐙(𝑇+𝑗1).(6.4)

To compare the forecast performance, we use mean of square forecast error (MSFE) which is defined by 1MSFE=𝑁𝑁𝑖=1𝐸𝑖,(6.5) where 𝐸𝑖=11010𝑗=1𝑌𝑖𝑌(𝑇+𝑗)𝑖,𝑇+𝑗1(1)2,𝑌𝑖,𝑇+𝑗1𝑍(1)=𝑖,𝑇+𝑗1(1)+𝐷𝑖+𝑌𝑖(𝑇+𝑗1),(6.6) and 𝑍𝑖,𝑇+𝑗1(1) is the element of Z𝑖,𝑇+𝑗1(1).

To measure the performance closeness between two models, we also calculate the MSFE difference between model 𝑀1 and model 𝑀2 for country 𝑖 which is defined by 𝐶𝑖=𝐸𝑖𝑀1𝐸𝑖𝑀2.(6.7)

The performance of 𝑀1 and 𝑀2 model is said to be close if the value of 𝐶𝑖 is near to zero. The negative value of 𝐶𝑖 indicates that for location 𝑖, the 𝑀1 model is better than 𝑀2 model. Since the residual is approximately normally distributed, the paired 𝑡-test can also be applied to test the null hypothesis that model 𝑀1 and model 𝑀2 have the same forecast performance.

Suppose 𝑀1 and 𝑀2 in (6.7) are represented by GSTAR(2;1,1) and GSTAR(1;1) model, respectively. The MSFE difference values are displayed in a bubble plot (Figure 8). Center of the bubble is the value of MSFE difference and the radius is its absolute value. The bubble placed under the zero axis indicates that the MSFE value for model 𝑀1 is smaller than that of the model 𝑀2. From the figure, it can be seen that most of the countries have the MSFE value which is scattered around the zero axis. The high value of the 𝑃 value < 0.9858 for the paired 𝑡-test gives the evidence that the forecast performance of each model is not significantly different.

The behavior of both differences for each time is displayed in Figure 9. From the figure, it can be observed that most points in Ireland (site 8) have a negative value. It means that in Ireland GSTAR(2;1,1) model has a better performance than GSTAR(1;1). Conversely, the most points in Norway (site 11) and Switzerland (site 15) are positive when the difference is counted for the pair of GSTAR(2;1,1) and GSTAR(1;1). On the other hand, Figure 7 also shows that in Norway there is a point with high jump difference value. It means that the worst performance of GSTAR(2;1,1) occurred at this point. In addition, the ratio of the negative value and the positive value is almost the same. It gives the interpretation that forecast performance of GSTAR(2;1,1) and GSTAR(1;1) model is almost similar.

7. Conclusions

GSTAR modeling has been built to the ratio of per capita GDP data in West European countries. The model of order (2;1,1) model has been identified as the candidate model. However, when the forecast performance is compared with GSTAR(1;1), it is found that the performance is significantly indifferent. Due to parsimony principle, we recommend that the GSTAR(1;1) might be considered as a forecasting model.

Acknowledgments

The research of the first author is supported by the Scholarship of Sandwich-Like Program no. 1995.1/D4.4/2008 of the Ministry of National Education of Republic of Indonesia c/q Directorate General of Higher Education (DIKTI). The authors are very grateful to Professor Gopalan Nair from the School of Mathematics and Statistics, the University of Western Australia, for his supervision during the first author’s visit in 2008. Helpful comments from earlier manuscript’s reviewer are also thankfully acknowledged.