#### Abstract

This paper deals with the analysis of the temperatures in several locations in Spain during the last 50 years. We focus on the degree of persistence of the series, measured through a fractional differencing parameter. This is crucial to properly estimate the parameters of the time trend coefficients in order to determine the degree of warming in the area. The results indicate that all series are fractionally integrated with orders of integration ranging between 0 and 0.5. Moreover, the time trend coefficients are all positive though they are statistically insignificant, which is in contrast with the results based on nonfractional integration.

#### 1. Introduction

Modelling climatological time series is an issue that still remains controversial. Although the evidence supporting the existence of climate change is now well established throughout the world, the statistical methods employed to determine the precise magnitude of the change are an unresolved topic. The standard way of modelling temperatures is to assume a linear function of time of the form where is the observed temperature at period of time , and is the deviation term that is supposed to be relatively stable across time. The parameter measures the average change in per time period. Thus, in the context of temperature time series, long-run warming may be occurring if is positive, in which case there is an increasing trend in temperature. On the other hand, it is a well-known fact that temperatures are highly time-dependent. Mathematically, there exist different ways of modelling that behaviour and a key issue here are to determine if the deviation term is then stationary or not. The most standard approach is to assume that in (1) is stationary . Rigorously speaking, an integration of order 0 process (denoted by ) is defined as a covariance stationary process with a spectral density function that is positive and finite at the zero frequency. In plain words, it means that the process must be relatively stable across time, and it includes a wide number of mathematical models, like the white noise, stationary autoregressive (AR), moving averages (MAs), stationary ARMA, and so forth. Of all these models, the classic autoregressive AR (1) model where , and is white noise that has been widely employed in the climatological community (e.g., [1]) because of its relation with the stochastic the first-order differential equation [2]. If the detrended series is not , the classic alternative is the unit root, also called integration of order 1 () and described as and first differences are then adopted to achieve the error . (Note that in this context may be white noise but also a weakly autocorrelated (AR(1)) process.) These two approaches have been widely employed in the climatological community to describe climate change and the warming effect in temperatures throughout the world. Thus, for example, Bloomfield and Nychka [3] and Woodward and Gray [4] among others assume that the error term is , while unit roots or models have been employed in temperature time series in Woodward and Gray [5], Stern and Kaufmann [6], Kaufmann and Stern [7], and Kaufmann et al. [8].

However, the and the models described above are merely two particular cases of a much more general class of processes called , where (the number of differences required to get ) may not necessarily be an integer value (usually 1) but a real value between 0 and 1 or even above 1. In such a case, the process is said to be fractionally integrated or integrated of order d (denoted by ). That is, is said to be if where is the lag operator (i.e., ), and is . Note that the polynomial on the left-hand-side in (4) can be expressed in terms of its binomial expansion, such that, for all real , and (4) can be written as Thus, is a parameter that indicates the degree of temporal dependence between the data, and the higher the value of is, the higher is the level of association (or correlation) between the observations. Also, note that if d is an integer value, depends only on a finite number of previous values, while if it is a noninteger value, it depends on all its past history. Comprehensive surveys of fractionally integrated models can be found in Robinson [9, 10], Beran [11], Baillie [12], Doukhan et al. [13], and Gil-Alana and Hualde [14], and examples of this type of model in meteorological time series data are the papers of Bloomfield [15], Smith et al. [16], Lewis and Ray [17], Pethkar and Selvam [18], Koscielny-Bunde et al. [19], Pelletier and Turcotte [20], Percival et al. [21], Maraun et al. [22], and Gil-Alana [23, 24]. Most of these authors coincide that the order of integration in regional, global (and even for specific locations) temperature time series is positive though smaller than 0.5.

In this paper, we focus on the Spanish case. The reason for the choice is that among the European countries, Spain is one of the most highly affected by climate change. According to a recent report from the European Environmental Agency (EEA, 2004), Spain and Portugal will be most affected within the European Union by climate change. In this paper, it is also mentioned that averaged temperature in the last 100 years has risen slightly more in Spain than the rest of the EU.

Spain climate is determined by its world position, on the south western edge of Eurasia and just 13.4 kilometers from Africa at its southernmost point, with an ocean to the west and a sea to the east, and by its continental land mass and high mountainous terrain, producing a mosaic of climates, one of the most varied in Europe. In a recent paper, Brunet et al. [25] examined long-term (1850–2003) Spanish temperatures by employing a new daily adjusted record of daily mean temperature collected by the European Community (EC-) funded project EMULATE. Their results indicate a highly significant warming over the entire period, with a pronounce increase in temperature since 1973 [26]. (Other papers dealing with temperatures in Spain are Jones and Moberg [27] and Sigró et al. [28].)

The outline of the article is as follows. In Section 2, we present the statistical model. Section 3 describes the data and presents the test results, while Section 4 contains some concluding comments and extensions.

#### 2. The Statistical Framework

Throughout this paper, we consider a model given by (1) and (4), that is, and given the seasonal (monthly) structure of the series under analysis, we suppose that the disturbance term, , follows a seasonal AR(1) process of the form In other words, the value of the time series at time t depends not on the previous value (as is the case in the nonseasonal AR(1) model described in (3)) but on the value at the same month in the previous year plus an error term. Thus, we assume that all the seasonal dependences are captured through the seasonal AR polynomial. Alternatively, we could have employed seasonal dummy variables. However, in doing so in the empirical work carried out in the following section, the coefficients were found to be statistically insignificant in the majority of the cases.

We estimate the above model using a Whittle function in the frequency domain along with a procedure proposed by Robinson [9] that permits us to test any real value for the differencing parameter . (The Whittle function is an approximation to the likelihood function that is more suitable for applications in many cases [29].) That is, we test the null hypothesis in (7) and (8), where can be any real value. This is a very general specification since it allows us to consider many interesting cases. Thus, if we cannot reject the null hypothesis (9) with , we support the trend stationary representation advocated by many authors, including Brunet et al. [25] work for the Spanish case. On the other hand, if cannot be rejected, we support the unit root or representation. However, the differencing parameter d can also be a value between 0 and 1, or even above 1.

#### 3. Data and Results

The data employed in the article correspond to the daily maximum () and minimum () temperatures (in 0.1) in six locations in Spain obtained from the European Climate Assessment & Dataset (ECA&D), collected by Tank et al. [30]. (Mean temperature data were also considered and the results, though not reported, were completely in line with those presented here for the maximum and minimum temperatures.) The specific locations are Badajoz (Talavera): LAT: +38:53:00, LON: −06:48:15; Madrid: LAT: +40:24:40, LON: −03:39:19; Malaga: LAT: +36:40:00, LON: −04:28:43; San Sebastian-Donostia: LAT: +43:18:24, LON: −02:01:38; Valencia: LAT: +39:28:48, LON: −00:21:08; Zaragoza: LAT: +41:39:43, LON: −00:59:31. We focus on these stations since they are those with the longest available data without periods of missing observations. The starting dates are 27/10/1936 for San Sebastian-Donostia; 01/01/1950 for Madrid and Valencia; 01/01/1955 for Badajoz; 24/10/1959 for Zaragoza, and 01/08/1980 for Malaga. All the series end on 31/07/2008. The dataset was build based on available data at ECA&D, and the starting dates were chosen in such a way that there was no more than one consecutive observation missing in the sample. For such observation, the arithmetic mean of the previous and the following observations was adopted. The daily temperatures are displayed in Figure 1, and we observe a clear seasonal pattern in all of them that is more visible when we compute the monthly mean temperatures in Figure 2. Figure 3 displays the annual means of , noting a slight trend in the temperatures in all cases. This implies that a deterministic trend may also be present in the monthly and daily data though it may be obscured by the presence of the seasonal structures. Though not reported, similar plots were obtained in case of the minimum temperatures.

In this section, we focus on the monthly mean temperatures and estimate the model given by (7) and (8). Table 1 focuses on , and it displays the estimates of the intercepts and the time trend coefficients, the fractional differencing parameters (with the 95% confidence band), and the seasonal AR coefficients. The first thing we observe in this table is that the orders of integration are in all cases constrained between 0 and 1, and the two integer orders of differentiation (i.e., and ) are decisively rejected in favour of fractional integration. We see that the highest estimates correspond to Badajoz, with , followed by Madrid (0.380) and Valencia and Zaragoza (0.267 and 0.255, resp.), while Malaga () and San Sebastian-Donostia () present the lowest degrees of integration. In general, the degree of seasonal dependence is high in all cases, ranging from 0.8430 (San Sebastian-Donostia) to 0.9506 (Malaga). If we focus on the deterministic terms, we first notice that the intercepts are statistically significant in all cases, while the time trends are all insignificantly different from zero. Thus, according to this specification, the evidence of warming may be questionable though the coefficients are positive in all cases: the highest values correspond to Badajoz (0.065), Madrid (0.051), and Zaragoza (0.046) implying that temperatures have increased about 0.0065, 0.0051, and 0.0046 degrees Celsius per month, respectively. These values represent an annual increase of about 0.078, 0.061, and 0.055, respectively, for each of the three locations.

Table 2 is similar to Table 1 but referring to the minimum temperatures (). The same conclusions as in Table 1 are obtained here. Thus, fractional degrees of differentation are observed in all cases, with values ranging from 0.155 (San Sebastian-Donostia) to 0.384 (Badajoz); the intercepts are all statistically significant, while the time trends, though positive, are statistically insignificant. The highest coefficients are those corresponding to Malaga (0.05101) and Valencia (0.05046), while the lowest value is obtained at San Sebastian-Donostia with .

In Tables 3 and 4, we present the results, respectively, for and , following the standard practice of assuming that the error term in (7) is stationary. In other words, we impose a priori in (7) and though this hypothesis was strongly rejected in Tables 1 and 2, it is interesting to check how the time trend coefficients change in this case. The most noticeable fact observed here is that the time trend coefficients, though smaller in magnitude than in Tables 1 and 2, they are statistically significant in the majority of the cases: Badajoz, Madrid, Valencia, and Zaragoza in case of , and in all the locations except Badajoz in case of . The seasonal structure is again highly dependent in all cases.

Table 5 reports the annual increases in according to the time trend coefficients obtained in Tables 1 and 3, that is, with d estimated from the data and with a priori. We see that the values are much higher in the cases of Badajoz, Madrid, Valencia, and Zaragoza if d is estimated rather than imposed. The same happens with (see Table 6), observing higher values if is estimated rather than imposed in all locations with the exception of Malaga.

In the final part of this paper, we wonder if the differences in the results across locations may be due to the different sample sizes employed in each case. Thus, in Tables 7 and 8, we report the same estimates as in Tables 5 and 6 (i.e., annual increases in 0.1) but using the same sample size in all cases, with the starting date in August 1980. The values are now similar in the two cases, especially for (Table 7) though slightly smaller when using the estimated values of the d’s. In case of , the highest warming effects correspond to the cases of San Sebastian-Donostia (with an increase of about 0.048 per year), Valencia (0.041), and Zaragoza (0.035), while for , the highest increases are in San Sebastian-Donostia (0.049), Valencia (0.039), and Badajoz (0.036).

These results are partially consistent with those obtained in Brunet et al. [25]. They used a daily adjusted dataset composed of the 22 longest Spanish temperature records, and, on average, they found an annual increase of about 0.054 for the time period 1973–2003 employing a trend stationary representation based on errors. (Note, however, that they do not directly work with observable data but with reconstructed data generated in the framework of the EC EMULATE project.) This is in line with our results displayed in the right columns in Tables 7 and 8 though our values are slightly smaller than theirs, which may be explained by the lack of autocorrelation in Brunet et al. [25] work.