#### Abstract

Temporal variability of three different temperature time series was compared by the use of statistical modeling of time series. The three temperature time series represent the same physical process, but are at different levels of spatial averaging: temperatures from point measurements, from regional Baltan65+, and from global ERA-40 reanalyses. The first order integrated average model IMA(0, 1, 1) is used to compare the temporal variability of the time series. The applied IMA(0, 1, 1) model is divisible into a sum of random walk and white noise component, where the variances for both white noises (one of them serving as a generator of the random walk) are computable from the parameters of the fitted model. This approach enables us to compare the models fitted independently to the original and restored series using two new parameters. This operation adds a certain new method to the analysis of nonstationary series.

#### 1. Introduction

Atmospheric reanalyses are widely used in meteorological and climatological research, as it makes available long time series of gridded meteorological variables, to explore climatic trends and low-frequency variations. Traditional analysis of air temperature time series, focused on trend detection and fitting of statistical models, has shown itself a useful tool (e.g., [1, 2] and references wherein). A use of structural time series models (i.e., linear trend plus red noise) has been popular in recent years (e.g., [3, 4]).

Our paper differs from cited above approach because no a priori model structure is supposed. We use the structure and correlation functions of temperature time series to select an applicable model type according to the scheme introduced by [5] inside the family of autoregressive integrated moving average (ARIMA) models. The scheme has been tested by means of various daily series before by [6, 7], and the earlier analysis shows that the first order integrated moving average model IMA(0, 1, 1) is applicable for daily temperature anomaly time series.

Nonstationary nature of the fitted IMA(0, 1, 1) model indicates that the traditional characterization of temperature variability on the basis of sample moments is unjustified. Due to nonstationarity, the moments do not converge if the sample size increases. This means that in order to compare temporal variability of air temperatures from different sources (e.g., measured and reanalysed), some other characteristic parameters are needed.

A nonstationary IMA(0, 1, 1) model is divisible into a sum of white noise (WN) and random walk (RW) component [5]. The variances for both white noises (the other of them serving as the random walk generator) are computable from two parameters of the fitted model. This approach enables us to compare the coincidence of original and restored series by means of two separated components. One of them is stationary (WN) and the other nonstationary (RW).

The aim of our paper is to show that time series analysis offers a tool with which it is possible to estimate the accuracy of reanalysis method to restore statistical characteristics of the initial time series. Our characterization also introduces different parameters for the comparison of the variability in time series. Two climate scale characteristics are the mean annual cycle and the tolerance. Two proposed weather scale characteristics are the ranges for lower and upper outliers, respectively.

Applicability of the idea is demonstrated on the basis of three daily air temperature time series representing the same territory but produced using different procedures. The results enable us to distinguish between recoverable and nonrecoverable parts of the IMA(0, 1, 1) model. This operation introduces a new method for the analysis of nonstationary series.

#### 2. Initial Data

The temperatures we used were measured at Tartu-Ülenurme station (58°18′N, 26°41′E), 7 km out of Tartu, where the station was moved in 1950. Since 1997, it is not official SYNOP station for Tartu any more, nevertheless due to the airport the meteorological measurements are still carried on. We have analyzed the temperatures measured at 6, 12, and 18 GMT for the period from 1966 to 2005.

We used two different reanalysis outputs: one of them is well-known ERA-40 (the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis) [8] and the other Baltan65+—a regional reanalysis for the Baltic Sea region [9]. ERA-40 has a spatial resolution of 1 degree and Baltan65+ of 11 km. Data used to represent Tartu were from the closest grid point (see Figure 1). In Baltan65+ database, the development of the climate system is calculated with High Resolution Limited Area Model (HIRLAM) (version 7.1.4.), for the period 01.01.1965−31.12.2005. Every 6 h modeling cycle begins with providing new observational data and boundary conditions. Atmospheric pressure, air temperature, wind speed, specific humidity, and sea surface temperature are applied as boundary fields. These fields are created by interpolation of corresponding ERA-40 fields to the HIRLAM grid. As ERA-40 is calculated for the period 1957–2002, then after 2002, the ECMWF operational model is used for boundary data. Data assimilation contains also quality control for both observations and background fields. Observational data for Baltan65+ project is acquired from the ERA-40 database where possible and from Finnish Meteorological Institute’s operational HIRLAM observation archive for the period after 2002. Measurements from Tartu SYNOP station are also used in the data assimilation.

We denote the different temperatures in the following way: for Tartu, for ERA-40, and for Baltan. Here, stands for GMT observation time. From ERA-40 temperatures we used 12 GMT ones only. Unfortunately, not all three time-series are of the same length and for exactly the same period. The lengths of the original temperature time series are 14610 days for the measured, 14975 for Baltan65+, and 16436 for ERA-40.

#### 3. Day to Day Comparison

We fit an acceptable model to anomalies with respect to annual mean cycle. Therefore, the mean annual cycles for available periods are computed for , , and series. One or two year difference in the length of initial series has no considerable influence for that characteristic. An example showing all three cycles corresponding to the noon observations (12 GMT) is shown in Figure 2. The curves are raw, that is, averaged over each calendar day (February 29 included), that avoids generation of approximation errors for the anomalies. Since the cycles are very similar to each other, then the curves are shifted for better overview. The maximum difference between any two cycles does not exceed 0.5 centigrades.

The daily anomalies are compared over the common interval 1966–2005 in order to describe the distribution of restoration accuracy. The frequencies of temperature difference are collected in three groups and shown in Table 1. The maximum frequency corresponds to the group containing the occasions with temperature difference less than one degree. About one percent of observations shows the occasions with the corresponding difference exceeding 3 degrees.

The annual cycles are similar but still show a systematic shift towards warming for the mean values as a result of reproduction. Forty years mean shift for the measurements at 6, 12, and 18 GMT is 0.037, 0.035, and 0.028, respectively. This difference is considered to be unimportant for future analysis.

Anomalies (deviations) with respect to the annual cycles of measured , Baltan65+, and ERA-40 values are further used for modeling the temporal variability.

#### 4. Model Fitting Summary

Before fitting any statistical model to experimental time series, an initial exploration of the series temporal variability is necessary. Experience shows that several local daily air temperature series appeared to be the series with stationary daily increments [10]. This means that instead of the customary assumption about stationarity, it is reasonable to apply more general theory that is capable of analyzing the series with stationary increments. This leads to the use of structure function (instead of the correlation function) to examine the series variability.

The structure function for a series , where , is the second moment of time series increments as a function of the increment interval (see [12] for details): This function enables us to distinguish between short and long range variabilities in the initial series by means of its different scaling exponent, depending on the increment interval (e.g. [6, 7]). The growth of for three temperature series (corresponding to the observation time 12 GMT) over the increment range from 1 to 4096 days is shown in Figure 3(a).

**(a)**

**(b)**

There appears to be a remarkable growth of the structure function as the increment interval grows up to two weeks. The growth rate decreases considerably as the increment interval grows beyond. The further growth is very slow for daily temperature anomaly series. The same behaviour is well-known on the basis of several earlier works (e.g., [10, 11]).

If grows beyond a month, the growth becomes proportional to where is a small number (approximately 0.1 or less) for these series. The property is called scaling [12]. For the current analysis, this means that the correlation between consecutive increments as a function of the increment interval is capable of suggesting an applicable autoregressive integrated moving average (ARIMA) model to represent the series long range temporal variability. More precisely, the named correlation is the lag one autocorrelation of the series (e.g., [6]), Here, because the original series where is divisible into just different increment series. The necessary correlation between the consecutive increments is computed as the mean over values of the corresponding lag one autocorrelations for time series (2) at each . The results as functions of for three temperature anomaly series are shown in Figure 3(b).

Figure 3(b) shows that the correlation between consecutive increments for these series saturates at a negative level near as the increment interval grows. This indicates that the first order moving average MA model is applicable to represent temporal variability of the subseries of long range increments [6, 11]. The saturation means that the applicable model type does not change if the increment interval stays larger than 30 days.

In this case, the increment interval days will be used for modeling. The interval is chosen earlier [6] in order to match with total solar irradiance variability. This means that the model has been fitted to 56 subseries.

Let us consider subseries from the initial daily series over the time interval days. Fixing this time interval, we can write the MA model for the corresponding increments formally as where is white noise (WN) and is a fitted coefficient. Here, we use (3) to stand for the mean model for original series obtained by averaging the fitted coefficients over 56 subseries models.

For each subseries model, a diagnostic test has been carried out using the modified portmanteau statistic Ljung-Box test statistic [13]:

Here, is the length of subseries, , and is the autocorrelation of residual series . Critical values for in our case (for 23 degrees of freedom) are 35.2 and 41.6 at 95% and 99% significance levels, respectively. The test is generally passed at 95% level. In a few occasions the value of appeared to be between 35.2 and 41.6. (not shown).

Adding both sides of (3) over , we obtain the model for temperature deviations (i.e., IMA(0, 1, 1)): where .

As a result, we obtained a simple model which describes long range temperature anomaly variability by means of two parameters, and the variance of residuals .

#### 5. Climatological Interpretation of the Model

Integrated moving average model IMA(0, 1, 1) can be presented as a sum of two components (see [5] for details): where is a nonstationary random walk (RW) with the generator which is white noise. The other component is also white noise (WN), but independent of . It is possible to compute the variances for both components knowing and [5]:

Now, we have four parameters: , , , and to compare the reproduced Baltan65+ and ERA-40 series with the original one.

##### 5.1. Revealed Difference in Model Parameters

Every subseries model depends on one fitted parameter , that in our case fits all into a narrow interval from 0.04 to 0.14. The average values for both variables and are taken to represent the model in the following analysis. These values of at three observation times are shown in the second, third, and fourth columns of Table 2. The other nine columns show mean standard deviations for the series of , , and , respectively.

The most significant difference between the produced model parameters occurs in and values. They both indicate weakening of the influence of RW component as the modeling object changes from the original to the reproduced variable of or . This means that the reproducing process essentially smooths the temporal variability of temperature. Using as a measure of relative restoration error, two of them appear to be reasonably small 0.03 (for the series ) and 0.01 (for ) but the third is huge 0.7 (for ). This is an indication of the fact that a random walk path length is not predictable as well as reproducible (see [5] for details). But the mean over 56 models appears to be approximately equal in the models for the original and reproduced variables.

#### 6. About the Air Temperature Tolerance

The standard deviation is about ten times larger than as shown in Table 2. This means that the fitted model appears to be nearly stationary. It is natural to describe the range of its variability by the aid of standard deviation of the stationary component of the fitted model (i.e., ). In case of normal stationary series, the interval would contain approximately 95% of the observed anomalies over the available sample. Without loss of generality, we can use the same yardstick to compute the frequency of outliers to compare the behaviour of the current empirical series. Since the temperature anomaly fluctuations are generally asymmetric, thus the interval should be shifted a bit towards lower values in order to reconcile it with the 0.025 and 0.975 percentile values for the whole anomaly histogram from the available time period. We call the interval between 0.025 and 0.975 percentiles of the total histogram of the temperature anomalies the local temperature (anomaly) tolerance. Its width is initiated by due to the general tendency that 95% of the anomaly observations fit into it. Individual deviations due to asymmetry lead to specification of the interval by means of the percentiles. The results in centigrades for two variables are shown in Table 3.

Using the tolerance for comparison of time series behaviour is reasonable. It helps to get rid of the traditional statistical way to describe the air temperature variability by means of sample moments whereas they are useless due to nonstationarity. The new approach partly satisfies the assumption by North and Cahalan [14] about stationarity of the marginal distribution. The only concession is the use of selected 95% of the sample instead of the whole 100% sample. The use of 100% of the sample would be justified only if in the fitted model. Then, the IMA(0, 1, 1) model would be reduced to IMA(0, 1, 0) model, which shows that the temperature anomaly series behaves as WN. In our model there is a small but still nonzero indicating evident difference from any pure WN case. This means that the RW component is important in the description of the temperature series variability. But due to the appeared small frequency of the outliers from the tolerance, their influence can be treated as a result of extremal weather events not influential to the climate.

Table 3 shows that the reproducing is connected to some narrowing down of the regions. This is in good accord with the difference between the variances . More informative action is to compute the change of outliers frequency if the reproduced tolerance is used in order to handle the original temperature anomalies. The new frequencies are expected to be slightly higher than 0.025 for both edges. For lower/higher edge, they are 0.027/0.026 (for 6 GMT), 0.028/0.028 (for 12 GMT), and 0.027/0.026 (for 18 GMT). This means that the summary number of outliers increased less than one percent.

Random walk sample paths are unpredictable, and their influence to our climate scale characteristic (tolerance) is small. Thus, an analysis of outliers ranges is not important for our main problem of estimating the climate scale influence of time series reproduction by means of HIRLAM-based reanalysis.

But this analysis becomes more important if a determination of climate variability and/or change signals arise [7].

#### 7. Conclusions

We fitted statistical models to three 40-year long temperature anomalies time series; one was measured at Tartu, and two others were from different reanalyses—ERA-40 and Baltan65+. Modeling of daily air temperature series showed that a nonstationary first order integrated average IMA(0, 1, 1) model was applicable.

Despite that the daily time series themselves may differ essentially, the comparison of the three series models shows that the white noise (climatological) component of the series barely differ. This means that the traditional comparison by means of statistical moments is inapplicable, as they remain because of nonstationarity sample dependent and, thus, are unable to produce a reliable basis for any comparison. Applying IMA(0, 1, 1) enabled us to overcome this obstacle and organize the comparison of measured and reanalysed datasets on the basis of stationary components picked out from the series.(1)The calculations with anomaly series show that the separation of white noise component is crucial for successful comparison—only the white noise component variance is reproducible by means of a reanalysis (the mean values remain zero during both operations).(2)The comparison shows that a nonstationary IMA(0, 1, 1) model enables us to extract the standard deviation () belonging to its stationary white noise component. This standard deviation enables us to produce a quantitative scheme to estimate fitness of the reanalyzed variables.(3)The separated random walk components of measured and reanalysed data differ essentially. This is self-evident because random walk is not predictable (and thus, also not reproducible). Its presence in a stochastic process enhances its nonstationarity. This emphasizes the necessity of using the white noise components variance in climate analysis.(4)The described small comparison shows that the regional Baltan65+ reanalysis is capable of producing a reliable dataset for future climate studies provided that a separation of stationary component will be extracted.(5)The same procedure is applicable if one needs to compare temperature variability between two closely situated stations or between outputs of numerical models which pretend to describe the temperature variability over the same area.

#### Acknowledgments

The authors would like to thank Andres Luhamaa for providing Baltan65+ temperature data, Marili Kangur for preparing the measured temperature data, and COST733 colleagues for preparing ERA-40 data. The study was supported by the Estonian Ministry of Education and Research (Grant SF0180038s08 and Grant ETF9134) and by the EU Regional Development Foundation, Environmental Conservation and Environmental Technology R & D Program Project no. 3.2.0801.12-0044.