Abstract

Initial errors in weather prediction grow in time and, as they become larger, their growth slows down and then stops at an asymptotic value. Time of reaching this saturation point represents the limit of predictability. This paper studies the asymptotic values and time limits in a chaotic atmospheric model for five initial errors, using ensemble prediction method (model’s data) as well as error approximation by quadratic and logarithmic hypothesis and their modifications. We show that modified hypotheses approximate the model’s time limits better, but not without serious disadvantages. We demonstrate how hypotheses can be further improved to achieve better match of time limits with the model. We also show that quadratic hypothesis approximates the model’s asymptotic value best and that, after improvement, it also approximates the model’s time limits better for almost all initial errors and time lengths.

1. Introduction

Forecast errors in numerical weather prediction models (NWPMs) grow in time because of the inaccuracy of the initial state, chaotic nature of the weather system itself, and the model imperfections. Due to the nonlinear terms in the governing equations, the forecast error will saturate after some time. Time of saturation or the limit of predictability of deterministic forecast in NWPM is defined by [1] as time when the prediction state diverges as much from the verifying state as a randomly chosen but dynamically and statistically possible state. Forecasters also use other time limits to measure the error growth. Forecast-error doubling time is time when initial error doubles its size. , , , and are times when the forecast error reaches 95%, 71%, 50%, and 25% of the limit of predictability. The time limit is the time when the forecast error exceeds of the saturation or asymptotic value (AV) and, by [2], it corresponds to the level of climatic variability. Lorenz [3] calculated forecast error growth of NWPM by comparing the integrations of model, starting from slightly different initial states. Present-day calculations use the approach developed by Lorenz [4], where we can obtain two types of error growth. The first is called lower bound and is calculated as the root mean-square error (RMSE) between forecast data of increasing lead times and analysis data valid at the same time. The second is called upper bound and is calculated as the root mean-square (RMS) difference between pairs of forecasts, valid at the same time but with times differing by some fixed time interval. For example, if this interval is one day, the analysis for a given day is compared with one day forecast valid for the same day, and then this one day forecast is compared with two days forecast valid for the same day and so on. This second method compares only model equations and therefore it represents growth without model error. The innovation to upper bound, that is also used, is calculated as the RMS difference between forecast and control forecast with higher resolution of the model (perfect model framework).

Quadratic hypothesis (QH) was the first attempt that was made by Lorenz [3] to quantify the error growth. QH is based on the assumption that, if the principal nonlinear terms in the atmospheric equations are quadratic, then the nonlinear terms in the equations governing the field of errors are also quadratic. Dalcher and Kalney [5] added a model error to Lorenz’s QH. A version that is used by recent researchers is the Simmons’s et al. modification [6] of [5]. The Lorenz’s QH is therefore suitable for upper bound of error growth and the Simons’s et al. modification for lower bound. Trevisan et al. [7] came out with idea that logarithmic term is more valid than quadratic and linear term in the equations governing the field of errors, but this logarithmic hypothesis (LH) has never been used in NWPM computations.

Ensemble prediction systems (EPS) are used in order to estimate forecast uncertainties. They consist of a given number of deterministic forecasts where each individual forecast starts from slightly different initial states. EPS also includes a stochastic scheme designed to simulate the random model errors due to parameterized physical processes. Recent studies of predictability and forecast error growth (e.g., [811]) are mostly done by models of European Centre for Medium Range Weather Forecasts (ECMWF) and the Global Ensemble Forecast System (GEFS) from the National Centers for Environmental Prediction (NCEP). They include deterministic and ensemble forecast with 1 to 70 members. Operational model of ECMWF uses 50 members plus control forecast. More detailed study [10] uses 5 members plus control forecast. The initial conditions of ensemble members are defined by linear combination of the fastest singular vectors. Horizontal resolution with spectral truncation varies from T95 to T1279 and the number of vertical levels varies from 19 to 91 (analyses use higher resolution than forecasts). The output data are interpolated to 1° latitude × 1° longitude or 2.5° latitude × 2.5° longitude resolution separately for the Northern Hemisphere (20°, 90°) and Southern Hemisphere (−90°, −20°). Forecast is usually run for 90 days at winter (DJF) or summer (JJA) season with 0 (analysis) to 10, 15 (ECMWF), or 16 days (NCEP) of forecast length (FL) at 6 or 12 hours intervals. The most often used variable for analyzing the forecast error is geopotential height at 500 hPa level (Z500). Others are geopotential height at 1000 hPa level (Z1000) and the 850 hPa temperature (T850). To describe the forecast error growth over the calculated forecast length, the Simmons et al.’s modification [6] of Lorenz’s QH [3] is used.

The questions that have arisen from studies of predictability and forecast error growth and that represent the key issues addressed in this work are: Is the LH [7] better approximation of initial error growth than QH [3]? Is there a possible modification of LH and QH that better approximates model data? If so, how much difference it creates in time limits that measure the forecast error growth? How precisely do the approximations describe forecast error growth over the FL (10, 15 or 16 days)? How do the approximations obtained from model values with various number of ensemble members differ from each other? Lorenz’s chaotic atmospheric model (L05II) [12] will be used. For a more comprehensive introduction to the problem of weather predictability, we refer reader to the book by Palmer and Hagedorn [13]. After this introduction, Section 2 describes the model and experimental design, Section 3 describes ensemble prediction method, Section 4 introduces quadratic and logarithmic hypotheses, and Section 5 sets experimental designs. Section 6 presents the results and their discussion and Section 7summarizes the conclusions.

2. Model

Because of the limitations of NWPMs and because we want to derive the impact of initial error (perfect model framework), we use modification [13] of low-dimensional atmospheric model (L96). L96 [14] is a nonlinear model, with variables connected by governing equations: are unspecified (i.e., unrelated to actual physical variables) scalar meteorological quantities, is a constant representing external forcing, and is time. The index is cyclic so that and variables can be viewed as existing around a circle. Nonlinear terms of (1) simulate advection. Linear terms represent mechanical and thermal dissipation. The model quantitatively, to a certain extent, describes weather systems, but, unlike the well-known Lorenz’s model of atmospheric convection [15], it cannot be derived from any atmospheric dynamic equations. The motivation was to formulate the simplest possible set of dissipative chaotically behaving differential equations that share some properties with the “real” atmosphere. NWPMs interpolate the output data mostly to 1° latitude × 1° longitude grid. In L96, it means . Such a high resolution would create large number of waves with similar maxima “pressure highs” and minima “pressure lows”; however, to share some properties with the “real” atmosphere, we would rather have 5 to 7 main highs and lows that correspond to planetary waves (Rossby waves) and a number of smaller waves that correspond to synoptic-scale waves. Therefore, we introduce spatial continuity modification (L05II) [12] of L96. Equation (1) is rewritten to the form:where

If is even, denotes a modified summation, in which the first and last terms are to be divided by 2. If is odd, denotes an ordinary summation. Generally, is much smaller than and if is even and if is odd. For our computation, we choose , so each sector covers 1° degrees of longitude. To keep a desirable number of main pressure highs and lows, Lorenz suggested keeping ratio and therefore . Parameter is selected as a compromise between too long doubling time (smaller ) and undesirable shorter waves (larger ). We first choose arbitrary values of the variables , and, using a fourth order Runge-Kutta method with a time step or 6 hours, we integrate forward for 14400 steps, or 10 years. We then use the final values, which should be free of transient effect. Figure 1 shows values of model variables with selected parameters. For this setting and by the method of numerical calculation presented in [16], the global largest Lyapunov exponent is . The definition of a chaotic system according to [3] states, that a bounded dynamical system with a positive Lyapunov exponent is chaotic. Because the value of the largest Lyapunov exponent is positive and the system under study is bounded, it is chaotic. Strictly speaking, we also need to exclude the asymptotically periodic behavior, but such a task is impossible to fulfill for the numerical simulation. The choice of parameters and time unit = 5 days is made to obtain similar value of the largest Lyapunov exponent as state of the art NWPMs.

3. Ensemble Prediction Method

The ensemble prediction method (EPM) employed is similar to [14] and is used to calculate average initial error growth. We make an initial “run” by choosing error and letting be the “observed” initial value of variables. We then integrate forward from the true and the observed initial state, for between 25 and 37.5 days ( to steps). This time length covers initial error growth till the limit of predictability. We obtain sequences and , after which we let for all values of and . In NWPM, forecast error growth is obtained from an average of values from 90 days and from various number of ensemble members. To simulate that, we make a total of , , and runs in the above described manner. In each run, new values of are set as the old values of . Finally, we let be the average of the values, where is the predictable range and is the average of values. Logarithmic average is chosen because of its suitability for comparison with growth governed by the largest Lyapunov exponent. For further information, see [1719].

4. Error Growth Hypotheses

According to Lorenz [14], there is an eventual cessation of the exponential error growth due to processes represented by nonlinear terms in the weather governing equations. Most important are the quadratic terms, which represent the advection of the temperature and velocity fields. Under the assumption that the principal nonlinear terms in the atmospheric equations are quadratic, nonlinear terms in equations governing the field of errors are also quadratic. To describe this, Lorenz [14] defined QHwhere is a distance at time between two originally nearby trajectories and and are constants. As an alternative, Trevisan et al. [7] introduced LHwhere and are constants. The explanation follows, if we let and is the normalized , then represents the QH. In [7], it is assumed that linear fit is superior to the QH.

As modifications of QH (4), we use Simmons’s et al. [6] version (QHM),that is used for approximation of growth with combination of initial and model error and where , and are constants. We also add the constant term to LH (5) (LHM)where , , are constants. The reason for choosing (6) and (7) is based on assumption that, at , for ((6) and (7)) but for model data. By adding the constant term, we may solve this difference.

In [7, 20, 21], it is shown on low-dimensional models that if the initial error is sufficiently small and therefore the early stages of error growth are exponential, QH is superior. If the initial error is not small enough, it is better to use LH. Generally, whether an error is small enough to guarantee the exponential growth depends on specific meteorological conditions and/or model under study.

5. Experimental Design

We want to achieve the conditions as similar to NWPM as possible. The size of initial error for NWPM (perfect model framework) is by [9] approximately between 2% and 0.01% of AV of the forecast error for control forecast and between 10% and 3% of AV for ensemble members. Different values of AV fraction are a result of varying resolution and because it is calculated for different variables (Z500, Z1000, and T850). In our case, the AV is . This is calculated by four independent methods with the same results. The first method is numerical computation of ensemble prediction approach. Second and third methods are based on formula:where is “forecast” from , from , and is average value of . The bars above the (8) members mean the average value. The explanation for (8) can be found in [6, 10]. The fourth method is based on assumption [2] that variability of is 71% of .

Recalculation of initial errors to L05II model leads to the sizes of between 0.001 and 0.84. For initial error sizes , we therefore choose randomly from five normal distributions . , , , , , where is mean and is standard deviation. These choices of are made, because [20, 21] shows that change over QH and LH takes place between and for L96. NWPM routinely defines initial conditions of ensemble members by the fastest singular vectors. We do not adopt it, because, by [10, 22], it affects early stages of forecast error growth and, in our case, we want to have model data as close as possible to the tested hypotheses. From these initial conditions, the average initial error growth is calculated from ensemble prediction method by the fourth order Runge-Kutta integration schema with a time step or 6 hours for , , and . Because we want to study agreement of ((4)–(7)) with model data, we make differences of model data at points , and ( days), ( days), = limit of predictability (), and we calculate parameters , , , , , , , , and . The choice of the first two values of is made, because we want to keep ratio /forecast length the same for NWPM and L05II.

The solutions of ((4)–(6)) are

LHM (7) cannot be solved analytically and therefore it is solved numerically by the fourth order Runge-Kutta integration schema with a time step . We have five types of normal distribution for reaching sizes of initial error , five settings for EPM , three FL , and five ways of getting data of initial error growth: EPM, ((9)–(11)) and numerical solution of (NS (7)). To answer the key questions, we compute time limits for all combinations. We take ,  , EPM as the most reliable dataset in our experiment for all and we calculate differences with other combinations at the same time limits.

6. Results and Discussion

Tables 1 and 2 show with darker grey rows the resulting values (, , EPM) for all time limits and for all represented by . is average value of and we use it, because the difference between is of the order of 0.1 and and do not show closer values to than and . Lines in Tables 1 and 2 marked (, EPM, ) − () where successively represents (9), (10), (11), and NS (7), and show the difference between most reliable data (, , EPM) and data from combinations of , . Columns marked by , , , , and display standard deviation of . Last third of Table 2 shows average values of with standard deviations , , , , and . From Tables 1 and 2, we can see that QH with solution (9) has an almost constant difference (EPM, ) − () equal on average to value between 2.3 and 2.5 days with between 0.1 and 0.3 days. The value of difference (EPM, ) − ((9), ) is higher only for . LH with solution (10) does not give good fit to model values and it is the worst choice if we compare it with other approximations. QHM with solution (11) has an almost constant difference (EPM, ) − () equal on average to value between 1.5 and 1.9 days with σ between 0 and 0.2 days. Different from these averages are values of difference (EPM, ) − ((11), ) at and values of difference (EPM, ) − ((11), ) at . Values of difference (EPM, ) − ((11), ) at are not displayed, because is smaller than model value at (Figures 2, 3, and 4). The solution NS (7) of LHM does not exist for and (Figure 2). For others and , there is also an almost constant difference (EPM, ) − () equal on average to value between 1.3 and 1.5 days with between 0 and 0.2 days. Different from these averages are values of difference (EPM, ) − (NS (7), ) at and values of difference (EPM, ) − (NS (7), ) at and . Let us first take a look on constant differences and their source. Figures 2 to 4 display time variations of the average prediction error for . In these figures, we can see that, in contrast to approximations, the model values show negative growth rate for the first day, but turning into increase thereafter. At around two days, the model values reach the same value as it had initially. NWPMs also show this type of behavior [23] and approximation cannot capture that.

To summarize our findings, even though solutions (9) and NS (7) give better approximations to model data than (7) for all , they have major disadvantages. Solution (9) underestimates the model data for , , and , and therefore we cannot calculate the time when this approximation reaches 95% of . Solution NS (7) does not exist for and . If we subtract two days (time when the model values reach the same value that they had initially) from (, EPM, ) − (), (7) would become superior, because we would get similar result as (9) and NS (7) but without the above mentioned disadvantages. One may argue that, because of subtraction of 2 days, we should recalculate the approximations. We did that and the results are close to the ones with subtraction. It is also good to mention that is always higher than and lower than and is always higher than . In the case of values at , , , and , , the highest difference from almost constant values is for , and the best results are given by QH with solution (9).

Table 3 focuses on the average values over of for and average . This value is for example used to find out if the variability of the model is equal to the variability of the atmosphere [10]. The differences from model values (Table 3, Figures 5, 6, and 7) indicate really poor approximation by (5) and (7) and usable approximations by (4) and (6), but with the already mentioned disadvantage of underestimations by (6). For (4),    lies between −0.1 and 0 (relatively against it means between −1.2% and 0%), between −0.1 and 0 (between −1.2% and 0%), and between −0.4 and 0.4. (−4.8% and 4.8%).

7. Conclusion

This paper studies errors of estimations of time limits and asymptotic value of initial errors growth in chaotic atmospheric model L05II introduced by Lorenz [12] with the parameters as close to NWPM as possible. Five types of initial conditions are represented by five normal distributions. Five settings of EPM showed the differences of order 0.1 and therefore the average value was chosen as model data. Quadratic hypothesis shows the best agreement with model’s asymptotic value and good agreement with model time limits. Approximation can be even improved by subtraction of constant value and after that the quadratic hypothesis is closest to model data from all hypotheses. Purpose and size of this constant are explained. Logarithmic hypothesis has the lowest agreement with the model data for time limits and asymptotic value. Modified quadratic hypothesis is good in approximating the model asymptotic value but it is not the best. For time limits, it is the best choice for approximation as long as we do not use the subtraction of the constant. Disadvantage is that, for some cases, this hypothesis underestimates model data and therefore some time limits are not available. Modified logarithmic hypothesis does not give good agreement with model’s asymptotic value but gives similar agreement with model’s time limit as modified quadratic hypothesis. Disadvantage is that, for the first two initial conditions, it is not solvable and therefore is usable only for larger initial errors. Quadratic hypothesis after subtraction of the constant value overestimates the model data for 0.5 days on average. Higher value is shown only for the shortest prediction time length and time limit . The size is 1.9 days on average. Relative difference between model’s asymptotic value and asymptotic value calculated from quadratic hypothesis is between 0 and 1.2% for prediction time and and between −4.8% and 4.8% for prediction time . So, only for the lastly mentioned prediction time, we should calculate with this difference.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The authors were supported by Project no. SVV-2013-267308.