International Scholarly Research Notices

International Scholarly Research Notices / 2011 / Article

Research Article | Open Access

Volume 2011 |Article ID 321683 | https://doi.org/10.5402/2011/321683

M. Brackstone, A. S. Deakin, "Approximations of Time Series", International Scholarly Research Notices, vol. 2011, Article ID 321683, 10 pages, 2011. https://doi.org/10.5402/2011/321683

Approximations of Time Series

Academic Editor: M. Tabata
Received14 Jun 2011
Accepted18 Jul 2011
Published22 Sep 2011

Abstract

A method is proposed to approximate the main features or patterns including interventions that may occur in a time series. Collision data from the Ontario Ministry of Transportation illustrate the approach using monthly collision counts from police reports over a 10-year period from 1990 to 1999. The domain of the time series is partitioned into nonoverlapping subdomains. The major condition on the approximation requires that the series and the approximation have the same average value over each subdomain. To obtain a smooth approximation, based on the second difference of the series, a few iterations are necessary since an iteration over one subdomain is affected by the previous iteration over the adjacent subdomains.

1. Introduction

Graduated licensing system (GLS) is a method of gradual exposure of young novice drivers into the driving environment, allowing them to obtain initial experience with driving under supervision, followed by more independent driving under higher-risk circumstances [1]. This model was widely incorporated into driver licensing programs across the US and Canada as well as other countries over the 1990s. Most of these programs have incorporated similar restrictions into their initial phases [2]. These include driving with supervision, restricted driving at night, limited teenage passengers, and zero blood alcohol level while driving. This method has had limited long-term evaluation in North America, but long-term followup in New Zealand suggested a reduced but persistent long-term reduction in young driver collisions as a result of its implementation [3]. The collision data for Ontario drivers, around the time of the introduction of the GLS ([2], p. 126), illustrate the variety of approximations of time series that are possible.

There are many practical techniques for smoothing a time series [4]. The smoothed value at a point is a weighted average involving the elements in the series that are within a local window about the point. One way to generate the weights in a moving average involves a local polynomial approximation of order 3 (or 5) where the window includes 5 (or 7 et cetera) points. The weights are then defined by regression. Another approach defines the weights in terms of an appropriate kernel, and this method applies more generally to bivariate data [5]. The advantages of local estimates compared with global estimates are discussed in [6].

The first step in the computational process involves the partition of the domain of the time series into subdomains. The subdomains are then labelled as odd-numbered or even-numbered. Iterations are then performed over the odd numbered subdomains followed by the iterations over the even-numbered subdomains. This process is numerically efficient since the iterations over one set of subdomains update the boundary conditions for the iterations over the remaining subdomains [7]. To determine a smooth and accurate approximation, this set of iterations is repeated a few times.

This paper is organized as follows. Section 2 describes the form of the approximations, the partition of the time series, and the minimization over the subdomains of the partition. In Section 3, the equations for the approximation are derived, and some computational details are given in Section 4. Under certain assumptions, approximations of time series with variable spacing are possible. The time series and their approximations are presented in Section 5, and an example outlines the approach for step level changes, missing data, and outliers in Section 6. In Section 7, the approximation over a subdomain is determined by a fourth-order polynomial and a straight line. Finally, guidelines for the application of the proposed approach to a time series are outlined in “Concluding remarks.”

2. The General Model

The equation for the general model for the approximation of the time series {𝑍𝑡∣𝑡=1,…,𝑁} is𝑍𝑡=𝑛−1𝑘=0𝑂𝑘𝑡+𝑄𝑘𝑡+𝑅𝑛𝑡,(2.1) where 𝑄0𝑡=𝑇𝑡+𝑀𝑡 and 𝑅𝑛𝑡=𝐴𝑛𝑡𝑊𝑛𝑡. In these equations, 𝑂𝑘𝑡 is the term for outliers (if present); 𝑄𝑘𝑡 is the 𝑘th approximation; 𝑇𝑡 is a trend and includes the level changes; 𝑀𝑡 is a nonperiodic oscillatory function; 𝑅𝑛𝑡 is the remainder; 𝐴𝑛𝑡 is a measure of the variation of the remainder. A restriction on the approximations requires that the root-mean-square (RMS) value of the remainder is a decreasing sequence with increasing 𝑛. The form of (2.1) is similar to the asymptotic expansion of a function that contains a small positive parameter ([8], p. 1–4).

The partition of the domain of the time series {𝑍𝑡}, henceforth denoted by 𝑍𝑡, is chosen in order to accurately approximate possible patterns in the time series. Let 𝑃={𝐸𝑘∣𝑘=1,…,𝑀} be a partition of [1,𝑁] where the nonoverlapping subdomains are 𝐸𝑘={𝑡𝑘−𝑛𝑘+1,…,𝑡𝑘}, 𝑛𝑘≥1, where 𝑛𝑘 is the number of elements in the 𝑘th subdomain 𝐸𝑘. The overlapping subdomains, over which the iterations are computed, are defined as 𝐸𝑜𝑘={𝑡𝑘−𝑛𝑘,𝐸𝑘,𝑡𝑘+1}. For 𝑘=1, 𝐸𝑜1={0,𝐸1,𝑛1+1} and for 𝑘=𝑀, 𝐸𝑜𝑀={𝑁−𝑛𝑁,𝐸𝑁,𝑁+1} so that the overlapping subdomains are defined over the interval [0,𝑁+1].

An approximation 𝑄𝑡 of a time series 𝑍𝑡, where 𝑍𝑡=𝑄𝑡+𝑅𝑡, is determined by a few iterations 𝐼𝑛𝑡 starting with 𝐼0𝑡=𝑍𝑡. Once the desired accuracy is obtained, the last iteration is defined as 𝑄𝑡. All iterations and the approximation 𝑄𝑡 along with the remainder 𝑅𝑡 satisfy the following properties.(1)An iteration over 𝐸𝑘 has the same average value ğ‘Žğ‘˜ as the time series 𝑡∈𝐸𝑘𝑍𝑡=𝑡∈𝐸𝑘𝐼𝑛𝑡=ğ‘Žğ‘˜,(2.2) and, hence, the average value of the remainder 𝑅𝑡 is zero over this subdomain. If 𝑍𝑡 is a measure of “energy” in the process, then 𝑄𝑡 conserves energy over each subdomain in the partition. In the particular case 𝑛𝑘=1, 𝐸𝑘={𝑡𝑘} and 𝑡=𝑡𝑘 is a fixed point for the approximation so that 𝑄𝑡∶=𝑍𝑡 at 𝑡=𝑡𝑘.(2)The measure of smoothness of the iterations at time 𝑡 for the 𝑛th iteration is defined by 𝛿𝑛𝑡=𝐼𝑛𝑡+1+𝐼𝑛𝑡−1−2𝐼𝑛𝑡 which is the second difference of 𝐼𝑛𝑡 at time 𝑡. The norm on 𝐸𝑜𝑘 is defined as the RMS value Δ𝑛𝑘=âŽ›âŽœâŽœâŽâˆ‘ğ‘¡âˆˆğ¸ğ‘˜î€·ğ›¿ğ‘›ğ‘¡î€¸2ğ‘›ğ‘˜âŽžâŽŸâŽŸâŽ 1/2.(2.3) Provided that the number of elements in 𝐸𝑘 is greater than 1, then the condition that Δ𝑛𝑘 has a minimum value is imposed. 𝐼𝑛𝑡 is required for 𝑡=𝑡𝑘−𝑛𝑘 and 𝑡=𝑡𝑘+1 in 𝐸𝑜𝑘 to determine 𝛿𝑛𝑡 at the endpoints of 𝐸𝑘. These two values of 𝐼𝑛𝑡 are the boundary conditions for the minimization on 𝐸𝑘.(3)For most of the examples presented in this paper, 𝑡=1 and 𝑡=𝑁 are fixed points so that 𝑄1=𝑍1 (𝑛1=1) and 𝑄𝑀=𝑍𝑁 (𝑛𝑀=1). These values provide the boundary conditions for the minimization over the adjacent subdomain. For the general case where 𝑛1>1 in 𝐸1 (𝑛𝑀>1 in 𝐸𝑀), one of the boundary conditions is missing in 𝐸𝑜1 (𝐸𝑜𝑀) so that an external boundary condition is required as described in the last paragraph of Section 3.(4)In some cases there are two or more approximations over one or more subdomains and a criterion is required to choose the best approximation. From (2.1), 𝑅𝑘𝑡=𝑄𝑘𝑡+𝑅𝑘+1t(𝑅0𝑡∶=𝑍𝑡) where 𝑄𝑘𝑡 is determined from 𝑅𝑘𝑡 over a partition 𝑃𝑘. For the example in Section 5, the simplest case occurs when 𝑃𝑘 is a refinement of 𝑃𝑘−1; that is, 𝑃𝑘=∪𝑃ℓ𝑘 where 𝑃ℓ𝑘 covers 𝐸ℓ in 𝑃𝑘−1. Then an approximation over 𝐸ℓ is 𝑄𝑘𝑡=0 and the other is defined by 𝑃ℓ𝑘. Let the RMS value of the remainder 𝑅𝑡𝑘+1 over 𝑃ℓ𝑘 be denoted by 𝑆𝑘+1, and 𝑆𝑘 is the RMS value over 𝐸ℓ in 𝑅𝑘𝑡. The approximation defined by 𝑃ℓ𝑘 is a significant improvement if the ratio 𝑆𝑘+1/𝑆𝑘≤𝜖 for a chosen value of 𝜖. As shown in Section 7, an upper bound for 𝜖 takes on values between 0.75 and 0.9. For the example in Section 6 involving an outlier, there are two approximations for 𝑄0𝑡.(5)The magnitude 𝐴𝑛𝑡 of the remainder 𝑅𝑛𝑡 is defined to be the RMS value of the remainder over each subdomain in a partition, and this definition implies that the RMS value of the series 𝑊𝑛𝑡 in (2.1) is equal to 1. For the example presented in Section 5, the subdomains are uniform with 12 elements.

3. Mathematical Details

Given the iterates 𝐼𝑡𝑛−1 and 𝛿𝑡𝑛−1 on 𝐸𝑜𝑘, the iterates 𝐼𝑛𝑡 for 𝑡∈𝐸𝑘 are computed such that the sum of squares of 𝛿𝑛𝑡 has a minimum value. For the moment, the first and last interval 𝐸1 and 𝐸𝑀 are excluded. The following variables are required to set up the equations for the minimization over the subdomain:𝑋𝑛𝑘=𝛿𝑛𝑡𝑘−𝑛𝑘+1,…,ğ›¿ğ‘›ğ‘¡ğ‘˜î‚„î…ž,𝑌𝑛𝑘=𝐼𝑛𝑡𝑘−𝑛𝑘+1,…,ğ¼ğ‘›ğ‘¡ğ‘˜î‚„î…ž,𝐵𝑛𝑘=𝐼𝑛𝑡𝑘−𝑛𝑘,0,…,0,𝐼𝑛𝑡𝑘+1(3.1) are 𝑛𝑘×1 matrices and a prime on a matrix indicates the transposed matrix. From the definition of 𝛿𝑛𝑡 in Section 2, these matrices are related by 𝑋𝑛𝑘−𝐴𝑌𝑛𝑘=𝐵𝑛𝑘, where 𝐴 is a 𝑛𝑘×𝑛𝑘 tridiagonal symmetric matrix with elements{(−2,1);(1,−2,1);…;(1,−2,1);(1,−2)},(3.2) where −2 is on the main diagonal. The equations for the iterations are obtained by replacing 𝐵𝑛𝑘 with 𝐵𝑘𝑛−1. Since the sum of 𝐼𝑛𝑡 for 𝑡∈𝐸𝑘 is a constant for all 𝑛, then ğ»î…žğ‘Œğ‘›ğ‘˜=𝑡∈𝐸𝑘𝑍𝑡=ğ‘Žğ‘˜ğ‘›ğ‘˜,(3.3) where 𝐻=[1,1,…,1], and ğ‘Žğ‘˜ is the average value of 𝑍𝑡 over 𝐸𝑘. The condition on the sum in terms of 𝑋𝑛𝑘 is ğ¸î…ž(𝑋𝑛𝑘−𝐵𝑘𝑛−1)=ğ‘Žğ‘˜ğ‘›ğ‘˜, where 𝐸 is the solution of 𝐴𝐸=𝐻. Thus, 𝐸′𝑋𝑛𝑘=𝑛𝑘𝜌𝑘𝑛−1 where𝜌𝑘𝑛−1=ğ‘Žğ‘˜âˆ’î‚€ğ¼ğ‘¡ğ‘›âˆ’1𝑘−1+𝐼𝑡𝑛−1𝑘+12,(3.4) and 𝐸[1]=𝐸[𝑛𝑘]=−𝑛𝑘/2 (Section 7). The solution for 𝑋𝑛𝑘 such that (𝑋𝑛𝑘)î…žğ‘‹ğ‘›ğ‘˜ has a minimum is 𝑋𝑛𝑘=𝜌𝑘𝑛−1𝐺2(𝑛𝑘)𝐸, where 𝐺2(𝑛𝑘)=𝑛𝑘/(𝐸′𝐸). Finally, Δ𝑛𝑘=|𝜌𝑘𝑛−1|𝐺(𝑛𝑘), and the solution 𝑌𝑛𝑘 is𝑌𝑛𝑘=𝐼𝑡𝑛−1𝑘−𝑛𝑘𝑉1+𝐼𝑡𝑛−1𝑘+1𝑉2+𝜌𝑘𝑛−1𝑉3,𝑉(3.5)1[𝑖]𝑖=1−𝑛𝑘+1,𝑉2[𝑖]=𝑖𝑛𝑘+1,𝐴𝑉3=𝐺2𝑛𝑘.(3.6) The iteration 𝐼𝑛𝑡=𝑌𝑛𝑘[𝑖], where 𝑖=1,…,𝑛𝑘 and 𝑡=𝑡𝑘−𝑛𝑘+1,…,𝑡𝑘 in 𝐸𝑘, respectively. For a given 𝑛𝑘, 𝐸 and then 𝑉3 are uniquely determined, and 𝑉3 is computed in advance for all of the subdomains that occur in a time series.

3.1. External Boundary Conditions

If 𝑡=1 is not a fixed point, then the subdomain 𝐸𝑜1 requires a boundary condition. Here are three possible external boundary conditions to impose at 𝑡=0 that can be used to reflect the possible behavior of the time series near the endpoint. (1)𝐼0𝑛−1=𝐼1𝑛−1. The slope of the tangent is zero at 𝑡=1. (2)𝐼0𝑛−1 is defined so that 𝜌1𝑛−1=0 in (3.4). This condition implies that the approximation over 𝐸1 is a segment of a straight line. (3)An iterative process is used to obtain 𝑍0 such that the RMS value of the remainder over the adjacent subdomain(s) has a minimum value.

Similarly, if 𝑡=𝑁 is not a fixed point, then the external boundary conditions are obtained by replacing 𝐼0𝑛−1, 𝐼1𝑛−1, 𝜌1𝑛−1, 𝑍0 with 𝐼𝑛−1𝑁+1, 𝐼𝑁𝑛−1, 𝜌𝑀𝑛−1, 𝑍𝑁+1, respectively.

4. Computational Aspects

For a time series 𝑍𝑡 and a partition 𝑃, there is a related series defined by Z𝑡=ğ‘Žğ‘˜ for 𝑡∈𝐸𝑘, where ğ‘Žğ‘˜ is the average value of 𝑍𝑡 for 𝑡∈𝐸𝑘. This property holds for all of the approximations in this paper,

The approximations for a time series 𝑍𝑡 and the averaged time series Z𝑡 are the same to the desired accuracy provided that the same partition and the same external boundary conditions (if any) are applied.

Consequently, any time series with variable spacing can be approximated provided that the estimates of the average values of the time series over the subdomains are adequate.

The approximation for the averaged series is employed especially for larger subdomains (𝑛𝑘≈12 or more). The efficiency of the computations is increased if the boundary conditions in the first set of iterations (even and odd) are the average of the four values of the series that straddle the subdomains 𝐸𝑘−1 and 𝐸𝑘. The averaged series was used in all computations, although the approximation obtained from 𝑍𝑡 may be more efficient in special cases.

It is convenient to introduce another notation to represent a partition: 𝑃 = {𝑛1,𝑛2,…;…;…,𝑛𝑀}, where the number of elements in the subdomains in the first block is {𝑛1,𝑛2,…} and in the last block by {…,𝑛𝑀}. These blocks are a convenient way to separate the seasons or a set of months. Also, the approximation 𝑄𝑘𝑡 obtained by iterating the time series 𝑛 times, using the partition 𝑃𝑘, is denoted by 𝒫𝑛𝑘{𝑅𝑘𝑡} (𝑅0𝑡∶=𝑍𝑡). The number of iterations 𝑛 is determined from the difference𝐷𝑛𝑡=𝒫𝑛𝑘𝑅𝑘𝑡−𝒫𝑘2𝑛𝑅𝑘𝑡(4.1) by imposing the condition that max(|𝐷𝑛𝑡|)<𝐿. 𝐿=1 in Figures 1 and 2, 𝐿=0.04 in Figures 3 and 4, and 𝐿=0.01 in Figure 5. All calculations in this paper were performed using Maple software [9].

5. Applications

Two time series, provided by the Ontario Ministry of Transportation ([2], p. 126), illustrate the approximations. The graph of the time series for the monthly accidents for young novice drivers is given in Figure 1 where the main feature here is the intervention that occurs at 52 months owing to the introduction of the GLS on April 1, 1994. The corresponding graph for all drivers is shown in Figure 3 where the sharp drop in the graph from the maximum in December/January to April, except for the last 2 years, is a strong feature of the series.

In Figure 1, the uniform partition 𝑃0 = {1,3;4;…;4;3,1}, except for the first and last blocks, provides a smooth approximation 𝑄0𝑡=𝒫40{𝑍𝑡} and captures the intervention well. In this case, the first and last elements are fixed (𝑄01=𝑍1 and 𝑄0120=𝑍120). Since the sum of the elements in the remainder over 𝐸𝑘 is zero, then 𝒫0{𝑅1𝑡}={0}. Other uniform partitions are possible where there are 3 or 6 elements in each subdomain. The approximation in the former case is slightly less smooth than 𝑄0𝑡 in Figure 1, and the RMS value of the remainder is 26. For the case of 6 elements, the RMS value of the remainder is 30. A more accurate approximation is obtained if the subdomains have two elements; however, this approximation has an angular appearance since it more closely approximates the time series.

In Figure 2, the approximation 𝑄0𝑡 of Figure 1 is expressed as a sum of a trend and an oscillatory series. The partition for the trend 𝑇𝑡=𝒫6𝑇{𝑄0𝑡} is 𝑃𝑇 = {12,12,12,12;6,6,6,6;12,12,12,12}. The external boundary condition implies that the tangent is horizontal at the endpoints of the series. The trend in this example is defined as a seasonal approximation of the time series where the subdomains contain 6 elements over the domain of the intervention. The remainder is the oscillatory series 𝑀𝑡=𝑄0𝑡−𝑇𝑡.

In Figure 3, the points for January or December (plus one November) and April are fixed points for the approximation 𝑄0𝑡=𝒫30{𝑍𝑡}. The partition is 𝑃0={1,2,1,7,1;3,1,7,1;3,1,7,1;3,1,8;1,2,1,7,1;3,1,6,2;1,2,1,8;1,2,1,6,1;2,2,1,8;1,2,1,7,1}. The second approximation captures the increase in the number of accidents that occur in the summer months by approximating the remainder 𝑅1𝑡 in 𝑍𝑡=𝑄0𝑡+𝑅1𝑡 to obtain 𝑅1𝑡=𝑄1𝑡+𝑅2𝑡 where 𝑄1𝑡=𝒫31{𝑅1𝑡}. The partition 𝑃1 is a refinement of 𝑃0 where 7 is replaced with 3,4;6 with 3,3; 8 with 3,5. Consequently, the approximations 𝑄0𝑡 and 𝑄0𝑡+𝑄1𝑡 have the same average value over the subdomains of 𝑃0.

The subdomains that are the same in the two partitions 𝑃0 and 𝑃1 are indicated by the intervals over which the approximation is zero in Figure 3. For the remaining intervals, the ratio of the RMS value of the remainder 𝑅2𝑡 in Figure 4 to the RMS value of 𝑅1𝑡 is equal to 0.49 so that this second approximation is significant. Furthermore, each of the ten segments of 𝑄1𝑡, excluding the segments in which the approximation is identically equal to zero, has a value between 0.40 and 0.52. The approximation 𝑄0𝑡+𝑄1𝑡 of 𝑍𝑡 and the remainder 𝑅2𝑡 appear in Figure 4.

6. Level Changes, Missing Data, and Outliers

For a step level change between 𝑡=𝜏 and 𝑡=𝜏+1, an approximation may not provide an adequate approximation for the time series in the subdomains on both sides of the step. For 𝑡>𝜏, an external boundary condition is applied at 𝑡=𝜏 such that the remainder of the approximation has a minimum RMS value. The same approach is applied to the series for 𝑡<𝜏+1. These ideas are illustrated in Figure 5 where the partition for the approximation 𝑄𝑡=𝒫6{𝑇𝑡} over [1,120] is 𝑃 = {1,11;12;…;12;11,1}. The approximation exhibits a phenomenon that is similar to a Fourier series near a discontinuity in that the approximation overshoots on the right and undershoots to the left of the jump. The maximum and minimum of 𝑄𝑡 are 1.176 and −0.173. Moving away from the jump in either direction, the oscillations of 𝑄𝑡−𝑇𝑡 occur with rapidly decreasing amplitude. The details in item 4 of Section 2 are applied over the subdomains adjacent to 𝑡=60 to choose between the two approximations of the trend. Interventions and level shifts, from an autoregressive moving average point of view, are presented in [10] and [11].

A simple example indicates the approach for a series that has a missing value or a possible outlier at 𝑡=6. The series is {𝑍𝑡}= {0.6,−0.3,−0.5,−0.2,0.4,𝑍6,1,0.8,0.9,1.1,1.0,1.2} where 𝑍6 is not defined in the case of a missing value. For both cases, the approximation 𝑄0𝑡 is determined for the partition 𝑃= {1;4;1;5;1}, where the value of the series is 𝑋 at 𝑡=6, such that the RMS value of the remainder is a minimum. A good initial estimate for 𝑋 is the average value of the time series in a window about 𝑡=6 [5]. Then an iterative process is started to obtain 𝑋≈1.25, as shown in Figure 6, and the RMS value of the remainder is 0.104. The smoothest approximation over 𝑃 occurs for 𝑋≈0.12, where ∑211𝛿2𝑡 has a minimum value. For the case of a possible outlier, 𝑂0𝑡=0 for 𝑡≠6 in (2.1) and 𝑂06=𝑍6−1.25 provided that the ratio of the RMS value of the approximation with 𝑋=1.25 to the RMS value of the approximation under the assumption that 𝑍6 is not an outlier satisfies the condition in item 4 of Section 2.

7. Properties of Approximations

Approximations of Random Samples
The point of this exercise is to determine the 𝜖 in item 4 of Section 2 such that the only reasonable approximation for a series of random samples is 𝑄0𝑡 equal to the mean of the series. 12,000 random samples from the normal distribution with a mean of 0 and a standard deviation of 1 were generated using Maple to form 100 time series with 120 elements in each series. For each series, five approximations were determined where the subdomains of the uniform partition contained 3, 4, 6, 12, and 24 elements. The external boundary condition for the approximation is the condition of zero slope of the tangent. For each series, the ratio of the RMS value of the remainder for the approximation to the RMS value of the series were calculated, and the results are given in Table 1. The approximations corresponding to 24 and 12 elements are smooth and appear to reflect an underlying pattern in the series; whereas, for the cases 3 and 4, the approximations are contorted. An upper bound for 𝜖 is less than the minimum values in the range.


S u b d o m a i n M e a n ( S t E ) R a n g e

24 0.986 (0.013) [0.94, 1.01]
12 0.967 (0.018) [0.91, 1.00]
6 0.930 (0.023) [0.87, 0.98]
4 0.888 (0.027) [0.81, 0.96]
3 0.843 (0.035) [0.76, 0.92]

Quartic Polynomial
The terms in the equation (3.5) for the approximation over 𝐸𝑜𝑘 have a simple interpretation. For the first two terms in (3.5), (𝑖,𝑉1[𝑖]) and (𝑖,𝑉2[𝑖]) are points on straight lines. For the last term, Maple solves 𝐴𝐸=𝐻 for 𝐸 and (3.6) for 𝑉3 exactly; consequently, an accurate computation shows that (𝑖,𝐸[𝑖]) and (𝑖,𝑉3[𝑖]) are points on the graph of a quadratic and a quartic polynomial, respectively.
To describe the properties of 𝐸 and 𝑉3, it is necessary to change variables from 𝑡 to 𝑠 where 𝑠=𝑡−[𝑡𝑘−(𝑛𝑘−1)/2] and 𝑠=0 is the central point of the interval 𝐸𝑜𝑘. In the 𝑠 variable, the boundary conditions are applied at the points 𝑠=±𝑠𝑜, where 𝑠𝑜=(𝑛𝑘+1)/2. The equation of the quadratic polynomial 𝑣(𝑠,𝑛𝑘) for 𝐸 is defined by 𝑑2𝑣/𝑑𝑠2=1 and 𝑣=0 at 𝑠=±𝑠𝑜 so that 𝑣=(𝑠2−𝑠2𝑜)/2. For any integer 𝑖, 𝐸[𝑖] is equal to 𝑣 at the corresponding value for 𝑠, and 𝐸′𝐸=((2𝑠𝑜)5−2𝑠𝑜)/120. The equation for the quartic polynomial 𝑢(𝑠,𝑛𝑘) for 𝑉3, provided 𝑛𝑘≥3, is determined by 𝑑4𝑢/𝑑𝑠4=𝐺2(𝑛𝑘), where the roots of the equation for 𝑢 are 𝑠=±𝑠𝑜 and 𝑠2=5𝑠2𝑜+1. Thus, 𝑢=𝐺2(𝑛𝑘)(𝑠2−𝑠2𝑜)(𝑠2−5𝑠2𝑜−1)/24. 𝐺(𝑛𝑘) is a measure of the smoothness of the approximation over 𝐸𝑜𝑘: G(2) = 1.0, G(3) = 0.59, G(4) = 0.39, G(6) = 0.21, G(12) = 0.062, and G(24) = 0.017.

Concluding Remarks
The major input for the approximation of a time series involves the partition of the domain. Initially a uniform partition is chosen and, if seasonal behavior is present in the series, a subset of the partitions cover the domain for the seasons. In general, as the length of the subintervals decreases, the approximation is less smooth and the accuracy of the approximation increases. The best approximation occurs at the point at which the approximation is acceptably smooth. The subintervals can be enlarged to determine a much smoother approximation that can be labelled as a trend while still respecting the seasonal aspects of the series; however, if an intervention is present, then some adjustment of the partition may be necessary in the region of the intervention. For time series with a well-defined local maximum or minimum, the approximation can be assigned the same value as the series by taking the partition to be a single point of the domain. For series with jumps and other complexities, examples are provided to suggest how to proceed in these cases.
An approach in the literature, as indicated in the introduction, defines the approximation at a point as a weighted average of the values of the values of the time series in a window about the point. This approach may smooth out interesting features in the time series and, if applied over a smaller intervals, the approximation will not be smooth. Since the proposed model is not based on regression, a comparison of the two approaches has not been considered.

References

  1. A. F. Williams and R. A. Shults, “Graduated driver licensing research, 2007-present: a review and commentary.,” Journal of Safety Research, vol. 41, no. 2, pp. 77–84, 2010. View at: Google Scholar
  2. M. Brackstone, Proposal for impact evaluation of graduated licensing system on young drivers in Ontario, M.S. thesis, Department of Epidemiology and Biostatistics, University of Western Ontario, Canada, 2008.
  3. J. D. Langley, A. C. Wagenaar, and D. J. Begg, “An evaluation of the New Zealand graduated driver licensing system,” Accident Analysis and Prevention, vol. 28, no. 2, pp. 139–146, 1996. View at: Publisher Site | Google Scholar
  4. G. Janacek, Practical Time Series, Oxford University Press, New York, NY, USA, 2001.
  5. W. Härdle, Applied Nonparametric Regression, vol. 19 of Econometric Society Monographs, Cambridge University Press, New York, NY, USA, 1990.
  6. L. Keele, Semiparametric Regression for the Social Sciences, Wiley, Hoboken, NY, USA, 2008.
  7. B. F. Smith, P. E. Bjørstad, and W. D. Gropp, Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equation, Cambridge University Press, New York, NY, USA, 1996.
  8. J. Kevorkian and J. D. Cole, Perturbation Methods in Applied Mathematics, vol. 34 of Applied Mathematical Sciences, Springer, New York, NY, USA, 1981.
  9. Maple software, version 13, Maplesoft, Waterloo, Canada.
  10. W. W. S. Wei, Time Series Analysis: Univariate and Multivariate Method, Addison Wesley/Pearson, Boston, Mass, USA, 2nd edition, 2006.
  11. R. S. Tsay, “Outliers, level shifts, and variance changes in time series,” Journal of Forecasting, vol. 7, pp. 1–20, 1988. View at: Google Scholar

Copyright © 2011 M. Brackstone and A. S. Deakin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views529
Downloads330
Citations

Related articles

We are committed to sharing findings related to COVID-19 as quickly as possible. We will be providing unlimited waivers of publication charges for accepted research articles as well as case reports and case series related to COVID-19. Review articles are excluded from this waiver policy. Sign up here as a reviewer to help fast-track new submissions.