Abstract

This paper discusses the estimation of autocorrelation function (ACF) of fractional Gaussian noise (fGn) with long-range dependence (LRD). A variance bound of ACF estimation of one block of fGn with LRD for a given value of the Hurst parameter () is given. The present bound provides a guideline to require the block size to guarantee that the variance of ACF estimation of one block of fGn with LRD for a given value does not exceed the predetermined variance bound regardless of the start point of the block. In addition, the present result implies that the error of ACF estimation of a block of fGn with LRD depends only on the number of data points within the sample and not on the actual sample length in time. For a given block size, the error is found to be larger for fGn with stronger LRD than that with weaker LRD.

1. Introduction

ACF analysis, or equivalently spectral analysis according to the Wiener-Khintchine theorem, plays a role in many areas of sciences and technologies (see, e.g., [1, 2]), such as structural engineering [37]. In engineering, ACF or its Fourier transform (power spectrum density function (PSD)) can only be estimated according to a given record length in measurement. Note that the random load simulated in a laboratory test may be generated based on a predetermined ACF or PSD; see, for example, [813]. Thus, the quality of ACF or PSD estimation has great impact on structure analysis and design.

The literature of error analysis (mainly, bias, and variance) of ACF/PSD estimation of an ordinary random process is quite rich; see, for example, [1, 2, 1419]. By ordinary random processes, we mean that the ACF and PSD of a process are ordinary functions except the Dirac delta function that is the ACF of white noise.

Note that processes with LRD or long-memory substantially differ from ordinary processes [20]. By LRD, we mean that the ACF of a process is nonsummable in the discrete case or nonintegrable in the continuous case [20]. Hence, its PSD should be considered in the sense of generalized function over the Schwartz space of test functions. FGn introduced in [21] is a widely used model of stationary fractal time series, which has found increasingly wide applications in many fields of sciences and technologies, ranging from hydrology to network traffic; see, for example, [2245]. Note that the statistics of a zero mean Gaussian process are completely determined in terms of its ACF. Therefore, when using fGn-type load in structural engineering, the method to assure the quality of its ACF estimation is desired. In passing, we mention that, in the field of the Internet, ACF estimation of fGn-type teletraffic is utilized for detection of distributed denial-of-service flood attacks [32].

In the field, [46] discussed the statistical error of the structure function of Gaussian random fractals, and [47] studied the bias of the sample autocorrelations of fractional noise. This paper aims at providing a variance bound of the ACF estimation of one block of fGn.

An ACF is usually estimated on a block-by-block basis [1, 10, 13], where block size means the number of data points of a block of sample. Note that the ACF estimation of different blocks may be different, resulting in the estimation error caused by sectioning. The error resulted from sectioning can be reduced by the skill of averaging [1]. Different from conventional methods to reduce errors based on averaging, this research studies how to determine the size of one block according to a given degree of accuracy of ACF estimation of fGn with LRD.

Intuitively, if the size of one block is large enough, the ACF estimation will be independent of the start point for sectioning the block. Let N be the block size of fGn with LRD. The aim of this paper is to provide a formula to calculate the variance bound of ACF estimation of fGn with LRD for a given N and a given value of H.

The remaining article is organized as follows. Section 2 presents an error bound of ACF estimation of one block of fGn with LRD. Discussions are given in Section 3. Finally, Section 4 concludes the paper.

2. Variance Bound of ACF Estimation of One Block of fGn with LRD

2.1. Preliminaries

Let B(t) be ordinary Brownian motion (Bm) for and B(0) = 0 [48]. The stationary white noise can be taken as which is the derivative of B(t) in the domain of generalized functions. Let be the Riemann-Liouville integral operator [49, 50]. Then,

where is the Gamma function. Replacing with in (2.1) yields In the above expression, is termed the Riemann-Liouville fractional Brownian motion (fBm) and . This fBm is self-similar but does not have stationary increments. In passing, it is noted that the fBm described in the sense of the Riemann-Liouville fractional integral can be explained as the response of a fractional system, the impulse response of which is under the excitation of white noise from a view of the theory of linear fractional systems discussed in [51, 52].

Following Mandelbrot and van Ness [21], the fBm that is self-similar and has stationary increments is defined for by

where is the starting value at time 0. If , Hence, fBm generalizes Bm. The fBm expressed by (2.3) is the fractional integral of in the sense of Weyl (see [49, 50] for the details of the fractional Weyl integral operator).

FGn is the increment process of fBm. It is stationary and self-affine with parameter H. Let be fGn in the continuous case. Then, the ACF of is given by where , is the intensity of fGn, and is used by regularizing fBm so that the regularized fBm is differentiable [21, pages 427-428]. The PSD of is given by (Li and Lim [53])

Letting = 1 and replacing by in (2.4) yields the ACF of the discrete fGn (dfGn):

Recall that a stationary Gaussian process with ACF is of LRD if [20]

otherwise it is of short-range dependence (SRD). Another definition of LRD is given as follows. For asymptotically large time scales, if

then the process is of LRD.

Note that the expression 0.5[(k + 1)2H − 2k2H + (k − 1)2H ] described in (2.6) is the finite second-order difference of 0.5(k)2H . Approximating it with the second-order differential of 0.5 yields

Expressing in (2.8) by the Hurst parameter gives , or

The LRD condition expressed by H therefore is 0.5 < H < 1. The larger the H value, the stronger the long-range persistence.

FGn contains three subclasses of time series. In the case of 0.5 < H < 1, is positive and finite for all . It is monotonously decreasing but nonintegrable. In fact, from the ACF of dfGn described by (2.9), one immediately has Thus, for 0.5 < H < 1, the corresponding fGn is of LRD. For the integral of is zero. Hence, fGn is of SRD in this case. Moreover, changes its sign and becomes negative for some τ proportional to ε in the parameter domain [21, page 434]. FGn reduces to white noise when .

Note that if is sufficiently smooth on and if

where c is a constant, then one has the fractal dimension of as

see, for example, [5457]. The local irregularity of the sample paths is measured by which can be regarded as the fractal index of the process. Thus, the behaviour of near the origin determines the local irregularity or the local self-similarity of the sample paths. The larger the value, the higher the local irregularity.

Now, in the case of , we apply the binomial series to Then, one has

Therefore, one immediately gets

Hence, measures both LRD and self-similarity of fGn. In other words, the local properties of fGn are reflected in the global ones as remarked by Mandelbrot [58, page 27].

Figures 1(a) and 1(b) give the plots of the ACFs of fGn with LRD and SRD in the case of , respectively.

2.2. Variance Bound

In practical terms, the number of measured data points within a sample of fGn is finite. Let a positive integer N be the number of data points of a measured sample of dfGn sequence . Then, the ACF of is estimated by

Usually, for

Therefore, is a random variable.

Let be the mean square error in terms of . Denote R(k) by R(k; H, N). The aim of the statistical error analysis in this research is to derive a relationship between and N as well as H so as to establish a reference guideline for requiring N under the conditions that the bound of and the value of H are given.

Theorem 2.1. Let be dfGn series with LRD. Let be the true ACF of . Let be the number of data points of a sample sequence. Let be an estimate of . Let Var be the variance of . Then,

Proof. Mathematically, is computed over infinite interval [1, 2, 59]: In practice, can only be estimated with a finite sequence. Therefore, where is the start point.
Let be the bias of . Then, Var(R) . Since
R(k) is the unbiased estimate of and accordingly. We need to express Var(R) by the following proposition to prove the theorem.

Proposition 2.2. Let be dfGn with LRD. Let be the true ACF of . Let be the number of data points of a sample sequence. Let be an estimate of . Let Var(R ) be the variance of . Suppose that is monotonously decreasing and ) 0. Then,

Proof. As Var(R) = = , according to (P-3), one has Expanding yields Thus, Let Then, Since x is Gaussian, random variables , and have a joint-normal distribution and , where Therefore, According to (P-6), the variance is expressed as Replacing () with in the above expression yields where = . Without losing generality, let Then, the above becomes
Since ACF is an even function, the above expression is written by
Therefore, Proposition 2.2 holds.
Now, replacing with (2.6) yields
According to (2.9), replacing on the right hand of the above expression by , we have Theorem results.

The above formula represents an upper bound of Var(R). Denote by the bound of standard deviation. Then,

We illustrate s(N, H) in terms of N by Figure 2 for H = 0.60, 0.70, 0.80, and 0.90.

From Figure 2, we see that for , meaning that the error of ACF estimation of fGn is larger with stronger LRD than that with weaker LRD.

3. Discussions

3.1. To Avoid Misleading Result of ACF Estimation

Recall that processes with LRD substantiality differ from those with SRD [20]. Therefore, possible SRD signs of an ACF estimate of an fGn series that is of LRD may be taken as a misleading result of ACF estimation.

Suppose that we have a block of fGn with H = 0.75. Hence, this series is of LRD. Figure 3(a) shows an fGn series with H = 0.75, which is synthesized with the method given in [60].

Assume the block size N = 256. Then, we have according to Theorem 2.1. The dotted line in Figure 3(b) indicates its ACF estimation with N = 256 and the solid line in Figure 3(b) shows the theoretical ACF of fGn with H = 0.75. We note that the error regarding the ACF estimate reflected by the dotted line in Figure 3(b) is severe because many points of the dotted line are negative. Thus, it may probably confuse the property of the positive correlation (i.e., LRD) of the data being processed. Consequently, by the dotted line in Figure 3(b), one might likely be misled to take the data being processed (Figure 3(a)) as SRD. Figures 3(c) and 3(d) show the first 64 and 128 points of Figure 3(b), respectively. They again show the possible confusions caused by severe estimation error.

Now we increase the block size such that N = 2048. Then, one has according to (2.17). In this case, the ACF estimation is indicated by the dotted line in Figure 4(a). Comparing Figure 4(a) to Figure 3(b), we see that the error of ACF estimation is considerably reduced when N increases to 2048 because most data points of the ACF estimate are positive. Figures 4(b), 4(c), and 4(d) give the plots of the first 64, 128, and 256 points of Figure 4(a), respectively. They evidently interpret the improvement of the quality of ACF estimation of one block of fGn with LRD by increasing the block size.

From the above, one sees that the accuracy of ACF estimate of fGn with LRD can be increased if the block size increases. Therefore, in addition to the direct way to increase the record length, increasing the sampling rate in measurement of fGn to be processed may yet be a way to increase the accuracy of the ACF estimation in the case that the block size is given.

3.2. One Block Estimation

The previous discussions regarding ACF estimation of fGn with LRD do not relate to averaging. In fact, once the block size N is such that it meets the required accuracy according to Theorem 2.1, the ACF estimation is independent of the start point of the block. That is, for any yields an ACF estimate, the error of which is bounded based on Theorem 2.1. Further, we note that the discussed ACF estimation does not relate to sectioning. As a matter of fact, for each yields an ACF estimate, the error of which is bounded by (2.17).

3.3. Remarks

In the field of fractional order signal processing (see, e.g., [61]), [62] recently introduced a method to obtain a reliable estimation of H based on fractional Fourier transform for processing very long experimental time series locally. It is worth noting that the present error bound in this paper may yet be an explanation why the reliable estimation of H discussed in [62] requires long series.

Finally, we note that the ACF estimate expressed by (2.15) is biased one. However, that does not matter because the present variance bound relates to the fluctuation of the ACF estimate regardless of whether it is biased or not.

4. Conclusions

We have established an error bound of ACF estimation of one block of fGn with LRD. It has been shown that the error does not depend on the absolute length of the sample but only relies on the number of data points, that is, the block size N, of the sample. The error of an ACF estimate of fGn with stronger LRD is larger than that with weaker LRD for a given N. The discussed ACF estimation is not related to averaging. The accuracy of an ACF estimate of a block of fGn with LRD can be guaranteed once the block size is selected according to (2.17) without the relation to sectioning.

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under the project Grant nos. 60573125, 60873264, and 60873102.