`Advances in Decision SciencesVolume 2012 (2012), Article ID 572919, 15 pageshttp://dx.doi.org/10.1155/2012/572919`
Research Article

## Large-Deviation Results for Discriminant Statistics of Gaussian Locally Stationary Processes

Faculty of scince, Niigata University, 8050 Ikarashi 2-no-cho, Nishi-ku, Niigata 950-2181, Japan

Received 16 February 2012; Accepted 9 April 2012

Copyright © 2012 Junichi Hirukawa. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

This paper discusses the large-deviation principle of discriminant statistics for Gaussian locally stationary processes. First, large-deviation theorems for quadratic forms and the log-likelihood ratio for a Gaussian locally stationary process with a mean function are proved. Their asymptotics are described by the large deviation rate functions. Second, we consider the situations where processes are misspecified to be stationary. In these misspecified cases, we formally make the log-likelihood ratio discriminant statistics and derive the large deviation theorems of them. Since they are complicated, they are evaluated and illustrated by numerical examples. We realize the misspecification of the process to be stationary seriously affecting our discrimination.

#### 1. Introduction

Consider a sequence of random variables converging (in probability) to a real constant . By this we mean that as for all . The simplest setting in which to obtain large-deviation results is that considering sums of independent identically distributed (iid) random variables on the real line. For example, we would like to consider the large excursion probabilities of sums as the sample average: where the , , are i.i.d., and approaches infinity. Suppose that exists and is finite. By the law of large numbers, we know that should be converging to . Hence, is merely the expected value of the random process. It is often the case that not only does go to zero, but it does so exponentially fast. That is, where is a slowly varying function of (relative to the exponential), and is a positive quantity. Loosely, if such a relationship is satisfied, we will say that the sequence satisfies a large-deviation principle. Large-deviation theory is concerned primarily with determining the quantities and (to a lesser extent) . The reason for the nomenclature is that for a fixed and a large index , a large-deviation from the nominal value occurs if . Large-deviation theory can rightly be considered as a generalization or extension of the law of large numbers. The law of large numbers says that certain probabilities converge to zero. Large-deviation theory is concerned with the rate of convergence. Bucklew [1] describes the historical statements of large-deviation in detail.

There have been a few works on the large-deviation theory for time series data. Sato et al. [2] discussed the large-deviation theory of several statistics for short- and long-memory stationary processes. However, it is still hard to find the large-deviation results for nonstationary processes. Recently, Dahlhaus [3, 4] has formulated an important class of nonstationary processes with a rigorous asymptotic theory, which he calls locally stationary. A locally stationary process has a time-varying spectral density whose spectral structure changes smoothly with time. There are several papers which discuss discriminant analysis for locally stationary processes (e.g., Chandler and Polonik [5], Sakiyama and Taniguchi [6], and Hirukawa [7]). In this paper, we discuss the large-deviation theory of discriminant statistics of Gaussian locally stationary processes. In Section 2 we present the Gärtner-Ellis theorem which establishes a large-deviation principle of random variables based only upon convergence properties of the associated sequence of cumulant generating functions. Since no assumptions are made about the dependency structure of random variables, we can apply this theorem to non-stationary time series data. In Section 3, we deal with a Gaussian locally stationary process with a mean function. First, we prove the large-deviation principle for a general quadratic form of the observed stretch. We also give the large-deviation principle for the log-likelihood ratio and the misspecified log-likelihood ratio between two hypotheses. These fundamental statistics are important not only in statistical estimation and testing theory but in discriminant problems. The above asymptotics are described by the large-deviation rate functions. In our stochastic models, the rate functions are very complicated. Thus, in Section 4, we evaluate them numerically. They demonstrate that the misspecifications of non-stationary has serious effects. All the proofs of the theorems presented in Section 3 are given in the Appendix.

#### 2. Gärtner-Ellis Theorem

Cramér's theorem (e.g., Bucklew [1]) is usually credited with being the first large-deviation result. It gives the large-deviation principle for sums of independent identically distributed random variables. One of the most useful and surprising generalizations of this theorem is the one due to Gärtner [8] and, more recently, Ellis [9]. These authors established a large-deviation principle of random variables based only upon convergence properties of the associated sequence of moment generating functions . Their methods thus allow large-deviation results to be derived for dependent random processes such as Markov chains and functionals of Gaussian random processes. Gärtner [8] assumed throughout that for all . By extensive use of convexity theory, Ellis [9] relaxed this fairly stringent condition.

Suppose that we are given an infinite sequence of random variables . No assumptions are made about the dependency structure of this sequence. Define

Now let us list two assumptions.

Assumption 2.1. exists for all , where we allow both as a limit value and as an element of the sequence .

Assumption 2.2. is differentiable on .

Define the large-deviation rate function by this function plays a crucial role in the development of the theory. Furthermore, define where indicates the derivative of . Before proceeding to the main theorem, we first state some properties of this rate function.

Property 1. is convex.

We remark that a convex function on the real line is continuous everywhere on , the domain of .

Property 2. has its minimum value at , and .

We now state a simple form of a general large-deviation theorem which is known as the Gärtner and Ellis theorem (e.g., Bucklew [1]).

Lemma 2.3 (Gärtner-Ellis). Let be an interval with . If Assumption 2.1 holds and , then
If Assumptions 2.1 and 2.2 hold and , then

Large-deviation theorems are usually expressed as two separate limit theorem: an upper bound for closed sets and a lower bound for open sets. In the case of interval subsets of  , it can be guaranteed that the upper bound equals the lower bound by the continuity of . For the applications that we have in mind, the interval subsets will be sufficient.

#### 3. Large-Deviation Results for Locally Stationary Processes

In this section, using the Gärtner-Ellis theorem, we develop the large-deviation principle for some non-stationary time series statistics. When we deal with non-stationary processes, one of the difficult problems to solve is how to set up an adequate asymptotic theory. To overcome this problem, an important class of non-stationary process has been formulated in rigorous asymptotic framework by Dahlhaus [3, 4], called locally stationary processes. Locally stationary processes have time-varying densities, whose spectral structures smoothly change in time. We give the precise definition of locally stationary processes which is due to Dahlhaus [3, 4].

Definition 3.1. A sequence of stochastic processes is called locally stationary with transfer function and trend if there exists a representation: where (i) is a stochastic process on with and where denotes the cumulant of -th order, , , for all and is the period extension of the Dirac delta function. To simplify the problem, we assume in this paper that the process is Gaussian, namely, we assume that for all ; (ii)there exists constant and a -periodic function with and  for all . and are assumed to be continuous in .

The function is called the time-varying spectral density of the process. In the following, we will always denote by and time points in the interval , while and will denote time points in the rescaled interval , that is .

We discuss the asymptotics away from the expectation of some statistics used for the problem of discriminating between two Gaussian locally stationary processes with specified mean functions. Suppose that is a Gaussian locally stationary process which under the hypothesis has mean function and time-varying spectral density for . Let be a stretch of the series , and let be the probability density function of under . The problem is to classify into one of two categories and in the case that we do not have any information on the prior probabilities of and .

Set and , where

Initially, we make the following assumption.

Assumption 3.2. (i) We observe a realisation of a Gaussian locally stationary process with mean function and transfer function , under , ;
(ii) the are uniformly bounded from above and below, and are differentiable in and with uniformly continuous derivatives ;
(iii) the are differentiable in with uniformly continuous derivatives.
In time series analysis, the class of statistics which are quadratic forms of is fundamental and important. This class of statistics includes the first-order terms (in the expansion with respect to ) of quasi-Gaussian maximum likelihood estimator (QMLE), tests and discriminant statistics, and so forth Assume that is the transfer function of a locally stationary process, where the corresponding satisfies Assumption 3.2 (ii) and is a continuous function of which satisfies Assumption 3.2 (iii), if we replace by and by , respectively. And set , , and . Henceforth, stands for the expectation with respect to . Set for . We first prove the large-deviation theorem for this quadratic form of . All the proofs of the theorems are in the Appendix.

Theorem 3.3. Let Assumption 3.2 hold. Then under , and under , where for , equals

Next, one considers the log-likelihood ratio statistics. It is well known that the log-likelihood ratio criterion: gives the optimal discrimination rule in the sense that it minimizes the probability of misdiscrimination (Anderson [10]). Set for . For discrimination problem one gives the large-deviation principle for .

Theorem 3.4. Let Assumption 3.2 hold. Then under , where equals Similarly, under ,

In practice, misspecification occurs in many statistical problems. We consider the following three situations. Although actually has the time-varying mean functions and the time-varying spectral densities , under , , respectively, (i)the mean functions are misspecified to , ; (ii)the spectral densities are misspecified to , ; (iii)the mean functions and the spectral densities are misspecified to and , . Namely, is misspecified to stationary.

In each misspecified case, one can formally make the log-likelihood ratio in the form: where Set for and . The next result is a large-deviation theorem for the misspecified log-likelihood ratios . It is useful in investigating the effect of misspecification.

Theorem 3.5. Let Assumption 3.2 hold. Then under , where equals equals and equals Similarly, under ,

Now, we turn to the discussion of our discriminant problem of classifying into one of two categories described by two hypotheses a follows: We use as the discriminant statistic for the problem (3.19), namely, if we assign into , and otherwise into . Taking in (3.9), we can evaluate the probability of misdiscrimination of from into as follows:

Thus, we see that the rate functions play an important role in the discriminant problem.

#### 4. Numerical Illustration for Nonstationary Processes

We illustrate the implications of Theorems 3.4 and 3.5 by numerically evaluating the large-deviation probabilities of the statistics and , for the following hypotheses: where , and , , respectively. Figure 1 plots the mean function (the solid line), the coefficient functions (the dashed line), and (the dotted line). The time-varying spectral density is plotted in Figure 2.

Figure 1: The mean function (the solid line), the coefficient functions (the dashed line), and (the dotted line).
Figure 2: The time-varying spectral density .

From these figures, we see that the magnitude of the mean function is large at close to , while the magnitude of the time-varying spectral density is large at close to .

Specifically, we use the formulae in those theorems concerning to evaluate the limits of the large-deviation probabilities: Though the result is an asymptotic theory, we perform the simulation with a limited sample size. Therefore, we use some levels of to expect fairness, that is, we take . The results are listed in Table 1.

Table 1: The limits of the large-deviation probabilities of and , .

For each value , the large-deviation rate of is the largest and that of is the smallest. Namely, we see that the correctly specified case is the best, and on the other hand the misspecified to stationary case is the worst. Furthermore, the large-deviation rates and are significantly small, comparing with . This fact implies that the misspecification of the spectral density to be constant in the time seriously affects the large-deviation rate.

Figures 3, 4, 5, and 6 show the large-deviation probabilities of and , , for , at each time and frequency .

Figure 3: The time-frequency plot of the large-deviation probabilities of .
Figure 4: The time-frequency plot of the large-deviation probabilities of .
Figure 5: The time-frequency plot of the large-deviation probabilities of .
Figure 6: The time-frequency plot of the large-deviation probabilities of .

We see that the large-deviation rate of keeps the almost constant value at all the time and frequency . On the other hand, that of is small at close to 0 and those of and are small at close to and close to . That is, the large-deviation probability of is violated by the large magnitude of the mean function, while those of and are violated by that of the time-varying spectral density. Hence, we can conclude the misspecifications seriously affect our discrimination.

#### Appendix

We sketch the proofs of Theorems 3.33.5. First, we summarize the assumptions used in this paper.

Assumption A.1. (i) Suppose that is a -periodic function with which is differentiable in and with uniformly bounded derivative . denotes the time-varying spectral density. are -periodic functions with
(ii) suppose that is differentiable with uniformly bounded derivative.
We introduce the following matrices (see Dahlhaus [4] p.154 for the detailed definition): where and . According to Lemmata 4.4 and 4.7 of Dahlhaus [4], we can see that and and are the approximations of and , respectively. We need the following lemmata which are due to Dahlhaus [3, 4]. Lemma A.2 is Lemma of Dahlhaus [3] and Lemma A.3 is Theorem 3.2 (ii) of Dahlhaus [4].

Lemma A.2. Let , , fulfill Assumption A.1 (i) and , fulfill Assumption A.1 (ii). Let or . Furthermore, let , or . Then we have

Lemma A.3. Let be the transfer function of a locally stationary process , where the corresponding is bounded from below and has uniformly bounded derivative . denotes the time-varying spectral density of . Then, for , we have

We also remark that if and are real nonnegative symmetric matrices, then

Proof of Theorems 3.33.5. We need the cumulant generating function of the quadratic form in normal variables . It is known that the quadratic form has cumulant generating function equals to (see Mathai and Provost [11] Theorem 3.2a.3). Theorems 3.3, 3.4, and 3.5 correspond to the respective cases: We prove Theorem 3.5 for (under ) only. Theorems 3.3 and 3.4 are similarly obtained. In order to use the Gärtner-Ellis theorem, consider Setting and in (A.8), we have under the following: Using the inequality (A.7), we then replace by , where denote , , that is, we obtain the approximation
In view of Lemmas A.2 and A.3, the above converges to , given in Theorem 3.5. Clearly, exists for and is convex and continuously differentiable with respect to . For a sequence as , we can show that Hence, for every . Application of the Gärtner-Ellis theorem completes the proof.

#### Acknowledgments

The author would like to thank the referees for their many insightful comments, which improved the original version of this paper. The author would also like to thank Professor Masanobu Taniguchi who is the lead guest editor of this special issue for his efforts and celebrate his sixtieth birthday.

#### References

1. J. A. Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation, Wiley, New York, NY, USA, 1990.
2. T. Sato, Y. Kakizawa, and M. Taniguchi, “Large deviation results for statistics of short- and long-memory Gaussian processes,” Australian & New Zealand Journal of Statistics, vol. 40, no. 1, pp. 17–29, 1998.
3. R. Dahlhaus, “Maximum likelihood estimation and model selection for locally stationary processes,” Journal of Nonparametric Statistics, vol. 6, no. 2-3, pp. 171–191, 1996.
4. R. Dahlhaus, “On the Kullback-Leibler information divergence of locally stationary processes,” Stochastic Processes and their Applications, vol. 62, no. 1, pp. 139–168, 1996.
5. G. Chandler and W. Polonik, “Discrimination of locally stationary time series based on the excess mass functional,” Journal of the American Statistical Association, vol. 101, no. 473, pp. 240–253, 2006.
6. K. Sakiyama and M. Taniguchi, “Discriminant analysis for locally stationary processes,” Journal of Multivariate Analysis, vol. 90, no. 2, pp. 282–300, 2004.
7. J. Hirukawa, “Discriminant analysis for multivariate non-Gaussian locally stationary processes,” Scientiae Mathematicae Japonicae, vol. 60, no. 2, pp. 357–380, 2004.
8. J. Gärtner, “On large deviations from an invariant measure,” Theory of Probability and its Applications, vol. 22, no. 1, pp. 24–39, 1977.
9. R. S. Ellis, “Large deviations for a general class of random vectors,” The Annals of Probability, vol. 12, no. 1, pp. 1–12, 1984.
10. T. W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, New York, NY, USA, 2nd edition, 1984.
11. A. M. Mathai and S. B. Provost, Quadratic Forms in Random Variables: Theory and Applications, Marcel Dekker, New York, NY, USA, 1992.