Abstract

Diffusion models have been used extensively in many applications. These models, such as those used in the financial engineering, usually contain unknown parameters which we wish to determine. One way is to use the maximum likelihood method with discrete samplings to devise statistics for unknown parameters. In general, the maximum likelihood functions for diffusion models are not available, hence it is difficult to derive the exact maximum likelihood estimator (MLE). There are many different approaches proposed by various authors over the past years, see, for example, the excellent books and Kutoyants (2004), Liptser and Shiryayev (1977), Kushner and Dupuis (2002), and Prakasa Rao (1999), and also the recent works by Aït-Sahalia (1999), (2004), (2002), and so forth. Shoji and Ozaki (1998; see also Shoji and Ozaki (1995) and Shoji and Ozaki (1997)) proposed a simple local linear approximation. In this paper, among other things, we show that Shoji's local linear Gaussian approximation indeed yields a good MLE.

1. Introduction

Diffusion processes are used as theoretical models in analyzing random phenomena evolved in continuous time. These models may be described in terms of Itô's type stochastic differential equations where is a Brownian motion, with some unknown parameters to be determined in rational ways.

It is, however, difficulty to derive the maximum likelihood estimator for if the diffusion coefficient (i.e., the volatility) is unknown. On the other hand, in practice, the volatility is determined first by using the fact that when is a constant. Therefore we will limit ourselves on diffusion models with constant volatility: Since there is no much difference at technical level, we will consider one-dimensional models only. That is, we will assume throughout the paper that is a one-dimensional Brownian motion, and is real valued.

The distribution of over a finite time interval has a density with respect to the Wiener measure (the law of the Brownian motion ), given by the Cameron-Martin formula: which is in turn the likelihood function with continuous observation. In practice, only discrete values may be observed over the duration , where and . The corresponding likelihood function is the conditional expectation under Wiener measure: where is the conditional probability density function of given , and is the Gaussian density (see [1]). Since the denominator of (1.5) does not depend on , we may simply consider the numerator as a likelihood function. Therefore, the MLE for under a discrete observation may be found by solving either explicitly if possible or numerically the likelihood equation

The difficulty with this approach is that, unless for a very special drift vector field , an explicit formula for is not known. To overcome this difficulty, many approximation methods have been proposed in the literature by various authors. The idea is to replace the diffusion model (1.3) by an approximation model for which an explicit formula for the likelihood function is available. One possible candidate is of course the Euler-Maruyama approximation where is an i.i.d. sequence with standard normal distribution and . However, the likelihood function for this model is not, in general, close enough to that of the diffusion model if measured in terms of the ratio of their corresponding likelihood functions

The second approach is to discretize the likelihood function for continuous observations. In order to utilize this likelihood function, we need to handle the Itô integral which is defined only in probability sense. If (where is a -function) is a gradient field, then, according to Itô's formula, here the right-hand side involves only the sample . This idea to get rid of Itô's integral and replace it by an ordinary one has far-reaching consequences, see the interesting paper [2] for some applications.

One can also use approximations to the probability density function and construct functions which are close to the maximum likelihood function. There are a great number of articles devoted to this approach, such as [35], for example. The difficulty, however, is that even is a uniform approximation of , there is no guarantee that the approximate likelihood function would tend to when .

In this paper we consider the linear diffusion approximation proposed by Shoji and Ozaki [6] to the diffusion model (1.3), which leads to the following approximation of the likelihood function : where so that is a sample with fixed duration over , and is the probability transition density of the following linear diffusion model when and .

The approximation (1.12) is called the local linearization of the diffusion model (1.3), which has been studied in Shoji and Ozaki [6]. Shoji has showed numerically that the local linearizations do yield better estimates. Shoji's approximation was revisited in Prakasa Rao [7], without a definite conclusion.

The main goal of the paper is to prove Theorem 3.1 which implies that the local linear approximations (1.12) is efficient for the propose of deriving MLE with discrete samples.

The paper is organized as follows. In Section 2, we present the MLE for linear models such as (1.12). In Section 3, we state our main result for Shoji's local linear approximation, and give some comments about the conditions on the sampling data. Our main theorem provides a deterministic convergence rate for the likelihood functions. In Section 4, we prove that the likelihood function for the local linear approximation converges to the Cameron-Martin density but only in probability sense. Sections 5, 6, and 7 are devoted to the proof of our main result. In Section 5, we state the main tool, a representation formula for diffusions, established by Qian and Zheng [8]. In Section 6, we develop the main technical estimates in order to prove Theorem 3.1, whose proof is completed in Section 7. Section 8 contains a discussion about the Euler-Maruyama approximation which concludes the paper.

2. Linear Diffusions

Let us begin with the MLE of parameters , , and for the linear diffusion model (Mishra and Bishwal [9] discussed a similar model): whose finite-dimensional distributions are Gaussian, determined through the probability transition function . Fortunately we have an explicit formula for . Indeed the linear equation (2.1) may be solved explicitly and its solution is given by the formula (formula (6.8) of Karatzas and Shreve [10], page 354), and therefore

Suppose we have a discrete sample observed over the equal time scale during the period , , . According to the Markov property, their joint distribution, or the maximum likelihood function where , and is the probability density function of the initial distribution. Therefore the logarithmic of the maximum likelihood function The maximum likelihood estimates for , , and are the stationary points of , that is solutions to the equation . Set . Then and .

Proposition 2.1. The maximum likelihood estimates for the linear diffusion model (2.1) with discrete observations are given by

As an interesting consequence we have the following.

Corollary 2.2. The maximum likelihood estimators to the linear diffusion model (2.1) are not sufficient statistics while are sufficient.

3. Diffusion Models

We consider the diffusion model (1.3). Our approach and our conclusions are applicable to multidimensional cases as long as the diffusion coefficients are constant. For simplicity, we only consider one-dimensional case. The question is to estimate under a discrete observation over the time scale in the time interval . Then, up to a constant factor, its maximum likelihood function where is the transition probability density of (we have dropped the subscript for simplicity). The approximation maximum likelihood function, proposed in [6], is given by where is the transition density function to the linear diffusion model which is the first-order approximation to (1.3).

In what follows we assume that has bounded first and second derivatives and for some constant independent of parameters .

The main result of the paper is follows.

Theorem 3.1. Assume that and are bounded uniformly in . Let be a fixed time and be a constant. Suppose () is a family of discrete samples such that for all pair such that , , where . Then where and are defined in (3.1) and (3.2) with .

The convergence in (3.6) happens in a deterministic sense, and therefore conditions such as and are reasonable. The first condition, that is , just says the “variance” of the sample cannot be too big. Since so that on average we should have . Since has continuous sample paths, so that for a fixed sample point is bounded. Since are sampled from the fixed duration , thus we can assume that is bounded, though here we have a countable many samples. It is possible to relax this constraint, for example, we may impose that with , but for simplicity we only consider the bounded case. This condition is placed as a kind of “integrability” condition on the samples.

From the asymptotic of the transition density function , it is easy to see that for each , while, as our observation happens over a fixed time interval , the ratio (3.6) as is really an infinite product, its behavior thus depends on the global behavior of . Although there are many results about bounds of in the literature (see [2, 11] e.g.), the best we could find are those which yield (3.8) uniformly in , none of them yields the precise limit (3.6). In fact, the proof of (3.6) depends on careful estimates on through a representation formula established in [8].

4. Linear Diffusion Approximations

Without losing generality, we may assume that . Let be a discrete observation of the diffusion model (1.3) at (). For simplicity, write as if no confusion may arise. Consider the family of linear diffusions with . Let Then so that where . The approximating likelihood function is We need to compare this function to the likelihood function with continuous observation—the Cameron-Martin density, which, however, should be discounted with respect to the Wiener measure. Thus we have to renormalize against the discrete version of Brownian motion, which is given by where Hence its logarithmic

Proposition 4.1. One has uniformly in , in probability with respect to the Wiener measure, where is the log of the Cameron-Martin density (1.4).

Proof. Let . Then Since and , so that However, in probability. The claim thus follows immediately.

5. A Representation Formula

From this section, we develop necessary estimates in order to prove Theorem 3.1. In this section, we recall the main tool in our proof, a representation formula proved by Qian and Zheng [8]. Based on this formula, we prove the main estimate (6.65), which has independent interest, in the next section. We conclude the proof of Theorem 3.1 in Section 7.

Let . Consider the linear diffusion whose probability transition function is also denoted by . Recall that is the probability transition function of the diffusion defined by (1.3). The strong solution of (5.1) is given by so that where , and .

Observe that for any , is increasing, and We will also use the fact that

Lemma 5.1. For , and . Then

Our main tool is a representation formula (5.7) discovered in [8]. Let be the solution to the linear stochastic differential equation (5.1).

Proposition 5.2. For and one has where which is a martingale under the probability .

To prove (3.6), we need to estimate the double integral appearing on the right-hand side of (5.7), which requires a precise estimate for which can be achieved since we know the precise form . Of course, if we knew the joint distribution of , our task would be easy, but unfortunately it is rarely the case. Our arguments are based on the fact that is a martingale under , together with some delicate estimates for the functional integral which will be done in the next section.

6. Main Estimates

We use the notations established in the previous section. Let , , and . Then and therefore where

For and we set for simplicity.

Lemma 6.1. For any one has for all .

Proof. The two inequalities follow from the fact that assumes its maximum 1 and minimum .

Since so that which yield, together with (5.7), the following.

Lemma 6.2. One has where

Let

Lemma 6.3. Choose such that Then for any and , such that one has

Proof. Let Then and therefore On the other hand so that is a martingale with Thus is a time change of a standard Brownian motion, and for some standard Brownian motion . Since so we have

Corollary 6.4. For and to be such that one has

In what follows, we always assume that is chosen such that the condition (6.23) is satisfied. Next we estimate , which is provided in the following.

Lemma 6.5. Let . Then where the positive constant depends only on , , and .

Proof. Let Then, by the Hölder inequality Next we estimate the expectation . Since so that On the other hand so that

Lemma 6.6. Let satisfy condition (6.23), , and and such that . Then

Proof. Since is a martingale under , so that By the Hölder inequality we deduce that
Equation (6.32), follows from the representation (6.9).

Lemma 6.7. Let . Then

Proof. We have Under the probability , is a central normal distribution with variance In terms of and Making change of variable Then, under , has the standard normal distribution , so that Let us simplify the last integral. Indeed, set Then we rewrite the term appearing in the exponential in the last line of (6.42) together with the inequality (6.42) may be rewritten as follows: Making change of variable in the last integral so that Thus (6.46) yields that where Therefore which is equivalent to the required inequality.

Lemma 6.8. Let Then

Proof. Indeed, by Lemma 6.7 we have On the other hand thus

Let Then (6.53) and (6.54) imply that

Lemma 6.9. One has In particular

Proof. Let Then Since which implies the required estimate.

By collecting all estimates we have established, we may obtain the following.

Proposition 6.10. There is a constant depending only on and such that where

Proof. Indeed While, and it thus yields our key estimate (6.65).

Similarly we have a lower bound

where depends only on and .

7. Proof of Theorem 3.1

We are now in a position to prove Theorem 3.1. We may assume that , so that . Let () be discrete samplings with time scale on . By our assumptions, , and for all pair such that and . For simplicity we write for if no confusion may arise.

In the proof below, we will use to denote nonnegative constants which may depend on and the bounds of and appearing in our diffusion model (1.3), but independent of .

Recall that is the probability transition density function of the diffusion (3.3), that is, where

According to (6.65) we have where Since and are bounded, so that Thus Therefore where we have used (7.6). It follows that Similarly we have Therefore and the proof of Theorem 3.1 is complete.

8. The Euler-Maruyama Approximation

Recall that the Euler-Maruyama approximation to (1.3) is a Markov chain given by where is an i.i.d. random sequence, with standard normal . The conditional distribution of given is Gaussian with mean and variance so that the likelihood function is given as Applying the representation formula (5.7) we have the following.

Proposition 8.1. It holds that where is the standard Brownian motion, , , and

From which we may deduce the following estimate.

Proposition 8.2. If is bounded and Lipschitz continuous, uniformly in , then the maximum likelihood function with discrete sampling is stable, in the sense that for some constant and , where .

However, this estimate does not lead to the same result as for the local linear approximation. It is not known (to our best knowledge) whether holds or not under similar conditions in Theorem 3.1.

Acknowledgment

The authors thank Dr. Zeng for providing a list of errors contained in the first version of the paper.