- About this Journal
- Abstracting and Indexing
- Aims and Scope
- Article Processing Charges
- Articles in Press
- Author Guidelines
- Bibliographic Information
- Citations to this Journal
- Contact Information
- Editorial Board
- Editorial Workflow
- Free eTOC Alerts
- Publication Ethics
- Reviewers Acknowledgment
- Submit a Manuscript
- Subscription Information
- Table of Contents
International Journal of Stochastic Analysis
Volume 2011 (2011), Article ID 247329, 89 pages
Multiresolution Hilbert Approach to Multidimensional Gauss-Markov Processes
1Lewis-Sigler Institute, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544, USA
2Laboratory of Mathematical Physics, The Rockefeller University, New York, NY 10065, USA
3Mathematical Neuroscience Laboratory, Collège de France, CIRB, 11 Place Marcelin Berthelot, CNRS UMR 7241 and INSERM U 1050, Université Pierre et Marie Curie ED, 158 and Memolife PSL, 75005 Paris, France
4INRIA BANG Laboratory, Paris, France
Received 28 April 2011; Accepted 6 October 2011
Academic Editor: Agnès Sulem
Copyright © 2011 Thibaud Taillefumier and Jonathan Touboul. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The study of the multidimensional stochastic processes involves complex computations in intricate functional spaces. In particular, the diffusion processes, which include the practically important Gauss-Markov processes, are ordinarily defined through the theory of stochastic integration. Here, inspired by the Lévy-Ciesielski construction of the Wiener process, we propose an alternative representation of multidimensional Gauss-Markov processes as expansions on well-chosen Schauder bases, with independent random coefficients of normal law with zero mean and unit variance. We thereby offer a natural multiresolution description of the Gauss-Markov processes as limits of finite-dimensional partial sums of the expansion, that are strongly almost-surely convergent. Moreover, such finite-dimensional random processes constitute an optimal approximation of the process, in the sense of minimizing the associated Dirichlet energy under interpolating constraints. This approach allows for a simpler treatment of problems in many applied and theoretical fields, and we provide a short overview of applications we are currently developing.
Intuitively, multidimensional continuous stochastic processes are easily conceived as solutions to randomly perturbed differential equations of the form where the perturbative term implicitly defines a probability space and satisfies some ad hoc regularity conditions. If the existence of such processes is well established for a wide range of equations through the standard Itô integration theory (see, e.g., ), studying their properties proves surprisingly challenging, even for the simplest multidimensional processes. Indeed, the high dimensionality of the ambient space and the nowhere differentiability of the sample paths conspire to heighten the intricacy of the sample paths spaces. In this regard, such spaces have been chiefly studied for multidimensional diffusion processes , and more recently, the development of rough paths theory has attracted renewed interest in the field (see [3–7]). However, aside from these remarkable theoretical works, little emphasis is put on the sample paths since most of the available results only make sense in distribution. This is particularly true in the Itô integration theory, where the sample path is completely neglected for the Itô map being defined up to null sets of paths.
To overcome the difficulty of working in complex multidimensional spaces, it would be advantageous to have a discrete construction of a continuous stochastic process as finite-dimensional distributions. Since we put emphasis on the description of the sample paths space, at stake is to write a process as an almost surely pathwise convergent series of random functions where is a deterministic function and is a given random variable.
The Lévy-Ciesielski construction of the -dimensional Brownian motion (also referred to as Wiener process) provides us with an example of discrete representation for a continuous stochastic process. Noticing the simple form of the probability density of a Brownian bridge, it is based on completing sample paths by interpolation according to the conditional probabilities of the Wiener process . More specifically, the coefficients are Gaussian independent and the elements , called the Schauder elements and denoted by , are obtained by time-dependent integration of the Haar basis elements: and , with for all This latter point is of relevance since, for being a Hilbert system, the introduction of the Haar basis greatly simplifies the demonstration of the existence of the Wiener process . From another perspective, fundamental among discrete representations is the Karhunen-Loève decomposition giving a representation of stochastic processes by expanding it on a basis of orthogonal functions [10, 11]. The definition of the basis elements depends only on the second-order statistics of the considered process and the coefficients are pairwise uncorrelated random variables. Incidentally, such a decomposition is especially suited to study the Gaussian processes because the coefficients of the representation are Gaussian and independent. For these reasons, the Karhunen-Loève decomposition is of primary importance in exploratory data analysis, leading to methods referred to as “principal component analysis,” “Hotelling transform"  or “proper orthogonal decomposition”  according to the field of application. In particular, it was directly applied to the study of the stationary Gaussian Markov processes in the theory of random noise in radio receivers .
It is also important for our purpose to realize that the Schauder elements have compact supports that exhibit a nested structure: this fact entails that the finite sums are processes that interpolate the limit process on the endpoints of the supports, that is, on the dyadic points , . One of the specific goal of our construction is to maintain such a property in the construction of all multidimensional the Gauss-Markov processes (i.e., processes that are both Gaussian and satisfy the Markov property) of the form: (covering all 1-dimensional Gauss-Markov processes thanks to Doob’s representation of Gauss-Markov processes), being successively approximated by finite-dimensional processes that interpolates at ever finer resolution. In that respect, it is only in that sense that we refer to our framework as a multiresolution approach as opposed to the wavelet multiresolution theory . Other multiresolution approaches have been developed for certain Gaussian processes, most notably for the fractional Brownian motion .
In view of this, we propose a construction of the multidimensional Gaussian Markov processes using a multiresolution Schauder basis of functions. As for the Lévy-Ciesielski construction, and in contrast with Karhunen-Loève decomposition, our basis is not made of orthogonal functions but the elements are of nested compact support and the random coefficients are always independent and Gaussian (for convenience with law , i.e., with zero mean and unitary variance). We first develop a heuristic approach for the construction of stochastic processes reminiscent of the midpoint displacement technique [8, 9], before rigorously deriving the multiresolution basis that we will be using the paper. This set of functions is then studied as a multiresolution Schauder basis of functions: in particular, we derive explicitly from the multiresolution basis an Haar-like Hilbert basis, which is the underlying structure explaining the dual relationship between basis elements and coefficients. Based on these results, we study the construction application and its inverse, the coefficient applications, that relate coefficients on the Schauder basis to sample paths. We follow up by proving the almost-sure and strong convergence of the process having independent standard normal coefficients on the Schauder basis to a Gauss-Markov process. We also show that our decomposition is optimal in some sense that is strongly evocative of spline interpolation theory : the construction yields successive interpolations of the process at the interval endpoints that minimize the Dirichlet energy induced by the differential operator associated with the Gauss-Markov process [18, 19]. We also provide a series of examples for which the proposed Schauder framework yields bases of functions that have simple closed form formulae: in addition to the simple one-dimensional Markov processes, we explicit our framework for two classes of multidimensional processes, the Gauss-Markov rotations and the iteratively integrated Wiener processes (see, e.g., [20–22]).
The ideas underlying this work can be directly traced back to the original work of Lévy. Here, we intend to develop a self-contained Schauder dual framework to further the description of multidimensional Gauss-Markov processes, and, in doing so, we extend some well-known results of interpolation theory in signal processing [23–25]. To our knowledge, such an approach is yet to be proposed. By restraining our attention to the Gauss-Markov processes, we obviously do not assume generality. However, we hope our construction proves of interest for a number of points, which we tentatively list in the following. First, the almost-sure pathwise convergence of our construction together with the interpolation property of the finite sums allows to reformulate results of the stochastic integration in term of the geometry of finite-dimensional sample paths. In this regard, we found it appropriate to illustrate how in our framework, the Girsanov theorem for the Gauss-Markov processes appears as a direct consequence of the finite-dimensional change of variable formula. Second, the characterization of our Schauder elements as the minimizer of a Dirichlet form paves the way to the construction of infinite-dimensional Gauss-Markov processes, that is, processes whose sample points themselves are infinite-dimensional [26, 27]. Third, our construction shows that approximating a Gaussian process by a sequence of interpolating processes relies entirely on the existence of a regular triangularization of the covariance operator, suggesting to further investigate this property for non-Markov Gaussian processes . Finally, there is a number of practical applications where applying the Schauder basis framework clearly provides an advantage compared to standard stochastic calculus methods, among which first-hitting times of stochastic processes, pricing of multidimensional path-dependant options [29–32], regularization technique for support vector machine learning , and more theoretical work on uncovering the differential geometry structure of the space of the Gauss-Markov stochastic processes . We conclude our exposition by developing in more detail some of these direct implications which will be the subjects of forthcoming papers.
2. Heuristic Approach to the Construction
In order to provide a discrete multiresolution description of the Gauss-Markov processes, we first establish basic results about the law of the Gauss-Markov bridges in the multidimensional setting. We then use them to infer the candidate expressions for our desired bases of functions, while imposing its elements to be compactly supported on nested sequence segments. Throughout this paper, we are working in a complete probability space .
2.1. Multidimensional Gauss-Markov Processes
After recalling the definition of the multidimensional Gauss-Markov processes in terms of stochastic integral, we use the well-known conditioning formula for the Gaussian vectors to characterize the law of the Gauss-Markov bridge processes.
2.1.1. Notations and Definitions
Let be an -dimensional Wiener process, consider the continuous functions , and define the positive bounded continuous function . The -dimensional Ornstein-Uhlenbeck process associated with these parameters is solution of the equation and with initial condition in , it reads where is the flow of the equation, namely, the solution in of the linear equation: Note that the flow enjoys the chain rule property: For all such that , the vectors and admit the covariance where we further defined the function which will be of particular interest in the sequel. Note that because of the chain rule property of the flow, we have We suppose that the process is never degenerated, that is, for all , all the components of the vector taking into account are nondeterministic random variables, which is equivalent to saying that the covariance matrix of taking into account , denoted by is symmetric positive definite for any . Therefore, assuming the initial condition , the multidimensional centered process has a representation (similar to Doob’s representation for one-dimensional processes, see ) of form with and .
Note that the processes considered in this paper are defined on the time interval . However, because of the time-rescaling property of these processes, considering the processes on this time interval is equivalent to considering the process on any other bounded interval without loss of generality.
2.1.2. Conditional Law and Gauss-Markov Bridges
As stated in the introduction, we aim at defining a multiresolution description of Gauss-Markov processes. Such a description can be seen as a multiresolution interpolation of the process that is getting increasingly finer. This principle, in addition to the Markov property, prescribes to characterize the law of the corresponding Gauss-Markov bridge, that is, the Gauss-Markov process under consideration, conditioned on its initial and final values. The bridge process of the Gauss process is still a Gauss process and, for a Markov process, its law can be computed as follows.
Proposition 2.1. Let two times in the interval . For any , the random variable conditioned on and is a Gaussian variable with covariance matrix and mean vector given by where the continuous matrix functions and of are given by
Note that the functions and have the property that and ensuring that the process is indeed equal to at time and at time .
Proof. Let be two times of the interval such that , and let . We consider the Gaussian random variable conditioned on the fact that . Its mean can be easily computed from expression (2.2) and reads and its covariance matrix, from (2.5), reads From there, we apply the conditioning formula for the Gaussian vectors (see, e.g., ) to infer the law of conditioned on and , that is the law of where denotes the bridge process obtained by pinning in and . The covariance matrix is given by and the mean reads where we have used the fact that . The regularity of the thus-defined functions and directly stems from the regularity of the flow operator . Moreover, since for any , we observe that and ; we clearly have and .
Remark 2.2. Note that these laws can also be computed using the expression of the density of the processes but involve more intricate calculations. An alternative approach also provides a representation of Gauss-Markov bridges with the use of integral and anticipative representation . These approaches allow to compute the probability distribution of the Gauss-Markov bridge as a process (i.e., allows to compute the covariances), but since this will be of no use in the sequel, we do not provide the expressions.
2.2. The Multiresolution Description of Gauss-Markov Processes
Recognizing the Gauss property and the Markov property as the two crucial elements for a stochastic process to be expanded to Lévy-Cesielski, our approach first proposes to exhibit bases of deterministic functions that would play the role of the Schauder bases for the Wiener process. In this regard, we first expect such functions to be continuous and compactly supported on increasingly finer supports (i.e., subintervals of the definition interval ) in a similar nested binary tree structure. Then, as in the Lévy-Ciesielski construction, we envision that, at each resolution (i.e., on each support), the partially constructed process (up to the resolution of the support) has the same conditional expectation as the Gauss-Markov process when conditioned on the endpoints of the supports. The partial sums obtained with independent Gaussian coefficients of law will thus approximate the targeted Gauss-Markov process in a multiresolution fashion, in the sense that, at every resolution, considering these two processes on the interval endpoints yields finite-dimensional Gaussian vectors of the same law.
2.2.1. Nested Structure of the Sequence of Supports
Here, we define the nested sequence of segments that constitute the supports of the multiresolution basis. We construct such a sequence by recursively partitioning the interval .
More precisely, starting from with and , we iteratively apply the following operation. Suppose that, at the th step, the interval is decomposed into intervals , called supports, such that for . Each of these intervals is then subdivided into two child intervals, a left-child and a right-child , and the subdivision point is denoted by . Therefore, we have defined three sequences of real , , and for and satisfying and with the convention and and . The resulting sequence of supports clearly has a binary tree structure.
For the sake of compactness of notations, we define the set of indices and for , we define , the set of endpoints of the intervals . We additionally require that there exists such that for all which in particular implies that and ensures that the set of endpoints is everywhere dense in . The simplest case of such partitions is the dyadic partition of , where the endpoints for read in which case the endpoints are simply the dyadic points . Figure 1 represents the global architecture of the nested sequence of intervals.
The nested structure of the supports, together with the constraint of continuity of the bases elements, implies that only a finite number of coefficients are needed to construct the exact value of the process at a given endpoint, thus providing us with an exact schema to simulate the sample values of the process on the endpoint up to an arbitrary resolution, as we will further explore.
2.2.2. Innovation Processes for Gauss-Markov Processes
For , a multidimensional Gauss-Markov process, we call the multiresolution description of a process the sequence of conditional expectations on the nested sets of endpoints . In detail, if we denote by the filtration generated by given the values of the process at the endpoints of the partition, we introduce the sequence of the Gaussian processes defined by: These processes can be naturally viewed as an interpolation of the process sampled at the increasingly finer times since for all we have . The innovation process is defined as the update transforming the process into , that is, It corresponds to the difference the additional knowledge of the process at the points make on the conditional expectation of the process. This process satisfies the following important properties that found our multiresolution construction.
Proposition 2.3. The innovation process is a centered Gaussian process independent of the processes for any . For and with , the covariance of the innovation process reads where with , and as defined in Proposition 2.1.
Proof. Because of the Markovian property of the process , the law of the process can be computed from the bridge formula derived in Proposition 2.1 and we have Therefore, the innovation process can be written for as where is a measurable process a deterministic matrix function and The expressions of and are quite complex but are highly simplified when noting that directly implying that and yielding the remarkably compact expression This process is a centered Gaussian process. Moreover, observing that it is -measurable, it can be written as and the process appears as the Gauss-Markov bridge conditioned at times and , and whose covariance is given by Proposition 2.1 and that has the expression Let , and assume that and . If , then, because of the Markov property of the process , the two bridges are independent and therefore the covariance is zero. If , we have Eventually, the independence property stems from the simple properties of the conditional expectation. Indeed, let . We have and the fact that a zero covariance between two Gaussian processes implies the independence of these processes concludes the proof.
2.2.3. Derivation of the Candidate Multiresolution Bases of Functions
We deduce from the previous proposition the following fundamental theorem of this paper.
Theorem 2.4. For all , there exists a collection of that are zero outside the subinterval such that in distribution one has: where are independent -dimensional standard normal random variables (i.e., of law ). This basis of functions is unique up to an orthogonal transformation.
Proof. The two processes and are two Gaussian processes of mean zero. Therefore, we are searching for functions vanishing outside and ensuring that the two processes have the same probability distribution. A necessary and sufficient condition for the two processes to have the same probability distribution is to have the same covariance function (see, e.g., ). We therefore need to show the existence of a collection of functions functions that vanish outside the subinterval and that ensure that the covariance of the process is equal to the covariance of . Let such that and . If , the assumption fact that the functions vanish outside implies that If , the covariance reads which needs to be equal to the covariance of , namely, Therefore, since , we have We can hence now define as a square root of the symmetric positive matrix , by fixing in (2.35) Eventually, since by assumption we have that is invertible, so is , and the functions can be written as with being a square root of . Square roots of positive symmetric matrices are uniquely defined up to an orthogonal transformation. Therefore, all square roots of are related by orthogonal transformations , where . This property immediately extends to the functions we are studying: two different functions and satisfying the theorem differ from an orthogonal transformation . We proved that, for to have the same law as in the interval , the functions with support in are necessarily of the form . It is straightforward to show the sufficient condition that provided such a set of functions, the processes and are equal in law, which ends the proof of the theorem.
Using the expressions obtained in Proposition 2.1, we can make completely explicit the form of the basis in terms of the functions , and : and satisfies Note that can be defined uniquely as the symmetric positive square root, or as the lower triangular matrix resulting from the Cholesky decomposition of .
Let us now define the function such that the process has the same covariance as , which is computed using exactly the same technique as that developed in the proof of Theorem 2.4 and that has the expression for , a square root of the covariance matrix of which from (2.5) reads
We are now in position to show the following corollary of Theorem 2.4.
Corollary 2.5. The Gauss-Markov process is equal in law to the process where are independent standard normal random variables .
Proof. We have
We therefore identified a collection of functions that allows a simple construction of the Gauss-Markov process iteratively conditioned on increasingly finer partitions of the interval . We will show that this sequence converges almost surely towards the Gauss-Markov process used to construct the basis, proving that these finite-dimensional continuous functions form an asymptotically accurate description of the initial process. Beforehand, we rigorously study the Hilbertian properties of the collection of functions we just defined.
3. The Multiresolution Schauder Basis Framework
The above analysis motivates the introduction of a set of functions we now study in details. In particular, we enlighten the structure of the collection of functions as a Schauder basis in a certain space of continuous functions from to . The Schauder structure was defined in [38, 39], and its essential characterization is the unique decomposition property: namely that every element in can be written as a well-formed linear combination and that the coefficients satisfying the previous relation are unique.
3.1. System of Dual Bases
To complete this program, we need to introduce some quantities that will play a crucial role in expressing the family as a Schauder basis for some given space. In (2.39), two constant matrices appear that will have a particular importance in the sequel for in with : where stands for . We further define the matrix and we recall that is a square root of , the covariance matrix of , conditionally to and , given in (2.29). We stress that the matrices , , and are all invertible and satisfy the important following properties.
Proposition 3.1. For all in , , one has:(i)(ii).
To prove this proposition, we first establish the following simple lemma of linear algebra.
Lemma 3.2. Given two invertible matrices and in such that is also invertible, if one defines , one has the following properties:(i)(ii).
Let us define . With this notations we define the functions in a compact form as follows.
Definition 3.3. For every in with , the continuous functions are defined on their support as and the basis element is given on by
The definition implies that are continuous functions in the space of piecewise derivable functions with piecewise continuous derivative which takes value zero at zero. We denote such a space by .
Before studying the property of the functions , it is worth remembering that their definitions include the choice of a square root of . Properly speaking, there is thus a class of bases and all the points we develop in the sequel are valid for this class. However, for the sake of simplicity, we consider from now on that the basis under scrutiny results from choosing the unique square root that is lower triangular with positive diagonal entries (the Cholesky decomposition).
3.1.1. Underlying System of Orthonormal Functions
We first introduce a family of functions and show that it constitutes an orthogonal basis on a certain Hilbert space. The choice of this basis can seem arbitrary at first sight, but the definition of these function will appear natural for its relationship with the functions and that is made explicit in the sequel, and the mathematical rigor of the argument lead us to choose this apparently artificial introduction.
Definition 3.4. For every in with , we define a continuous function which is zero outside its support and has the expressions: The basis element is defined on by
Remark that the definitions make apparent the fact that these two families of functions are linked for all in through the simple relation Moreover, this collection of functions constitutes an orthogonal basis of functions, in the following sense.
Proposition 3.5. Let be the closure of equipped with the natural norm of . It is a Hilbert space, and moreover, for all , the family of functions defined as the columns of , namely forms a complete orthonormal basis of .
Proof. The space is clearly a Hilbert space as a closed subspace of the larger Hilbert space is equipped with the standard scalar product:
We now proceed to demonstrate that the columns of form an orthonormal family which generates a dense subspace of . To this end, we define as the space of functions
that is, the space of functions that take values in the set of -matrices whose columns are in . This definition allows us to define the bilinear function as
and we observe that the columns of form an orthonormal system if and only if
where is the Kronecker delta function, whose value is 1 if and , and 0 otherwise.
First of all, since the functions are zero outside the interval , the matrix is nonzero only if . In such cases, assuming that and, for example, that , we necessarily have strictly included in : more precisely, is either included in the left-child support or in the right-child support of . In both cases, writing the matrix shows that it is expressed as a matrix product whose factors include . We then show that which entails that if . If , we remark that , and we conclude that from the preceding case. For , we directly compute for the only nonzero term Using the passage relationship between the symmetric functions and given in (2.7), we can then write Proposition 3.1 implies that which directly implies that . For , a computation of the exact same flavor yields that . Hence, we have proved that the collection of columns of forms an orthonormal family of functions in (the definition of clearly states that its columns can be written in the form of elements of ).
The proof now amounts showing the density of the family of functions we consider. Before showing this density property, we introduce for all in the functions with support on defined by Showing that the family of columns of is dense in is equivalent to show that the column vectors of the matrices seen as a function of are dense in . It is enough to show that the span of such functions contains the family of piecewise continuous -valued functions that are to be constant on , in (the density of the endpoints of the partition entails that the latter family generates ).
In fact, we show that the span of functions is exactly equal to the space of piecewise continuous functions from to that are constant on the supports , for any in . The fact that is included in is clear from the fact that the matrix-valued functions are defined constant on the support , for in .
We prove that is included in by induction on . The property is clearly true at rank since is then equal to the constant invertible matrix . Assuming that the proposition true at rank for a given , let us consider a piecewise continuous function in . Remark that, for every in , the function can only take two values on and can have discontinuity jump in : let us denote these jumps as Now, remark that for every in , the matrix-valued functions take only two matrix values on , namely, and . From Proposition 3.1, we know that is invertible. This fact directly entails that there exist vectors , for any in , such that . We then necessarily have that the function is piecewise constant on the supports , in . By recurrence hypothesis, belongs to , so that belongs to , and we have proved that . Therefore, the space generated by the column vectors is dense in , which completes the proof that the functions form a complete orthonormal family of .
The fact that the column functions of form a complete orthonormal system of directly entails the following decomposition of the identity on .
Corollary 3.6. If is the real delta Dirac function, one has
Proof. Indeed it easy to verify that, for all in , we have for all where denotes the inner product in between and the -column of . Therefore, by the Parseval identity, we have in the sense
From now on, abusing language, we will say that the family of -valued functions is an orthonormal family of functions to refer to the fact that the columns of such matrices form orthonormal set of . We now make explicit the relationship between this orthonormal basis and our functions derived in our analysis of the multidimensional Gauss-Markov processes.
3.1.2. Generalized Dual Operators
The Integral Operator
The basis is of great interest in this paper for its relationship to the functions that naturally arise in the decomposition of the Gauss-Markov processes. Indeed, the collection can be generated from the orthonormal basis through the action of the integral operator defined on into by where is an open set and, for any set denotes the indicator function of . Indeed, realizing that acts on into through where denotes the th -valued column function of , we easily see that for all in , , It is worth noticing that the introduction of the operator can be considered natural since it characterizes the centered Gauss-Markov process through loosely writing .
In order to exhibit a dual family of functions to the basis , we further investigate the property of the integral operator . In particular, we study the existence of an inverse operator , whose action on the orthonormal basis will conveniently provide us with a dual basis to . Such an operator does not always exist; nevertheless, under special assumptions, it can be straightforwardly expressed as a generalized differential operator.
The Differential Operator
Here, we make the assumptions that , that, for all , is invertible in , and that and have continuous derivatives, which especially implies that . In this setting, we define the space of functions in that are zero at zero and denote by its dual in the space of distributions (or generalized functions). Under the assumptions just made, the operator admits the differential operator defined by as its inverse, that is, when restricted to , we have on . The dual operators of and are expressed, for any in , as