Abstract
The study of the multidimensional stochastic processes involves complex computations in intricate functional spaces. In particular, the diffusion processes, which include the practically important Gauss-Markov processes, are ordinarily defined through the theory of stochastic integration. Here, inspired by the Lévy-Ciesielski construction of the Wiener process, we propose an alternative representation of multidimensional Gauss-Markov processes as expansions on well-chosen Schauder bases, with independent random coefficients of normal law with zero mean and unit variance. We thereby offer a natural multiresolution description of the Gauss-Markov processes as limits of finite-dimensional partial sums of the expansion, that are strongly almost-surely convergent. Moreover, such finite-dimensional random processes constitute an optimal approximation of the process, in the sense of minimizing the associated Dirichlet energy under interpolating constraints. This approach allows for a simpler treatment of problems in many applied and theoretical fields, and we provide a short overview of applications we are currently developing.
1. Introduction
Intuitively, multidimensional continuous stochastic processes are easily conceived as solutions to randomly perturbed differential equations of the form where the perturbative term implicitly defines a probability space and satisfies some ad hoc regularity conditions. If the existence of such processes is well established for a wide range of equations through the standard Itô integration theory (see, e.g., [1]), studying their properties proves surprisingly challenging, even for the simplest multidimensional processes. Indeed, the high dimensionality of the ambient space and the nowhere differentiability of the sample paths conspire to heighten the intricacy of the sample paths spaces. In this regard, such spaces have been chiefly studied for multidimensional diffusion processes [2], and more recently, the development of rough paths theory has attracted renewed interest in the field (see [3–7]). However, aside from these remarkable theoretical works, little emphasis is put on the sample paths since most of the available results only make sense in distribution. This is particularly true in the Itô integration theory, where the sample path is completely neglected for the Itô map being defined up to null sets of paths.
To overcome the difficulty of working in complex multidimensional spaces, it would be advantageous to have a discrete construction of a continuous stochastic process as finite-dimensional distributions. Since we put emphasis on the description of the sample paths space, at stake is to write a process as an almost surely pathwise convergent series of random functions where is a deterministic function and is a given random variable.
The Lévy-Ciesielski construction of the -dimensional Brownian motion (also referred to as Wiener process) provides us with an example of discrete representation for a continuous stochastic process. Noticing the simple form of the probability density of a Brownian bridge, it is based on completing sample paths by interpolation according to the conditional probabilities of the Wiener process [8]. More specifically, the coefficients are Gaussian independent and the elements , called the Schauder elements and denoted by , are obtained by time-dependent integration of the Haar basis elements: and , with for all This latter point is of relevance since, for being a Hilbert system, the introduction of the Haar basis greatly simplifies the demonstration of the existence of the Wiener process [9]. From another perspective, fundamental among discrete representations is the Karhunen-Loève decomposition giving a representation of stochastic processes by expanding it on a basis of orthogonal functions [10, 11]. The definition of the basis elements depends only on the second-order statistics of the considered process and the coefficients are pairwise uncorrelated random variables. Incidentally, such a decomposition is especially suited to study the Gaussian processes because the coefficients of the representation are Gaussian and independent. For these reasons, the Karhunen-Loève decomposition is of primary importance in exploratory data analysis, leading to methods referred to as “principal component analysis,” “Hotelling transform" [12] or “proper orthogonal decomposition” [13] according to the field of application. In particular, it was directly applied to the study of the stationary Gaussian Markov processes in the theory of random noise in radio receivers [14].
It is also important for our purpose to realize that the Schauder elements have compact supports that exhibit a nested structure: this fact entails that the finite sums are processes that interpolate the limit process on the endpoints of the supports, that is, on the dyadic points , . One of the specific goal of our construction is to maintain such a property in the construction of all multidimensional the Gauss-Markov processes (i.e., processes that are both Gaussian and satisfy the Markov property) of the form: (covering all 1-dimensional Gauss-Markov processes thanks to Doob’s representation of Gauss-Markov processes), being successively approximated by finite-dimensional processes that interpolates at ever finer resolution. In that respect, it is only in that sense that we refer to our framework as a multiresolution approach as opposed to the wavelet multiresolution theory [15]. Other multiresolution approaches have been developed for certain Gaussian processes, most notably for the fractional Brownian motion [16].
In view of this, we propose a construction of the multidimensional Gaussian Markov processes using a multiresolution Schauder basis of functions. As for the Lévy-Ciesielski construction, and in contrast with Karhunen-Loève decomposition, our basis is not made of orthogonal functions but the elements are of nested compact support and the random coefficients are always independent and Gaussian (for convenience with law , i.e., with zero mean and unitary variance). We first develop a heuristic approach for the construction of stochastic processes reminiscent of the midpoint displacement technique [8, 9], before rigorously deriving the multiresolution basis that we will be using the paper. This set of functions is then studied as a multiresolution Schauder basis of functions: in particular, we derive explicitly from the multiresolution basis an Haar-like Hilbert basis, which is the underlying structure explaining the dual relationship between basis elements and coefficients. Based on these results, we study the construction application and its inverse, the coefficient applications, that relate coefficients on the Schauder basis to sample paths. We follow up by proving the almost-sure and strong convergence of the process having independent standard normal coefficients on the Schauder basis to a Gauss-Markov process. We also show that our decomposition is optimal in some sense that is strongly evocative of spline interpolation theory [17]: the construction yields successive interpolations of the process at the interval endpoints that minimize the Dirichlet energy induced by the differential operator associated with the Gauss-Markov process [18, 19]. We also provide a series of examples for which the proposed Schauder framework yields bases of functions that have simple closed form formulae: in addition to the simple one-dimensional Markov processes, we explicit our framework for two classes of multidimensional processes, the Gauss-Markov rotations and the iteratively integrated Wiener processes (see, e.g., [20–22]).
The ideas underlying this work can be directly traced back to the original work of Lévy. Here, we intend to develop a self-contained Schauder dual framework to further the description of multidimensional Gauss-Markov processes, and, in doing so, we extend some well-known results of interpolation theory in signal processing [23–25]. To our knowledge, such an approach is yet to be proposed. By restraining our attention to the Gauss-Markov processes, we obviously do not assume generality. However, we hope our construction proves of interest for a number of points, which we tentatively list in the following. First, the almost-sure pathwise convergence of our construction together with the interpolation property of the finite sums allows to reformulate results of the stochastic integration in term of the geometry of finite-dimensional sample paths. In this regard, we found it appropriate to illustrate how in our framework, the Girsanov theorem for the Gauss-Markov processes appears as a direct consequence of the finite-dimensional change of variable formula. Second, the characterization of our Schauder elements as the minimizer of a Dirichlet form paves the way to the construction of infinite-dimensional Gauss-Markov processes, that is, processes whose sample points themselves are infinite-dimensional [26, 27]. Third, our construction shows that approximating a Gaussian process by a sequence of interpolating processes relies entirely on the existence of a regular triangularization of the covariance operator, suggesting to further investigate this property for non-Markov Gaussian processes [28]. Finally, there is a number of practical applications where applying the Schauder basis framework clearly provides an advantage compared to standard stochastic calculus methods, among which first-hitting times of stochastic processes, pricing of multidimensional path-dependant options [29–32], regularization technique for support vector machine learning [33], and more theoretical work on uncovering the differential geometry structure of the space of the Gauss-Markov stochastic processes [34]. We conclude our exposition by developing in more detail some of these direct implications which will be the subjects of forthcoming papers.
2. Heuristic Approach to the Construction
In order to provide a discrete multiresolution description of the Gauss-Markov processes, we first establish basic results about the law of the Gauss-Markov bridges in the multidimensional setting. We then use them to infer the candidate expressions for our desired bases of functions, while imposing its elements to be compactly supported on nested sequence segments. Throughout this paper, we are working in a complete probability space .
2.1. Multidimensional Gauss-Markov Processes
After recalling the definition of the multidimensional Gauss-Markov processes in terms of stochastic integral, we use the well-known conditioning formula for the Gaussian vectors to characterize the law of the Gauss-Markov bridge processes.
2.1.1. Notations and Definitions
Let be an -dimensional Wiener process, consider the continuous functions , and define the positive bounded continuous function . The -dimensional Ornstein-Uhlenbeck process associated with these parameters is solution of the equation and with initial condition in , it reads where is the flow of the equation, namely, the solution in of the linear equation: Note that the flow enjoys the chain rule property: For all such that , the vectors and admit the covariance where we further defined the function which will be of particular interest in the sequel. Note that because of the chain rule property of the flow, we have We suppose that the process is never degenerated, that is, for all , all the components of the vector taking into account are nondeterministic random variables, which is equivalent to saying that the covariance matrix of taking into account , denoted by is symmetric positive definite for any . Therefore, assuming the initial condition , the multidimensional centered process has a representation (similar to Doob’s representation for one-dimensional processes, see [35]) of form with and .
Note that the processes considered in this paper are defined on the time interval . However, because of the time-rescaling property of these processes, considering the processes on this time interval is equivalent to considering the process on any other bounded interval without loss of generality.
2.1.2. Conditional Law and Gauss-Markov Bridges
As stated in the introduction, we aim at defining a multiresolution description of Gauss-Markov processes. Such a description can be seen as a multiresolution interpolation of the process that is getting increasingly finer. This principle, in addition to the Markov property, prescribes to characterize the law of the corresponding Gauss-Markov bridge, that is, the Gauss-Markov process under consideration, conditioned on its initial and final values. The bridge process of the Gauss process is still a Gauss process and, for a Markov process, its law can be computed as follows.
Proposition 2.1. Let two times in the interval . For any , the random variable conditioned on and is a Gaussian variable with covariance matrix and mean vector given by where the continuous matrix functions and of are given by
Note that the functions and have the property that and ensuring that the process is indeed equal to at time and at time .
Proof. Let be two times of the interval such that , and let . We consider the Gaussian random variable conditioned on the fact that . Its mean can be easily computed from expression (2.2) and reads and its covariance matrix, from (2.5), reads From there, we apply the conditioning formula for the Gaussian vectors (see, e.g., [36]) to infer the law of conditioned on and , that is the law of where denotes the bridge process obtained by pinning in and . The covariance matrix is given by and the mean reads where we have used the fact that . The regularity of the thus-defined functions and directly stems from the regularity of the flow operator . Moreover, since for any , we observe that and ; we clearly have and .
Remark 2.2. Note that these laws can also be computed using the expression of the density of the processes but involve more intricate calculations. An alternative approach also provides a representation of Gauss-Markov bridges with the use of integral and anticipative representation [37]. These approaches allow to compute the probability distribution of the Gauss-Markov bridge as a process (i.e., allows to compute the covariances), but since this will be of no use in the sequel, we do not provide the expressions.
2.2. The Multiresolution Description of Gauss-Markov Processes
Recognizing the Gauss property and the Markov property as the two crucial elements for a stochastic process to be expanded to Lévy-Cesielski, our approach first proposes to exhibit bases of deterministic functions that would play the role of the Schauder bases for the Wiener process. In this regard, we first expect such functions to be continuous and compactly supported on increasingly finer supports (i.e., subintervals of the definition interval ) in a similar nested binary tree structure. Then, as in the Lévy-Ciesielski construction, we envision that, at each resolution (i.e., on each support), the partially constructed process (up to the resolution of the support) has the same conditional expectation as the Gauss-Markov process when conditioned on the endpoints of the supports. The partial sums obtained with independent Gaussian coefficients of law will thus approximate the targeted Gauss-Markov process in a multiresolution fashion, in the sense that, at every resolution, considering these two processes on the interval endpoints yields finite-dimensional Gaussian vectors of the same law.
2.2.1. Nested Structure of the Sequence of Supports
Here, we define the nested sequence of segments that constitute the supports of the multiresolution basis. We construct such a sequence by recursively partitioning the interval .
More precisely, starting from with and , we iteratively apply the following operation. Suppose that, at the th step, the interval is decomposed into intervals , called supports, such that for . Each of these intervals is then subdivided into two child intervals, a left-child and a right-child , and the subdivision point is denoted by . Therefore, we have defined three sequences of real , , and for and satisfying and with the convention and and . The resulting sequence of supports clearly has a binary tree structure.
For the sake of compactness of notations, we define the set of indices and for , we define , the set of endpoints of the intervals . We additionally require that there exists such that for all which in particular implies that and ensures that the set of endpoints is everywhere dense in . The simplest case of such partitions is the dyadic partition of , where the endpoints for read in which case the endpoints are simply the dyadic points . Figure 1 represents the global architecture of the nested sequence of intervals.
The nested structure of the supports, together with the constraint of continuity of the bases elements, implies that only a finite number of coefficients are needed to construct the exact value of the process at a given endpoint, thus providing us with an exact schema to simulate the sample values of the process on the endpoint up to an arbitrary resolution, as we will further explore.
2.2.2. Innovation Processes for Gauss-Markov Processes
For , a multidimensional Gauss-Markov process, we call the multiresolution description of a process the sequence of conditional expectations on the nested sets of endpoints . In detail, if we denote by the filtration generated by given the values of the process at the endpoints of the partition, we introduce the sequence of the Gaussian processes defined by: These processes can be naturally viewed as an interpolation of the process sampled at the increasingly finer times since for all we have . The innovation process is defined as the update transforming the process into , that is, It corresponds to the difference the additional knowledge of the process at the points make on the conditional expectation of the process. This process satisfies the following important properties that found our multiresolution construction.
Proposition 2.3. The innovation process is a centered Gaussian process independent of the processes for any . For and with , the covariance of the innovation process reads where with , and as defined in Proposition 2.1.
Proof. Because of the Markovian property of the process , the law of the process can be computed from the bridge formula derived in Proposition 2.1 and we have Therefore, the innovation process can be written for as where is a measurable process a deterministic matrix function and The expressions of and are quite complex but are highly simplified when noting that directly implying that and yielding the remarkably compact expression This process is a centered Gaussian process. Moreover, observing that it is -measurable, it can be written as and the process appears as the Gauss-Markov bridge conditioned at times and , and whose covariance is given by Proposition 2.1 and that has the expression Let , and assume that and . If , then, because of the Markov property of the process , the two bridges are independent and therefore the covariance is zero. If , we have Eventually, the independence property stems from the simple properties of the conditional expectation. Indeed, let . We have and the fact that a zero covariance between two Gaussian processes implies the independence of these processes concludes the proof.
2.2.3. Derivation of the Candidate Multiresolution Bases of Functions
We deduce from the previous proposition the following fundamental theorem of this paper.
Theorem 2.4. For all , there exists a collection of that are zero outside the subinterval such that in distribution one has: where are independent -dimensional standard normal random variables (i.e., of law ). This basis of functions is unique up to an orthogonal transformation.
Proof. The two processes and are two Gaussian processes of mean zero. Therefore, we are searching for functions vanishing outside and ensuring that the two processes have the same probability distribution. A necessary and sufficient condition for the two processes to have the same probability distribution is to have the same covariance function (see, e.g., [36]). We therefore need to show the existence of a collection of functions functions that vanish outside the subinterval and that ensure that the covariance of the process is equal to the covariance of . Let such that and . If , the assumption fact that the functions vanish outside implies that If , the covariance reads which needs to be equal to the covariance of , namely, Therefore, since , we have We can hence now define as a square root of the symmetric positive matrix , by fixing in (2.35) Eventually, since by assumption we have that is invertible, so is , and the functions can be written as with being a square root of . Square roots of positive symmetric matrices are uniquely defined up to an orthogonal transformation. Therefore, all square roots of are related by orthogonal transformations , where . This property immediately extends to the functions we are studying: two different functions and satisfying the theorem differ from an orthogonal transformation . We proved that, for to have the same law as in the interval , the functions with support in are necessarily of the form . It is straightforward to show the sufficient condition that provided such a set of functions, the processes and are equal in law, which ends the proof of the theorem.
Using the expressions obtained in Proposition 2.1, we can make completely explicit the form of the basis in terms of the functions , and : and satisfies Note that can be defined uniquely as the symmetric positive square root, or as the lower triangular matrix resulting from the Cholesky decomposition of .
Let us now define the function such that the process has the same covariance as , which is computed using exactly the same technique as that developed in the proof of Theorem 2.4 and that has the expression for , a square root of the covariance matrix of which from (2.5) reads
We are now in position to show the following corollary of Theorem 2.4.
Corollary 2.5. The Gauss-Markov process is equal in law to the process where are independent standard normal random variables .
Proof. We have
We therefore identified a collection of functions that allows a simple construction of the Gauss-Markov process iteratively conditioned on increasingly finer partitions of the interval . We will show that this sequence converges almost surely towards the Gauss-Markov process used to construct the basis, proving that these finite-dimensional continuous functions form an asymptotically accurate description of the initial process. Beforehand, we rigorously study the Hilbertian properties of the collection of functions we just defined.
3. The Multiresolution Schauder Basis Framework
The above analysis motivates the introduction of a set of functions we now study in details. In particular, we enlighten the structure of the collection of functions as a Schauder basis in a certain space of continuous functions from to . The Schauder structure was defined in [38, 39], and its essential characterization is the unique decomposition property: namely that every element in can be written as a well-formed linear combination and that the coefficients satisfying the previous relation are unique.
3.1. System of Dual Bases
To complete this program, we need to introduce some quantities that will play a crucial role in expressing the family as a Schauder basis for some given space. In (2.39), two constant matrices appear that will have a particular importance in the sequel for in with : where stands for . We further define the matrix and we recall that is a square root of , the covariance matrix of , conditionally to and , given in (2.29). We stress that the matrices , , and are all invertible and satisfy the important following properties.
Proposition 3.1. For all in , , one has:(i)(ii).
To prove this proposition, we first establish the following simple lemma of linear algebra.
Lemma 3.2. Given two invertible matrices and in such that is also invertible, if one defines , one has the following properties:(i)(ii).
Proof. (i).(ii).
Proof of Proposition 3.1. (ii) Directly stems from Lemma 3.2, item (ii) by posing , , and . Indeed, the lemma implies that(i)We have which ends the demonstration of the proposition.
Let us define . With this notations we define the functions in a compact form as follows.
Definition 3.3. For every in with , the continuous functions are defined on their support as and the basis element is given on by
The definition implies that are continuous functions in the space of piecewise derivable functions with piecewise continuous derivative which takes value zero at zero. We denote such a space by .
Before studying the property of the functions , it is worth remembering that their definitions include the choice of a square root of . Properly speaking, there is thus a class of bases and all the points we develop in the sequel are valid for this class. However, for the sake of simplicity, we consider from now on that the basis under scrutiny results from choosing the unique square root that is lower triangular with positive diagonal entries (the Cholesky decomposition).
3.1.1. Underlying System of Orthonormal Functions
We first introduce a family of functions and show that it constitutes an orthogonal basis on a certain Hilbert space. The choice of this basis can seem arbitrary at first sight, but the definition of these function will appear natural for its relationship with the functions and that is made explicit in the sequel, and the mathematical rigor of the argument lead us to choose this apparently artificial introduction.
Definition 3.4. For every in with , we define a continuous function which is zero outside its support and has the expressions: The basis element is defined on by
Remark that the definitions make apparent the fact that these two families of functions are linked for all in through the simple relation Moreover, this collection of functions constitutes an orthogonal basis of functions, in the following sense.
Proposition 3.5. Let be the closure of equipped with the natural norm of . It is a Hilbert space, and moreover, for all , the family of functions defined as the columns of , namely forms a complete orthonormal basis of .
Proof. The space is clearly a Hilbert space as a closed subspace of the larger Hilbert space is equipped with the standard scalar product:
We now proceed to demonstrate that the columns of form an orthonormal family which generates a dense subspace of . To this end, we define as the space of functions
that is, the space of functions that take values in the set of -matrices whose columns are in . This definition allows us to define the bilinear function as
and we observe that the columns of form an orthonormal system if and only if
where is the Kronecker delta function, whose value is 1 if and , and 0 otherwise.
First of all, since the functions are zero outside the interval , the matrix is nonzero only if . In such cases, assuming that and, for example, that , we necessarily have strictly included in : more precisely, is either included in the left-child support or in the right-child support of . In both cases, writing the matrix shows that it is expressed as a matrix product whose factors include . We then show that
which entails that if . If , we remark that , and we conclude that from the preceding case. For , we directly compute for the only nonzero term
Using the passage relationship between the symmetric functions and given in (2.7), we can then write
Proposition 3.1 implies that which directly implies that . For , a computation of the exact same flavor yields that . Hence, we have proved that the collection of columns of forms an orthonormal family of functions in (the definition of clearly states that its columns can be written in the form of elements of ).
The proof now amounts showing the density of the family of functions we consider. Before showing this density property, we introduce for all in the functions with support on defined by
Showing that the family of columns of is dense in is equivalent to show that the column vectors of the matrices seen as a function of are dense in . It is enough to show that the span of such functions contains the family of piecewise continuous -valued functions that are to be constant on , in (the density of the endpoints of the partition entails that the latter family generates ).
In fact, we show that the span of functions
is exactly equal to the space of piecewise continuous functions from to that are constant on the supports , for any in . The fact that is included in is clear from the fact that the matrix-valued functions are defined constant on the support , for in .
We prove that is included in by induction on . The property is clearly true at rank since is then equal to the constant invertible matrix . Assuming that the proposition true at rank for a given , let us consider a piecewise continuous function in . Remark that, for every in , the function can only take two values on and can have discontinuity jump in : let us denote these jumps as
Now, remark that for every in , the matrix-valued functions take only two matrix values on , namely, and . From Proposition 3.1, we know that is invertible. This fact directly entails that there exist vectors , for any in , such that . We then necessarily have that the function is piecewise constant on the supports , in . By recurrence hypothesis, belongs to , so that belongs to , and we have proved that . Therefore, the space generated by the column vectors is dense in , which completes the proof that the functions form a complete orthonormal family of .
The fact that the column functions of form a complete orthonormal system of directly entails the following decomposition of the identity on .
Corollary 3.6. If is the real delta Dirac function, one has
Proof. Indeed it easy to verify that, for all in , we have for all where denotes the inner product in between and the -column of . Therefore, by the Parseval identity, we have in the sense
From now on, abusing language, we will say that the family of -valued functions is an orthonormal family of functions to refer to the fact that the columns of such matrices form orthonormal set of . We now make explicit the relationship between this orthonormal basis and our functions derived in our analysis of the multidimensional Gauss-Markov processes.
3.1.2. Generalized Dual Operators
The Integral Operator
The basis is of great interest in this paper for its relationship to the functions that naturally arise in the decomposition of the Gauss-Markov processes. Indeed, the collection can be generated from the orthonormal basis through the action of the integral operator defined on into by
where is an open set and, for any set denotes the indicator function of . Indeed, realizing that acts on into through
where denotes the th -valued column function of , we easily see that for all in , ,
It is worth noticing that the introduction of the operator can be considered natural since it characterizes the centered Gauss-Markov process through loosely writing .
In order to exhibit a dual family of functions to the basis , we further investigate the property of the integral operator . In particular, we study the existence of an inverse operator , whose action on the orthonormal basis will conveniently provide us with a dual basis to . Such an operator does not always exist; nevertheless, under special assumptions, it can be straightforwardly expressed as a generalized differential operator.
The Differential Operator
Here, we make the assumptions that , that, for all , is invertible in , and that and have continuous derivatives, which especially implies that . In this setting, we define the space of functions in that are zero at zero and denote by its dual in the space of distributions (or generalized functions). Under the assumptions just made, the operator admits the differential operator defined by
as its inverse, that is, when restricted to , we have on . The dual operators of and are expressed, for any in , as
They satisfy (from the properties of and ) on . By dual pairing, we extend the definition of the operators , as well as their dual operators, to the space of generalized function . In details, for any distribution in and test function in , define and by
and reciprocally for the dual operators and .
Candidate Dual Basis
We are now in a position to use the orthonormality of to infer a dual family of the basis . For any function in , the generalized function belongs to , the space of continuous functions that are zero at zero. We equip this space with the uniform norm and denote its topological dual , the set of -dimensional Radon measures with . Consequently, operating in the Gelfand triple
we can write, for any function , in ,
The first equality stems from the fact that, when and are seen as generalized functions, they are still inverse of each other, so that in particular on . The dual pairing associated with the Gelfand triple (3.32) entails the second equality where is the generalized operator defined on and where is in .
As a consequence, defining the functions in , the -dimensional space of Radon measures, by
provides us with a family of -generalized functions which are dual to the family in the sense that, for all in , we have
where the definition of has been extended through dual pairing: given any in and any in , we have
with denoting the dual pairing between the th column of taking value in and the th column of taking value in . Under the favorable hypothesis of this section, the -generalized functions can actually be easily computed since considering the definition of shows that the functions have support and are constant on and in . Only the discontinuous jumps in , , and intervene, leading to expressing for in ,
and , where denotes the standard delta Dirac function (centered in 0). These functions can be extended to the general setting of the paper since its expressions do not involve the assumptions made on the invertibility and smoothness of . We now show that these functions, when defined in the general setting, still provide a dual basis of the functions .
3.1.3. Dual Basis of Generalized Functions
The expression of the basis that has been found under favorable assumptions makes no explicit reference to these assumptions. It suggests defining functions formally as linear combination of Dirac functions acting by duality on .
Definition 3.7. For in , the family of generalized functions in is given by () and , where is the standard Dirac distribution.
Notice that the basis is defined for the open set . For the sake of consistency, we extend the definition of the families and on by setting them to zero on , except for , which is continued for by a continuous function that is compactly supported in for a given in , and satisfies .
We can now formulate the following.
Proposition 3.8. Given the dual pairing in where is a bounded open set of containing , the family of continuous functions in admits, for dual family in , the set of distributions .
Proof. We have to demonstrate that, for all in ,
Suppose first that . If , can only be nonzero if the support is strictly included in . We then have
Assume that is to the left of , that is, is a left child of in the nested binary tree of supports and write
Using the fact that and that the function , as any integral between and , satisfies the chain rule for all , we obtain
The same result is true if is a right child of in the nested binary tree of supports. If , necessarily the only nonzero term is for , that is,
If , can only be nonzero if the support is included in , but then is zero in , , so that .
Otherwise, if and , we directly have
Finally, if , given the simple form of with a single Dirac function centered in , we clearly have , and if
and using the fact that (by definition of ) we have , this last expression is equal to
which completes the proof.
This proposition directly implies the main result of the section.
Theorem 3.9. The collection of functions constitutes a Schauder basis of functions on , that is, any element of can be written in a unique way as a sum of coefficients multiplied by .
This theorem provides us with a complementary view of stochastic processes: in addition to the standard sample paths view, this structure allows to see the Gauss-Markov processes as coefficients on the computed basis. This duality is developed in the sequel.
3.2. The Sample Paths Space
3.2.1. The Construction Application
The Schauder basis of functions with compact supports constructed allows to define functions by considering the coefficients on this basis, which constitute sequences of real numbers in the space We equip with the uniform norm , where we write . We denote by the Borelian sets of the topology induced by the uniform norm and we recall that , the cylinder sets of , form a generative family of Borelian sets. Remark that not any sequence of coefficients provides a continuous function, and one needs to assume a certain decrease in the coefficients to get convergence. A sufficient condition to obtain convergent sequences is to consider coefficients in the space This set is clearly a Borelian set of since it can be written as a countable intersection and union of cylinder, namely, by denoting by the set of finite subset of and , , It is also easy to verify that it forms a vectorial subspace of .
After these definitions, we are in position to introduce the following useful function.
Definition 3.10. One denotes by the partial construction application: where the is the -dimensional Wiener space, which is complete under the uniform norm .
This sequence of partial construction applications is shown to converge to the construction application in the following.
Proposition 3.11. For every in , converges uniformly toward a continuous function in . One will denote this function , defined as and this application will be referred to as the construction application.
This proposition is proved in Appendix D. The image of this function constitutes a subset of the Wiener space continuous functions . Let us now define the vectorial subspace of so that appears as a bijection.
It is important to realize that, in the multidimensional case, the space depends on and in a nontrivial way. For instance, assuming that , the space depends obviously crucially on the rank of . To fix the idea, for a given constant in , we expect the space to only include sample paths of for which the first components are constant. Obviously, a process with such sample paths is degenerated in the sense that its covariance matrix is not invertible.
Yet, if we additionally relax the hypothesis that , the space can be dramatically altered: if we take the space will represent the sample space of the -integrated Wiener process, a nondegenerate -dimensional process we fully develop in the example section.
However, the situation is much simpler in the one-dimensional case: because the uniform convergence of the sample paths is preserved as long as is continuous and is nonzero through (D.8), the definition does not depend on or . Moreover, in this case, the space is large enough to contain reasonably regular functions as proved in Appendix D, Proposition 3.
In the case of the -integrated Wiener process, the space clearly contains the functions .
This remark does not hold that the space does not depend on as long as is continuous because the uniform convergence of the sample paths is preserved through the change of basis of expansion through (D.8).
We equip the space with the topology induced by the uniform norm on . As usual, we denote the corresponding Borelian sets. We prove in Appendix D the following.
Proposition 3.12. The function is a bounded continuous bijection.
We therefore conclude that we dispose of a continuous bijection mapping the coefficients onto the sample paths, . We now turn to study its inverse, the coefficient application, mapping sample paths on coefficients over the Schauder basis.
3.2.2. The Coefficient Application
In this section, we introduce and study the properties of the following function.
Definition 3.13. One calls coefficient application and denotes by the function defined by
Should a function admit a uniformly convergent decomposition in terms of the basis of elements , the function gives its coefficients in such a representation. More precisely, we have the following.
Theorem 3.14. The function is a measurable linear bijection whose inverse is .
The proof of this theorem is provided in Appendix D.
4. Representation of Gauss-Markov Processes
4.1. Inductive Construction of Gauss-Markov Processes
Up to this point, we have rigorously defined the dual spaces of sample paths and coefficients . Through the use of the Schauder basis and its dual family of generalized functions , we have defined the inverse measurable bijections and transforming one space into the other. In doing so, we have unraveled the fundamental role played by the underlying orthonormal basis . We now turn to use this framework to formulate a pathwise construction of the Gauss-Markov processes in the exact same flavor as the Lévy-Ciesielski construction of the Wiener process.
4.1.1. Finite-Dimensional Approximations
Considering the infinite-dimensional subspace of , let us introduce the equivalence relation as We can use the functions to carry through the structure of on the infinite-dimensional space of coefficients : which clearly entails that if and only if . We denote the sets of equivalence classes of and , which are both clearly isomorphic . For every , we define the finite-dimensional operators and , with the help of the canonical projections , and the inclusion map , .
The results of the preceding sections straightforwardly extend on the equivalence classes, and in particular we see that the functions and are linear finite-dimensional bijections satisfying . We write (resp., ), the canonical basis of (resp., ) when listed in the recursive dyadic order. In these bases, the matrices and are lower block triangular. Indeed, denoting in the natural bases and by where is a matrix, the structure of the nested support entails the block-triangular structure (where only possibly nonzero coefficients are written):
Similarly, for the matrix representation of in the natural bases and proves to have the following triangular form: The duality property, Proposition 3.8, simply reads for all and , and that is, . But because we are now in a finite-dimensional setting, we also have : Realizing that represents the class of functions in whose values are zero on every dyadic point of except for , clearly appear as the coefficients of the decomposition of such functions in the bases for in .
Denoting , a set of independent Gaussian variables of law on , and for all , we form the finite dimensional Gauss-Markov vector as which, from Corollary 2.5, has the same law as , the finite-dimensional random vector obtained from sampling on (modulo a permutation on the indices). We then prove the following lemma that sheds light on the meaning of the construction.
Lemma 4.1. The Cholesky decomposition of the finite-dimensional covariance block matrix is given by .
Proof. For every , we compute the covariance of the finite-dimensional process as From there, we write the finite-dimensional covariance block matrix in the recursively ordered basis for , , as We already established that the matrix was triangular with positive diagonal coefficient, which entails that the preceding equality provides us with the Cholesky decomposition of .
In the finite-dimensional case, the inverse covariance or potential matrix is a well-defined quantity and we straightforwardly have the following corollary.
Corollary 4.2. The Cholesky decomposition of the finite-dimensional inverse covariance matrix is given by .
Proof. The result stems for the equalities .
4.1.2. The Lévy-Ciesielski Expansion
We now show that, asymptotically, the bases allow us to faithfully build the Gauss-Markov process from which we have derived its expression. In this perspective we consider , a set of independent Gaussian variables of law on , and, for all , we form the finite-dimensional continuous Gaussian process , defined for by which, from the result of Theorem 2.4, has the same law . We prove the following lemma.
Lemma 4.3. The sequence of processes almost surely converges towards a continuous Gaussian process denoted by .
Proof. For all fixed and for any in , we know that is continuous. Moreover, we have established, that, for every in , converges uniformly in toward a continuous limit denoted by . Therefore, in order to prove that defines almost surely a process with continuous paths, it is sufficient to show that , where is the -induced measure on , which stems from a classical Borel-Cantelli argument. For a random variable of normal law , and , we have Then, for any Since the series is convergent, the Borel-Cantelli argument implies that . Eventually, the continuous almost-sure limit process is Gaussian as a countable sum of Gaussian processes.
Now that these preliminary remarks have been made, we can evaluate, for any and in , the covariance of as the limit of the covariance of .
Lemma 4.4. For any , the covariance of is
Proof. As are independent Gaussian random variables of normal law , we see that the covariance of is given by To compute the limit of the right-hand side, we need to remember that the element of the bases and the functions are linked by the following relation: from which we deduce Defining the auxiliary -valued function we observe that the -coefficient function reads where is the real function that is one if and zero otherwise. As we can write we see that the function belongs to , so that we can write as a scalar product in the Hilbert space : We then specify the -coefficient of writing and, remembering that the family of functions forms a complete orthonormal system of , we can use the Parseval identity, which reads Thanks to this relation, we can conclude the evaluation of the covariance since
We stress the fact that the relation provides us with a continuous version of the Cholesky decomposition of the covariance kernel . Indeed, if we chose as the Cholesky square root of , we remark that the operators are triangular in the following sense: consider the chain of nested vectorial spaces with ; then, for every in , the operator transforms the chain into the chain with .
The fact that this covariance is equal to the covariance of the process , solution of (2.1) implies that we have the following fundamental result.
Theorem 4.5. The process is equal in law to the initial Gauss-Markov process used to construct the basis of functions.
Remark 4.6. Our multiresolution representation of the Gauss-Markov processes appears to be the direct consequence of the fact that, because of the Markov property, the Cholesky decomposition of the finite-dimensional covariance admits a simple inductive continuous limit. More generally, triangularization of the kernel operators has been studied in depth [28, 40–42], and it would be interesting to investigate if these results make possible a similar multiresolution approach for non-Markov Gaussian processes. In this regard, we naturally expect to lose the compactness of the supports of a putative basis.
Remark 4.7. We eventually underline the fact that large deviations related to this convergence can be derived through the use of the Baldi and Caramellino good rate function related to the Gaussian pinned processes [43, 44].
4.2. Optimality Criterion of the Decomposition
In the following, we draw from the theory of interpolating splines to further characterize the nature of our proposed basis for the construction of the Gauss-Markov processes. Essentially adapting the results from the previous works [45, 46], we first show that the finite-dimensional sample paths of our construction induce a nested sequence of the reproducing Hilbert kernel space (RKHS). In turn, the finite-dimensional process naturally appears as the orthogonal projection of the infinite-dimensional process onto . We then show that such an RKHS structure allows us to define a unicity criterion for the finite-dimensional sample path as the only functions of that minimize a functional, called Dirichlet energy, under constraint of interpolation on (equivalent to conditioning on the times ). In this respect, we point out that the close relation between the Markov processes and the Dirichlet forms is the subject of a vast literature, largely beyond the scope of the present paper (see, e.g., [18]).
4.2.1. Sample Paths Space as a Reproducing Hilbert Kernel Space
In order to define the finite-dimensional sample paths as a nested sequence of RKHSs, let us first define the infinite-dimensional operator Since we know that the column functions of form a complete orthonormal system of , the operator is an isometry and its inverse satisfies , which reads for all in Equipped with this infinite-dimensional isometry, we then consider the linear operator suitably defined on the set with , the norm of . The set form an infinite-dimensional vectorial space that is naturally equipped with the inner product Moreover since , such an inner product is definite positive and, consequently, forms an Hilbert space.
Remark 4.8. Two straightforward remarks are worth making. First, the space is strictly included in the infinite-dimensional sample paths space . Second, notice that, in the favorable case , if is everywhere invertible with continuously differentiable inverse, we have . More relevantly, the operator can actually consider a first-order differential operator from to as a general left inverse of the integral operator . Indeed, realizing that on can be expressed as , we clearly have
We know motivate the introduction of the Hilbert space by the following claim.
Proposition 4.9. The Hilbert space is a reproducing kernel Hilbert space (RKHS) with -valued reproducing kernel , the covariance function of the process .
Proof. Consider the problem of finding all elements of solution of the equation for in . The operator provides us with a continuous -valued kernel function : which is clearly the Green function for our differential equation. This entails that the following equalitiy holds for every in : Moreover, we can decompose the kernel in the sense as since we have with . Then, we clearly have where we recognize the covariance function of , which implies Eventually, for all in , we have where we have introduced the -operator associated with the inner product : for all -valued functions and defined on such that the columns and , , are in , we define the matrix in by By the Moore-Aronszajn theorem [47], we deduce that there is a unique reproducing kernel Hilbert space associated with a given covariance kernel. Thus, is the reproducing subspace of corresponding to the kernel , with respect to the inner product .
Remark 4.10. From a more abstract point of view, it is well known that the covariance operator of a Gaussian measure defines an associated Hilbert structure [48, 49].
In the sequel, we will use the space as the ambient Hilbert space to define the finite-dimensional sample paths spaces as a nested sequence of RKHS. More precisely, let us write for the finite-dimensional subspace of with the space being defined as
We refer to such spaces as finite-dimensional approximation spaces since we remark that which means that the space is made of the sample space of the finite-dimensional process . The previous definition makes obvious the nested structure , and it is easy to characterize each space as a reproducing Hilbert kernel space.
Proposition 4.11. The Hilbert spaces are reproducing kernel Hilbert spaces (RKHSs) with -valued reproducing kernel , the covariance function of the process .
Proof. The proof this proposition follows the exact same argument as that in the case of , but with the introduction of finite-dimensional kernels and the corresponding covariance function
4.2.2. Finite-Dimensional Processes as Orthogonal Projections
The framework set in the previous section offers a new interpretation of our construction. Indeed, for all , the columns of form an orthonormal basis of the space : This leads to defining the finite-dimensional approximation of an sample path of as the orthogonal projection of on with respect to the inner product . At this point, it is worth remembering that the space is strictly contained in and does not coincide with : actually one can easily show that . We devote the rest of this section to defining the finite-dimensional processes resulting from the conditioning on , as pathwise orthogonal projection of the original process on the sample space .
Proposition 4.12. For any , the conditioned processes can be written as the orthogonal projection of on with respect to :
The only hurdle to prove Proposition 4.12 is purely technical in the sense that the process exists in a larger space than : we need to find a way to extend the definition of so that the expression bears a meaning. Before answering this point quite straightforwardly, we need to establish the following lemma.
Lemma 4.13. Writing the Gauss-Markov process , for all , the conditioned process is expressed as the stochastic integral
Proof. In the previous section, we have noticed that the kernel converges toward the kernel (the Green function) in the sense This implies that the process as the stochastic integral can also be written as Specifying the decomposition of , we can then naturally express as the convergent sum where the orthonormality property of the with respect to makes the vectors appear as independent -dimensional Gaussian variables of law . It is then easy to see that by definition of the elements , for almost every in , we then have and we finally recognize in the previous expression that for all
We can now proceed to justify the main result of Proposition 4.12.
Proof. The finite-dimensional processes defined through Lemma 4.13 have sample paths belonging to . Moreover, for almost every in and for all in , because of the orthonormality property of with respect to . As the previous equalities hold for every , the applications can naturally be extended on by continuity. Therefore, it makes sense to write, for all in , even if is defined into a larger sample space than . In other words, we have and we can thus express the conditioned process as the orthogonal projection of onto the finite sample path by writing
4.2.3. Optimality Criterion of the Sample Paths
Proposition 4.12 elucidates the structure of the conditioned processes as pathwise orthogonal projections of on the finite-dimensional RKHS . It allows us to cast the finite sample paths in a geometric setting and incidentally, to give a characterization of them as the minimizer of some functionals. In doing so, we shed a new light on well-known results of the interpolation theory [50–52] and extend them to the multidimensional case.
The central point of this section reads as follows.
Proposition 4.14. Given a function in , the function belongs to and is defined by the following optimal criterion: is the only function in interpolating on such that the functional takes its unique minimal value over in .
Proof. The space has been defined as so that, for all in , clearly belongs to . Moreover, interpolates on : indeed, we know that the finite-dimensional operators and are inverse of each other , which entails that for all in
where we use the fact that, for any in and for all in , (recall that if and belongs to ).
Let us now show that is determined in by the announced optimal criterion. Suppose belongs to and interpolates on , and remark that we can write
since is an isometry. Then, consider in and remark that, since for all in , are Dirac measures supported by , we have
This entails
Since, by definition of , if . Moreover, the minimum is only attained for such that if and if , which defines univocally . This establishes that, for all in such that for all in , , we have and the equality case holds if and only if .
Remark 4.15. When represents a regular differential operator of order , , where , that is, for with The finite-dimensional sample paths coincide exactly with the spline interpolation of order , which are well known to satisfy the previous criterion [46]. This example will be further explored in the example section.
The Dirichlet energy simply appears as the squared norm induced on by the inner product , which in turn can be characterized as a Dirichlet quadratic form on . Actually, such a Dirichlet form can be used to define the Gauss-Markov process, extending the Gauss-Markov property to processes indexed on the multidimensional spaces parameter [19]. In particular, for an -dimensional parameter space, we can condition such Gauss-Markov processes on a smooth -dimensional boundary. Within the boundary, the sample paths of the resulting conditioned process (the solution to the prediction problem in [19]) are the solutions to the corresponding Dirichlet problems for the elliptic operator associated with the Dirichlet form.
The characterization of the basis as the minimizer of such a Dirichlet energy (4.59) gives rise to an alternative method to compute the basis as the solution of a Dirichlet boundary value problem for an elliptic differential operator.
Proposition 4.16. Let us assume that and are continuously differentiable and that is invertible. Then, the functions are defined as where and are the unique solutions of the second-order -dimensional linear differential equation with the following boundary value conditions:
Proof. By Proposition 4.14, we know that minimizes the convex functional over , being equal to zero outside the interval and equal to one at the point . Because of the hypotheses on and , we have and we can additionally restrain our search to functions that are twice continuously differentiable. Incidentally, we only need to minimize separately the contributions on the interval and . On both intervals, this problem is a classical Euler-Lagrange problem (see, e.g., [53]) and is solved using basic principles of calculus of variations. We easily identify the Lagrangian of our problem as From there, after some simple matrix calculations, the Euler-Lagrange equations can be expressed under the form: which ends the proof.
Remark 4.17. It is a simple matter of calculus to check that the expression of given in Proposition 2.1 satisfies (4.67). Notice also that, in the case , the differential equation becomes which is further simplified for constant .
Under the hypotheses of Proposition 4.16, we can thus define as the unique solution to the second-order linear differential equation (4.67) with the appropriate boundary values conditions. From this definition, it is then easy to derive the bases by completing the following program.(1)Compute the by solving the linear ordinary differential problem.(2)Apply the differential operator to get the functions .(3)Orthonormalize the column functions by the Gram-Schmidt process.(4)Apply the integral operator to get the desired functions (or equivalently multiply the original function by the corresponding Gram-Schmidt triangular matrix).
Notice finally that each of these points is easily implemented numerically.
5. Examples: Derivation of the Bases for Some Classical Processes
5.1. One-Dimensional Case
In the one-dimensional case, the construction of the Gauss-Markov process is considerably simplified since we do not have to consider the potential degeneracy of matrix-valued functions. Indeed, in this situation, the centered Gauss-Markov process is solution of the one-dimensional stochastic equation with homogeneously Hölder continuous and positive continuous function. We then have the Doob representation Writing the function as the covariance of the process reads for any The variance of the Gauss-Markov bridge pinned in and yields These simple relations entail that the functions are defined on their supports by with
This reads on and on as and therefore we have with As for the first element, it simply results from the conditional expectation of the one-dimensional bridge pinned in and : In this class of processes, two paradigmatic process are the Wiener process and the Ornstein-Uhlenbeck processes with constant coefficients. In the case of the Wiener process, and , which yields the classical triangular-shaped Schauder functions used by Lévy [8]. As for the Ornstein-Uhlenbeck process with constant coefficients and , we have , and , yielding for the construction basis the expressions which were already evidenced in [54].
5.2. Multidimensional Case
In the multidimensional case, the explicit expressions for the basis functions make fundamental use of the flow of the underlying linear equation (2.3) for a given function . For commutative forms of (i.e., such that for all ), the flow can be formally expressed as an exponential operator. It is, however, a notoriously difficult problem to find a tractable expression for general . As a consequence, it is only possible to provide closed-from formulae for our basis functions in very specific cases.
5.2.1. Multidimensional Gauss-Markov Rotations
We consider in this section that is antisymmetric and constant and such that . For antisymmetric, since , we have that is, the flow is unitary. This property implies that which yields by definition of The square root is then uniquely defined (by choosing both Cholesky and symmetrical square roots) by and reads
Recognizing the element of the Schauder basis for the construction of the one-dimensional Wiener process we obtain the following formula:
This form shows that the Schauder basis for multidimensional rotations results from the multiplication of the triangular-shaped elementary function used for the Lévy-Ciesielski construction of the Wiener process with the flow of the equation, that is, the elementary rotation.
The simplest example of this kind is the stochastic sine and cosine process corresponding to In that case, has the expression Interestingly, the different basis functions have the structure of the solutions of the nonstochastic oscillator equation. One of the equations perturbs the trajectory in the radial component of the deterministic solution and the other one in the tangential direction. We represent such a construction in Figure 2 with the additional conditioning that , that is, imposing that the trajectory forms a loop between time 0 and 1.
(a)
(b)
5.2.2. The Successive Primitives of the Wiener Process
In applications, it often occurs that people use smooth stochastic processes to model the integration of noisy signals. This is for instance the case of a particular subject of a Brownian forcing or of the synaptic integration of noisy inputs [55]. Such smooth processes involves in general integrated martingales, and the simplest example of such processes are the successive primitives of a standard Wiener process.
Let , and denote by the th order primitive of the Wiener process. This process can be defined via the lower-order primitives for via the relations where is a standard real Wiener process. These equations can be written in our formalism as with In particular, though none of the integrated processes for is Markov by itself, the -uplet is a Gauss-Markov process.
Furthermore, because of the simplicity and the sparsity of the matrices involved, we can identify in a compact form all the variables used in the computation of the construction basis for these processes. In particular, the flow of the equation is the exponential of the matrix , and since is nilpotent, it is easy to show that has the expression, and the only nonzero entry of the matrix is one at position . Using this expression and the highly simple expression of , we can compute the general element of the matrix , which reads Eventually, we observe that the functions , yielding the multiresolution description of the integrated Wiener processes, are directly deduced from the matrix-valued function whose components are further expressed as for and as for . The final computation of the involves the computation of and , which in the general case can become very complex. However, this expression is highly simplified when assuming that is the middle of the interval . Indeed, in that case, we observe that, for any such that is odd, , which induces the same property on the covariance matrix and on the polynomials . This property gives therefore a preference to the dyadic partition that provides simple expressions for the basis elements in any dimensions, and allows simple computations of the basis.
Remark 5.1. Observe that, for all , we have As and are constant, we immediately deduce the important relation that, for all . This indicates that each finite-dimensional sample paths of our construction has components that satisfies the nondeterministic equation associated with the iteratively integrated Wiener process. Actually, this fact is better stated remembering that the Schauder basis and the corresponding orthonormal basis are linked through (3.10), which reads Additionally, we realize that the orthonormal basis is entirely determined by the one-dimensional families , which are mutually orthogonal functions satisfying .
We study in more detail the case of the integrated and doubly-integrated Wiener process ( and ), for which closed-form expressions are provided in Appendices A and B. As expected, the first row of the basis functions for the integrated Wiener process turns out to be the well-known cubic Hermite splines [56]. These functions have been widely used in numerical analysis and actually constitute the basis of the lowest degree in a wider family of bases known as the natural basis of polynomial splines of interpolation [25]. Such bases are used to interpolate data points with constraint of smoothness of different degrees (e.g., the cubic Hermite splines ensure that the resulting interpolation is in ). The next family of splines of interpolation (corresponding to the constraint) is naturally retrieved by considering the construction of the doubly-integrated Wiener process: we obtain a family of three 3-dimensional functions that constitutes the columns of a matrix that we denote by . The top row is made of polynomials of degree five, which have again simple expressions when is the middle of the interval .
6. Stochastic Calculus from the Hilbert Point of View
Thus far, all calculations, propositions, and theorems are valid for any finite-dimensional the Gauss-Markov process and all the results are valid pathwise, that is, for each sample path. The analysis provides a Hilbert description of the processes as a series of standard Gaussian random variables multiplied by certain specific functions, that form a Schauder basis in the suitable spaces. This new description of Gauss-Markov processes provides a new way for treating problems arising in the study of stochastic processes. As examples of this, we derive the Itô formula and the Girsanov theorem from the Hilbertian viewpoint. Note that these results are equalities in law, that is, dealing with the distribution of stochastic processes, which is a weaker notion compared to the pathwise analysis. In this section, we restrict our analysis to the one-dimensional case for technical simplicity.
The closed-form expressions of the basis of functions in the one-dimensional case are given in Section 5.1. The differential and integral operators associated, introduced in Section 3.1.2 are highly simplified in the one-dimensional case. Let be a bounded open set of , we denote by the space of continuous real functions on and we recall that the topological dual of is , the space of Radon measures on . We also introduce , the space of test functions in that are zero in zero, and whose dual space satisfies .Let be a bounded open neighborhood of , and denote by is the space of continuous real functions on , its topological dual, the space of Radon measures, the space of test function in which are zero at zero and it dual . We consider the Gelfand triple
The integral operator is defined (and extended by dual pairing) by and the inverse differential operator reads
Now that we dispose of all the explicit forms of the basis functions and related operators, we are in position to complete our program and start by proving the very important Itô formula and its finite-dimensional counterpart before turning to the Girsanov theorem.
6.1. Itô’s Formula
A very useful theorem in the stochastic processes theory is the Itô formula. We show here that this formula is consistent with the Hilbert framework introduced. Most of the proofs can be found in Appendix E. The proof of Itô formula is based on demonstrating the integration by parts property.
Proposition 6.1 (Integration by parts). Let and be two one-dimensional Gauss-Markov processes starting from zero. Then one has the following equality in law: where, for two stochastic processes and , denotes the Stratonovich integral. In terms of the Itô integral, this formula is written as where the brackets denote the mean quadratic variation.
The proof of this proposition is quite technical and is provided in Appendix E. It is based a thorough analysis of the finite-dimensional processes and . For this integration by parts formula and using a density argument, one can recover the more general Itô formula.
Theorem 6.2 (Itô). Let be a Gauss-Markov process and . The process is a Markov process and satisfies the following relation in law:
This theorem is proved in Appendix E.
The Itô formula implies in particular that the multiresolution description developed in the paper is valid for every smooth functional of a Gauss-Markov process. In particular, it allows a simple description of exponential functionals of Gaussian Markovian processes, which are of particular interest in mathematics and have many applications, in particular in economics (see, e.g., [57]).
Therefore, we observe that in the view of the paper, Itô formula stems from the nonorthogonal projections of basis element. For multidimensional processes, the proof of the Itô formula is deduced from the one-dimensional proof and would involve the study of the multidimensional bridge formula for and .
We eventually remark that this section provides us with a finite-dimensional counterpart of the Itô formula for discretized processes, which has important potential applications, and further assesses the suitability of using the finite resolution representation developed in this paper. Indeed, using the framework developed in the present paper allows considering finite-resolution processes and their transformation through nonlinear smooth transformation in a way that is concordant with the standard stochastic calculus processes, since the equation on the transformed process indeed converges towards its Itô representation as the resolution increases.
6.2. Girsanov Formula: A Geometric Viewpoint
In the framework we developed, transforming a process into a process is equivalent to substituting the Schauder construction basis related to for the basis related to . Such an operation provides a pathwise mapping for each sample path of onto a sample path of having the same probability density in . This fact sheds a new light on the geometry of multidimensional Gauss-Markov processes, since the relationship between two processes is seen as a linear change of basis. In our framework, this relationship between processes is straightforwardly studied in the finite rank approximations of the processes up to a certain resolution. Technical intricacy is nevertheless raised when dealing with the representation of the process itself in the infinite-dimensional Hilbert spaces. We solve these technical issues here and show that in the limit one recovers Girsanov theorem as a limit of the linear transformations between the Gauss-Markov processes.
The general problem consists therefore in studying the relationship between two real Gauss-Markov processes and that are defined by
We have noticed that the spaces are the same in the one-dimensional case as long as both and never vanish and therefore make this assumption here. In order to further simplify the problem, we assume that is continuously differentiable. This assumption allows us to introduce the process that satisfies the stochastic differential equation with . Moreover, if and are the bases of functions that describe the process and , respectively, we have .
The previous remarks allow us to restrict without loss of generality our study to the processes defined for same function , thus reducing the parameterization of the Gauss-Markov processes to the linear coefficient . Observe that in the classical stochastic calculus theory, it is well known that such hypotheses are necessary for the process to be absolutely continuous with respect to (through the use of the Girsanov theorem).
Let us now consider that , and three real Hölder continuous real functions, and introduce and solutions of the equations All the functions and tools related to the process (resp., ) will be indexed by () in the sequel.
6.2.1. Lift Operators
Depending on the space we are considering (either coefficients or trajectories), we define the two following operators mapping the process on .(1)The coefficients lift operator is the linear operator mapping in the process on the process : For any , the operator maps a sample path of on a sample path of .(2)The process lift operator is the linear operator mapping in the process on the process : We summarize the properties of these operators now.
Proposition 6.3. The operators and satisfy the following properties.(i) They are linear measurable bijections(ii)For every , the function (resp., ) is a finite-dimensional linear operator, whose matrix representation is triangular in the natural basis of (resp., ) and whose eigenvalues are given by (resp., ).(iii) and are bounded operators for the spectral norm with and .(iv)The determinants of (denoted by ) and admit a limit when tends to infinity:
The proof of these properties elementary stems from the analysis done on the functions and that were previously performed, and these are detailed in Appendix C.
6.2.2. Radon-Nikodym Derivatives
From the properties proved on the lift operators, we are in position to further analyze the relationship between the probability distributions of and . We first consider the finite-dimensional processes and . We emphasize that, throughout this section, all equalities are true pathwise.
Lemma 6.4. Given the finite-dimensional measures and , the Radon-Nikodym derivative of with respect to satisfies with and the equality is true pathwise.
Proof. In the finite-dimensional case, for all , we can write that , , and the Lebesgue measure are mutually absolutely continuous: we denote by and the Gaussian density of and with respect to the Lebesgue measure on . Therefore, in the finite-dimensional case, the Radon-Nikodym derivative of with respect to is defined to be pathwise and is simply given by the quotient of the density of the vector with the density of the vector for , , that is, We first make explicit Then, we rearrange the exponent using the Cholesky decomposition: so that we write the exponent of (6.16) as We finally reformulate (6.16) as
Let us now justify from a geometrical point of view why this formula is a direct consequence of the finite-dimensional change of variable formula on the model space . If we introduce , the coefficient application related to , we know that follows a normal law . We denote by its standard Gaussian density with respect to the Lebesgue measure on . We also know that
Since is linear, the change of the variable formula directly entails that admits on as density with respect to the Lebesgue measure. Consider now as a measurable set of ; then we have from which we immediately conclude.
6.2.3. The Trace Class Operator
The pathwise expression of the Radon-Nikodym derivative extends to the infinite-dimensional representation of and . This extension involves technical analysis on the infinite-dimensional Hilbert space . We have shown in Proposition 6.3 that the application was bounded for the spectral norm. Therefore, we have, for any in , the inequality implying that maps into . We can then define the adjoint operator from to , which is given by Let us now consider the self-adjoint operator . This operator is the infinite-dimensional counterpart of the matrix .
Lemma 6.5. Considering the coefficients of the matrix representation of in the natural basis of given as , one has
Proof. Assume that and that and . With the notations used previously with , an open neighbourhood of , we have Since we have by definition , , , on the space , we have From there using (6.28), we can write the integration by part formula in the sense of the generalized functions to get We now compute using (6.29): Specifying , , , and and recalling the relations we rewrite the function (6.31) in as Now, since the family forms a complete orthonormal system of and is zero outside , by the Parseval identity, expression (6.27) can be written as the scalar product: which ends the proof of the lemma.
Notice that we can further simplify expression (6.26): We are now in a position to show that the operator can be seen as the limit of the finite-dimensional operator , in the following sense.
Theorem 6.6. The operator is a trace class operator, whose trace is given by
We prove this essential point in Appendix F. The proof consists in showing that the operator is isometric to a Hilbert-Schmidt operator whose trace can be computed straightforwardly.
6.2.4. The Girsanov Theorem
We now proceed to prove the Girsanov theorem by extending the domain of the quadratic form associated with to the space , which can only be done in law.
Theorem 6.7. In the infinite-dimensional case, the Radon-Nikodym derivative of with respect to reads which in terms of the Itô stochastic integral reads
In order to demonstrate the Girsanov theorem from our geometrical point of view, we need to establish the following result.
Lemma 6.8. The positive definite quadratic form on associated with operator is well defined on . Moreover, for all , where and refers to the Stratonovich integral and the equality is true in law.
Proof of Theorem 6.7. We start by writing the finite-dimensional Radon-Nikodym derivative
By Proposition 6.3, we have
If, as usual, denotes a recursively indexed infinite-dimensional vector of independent variables with law and , writing , we have
We know that is almost surely in , and, by Lemma 6.8, we also know that on
so that we can effectively write the infinite-dimensional Radon-Nikodym derivative on as the point-wise limit of the finite-dimensional one on through the projectors :
which directly yields formula (6.37).
The derivation of the Girsanov formula (6.40) from (6.37) comes from the relationship between the Stratonovich and Itô formulas since the quadratic variation of and reads
Therefore, the expression of the Radon-Nikodym derivative in Lemma 6.8 can be written in terms of the Itô integrals as
and the Radon-Nikodym derivative as
Observe that, if , we recover the familiar expression
Conclusion and Perspectives
The discrete construction we present displays both analytical and numerical interests for further applications. From the analysis viewpoint, even if the basis does not exhibit the same orthogonal properties as the Karhunen-Loève decomposition, it has the important advantage of saving the structure of sample paths through its property of strong pathwise convergence and of providing a multiscale representation of the processes, which contrasts with the convergence in the mean of the Karhunen-Loève decomposition. From the numerical viewpoint, three Haar-like properties make our decomposition particularly suitable for certain numerical computations: (i) all basis elements have compact support on an open interval that has the structure of dyadic rational endpoints, (ii) these intervals are nested and become smaller for larger indices of the basis element, and (iii) for any interval endpoint, only a finite number of basis elements are nonzero at that point. Thus the expansion in our basis, when evaluated at an interval endpoint (e.g., dyadic rational), terminates in a finite number of steps. Moreover, the very nature of the construction based on an increasingly refined description of the sample paths paves the way to coarse-graining approaches similar to wavelet decompositions in signal processing. In view of this, our framework offers promising applications.Dichotomic Search of First-Hitting Times
The first application we envisage concerns the problem of first-hitting times. Because of its manifold applications, finding the time when a process first exits a given region is a central question of stochastic calculus. However, closed-form theoretical results are scarce and one often has to resort to numerical algorithms [59]. In this regard, the multiresolution property suggests an exact scheme to simulate sample paths of a Gaussian Markov process in an iterative “top-down’’ fashion. Assuming the intervals are dyadic rational and that we have a conditional knowledge of a sample path on the dyadic points of , one can decide to further the simulation of this sample path at any time in by drawing a point according to the conditional law of given , which is simply expressed in the framework of our construction. This property can be used for great advantages in numerical computations such as dichotomic search algorithms for first passage times: the key element is to find an estimate of the true conditional probability that a hitting time has occurred when knowing the value of the process at two given times, one in the past and one in the future. With such an estimate, an efficient strategy to look for passage times consists in refining the sample path when and only when its trajectory is estimated likely to actually cross the barrier. Thus the sample path of the process is represented at poor temporal resolution when it is far from the boundary and at increasingly higher resolution closer to the boundary. Such an algorithmic principle achieves a high level of precision in the computation of the first-hitting time, while demanding far less operation than usual stochastic Runge-Kutta scheme. This approach has been successfully implemented for the one-dimensional case [58], see Figure 4. In that article, the precision of the algorithm is controlled as well as the probability to evaluate a first hitting time substantially different from the actual value. The approach proves to be extremely efficient compared to customary methods. The general multidimensional approach proposed in the present paper allows direct generalization of these results to the computation of exit times in any dimension and for general smooth sets [30–32].Gaussian Deformation Modes in Nonlinear Diffusions
The present study is developed for the Gauss-Markov systems. However, many models arising in applied science present nonlinearities, and in that case, the construction based on a sum of Gaussian random variables will not generalize. However, the Gaussian case treated here can nevertheless be applied to perturbation of nonlinear differential equations with small noise. Let be a nonlinear time-varying vector field, and let us assume that is a stable (attractive) solution of the dynamical system:
This function can for instance be a fixed point (in which case it is a constant), a cycle (in which case it is periodic), or a general attractive orbit of the system. In the deterministic case, any solution having its initial condition in a given neighbourhood in of the solution will asymptotically converge towards the solution, and therefore perturbations of the solutions are bounded. Let us now consider that the system is subject to a small amount of noise and define as the solution of the stochastic nonlinear differential equation:
Assuming that the noise is small (i.e., is a small parameter), because of the attractivity of the solution , the function will remain very close to (at least in a bounded time interval). In this region, we define . This stochastic variable is the solution of the equation
The solution at the first order in is therefore the multidimensional Gaussian process with nonconstant coefficients:
and our theory describes the solutions in a multiresolution framework. Notice that, in that perspective, our basis functions can be seen as increasingly finer modes of deformation of a deterministic trajectory. This approach appears particularly relevant to the theory of weakly interconnected neural oscillators in computational neuroscience [60]. Indeed, one of the most popular approaches of this field, the phase model theory, formally consists in studying how perturbations are integrated in the neighborhood of an attracting cycle [61, 62].
All these instances are exemplary of how our multiresolution description of the Gauss-Markov processes offers a simple yet rigorous tool to broach a large number of open problems and promises fascinating applications both in theoretical and in applied science.
Appendices
A. Formulae of the Basis for the Integrated Wiener Process
In the case of the primitive of the Wiener process, straightforward linear algebra computations lead to the two bases of functions and having the following expressions where are the diagonal components of the (diagonal) matrix . As expected, we notice that the differential structure of the process is conserved at any finite rank since we have
B. Formulae of the Basis for the Doubly-Integrated Wiener Process
For the doubly-integrated Wiener process, the construction of the three-dimensional process involves a family of three 3-dimensional functions, which constitutes the columns of a matrix that we denote by . This basis has again a simple expression when is the middle of the interval : Notice again that the basis functions satisfy the relationships for and . These functions also form a triorthogonal basis of functions, which makes it easy to simulate sample paths of the doubly-integrated Wiener process, as show in Figure 5.
(a)
(b)
C. Properties of the Lift Operators
This appendix is devoted to the proofs of the properties of the lift operator enumerated in Proposition 6.3. The proposition is split into three lemmas for the sake of clarity.
Lemma C.1. The operator is a linear measurable bijection. Moreover, for every , the function is a finite-dimensional linear operator, whose matrix representation is triangular in the natural basis of and whose eigenvalues are given by
Eventually, is abounded operator for the spectral norm with
and the determinant of denoted by admits a limit when tends to infinity:
Proof. All these properties are deduced from the properties of the functions and derived previously.(i) is a linear measurable bijection of due to the composed application of two linear bijective measurable functions and .(ii)Since we have the expressions of the matrices of the finite-dimensional linear transformations, it is easy to write the linear transformation of on the natural basis as leading to the coefficient expression where we have dropped the index since the expression of the coefficients does not depend on it. We deduce from the form of the matrices and that the application has a matrix representation in the basis of the form where we only represent the nonzero terms.The eigenvalues of the operator are therefore the diagonal elements that are easily computed from the expressions of the general term of the matrix: (iii)From the expression of , we deduce the inequalities from which follows the given upper-bound to the singular values.(iv) is a finite-dimensional triangular linear matrix in the basis . Its determinant is simply given as the product where we noticed that the eigenvalues are of the form Since, for every , we have , which entails that , it is enough to show that we have Now, writing for every the quantity we have so that is a telescoping product that can be written as If is Hölder continuous, there exist and such that and introducing, for any , the quantity , we have that with After expanding the exponential in the preceding definitions, we have now, from we can directly conclude that
Notice that, if , is the identity and as expected.
Similar properties are now proved for the process lift operator .
Lemma C.2. The function is a linear measurable bijection.
Moreover, for every , the function is a finite-dimensional linear operator, whose matrix representation is triangular in the natural basis of and whose eigenvalues are given by
Eventually, is a bounded operator for the spectral norm with
and the determinant of admits a limit when tends to infinity:
Proof. (i)The function is a linear measurable bijection of onto because and are linear bijective measurable functions.(ii)We write the linear transformation of for in as If we denote the class of in by , , we can write (C.24) as from which we deduce the expression of the coefficients of the matrix : where as usual we drop the index . Because of the the form of the matrices and , the matrix in the basis has the following triangular form: From the matrix representations and , the diagonal terms of read (iii)The upper bound directly follows from the fact that .(iv) Since , the value of the determinant of is clearly the inverse of the determinant of , so that .
Note that Lemma C.2 directly follows from the fact that and are inverse of each other and admit a triangular matrix representation. More precisely, when restricted to the finite-dimensional case, we have set the following properties.
Properties C.3. We have the following set of properties in terms of matrix operations(i) and ,(ii) and ,(iii) and ,(iv) and ,(v) and .
Proof. (i)Let us write and . Since and project the flag onto the flag , and since conversely and project the flag onto the flag , we can write(ii)We have(iii)We have(iv)We have(v)We have
D. Construction and Coefficient Applications
In this appendix, we provide the proofs of the main properties used in the paper regarding the construction and the coefficient applications.
D.1. The Construction Application
We start by addressing the case of the construction application introduced in Section 3.2.1.
We start by proving Proposition 3.11.
Proof. For the sake of simplicity, we will denotes for any function , the uniform norm as , where is the operator norm induced by the uniform norms. We will also denote the th line of by (it is a -valued function) and the th column of by .
Let be fixed. These coefficients induce a sequence of continuous functions through the action of the sequence of the partial construction applications. To prove that this sequence converges towards a continuous function, we show that it uniformly converges, which implies the result of the proposition using the fact that a uniform limit of continuous functions is a continuous function. Moreover, since the functions take values in , which is a complete space, we show that for any sequence of coefficients , the sequence of functions constitutes a Cauchy sequence for the uniform norm.
By definition of , for every in , there exist and such that, for every , we have
which implies that for, , we have
We therefore need to upperbound the uniform norm of the function . To this purpose, we use the definition of given by (3.28):
The coefficient in position of the integral term in the right-hand side of the previous inequality can be written as a function of the lines and columns of and and can be upperbounded using the Cauchy-Schwarz inequality on as follows:
Since the columns of form an orthogonal basis of functions for the standard scalar product in (see Proposition 3.5), we have . Moreover, since is bounded continuous on , we can define constants and write
Setting , for all in , the -valued functions
satisfy .
Moreover, since is also bounded continuous on , there exists such that , and we finally have, for all ,
Now using this bound and (D.2), we have
and since , for the continuous functions forms a uniformly convergent sequence of functions for the -dimensional uniform norm. This sequence therefore converges towards a continuous function, and is well defined on and takes values in .
This proposition being proved, we dispose of the map . We now turn to prove different useful properties on this function. We denote by the Borelian sets of the -dimensional Wiener space .
Lemma D.2. The function is a linear injection.
Proof. The application is clearly linear. The injective property simply results from the existence of the dual family of distributions . Indeed, for every in , we have that entails, that for all .
In the one-dimensional case, as mentioned in the main text, because the uniform convergence of the sample paths is preserved as long as is continuous and is nonzero through (D.8), the definition does not depend on or and the space is large enough to contain reasonably regular functions.
Proposition D.3. In the one-dimensional case, the space contains the space of uniformly Hölder continuous functions defined as
Remark D.3. This point can be seen as a direct consequence of the characterization of the local Hölder exponent of a continuous real function in terms of the asymptotic behavior of its coefficients in the decomposition on the Schauder basis [63].
Proof. To underline that we place ourselves in the one-dimensional case, we drop the bold notations that indicate multidimensional quantities. Supposing that is uniformly Hölder continuous for a given , there always exists such that coincides with on : it is enough to take such that, for all in , . We can further write for For a given function , posing , we have Moreover, if is in , it is straightforward to see that has a continuous derivative. Then, since is -Hölder, for any , there exists such that entails that from which we directly deduce This demonstrates that belongs to and ends the proof.
We equip the space with the topology induced by the uniform norm on . As usual, we denote by the corresponding Borelian sets. We now show Proposition 3.12.
Proposition D.4. The function is a bounded continuous bijection.
Proof. Consider an open ball of of radius . If we take as defined in (D.8), we can choose a real such that Let us consider in such that . Then, by (D.8), we immediately have that, for all in the ball of radius of . This shows that is open and that is continuous for the -dimensional uniform norm topology.
D.2. The Coefficient Application
In this section of the appendix, we show some useful properties of the coefficient application introduced in Section 3.2.2.
Lemma D.5. The function is a measurable linear injection.
Proof. (i)The function is clearly linear.(ii)To prove that is injective, we show that for and in , implies that . To this end, we fix in equipped with the uniform norm and consider the continuous function This function coincides with on every dyadic number in and has zero value if . Since , there exists insuch that , and by continuity of , there exists an such that on the ball. But, for large enough, there exists , such that . We then necessarily have that ; otherwise, we would have , which would contradict the choice of .(iii)Before proving the measurability of , we need the following observation. Consider for , the finite-dimensional linear function Since for all , the matrices , , are all bounded, the function is a continuous linear application. To show that the function is measurable, it is enough to show that the pre-image by of the generative cylinder sets of belongs to .For any , take an arbitrary Borel set and define the cylinder set as and we write the collection of cylinder sets as We proceed by induction of to show that the preimage by of any cylinder set in is in . For , a cylinder set of is of the form in , , which is measurable for being a cylinder set of . Suppose now that, for , for any set in , the set is measurable. Then, considering a set in , there exists in such that . Define in such that , where and remark that . Clearly, we have that , where we have defined the cylinder set as Having defined the function , we now have . Because of the continuity of , is a Borel set of . Since, by hypothesis of recurrence, is in , is also in as the intersection of two Borel sets. The proof of the measurability of is complete.
We now demonstrate Theorem 3.14.
Proposition D.6. The function is a measurable linear bijection whose inverse is .
Proof. Let be a continuous function. We have This function is equal to for any , the set of dyadic numbers. Since is dense in and both and are continuous, the two functions, coinciding on the dyadic numbers, are equal for the uniform distance, and hence .
E. Itô Formula
In this section, we provide rigorous proofs of Proposition 6.1 and Theorem 6.2 related to the Itô formula.
Proposition E.1 (integration by parts). Let and be two one-dimensional Gauss-Markov processes starting from zero. Then one has the following equality in law: where two stochastic processes denotes for and the Stratonovich integral. In terms of the Itô integral, this formula is written as where the brackets denote the mean quadratic variation.
Proof. We assume that and satisfy the equations:
and we introduce the functions , , , and such that and .
We define and , the construction bases of the processes and . Therefore, using Theorem 4.5, there exist and standard normal independent variables such that and and we know that the processes and are almost-surely uniform limits when of the processes and defined as the partial sums:
Using the fact that the functions and have piecewise continuous derivatives, we have
Therefore, we need to evaluate the piecewise derivative of the functions and . We know that
which entails that
and similarly so for the process . Therefore, we have
with
We easily compute
For , as it is our case, and are both almost surely finite for all in . For almost all and drawn with respect to the law of the Gaussian infinite vector , we therefore have, by the Lebesgue dominated convergence theorem, that this integral converges almost surely towards
The other two terms and necessitate a more thorough analysis, and we treat them as follows. Let us start by considering the first one of this term:
Let us now have a closer look at the process for where for such that . Because of the structure of our construction, we have
We therefore have
with
Let us denote by the time step of the partition , which is smaller than with from the assumption made in Section 3.2.1. Moreover, we know that the functions , and are continuously differentiable, and since and are -Hölder, so are and . When (i.e., when ), using the Taylor and Hölder expansions for the differential functions, we can further evaluate the integrals we are considering. Let us first assume that . We have
Similarly, we show that when . If , we have and for in we have
We then finally have
Moreover, we observe that the process is a martingale, and by definition of the Stratonovich integral for martingale processes, we have
where is used to denote the Stratonovich stochastic integral and the limit is taken in distribution. Notice that the fact that the sum converges towards the Stratonovich integral does not depend on the type of sequence of partition chosen which can be different from the dyadic partition. Putting all these results together, we obtain the equality in law:
which is exactly the integration by parts formula we were searching for. The integration by parts formula for the Itô stochastic integral directly comes from the relationship between the Stratonovich and Itô stochastic integrals.
Theorem E.1 (Itô). Let be a Gauss-Markov process and in . The process is a Markov process and satisfies the relation
Proof. The integration by parts formula directly implies the Itô formula through a density argument as follows. Let be the set of functions such that (E.22) is true. It is clear that is a vector space. Moreover, because of the result of Proposition 6.1, the space is an algebra. Since all constant functions and the identity function trivially belong to , the algebra contains all polynomial functions.
Let now . There exists a sequence of polynomials such that (resp., ) uniformly converges towards (resp., ). Let us denote by the sequence of stopping times:
This sequence grows towards infinity. We have
On the interval , we have , which allows to use the Lebesgue dominated convergence theorem on each term of the equality. We have
which converges towards zero because of the Lebesgue theorem for the Steljes integration. The same argument directly applies to the other term. Therefore, letting , we proved Itô formula for , and eventually letting , we obtain the desired formula.
F. Trace Class Operator
In this section, we demonstrate Theorem 6.6, which proves the instrumental to extend the finite-dimensional change of variable formula to the infinite-dimensional case. The proof relies on the following lemma.
Lemma F.1. The operator is isometric to the operator defined by with the kernel
Proof. Notice first that which leads to writing in , for any and in with This proves that . Therefore, if we denote the isometric linear operator we clearly have with .
We now proceed to demonstrate that is a trace class operator.
Proof of Theorem 6.6. Since the kernel is integrable in , the integral operator is a Hilbert-Schmidt operator and thus is compact. Moreover, it is a trace class operator since we have Since and are isometric through , the compactness of is equivalent to the compactness of . Moreover, the traces of both operators coincide: using the result of Corollary 3.6.
G. Girsanov Formula
In this section we provide the quite technical proof of Lemma 6.8 which is useful in proving the Girsanov formula.
Lemma G.1. The positive definite quadratic form on associated with operator is well defined on . Moreover for all , where and refers to the Stratonovich integral and the equality is true in law.
Proof. The proof of this lemma uses quite similar materials to those used in the proof of the Itô theorem. However, since this result is central for giving insight on the way our geometric considerations relate to the Girsanov theorem, we provide the detailed proof here.
Consider in , denote , and write
where we have posited
It is easy to see, using similar arguments to those in the proof of the integration by parts formula, Proposition 6.1:
Because of the uniform convergence property of towards and the fact that it has almost surely bounded sample paths, the latter sum converges towards
Now writing quantity as the sum of elementary integrals between the points of discontinuity , ,
and using the identities of (E.13) and (E.14), we then have
where we denote
Let us define the function in by
If and are uniformly -Hölder continuous, so is . Therefore, there exist an integer and a real such that if , for all , we have
which shows that , and similarly as well. As a consequence, expression Lemma G.1 converges when tends to infinity toward the desired Stratonovich integral.
Acknowledgments
The authors wish to thank Professor Marcelo Magnasco for many illuminating discussions. This work was partially supported by NSF Grant EF-0928723 and ERC Grant NERVI-227747.