Abstract

We study the optimal control problem of a controlled time-symmetric forward-backward doubly stochastic differential equation with initial-terminal state constraints. Applying the terminal perturbation method and Ekeland’s variation principle, a necessary condition of the stochastic optimal control, that is, stochastic maximum principle, is derived. Applications to backward doubly stochastic linear-quadratic control models are investigated.

1. Introduction

It is well known that general coupled forward-backward stochastic differential equations (FBSDEs) consist of a forward SDE of Itô's type and a backward SDE of Pardoux-Peng's (see [1, 2]). Since Antonelli [3] first studied FBSDEs in early 1990s, FBSDEs have been studied widely in many papers (see [47]). FBSDEs are often encountered in the optimization problem when applying stochastic maximum principle (see [8, 9]). In finance, FBSDEs are used when considering problems with the large investors; see [6, 10, 11]. Such equations are also used in the potential theory (see [12]). Moreover, one can apply FBSDEs to study Homogenization and singular perturbation of certain quasilinear parabolic PDEs with periodic structures (see [13, 14]).

In order to produce a probabilistic representation of certain quasilinear stochastic partial differential equations (SPDEs), Pardoux and Peng [15] first introduced backward doubly stochastic differential equations (BDSDEs) and proved the existence and uniqueness theorem of BDSDEs. Using such BDSDEs they proved the existence and uniqueness theorem of those quasilinear SPDEs and thus significantly extended the famous Feynman-Kac formula for such SPDEs.

Peng and Shi [16] studied the following time-symmetric forward-backward doubly stochastic differential equations (FBDSDEs): which generalized the general FBSDEs. Here the forward equation is “forward” with respect to a standard stochastic integral , as well as “backward” with respect to a backward stochastic integral ; the coupled “backward equation” is “forward” under the backward stochastic integral and “backward” under the forward one. In other words, both the forward equation and the backward one are BDSDEs with different directions of stochastic integral. Under certain monotonicity conditions, they proved the uniqueness and existence theorem for these equations. In [17], when deriving the stochastic maximum principle of backward doubly stochastic optimal control problems, Han et al. showed that this kind of equations are just the state equation and adjoint equation of their optimal control problem.

In this paper, we study a stochastic optimal control problem with initial-terminal state constraints where the controlled system is described by the above time-symmetric FBDSDEs. We suppose that the initial state and the terminal state fall in two convex sets, respectively, and the corresponding states and satisfy the constraints and , respectively. Then we minimize the following cost function: It is well known that the maximum principle is an important approach to study optimal control problems. The systematic account of this theory can be found in [9, 18]. When the controlled system under consideration is assumed to be with state constraints, especially with sample-wise constraints, the corresponding stochastic optimal control problems are difficult to solve. A sample-wise constraint requires that the state be in a given set with probability , for example, a nonnegativity constraint on the wealth process, that is, bankruptcy prohibition in financial markets. In order to deal with such optimal control problems, an approach named “terminal perturbation method” was introduced and applied in financial optimization problems recently (see [1922]). This method is based on the dual method or martingale method introduced by Bielecki et al. in [23] and El Karoui et al. in [24]. It mainly applies Ekeland's variational principle to tackle the state constraints and derive a stochastic maximum principle which characterizes the optimal solution. For other works about the optimization problem with state constraints, the readers may refer to [25, 26]. In this paper, a stochastic maximum principle is obtained for the controlled time-symmetric FBDSDEs with initial-terminal state constraints by using Ekeland's variational principle.

We give three specific applications to illustrate our theoretical results. In the first application, the controlled state equations are composed of a normal FSDE and a BDSDE. By introducing a backward formulation of the controlled system (inspired by [21]), we present the stochastic maximum principle for the optimal control. As a special case, we only consider one BDSDE as our state equation in the second application. As stated in the last application, our results can be applied in forward-backward doubly stochastic linear-quadratic (LQ) optimal control problems. The explicit expression of the optimal control is derived. Since the control system of SPDEs can be transformed to the relevant control system of FBDSDEs, our results can be used to solve the optimal control problem of one kind of SPDEs.

This paper is organized as follows. In Section 2.1, we recall some preliminaries. And we formulate our control problem in Section 2.2. In Section 2.3, by applying Ekeland's variation principle we obtain a stochastic maximum principle of this controlled time-symmetric FBDSDEs with initial-terminal state constraints. Some applications are given in the last section.

2. The Main Problem

2.1. Preliminaries

Let us first recall the existence and uniqueness results of the BDSDE which was introduced by Pardoux and Peng [15] and an extension of the well-known Itô's formula which would be often used in this paper.

Let be a probability space, and let be fixed throughout this paper. Let and be two mutually independent standard Brownian motion processes, with values in , respectively, defined on . Let denote the class of -null set of . For each , we define , where Note that the collection is neither increasing nor decreasing, and it does not constitute a filtration.

For any Euclidean space , we denote by the scale product of . The Euclidean norm of a vector will be denoted by , and for a matrix A, we define .

For any , let denote the set of (classes of a.e. equal) -dimensional jointly measurable stochastic processes which satisfy

(i) ; (ii) is -measurable, for a.e. .

We denote by the set of continuous -dimensional stochastic processes which satisfy:

(i) ; (ii) is -measurable, for any .

Let be jointly measurable such that for any , .

Moreover, we assume that there exist constants and such that for any , , Given , we consider the following BDSDE:

We note that the integral with respect to is a “backward Itô integral” and the integral with respect to is a standard forward Itô integral. These two types of integrals are particular cases of the Itô-Skorohod integral; see Nualart and Pardoux [27].

By Theorem 1.1 in [15], (2.3) has a unique solution .

Next let us recall an extension of the well-known Itô's formula in [21] which would be often used in this paper.

Lemma 2.1. Let , , , be such that Then, Generally, for ,

2.2. Problem Formulation

Let be a nonempty convex subset of . We set An element of is called an admissible control. Now let be jointly measurable such that for any and any Let We assume the following.(H1) , and , there exists a constant such that the following monotonicity condition holds for any : (H2)There exist constants and such that for any , , the following conditions hold: where , .(H3) and are continuous in their arguments and continuously differentiable in , and the derivatives of in are bounded and ; the derivatives of in are bounded by , and the derivatives of , , and in are bounded by ; , , and in are bounded by .

Given , , and , let us consider the following time-symmetric FBDSDE: Recall Theorem 2.2 in [16]. We have the following.

Theorem 2.2. For given , and , assume (H1)~(H3); then (2.13) exists as a unique -adapted solution .

In (2.13), we regard as controls. can be chosen from the following admissible set: where and are convex.

Remark 2.3. A main assumption in this paper is that the control domains are convex. For the terminal perturbation method, it is difficult to weaken or completely remove these assumptions. Until now, it remains an interesting and challenging open problem.

We also assume the state constraints For each , consider the following cost function: Our optimization problem is

Definition 2.4. A triple of random variable is called feasible for given , if the solution (2.13) satisfies and . We will denote by the set of all feasible for any given and .

A feasible is called optimal if it attains the minimum of over .

The aim of this paper is to obtain a characterization of , that is, the stochastic maximum principle.

2.3. Stochastic Maximum Principle

Using Ekeland's variational principle, we derive maximum principle for the optimization problem (2.17) in this section. For simplicity, we first study the case where , and in Sections 2.3.12.3.3 and then present the results for the general case in Section 2.3.4.

2.3.1. Variational Equations

For , we define a metric in by It is obvious that is a complete metric space.

Let be optimal and let be the corresponding state processes of (2.13). for all and for all , Let be the state processes of (2.13) associated with , .

To derive the first-order necessary condition, we let be the solution of the following time-symmetric FBDSDE: where for , , respectively. Equation (2.20) is called the variation equation.

Set We have the following convergence.

Lemma 2.5. Assuming (H1)~(H3) one has

Proof. From (2.13) and (2.20), we have Let Thus Using Lemma 2.1 to , we get where , are constants and .
Similar analysis shows that where and ( ) are similarly defined as before.
It yields that where , are constants and . Since and , there exists such that Since the Lebesgue dominated convergence theorem implies , we obtain the result by Gronwall's inequality.

2.3.2. Variational Inequality

In this subsection, we apply Ekeland's variational principle [28] to deal with initial-terminal state constraints: Define where and are the given initial and terminal state constraints and is an arbitrary positive constant.

It is easy to check that the mappings , , and are all continuous functionals from to .

Theorem 2.6. Suppose (H1)~(H3). Let be an optimal solution to (2.17). Then there exist with and such that the following variational inequality holds: where is the solution of (2.20) at time and is the solution of (2.20) at time .

Proof. It is easy to check that is continuous on such that Thus, from Ekeland's variational principle [28], such that(i) , (ii) , (iii) . For any and , set . Let (resp., ) be the solution of (2.13) under (resp., ), and let ) be the solution of (2.20) in which is substituted by .
From (iii), we know that On the other hand, similarly to Lemma 2.5 we have This leads to the following expansions: Applying the linearization technique, then So we have the following expansions: For the given , we consider the following four cases.
Case 1. There exists such that for all .
In this case, Dividing (2.34) by and sending to , we obtain where Case 2. There exists a position sequence satisfying such that Then For sufficiently large , since is continuous, we conclude Now Similar to Case 1 we get where Case 3. There exists a positive sequence satisfying such that Case 4. There exists a positive sequence satisfying such that Similar techniques can be used for both Case 3 and Case 4.
In summary, for all those cases, we have and by the definition of . Then there exists a convergent subsequence of whose limit is denoted by . On the other hand, it is easy to check that , as . Thus (2.32) holds.

2.3.3. Maximum Principle

In this subsection we derive the maximum principle for the case where , , and then present the results for the general case in Section 2.3.4. To this end, we introduce the adjoint process , and associated with the optimal solution to (2.13), which is the solution of the following time-symmetric FBDSDE: where , , , for are defined as in (2.20). It is easy to check that there exist unique processes , which solve the above equations.

Theorem 2.7. We assume (H1)~(H4). Let be optimal and let be the corresponding optimal trajectory. Then for arbitrary one has for any ,

Proof. For any , let be the solution to (2.20). Applying Lemma 2.1 to , we have This yields For any , we get Thus, it is easy to see that (2.52) holds.

2.3.4. The General Case

Define the Hamiltonian by Now we consider the general case where , , .

Since the proof of the maximum principle is essentially similar as in the preceding subsection, we only present the result without proof.

Let be optimal to (2.17) with being the corresponding optimal trajectory of (2.13). We define the following adjoint equations: where , , respectively.

Theorem 2.8. We assume (H1)~(H3). Let be optimal and let be the corresponding optimal trajectory. Then for arbitrary , one has that the following inequalities hold, for any :

Remark 2.9. Let us denote the boundary of by . Set Then Similar analysis can be used to the boundaries of and .

3. Applications

In this section, we give three specific cases to illustrate the applications of our obtained results.

3.1. System Composed of a Forward SDE and a BDSDE

Classical Formulation
For given and , we consider the following controlled system composed of a FSDE and a BDSDE: where is given, , a.s., where is a given nonempty convex subset in .
Set the mappings In this case, we regard and as the control variables. Define the following cost function: where We assume(H1) , and are continuous in their arguments and continuously differentiable in .(H2) There exist constants and such that for any ,   , the following conditions hold: (H3) The derivatives of in are bounded; the derivatives of in are bounded by ; the derivatives of and in are bounded by ; the derivatives of in are bounded by .

Then, for given and , there exists a unique triple which solves (3.1).

We assume an additional terminal state constraint , ., where is a given nonempty convex subset in . Our stochastic control problem is

Backward Formulation
From now on, we give an equivalent backward formulation of the previously mentioned stochastic optimal problem (3.7). To do so we need an additional assumption:(H4)there exists such that for all and , .
Note (H1) and (H4) imply the mapping is a bijection from onto itself for any .
Let and denote the inverse function by . Then system (3.1) can be rewritten as where and , , .
A key observation that inspires our approach of solving problem (3.7) is that, since is a bijection, can be regarded as the control; moreover, by the BSDE theory selecting is equivalent to selecting the terminal value . Hence we introduce the following “controlled” system: where the control variables are the random variables and to be chosen from the following set: For each , consider the following cost: where .
This gives rise to the following auxiliary optimization problem: where is the solution of (3.9) at time under and .
It is clear that the original problem (3.7) is equivalent to the auxiliary one (3.12).
Hence, hereafter we focus ourselves on solving (3.12). The advantage of doing this is that, since and now are the control variables, the state constraint in (3.7) becomes a control constraint in (3.12), whereas it is well known in control theory that a control constraint is much easier to deal with than a state constraint. There is, nonetheless, a cost of doing; that is, the original initial condition now becomes a constraint, as shown in (3.12).
From now on, we denote the solution of (3.9) by , whenever necessary, to show the dependence on . We also denote and by and , respectively. Finally, it is easy to check that and satisfy similar conditions in Assumptions (H1)−(H3).
We note that this is a special case of (2.17), so by the same method we have the following result.

Stochastic Maximum Principle
We define where ( ) and , , and are defined as in (2.32). It is easy to check that there exist unique processes and which solve the above equations.

Theorem 3.1. One assumes (H1) (H4). Let be optimal to (3.12), and let be the corresponding optimal trajectory. Then for arbitrary , one has that the following inequalities hold:

3.2. System Composed of a BDSDE with State Constraints

Although this case describes a controlled BDSDE system with state constraints, it seems trivial. Thus, we only give a brief illustration. Given and , consider the following BDSDE: For given and satisfying (H2) and (H3), from the Theorem 1.1 in [15], it is easy to check that there exists a unique solution of (3.15).

Note that is a -measurable variable, and Now we regard and as the control variables to be chosen from the following set: For each , consider the following cost function: which gives rise to the following optimization problem:

Maximum Principle
Let Then the adjoint equation is where for ; , respectively.

Theorem 3.2. One assumes (H1) (H3). Let and be the optimal controls and let be the corresponding optimal trajectory. Then one has

3.3. Backward Doubly Stochastic LQ Problem without State Constraints

Consider the following linear system: where and are given constants and are corresponding matrices.

The cost function (2.16) becomes where all functions of are bounded and are symmetric non-negative definite, are symmetric uniformly positive definite.

Then from (2.51), the adjoint equations become Define Suppose that is an open set. Then we have the following result: Thus, However, the maximum principle gives only the necessary condition for optimal control. We also have the following theorem.

Theorem 3.3. The function is the unique optimal control for backward doubly stochastic LQ problems, where and are solutions of above equations.

Proof. First let us prove that the is the optimal control. For all , let be the corresponding trajectory of (3.23), we get Using Lemma 2.1 to , we obtain So by the definition of , From the arbitrariness of , we deduce is the optimal control.
The proof of the uniqueness of the optimal control is classical. Assume that and are both optimal controls, and the corresponding trajectories are and . By the uniqueness of solutions of (3.23), we know that the trajectory corresponding to is , and notice that are positive, , are nonnegative, we have here . So which shows that .

Example 3.4. Consider the following backward doubly stochastic LQ problem, where , and : We want to minimize the following cost function: From (3.34), we get, for , By substituting and into the cost function, we obtain Thus, the optimal control is with the optimal state trajectory The adjoint equations are It is obvious that is the unique solution of the above equation.

Acknowledgments

The authors would like to thank Professor Shige Peng for some useful conversations. This work was supported by National Natural Science Foundation of China (no. 11171187, no. 10871118, and no. 10921101); supported by the Programme of Introducing Talents of Discipline to Universities of China (no. B12023); supported by Program for New Century Excellent Talents in University of China.