Abstract

We consider a stochastic recursive optimal control problem in which the control variable has two components: the regular control and the impulse control. The control variable does not enter the diffusion coefficient, and the domain of the regular controls is not necessarily convex. We establish necessary optimality conditions, of the Pontryagin maximum principle type, for this stochastic optimal control problem. Sufficient optimality conditions are also given. The optimal control is obtained for an example of linear quadratic optimization problem to illustrate the applications of the theoretical results.

1. Introduction

The nonlinear backward stochastic differential equations (BSDEs for short) were first introduced by Pardoux and Peng [1]. Independently, Duffie and Epstein [2] introduced BSDEs under economic background. In [2], they presented a stochastic recursive utility which is an extension of the standard additive utility with the instantaneous utility depending not only on the instantaneous consumption rate but also on the future utility. Actually, it corresponds to the solution of a particular BSDE whose generator does not depend on the variable 𝑧. And then, El Karoui et al. [3] gave the formulation of recursive utilities from the BSDE point of view. The problem that the cost function of the control system is described by the solution of BSDE is called the stochastic recursive optimal control problem. In this case, the control systems become forward-backward stochastic differential equations (FBSDEs).

One fundamental research direction for optimal control problem is to establish the necessary optimality conditionsβ€”Pontryagin maximum principle. Stochastic maximum principle for forward, backward, and forward-backward systems has been studied by many authors, including Peng [4, 5], Tang and Li [6], Wang and Yu [7], Wu [8], and Xu [9] for full information and Huang et al. [10], Wang and Wu [11], Wang and Yu [12], and Wu [13] for partial information case. However, in these papers, there are only regular controls in the control systems and impulse controls are not included.

Stochastic impulse control problems have received considerable research attention in recent years due to wide applicability in a number of different areas, especially in mathematical finance; see, for example, [14–17]. In most cases, the optimal impulse control problem was studied through dynamic programming principle. It was shown in particular that the value function is a solution of some quasi-variational inequalities.

The first result in stochastic maximum principle for singular control problem was obtained by Cadenillas and Haussmann [18], in which linear dynamics, convex cost criterion, and convex state constraint are assumed. Bahlali and Chala [19] generalized [18] to the nonlinear dynamics case with a convex state constraint. Bahlali and Mezerdi [20] considered a stochastic singular control problem in which the control system is governed by a stochastic differential equation where the regular control enters the diffusion coefficient and the control domain is not necessarily convex. The stochastic maximum principle was obtained with the approach developed by Peng [4]. Dufour and Miller [21] studied a stochastic singular control problem in which the admissible control is of bounded variation. It is worth pointing out that the control systems in these works are stochastic differential equations with singular control, and few examples are given to illustrate the theoretical results. Wu and Zhang [22] were the first to study stochastic optimal control problems of forward-backward systems involving impulse controls, and they obtained both the maximum principle and sufficient optimality conditions for the optimal control problem.

In this paper, we continue to study stochastic optimal control problem involving impulse controls, in which the control system is described by a forward-backward stochastic differential equation and the control variable consists of regular control and impulse control. Different from [22], it is assumed in this paper that the domain of the regular controls is not necessarily convex and the control variable does not enter the diffusion coefficient. Thus the result of this paper and that of [22] do not contain each other. We obtain the stochastic maximum principle by using a spike variation on the regular part of the control and a convex perturbation on the impulsive one. Sufficient optimality conditions are also obtained which can help to find the optimal control in applications.

The rest of this paper is organized as follows. In Section 2 we give some preliminary results and the formulation of our stochastic optimal control problem. In Section 3 we obtain the maximum principle for our stochastic optimal control problem. Sufficient optimality conditions for the optimal control problem is established in Section 4, and an example of linear quadratic optimization problem is also given to illustrate the applications of our theoretical results.

2. Formulation of the Stochastic Optimal Control Problem

Firstly we introduce some notations. Let (Ξ©,β„±,β„™) be a probability space and 𝔼 the expectation with respect to β„™. Let 𝑇 be a finite time horizon and ℱ𝑑 the natural filtration of a 𝑑-dimensional standard Brownian motion {𝐡𝑑,0≀𝑑≀𝑇} augmented by the β„™-null sets of β„±. For π‘›βˆˆβ„• and 𝑝>1, denote by 𝑆𝑝(ℝ𝑛) the set of 𝑛-dimensional adapted processes {πœ‘π‘‘,0≀𝑑≀𝑇} such that 𝔼[sup0≀𝑑≀𝑇|πœ‘π‘‘|𝑝]<∞, and denote by 𝐻𝑝(ℝ𝑛) the set of 𝑛-dimensional adapted processes {πœ“π‘‘,0≀𝑑≀𝑇} such that βˆ«π”Ό[(𝑇0|πœ“π‘‘|2𝑑𝑑)𝑝/2]<∞.

Let π‘ˆ be a nonempty subset of β„π‘˜ and 𝐾 a nonempty convex subset of ℝ𝑛. Let {πœπ‘–} be a given sequence of increasing ℱ𝑑-stopping times such that πœπ‘–β†‘+∞ as π‘–β†’βˆž. We denote by ℐ the class of right continuous processes βˆ‘πœ‚(β‹…)=𝑖β‰₯1πœ‚π‘–πŸ™[πœπ‘–,𝑇](β‹…) such that each πœ‚π‘– is an β„±πœπ‘–-measurable random variable. It is worth noting that the assumption πœπ‘–β†‘+∞ implies that at most finitely many impulses may occur on [0,𝑇]. Denote by 𝒰 the class of adapted processes π‘£βˆΆ[0,𝑇]Γ—Ξ©β†’π‘ˆ such that 𝔼[sup0≀𝑑≀𝑇|𝑣𝑑|3]<∞, and denote by 𝒦 the class of 𝐾-valued impulse processes πœ‚(β‹…)βˆˆβ„ such that βˆ‘π”Ό[(𝑖β‰₯1|πœ‚π‘–|)3]<∞. We call π’œβˆΆ=𝒰×𝒦 the admissible control set. In what follows, for a continuous function 𝑙(β‹…), the integration βˆ«π‘‡0𝑙(𝑑)π‘‘πœ‚π‘‘ is understood as follows: ξ€œπ‘‡0𝑙(𝑑)π‘‘πœ‚π‘‘=0β‰€πœπ‘–β‰€π‘‡π‘™ξ€·πœπ‘–ξ€Έπœ‚π‘–.(2.1)

Given πœ‚(β‹…)βˆˆβ„ and π‘₯βˆˆβ„π‘›, we consider the following SDE with impulses: 𝑑𝑋𝑑=𝑏𝑑,𝑋𝑑𝑑𝑑+πœŽπ‘‘,𝑋𝑑𝑑𝐡𝑑+πΆπ‘‘π‘‘πœ‚π‘‘,𝑋0=π‘₯,(2.2) where π‘βˆΆ[0,𝑇]×Ω×ℝ𝑛→ℝ𝑛, 𝜎∢[0,𝑇]×Ω×ℝ𝑛→ℝ𝑛×𝑑, and 𝐢∢[0,𝑇]→ℝ𝑛×𝑛 are measurable mappings. Similar to [22, Proposition 2.1], we have the following.

Proposition 2.1. Let 𝐢 be continuous and 𝑏, 𝜎 uniformly Lipschitz in π‘₯. Assume that 𝑏(β‹…,0)βˆˆπ»π‘(ℝ𝑛), 𝜎(β‹…,0)βˆˆπ»π‘(ℝ𝑛×𝑑), and βˆ‘π”Ό[(𝑖β‰₯1|πœ‚π‘–|)𝑝]<∞ for some 𝑝β‰₯2. Then SDE (2.2) admits a unique solution 𝑋(β‹…)βˆˆπ‘†π‘(ℝ𝑛).

For πœ‚(β‹…)βˆˆβ„, let us consider the following BSDE with impulses: π‘‘π‘Œπ‘‘ξ€·=βˆ’π‘“π‘‘,π‘Œπ‘‘,𝑍𝑑𝑑𝑑+π‘π‘‘π‘‘π΅π‘‘βˆ’π·π‘‘π‘‘πœ‚π‘‘,π‘Œπ‘‡=𝜁,(2.3) where πœβˆˆβ„±π‘‡, π‘“βˆΆ[0,𝑇]Γ—Ξ©Γ—β„π‘šΓ—β„π‘šΓ—π‘‘β†’β„π‘š and 𝐷∢[0,𝑇]β†’β„π‘šΓ—π‘› are measurable mappings. Similar to [22, Proposition 2.2], we have the following.

Proposition 2.2. Let 𝐷 be continuous and 𝑓 Lipschitz in (𝑦,𝑧). Assume that 𝔼|𝜁|𝑝<∞, βˆ‘π”Ό[(𝑖β‰₯1|πœ‚π‘–|)𝑝]<∞, and 𝑓(β‹…,0,0)βˆˆπ»π‘(β„π‘š) for some 𝑝β‰₯2. Then BSDE (2.3) admits a unique solution (π‘Œ(β‹…),𝑍(β‹…))βˆˆπ‘†π‘(β„π‘š)×𝐻𝑝(β„π‘šΓ—π‘‘).

The control system of our stochastic optimal control problem is subject to the following FBSDE: 𝑑π‘₯𝑑𝑣,πœ‚ξ€·=𝑏𝑑,π‘₯𝑑𝑣,πœ‚,𝑣𝑑𝑑𝑑+πœŽπ‘‘,π‘₯𝑑𝑣,πœ‚ξ€Έπ‘‘π΅π‘‘+πΆπ‘‘π‘‘πœ‚π‘‘,𝑑𝑦𝑑𝑣,πœ‚ξ€·=βˆ’π‘“π‘‘,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑧𝑑𝑣,πœ‚,𝑣𝑑𝑑𝑑+𝑧𝑑𝑣,πœ‚π‘‘π΅π‘‘βˆ’π·π‘‘π‘‘πœ‚π‘‘,π‘₯0𝑣,πœ‚=π‘Žβˆˆβ„π‘›,𝑦𝑇𝑣,πœ‚ξ€·π‘₯=𝑔𝑇𝑣,πœ‚ξ€Έ,(2.4) where π‘βˆΆ[0,𝑇]Γ—β„π‘›Γ—π‘ˆβ†’β„π‘›, 𝜎∢[0,𝑇]×ℝ𝑛→ℝ𝑛×𝑑, π‘“βˆΆ[0,𝑇]Γ—β„π‘›Γ—β„π‘šΓ—β„π‘šΓ—π‘‘Γ—π‘ˆβ†’β„π‘š, π‘”βˆΆβ„π‘›β†’β„π‘š are measurable mappings, and 𝐢∢[0,𝑇]→ℝ𝑛×𝑛, 𝐷∢[0,𝑇]β†’β„π‘šΓ—π‘› are continuous functions. The objective is to minimize the following cost functional over the class π’œ: ξƒ¬πœ™ξ€·π‘₯𝐽(𝑣(β‹…),πœ‚(β‹…))=𝔼𝑇𝑣,πœ‚ξ€Έξ€·π‘¦+𝛾0𝑣,πœ‚ξ€Έ+ξ€œπ‘‡0β„Žξ€·π‘‘,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑣𝑑𝑑𝑑+𝑖β‰₯1π‘™ξ€·πœπ‘–,πœ‚π‘–ξ€Έξƒ­,(2.5) where πœ™βˆΆβ„π‘›β†’β„, π›ΎβˆΆβ„π‘šβ†’β„, β„ŽβˆΆ[0,𝑇]Γ—β„π‘›Γ—β„π‘šΓ—π‘ˆβ†’β„, and π‘™βˆΆ[0,𝑇]×ℝ𝑛→ℝ are measurable mappings.

In what follows we assume the following.(𝐻1)𝑏, 𝜎, 𝑓, 𝑔 are continuous, and they are continuously differentiable in (π‘₯,𝑦,𝑧), with derivatives continuous and uniformly bounded. Moreover, assume that 𝑏 and 𝑓 have linear growth in (π‘₯,𝑦,𝑧,𝑣).(𝐻2)πœ™, 𝛾, β„Ž, 𝑙 are continuous, and they are continuously differentiable in (π‘₯,𝑦,πœ‚), with derivatives continuous and bounded by 𝑐(1+|π‘₯|), 𝑐(1+|𝑦|), 𝑐(1+|π‘₯|+|𝑦|+|𝑣|), and 𝑐(1+|πœ‚|), respectively. Moreover, we assume |β„Ž(𝑑,0,0,𝑣)|≀𝑐(1+|𝑣|3) for any (𝑑,𝑣).

From Propositions 2.1 and 2.2, it follows that FBSDE (2.4) admits a unique solution (π‘₯𝑣,πœ‚(β‹…),𝑦𝑣,πœ‚(β‹…),𝑧𝑣,πœ‚(β‹…))βˆˆπ‘†3(ℝ𝑛)×𝑆3(β„π‘š)×𝐻3(β„π‘šΓ—π‘‘) for any (𝑣(β‹…),πœ‚(β‹…))βˆˆπ’œ, and the functional 𝐽 is well defined.

3. Stochastic Maximum Principle for the Optimal Control Problem

Let βˆ‘(𝑒(β‹…),πœ‰(β‹…)=𝑖β‰₯1πœ‰π‘–πŸ™[πœπ‘–,𝑇](β‹…))βˆˆπ’œ be an optimal control and (π‘₯𝑒,πœ‰(β‹…),𝑦𝑒,πœ‰(β‹…),𝑧𝑒,πœ‰(β‹…)) the corresponding trajectory. We introduce the spike variation with respect to 𝑒(β‹…) as follows: π‘’πœ€π‘‘=𝑒𝑣,ifπœβ‰€π‘‘β‰€πœ+πœ€,𝑑,otherwise,(3.1) where 𝜏∈[0,𝑇) is an arbitrarily fixed time, πœ€>0 is a sufficiently small constant, and 𝑣 is an arbitrary π‘ˆ-valued β„±πœ-measurable random variable such that 𝔼|𝑣|3<∞. Let πœ‚(β‹…)βˆˆβ„ be such that πœ‰(β‹…)+πœ‚(β‹…)βˆˆπ’¦. Then it is easy to check that πœ‰πœ€(β‹…)∢=πœ‰(β‹…)+πœ€πœ‚(β‹…), 0β‰€πœ€β‰€1 is also an element of 𝒦. Let us denote by (π‘₯πœ€(β‹…),π‘¦πœ€(β‹…),π‘§πœ€(β‹…)) the trajectory associated with (π‘’πœ€(β‹…),πœ‰πœ€(β‹…)). For convenience, denote πœ‘(𝑑)=πœ‘(𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑒𝑑), πœ‘(π‘’πœ€π‘‘)=πœ‘(𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,π‘’πœ€π‘‘) for πœ‘=𝑏,𝜎,𝑓,β„Ž,𝑏π‘₯,𝜎π‘₯,𝑓π‘₯,𝑓𝑦,𝑓𝑧,β„Žπ‘₯,β„Žπ‘¦. In what follows, we use 𝑐 to denote a positive constant which can be different from line to line.

Let us introduce the following FBSDE (called the variational equation): 𝑑π‘₯1𝑑=𝑏π‘₯(𝑑)π‘₯1𝑑𝑒+π‘πœ€π‘‘ξ€Έξ€»βˆ’π‘(𝑑)𝑑𝑑+𝜎π‘₯(𝑑)π‘₯1𝑑𝑑𝐡𝑑+πœ€πΆπ‘‘π‘‘πœ‚π‘‘,𝑑𝑦1𝑑𝑓=βˆ’π‘₯(𝑑)π‘₯1𝑑+𝑓𝑦(𝑑)𝑦1𝑑+𝑓𝑧(𝑑)𝑧1𝑑𝑒+π‘“πœ€π‘‘ξ€Έξ€»βˆ’π‘“(𝑑)𝑑𝑑+𝑧1π‘‘π‘‘π΅π‘‘βˆ’πœ€π·π‘‘π‘‘πœ‚π‘‘,π‘₯10=0,𝑦1𝑇=𝑔π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇.(3.2) By Propositions 2.1 and 2.2, FBSDE (3.2) admits a unique solution (π‘₯1(β‹…),𝑦1(β‹…),𝑧1(β‹…))βˆˆπ‘†3(ℝ𝑛)×𝑆3(β„π‘š)×𝐻3(β„π‘šΓ—π‘‘).

Similar to [9, Lemma 1], we can easily obtain the following.

Lemma 3.1. We have sup0≀𝑑≀𝑇𝔼||π‘₯1𝑑||3+sup0≀𝑑≀𝑇𝔼||𝑦1𝑑||3ξƒ¬ξ‚΅ξ€œ+𝔼𝑇0|𝑧1𝑑|2𝑑𝑑3/2ξƒ­β‰€π‘πœ€3.(3.3)

We proceed to give the following lemma.

Lemma 3.2. The following estimations hold: sup0≀𝑑≀𝑇𝔼||π‘₯πœ€π‘‘βˆ’π‘₯𝑑𝑒,πœ‰βˆ’π‘₯1𝑑||2ξ‚„β‰€πΆπœ€πœ€2,(3.4)sup0≀𝑑≀𝑇𝔼||π‘¦πœ€π‘‘βˆ’π‘¦π‘‘π‘’,πœ‰βˆ’π‘¦1𝑑||2ξ‚„β‰€πΆπœ€πœ€2π”Όξ‚Έξ€œ,(3.5)𝑇0||π‘§πœ€π‘‘βˆ’π‘§π‘‘π‘’,πœ‰βˆ’π‘§1𝑑||2ξ‚Ήπ‘‘π‘‘β‰€πΆπœ€πœ€2,(3.6) where πΆπœ€β†’0 as πœ€β†’0.

Proof. It is easy to check that π‘₯πœ€π‘‘βˆ’π‘₯𝑑𝑒,πœ‰βˆ’π‘₯1𝑑=ξ€œπ‘‘0ξ‚ƒπΆπœ€π‘ ξ‚€π‘₯πœ€π‘ βˆ’π‘₯𝑠𝑒,πœ‰βˆ’π‘₯1𝑠+π΄πœ€π‘ ξ‚„+ξ€œπ‘‘π‘ π‘‘0ξ‚ƒπ·πœ€π‘ ξ‚€π‘₯πœ€π‘ βˆ’π‘₯𝑠𝑒,πœ‰βˆ’π‘₯1𝑠+π΅πœ€π‘ ξ‚„π‘‘π΅π‘ ,(3.7) where π΄πœ€π‘ =ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘’πœ€π‘ ξ‚βˆ’π‘π‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠,π΅πœ€π‘ =ξ€œ10ξ‚ƒπœŽπ‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1π‘ ξ‚βˆ’πœŽπ‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠,πΆπœ€π‘ =ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+π‘₯1𝑠π‘₯+πœ†πœ€π‘ βˆ’π‘₯𝑠𝑒,πœ‰βˆ’π‘₯1𝑠,π‘’πœ€π‘ ξ‚π·π‘‘πœ†,πœ€π‘ =ξ€œ10𝜎π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+π‘₯1𝑠π‘₯+πœ†πœ€π‘ βˆ’π‘₯𝑠𝑒,πœ‰βˆ’π‘₯1π‘ ξ‚ξ‚π‘‘πœ†.(3.8) Since 𝑏π‘₯, 𝜎π‘₯ are uniformly bounded, we have sup0≀𝑠≀𝑇(|πΆπœ€π‘ |+|π·πœ€π‘ |)≀𝑐. Hence, if we can obtain sup0β‰€π‘‘β‰€π‘‡π”Όξ‚Έξ€œπ‘‘0π΄πœ€π‘ ξ€œπ‘‘π‘ +𝑑0π΅πœ€π‘ π‘‘π΅π‘ ξ‚Ή2β‰€πΆπœ€πœ€2,(3.9) then the estimation (3.4) can be obtained from Gronwall's lemma and (3.7). Let us take the π΄πœ€ term for example. By the definition of π‘’πœ€ and HΓΆlder's inequality, we have sup0β‰€π‘‘β‰€π‘‡π”Όξ‚Έξ€œπ‘‘0π΄πœ€π‘ ξ‚Ήπ‘‘π‘ 2ξƒ¬ξ€œβ‰€2𝔼𝑇0||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘’π‘ ξ‚βˆ’π‘π‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠||||𝑑𝑠2ξƒ¬ξ€œ+2π”Όπœπœ+πœ€||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘£βˆ’π‘π‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠||||𝑑𝑠2=∢2𝐼+2𝐼𝐼.(3.10) From HΓΆlder's inequality, Lemma 3.1, and the dominated convergence theorem, it follows that ξƒ―ξ€œπΌβ‰€π‘‡π”Όπ‘‡0||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘’π‘ ξ‚βˆ’π‘π‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠||||2ξƒ°ξ€œπ‘‘π‘ β‰€π‘‡π‘‡0𝔼||π‘₯1𝑠||32/3𝔼||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘’π‘ ξ‚βˆ’π‘π‘₯ξ‚„||||(𝑠)π‘‘πœ†6ξƒ­ξƒ°1/3𝑑𝑠≀𝑇5/3ξ‚»sup0≀𝑠≀𝑇𝔼||π‘₯1𝑠||3ξ‚Ό2/3ξƒ―ξ€œπ‘‡0𝔼||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘’π‘ ξ‚βˆ’π‘π‘₯ξ‚„||||(𝑠)π‘‘πœ†6𝑑𝑠1/3β‰€πΆπœ€πœ€2.(3.11) Since 𝑏π‘₯ is uniformly bounded, by Lemma 3.1 we get ξ€œπΌπΌβ‰€πœ€πœπœ+πœ€π”Όξƒ¬||||ξ€œ10𝑏π‘₯𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,π‘£βˆ’π‘π‘₯ξ‚„(𝑠)π‘‘πœ†π‘₯1𝑠||||2ξƒ­π‘‘π‘ β‰€π‘πœ€2sup0≀𝑠≀𝑇𝔼||π‘₯1𝑠||2β‰€π‘πœ€4.(3.12) Thus we obtain sup0β‰€π‘‘β‰€π‘‡βˆ«π”Ό[𝑑0π΄πœ€π‘ π‘‘π‘ ]2β‰€πΆπœ€πœ€2. In the same way we can get sup0β‰€π‘‘β‰€π‘‡π”Όξ‚Έξ€œπ‘‘0π΅πœ€π‘ π‘‘π΅π‘ ξ‚Ή2β‰€πΆπœ€πœ€2.(3.13) Hence, the estimation (3.4) is proved.
Now we prove (3.5) and (3.6). Set π‘‹πœ€π‘ =π‘₯πœ€π‘ βˆ’π‘₯𝑠𝑒,πœ‰βˆ’π‘₯1𝑠,π‘Œπœ€π‘ =π‘¦πœ€π‘ βˆ’π‘¦π‘ π‘’,πœ‰βˆ’π‘¦1𝑠,π‘πœ€π‘ =π‘§πœ€π‘ βˆ’π‘§π‘ π‘’,πœ‰βˆ’π‘§1𝑠,Ξ πœ€π‘ =𝑠,π‘₯𝑠𝑒,πœ‰+π‘₯1𝑠+πœ†π‘‹πœ€π‘ ,𝑦𝑠𝑒,πœ‰+𝑦1𝑠+πœ†π‘Œπœ€π‘ ,𝑧𝑠𝑒,πœ‰+𝑧1𝑠+πœ†π‘πœ€π‘ ,π‘’πœ€π‘ ξ‚,Ξ›πœ€π‘ =𝑠,π‘₯𝑠𝑒,πœ‰+πœ†π‘₯1𝑠,𝑦𝑠𝑒,πœ‰+πœ†π‘¦1𝑠,𝑧𝑠𝑒,πœ‰+πœ†π‘§1𝑠,π‘’πœ€π‘ ξ‚.(3.14) It is easy to obtain π‘Œπœ€π‘‘ξ€·π‘₯=π‘”πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’π‘”π‘‡π‘’,πœ‰ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1π‘‡βˆ’ξ€œπ‘‡π‘‘π‘πœ€π‘ π‘‘π΅π‘ +ξ€œπ‘‡π‘‘ξ€·πΈπ‘ 1,πœ€π‘₯1𝑠+𝐸𝑠2,πœ€π‘¦1𝑠+𝐸𝑠3,πœ€π‘§1𝑠+ξ€œπ‘‘π‘ π‘‡π‘‘ξ€·πΉπ‘ 1,πœ€π‘‹πœ€π‘ +𝐹𝑠2,πœ€π‘Œπœ€π‘ +𝐹𝑠3,πœ€π‘πœ€π‘ ξ€Έπ‘‘π‘ ,(3.15) where 𝐸𝑠1,πœ€=ξ€œ10𝑓π‘₯ξ€·Ξ›πœ€π‘ ξ€Έβˆ’π‘“π‘₯ξ€»(𝑠)π‘‘πœ†,𝐸𝑠2,πœ€=ξ€œ10ξ€Ίπ‘“π‘¦ξ€·Ξ›πœ€π‘ ξ€Έβˆ’π‘“π‘¦ξ€»πΈ(𝑠)π‘‘πœ†,𝑠3,πœ€=ξ€œ10ξ€Ίπ‘“π‘§ξ€·Ξ›πœ€π‘ ξ€Έβˆ’π‘“π‘§ξ€»(𝑠)π‘‘πœ†,𝐹𝑠1,πœ€=ξ€œ10𝑓π‘₯ξ€·Ξ πœ€π‘ ξ€ΈπΉπ‘‘πœ†,𝑠2,πœ€=ξ€œ10π‘“π‘¦ξ€·Ξ πœ€π‘ ξ€Έπ‘‘πœ†,𝐹𝑠3,πœ€=ξ€œ10π‘“π‘§ξ€·Ξ πœ€π‘ ξ€Έπ‘‘πœ†.(3.16) We have 𝑔π‘₯πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’π‘”π‘‡π‘’,πœ‰ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇=𝑔π‘₯πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’π‘”π‘‡π‘’,πœ‰+π‘₯1𝑇+𝑔π‘₯𝑇𝑒,πœ‰+π‘₯1𝑇π‘₯βˆ’π‘”π‘‡π‘’,πœ‰ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇=ξ€œ10𝑔π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰+π‘₯1𝑇+πœ†π‘‹πœ€π‘‡ξ‚π‘‘πœ†π‘‹πœ€π‘‡+ξ€œ10𝑔π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰+πœ†π‘₯1π‘‡ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚ξ‚„π‘‘πœ†π‘₯1𝑇=∢𝐼+𝐼𝐼.(3.17) Since 𝑔π‘₯ is uniformly bounded, it follows from (3.4) that 𝔼|𝐼|2≀𝑐𝔼|π‘‹πœ€π‘‡|2β‰€πΆπœ€πœ€2. Since 𝑔π‘₯ is continuous and uniformly bounded, from Lemma 3.1 and the dominated convergence theorem it follows that 𝔼||||𝐼𝐼2≀sup0≀𝑑≀𝑇𝔼||π‘₯1𝑑||3ξ‚Ό2/3𝔼||||ξ€œ10𝑔π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰+πœ†π‘₯1π‘‡ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰||||ξ‚ξ‚π‘‘πœ†6ξƒ­ξƒ°1/3β‰€πΆπœ€πœ€2.(3.18) Consequently, 𝔼|||𝑔π‘₯πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’π‘”π‘‡π‘’,πœ‰ξ‚βˆ’π‘”π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇|||2ξ‚Ή||𝐼||≀2𝔼2||||+2𝔼𝐼𝐼2β‰€πΆπœ€πœ€2.(3.19) From Lemma 3.1 and the dominated convergence theorem, it follows that sup0β‰€π‘‘β‰€π‘‡π”Όξ‚Έξ€œπ‘‡π‘‘ξ€·πΈπ‘ 1,πœ€π‘₯1𝑠+𝐸𝑠2,πœ€π‘¦1𝑠+𝐸𝑠3,πœ€π‘§1𝑠𝑑𝑠2β‰€πΆπœ€πœ€2.(3.20) Since 𝑓π‘₯, 𝑓𝑦, and 𝑓𝑧 are uniformly bounded, we have sup0≀𝑑≀𝑇||𝐹𝑠1,πœ€||+||𝐹𝑠2,πœ€||+||𝐹𝑠3,πœ€||≀𝑐.(3.21) Similar to the proof of Lemma 1 in [9] for the BSDE part, we can obtain (3.5) and (3.6) with the iterative method.

We are now ready to state the variational inequality.

Lemma 3.3. The following variational inequality holds: π”Όξƒ¬πœ™π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇+𝛾𝑦𝑦0𝑒,πœ‰ξ‚π‘¦10+πœ€π‘–β‰₯1π‘™πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έπœ‚π‘–ξƒ­ξ‚»ξ€œ+𝔼𝑇0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1𝑑+β„Žπ‘¦(𝑑)𝑦1𝑑𝑒+β„Žπœ€π‘‘ξ€Έξ€»ξ‚Όβˆ’β„Ž(𝑑)𝑑𝑑β‰₯π‘œ(πœ€).(3.22)

Proof. From the optimality of (𝑒(β‹…),πœ‰(β‹…)), we have 𝐽(π‘’πœ€(β‹…),πœ‰πœ€(β‹…))βˆ’π½(𝑒(β‹…),πœ‰(β‹…))β‰₯0.(3.23) From Lemmas 3.1 and 3.2, it follows that π”Όξ‚ƒπœ™ξ€·π‘₯πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’πœ™π‘‡π‘’,πœ‰+π‘₯1π‘‡π”Όξ‚ƒπœ™ξ‚€π‘₯=π‘œ(πœ€),𝑇𝑒,πœ‰+π‘₯1𝑇π‘₯βˆ’πœ™π‘‡π‘’,πœ‰ξ‚ƒπœ™ξ‚ξ‚„=𝔼π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇+π‘œ(πœ€).(3.24) Hence, π”Όξ‚ƒπœ™ξ€·π‘₯πœ€π‘‡ξ€Έξ‚€π‘₯βˆ’πœ™π‘‡π‘’,πœ‰ξ‚ƒπœ™ξ‚ξ‚„=𝔼π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘₯1𝑇+π‘œ(πœ€).(3.25) Similarly we get π”Όξ‚ƒπ›Ύξ€·π‘¦πœ€0ξ€Έξ‚€π‘¦βˆ’π›Ύ0𝑒,πœ‰ξ‚ƒπ›Ύξ‚ξ‚„=𝔼𝑦𝑦0𝑒,πœ‰ξ‚π‘¦10𝔼+π‘œ(πœ€),𝑖β‰₯1π‘™ξ€·πœπ‘–,πœ‰π‘–+πœ€πœ‚π‘–ξ€Έβˆ’ξ“π‘–β‰₯1π‘™ξ€·πœπ‘–,πœ‰π‘–ξ€Έξƒ­ξƒ¬ξ“=πœ€π”Όπ‘–β‰₯1π‘™πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έπœ‚π‘–ξƒ­+π‘œ(πœ€),(3.26) while π”Όξ‚»ξ€œπ‘‡0ξ€Ίβ„Žξ€·π‘‘,π‘₯πœ€π‘‘,π‘¦πœ€π‘‘,π‘’πœ€π‘‘ξ€Έξ€»ξ‚Όξ‚»ξ€œβˆ’β„Ž(𝑑)𝑑𝑑=𝔼𝑇0ξ‚ƒβ„Žξ€·π‘‘,π‘₯πœ€π‘‘,π‘¦πœ€π‘‘,π‘’πœ€π‘‘ξ€Έξ‚€βˆ’β„Žπ‘‘,π‘₯𝑑𝑒,πœ‰+π‘₯1𝑑,𝑦𝑑𝑒,πœ‰+𝑦1𝑑,π‘’πœ€π‘‘ξ‚Όξ‚»ξ€œξ‚ξ‚„π‘‘π‘‘+𝔼𝑇0ξ‚ƒβ„Žξ‚€π‘‘,π‘₯𝑑𝑒,πœ‰+π‘₯1𝑑,𝑦𝑑𝑒,πœ‰+𝑦1𝑑,π‘’πœ€π‘‘ξ‚ξ€·π‘’βˆ’β„Žπœ€π‘‘ξ€Έξ‚„ξ‚Όξ‚»ξ€œπ‘‘π‘‘+𝔼𝑇0ξ€Ίβ„Žξ€·π‘’πœ€π‘‘ξ€Έξ€»ξ‚Όξ‚»ξ€œβˆ’β„Ž(𝑑)π‘‘π‘‘βˆΆ=𝐼+𝐼𝐼+𝔼𝑇0ξ€Ίβ„Žξ€·π‘’πœ€π‘‘ξ€Έξ€»ξ‚Ό.βˆ’β„Ž(𝑑)𝑑𝑑(3.27) Since β„Žπ‘₯,β„Žπ‘¦,β„Žπ‘§ have linear growth, it follows from Lemma 3.2 and HΓΆlder's inequality that ξ‚»ξ€œπΌ=𝔼𝑇0ξ€œ10ξ€Ίβ„Žπ‘₯ξ€·Ξ πœ€π‘‘ξ€Έπ‘‹πœ€π‘‘+β„Žπ‘¦ξ€·Ξ πœ€π‘‘ξ€Έπ‘Œπœ€π‘‘ξ€»ξ‚Όπ‘‘πœ†π‘‘π‘‘=π‘œ(πœ€).(3.28) By Lemma 3.1 and the dominated convergence theorem, we have ξ‚»ξ€œπΌπΌ=𝔼𝑇0ξ€œ10ξ€Ίβ„Žπ‘₯ξ€·Ξ›πœ€π‘‘ξ€Έπ‘₯1𝑑+β„Žπ‘¦ξ€·Ξ›πœ€π‘‘ξ€Έπ‘¦1π‘‘ξ€»ξ‚Όξ‚»ξ€œπ‘‘πœ†π‘‘π‘‘=𝔼𝑇0ξ€Ίβ„Žπ‘₯ξ€·π‘’πœ€π‘‘ξ€Έπ‘₯1𝑑+β„Žπ‘¦ξ€·π‘’πœ€π‘‘ξ€Έπ‘¦1π‘‘ξ€»ξ‚Όξ‚»ξ€œπ‘‘π‘‘+π‘œ(πœ€)=𝔼𝑇0β„Žξ€Ίξ€·π‘₯ξ€·π‘’πœ€π‘‘ξ€Έβˆ’β„Žπ‘₯(ξ€Έπ‘₯𝑑)1𝑑+ξ€·β„Žπ‘¦ξ€·π‘’πœ€π‘‘ξ€Έβˆ’β„Žπ‘¦(𝑦𝑑)1π‘‘ξ€»ξ‚Όξ‚»ξ€œπ‘‘π‘‘+𝔼𝑇0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1𝑑+β„Žπ‘¦(𝑑)𝑦1π‘‘ξ€»ξ‚Όξ‚»ξ€œπ‘‘π‘‘+π‘œ(πœ€)=π”Όπœπœ+πœ€β„Žξ€Ίξ€·π‘₯(𝑑,𝑣)βˆ’β„Žπ‘₯ξ€Έπ‘₯(𝑑)1𝑑+ξ€·β„Žπ‘¦(𝑑,𝑣)βˆ’β„Žπ‘¦ξ€Έπ‘¦(𝑑)1π‘‘ξ€»ξ‚Όξ‚»ξ€œπ‘‘π‘‘+𝔼𝑇0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1+β„Žπ‘¦(𝑑)𝑦1𝑑𝑑+π‘œ(πœ€),(3.29) where πœ‘(𝑑,𝑣)=πœ‘(𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑣), πœ‘=β„Žπ‘₯,β„Žπ‘¦. It follows from HΓΆlder's inequality that ξ‚»π”Όξ€œπΌπΌβ‰€πœπœ+πœ€||β„Žπ‘₯(𝑑,𝑣)βˆ’β„Žπ‘₯||(𝑑)2𝑑𝑑1/2ξ‚»π”Όξ€œπ‘‡0||π‘₯1𝑑||2𝑑𝑑1/2+ξ‚»π”Όξ€œπœπœ+πœ€||β„Žπ‘¦(𝑑,𝑣)βˆ’β„Žπ‘¦||(𝑑)2𝑑𝑑1/2ξ‚»π”Όξ€œπ‘‡0||𝑦1𝑑||2𝑑𝑑1/2ξ‚»ξ€œ+𝔼𝑇0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1𝑑+β„Žπ‘¦(𝑑)𝑦1𝑑𝑑𝑑+π‘œ(πœ€).(3.30) Using Lemma 3.1 again, we get ξ‚»ξ€œπΌπΌβ‰€π”Όπ‘‡0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1𝑑+β„Žπ‘¦(𝑑)𝑦1𝑑𝑑𝑑+π‘œ(πœ€).(3.31) Consequently, π”Όξ‚»ξ€œπ‘‡0ξ€Ίβ„Žξ€·π‘‘,π‘₯πœ€π‘‘,π‘¦πœ€π‘‘,π‘’πœ€π‘‘ξ€Έξ€»ξ‚Όξ‚»ξ€œβˆ’β„Ž(𝑑)𝑑𝑑=𝔼𝑇0ξ€Ίβ„Žπ‘₯(𝑑)π‘₯1𝑑+β„Žπ‘¦(𝑑)𝑦1𝑑𝑒+β„Žπœ€π‘‘ξ€Έξ€»ξ‚Όβˆ’β„Ž(𝑑)𝑑𝑑+π‘œ(πœ€).(3.32) The variational inequality follows from (3.25)–(3.32).

Now we introduce the following FBSDE (called the adjoint equation): 𝑑𝑝𝑑=ξ€Ίπ‘“βˆ—π‘¦(𝑑)π‘π‘‘βˆ’β„Žβˆ—π‘¦ξ€»(𝑑)𝑑𝑑+π‘“βˆ—π‘§(𝑑)𝑝𝑑𝑑𝐡𝑑,π‘‘π‘žπ‘‘=ξ€Ίπ‘“βˆ—π‘₯(𝑑)π‘π‘‘βˆ’π‘βˆ—π‘₯(𝑑)π‘žπ‘‘βˆ’πœŽβˆ—π‘₯(𝑑)π‘˜π‘‘βˆ’β„Žβˆ—π‘₯ξ€»(𝑑)𝑑𝑑+π‘˜π‘‘π‘‘π΅π‘‘,𝑝0=βˆ’π›Ύβˆ—π‘¦ξ‚€π‘¦0𝑒,πœ‰ξ‚,π‘žπ‘‡=βˆ’π‘”βˆ—π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚π‘π‘‡+πœ™βˆ—π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰ξ‚.(3.33)

It is easy to check that the adjoint equation admits a unique solution (𝑝(β‹…),π‘ž(β‹…),π‘˜(β‹…))βˆˆπ‘†3(β„π‘š)×𝑆3(ℝ𝑛)×𝐻3(β„π‘šΓ—π‘‘).

We are now in a position to state the stochastic maximum principle.

Theorem 3.4. Let (𝑒(β‹…),πœ‰(β‹…)) be an optimal control, (π‘₯𝑒,πœ‰(β‹…),𝑦𝑒,πœ‰(β‹…),𝑧𝑒,πœ‰(β‹…)) the corresponding trajectory, and (𝑝(β‹…),π‘ž(β‹…),π‘˜(β‹…)) the solution of the adjoint equation. Then for any π‘£βˆˆπ‘ˆ and πœ‚(β‹…)βˆˆπ’¦ it holds that 𝐻𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑣,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚ξ‚€βˆ’π»π‘‘,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑒𝑑,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚π”Όξƒ―ξ“β‰₯0,a.e.,a.s.,(3.34)𝑖β‰₯1π‘™ξ€Ίξ€·πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έ+π‘žβˆ—πœπ‘–πΆπœπ‘–βˆ’π‘βˆ—πœπ‘–π·πœπ‘–πœ‚ξ€Έξ€·π‘–βˆ’πœ‰π‘–ξƒ°ξ€Έξ€»β‰₯0,(3.35) where 𝐻∢[0,𝑇]Γ—β„π‘›Γ—β„π‘šΓ—β„π‘šΓ—π‘‘Γ—π‘ˆΓ—β„π‘šΓ—β„π‘›Γ—β„π‘›Γ—π‘‘β†’β„ is defined by 𝐻(𝑑,π‘₯,𝑦,𝑧,𝑣,𝑝,π‘ž,π‘˜)=βˆ’βŸ¨π‘,𝑓(𝑑,π‘₯,𝑦,𝑧,𝑣)⟩+βŸ¨π‘ž,𝑏(𝑑,π‘₯,𝑣)⟩+βŸ¨π‘˜,𝜎(𝑑,π‘₯)⟩+β„Ž(𝑑,π‘₯,𝑦,𝑣).(3.36)

Proof. Applying ItΓ΄'s formula to βŸ¨π‘π‘‘,𝑦1π‘‘βŸ©+βŸ¨π‘žπ‘‘,π‘₯1π‘‘βŸ©, by Lemma 3.3 we derive π”Όξ‚»ξ€œπ‘‡0𝐻𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,π‘’πœ€π‘‘,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚ξ‚€βˆ’π»π‘‘,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑒𝑑,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚Όξƒ―ξ“ξ‚ξ‚„π‘‘π‘‘+πœ€π”Όπ‘–β‰₯1π‘™ξ€Ίξ€·πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έ+π‘žβˆ—πœπ‘–πΆπœπ‘–βˆ’π‘βˆ—πœπ‘–π·πœπ‘–ξ€Έπœ‚π‘–ξ€»ξƒ°β‰₯π‘œ(πœ€),(3.37) where πœ‚(β‹…)βˆˆβ„ satisfies πœ‰(β‹…)+πœ‚(β‹…)βˆˆπ’¦. Dividing (3.37) by πœ€ and letting πœ€ go to 0, we obtain π”Όξ‚ƒπ»ξ‚€πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,𝑣,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚€βˆ’π»πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,π‘’πœ,π‘πœ,π‘žπœ,π‘˜πœξƒ―ξ“ξ‚ξ‚„+𝔼𝑖β‰₯1π‘™ξ€Ίξ€·πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έ+π‘žβˆ—πœπ‘–πΆπœπ‘–βˆ’π‘βˆ—πœπ‘–π·πœπ‘–ξ€Έπœ‚π‘–ξ€»ξƒ°[].β‰₯0,a.e.𝜏∈0,𝑇(3.38) By choosing 𝑣=π‘’πœ in (3.38) we obtain the conclusion (3.35). If we choose πœ‚(β‹…)≑0, then for π‘£βˆˆβ„±πœ satisfying 𝔼|𝑣|3<∞ we have π”Όξ‚ƒπ»ξ‚€πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,𝑣,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚€βˆ’π»πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,π‘’πœ,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚„β‰₯0.(3.39) Now let us set π‘£πœ=π‘£πŸ™π΄+π‘’πœπŸ™A for any π‘£βˆˆπ‘ˆ and π΄βˆˆβ„±πœ. Then it is obvious that π‘£πœβˆˆβ„±πœ and 𝔼|π‘£πœ|3<∞. So from (3.39) it follows that, for any π΄βˆˆβ„±πœ, π”Όξ‚†πŸ™π΄ξ‚ƒπ»ξ‚€πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,𝑣,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚€βˆ’π»πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,π‘’πœ,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚„ξ‚‡β‰₯0.(3.40) Hence, π”Όπ»ξ‚€ξ‚†ξ‚ƒπœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,𝑣,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚€βˆ’π»πœ,π‘₯πœπ‘’,πœ‰,π‘¦πœπ‘’,πœ‰,π‘§πœπ‘’,πœ‰,π‘’πœ,π‘πœ,π‘žπœ,π‘˜πœξ‚ξ‚„βˆ£β„±πœξ‚‡β‰₯0,βˆ€π‘£βˆˆπ‘ˆ.(3.41) Since the quantity inside the conditional expectation is β„±πœ-measurable, the conclusion (3.34) can be obtained easily.

Similar to [22, Corollary 3.1], by Theorem 3.4 we can easily obtain the following

Corollary 3.5. Assume 𝐾=ℝ𝑛. Then for the optimal control (𝑒(β‹…),πœ‰(β‹…)) it holds that 𝐻𝑑,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑣,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚ξ‚€βˆ’π»π‘‘,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑧𝑑𝑒,πœ‰,𝑒𝑑,𝑝𝑑,π‘žπ‘‘,π‘˜π‘‘ξ‚π‘™β‰₯0,βˆ€π‘£βˆˆπ‘ˆ,a.e.,a.s.,πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έ+π‘žβˆ—πœπ‘–πΆπœπ‘–βˆ’π‘βˆ—πœπ‘–π·πœπ‘–=0,𝑖β‰₯1,a.𝑠..(3.42)

Remark 3.6. We can still obtain the stochastic maximum principle if the assumptions are relaxed in the following way.(i)The regular control process 𝑣(β‹…) and the impulse control process πœ‚(β‹…) are assumed to satisfy 𝔼[sup0≀𝑑≀𝑇|𝑣𝑑|𝑝]<∞ and βˆ‘π”Ό[𝑖β‰₯1|πœ‚π‘–|𝑝]<∞ for some π‘βˆˆ(2,3).(ii)The assumption |β„Ž(𝑑,0,0,𝑣)|≀𝑐(1+|𝑣|3) in Hypothesis (H2) can be weakened as |β„Ž(𝑑,0,0,𝑣)|≀𝑐(1+|𝑣|𝑝).(iii)In the spike variation setting, the random variable 𝑣 is assumed to satisfy 𝔼|𝑣|𝑝<∞.
In fact, under these new assumptions both the solutions of the control system (2.4) and the variational equation (3.2) belong to 𝑆𝑝(ℝ𝑛)×𝑆𝑝(β„π‘š)×𝐻𝑝(β„π‘šΓ—π‘‘). The conclusion of Lemma 3.1 becomes sup0≀𝑑≀𝑇𝔼||π‘₯1𝑑||𝑝+sup0≀𝑑≀𝑇𝔼||𝑦1𝑑||π‘ξƒ¬ξ‚΅ξ€œ+𝔼𝑇0||𝑧1𝑑||2𝑑𝑑𝑝/2ξƒ­β‰€π‘πœ€π‘.(3.43) And Lemmas 3.2 and 3.3 still hold true.

4. Sufficient Optimality Conditions for Optimal Controls

We still denote by (π‘₯𝑣,πœ‚(β‹…),𝑦𝑣,πœ‚(β‹…),𝑧𝑣,πœ‚(β‹…)) the trajectory corresponding to (𝑣(β‹…),πœ‚(β‹…))βˆˆπ’œ. Let us first introduce an additional assumption.

(H3) The control domain π‘ˆ is a convex body in β„π‘˜. The maps 𝑏, 𝑓, and β„Ž are locally Lipschitz in the regular control variable 𝑣.

Theorem 4.1. Let (H1)–(H3) hold. Assume that the functions πœ™, 𝛾, πœ‚β†’π‘™(𝑑,πœ‚) and (π‘₯,𝑦,𝑧,𝑣)→𝐻(𝑑,π‘₯,𝑦,𝑧,𝑣,𝑝,π‘ž,π‘˜) are convex. Moreover, 𝑦𝑇𝑣,πœ‚ has the following particular form: 𝑦𝑇𝑣,πœ‚=𝐾π‘₯𝑇𝑣,πœ‚+𝜁 for πΎβˆˆβ„π‘šΓ—π‘› and 𝜁∈𝐿3(Ξ©,ℱ𝑇,β„™;β„π‘š). Let (𝑝𝑒,πœ‰,π‘žπ‘’,πœ‰,π‘˜π‘’,πœ‰) be the solution of the adjoint equation associated with (𝑒,πœ‰)βˆˆπ’œ. Then (𝑒,πœ‰) is an optimal control of the stochastic optimal control problem if it satisfies (3.34) and (3.35).

Proof. Set 𝐽=𝐽(𝑣(β‹…),πœ‚(β‹…))βˆ’π½(𝑒(β‹…),πœ‰(β‹…)). Since πœ™, 𝛾, πœ‚β†’π‘™(𝑑,πœ‚) are convex, we have πœ™ξ€·π‘₯𝑇𝑣,πœ‚ξ€Έξ‚€π‘₯βˆ’πœ™π‘‡π‘’,πœ‰ξ‚β‰₯πœ™π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰π‘₯𝑇𝑣,πœ‚βˆ’π‘₯𝑇𝑒,πœ‰ξ‚,𝛾𝑦0𝑣,πœ‚ξ€Έξ‚€π‘¦βˆ’π›Ύ0𝑒,πœ‰ξ‚β‰₯𝛾𝑦𝑦0𝑒,πœ‰π‘¦ξ‚ξ‚€0𝑣,πœ‚βˆ’π‘¦0𝑒,πœ‰ξ‚,𝑖β‰₯1π‘™ξ€·πœπ‘–,πœ‚π‘–ξ€Έβˆ’ξ“π‘–β‰₯1π‘™ξ€·πœπ‘–,πœ‰π‘–ξ€Έβ‰₯𝑖β‰₯1ξ€Ίπ‘™πœ‰ξ€·πœπ‘–,πœ‰π‘–πœ‚ξ€Έξ€·π‘–βˆ’πœ‰π‘–.ξ€Έξ€»(4.1) Thus, ξ‚ξ‚†πœ™π½β‰₯𝔼π‘₯ξ‚€π‘₯𝑇𝑒,πœ‰π‘₯𝑇𝑣,πœ‚βˆ’π‘₯𝑇𝑒,πœ‰ξ‚+𝛾𝑦𝑦0𝑒,πœ‰π‘¦ξ‚ξ‚€0𝑣,πœ‚βˆ’π‘¦0𝑒,πœ‰ξ‚»ξ€œξ‚ξ‚‡+𝔼𝑇0ξ‚ƒβ„Žξ€·π‘‘,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,π‘£π‘‘ξ€Έξ‚€βˆ’β„Žπ‘‘,π‘₯𝑑𝑒,πœ‰,𝑦𝑑𝑒,πœ‰,𝑒𝑑𝑑𝑑+𝔼𝑖β‰₯1ξ€Ίπ‘™πœ‰ξ€·πœπ‘–,πœ‰π‘–πœ‚ξ€Έξ€·π‘–βˆ’πœ‰π‘–ξƒ°.ξ€Έξ€»(4.2) Set ℋ𝑣,πœ‚(𝑑)∢=𝐻(𝑑,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑧𝑑𝑣,πœ‚,𝑣𝑑,𝑝𝑑𝑒,πœ‰,π‘žπ‘‘π‘’,πœ‰,π‘˜π‘‘π‘’,πœ‰). Then by ItΓ΄'s formula applied to βŸ¨π‘žπ‘‘π‘’,πœ‰,(π‘₯𝑑𝑣,πœ‚βˆ’π‘₯𝑑𝑒,πœ‰)⟩+βŸ¨π‘π‘‘π‘’,πœ‰,(𝑦𝑑𝑣,πœ‚βˆ’π‘¦π‘‘π‘’,πœ‰)⟩, we get 𝐽β‰₯Ξ+Θ, where ξƒ―ξ“Ξž=𝔼𝑖β‰₯1π‘™ξ€Ίξ€·πœ‰ξ€·πœπ‘–,πœ‰π‘–ξ€Έ+π‘žβˆ—πœπ‘–πΆπœπ‘–βˆ’π‘βˆ—πœπ‘–π·πœπ‘–πœ‚ξ€Έξ€·π‘–βˆ’πœ‰π‘–ξƒ°,ξ‚»ξ€œξ€Έξ€»Ξ˜=𝔼𝑇0ℋ𝑣,πœ‚(𝑑)βˆ’β„‹π‘’,πœ‰(𝑑)βˆ’β„‹π‘₯𝑒,πœ‰ξ‚€π‘₯(𝑑)𝑑𝑣,πœ‚βˆ’π‘₯𝑑𝑒,πœ‰ξ‚βˆ’β„‹π‘¦π‘’,πœ‰(𝑦𝑑)𝑑𝑣,πœ‚βˆ’π‘¦π‘‘π‘’,πœ‰ξ‚βˆ’β„‹π‘§π‘’,πœ‰(𝑧𝑑)𝑑𝑣,πœ‚βˆ’π‘§π‘‘π‘’,πœ‰ξ‚Ό.𝑑𝑑(4.3) From (3.35) we have Ξβ‰₯0. By (3.34) and [23, Lemma 2.3-(iii); Chapter 3], we have 0βˆˆπœ•π‘’β„‹π‘’,πœ‰(𝑑). By [23, Lemma 2.4; Chapter 3], we further conclude that ξ‚€β„‹π‘₯𝑒,πœ‰(𝑑),ℋ𝑦𝑒,πœ‰(𝑑),ℋ𝑧𝑒,πœ‰ξ‚(𝑑),0βˆˆπœ•π‘₯,𝑦,𝑧,𝑒ℋ𝑒,πœ‰(𝑑).(4.4) Then, by [23, Lemma 2.3-(v); Chapter 3] and the convexity of 𝐻(𝑑,.,.,.,.,𝑝,π‘ž,π‘˜), we obtain ℋ𝑣,πœ‚(𝑑)βˆ’β„‹π‘’,πœ‰(𝑑)β‰₯β„‹π‘₯𝑒,πœ‰ξ‚€π‘₯(𝑑)𝑑𝑣,πœ‚βˆ’π‘₯𝑑𝑒,πœ‰ξ‚+ℋ𝑦𝑒,πœ‰ξ‚€π‘¦(𝑑)𝑑𝑣,πœ‚βˆ’π‘¦π‘‘π‘’,πœ‰ξ‚+ℋ𝑧𝑒,πœ‰ξ‚€π‘§(𝑑)𝑑𝑣,πœ‚βˆ’π‘§π‘‘π‘’,πœ‰ξ‚,(4.5) from which it follows immediately that Θβ‰₯0. Thus we obtain 𝐽β‰₯0 and the proof is complete.

We now give an example of linear quadratic optimal control problem involving impulse controls to illustrate the application of our theoretical results.

Example 4.2. For simplicity, assume that the variables and coefficients are scalar-valued. Let us take π‘ˆ={βˆ’1,1} and 𝐾=ℝ. There are only two values βˆ’1 and 1 in π‘ˆ which is a usual case in practice and represents only two control states: β€œon” and β€œoff”. For (𝑣(β‹…),πœ‚(β‹…))βˆˆπ’œ, the controlled system is subject to the following linear FBSDE:𝑑π‘₯𝑑=𝐴π‘₯𝑑+𝐡𝑣𝑑𝑑𝑑+𝐢π‘₯𝑑𝑑𝐡𝑑+π»π‘‘πœ‚π‘‘,𝑑𝑦𝑑=βˆ’π·π‘₯𝑑+𝐸𝑦𝑑+𝐹𝑧𝑑+𝐺𝑣𝑑𝑑𝑑+π‘§π‘‘π‘‘π΅π‘‘βˆ’π‘…π‘‘πœ‚π‘‘,π‘₯0=π‘Ž,𝑦𝑇=𝑔π‘₯𝑇,(4.6) and the cost functional is given by1𝐽(𝑣(β‹…),πœ‚(β‹…))=2π”Όξƒ¬π‘Šπ‘₯2𝑇+𝛾𝑦20+ξ€œπ‘‡0𝑀π‘₯2𝑑+𝑁𝑦2𝑑+𝑄𝑣2𝑑𝑑𝑑+𝐿𝑖β‰₯1πœ‚2𝑖.(4.7) The coefficients are deterministic constants such that π‘Š,𝛾,𝑀,𝑁β‰₯0 and 𝑄,𝐿>0. By Propositions 2.1 and 2.2 we know that the control system admits a unique solution (π‘₯(β‹…),𝑦(β‹…),𝑧(β‹…))βˆˆπ‘†3(ℝ)×𝑆3(ℝ)×𝐻3(ℝ) for any (𝑣,πœ‚)βˆˆπ’œ. And the functional 𝐽 is well defined from π’œ into ℝ.
Let βˆ‘(𝑒(β‹…),πœ‰(β‹…)=𝑖β‰₯1πœ‰π‘–πŸ™[πœπ‘–,𝑇](β‹…))βˆˆπ’œ be an optimal control and (π‘₯(β‹…),𝑦(β‹…),𝑧(β‹…)) the corresponding trajectory. Then the following adjoint equation 𝑑𝑝𝑑=ξ€·πΈπ‘π‘‘βˆ’π‘π‘¦π‘‘ξ€Έπ‘‘π‘‘+𝐹𝑝𝑑𝑑𝐡𝑑,π‘‘π‘žπ‘‘=ξ€·π·π‘π‘‘βˆ’π΄π‘žπ‘‘βˆ’πΆπ‘˜π‘‘βˆ’π‘€π‘₯𝑑𝑑𝑑+π‘˜π‘‘π‘‘π΅π‘‘,𝑝0=βˆ’π›Ύπ‘¦0,π‘žπ‘‡=βˆ’π‘”π‘π‘‡+π‘Šπ‘₯𝑇(4.8) admits a unique solution (𝑝(β‹…),π‘ž(β‹…),π‘˜(β‹…))βˆˆπ‘†3(ℝ)×𝑆3(ℝ)×𝐻3(ℝ). The Hamiltonian 𝐻 is given by 1𝐻(𝑑,π‘₯,𝑦,𝑧,𝑣,𝑝,π‘ž,π‘˜)=βˆ’π‘(𝐷π‘₯+𝐸𝑦+𝐹𝑧+𝐺𝑣)+π‘ž(𝐴π‘₯+𝐡𝑣)+π‘˜πΆπ‘₯+2𝑀π‘₯2+𝑁𝑦2+𝑄𝑣2ξ€Έ.(4.9) Then by Corollary 3.5 we obtain ξ€·βˆ’πΊπ‘π‘‘+π΅π‘žπ‘‘ξ€Έ1𝑣+2𝑄𝑣2β‰₯ξ€·βˆ’πΊπ‘π‘‘+π΅π‘žπ‘‘ξ€Έπ‘’π‘‘+12𝑄𝑒2𝑑,βˆ€π‘£βˆˆπ‘ˆ,a.e.,a.s.,(4.10)πΏπœ‰π‘–+π»π‘žπœπ‘–βˆ’π‘…π‘πœπ‘–=0,𝑖β‰₯1,a.s..(4.11) From (4.10) we get 𝑒𝑑=ξ‚»1,ifπΊπ‘π‘‘βˆ’π΅π‘žπ‘‘β‰₯0,βˆ’1,otherwise.(4.12) From (4.11) we obtain that πœ‰π‘–=πΏβˆ’1ξ€·π‘…π‘πœπ‘–βˆ’π»π‘žπœπ‘–ξ€Έ,𝑖β‰₯1,a.s..(4.13) Hence, if (𝑒,πœ‰)βˆˆπ’œ is an optimal control of this linear quadratic control problem, then it satisfies (4.12) and (4.13).
We can prove that (𝑒(β‹…),πœ‰(β‹…)) obtained in (4.12) and (4.13) is indeed an optimal control of this linear quadratic optimization problem. Note that Theorem 4.1 does not hold now since π‘ˆ is not convex in this example. In what follows, we use the same notations as those in the proof of Theorem 4.1. In fact, as in the proof of Theorem 4.1, we can still derive 𝐽(𝑣(β‹…),πœ‚(β‹…))βˆ’π½(𝑒(β‹…),πœ‰(β‹…))β‰₯Ξ+Θ. On the one hand, it follows from (4.13) that Ξ=0. On the other hand, we have ξ‚»ξ€œΞ˜=𝔼𝑇0ℋ𝑣,πœ‚ξ‚€(𝑑)βˆ’π»π‘‘,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑧𝑑𝑣,πœ‚,𝑒𝑑,𝑝𝑑𝑒,πœ‰,π‘žπ‘‘π‘’,πœ‰,π‘˜π‘‘π‘’,πœ‰ξ‚+Φ𝑑𝑑𝑑,(4.14) where Φ𝑑=𝐻𝑑,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑧𝑑𝑣,πœ‚,𝑒𝑑,𝑝𝑑𝑒,πœ‰,π‘žπ‘‘π‘’,πœ‰,π‘˜π‘‘π‘’,πœ‰ξ‚βˆ’β„‹π‘’,πœ‰(𝑑)βˆ’β„‹π‘₯𝑒,πœ‰ξ‚€π‘₯(𝑑)𝑑𝑣,πœ‚βˆ’π‘₯𝑑𝑒,πœ‰ξ‚βˆ’β„‹π‘¦π‘’,πœ‰ξ‚€π‘¦(𝑑)𝑑𝑣,πœ‚βˆ’π‘¦π‘‘π‘’,πœ‰ξ‚βˆ’β„‹π‘§π‘’,πœ‰ξ‚€π‘§(𝑑)𝑑𝑣,πœ‚βˆ’π‘§π‘‘π‘’,πœ‰ξ‚.(4.15) From (4.12) and the definition of 𝐻, it is easy to get ℋ𝑣,πœ‚ξ‚€(𝑑)βˆ’π»π‘‘,π‘₯𝑑𝑣,πœ‚,𝑦𝑑𝑣,πœ‚,𝑧𝑑𝑣,πœ‚,𝑒𝑑,𝑝𝑑𝑒,πœ‰,π‘žπ‘‘π‘’,πœ‰,π‘˜π‘‘π‘’,πœ‰ξ‚=ξ‚ƒξ€·βˆ’πΊπ‘π‘‘+π΅π‘žπ‘‘ξ€Έπ‘£π‘‘+12𝑄𝑣2π‘‘ξ‚„βˆ’ξ‚ƒξ€·βˆ’πΊπ‘π‘‘+π΅π‘žπ‘‘ξ€Έπ‘’π‘‘+12𝑄𝑒2𝑑β‰₯0.(4.16) Since 𝑀, 𝑁β‰₯0, 𝐻 is convex in (π‘₯,𝑦,𝑧), and thus Φ𝑑β‰₯0, so we obtain Θβ‰₯0. Consequently, it follows that 𝐽(𝑣(β‹…),πœ‚(β‹…))βˆ’π½(𝑒(β‹…),πœ‰(β‹…))β‰₯0 and the optimality of (𝑒(β‹…),πœ‰(β‹…)) is proved.

Remark 4.3. For the classical linear quadratic optimal control problem, one can usually obtain an optimal control in a linear state feedback form by virtue of the so-called Riccati equation, and along this line the solvability of the Riccati equation leads to that of the linear quadratic problem. However, it is difficult to obtain a state feedback optimal control in terms of the Riccati equation in Example 4.2 mainly due to the particular form of the regular control domain and the appearance of the impulse control in the control system.

Acknowledgments

The authors would like to thank the referees for valuable suggestions which helped to improve the first version of this paper. Z. Wu acknowledges the financial support from the National Natural Science Foundation of China (10921101 and 61174092) and the Natural Science Fund for Distinguished Young Scholars of China (11125102). F. Zhang acknowledges the financial support from the Natural Science Foundation of Shandong Province, China (ZR2011AQ018), and the Foundation of Doctoral Research Program, Shandong University of Finance and Economics.