Abstract

This paper is concerned with a mean-variance hedging problem with partial information, where the initial endowment of an agent may be a decision and the contingent claim is a random variable. This problem is explicitly solved by studying a linear-quadratic optimal control problem with non-Markov control systems and partial information. Then, we use the result as well as filtering to solve some examples in stochastic control and finance. Also, we establish backward and forward-backward stochastic differential filtering equations which are different from the classical filtering theory introduced by Liptser and Shiryayev (1977), Xiong (2008), and so forth.

1. Introduction and Problem Formulation

We begin with a finite time horizon [0,𝑇] for 𝑇>0, a complete filtered probability space (Ω,,(𝑡),) on which an 𝐑𝑚-valued standard Brownian motion (𝑊()) is defined. Moreover, we let the natural filtration 𝑡=𝜎{𝑊(𝑠);0𝑠𝑡}, 0𝑡𝑇, and =𝑇.

Suppose there is a financial market in which 𝑚+1 securities can be continuously traded. One of them is a bond whose price 𝐵() satisfies d𝐵(𝑡)=𝑟(𝑡)𝐵(𝑡)d𝑡,(1.1) where 𝑟(𝑡) is the interest rate of the bond at time 𝑡. The other 𝑚 assets are stocks whose dynamicses are subject to the following stochastic differential equations (SDEs): d𝑆𝑖(𝑡)=𝜇𝑖(𝑡)𝑆𝑖(𝑡)d𝑡+𝜎𝑖(𝑡)𝑆𝑖(𝑡)d𝑊𝑖(𝑡),(1.2) where 𝜇𝑖(𝑡) and 𝜎𝑖(𝑡) are called the appreciation rate of return and volatility coefficient of the 𝑖th stock.

Suppose there is an agent who invests in the bond and stocks, whose decision cannot influence the prices in the financial market. We assume that the trading of the agent is self-financed, that is, there is no infusion or withdrawal of funds over [0,𝑇]. We denote by 𝜋𝑖(𝑡) the amount that the agent invests in the 𝑖th stock and by 𝑥𝜋(𝑡) the wealth of the agent with an initial endowment 𝑥0>0. Then the agent has 𝑥𝜋(𝑡)𝑚𝑖=1𝜋𝑖(𝑡) savings in a bank. Under the forgoing notations and interpretations, the wealth 𝑥𝜋() is modeled by d𝑥𝜋(𝑡)=𝑟(𝑡)𝑥𝜋(𝑡)+𝑚𝑖=1𝜇𝑖𝜋(𝑡)𝑟(𝑡)𝑖(𝑡)d𝑡+𝑚𝑖=1𝜎𝑖(𝑡)𝜋𝑖(𝑡)d𝑊𝑖𝑥(𝑡),𝜋(0)=𝑥0.(1.3)

Generally speaking, it is impossible for the agent to know all the events occurred in the financial market. For instance, if the agent has not enough time or great vigor to observe all the prices of the 𝑚+1 assets, then the agent will only observe some data of all the prices. Without loss of generality, we denote by 𝒵𝑡 the information available to the agent at time 𝑡, which is a subfiltration of 𝑡. Suppose a process only adapted to 𝒵𝑡 is called observable. Therefore, the agent has to choose a portfolio strategy according to the observable filtration 𝒵𝑡. A portfolio strategy 𝜋()=(𝜋1(),,𝜋𝑚()) is called admissible if 𝜋𝑖(𝑡) is a 𝒵𝑡-adapted, square-integrable process with values in 𝐑. The set of the admissible portfolio strategies is denoted by 𝒰ad.

We give the following hypothesis.(H1)The coefficients 𝑟(),𝜇𝑖(),𝜎𝑖(), and 𝜎𝑖()1 are uniformly bounded and deterministic functions with values in 𝐑.

For any 𝜋()𝒰ad, (1.3) admits a unique solution under Hypothesis (H1). If we define 𝑣𝑖()=𝜎𝑖()𝜋𝑖(), then (1.3) is rewritten as d𝑥𝑣(𝑡)=𝑟(𝑡)𝑥𝑣(𝑡)+𝑚𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖𝑣(𝑡)𝑖(𝑡)d𝑡+𝑚𝑖=1𝑣𝑖(𝑡)d𝑊𝑖𝑥(𝑡),𝑣(0)=𝑥0.(1.4)

Let 𝜉>0 be a given contingent claim, which is a 𝒵𝑇-measurable, square-integrable random variable. Furthermore, we suppose 𝜉 is larger than or equal to 𝑥0e𝑇0𝑟(𝑡)d𝑡, where the value 𝑥0e𝑇0𝑟(𝑡)d𝑡 coincides with the amount that the agent would earn when the initial wealth 𝑥0 was invested in the bond at the interest rate 𝑟() for the entire investment period.

Define a cost functional 𝐽𝑣();𝑥0=12𝔼||𝑥𝑣||(𝑇)𝜉2.(1.5) Note that the above 𝜉 can contain 𝔼[𝑥𝑣(𝑇)𝒵𝑇] as a special case. For a priori given initial wealth 𝑥0, (1.5) measures the risk that the contingent claim 𝜉 cannot be reached. The agent’s objective is 𝑣min𝐽();𝑥0subjectto𝑣()𝒰ad,(𝑥𝑣();𝑣())satises(1.3)or(1.4).(PIMV)

The above problem formulates a mean-variance hedging problem with partial information. For simplicity, hereinafter we denote it by the notation “Problem (PIMV)”, short for the “partial information mean-variance hedging problem”. In particular, if we let 𝑡=𝒵𝑡, 0𝑡𝑇, then Problem (PIMV) reduces to the case with complete information. See, for example, Kohlmann and Zhou [1] for more details.

Because the contingent claim 𝜉 in (1.5) is random and the initial endowment 𝑥0 in (1.3) may be a decision, our Problem (PIMV) is distinguished from the existing literature. See, for example, Pham [2], Xiong and Zhou [3], Hu and Øksendal [4], and so forth. Motivated by Problem (PIMV), we study a general linear-quadratic (LQ) optimal control problem with partial information in Section 2. By a combination of the martingale representation theorem, the technique of “completing the square”, and conditional expectation, we derive a corresponding optimal control which is denoted by a related optimal state equation, a Riccati differential equation and a backward stochastic differential equation (BSDE). To demonstrate the applications of our results, we work out some partial information LQ examples and obtain some explicitly observable optimal controls by filtering for BSDEs. Also, we establish some backward and forward-backward stochastic differential filtering equations which are different from the classical ones.

In Section 3, we use the result established in Section 2 to derive an optimal portfolio strategy of Problem (PIMV), which is denoted by the sum of a replicating portfolio strategy for the contingent claim 𝜉 and a Merton’s portfolio strategy. To explicitly illustrate Problem (PIMV), we provide a special but nontrivial example in this section. In terms of filtering theory, we derive the corresponding risk measure. Furthermore, we use some numerical simulations and three figures to illustrate the risk measure and the optimal portfolio strategy.

In Section 4, we compare our results with the existing ones.

Finally, for the convenience of the reader, we state a classical filtering equation for SDEs which is used in Section 3 of this paper.

2. An LQ Optimal Control Problem with Partial Information

In this section, we study a partial information LQ optimal control problem, which is a generalization of Problem (PIMV).

Let us now begin to formulate the LQ problem. Consider a stochastic control system d𝑥𝑣(𝑡)=𝐴(𝑡)𝑥𝑣(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑣𝑖(𝑡)+𝑔(𝑡)d𝑡+𝑚𝑖=1𝑣𝑖(𝑡)d𝑊𝑖𝑥(𝑡),𝑣(0)=𝑥0.(2.1) Here 𝑥𝑣(t), 𝑥0, 𝑣𝑖(𝑡), 𝑔(𝑡)𝐑𝑛, 𝐴(𝑡) and 𝐵𝑖(𝑡)𝐑𝑛×𝑛; 𝑣()=(𝑣1(),,𝑣𝑚()) is a control (process) with values in 𝐑𝑛×𝑚. We suppose 𝑣(𝑡) is 𝒵𝑡-adapted, where 𝒵𝑡 is a given subfiltration of 𝑡 representing the information available to a policymaker at time 𝑡. We say that the control 𝑣() is admissible and write 𝑣()𝒰ad if 𝑣()𝐿2𝒵(0,𝑇;𝐑𝑛×𝑚), that is, 𝑣(𝑡) is a 𝒵𝑡-adapted process with values in 𝐑𝑛×𝑚 and satisfies 𝔼𝑇0||||𝑣(𝑡)2d𝑡<+.(2.2)

The following basic hypothesis will be in force throughout this section.(H2)𝐴(), 𝐵𝑖() are uniformly bounded and deterministic functions, 𝑥0 is 0-adapted, and 𝑔()𝐿2(0,𝑇;𝐑𝑛).

For any 𝑣()𝒰ad, control system (2.1) admits a unique solution under Hypothesis (H2). The associated cost functional is 𝐽𝑣();𝑥0=12𝔼||𝑥𝑣||(𝑇)𝜉2,(2.3) where 𝜉 is a given 𝑇-measurable, square-integrable random variable. The LQ optimal control problem with partial information is 𝑣min𝐽();𝑥0subjectto𝑣()𝒰ad,(𝑥𝑣();𝑣())satises(2.1).(PILQ) An admissible control 𝑢() is called optimal if it satisfies 𝐽𝑢();𝑥0=min𝑣()𝒰ad𝐽𝑣();𝑥0.(2.4) The solution 𝑥() and cost functional (2.3) along with 𝑢() are called the optimal state and the value function, respectively.

Problem (PILQ) is related to the recent work by Hu and Øksendal [4], where an LQ control for jump diffusions with partial information is investigated. Due to some characteristic setup, our Problem (PILQ) is not covered by [4]. See, for example, Section 4 in this paper for some detailed comments. Since the nonhomogeneous term in the drift of (2.1) is random and the observable filtration 𝒵𝑡 is very general, it is not easy to solve Problem (PILQ). To overcome the resulting difficulty, we shall adopt a combination method of the martingale representation theorem, the technique of “completing the square”, and conditional expectation. This method is inspired by Kohlmann and Zhou [1], where an LQ control problem with complete information is studied.

To simplify the cost functional (2.3), we define 𝑦𝑣(𝑡)=𝑥𝑣(𝑡)𝔼𝜉𝑡.(2.5) Since 𝔼[𝜉𝑡] is an 𝑡-martingale, by the martingale representation theorem (see e.g., Liptser and Shiryasyev [5]), there is a unique 𝑧𝑖()𝐿2(0,𝑇;R𝑛) such that 𝔼𝜉𝑡=𝔼𝜉+𝑚𝑖=1𝑡0𝑧𝑖(𝑠)d𝑊𝑖(𝑠).(2.6) Applying Itô’s formula to (2.1) and (2.5)-(2.6), we have d𝑦𝑣(𝑡)=𝐴(𝑡)𝑦𝑣(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑣𝑖(𝑡)+(𝑡)d𝑡+𝑚𝑖=1𝑣𝑖(𝑡)𝑧𝑖(𝑡)d𝑊𝑖𝑦(𝑡),𝑣(0)=𝑦0=𝑥0𝔼𝜉(2.7) with (𝑡)=𝑔(𝑡)+𝐴(𝑡)𝔼𝜉𝑡,(2.8) and cost functional (2.3) reduces to 𝐽𝑣();𝑦0=12𝔼||𝑦𝑣||(𝑇)2.(2.9)

Then Problem (PILQ) is equivalent to minimize (2.9) subject to (2.6)-(2.7) and 𝒰ad. To solve the resulting problem, we first introduce a Riccati differential equation on 𝐑𝑛×𝑛̇𝑃(𝑡)+𝑃(𝑡)𝐴(𝑡)+𝐴(𝑡)𝜏𝑃(𝑡)𝑚𝑖=1𝑃(𝑡)𝐵𝑖(𝑡)𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝑃(𝑡)=0,𝑃(𝑇)=𝐼,𝑃(𝑡)>0,0𝑡𝑇.(2.10) Note that (2.7) contains a nonhomogeneous term (). For this, we also introduce a BSDE on 𝐑𝑛, d𝛼(𝑡)=𝐴(𝑡)𝜏𝑚𝑖=1𝑃(𝑡)𝐵𝑖(𝑡)𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝛼(𝑡)𝑚𝑖=1𝑃(𝑡)𝐵𝑖(𝑡)𝑃(𝑡)1𝛽𝑖(𝑡)+𝑃(𝑡)(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑧𝑖(𝑡)d𝑡𝑚𝑖=1𝛽𝑖(𝑡)d𝑊𝑖(𝑡),𝛼(𝑇)=0.(2.11) Assume that the following hypothesis holds.(H3)For any 0𝑡𝑇, 𝐴(𝑡)+𝐴(𝑡)𝜏𝑚𝑖=1𝐵𝑖(𝑡)𝐵𝑖(𝑡)𝜏.(2.12)

Under Hypotheses (H2) and (H3), according to [1, Theorem 4.2], it is easy to see that (2.10) admits a unique solution, and then (2.11) admits a unique 𝑡-adapted solution (𝛼(),𝛽1(),,𝛽𝑚()).

For any admissible pair (𝑣(),𝑦𝑣()), using Itô’s formula to (1/2)𝑦𝑣()𝜏𝑃()𝑦𝑣()+𝛼()𝜏𝑦𝑣(), integrating from 0 to 𝑇, taking the expectations and trying to complete a square, then we have 𝐽𝑣();𝑦0=𝐽𝑦0+12𝔼𝑇0𝑚𝑖=1𝑣𝑖(𝑡)+𝐿𝑣𝑖(𝑡)𝜏𝑣𝑃(𝑡)𝑖(𝑡)+𝐿𝑣𝑖(𝑡)d𝑡=𝐽𝑦0+12𝔼𝑇0𝑚𝑖=1𝔼𝑣𝑖(𝑡)+𝐿𝑣𝑖(𝑡)𝜏𝑣𝑃(𝑡)𝑖(𝑡)+𝐿𝑣𝑖(𝑡)𝒵𝑡d𝑡,(2.13) where 𝐽𝑦0=12𝑦𝜏0𝑃(0)𝑦0+𝑦𝜏0+1𝛼(0)2𝔼𝑇02𝛼(𝑡)𝜏(𝑡)2𝑚𝑖=1𝛽𝑖(𝑡)𝑧𝑖(𝑡)+𝑚𝑖=1𝑧𝑖(𝑡)𝜏𝑃(𝑡)𝑧𝑖(𝑡)𝑚𝑖=1𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝛼(𝑡)+𝑃(𝑡)1𝛽𝑖(𝑡)𝑧𝑖(𝑡)𝜏𝑃×𝑃(𝑡)(𝑡)1𝐵𝑖(𝑡)𝜏𝛼(𝑡)+𝑃(𝑡)1𝛽𝑖(𝑡)𝑧𝑖(𝑡)d𝑡,(2.14)𝐿𝑣𝑖(𝑡)=𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝑃(𝑡)𝑦𝑣(𝑡)+𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝛼(𝑡)+𝑃(𝑡)1𝛽𝑖(𝑡)𝑧𝑖(𝑡).(2.15) Since 𝐽(𝑦0) is independent of 𝑣𝑖(), the integrand in (2.13) is quadratic with respect to 𝑣𝑖() and 𝑃()>0, then it follows from the property of conditional expectation that the minimum of 12𝔼𝑇0𝑚𝑖=1𝔼𝑣𝑖(𝑡)+𝐿𝑣𝑖(𝑡)𝜏𝑣𝑃(𝑡)𝑖(𝑡)+𝐿𝑣𝑖(𝑡)𝒵𝑡d𝑡(2.16) over all 𝒵𝑡-adapted 𝑣𝑖(𝑡) is attained at 𝑢𝑖𝐿(𝑡)=𝔼𝑖(𝑡)𝒵𝑡=𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝑃(𝑡)𝔼𝑦(𝑡)𝒵𝑡+𝔼𝛼(𝑡)𝒵𝑡𝑃(𝑡)1𝔼𝛽𝑖(𝑡)𝒵𝑡𝑧+𝔼𝑖(𝑡)𝒵𝑡(2.17) where 𝐿𝑖(𝑡)=𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝑃(𝑡)𝑦(𝑡)+𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝛼(𝑡)+𝑃(𝑡)1𝛽𝑖(𝑡)𝑧𝑖(𝑡)(2.18) and 𝑦() is the solution of the SDE with 𝑢𝑖() replaced by (2.17) d𝑦(𝑡)=𝐴(𝑡)𝑦(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑢𝑖(𝑡)+(𝑡)d𝑡+𝑚𝑖=1𝑢𝑖(𝑡)𝑧𝑖(𝑡)d𝑊𝑖(𝑡),𝑦(0)=𝑦0=𝑥0𝔼𝜉.(2.19)

Now, we are in the position to derive an optimal feedback control in terms of the original optimal state variable 𝑥(). Substituting (2.5) into (2.17), we get 𝑢𝑖𝐿(𝑡)=𝔼𝑖(𝑡)𝒵𝑡=𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝑃(𝑡)𝔼𝑥(𝑡)𝒵𝑡𝑃(𝑡)𝔼𝜉𝒵𝑡+𝔼𝛼(𝑡)𝒵𝑡𝑃(𝑡)1𝔼𝛽𝑖(𝑡)𝒵𝑡𝑧+𝔼𝑖(𝑡)𝒵𝑡,(2.20) where 𝑥() satisfies the SDE with 𝑢𝑖() replaced by (2.20) d𝑥(𝑡)=𝐴(𝑡)𝑥(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑢𝑖(𝑡)+𝑔(𝑡)d𝑡+𝑚𝑖=1𝑢𝑖(𝑡)d𝑊𝑖(𝑡),𝑥(0)=𝑥0.(2.21)

Furthermore, we define for any 0𝑡𝑇𝑝(𝑡)=𝔼𝜉𝑡𝑃(𝑡)1𝑞𝛼(𝑡),𝑖(𝑡)=𝑧𝑖(𝑡)𝑃(𝑡)1𝛽𝑖(𝑡).(2.22) Applying Itô’s formula to (2.6) and (2.10)-(2.11), we can check that (𝑝(),𝑞1(),,𝑞𝑚()) is the unique solution of the BSDE d𝑝(𝑡)=𝐴(𝑡)𝑝(𝑡)+𝑚𝑖=1𝐵𝑖(𝑡)𝑞𝑖(𝑡)+𝑔(𝑡)d𝑡+𝑚𝑖=1𝑞𝑖(𝑡)d𝑊𝑖(𝑡),𝑝(𝑇)=𝜉.(2.23)

Finally, substituting (2.17) into (2.13), we get the value function 𝐽𝒵𝑦0=𝐽𝑦0+12𝔼𝑇0𝑚𝑖=1𝔼𝐿𝑖(𝑡)𝜏𝑃(𝑡)𝐿𝑖(𝑡)𝒵𝑡𝐿𝔼𝑖(𝑡)𝜏𝒵𝑡𝐿𝑃(𝑡)𝔼𝑖(𝑡)𝒵𝑡𝑑𝑡,(2.24) where 𝐽(𝑦0) and 𝐿𝑖() are defined by (2.14) and (2.18), respectively.

Theorem 2.1. Let Hypotheses (H2) and (H3) hold. Then the optimal control of Problem (PILQ) is 𝑢𝑖(t)=𝑃(𝑡)1𝐵𝑖(𝑡)𝜏𝔼𝑃(𝑡)𝑥(𝑡)𝒵𝑡𝔼𝑝(𝑡)𝒵𝑡𝑞+𝔼𝑖(𝑡)𝒵𝑡,(2.25) where 𝑥() and (𝑝(),𝑞1(),,𝑞𝑚()) are the solutions of (2.21) and (2.23), respectively; the corresponding value function is given by (2.24).

Remark 2.2. Note that the dynamics of BSDE (2.23) is similar to control system (2.1) except for the state constraint, which shows a perfect relationship between stochastic control and BSDE. This interesting phenomenon is first found by [1], to our best knowledge. Also, [1] finds that the solution (𝑝(),𝑞1(),,𝑞𝑚()) of (2.23) can be regarded as the optimal state-control pair (𝑥(),𝑢()) of an LQ control problem with complete information, in which the initial state 𝑥0 is an additional decision. That is, 𝑝()=𝑥() and 𝑞𝑖()=𝑢𝑖() with 𝑢()=(𝑢1(),,𝑢𝑚()). However, this conclusion is not true in our partial information case. For clarity, we shall illustrate it by the following example.

Example 2.3. Without loss of generality, we let Hypothesis (H2) hold and 𝑛=1 in Problem (PILQ).

Since 𝑃() defined by (2.10) is a scalar, it is natural that (2.10) admits a unique solution. Consequently, (2.11) also admits a unique solution. Note that Hypothesis (H3) is not used in this setup. Define Δ()=𝑥()𝑝().(2.26) From Itô’s formula, (2.21)-(2.23) and (2.25), we get dΔ(𝑡)=𝐴(𝑡)Δ(𝑡)𝑚𝑖=1𝐵𝑖(𝑡)2𝔼Δ(𝑡)𝒵𝑡+𝑚𝑖=1𝐵𝑖𝔼𝑞(𝑡)𝑖(𝑡)𝒵𝑡𝑞𝑖+(𝑡)d𝑡𝑚𝑖=1𝐵𝑖𝔼𝑞(𝑡)𝑖(𝑡)𝒵𝑡𝑞𝑖(𝑡)𝐵𝑖(𝑡)𝔼Δ(𝑡)𝒵𝑡d𝑊𝑖Δ(𝑡),(0)=𝑥0𝑝(0).(2.27) Hereinafter, we set 𝑌(𝑡)=𝔼𝑌(𝑡)𝒵𝑡,𝑌()=𝑥(),𝑝(),𝑞1(),𝑞2(),𝑔(),𝑋()orΔ(),0𝑡𝑇,(2.28) where the signal 𝑌(𝑡) is an 𝑡-adapted and square-integrable stochastic progress, while the observation is the component of the 𝑚-dimensional Brownian motion (𝑊()). Without loss of generality, we let the observable filtration 𝒵𝑡 be 𝒵𝑡𝑊=𝜎1(𝑠),,𝑊𝑙(𝑠);0𝑠𝑡,0𝑡𝑇,1𝑙𝑚1.(2.29) In this setting, we call (2.28) the optimal filtering of the signal 𝑌(𝑡) with respect to the observable filtration 𝒵𝑡 in the sense of square error. See, for example, [5, 6] for more details.

Note that (𝑊1(),,𝑊𝑙()) is independent of (𝑊𝑙+1(),,𝑊𝑚()), 𝑥0 and 𝑝(0) are deterministic. Taking the conditional expectations on both sides of (2.27), we get the optimal filtering equation of Δ(𝑡) with respect to 𝒵𝑡dΔ(𝑡)=𝐴(𝑡)𝑚𝑖=1𝐵𝑖(𝑡)2Δ(𝑡)d𝑡𝑙𝑖=1𝐵𝑖(𝑡)Δ(𝑡)d𝑊𝑖(𝑡),Δ(0)=𝑥0𝑝(0).(2.30) Note that Δ() satisfies a homogeneous linear SDE and hence must be identically zero if 𝑥0=𝑝(0).

Thereby, if the decision 𝑥0 takes the value 𝑝(0) in Example 2.3, then the next corollary follows from Theorem 2.1.

Corollary 2.4. The optimal control of Example 2.3 is 𝑢𝑖𝑞(𝑡)=𝔼𝑖(𝑡)𝒵𝑡,0𝑡𝑇.(2.31) In particular, if 𝒵𝑡=𝑡, 0𝑡𝑇, then it reduces to the case of [1], that is, 𝑢𝑖()=𝑞𝑖().

From Theorem 2.1 and Corollary 2.4, we notice that the optimal control strongly depends on the conditional expectation of (𝑝(𝑡),𝑞1(𝑡),,𝑞𝑚(𝑡)) with respect to 𝒵𝑡, 0𝑡𝑇, where (𝑝(),𝑞1(),,𝑞𝑚()) is the solution of BSDE (2.23). Since 𝒵𝑡 is very general, the conditional expectation is, in general, infinite dimensional. Then it is very hard to find an explicitly observable optimal control by some usual methods. However, it is well known that such an optimal control plays an important role in theory and reality. For this, we desire to seek some new technique to further research the problem in the rest of this section. Recently, Wang and Wu [7] investigate the filtering of BSDEs and use a backward separation technique to explicitly solve an LQ optimal control problem with partial information. Please refer to Wang and Wu [8] and Huang et al. [9] for more details about BSDEs with partial information. Inspired by [7, 9], we shall apply the filtering of BSDEs to study the conditional expectation mentioned above. Note that there is no general filtering result for BSDEs in the published literature. In the rest of this section, we shall present two examples of such filtering problems. Combining Theorem 2.1 with a property of conditional expectation, we get some explicitly observable optimal controls. As a byproduct, we establish two new kinds of filtering equations, which are called as backward and forward-backward stochastic differential filtering equations. The result enriches and develops the classical filtering-control theory (see e.g., Liptser and Shiryayev [5], Bensoussan [10], Xiong [6], and so on).

Example 2.5. Let Hypothesis (H2) hold, 𝑛=1, and 𝑚=2 in Problem (PILQ). Suppose the observable filtration is denoted by 𝒵𝑡𝑊=𝜎1(𝑠);0𝑠𝑡,0𝑡𝑇.(2.32)

From Theorem 2.1, the optimal control is 𝑢𝑖(𝑡)=𝐵𝑖𝔼𝑥(𝑡)(𝑡)𝒵𝑡𝑝𝔼(𝑡)𝒵𝑡𝑞+𝔼𝑖(𝑡)𝒵𝑡=𝐵𝑖[](𝑡)̂𝑥(𝑡)̂𝑝(𝑡)+̂𝑞𝑖(𝑡),(2.33) where (𝑝(),𝑞1(),𝑞2()) is the unique solution of d𝑝(𝑡)=𝐴(𝑡)𝑝(𝑡)+2𝑖=1𝐵𝑖(𝑡)𝑞𝑖(𝑡)+𝑔(𝑡)d𝑡+2𝑖=1𝑞𝑖(𝑡)d𝑊𝑖(𝑡),𝑝(𝑇)=𝜉(2.34) and 𝑥() satisfies (2.21) with 𝑚=2.

Similar to Example 2.3, the optimal filtering equation of 𝑥() is d̂𝑥(𝑡)=𝐴(𝑡)̂𝑥(𝑡)+2𝑖=1𝐵𝑖(𝑡)𝑢𝑖(𝑡)+̂𝑔(𝑡)d𝑡+𝑢1(𝑡)d𝑊1(𝑡),̂𝑥(0)=𝑥0.(2.35) We proceed to calculate the optimal filtering of (𝑝(),𝑞1(),𝑞2()). Recalling BSDE (2.34) and noting that the observable filtration is 𝒵𝑡, it follows that d̂𝑝(𝑡)=𝐴(𝑡)̂𝑝(𝑡)+2𝑖=1𝐵𝑖(𝑡)̂𝑞𝑖(𝑡)+̂𝑔(𝑡)d𝑡+̂𝑞1(𝑡)d𝑊1(𝑡),̂𝑝(𝑇)=𝔼𝜉𝒵𝑇.(2.36) As (2.36) is a (non-Markov) BSDE, we call it a backward stochastic differential filtering equation which is different from the classical filtering equation for SDEs. Since ̂𝑞2() is absent from the diffusion term in (2.36), then we are uncertain that if (2.36) admits a unique solution (̂𝑝(),̂𝑞1(),̂𝑞2()). But, we are sure that it is true in some special cases. See the following example, in which we establish a forward-backward stochastic differential filtering equation and obtain a unique solution of this equation.

Example 2.6. Let all the assumptions hold and 𝑔()0 in Example 2.5. For simplicity, we set the random variable 𝜉=𝑋(𝑇), where 𝑋() is the solution of d𝑋(𝑡)=𝐾(𝑡)𝑋(𝑡)d𝑡+𝑀1(𝑡)d𝑊1(𝑡)+𝑀2(𝑡)d𝑊2𝑋(𝑡),(0)=𝑋0.(2.37) Assume that 𝐾(), 𝑀1(), and 𝑀2() are bounded and deterministic functions with values in 𝐑; 𝑋0 is a constant.

Similar to Example 2.5, the optimal control is 𝑢𝑖(𝑡)=𝐵𝑖[](𝑡)̂𝑥(𝑡)̂𝑝(𝑡)+̂𝑞𝑖(𝑡),(2.38) where (̂𝑥(),̂𝑝(),̂𝑞1(),̂𝑞2(),𝑋()) is the solution of d̂𝑥(𝑡)=𝐴(𝑡)̂𝑥(𝑡)+2𝑖=1𝐵𝑖(𝑡)𝑢𝑖(𝑡)d𝑡+𝑢1(𝑡)d𝑊1(𝑡),̂𝑥(0)=𝑥0,𝐴(2.39)d̂𝑝(𝑡)=(𝑡)̂𝑝(𝑡)+2𝑖=1𝐵𝑖(𝑡)̂𝑞𝑖(𝑡)d𝑡+̂𝑞1(𝑡)d𝑊1(𝑡),̂𝑝(𝑇)=𝑋(𝑇),(2.40)

d𝑋(𝑡)=𝐾(𝑡)𝑋(𝑡)d𝑡+𝑀1(𝑡)d𝑊1(𝑡),𝑋(0)=𝑋0.(2.41) It is remarkable that (2.35) together with (2.40)-(2.41) is a forward-backward stochastic differential filtering equation. To our best knowledge, this is also a new kind of filtering equation.

We now desire to give a more explicitly observable representation of 𝑢𝑖(). Due to the terminal condition of (2.40), we get by Itô’s formula and the method of undetermined coefficients, ̂𝑝()=Φ()𝑋()+Ψ(),̂𝑞𝑖()=Φ()𝑀𝑖().(2.42) Here 𝑋() is the solution of (2.41), and Ψ(𝑡)=𝑇𝑡2𝑖=1e𝑠𝑡𝐴(𝑟)d𝑟Φ(𝑠)𝐵𝑖(𝑠)𝑀𝑖Φ(𝑠)d𝑠,(𝑡)=e𝑇𝑡(𝐾(𝑠)𝐴(𝑠))𝑑𝑠.(2.43) Thus, the optimal control is 𝑢𝑖(𝑡)=𝐵𝑖(𝑡)̂𝑥(𝑡)Φ(𝑡)𝑋(𝑡)Ψ(𝑡)+Φ(𝑡)𝑀𝑖(𝑡),(2.44) where 𝑋() satisfies (2.41) and ̂𝑥() is the solution of d̂𝑥(𝑡)=𝐴(𝑡)2𝑖=1𝐵𝑖(𝑡)2̂𝑥(𝑡)+2𝑖=1𝐵𝑖(𝑡)Φ(𝑡)𝑀𝑖(𝑡)+𝐵𝑖+𝐵(𝑡)(Φ(𝑡)̂𝑥(𝑡)+Ψ(𝑡))d𝑡1(𝑡)Φ(𝑡)𝑋(𝑡)̂𝑥(𝑡)+Ψ(𝑡)+Φ(𝑡)𝑀1(𝑡)d𝑊1(𝑡),̂𝑥(0)=𝑥0.(2.45) Since 𝑋() is the solution of (2.41), it is easy to see that the above equation admits a unique solution ̂𝑥(). Now 𝑢𝑖(𝑡), 0𝑡𝑇, defined by (2.44) is an explicitly observable optimal control.

Remark 2.7. BSDE theory plays an important role in many different fields. Then we usually treat some backward stochastic systems with partial information. For instance, to get an explicitly observable optimal control in Theorem 2.1, it is necessary to estimate (𝑝(𝑡),𝑞1(𝑡),,𝑞𝑚(𝑡)) depending on the observable filtration 𝒵𝑡. However, there are short of some effective methods to deal with these estimates. In this situation, although the filtering of BSDEs is very restricted, it can be regarded as an alternative technique (just as we see in Examples 2.2-2.3). By the way, the study of Problem (PILQ) motivates us to establish some general filtering theory of BSDEs in future work. To our best knowledge, this is a new and unexplored research field.

3. Solution to the Problem (PIMV)

We now regard Problem (PIMV) as a special case of Problem (PILQ). Consequently, we can apply the result there to solve the Problem (PIMV). From Theorem 2.1, we get the optimal portfolio strategy 𝜋𝑖(𝑡)=𝜋𝑖1(𝑡)+𝜋𝑖2(𝑡)(3.1) with 𝜋𝑖1𝜇(𝑡)=𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2𝔼𝑥(𝑡)𝒵𝑡𝔼𝑝(𝑡)𝒵𝑡,𝜋𝑖21(𝑡)=𝜎𝑖𝔼𝑞(𝑡)𝑖(𝑡)𝒵𝑡.(3.2) Here (𝑝(),𝑞1(),,𝑞𝑚()) and 𝑥() are the solutions of d𝑝(𝑡)=𝑟(𝑡)𝑝(𝑡)+𝑚𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖𝑞(𝑡)𝑖(𝑡)𝑑𝑡+𝑚𝑖=1𝑞𝑖(𝑡)d𝑊𝑖(𝑡),𝑝(𝑇)=𝜉,(3.3)d𝑥(𝑡)=𝑟(𝑡)𝑥(𝑡)+𝑚𝑖=1𝜇𝑖𝜋(𝑡)𝑟(𝑡)𝑖(𝑡)𝑑𝑡+𝑚𝑖=1𝜎𝑖(𝑡)𝜋𝑖(𝑡)d𝑊𝑖𝑥(𝑡),(0)=𝑥0.(3.4) So we have the following theorem.

Theorem 3.1. If Hypothesis (H1) holds, then the optimal portfolio strategy of Problem (PIMV) is given by (3.1).

We now give a straightforward economic interpretation of (3.1). Introduce an adjoint equation d𝜃(𝑠)=𝑟(𝑠)𝜃(𝑠)𝑑𝑠+𝑚𝑖=1𝜇𝑖(𝑠)𝑟(𝑠)𝜎𝑖𝜃(𝑠)𝑖(𝑠)d𝑊𝑖(𝑠),𝜃(𝑡)=1,0𝑡𝑠𝑇.(3.5) Applying Itô’s formula to 𝑝()𝜃(), 𝑝(𝑡)=e𝑇𝑡𝑟(𝑠)d𝑠𝔼𝜉e𝑇𝑡𝑚𝑖=1(𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠))d𝑊𝑖(𝑠)(1/2)𝑇𝑡𝑚𝑖=1((𝜇𝑖(𝑠)𝑟(𝑠))/𝜎𝑖(𝑠))2𝑑𝑠𝑡.(3.6) Note that 𝒵𝑡 is a subfiltration of 𝑡, 0𝑡𝑇. Then we have 𝔼𝑝(𝑡)𝒵𝑡=e𝑇𝑡𝑟(𝑠)𝑑𝑠𝔼𝜉e𝑇𝑡𝑚𝑖=1(𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠))d𝑊𝑖(𝑠)(1/2)𝑇𝑡𝑚𝑖=1((𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠)))2d𝑠𝒵𝑡,(3.7) which is the partial information option price for the contingent claim 𝜉. According to Corollary 2.4, 𝜋𝑖2() is the partial information replicating portfolio strategy for the contingent claim 𝜉 when the initial endowment 𝑥0 is the initial option price 𝑝(0). Then 𝜋𝑖1() defined by (3.2) is exactly the partial information Merton’s portfolio strategy for the terminal utility function 𝑈(𝑥)=𝑥2 (see e.g., Merton [11]). That is, the optimal portfolio strategy (3.1) is the sum of the partial information replicating portfolio strategy for the contingent claim 𝜉 and the partial information Merton’s portfolio strategy. Consequently, if the initial endowment 𝑥0 is different from the initial option price 𝑝(0) necessary to hedge the contingent claim 𝜉, then 𝑥0𝑝(0) should be invested according to Merton’s portfolio strategy.

In particular, suppose the contingent claim 𝜉 is a constant. In this case, it is easy to see that the solution (𝑝(),𝑞1(),,𝑞𝑚()) of (3.3) is𝑝(𝑡)=e𝑇𝑡𝑟(𝑠)d𝑠𝜉,𝑞𝑖(𝑡)=0,0𝑡𝑇.(3.8) So we have the following corollary.

Corollary 3.2. Let Hypothesis (H1) hold and 𝜉 be a constant. Then the optimal portfolio strategy of Problem (PIMV) is 𝜋𝑖𝜇(𝑡)=𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2𝔼𝑥(𝑡)𝒵𝑡e𝑇𝑡𝑟(𝑠)d𝑠𝜉,0𝑡𝑇.(3.9)

Remark 3.3. The solution (𝑝(),𝑞1(),,𝑞𝑚()) defined by (3.8) has a straightforward interpretation in financial terms. That is, to achieve a deterministic wealth level 𝜉 at the terminal time 𝑇, the agent should only invest a risk-free asset (a bond) and cannot invest any risky assets (stocks). Therefore, the optimal portfolio strategy obtained in Corollary 3.2 is only the partial information Merton’s portfolio strategy.

The left part of this section will focus on a special mean-variance hedging problem with partial information. By virtue of filtering theory, we get an explicitly observable optimal portfolio strategy as well as a risk measure. We also plot three figures and give numerical simulations to illustrate the theoretical result.

Example 3.4. Let 𝑚=2 and all the conditions in Corollary 3.2 hold. Suppose the observable filtration of an agent is 𝒵𝑡𝑆=𝜎1(𝑠)0𝑠𝑡,0𝑡𝑇.(3.10) It implies that the agents can observe all the past prices of 𝑆1(), but due to some limit factors (e.g., bad behavior of the stock 𝑆2() or time and energy of the investor) they cannot (do not want to) observe 𝑆2().

Set 𝒮1()=log𝑆1(), where log𝑥 (𝑥>0) denotes a logarithm function. It follows from Itô’s formula that d𝒮1𝜇(𝑡)=11(𝑡)2𝜎1(𝑡)2d𝑡+𝜎1(𝑡)d𝑊1(𝑡).(3.11) Since 𝜇1() and 𝜎1() are deterministic functions (see Hypothesis (H1)), the above filtration 𝒵𝑡 is equivalently rewritten as 𝒵𝑡𝒮=𝜎1𝑊(𝑠)0𝑠𝑡=𝜎1(𝑠)0𝑠𝑡.(3.12)

Similar to Example 2.5, we get from Corollary 3.2𝜋𝑖𝜇(𝑡)=𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2̂𝑥(𝑡)e𝑇𝑡𝑟(𝑠)d𝑠𝜉,(3.13) where ̂𝑥() is the solution of d̂𝑥(𝑡)=𝑟(𝑡)2𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2̂𝑥(𝑡)+2𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2e𝑇𝑡𝑟(𝑠)d𝑠𝜉𝜇𝑑𝑡1(𝑡)𝑟(𝑡)𝜎1(𝑡)̂𝑥(𝑡)e𝑇𝑡𝑟(𝑠)d𝑠𝜉d𝑊1(𝑡),̂𝑥(0)=𝑥0.(3.14) Now 𝜋𝑖(𝑡), 0𝑡𝑇, defined by (3.13), is an observable optimal portfolio strategy.

We now calculate the risk measure (or the value function) of the agent’s goal RM2=2𝐽𝑢();𝑥0=min𝑣()𝒰ad2𝐽𝑣();𝑥0.(3.15) From (3.14), we derive 𝔼̂𝑥(𝑡)=𝑥0e𝑡0[𝑟(𝑠)2𝑖=1((𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠)))2]𝑑𝑠+𝜉𝑡02𝑖=1𝜇𝑖(𝑠)𝑟(𝑠)𝜎𝑖(𝑠)2e𝑡𝑠[𝑟(𝜈)2𝑖=1((𝜇𝑖(𝜈)𝑟(𝜈))/(𝜎𝑖(𝜈)))2]d𝜈𝑇𝑠𝑟(𝜈)d𝜈d𝑠,𝔼̂𝑥(𝑡)2=𝑥20e𝑡0[2𝑟(𝑠)2𝑖=1((𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠)))2((𝜇2(𝑠)𝑟(𝑠))/(𝜎2(𝑠)))2]d𝑠+𝑡0e𝑡𝑠[2𝑟(𝜈)2𝑖=1((𝜇𝑖(𝜈)𝑟(𝜈))/(𝜎𝑖(𝜈)))2((𝜇2(𝜈)𝑟(𝜈))/(𝜎2(𝜈)))2]d𝜈×e𝑇𝑠𝑟(𝜈)d𝜈𝜉2𝜇2(𝑠)𝑟(𝑠)𝜎2(𝑠)2𝔼̂𝑥+𝜇(𝑠)1(𝑡)𝑟(𝑡)𝜎1(𝑡)2e𝑇𝑠𝑟(𝜈)d𝜈𝜉d𝑠.(3.16) Combining (3.4), (3.14), Itô’s formula with Lemma A.1, d𝑥(𝑡)2=𝑥2𝑟(𝑡)(𝑡)2+e2𝑇𝑡𝑟(𝑠)d𝑠𝜉2̂𝑥(𝑡)22𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2𝜇d𝑡+21(𝑡)𝑟(𝑡)𝜎1e(𝑡)2𝑇𝑡𝑟(𝑠)d𝑠𝜉̂𝑥(𝑡)̂𝑥(𝑡)d𝑊1(𝑥𝑡),(0)2=𝑥20.(3.17) Solving the above equation, 𝔼𝑥(𝑇)2=𝑥20e2𝑇0𝑟(𝑠)d𝑠+𝑇02𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2𝜉2e2𝑇𝑡𝑟(𝑠)d𝑠𝔼̂𝑥(𝑡)2d𝑡.(3.18) Applying a property of conditional expectation and throughout integration by parts, we get RM2𝑥=𝔼(𝑇)22𝜉𝔼̂𝑥(𝑇)+𝜉2=𝜉𝑥0e𝑇0𝑟(𝑡)d𝑡21𝑇0𝜌(𝑡)d𝑡(3.19) with 𝜌(𝑡)=2𝑖=1𝜇𝑖(𝑡)𝑟(𝑡)𝜎𝑖(𝑡)2e𝑡0[2𝑖=1((𝜇𝑖(𝑠)𝑟(𝑠))/(𝜎𝑖(𝑠)))2+((𝜇2(𝑠)𝑟(𝑠))/(𝜎2(𝑠)))2]d𝑠.(3.20) So we have the following proposition.

Proposition 3.5. The optimal portfolio strategy and the risk measure are given by (3.13) and (3.19), respectively.

To further illustrate the optimal portfolio strategy (3.13) and the risk measure (3.19), we plot three figures and give some numerical results here. Suppose 𝑟=0.06, 𝜇1=0.12, 𝜇2=0.18, 𝜎1=0.12, and 𝜎2=0.24. Taking 𝑇=1 year, we get from (3.19) 𝜉=𝑥0e0.06+1131+2e(3/4)RM.(3.21)

In Figure 1, we let $0𝑥0$2 million and $0RM$ 1 million. The plane explicitly describes the relationship among 𝜉, 𝑥0, and RM. In detail, the bigger 𝑥0 and RM, the bigger 𝜉.

In Figure 2, we let the investment goal of the agent 𝜉=$ 1.2 million. The beeline shows that RM is a deceasing function of the initial endowment 𝑥0. In particular, when 𝑥0=$ 1.13 million, we have RM=$ 0. This means that to achieve the investment goal $ 1.2 million at the end of one year, the agent only needs to invest $ 1.13 million in the bond at the interest rate 6%; moreover, there is no risk for the investment strategy.

In Figure 3, we let 𝑥0=$ 1 million. The beeline implies that RM is an increasing function of the investment goal 𝜉. Consider now the agent who has an initial endowment 𝑥0=$ 1 million and wishes to obtain an expected return rate 20% in one year. Taking 𝑥0=$ 1 million and 𝜉=$ 1.2 million, we get RM=$ 0.1112 million, meaning that the risk of the investment goal is as high as 11.12%.

Furthermore, we calculate the Merton’s portfolio strategy. Let 𝑟=0.06, 𝑇=1, 𝑥0=$ 1 million, and 𝜉=$ 1.2 million.(1)Set 𝜇1=0.12, 𝜇2=0.18, 𝜎1=0.12, and 𝜎2=0.24. By (3.13), the amount of money the agent should invest in the 𝑖th stock is 𝜋𝑖𝜇(𝑡)=𝑖0.06𝜎2𝑖1.2e0.06(1𝑡)̂𝑥(𝑡).(3.22)

In particular, at the initial time 𝑡=0, 𝜋1(0)=$ 0.5422 million and 𝜋2(0)=$ 0.2711 million, which implies that the agent needs to invest $ 0.5422 million and $ 0.2711 million in the stocks 𝑆1() and 𝑆2(), respectively, and invest in the bond for an amount of 1(0.5422+0.2711)=$0.1867million.(3.23)(2)Set 𝜇1=0.12, 𝜇2=0.18, 𝜎1=0.12, and 𝜎2=0.17. Similarly, we have 𝜋1(0)=$ 0.5422 million, 𝜋2(0)=$ 0.5403 million, and (0.5422+0.5403)1=$0.0825million,(3.24)

which implies that the agent needs to borrow $ 0.0825 million from a bank and invest the amount $ 1.0825 million in the two stocks 𝑆1() and 𝑆2(). This is indeed an aggressive policy.

4. Comparison with Existing Results

The subject of stochastic control with partial information has been discussed by many researchers, such as Bensoussan [10] and Xiong and Zhou [3]. Usually, they made one of the following two assumptions: (i) the filtration 𝒵 is a sigma algebra generated by some observable process; (ii) the control systems are Markovian. From this viewpoint, our work cannot be covered by their results.

Note that our work is related to the recent paper by Hu and Øksendal [4]. In what follows, we shall give some detailed comparisons between them.(1)Comparing with [4], the distinctive characteristics of our Problem (PIMV) are the following four points. First, since 𝜉 in our cost functional (1.5) is a random variable, then it partly generalizes that of [4]; meanwhile, our cost functional (1.5) can measure the risk that the contingent claim cannot be reached. Second, we give a possible formulation of partial information in the setting of finance and interpret the economic meaning of the optimal portfolio with partial information. Third, in terms of filtering theory, we explicitly compute the observable optimal portfolio strategy and the risk measure of an agent in Example 3.4. Last but not least, we present some numerical results and figures to illustrate the optimal portfolio strategy and the risk measure. Although the example is special, it is nontrivial and contains some filtering techniques. These results show the practical sense of our paper.(2)Since the initial state 𝑥0 in control system (2.1) may be a decision and 𝜉 in cost functional (2.3) is a random variable, our Problem (PILQ) is different from that of [4]. In particular, if the foregoing 𝑥0 is also a decision, then the partial information optimal control can be denoted by a conditional expectation of the solution of the corresponding adjoint equation (recall Corollary 2.4). Obviously, this is different from [1, 4].(3)Filtering theory plays an important role in optimal control with partial information. To get an explicitly observable optimal control, it is necessary to compute the conditional expectation of the solution of BSDEs. However, it is not doable in most cases. In this paper, we try to solve some stochastic control problems by using filtering theory. As a byproduct, we establish backward and forward-backward stochastic differential filtering equations which are different from the existing literature about filtering theory. Although the technique of solving stochastic control is restricted and the filtering equations are linear, we can regard them as a contribution to filtering-control theory. By the way, the study about Problem (PILQ) motivates us to establish some general filtering theory of BSDEs in future work.

In [4], Hu and Øksendal obtain some optimal controls and value functions with complete and partial information. Note that they only denote the optimal controls by conditional expectation. We also notice there is not any filtering result in [4]. This is different from ours.

Appendix

A Classical Filtering Equation for SDEs

We present here a classical filtering result for the readers’ convenience which was employed in this paper. For a detailed discussion of filtering, we refer to the books [5, 6].

Consider the following 1-dimensional state and observation equations: 𝜃(𝑡)=𝜃0+𝑡0(𝑠)d𝑠+𝑥(𝑡),𝜉(𝑡)=𝜉0+𝑡0𝑎(𝑠,𝜔)d𝑠+𝑡0𝑏(𝑠,𝜉)d𝑊(𝑠).(A.1) Here (𝑊()) is a 1-dimensional standard Brownian motion defined on the complete filtered probability space (Ω,,(𝑡),𝑃) equipped with a natural filtration 𝑡=𝜎{𝑊(𝑠);0𝑠𝑡}, =𝑇, 0𝑡𝑇; 𝑥(𝑡) is an 𝑡-martingale; (𝑡) is an 𝑡-adapted process with 𝑇0|(𝑠)|d𝑠<+; the functional 𝑏(𝑡,𝑦), 𝑦𝐶𝑇, 0𝑡𝑇, is 𝑡-measurable.

We need the following hypothesis.(H)For any 𝑦,𝑦𝐶𝑇, 0𝑡𝑇, the functional 𝑏(𝑡,) satisfies ||𝑏(𝑡,𝑦)𝑏𝑡,𝑦||2𝐿1𝑡0𝑦(𝑠)𝑦(𝑠)2d𝐾(𝑠)+𝐿2𝑦(𝑡)𝑦(𝑡)2,𝑏(𝑡,𝑦)2𝐿1𝑡01+𝑦(𝑠)2d𝐾(𝑠)+𝐿21+𝑦(𝑡)2,(A.2)

where 𝐿1 and 𝐿2 are two nonnegative constants and 0𝐾()1 is a nondecreasing right continuous function. Moreover, sup0𝑡𝑇𝔼𝜃(𝑡)2<+, 𝑇0𝔼(𝑡)2d𝑡<+, 𝑇0𝔼𝑎(𝑡,𝜔)2d𝑡<+, 𝑏(𝑡,𝑦)2𝐶>0.

The following result is due to of [5, Theorem 8.1].

Lemma A.1. Define ̂𝜃(t)=𝔼[𝜃(t)𝜉t]. Here 𝜃(t) can take 𝜃(t), h(t), D(t), a(t,𝜔), and 𝜃(t)a(t,𝜔); 𝜉t=𝜎{𝜉(s)0𝑠t}, 0tT. If Hypothesis (H) holds, then the optimal nonlinear filtering equation is ̂̂𝜃𝜃(𝑡)=0+𝑡0(𝑠)d𝑠+𝑡0̂𝐷(𝑠)+𝜃(𝑠)𝑎(𝑠,𝜔)𝜃(𝑠)̂𝑎(𝑠,𝜔)𝑏(𝑠,𝜉)1d𝑊(𝑠),(A.3) where 𝑊(𝑡)=𝑡0d𝜉(𝑠)̂𝑎(𝑠,𝜔)d𝑠𝑏(𝑠,𝜉)(A.4) is a standard Brownian motion (with respect to 𝜉𝑡), and 𝐷(𝑡) is an 𝑡-adapted process with 𝐷(𝑡)=d𝑥(𝑡),𝑊(𝑡)d𝑡.(A.5)

Acknowledgments

The authors would like to thank the anonymous referee and the Editor for their insightful comments for improving the quality of this work. G. Wang acknowledges the financial support from the National Nature Science Foundation of China (11001156), the Natural Science Foundation of Shandong Province, China (ZR2009AQ017), and the Independent Innovation Foundation of Shandong University, China (2010GN063). Z. Wu acknowledges the financial support from the National Natural Science Foundation of China (10921101), the National Basic Research Program of China (973 Program, no. 2007CB814904), the Natural Science Foundation of Shandong Province, China (JQ200801 and 2008BS01024), and the Science Foundation for Distinguished Young Scholars of Shandong University, China (2009JQ004). The material in this paper was partially presented at the 27th Chinese Control Conference, July 16–18, 2008, Kunming, Yunnan, China.