Abstract
This paper is concerned with a new kind of Stackelberg differential game of mean-field backward stochastic differential equations (MF-BSDEs). By means of four Riccati equations (REs), the follower first solves a backward mean-field stochastic LQ optimal control problem and gets the corresponding open-loop optimal control with the feedback representation. Then the leader turns to solve an optimization problem for a mean-field forward-backward stochastic differential system. In virtue of some high-dimensional and complicated REs, we obtain the open-loop Stackelberg equilibrium, and it admits a state feedback representation. Finally, as applications, a class of stochastic pension fund optimization problems which can be viewed as a special case of our formulation is studied and the open-loop Stackelberg strategy is obtained.
1. Introduction
The stochastic differential games are important in various fields such as biology, engineering, economics, management, and particularly financial investment, and they are useful in modeling dynamic systems involving noise terms where more than one decision maker are involved. Among various differential games, the Stackelberg game, a concept of a hierarchical solution for markets where some firms have power of domination over others, is firstly introduced by H. von Stackelberg in 1934 [1]. Since then, a lot of literature is studied to deal with the deterministic Stackelberg game, such as Basar and Olsder [2], Long [3]. For the stochastic cases, Bagchi and Basar [4] studied an LQ stochastic Stackelberg differential game, where the state and control variables do not enter the diffusion coefficient in the state equation. Yong [5] obtained a more general result, with random coefficients, control dependent diffusion, and the weight matrices for the controls in the cost functionals being not necessarily positive definite. Bensoussan et al. [6] investigated a stochastic Stackelberg differential game in various information structures (i.e., adapted open-loop), whereas the diffusion coefficient does not contain the control variables. Furthermore, for the open-loop information structure and closed-loop memoryless information structure cases, they give the corresponding two types of optimal strategies of forward stochastic Stackelberg differential game. In Øksendal et al. [7], a time-dependent newsvendor problem with time-delayed information is solved, based on stochastic Stackelberg differential game approach. Shi et el. [8] solved a stochastic Stackelberg differential game with asymmetric information. Since the theory of mean-field forward stochastic differential equations (MF-SDEs) studied by [9], their related topics and applications (particularly in financial engineering) have been investigated by many authors (see [10–13]). Based on above, [14] studied the open-loop LQ Stackelberg game of the mean-field stochastic systems in finite horizon and got the feedback representation of the open-loop Stackelberg equilibrium involving the new state and its mean.
Here, we point out that all references mentioned above focus on the Stackelberg game with forward state equation in which the initial condition is specified at initial time. However, in financial investment, one frequently encounters financial investment problems with future conditions (as random variables at terminal time ) specified. Thus, by contrast, this paper introduces a stochastic Stackelberg differential game following a linear MF-BSDE. The general backward stochastic differential equations (BSDEs) were initially studied by [15, 16] and extended to mean-field case introduced by [17]. The BSDEs are well-formulated stochastic systems and have found various applications. El Karoui et al. [18] gave some important properties of BSDEs and their applications to optimal controls and financial mathematics. Kohlmann and Zhou [19] studied the relationship between a BSDE and a forward LQ optimal control problem, and based on it, Lim and Zhou [20] discussed a backward LQ optimal control problem. Li et al. [21] studied the backward LQ optimal control problem for mean-field case. Huang et al. [22] studied a backward LQ optimal control in partial information and gave some applications in pension fund optimization problems. Furthermore, some recent literature can be found in [23–26] for the study of games following backward stochastic differential systems and [21] for the study of control problems following mean-filed backward stochastic differential systems.
Inspired by above-mentioned motivations, this paper studies a new kind of Stackelberg differential game for mean-field backward stochastic differential system. Specifically, we consider stochastic dynamic games involving two leader and follower agents satisfying linear MF-BSDE systems. It distinguishes itself from the literature mentioned above in the following aspects.(i)An important class of Stackelberg differential game of MF-BSDEs is introduced, which consists of two stochastic optimal control problems (i.e., a stochastic optimal control problem of MF-BSDEs for the follower and a stochastic optimal control problem of mean-field forward-backward stochastic differential equations (MF-FBSDEs) for the leader). Unlike forward MF-SDEs, the solution of MF-BSDEs should consist of one adapted solution pair (see (1)), where the second component is naturally introduced to ensure the adaptiveness of when propagating from terminal backward to initial time.(ii)For the follower, the open-loop optimal control of his LQ control problem is characterized in terms of the MF-FBSDE (10)-(11). To get the state feedback representations for the optimal controls of the follower, we introduce some new equations four Riccati equations, a mean-field stochastic differential equation, and a mean-field backward stochastic differential equation.(iii)For the leader’s optimal control problem of mean-field forward-backward stochastic differential system, under standard conditions, we conclude the uniqueness and existence of an optimal control from which the cost functional is strictly convex and coercive with respect to the control variable. By virtue of the maximum principle method, the open-loop optimal control can be represented via the Hamiltonian system and adjoint process. Moreover, state feedback representation for the optimal controls of the leader is explicitly given with the help of some new high-dimensional and coupled Riccati equations.(iv)Last but not least, we study a class of stochastic pension fund optimization problem with two representative members. Applying the aforementioned conclusions, we derive the open-loop optimal contribution policy in feedback representation.
Some remarks to above points are given as follows. As to (ii), unlike the standard Hamiltonian system for MF-SDEs control which is a MF-FBSDE coupled in its terminal condition, the Hamiltonian system in BSDE control setup becomes a mean-field forward-backward stochastic differential equation (MF-FBSDE) coupled in its initial condition. Therefore, to decouple it and get the feedback representation, we should introduce some additional REs and MF-SDEs. As to (iii), since more equations are introduced to decouple the Hamiltonian system of the follower, the dynamic process of the leader becomes a “” MF-FBSDE (one forward and two backward mean-field stochastic differential equations), which is different from those of forward case (the state process for forward case is an “” MF-FBSDE).
The structure of this paper is as follows. Section 1 gives the introduction and specifies some standard notations and terminologies. Section 2 formulates the LQ Stackelberg game for MF-BSDEs. The corresponding optimal control problem of MF-BSDEs for the follower is studied in Section 3. Section 4 studies the stochastic optimal control problem of MF-FBSDEs for the leader, and the open-loop Stackelberg strategy in feedback representation is obtained. The stochastic pension fund optimization problems with two representative members are studied in Section 5. Section 6 concludes our work and presents some future research direction.
1.1. Notation and Terminology
The following notations will be used throughout the paper. We let be the Euclidean space of -dimensional Euclidean space, be the space of matrices, and be the space of symmetric matrices. and denote the scalar product and norm in the Euclidean space, respectively. The transpose of a vector (or matrix) is denoted by . If is positive semidefinite, we write . Consider a finite time horizon for a fixed . Let be a given Hilbert space. The set of -valued continuous functions is denoted by . If and for every , we say that is positive semidefinite, which is denoted by .
We suppose is a complete filtered probability space on which a standard one-dimensional Brownian motion is defined, where is the natural filtration of augmented by all the -null sets in . Suppose is an -random variable. We write if is square integrable (i.e., ). Consider is a adapted process. If is square integrable (i.e., ), we shall write ; if is uniformly bounded (i.e., ), then . These definitions generalize in the obvious way to the case when is (or ) valued. Furthermore, in cases where we restrict ourselves to deterministic Borel measurable functions , we shall drop the subscript in the notation; for example, . Finally, we denote , for all and .
2. Problem Formulation
In this paper, we consider the following controlled linear MF-BSDE:Here is the state process. Note that is also part of solution of (1), which is introduced here to ensure the adaptiveness of . is the control process of the follower, is the control process of the leader, and the admissible control sets are given by respectively. are given deterministic matrix-valued functions; is the terminal condition. Now, we introduce the following assumption that will be in force throughout this paper.
(H1): The coefficients of the state equation satisfy the following: Under (H1), for all , MF-BSDE (1) has a unique adapted solution belonging to (see [21, Theorem 2.1]). Furthermore, we define the cost functionals of two players asFor the coefficients of cost functionals, we shall assume the following assumptions throughout this paper:
(H2): For all , the weighting coefficients in the cost functional satisfy and there exists a constant such that for a.e. For the sake of notation simplicity, the time argument is suppressed in cost functional above and in the sequel of this paper wherever necessary.
Let us now explain the mean-filed backward stochastic Stackelberg differential game system. In the game, Player is the follower, and Player is the leader. In addition, for any , we assume that is a cost functional for Player . Therefore, at the initial time, for any given terminal target , the leader announces his strategy over the whole planning horizon . Then, with the knowledge of the leader’s strategy, the follower determines his response strategy over the entire horizon to minimize . Since the follower’s optimal response can be determined by the leader, the leader can take it into account in finding and announcing her optimal strategy which minimizes over . In a little more rigorous way, we give the following definition of the Stackelberg game.
Definition 1. The pair is called an open-loop solution to the above Stackelberg game if it satisfies the following conditions:
(i) There exists a map such that(ii) There exists a unique such that(iii) The optimal strategy of the follower is .
The aim of the paper is to find the feedback representation of the open-loop Stackelberg strategy for mean-field backward differential game (1) and (4).
3. Optimization for the Follower
In this section, we consider the optimization problem of the follower. For given , the follower wants to solve the following LQ optimal control problems for mean-field backward stochastic differential system.
Problem (BMF-LQ). For given , find a such that By using the similar method found in [21], we are able to obtain the following result.
Proposition 2. Under (H1)-(H2), let the terminal state and the leader’s strategy be given.
(i) The problem (BMF-LQ) is uniquely solvable with being the only optimal pair if and only if there exists a unique 4-tuple satisfying the MF-FBSDE: and the following stationarity condition holds:(2) The following relations are satisfied:where , and are the solutions of the following three equations, respectively,
In above proposition, we get the optimal strategy of the follower with any given leader’s strategy through the adjoint equation (one part of the Hamiltonian system (10)-(11)). Next, we intend to obtain the state feedback form of . We first have the following lemma.
Lemma 3. Under (H1)-(H2), let be the solution of the Hamiltonian system (10)-(11). Thenwhere and satisfy the following two Riccati equations, respectively,and is given by the following MF-SDE:where
Proof. Since is the solution of the Hamiltonian system (10)-(11), by noticing relations (12), we can get that the initial value of satisfies By using the fact that where we assume that and are reversible, it can be shown by a straightforward computation that which implies Therefore, noting (11) and (21), we can rewrite the backward stochastic differential equation of FBDSE (10) as the following form MF-SDE:Then, we conjecture that and are related by the following:where are absolutely continuous with initial value , , respectively, and satisfies for some -progressively measurable processes and . Note that and satisfy the following two ordinary differential equations, respectively, Applying Itô’s formula to the second equation of (27), we get and it implies that Riccati equation is given by (19) and satisfiesand, similarly, by applying the Itô’s formula to the third equation of (27), we can obtain the following equation. Hence should be a solution of Riccati equation (18), and should satisfyFinally, substituting (12) into (27), it is easy to show the following. Thus, noticing (31), (33), and (34), we get that is given by MF-SDE (20).
Remark 4. In Lemma 3, to get the relation between and , we introduce another two Riccati equations which have a unique solution, respectively (see [20, Section 4]). Since (20) is a linear MF-SDE with bounded coefficients and square integrable nonhomogeneous terms, it has a unique solution (see [13, Section 2]).
Based on the above lemma, we obtain the main conclusion of problem (BMF-LQ).
Theorem 5. Under (H1)-(H2), Problem (BMF-LQ) is solvable with the optimal open-loop strategy being of a feedback representationwhere , and are the solutions of (18), (19), and (20), respectively. The optimal state trajectory is the unique solution of the MF-BSDEMoreover, the optimal cost of the follower isHere is the solution of MF-BSDE (15).
Proof. The first assertion is the direct consequence of Proposition 2 and Lemma 3. For the second assertion, since given by (36) is the optimal strategy of the follower with terminal condition and the leader’s strategy which are given, we have that the optimal cost of the follower is as follows.Noting (12), (17), and (36), we get and substitution of the above into (85) completes the proof.
Remark 6. If , i.e., there is only one player in this game, this game problem degrades into the same as that studied by [21]. However, in [21], they did not get the feedback form open-loop optimal control, which is given by Theorem 5 in our paper.
Remark 7. In Theorem 5, we get the feedback form optimal control of the follower by (36), and it is easy to see that the optimal control is a functional about the leader’s control . Furthermore, if the leader announces his control , the follower should choose his map as the form of (36) to minimize the cost functional .
4. Optimization for the Leader
In above section, we get the open-loop feedback form optimal control of problem (BMF-LQ) for any given and . Now, let problem (BMF-LQ) be uniquely solvable for any given . Since the follower’s optimal response of form (36) can be determined by the leader, the leader can take it into account in finding and announcing his optimal strategy. Consequently, the leader has the following state equationwith the coefficients given by (21). It should be mentioned that the “state” in (41) is the quintuple . Since (41) is a decoupled “” MF-FBSDE (one forward and two backward mean-field type stochastic differential equations), the solvability for -adapted solution can be easily guaranteed. The leader would like to choose his control such that his cost functional,is minimized. The optimal control problem for the leader can be stated as follows.
Problem (FBMF-LQ). For given , find a such that
Remark 8. In the traditional leader-follower game for (mean-field) forward stochastic differential equations, the state processes of leaders are all given by an “” (MF) FBSDE. However, in backward case, the state process of the leader becomes a “” MF-FBSDE. Thus, the problem we studied is more complex and technically challenging.
Under (H1)-(H2), by noting [23, proposition 1] and [21, Theorem 2.2], we get that the cost functional is strictly convex and coercive, which means that problem (FBMF-LQ) has a unique optimal control. Then, we will use the variational method to solve the (FBMF-LQ).
Proposition 9. Let (H1)-(H2) hold. Let be the optimal sextuplet for the terminal state . Then the solution to the MF-FBSDE,satisfies
Proof. For any and any , let be the solution ofLet be the solution to the perturbed state equation, then it is clear that , and henceNoting that and applying Itô’s formula to , we have Noting (47) and the fact that , we have which implies (45).
Here, denote that From the above result, we see that if happens to be the optimal control of problem (MF-FBSLQ) for terminal state , then the following MF-FBSDE admits an adapted solution :And the following stationarity condition holds:We now use the idea of the four-step scheme (see [21, 27, 28]) to study the solvability of the above MF-FBSDE (52).
Remark 10. Another thing which we should keep in mind is that, different with the traditional forward LQ leader-follower game whose Hamiltonian system is a -FBSDE (or MF-FBSDE) with terminal conditions coupling, in our backward system case, the Hamiltonian system that we get is a -MF-FBSDE with initial conditions coupling, so that, to decouple the corresponding Hamiltonian system and get the feedback form optimal control, we should introduce four Riccati type equations with more higher dimension and complex coupling.
Suppose we have the relationNamely,Here are absolutely continuous with terminal condition , and satisfies the following MF-BSDE: where is some -progressively measurable processes to be confirmed. Here, we should point out that, even though there are no nonhomogeneous terms in the MF-FBSDE (52), since the influence of the initial terms coupling, we should also introduce the nonhomogeneous term in the possible connection (54) between and , and it is the essential difference with traditional case. Note that (taking the mathematical expectation in (52)-(53))Thus,Applying Itô’s formula to the first equation of (55) and noting (55) and (58), we have This implies (assuming that and are invertible)Substitution of (60) into (59) now givesSimilarly, applying Itô’s formula to the second equation of (55) and noting (55), (57), we have This implies (noting (60))Noticing (61) and (64), we obtainThen, to get the feedback representation of the leader, we should try to give another connection between and . Namely, suppose we have the following relationi.e.,Here are absolutely continuous and satisfies the following MF-SDE: where and are some -progressively measurable processes to be confirmed. In addition, noting (55) and (68), we haveThus, by noticing (60), the process is given bywhere Then, by using the similar method in proving Lemma 3, we can get that the initial value of is given byThus, noting (60), the process satisfies the following MF-SDE:and, applying Itô’s formula to the first equation of (68) and noting (57) and (71), we have It implies that (noting (68))Similarly, applying Itô’s formula to the second equation of (68) and noting (58) and (71), we obtainTherefore, satisfies the following MF-SDE:where , , and are given by (79), (77), and (80), respectively. Then, we get the main result of the section.
Theorem 11. Under (H1)-(H2), suppose Riccati equations (62), (65), (78), (76) admit differentiable solution , respectively. Problem (FBMF-LQ) is solvable with the optimal open-loop strategy being of a feedback representationwhere , and are the solutions of (78), (76), and (81), respectively. The optimal state trajectory is the unique solution of the MF-BSDE Moreover, the optimal cost of the follower iswhere and .
Proof. Firstly, by noting (45) and (67), we get the feedback representation (82) of the leader’s optimal strategy with “state” . For the second assertion, we have that the optimal cost of the leader isNoting (70) and (55), we get the following.Substitution (71), (73), (82), and (86) into (85) completes the proof.
Here, we wanted to emphasize that, for generality, we only assume that the REs used in above theorem have unique solutions, respectively. Furthermore, we will present some sufficient conditions for the solvability of them in the appendix.
Remark 12. If we let , , then , , , , and the game problem studied in this paper will degrade into the problem of the LQ leader-follower game for BSDE. As we know, it is not studied before.
Likewise, noting (36) and (68), the optimal control of the follower can also be represented in a similar way as in (82).On the existence and uniqueness of the open-loop Stackelberg strategy, we have the following conclusion.
Theorem 13. Under (H1)-(H2), If (62), (65), (78), (76) admit a tetrad of solutions , then the open-loop Stackelberg strategy exists and is unique. In this case, the unique open-loop Stackelberg strategy in feedback representation is given by (87) and (82). In addition, the optimal costs of the follower and the leader are given by (38) and (84), respectively.
5. Application to Pension Fund Problem and Simulation
In this section, we present an LQ Stackelberg game for MF-BSDE of the defined benefit (DB) pension fund. It is well known that defined benefit (DB) pension scheme is one of two main categories. In a DB scheme, there are two corresponding representative members who make contributions continuously over time to the pension fund in . One of the members is the leader (i.e., the supervisory, government, or company) with the regular premium proportion , and the other one is the follower (i.e., individual producer or retail investor) with the regular premium proportion . Premiums decided by two members are payable in regular premiums; i.e., premiums are a proportion of salary which are continuously deposited into the plan member’s individual account. See [8, 22, 29–31] for more details.
Now, consider the one-dimension investment problem (i.e., ). We can invest two tradable assets: a risk-free asset given by the following ODE where is the interest rate at time ; the other one is a stock with price satisfying linear SDE where is its instantaneous rate of return, and is its instantaneous volatility. Then, the value process of pension fund plan member’s account is governed by the dynamic where is the portfolio process. On the one hand, if the pension fund manager wants to achieve the wealth level at the terminal time to fulfill his/her obligations and, on the other hand, if we set , then the above equation is equivalent to where are two control variables corresponding to the regular premium proportion of two agents. For any agents, it is natural to hope that the process of wealth’s variance is as small as possible. Therefore, the minimized cost functionals are revised as In addition, if we let , we can get the value of the corresponding Riccati equations used in our paper. In addition, by using the Euler’s method, we plot the solution curves of all Riccati equations, which are shown in Figures 1, 2, 3, and 4.




Here we should point out that Riccati equations are one-dimension functions and given by Figure 1. Furthermore, for any , since is symmetric matrix function, we have that , where means the value of the row and the column of . In addition, in Figure 4, . Therefore, there are only four lines in Figure 4. Next, by noting (76), (66), and (81), we have that , , and are given by the following equations or Figure 5, respectively.

Applying Theorem 13, we can obtain the open-loop Stackelberg strategy with feedback representation of two agents in the pension fund problem studied in this section.
6. Conclusion
We study the open-loop Stackelberg strategy for LQ mean-field backward stochastic differential game. Both the corresponding two mean-field stochastic LQ optimal control problems for follower and follower have been discussed. By virtue of eight REs, two MF-SDEs, and two MF-BSDEs, the Stackelberg equilibrium has been represented as the feedback form involving the state as well as its mean. Based on that, as an application of the LQ Stackelberg game of mean-field backward stochastic differential system in financial mathematics, a class of stochastic pension fund optimization problems with two representative members is discussed. Our present work suggests various future research directions, for example, (i) to study the backward Stackelberg game with indefinite control weight (this will formulate the mean-variance analysis with relative performance in our setting) and (ii) to study the backward mean-field game in Stackelberg strategy (this will involve followers rather than one in the game). We plan to study these issues in our future works.
Appendix
In this section, we concentrate on the solvability of the REs , which are used in Theorem 11. For simplicity, we will consider only the constant coefficient case. Now, we firstly consider the solvability of (the solution of (78)). Noting that it is given the initial value of the RE (78), by making the time reversing transformation we obtain that the RE (78) is equivalent to Then, we introduce the following Riccati equation:where and . Furthermore, it is easy to see that solution of (78) and that of (A.3) are related by the following:Next, we let Then, according to [27, Chapter 2], we have the following conclusion and representation of .
Proposition 14. Let (H1)-(H2) hold and let Then (A.3) admits a unique solution which has the following representation: Moreover, (A.4) gives the solution of the Riccati equation (78).
In addition, using the same method, we can get the similar conclusion about of (76).
Proposition 15. Let (H1)-(H2) hold and let . Then (78) admits a unique solution which has the following representation: Here , , and .
In the rest of this section, we concentrate on the solvability of and with the case . By noting the definitions (21), it is easy to get that . Then, similar to what we did in the previous proof, we get the following proposition.
Proposition 16. Let (H1)-(H2) hold and . In addition, let . Then, (62) and (65) admit unique solutions and which have the following representation: respectively, where
Data Availability
All the numerical calculated data used to support the findings of this study can be obtained by calculating the equations in the paper, and the codes used in this paper are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work is supported by the Natural Science Foundation of China (no. 11601285, no. 61573217, and no. 11831010), the National High-Level Personnel of Special Support Program, and the Chang Jiang Scholar Program of Chinese Education Ministry.