Research Article | Open Access
Time-Consistent Strategies for a Multiperiod Mean-Variance Portfolio Selection Problem
It remained prevalent in the past years to obtain the precommitment strategies for Markowitz's mean-variance portfolio optimization problems, but not much is known about their time-consistent strategies. This paper takes a step to investigate the time-consistent Nash equilibrium strategies for a multiperiod mean-variance portfolio selection problem. Under the assumption that the risk aversion is, respectively, a constant and a function of current wealth level, we obtain the explicit expressions for the time-consistent Nash equilibrium strategy and the equilibrium value function. Many interesting properties of the time-consistent results are identified through numerical sensitivity analysis and by comparing them with the classical pre-commitment solutions.
Since the pioneering work of  in a single period, mean-variance formulation has been one of focused topics of portfolio selection optimization and has stimulated hundreds of extensions and applications. The interested readers can refer to [2, 3] for detailed information. The objective of quite a number of existing mean-variance portfolio selection models is seeking an optimal strategy which maximizes the mean-variance utility , where is the terminal wealth. But it is well known that this mean-variance criterion lacks of iterated-expectation property, which gives rise to time-inconsistent investment strategy in the sense that Bellman optimality principle is not available any more. The so-called time-inconsistent strategy means that optimal strategy obtained at time does not agree with optimal strategy derived at time where . Therefore, the optimal strategy in the classical time-inconsistent models is just optimal from the viewpoint of the initial time, and decision makers at any time after the initial time must commit themselves to the initial optimal strategy even if it is not optimal at time . So, the time-inconsistent optimal strategy in the classical mean-variance model is called the precommitment strategy. But this precommitment has been criticized for lacking rationality. For one simple example, investment psychology and tastes will often change over time, and the decision maker at later time may not commit themselves to following a strategy which is not optimal at their current time. The work in  analyzed the incentives which induce the investor to revise her optimal strategy at subsequent dates under mean-variance criterion.
For this reason, we want to find an optimal strategy with time consistency which is necessary for a rational individual. The analysis of inconsistency can be traced back to  which pointed out that “optimal plan of the present moment is generally one which will not be obeyed" and the time-inconsistent problem can be solved by precommitment strategy or alternatively time-consistent strategy. The authors of [5, 6] devoted themselves to identifying an intertemporal consumption programme that would be “the best plan that an agent would actually follow." The work in  questioned the generality of the existence of Strotz-Pollak equilibrium and gave an alternative criterion of Nash equilibrium. More recently, it is of interest to study time-inconsistent problems. The work of [8, 9] investigated a time-consistent strategy for a consumption and investment problem with nonexponential discounting. The work in  gave general approaches to handle time-inconsistent problems by viewing them as a game theoretic framework and looking for Nash subgame perfect equilibrium points. They formally defined the continuous time equilibrium concept and derived the extension of HJB equation and its verification theorem for a very general objective functions. The work in  studied a continuous-time mean-variance portfolio optimization model on the assumption that the risk aversion factor depended dynamically on the current wealth. In view of the extension of HJB equation developed in , they obtained the time-consistent equilibrium control and equilibrium value function. The work in  provided explicit solutions to a series of cases, including mean-standard deviation in continuous-time setting. The work in  investigated optimal mean-variance time-consistent investment and reinsurance policies for an insurer under continuous-time setting. The work in  developed a fully numerical scheme to determine time-consistent mean-variance strategy based on piecewise constant policy technique. As for the discrete-time mean-variance models, the work in  gave a complicated backwards recursive relationship about time-consistent investment strategy but had not found analytical expression for the strategy.
To the best of our knowledge, no existing literature has given time-consistent equilibrium strategy and equilibrium value function in closed form for discrete-time mean-variance asset allocation. Our research will fill the gap. We view this decision-making process as a noncooperative game and suppose that there is one decision maker, referred to as “decision maker ", for each point of time . This assumption is reasonable. On one hand, in the real world, there are often quite different persons who will join in the decision-making process especially when the investment horizon is long; on the other hand, we can image decision-maker as the future incarnation of themselves at time considering that the tastes of the decision maker will change over time. So our work in this paper is listed as follows: derive the analytical expressions for the time-consistent equilibrium strategy and equilibrium value function when the risk aversion is assumed to be a constant and a function of current wealth, respectively; when risk aversion factor is a constant, compare our time-consistent results with the precommitment ones in  and present the particular properties of the time-consistent results; study the problem in discrete-time setting with nonconstant risk aversion which is a function of current wealth and identify the properties of the investment proportion by numerical analysis.
The rest of the paper is organized as follows. In Section 2, the problem formulation is presented, and the recursive formula of the equilibrium value function is derived. In Section 3, equilibrium strategy and equilibrium value function for mean-variance model with constant risk aversion are obtained. Comparison of our time-consistent results with the precommitment ones in  is also given in this section. In Section 4, the equilibrium results are investigated on the assumption that the risk aversion depends dynamically on the current wealth and numerical analysis is given to demonstrate the properties of investment proportion. Section 5 presents our conclusions.
2. Problem Formulation
In this paper, we assume that investors join the market at time with an initial wealth and plan to process the investment in consecutive time periods. There are one risk-free asset and risky assets in the market. The risky assets have random returns at period (time interval [)) where denotes the random return of the th asset at period and superscript “” stands for the transpose of a matrix or vector. Denote by the return of the risk-free asset at period , and denote by and the wealth available for investment and the amounts invested in risky asset at time , respectively. Then the wealth dynamics is where for .
As we know, the classic mean-variance optimization problem is as follows:
which results in a time-inconsistent strategy, that is, precommitment strategy. Therefore, as mentioned in Section 1, this paper aims to solve this problem from another perspective and to look for the time-consistent Nash equilibrium investment strategy. To this end, we first give the definition of Nash equilibrium strategy according to  and the references therein.
Let be the policy made at time and where is the terminal wealth corresponding to the investment strategy , and is the information at time , such as wealth level. A natural assumption is that risk aversion is a function of .
Definition 1. Let be a fixed control law. For an arbitrary point , one selects an arbitrary control value and define the strategy .
Then is said to be a subgame perfect Nash equilibrium strategy (or simply equilibrium strategy) if for all , it satisfies
In addition, if equilibrium strategy exists, the equilibrium value function is defined as
Let be the Nash equilibrium strategy at time , then Definition 1 makes it possible to solve the problem by the following procedure:(a)(b)given that the decision maker will use , is the optimal control that optimizes objective function ;(c)generally, is obtained by letting decision maker choose to maximize given that the forthcoming decision makers will choose the strategy ; that is,
Now we try to derive the time-consistent Nash equilibrium strategy and value function, but first we need to give the following notations and assumptions throughout this paper:(N1);(N2);(N3);(N4);(N5).(A1) The distribution function of the random returns is , and is assumed to be statistically independent.(A2) is assumed to be positive definite.(A3) Short selling is allowed for all risky assets in all periods. Unlimited borrowing and lending are permitted. Transaction costs are not taken into account.(A4) Capital additions or withdrawals are forbidden for all assets in all periods.
With the notations above, we can obtain the recursions of and . For the sake of convenience, we define and , then
3. Nash Equilibrium Strategy with Constant Risk Aversion
3.1. Equilibrium Strategy and Equilibrium Value Function
When the risk aversion is a constant, is of the form
According to (8), the recursion of equilibrium value function is simplified as
Recursion (12) indicates that does not depend on , and then we only need to find the explicit expression of by the following recursion:
The following theorem gives the explicit expressions of and
Theorem 2. When the risk aversion is a constant, the Nash equilibrium strategy is given by
The corresponding equilibrium value function is
Proof. Obviously (17) and (18) hold true for . Then for ,
Since is positive definite, the optimal solution exists and is given by
Substituting (20) into (19) yields
Hence (16), (17), and (18) hold true for . Now we assume that (17) and (18) are true for , then for ,
It is obvious that the optimal solution of (22) exits and is given by
Substituting (23) into (22), we obtain
and according to (14),
Equations (23), (24), and (25) mean that (16), (17), and (18) hold true for . By induction, the proof of Theorem 2 is complete.
3.2. Comparison to the Precommitment Results
3.2.1. About the Value Function
In view of (17), we know that the equilibrium value function at initial time is
and the precommitment value function of  is
Lemma 3. Consider the following:
Proof. First of all, we have
This proof gives an important inequality needed in the later analysis as
Lemma 3 shows that the precommitment value function is greater than the equilibrium value function. This is a fair game of God. The Nash equilibrium strategy gains the time consistency but at the same time destroys the welfare of the whole decision procedure because of the inability to precommit. Referring to the proof of Lemma 3, we also realize that the distance between these two value functions is amplified at longer time horizon. Specially, when , these two value functions coincide with each other. When , the time-consistent results should be and are actually the same as the precommitment ones.
3.2.2. About the Investment Strategy
The time-consistent strategy at each period is
Referring to , the precommitment strategy at each period is
The significant differences between and are as follows.(a)Since the time-consistent strategy at time will not be affected by the initial information, then it has nothing to do with the initial wealth in contrast with precommitment strategy.(b)the time consistent is time deterministic but the precommitment one is stochastically dependent on the current wealth.
3.2.3. About the Efficient Frontier
Lemma 4. Under the time-consistent strategy (16),
Proof. Substituting (16) into (1) yields
and for ,
and for ,
By repeatedly using recursive equation (39), we can obtain (35). Substituting (35) into (40) yields
Repeatedly using the above recursive equation yields (36).
Theorem 5. The efficient frontier under the time-consistent strategy (16) is
Proof. When , (35) becomes
Equation (43), as we expect, coincides with the expression of in (18).
When , (36) becomes
Equation (44) together with (43) yields
Referring to (43) and (45), we get the efficient frontier as follows:
Efficient frontier in  is
In view of (32), we can find that
Therefore, we have given a mathematical proof to the fact that the efficient frontier for the time-consistent strategy is never above the efficient frontier for the precommitment strategy. Moreover, the shorter the investment horizon is, the closer these two efficient frontiers are.
3.3. Numerical Analysis
In this part, we want to compare expected terminal wealth and efficient frontier under time-consistent framework with the corresponding ones, respectively, in . Let initial wealth . For convenience, we assume that there are only one risk-free asset and one risky asset in the market. Furthermore, we suppose that the risk-free return is a constant during all the investment periods and risky returns have the same distribution function with the same expectation and variance .
Let risk aversion . Here we fix other parameters and make increase from to , to present the effect of investment horizon on the distance between time-consistent expected terminal wealth and precommitment one. Refering to (43) and formula (56) in , we obtain Figure 1, which shows that precommitment expected terminal wealth is higher than the time-consistent one and the gap between these two expectations is widened when increases. Actually, the gap sequences are 0,0.0506, 0.1747, 0.4051, 0.7899, 1.3985, 2.3316, and 3.7352 when is changed from to . This phenomenon shows that for short time horizons, time inconsistency has a slight effect on the relative results. But for long time horizon, noncooperation of each decision maker destroys the welfare of the whole game, and the longer is, the larger the loss is.
Let . Here we fix other parameters and make risk aversion increase from to . The effect of risk aversion on the distance between time-consistent expected terminal wealth and precommitment one is showed in Figure 2, which indicates that the distance between these two kinds of expected final wealth is minor when risk aversion is big enough. This is because, by (43) and formula (56) in , we can find that distance between these two expectations are only affected by the risky investment income. When is big enough, either the time-consistent investor or the precommitment one will prefer to invest less wealth in risky asset, which leads to the conclusion in Figure 2.
We want to compare the precommitment efficient frontier with the time-consistent one when and . Conclusions showed in Figure 3 coincide with our conclusions derived by mathematical analysis in Section 3.2.3 and the reason behind Figure 3 is similar to that demonstrated in Figure 1.
4. Nash Equilibrium Strategy with Wealth-Dependent Risk Aversion
In this section, we will consider an optimization problem
under the framework of Nash equilibrium, where the risk aversion is a function of current wealth . We just study a tractable situation that the risk aversion is inversely proportional to wealth according to some psychological results; that is, where constant is called the coefficient of risk aversion. It still remains open what the time-consistent strategy is with other forms of .
4.1. Nash Equilibrium Strategy and Equilibrium Value Function
Similarly, referring to (8), the recursion of is
with terminal condition .
The recursions of and are, respectively,
Before giving the time-consistent results, we need to introduce the following sequences and and their properties:
Lemma 6. For , and is positive definite.
Proof. is obviously positive definite. Then,
and hence . Now for any column vector , we have
and then is positive definite.
Now we assume that for and is positive definite, and then for ,
Similar to the proof of which is positive definite, we can immediately prove that is positive definite given . By induction, we complete the proof of Lemma 6.
Theorem 7. If , the time-consistent strategy is
and the equilibrium strategy is
If and the model is meaningless.
Remark 8. If the initial wealth is big enough and risky assets have steady returns, then at each period wealth level is often greater than in the real-world situation. Therefore, the condition is not a severe requirement.
Proof. By (50), we obtain
It is obvious that when ,
Substituting (61) into yields
Equations (61)–(64) mean that (56)–(59) are true for .
We assume the results in Theorem 7 hold true for , and then according to (50) and (57),
By Lemma 6, we know that is positive definite. Given that , we can see that the optimal solution of (66) exists and is