#### Abstract

We consider the dynamic contract model with time inconsistency preference of principal-agent problem to study the influence of the time inconsistency preference on the optimal effort and the optimal reward mechanism. We show that when both the principal and the agent are time-consistent, the optimal effort and the optimal reward are the decreasing functions of the uncertain factor. And when the agent is time-inconsistent, the impatience of the agent has a negative impact on the optimal contract. The higher the discount rate of the agent is, the lower the efforts provided; agents tend to the timely enjoyment. In addition, when both the principal and the agent are time-inconsistent, in a special case, their impatience can offset the impact of uncertainty factor on the optimal contract, but, in turn, their impatience will affect the contract.

#### 1. Introduction

The principal-agent problem is a classic issue of the optimal contract and is widely used in financial and economical fields. The principal and the agent, as two parties of the contract, interact with each other. Under the constraints of the contract, the agent creates profits for the principal and the principal pays the salary for the agent as incentives. In this paper, we introduce an optimal contract where both the principal and the agent are time-inconsistent to solve the principle-agent problem under moral hazard in a dynamic environment. In general, the agent is regarded as risk-neutral and the principal is risk aversion.

In solving the optimal contract with the time-inconsistent principle-agent problem, there are three problems we need to face. The first is the solution to the principal-agent’s problem in the continuous-time. A continuous-time model where the agent controls the Brownian motion drift rate over the time interval is studied by [1]. Later, [2, 3] uses martingale methods to develop the first-order approach of principal-agent problems under moral hazard with exponential utility. Reference [4] shows that the first best sharing rule is also linear in output in the continuous-time principal-agent model with exponential utility. Reference [5, 6] uses the stochastic maximum principle to extended Holmstrom’s model, and discuss the optimal solution of the agent with private information in the continuous-time model. Reference [7, 8] uses the forward-backward stochastic differential equations to consider the optimal contract under moral hazard. Reference [9] systematically expands the problem of continuous-time principal-agent. However, the above methods solved the principal-agent problem in a continuous period of time under the time-consistent preference, which is simplistic for the actual situation. Therefore, it is natural to consider the time inconsistency in the principal-agent model.

The second problem is how to find the optimal strategy when the time preference is inconsistent. Reference [10] proposes the optimal contracts for the principal who contracted with the dynamically inconsistent agents in a discrete case. Their study includes exploitative contracts that applied for naive agents to better explain the true contractual arrangements. Based that, [11, 12] takes the neutral agent as a benchmark to study the possibility that the principal manipulates the naive agent. The result shows that the innocence of the agent does not bring benefits to the principal, and the maximum effectiveness of the principal is the same in front of the neutral agent and naive agent. Besides, the definition of naive and neutral agent was firstly mentioned in [13]. Reference [14] takes the discount of the quasi-hyperbolic as the agent’s discount function, then discuss the optimal contract and the profit level of the principal when the agent is neutral or naive. The above mainly deals with the problem of time-inconsistent agents in discrete time. However, less attention is paid in continuous-time because it is complicated to solve the closed-loop solution under nonconstant discounting. Reference [15] proposes the optimal contract models based on the Pontryagin maximum principle for forward-backward stochastic differential equations to study a general continuous-time principal-agent problem in which the utility function is time-inconsistent.

The third question is how to find the exact solution of the Hamilton-Jacobi-Bellman (HJB) equation. Using the stochastic control to solve the HJB equation is a complex mathematical process, especially in the case of increasing the control variables, which will be more complex nonlinear partial differential equations. By the Legendre transform, the problem can be transformed into a dual problem that is convenient for analysis, so as to solve some model solving problems. Reference [16] studies the portfolio problem under the general utility function, and prove the effectiveness of using the Legendre dual method to solve the HJB equation. Reference [17] uses Legendre transform-dual theory to solve the optimal investment problem based on hyperbolic absolute risk aversion (HARA) preference under constant elastic variance model. References [18, 19] study the investment-consumption of HARA utility by Legendre method.

In this paper, we study the optimal incentive contract under moral hazard in the framework of the principal-agent problem with time inconsistency in continuous-time. Assume both the principal and the agent are time-inconsistent, where the principal is risk-averse with an exponential utility function and the agent is risk-neutral with a linear utility function. To describe the time inconsistency of participants, we assume that the discount rate of participants is a function of time (not a constant) but still takes the form of exponential discounting because the principal’s utility function is the exponential utility function. According to the property of the exponential function, we can divide the discount function into two parts: one part is the traditional discount function (discount rate is constant) and the other part is the uncertainty. Under the moral hazard, the principal can observe the process of output but cannot observe the agent’s efforts and random perturbations. Thus the principal considers a part of the discount function as an unknown factor that affects the output. Reference [20] puts forward that the principal can constantly learn and update his belief from the unknown factor (the uncertain part of the discount function) through the existing information and historical information in the production process. Therefore, we transform the time-inconsistent principal into a principal who has the consistent time and learning process. For time-inconsistent agents, we can employ the Markov subgame perfect Nash equilibrium method [21] to get its time consistency strategy.

Through the above assumption, we solve the optimal contract in two cases where the principal is time-consistent and time-inconsistent. When the principal is time consistency, we use the stochastic optimal control method to derive the nonlinear partial differential equation (HJB equation) for the optimal value function of the principal. This partial differential equation is hard to solve for an exact closed-loop solution; however, the original problem can transfer to a dual problem by applying the Legendre transform in some cases. To obtain the exact solution of the optimal contract (closed-loop solution), we use the Legendre transform-dual theory to obtain the explicit solution of the optimal solution and the optimal contract. When the principal is time-inconsistent, we obtain a value function which takes the time, the agent’s personal information, and the utility of agent as variables, so as to derive a three-dimensional nonlinear second-order HJB equation. In this situation, we solve the HJB equation by guess solution.

The general structure of this paper is as follows. Section 2 presents the model. The incentive compatibility conditions and the given proof are provided in Section 3. Section 4 studies the optimal contract of a time-consistent and time-inconsistent principal. Section 5 provides numerical simulation of optimal strategy. Finally, we made the conclusion in Section 6.

#### 2. The Model

##### 2.1. The Agent

Suppose a principal made the contract with a time-inconsistent agent to manage a production process (or invest a risk project) and the initial time of contracting period is recorded as 0. Consider an infinite horizon stochastic environment; let be a standard Brownian motion on the probability space . The risk process that pays a cumulative process evolved on period as follows:assumes that is a compact set and is the agent’s effort choice. is the salary of the agent (or his consumption). is the project’s volatility (constant) where . The path of is observable both from the principal and the agent, but the path of is observable only from the agent, and the effort choice is unobservable from the principal.

At the initial moment time , the principal provides the agent with a contract (pay the salary according to the contract). Assume that the salary is composed of two parts, a continuous payment and a terminal payment . Moreover, we assume that the agent is risk-neutral, and ( we will give the explicit functional forms for and in the specific question (see (26) and (27) )) are utility functions, and and are concave and twice continuously differentiable. And the agent has a discount function , where (in this paper, for convenience, sometimes is written as ) is a general discount function; see [22]; then the agent will be time-inconsistent if is time dependence.

The agent’s preferences as of time read

##### 2.2. The Principal

In this paper, we assume that the principal is partially naive type ([23] in a discrete time-inconsistent model, disaggregate participants into mature type, naive type, and partially naive type based on the cognitive differences of participants about their own future preference), which means that the principal knows he is time-inconsistent (his discount function is time-variant), but his current perception of the future discount rate is biased against the true value of the future discount rate (at time , he cannot be sure the value of discount rate (we can further assume that ) when ). Therefore, the principal will continuously update his belief in the future discount rate based on the past information. The detailed analysis is as follows.

The principal’s preferences as of time will bewhere and are utility functions by the principle and is the discount function. Assume that the utility function over salary (consumption) and effort and , where is an absolute risk aversion coefficient. Hence, we rewrite (3) as follows:where that .

From (4), we can split the principal’s discount factor into two parts under the condition of the exponential utility function: one part is a constant discount rate and another part is . The purpose of the above operation is that the principal estimates a suitable constant discount rate instead of a time-varying discount rate , and the principal does not know the exact value of this constant. Therefore, the principal will constantly update the recognition of based on past information. is the subjective choice of the principal, but indicates an objective reality, reflects the type of principal (time-inconsistent or time-consistent), and does not depend on the subjective choice of the principal. So we can set as a part of the output (investment) process. In this way, we can turn the principal’s time inconsistency problem into an unknown constant discount rate problem. If the principal is time-consistent, namely, the principal’s discount rate is a constant , he can choose as a constant discount rate ; hence ( or ). Under the probability measure of the principal, we can regard that is an intrinsic influence of the risk item and is not subject to the control of the principal but must be considered. Hence the risk process (1) becomes

As discussed above, we know that the process and the path of are observable from the agent; therefore the measure for the agent is , which means that the agent does not learn in secretly, so the agent’s beliefs will not be a hidden-state variable ([24]) (this does not mean that the agent cannot mislead the principal by choosing an effort, just that such actions do not cause persistent hidden states according to the agent’s beliefs). From (1) we have

Equation (5) expresses the principal’s beliefs about the project and (6) expresses the agent’s beliefs. The disagreement between the principal and the agent is caused by principal’s nonindex discount.

At time , the principal knows the exact value of , because is his discount rate, but he does not knows the exact value of ; therefore we use a sided Bayesian learning model after signed contract, and we assume that the prior about at time is normally distributed with mean and variance . The agent does not update his beliefs because he has perfect information. If the agent follows the recommended effort choice , the principal’s posterior beliefs about depend on and on cumulative effort .

According to the Kalman-Bucy filter, see [25]. The conditional expectation and the precision of filtering satisfy the system of equations where and and is a standard Brownian motion under the measure induced by the effort sequence as

#### 3. Incentive Compatible Conditions

In this section, we focus on the agent’s problem. Since the agent’s objective function relies on the consumption process , that is, it relies on the history of the whole output, so it is non-Markov (the specific proof see [6]). Thus, the agent optimization problem cannot be solved by the standard dynamic programming principle. We will employ the stochastic maximum principle of the solution to the weak situation of the agent problem. The main idea is to apply random variational methods; the relative papers [8, 20, 26] used the similar approach.

Define the agent’s continuation value (promised utility) as the expected discounted utility for remaining in the contract from date forwardwhere is the output history. We use to relate the expectation operator under the measure (because the agent’s objective function depends on the consumption process , which is non-Markovian since it depends on the whole output path ; hence the optimization problem (9) cannot be analyzed with standard methods. So we use a martingale approach. Given a contract , the agent controls the distribution of salary through his choice of effort. For the specific technical treatment see Appendix A). The agent’s objective function can be recast as where represents the salary paid by the principal based on output history.

After the change of measure, the time-inconsistent agent’s problem only has one control. We apply a stochastic maximum principle to characterize the agent’s optimality condition, and we also use the dynamic programming equation derive a stochastic maximum principle with general time-inconsistent.

The agent’s problem is to find an admissible control to maximize the expected reward . In other words, the agent needs to solve the problem

Given , for all , subject to

next we define the optimal effort for the time-inconsistent agent. Let and be a measurable set whose Lebesgue measure is . Let be an arbitrary effort choice. We define the following:with , and it is clear that . We refer as a needle variation of the effort choice . Then, we have following definition.

*Definition 1. *An effort choice is an optimal effort choice for the time-inconsistent agent, for , if The optimal density process is a solution of the stochastic differential equation.

Through the above technical processing, we convert the time-inconsistent strategy into time consistency optimal strategy of the agent.

Next, we analyze the conditions for the implementation of incentive contract. According to the previous analysis, the agent will control the distribution of salary by choosing his effort. The idea of using the distribution of salary as control to solve principal-agent problem goes back to [27] is and expanded by [5, 20]. The learning process of the principal complicates our problem, as the past effort affects not only current salary but also future expectations of the agent and the principal. Therefore, we have to deal with a principal-agent problem with time inconsistency and learning process. In Appendix B, we show how this difficulty can be handled through an extension of the proof by [8, 20]. And the conclusion presents the following.

Proposition 2. *The agent’s continuation value can be uniquely represented by the following differential form:where is a square integrable predictable process.**The necessary and sufficient conditions for is the optimal effort choice reads:**(i) If is the optimal effort choice, then for every there exists a solution of (16) which satisfies (in this paper, and represent function taking the first-order partial derivative and second-order partial derivative of , respectively)where**(ii) For almost all , if the following inequality holdswhere is the predictable process defined by then is the optimal effort choice.*

According to (18), we say that is a stochastic process capturing the value of private information and then obtain the solutionfor all .

In the following, at any time , the process for reads the coefficients is chosen by the principal to maximize his expected utility (the proof is given by [20])

According to (19), the process is the random fluctuation in the discounted sum of marginal utilities evaluated from time 0. Based on the stochastic differential equation of , we can obtain that . Besides that, , , , and are endogenous, which implies that we need to get a contract to satisfy the necessary conditions and then prove that it also meets the sufficient conditions. If the contract has no explicit solution, it will be difficult to prove that the contract also satisfies the sufficient condition. In this paper, the utility function for the principal is exponential function and the utility function for the agent is linear function. In the next section we will employ the exponential utility function to get the closed-loop solution of the contract.

#### 4. The Optimal Contract

This section detailedly explains how to solve the principal’s problem and derive the optimal contract in closed form when the principal’s utility is exponential.

**Eliminating **** from the list states.** For a given contract , the principal expected utility form data forward readsand defines and we have the following result.

Proposition 3. *Proof in Appendix C.*

Assume that Propositions 2 and 3 hold, so that the necessary condition is also sufficient. The principal’s problem consists of solving for subject to the two promise-keeping constraints (9) and (18) and also to the incentive constraint (17). Given that the posterior mean does not enter directly into any of the constraints, it can be dispensed as a state, only leaving the precision as a belief state. Furthermore, since is deterministic, we may index the precision by . The fact is that the expected value of is immaterial to the principal’s objective and illustrates that incentives are designed to reward effort, not to ability.

**The Agent’s utility function.** To obtain the solution of optimal contract, according to our assumption in Section 2.1, the agent is risk-neutral and the utility functions of the agent are linear, i.e., moreover, we make a particular assumption about the terminal utility for the agent, settingwhere is a constant. This assumption implies a situation in which an infinitely lived agent retires at the termination date T of the contract and, after retirement, he can consume a permanent annuity derived from . We always concentrate on problems where the contracting horizon goes to infinity , so this particular assumption is not critical.

**Incentive providing contracts.** Restore the principal’s optimization problem assubject to

Since the state variables and are Markovian processes, we can use the HJB equation to analyze the principal’s optimal control problem. Take as the principal’s value function; this value function satisfies the HJB equation for :

##### 4.1. Second Best Contracts for the Time-Consistent Principal

In this section, we mainly consider that the model with the time-consistent principal and setting hidden action means that the principal can observe the process but does not know the type of agent and also cannot observe the agent’s effort . For incentive contracts , for any , the necessary condition for incentive compatibility (17) becomes . When the principal is time-consistent, it expresses that for all ; hence we say . There is no need to inquire and there is no influence of belief manipulation, which indicates that the information value is equal to zero.

###### 4.1.1. The HJB Equation

When the time tends to infinity (), the agent’s continuous value function is the only state variable for writing the principal’s HJB equation as follows:with the terminal condition , where . Taking the first-order conditions for , we have

Under full information, the principal can observe the agent’s effort and consumption and there is no private information in this case. Hence, the principal can freely choose as parts of the contract; i.e., is independent of and , and then we have the following proposition.

Proposition 4. *Under full information, the optimal effort for the principal is a constant . We say is the first best effort level.*

Under hidden action case, the optimal effort and consumption derived from (34) shows

Putting (35) into (33), the HJB equation (33) for the value function is rewritten as

Recalling the principal with the exponential utility function, the HJB equation is a complex nonlinear partial differential equation. It is difficult to take the classic separation of the variable method and solve it intuitively. In the next, we will employ the Legendre transform to turn the problem into a dual problem, by solving the dual problem to obtain the optimal solution for the original problem.

###### 4.1.2. Legendre Transform

The dual function of is defined by where is the dual variable of . The function is closely related to and can be used as a dual function of the function . In this paper, is defined as the dual function of and satisfies

According to the definition of the dual function, we have and

Based on the conclusion in [28], the following transformation rules are obtained:

Define the dual function of the utility function as

With the analysis from [29], the function and can be changed to pass the Legendre conversion

The relationship between the optimal values and is

According to equation (39) and rules (40), the HJB equation for the dual problem is

Taking the derivative of and combineing (35), we have

###### 4.1.3. Solution of the HJB Equation

According to the form of the principle’s utility function, we have

We can assume that HJB equation (36) has the following form of solution:with .plug in (45) and separate the variables; then we have

Thus, the following two ordinary differential equations are established:

Proposition 5. *Assume that (i) the principal is time-consistent and the agent is time inconsistent, (ii) and are as defined in (26) and (27), and (iii) for all , so that the incentive constraint (17) binds for almost all . Then recommended effort and the agent’s consumption is given bywhere *

##### 4.2. Second Best Contracts for the Time-Inconsistent Principal

In this section, we discuss the case when the principal is time-inconsistent, which means the discount rate is not a constant. In this case, the principal still cannot observe the agent’s efforts and consumption (moral hazard). Hence, the value of private information is not zero. As described in Section 3, the HJB equation is as follows:

Now we need to solve this above equation by guessing the solution. Under the first-order conditions for , we have

Substituting (55) into (54), denoting that , we havewhere .

In particular, we suppose the value function has the following form: with and .

Hence, for some functions and , the expressions of optimal effort and consumption areSubstituting the optimal effort and consumption into (54), we deduce where .

The following two differential equations can be obtained by eliminating the dependence on :According to (60), we can obtainFrom the above analysis, we need to know the specific expressions of to get the explicit expression of effort and consumption .

Let us expand at , Here we consider a simple situation; according the principal’s utility function and , suppose the structure of the solution of equation (61) is with the terminal condition and .

Lemma 6. *Suppose the structure of solution of (61) is , with the terminal condition and , and then and are, respectively, the solutions of the following differential equations:where .*

It is proved as follows.

Substituting into (61), we calculate to get The following two differential equations can be obtained by eliminating the dependence on :Lemma 6 conclusion can be provided by solving the above two differential equations.

Proposition 7. *When the principal is time-inconsistent, the expressions of optimal effort and consumption for the agent arewhere , , and are given by (62) and (64), respectively.*

It can be seen from the proposition that the second optimal effort is less than the first optimal effort. The optimal consumption is the linear function of the agent’s promise value and private information.

#### 5. Numerical Simulation

In this section, we provide a numerical simulation to characterize the dynamic behavior of the optimal portfolio strategy derived in the previous section. Firstly, an optimal effort numerical simulation is performed when the principal was time-consistent.

As shown in Figure 1, the discount rate of the agent is taken as the constant discount rate, namely, . The optimal effort, under different volatility, is reduced with the increase of volatility. It also shows that the greater the uncertainty, the lower the efforts of the agent. In addition, the three curves are almost declining, indicating that effort is a decreasing function of time.

If we take the discount rate of the agent as , i.e., , where and ; ; , respectively. The curves of effort variation are drawn in Figure 2.

Analogy with Figure 1, although the discount function is different, we still can get the similar conclusion, which means that the greater the uncertainty for any type of agent is (whether he is time-consistent or time-inconsistent), the less effort he provides. The reason is that in the case of moral hazard, the principal cannot distinguish the influence of the agent’s efforts and uncertainty on the risk project’s return.

Next, we simulate the optimal consumption (salary) under the specific parameters. Let , , and . According to the expression of , we have Since is a stochastic process, the mathematical expectation of can be expressed as where within a constant . Then we substitute the expressions of and into the above expression; the relationship between and time can be simulated with the terminal condition .

Consumption is initially diminishing because the effort is a function that decreases monotonically over time. After falling to a certain value, the bottoming out of consumption rose and this trend can be explained by the value of terminal condition of consumption setting. The overall consumption trend is shown in Figure 3.

As shown in Figure 3, in the period of time just after the contract has been performed (for example ), the greater the volatility, the lower the consumption. The reason is that greater volatility leads to lower efforts. In the latter part of the contract, the situation is just the opposite.

Finally, we simulate the optimal effort trend when the principal is time-inconsistent. Assuming that the discount rate for the agent is the constant discount rate, then two different effort curves are both horizontal lines, as Figure 4 shows. This indicates that, under the established discount rate, the optimal effort does not change over time. The reason is that we hypothesize the value function of principal as an exponential function under private information. And also, the effort is a decreasing function of the agent discount rate, which means that if the agent pays more attention to the present value (timely enjoyment), the higher discount rate and the less effort provided.

#### 6. Conclusion

In this paper, we were interested in a time-inconsistent principal-agent problem under full information and moral hazard framework. In particular, the optimal contracts we discussed in details assume that the principal is risk aversion and the agent is risk-neutral. There are two main works we have done in the paper. First, we made the technical processing of the time-inconsistent principal and agent, respectively. We transformed the principal by the changing of the time-varying discount rate into the time-consistent. And we used the Markov subgame perfect Nash equilibrium method to get the time consistency strategy for the time-inconsistent agent. Second, we used the Legendre transform duality theory to transform the HJB equation into the dual equation. The solution of the original HJB equation was obtained smoothly, thus obtaining the explicit expression of the optimal effort and optimal consumption. Under moral hazard, we also obtained the exact solution of the original HJB equation by using the guessing solution. We found that the optimal consumption of agent is a linear function of promised value and private information . The optimal effort is the function of the agent discount rate. Eventually, we considered the contractual relationship between the principal and the agent in a special circumstances. The more general situations of time-inconsistent contracts should be considered in future research.

#### Appendix

#### A. Details of the Change of Measure

Consider the Brownian motion under a probability space with probability measure . And let so that is also a Brownian motion under . Given a contract , we define the drift of output as Since expected output is linear in cumulative output, then we define a predictable process with an effort : for , and is an martingale with for all . By Girsanov theorem, the defined new measure is as and the process defined by is a Brownian motion under , and the triple is a weak solution of the following SDE:

Hence each effort choice results in a different Brownian motion. defined above satisfied which is the relative process for the change of measure.

#### B. The Proof of Proposition 2

Consider the agent problem and suppose is given; in general we assume to ease the presentation. Let be the density process corresponding to , i.e., Processes and are defined by the SDE andcan be regarded as the first-order and the second-order variation of the density process .

From the reference of Theorem 4.4 in Chapter 3. in [30], the following expansion holds: Define the adjoint variables as follows:Using adjoint processes to partial integration, we can remove in the above equation.

Now we introduce the Hamiltonian function bySince the control variable enters into the volatility of the density process, we have to introduce a second pair of adjoint processes by

By using the Lemma 4.5 and Lemma 4.6 in Chapter 3 [30], we have

We claim that the process is nonpositive. As a matter of fact ; hencewhere

Therefore a sufficient condition for has to be an optimal effort strategy that

Let and ; we haveHence, (16) is satisfied.

There is no drift in ; in addition, since , we can define the conditions on the Hamitonian rather than . Hence, the first-order condition for isFinally, the necessary condition for to be an optimal effort choice is , namely,On the other hand, from (9) the expected utility form is given by ,Then, for an arbitrary effort choice , we define and ; the following holds:The second term on the right hand side uses the fact that the stochastic integral is martingale. The last term on the right hand side can be written as where the last equality follows the form definition of and . Hence,