Research Article  Open Access
Zhigang Chen, Rongwei Xu, Yongxi Yi, "Dynamic Optimal Control of Transboundary Pollution Abatement under LearningbyDoing Depreciation", Complexity, vol. 2020, Article ID 3763684, 17 pages, 2020. https://doi.org/10.1155/2020/3763684
Dynamic Optimal Control of Transboundary Pollution Abatement under LearningbyDoing Depreciation
Abstract
This paper analyzes a dynamic Stackelberg differential game model of watershed transboundary water pollution abatement and discusses the optimal decisionmaking problem under noncooperative and cooperative differential game, in which the accumulation effect and depreciation effect of learningbydoing pollution abatement investment are taken into account. We use dynamic optimization theory to solve the equilibrium solution of models. Through numerical simulation analysis, the path simulation and analysis of the optimal trajectory curves of each variable under finiteplanning horizon and longterm steady state were carried out. Under the finiteplanning horizon, the longer the planning period is, the lower the optimal emission rate is in equilibrium. The longterm steadystate game under cooperative decision can effectively reduce the amount of pollution emission. The investment intensity of pollution abatement in the implementation of noncooperative game is higher than that of cooperative game. Under the longterm steady state, the pollution abatement investment trajectory of the cooperative game is relatively stable and there is no obvious crowding out effect. Investment continues to rise, and the optimal equilibrium level at steady state is higher than that under noncooperative decision making. The level of decline in pollution stock under finiteplanning horizon is not significant. Under the condition of longterm steady state, the trajectories of upstream and downstream pollution in the noncooperative model and cooperative model are similar, but cooperative decisionmaking model is superior to the noncooperative model in terms of the period of stabilization and steady state.
1. Introduction
In recent years, more and more experts and scholars use dynamic optimal control theory to study various complex pollution control problems. The differential game method based on dynamic optimal control theory provides an effective research tool for the treatment of transboundary pollution abatement. And it can analyze the interaction between the strategies of the participants and the dynamic change trajectory of pollution stock. Kilgour et al. [1] first used game theory to analyze the total amount control of pollution in river basins, taking total amount control of COD (chemical oxygen demand) as an example. Stimming [2] studied a differential game problem of participating enterprises under two kinds of environmental policies: pollution tax and emission permit. It was concluded that the total investment and emission amount under feedback strategy decision making were higher than that in openloop strategy. List and Mason [3] applied differential game theory to analyze the optimal institutional arrangement of transboundary pollution abatement under suboptimal conditions and established a differential game model under two systems of cooperative and noncooperative decision making, respectively. The results showed that if the emission yields of the two regions are asymmetric and the initial emission level is low, noncooperative decision making will be better than cooperative from the perspective of total income. Breton et al. [4] used a twoparty finitetime differential game model to analyze the cooperative implementation of environmental projects. The key assumption of the model is that pollution abatement investments in one country can reduce the stock of pollution in other countries, and the cost of pollution damage is linear. Löfgren et al. [5] analyzed the impact of environmental pollution taxes with future uncertainties on manufacturers' pollution control investment decisions. Yeung [6] assumed that different governments adopt cooperative and noncooperative decision making to control environmental pollution and implement the policy of levying pollution tax, with the ultimate goal of maximizing social welfare. The social production department decides the output of the enterprise on its own, aiming at maximizing its own profits. The cooperative differential game of transboundary pollution is analyzed and discussed. In addition, a dynamic consistency of cooperative strategy is further studied, and the stochastic differential game analysis of this problem is extended. Antelo and Loureiro [7] established a threestage game model of oligopoly competition pollution abatement investment and found that manufacturers will make different pollution investment decisions for the symmetry of pollution control technology information. In [8], a cooperative stochastic differential game of transboundary industrial pollution is presented, and a payment distribution mechanism is derived to maintain the subgame consistency. Additionally, there are several published studies of transboundary pollution problems from other views, such as renewable resource, clean technologies, harmonization of international and domestic law, abatement cost, R&D spillovers, and so on (for instance, [9–12]). These papers [8, 13, 14] took emission permit trading into account to study the problem of transboundary pollution. It has been shown in this literature that coordination of countries’ emission strategies leads to a lower total level of pollution and to a higher total welfare than when countries use noncooperative emission strategies. Chang et al. [15] obtained optimal emission levels and abatement expenditures in a finitehorizon transboundary pollution game with emission trading between two regions. And the paper found that cooperation between the regions leads to increased abatement and lower emissions, resulting in a lower pollution stock. In another paper [16], Chang et al. presented a stochastic differential game to study this kind of problem. More generally, the process of emission permit price is assumed to be stochastic and to follow a geometric Brownian motion (GBM). All the results demonstrate that the stochastic emission permit prices can motivate the players to make more flexible strategic decisions in the games.
The results of theoretical research and empirical analysis have shown that the technology acquired through learning by doing may not last indefinitely, but may depreciate. And learningbydoing “forgetting” or “depreciation” is of great significance to production planning and scheduling [17–21]. When using economics to analyze environmental problems, scholars usually assume that technology is constant or that technological progress is considered exogenous. Unlike traditional analysis, technological advancement in real production is an endogenous and dynamic process, which is influenced by factors such as regional government policies and knowledge acquisition. Jaffe et al. [22] argued that the relationship between technological change and the environment has an important impact on environmental problems. The positive interaction between technology accumulation and good environmental pollution abatement policies should be better understood. Kline and Rosenberg [23] studied a variety of technology industries and found that in some cases, learning by doing in production process contributed more to technological progress than the original development itself. Bramoullé and Olson [24] argued that the government's Pigouvian tax, an environmental regulation policy, could put pollution abatement on the right technological track. By using the effect of learning by doing over time on the distribution of pollution abatement among heterogeneous technologies, the best conditions for sharing all technologies could be determined. At the same time, it is found that the more mature pollution abatement technology is more inclined to adopt infant technology.
It is learning that accumulates pollution abatement experience and thus reduces pollution control investment costs. Due to the accumulation of pollution control experience, the cost of governance should be reduced, which is measured by cumulative governance. In fact, “learning by doing” from the experience of pollution control can promote the transformation of governance processes and technologies. These changes can reduce costs. However, participants’ net present value income includes not only production income and emission trading income but also pollution abatement costs. Some scholars use the value function of production to measure the learning value of learning by doing. It has been widely used in some operation research literature, such as [25–27]. Learning by doing is an important source of technological progress. Similarly, the technology of pollution abatement investment needs to be developed continuously in practice. In recent years, many authors have discussed the technology of pollution abatement investment [25] and the technology of investment accumulation [28]. Some studies have mentioned the environmental policy and abatement cost in the presence of learning by doing [24, 25, 29].
However, it seems that the existing literature rarely deals with the dynamic effects of accumulation and depreciation of learning by doing to analyze the changing trajectory of investment in transboundary pollution abatement. Furthermore, our work is in the spirit of the study by Li and Pan [25] and Zhong and Zhang [30]—the first time to investigate the effect of experience which is measured by the cumulative abatement investment from time 0 to t. In recent works [31–35], the investment cost reduces with accumulative experience. In this paper, it is assumed that the accumulation of pollution abatement investment increases at a constant rate. As knowledge is forgotten, depreciated, or replaced by new one, knowledge will depreciate at a certain rate. Similarly, there is a process of “learning by doing” that has been forgotten or replaced by new ones in actual production and application of pollution abatement technologies. In order to explore the strength of this forgotten or substitution effect of polluting enterprises in the process of pollution abatement investment and its impact on the instantaneous emission rate and pollution stock, this article builds models, focuses on solving the optimal solution of related variables, and analyzes it. Our main purpose is to study the pollution abatement investment decision under the accumulation and depreciation of knowledge. Therefore, we use a mature dynamic differential game model to simplify and more intuitively analyze the changes of various variables by numerical simulation.
The rest of this paper is organized as follows. In Section 2, we will establish our basic dynamic general equilibrium model. In addition, the optimal Nash equilibrium solution of instantaneous emission rate, investment intensity of pollution abatement, and the pollution stock in upstream and downstream regions under noncooperative game and cooperative game are presented in Sections 3 and 4, respectively. In Section 5, the path simulation and analysis of the optimal trajectory curves of each variable under finiteplanning horizon and longterm steady state equilibrium were carried out by numerical simulation. Some discussions and further analysis are provided in Section 6. Finally, Section 7 concludes the paper.
2. The Basic Model
There are two adjacent regions () in a river basin, which we call upstream region 1 and downstream region 2, in our transboundary pollution model. It is assumed that both regions discharge organic pollutants into the river basin. For region (), production always leads to a quantity of byproducts, namely, emissions (). We assume that is the industrial output of upstream and downstream region, which indicates that at time , the industrial production of region 1 and region 2 is and . The instantaneous emissions of pollutants from industrial output are (), that is, at time, the pollution emissions of region 1 and region 2 are and , respectively. It is assumed that pollution emissions in various regions are positively related to industrial production. And the instantaneous linear production function can be expressed as
According to literature [3, 4, 36], it is assumed that the regional industrial income function is , which can be expressed by the following quadratic functional form in terms of emissions:where is a positive constant. Thus, the profit function is expressed as a quadratic concave function of .
Environmental pollution damage cost is a linear function of pollution stock ; following [4], the cost function is . Among them, indicates the degree of environmental damage per unit pollution stock to region.
The investment intensity function of pollution abatement is . It is known that pollution abatement can be realized only when technique and labor are invested. So, we should face the abatement cost which could decrease the net revenue. Following [3], we assume that upstream investment abatement cost can be described by following the quadratic form:
Equation (3) means that the marginal cost is increasing with respect to the level of pollution abatement. So, the investment cost of water pollution abatement in downstream region 2 and the investment cost of water pollution control in upstream region 1 can be expressed as follows:where indicates the investment intensity of downstream region 2 in local area and indicates the investment intensity of downstream in upstream region 1. This functional form captures the idea that the cost of region 2 to region 1 depends on what the downstream is investing because investors will only acquire “learning by doing” for upstream regions after they have collected downstream investment experience in region 2.
By means of [24, 35], the experience of applying pollution abatement technology is measured by the cumulative abatement from time 0 to t, that is,where denotes the initial experience level of applying pollution abatement technology. Similar to the above, is a positive parameter and it represents the differences between the two regions’ ability in accumulating experience. According to the learningbydoing theory, the amount of cumulative experience will lead to a decline in the unit cost.
With the rapid development of society and technology, the accumulation of pollution abatement investment increases at a constant rate. And the amount of cumulative experience will lead to a decline in the unit cost. As knowledge is forgotten, depreciated, or replaced by new one, knowledge will depreciate at a certain rate. According to Jorgenson [37] and Griliches [38], a proportional or geometric depreciation rule seems to be a good choice to represent the depreciation of aggregate stock of knowledge.
To simplify, we apply the proportional and linear approach using to represent depreciation rate of the aggregate stock of knowledge evolving over time. Correspondingly, the learningbydoing function has also changed because of taking knowledge depreciation into consideration:
Here, we define parameters as rate of knowledge accumulation under investment in pollution abatement technology and parameters as depreciation rate of learning by doing. The investment intensity of pollution abatement in upstream region is .
In this model, the stocks of pollution in the upstream and downstream regions of the basin are and at time. And the pollution stock in the two basins can be expressed by the following two differential equations:where and represent the stocks of selfpurification pollution. As we all know, with the change of time and temperature, the water in nature has certain selfpurification ability. Without the loss of generality, it is assumed that the basin has the purification rate of water pollution, namely, the coefficient of selfpurification ability of water . is the number of pollution transferred from upstream to downstream. is assumed to be the transfer coefficient, and .
The number of initial emission permits in region 1 is and region 2 is . It is assumed that the emission trading market is a fully competitive market and the price of emission trading right is constant at . If the amount of emission exceeds the initial allocation, the emission right can be purchased in the permit market. On the contrary, if there is a surplus of the emission right, it can be reserved for the next year or sold in the emission permit trading market.
Given the above assumptions, we can get the concrete expression of the income function of the two regions:
We define as the conversion rates of learning by doing. Then, , , and represent the cost savings brought by learning by doing, and it can also be considered as the income from learningbydoing conversion.
3. NonCooperative Game
The amount of pollution abatement investment provided by the downstream region in the same basin will affect the investment enthusiasm of the upstream region in pollution control and the amount of pollutant discharge in the region. In turn, the amount of pollutant discharge in the upstream region will affect the amount of pollutant discharge in the downstream region, which constitutes a dynamic game relationship between the two sides. Considering the decision making in continuous time, this constitutes a dynamic differential game relationship.
Under this model, both regions aim to maximize the net present value of their own longterm income. The pollution discharge in the upstream region will affect the income of the downstream region by affecting the pollution stock in the same water basin. The decisionmaking problem of independent discharge from the two regions constitutes a differential game problem with , , , , and as control variables and , , , , and as state variables, aiming at maximizing the net present value of their respective income.
The current goal of region 1 is to maximize the expected present flow of instantaneous net revenue in terms of the emission path and the abatement level. We describe this issue as
Likewise, the current goal of downstream region 2 is to maximize the expected present flow of instantaneous net revenue in terms of the emission path and the abatement level. Similarly, the optimization problem of the downstream region can be given as
The current value Hamilton function of equation (10) is
The current value Hamilton function of equation (12) iswhere are the dynamic adjoint variables associated with the state equation about . Here, the dual variables , also called shadow prices or common state variables, are Lagrange multipliers, which are the derivatives of the two players’ value functions, i.e., revenues, with respect to the pollution stock .
To maximize (14) and (15), the firstorder conditions of current Hamiltonian function are the following:
The current value costate equations arewhere transversality conditions are .
Solving equations (16)–(20), we have
According to the actual situation, here we focus on the analysis of and , namely, .
In the long run, , tends to a steady state. We apply the superscript “” to identify the noncooperative equilibrium results. Equations (21)–(26) are standard firstorder differential equations. Substituting into equations (29)–(31) and solving equations under state equilibrium conditions, we get
Further, substituting equations (32)–(34) into equation (8), collating, and solving, we obtain the optimal trajectory of pollution stock in the upstream and downstream regions under the noncooperative game as follows:where and .
4. Cooperative Game
Now, consider another model, assuming that the upstream and downstream regions of the basin reach some agreement, set up a joint decisionmaking department or unified decision making by a higher management department, and jointly coordinate the pollution discharge strategies of the two regions with the goal of maximizing total net present value of longterm income. Then, the decisionmaking problem of combined pollution discharge in upstream and downstream regions constitutes a differential game problem, which takes instantaneous emission discharge (emission discharge rate) as control variable and pollution stock in water area, investment in pollution abatement, and learning by doing as state variables, and the total net present value of the whole benefit of the basin is maximized. We describe this issue as
The current value Hamilton function of equation (37) iswhere are the dynamic adjoint variables associated with the state equation about . Here, the dual variables , also called shadow prices or common state variables, are Lagrange multipliers, which are the derivatives of the two players’ value functions, i.e., revenues, with respect to the pollution stock .
To maximize (39), the firstorder conditions of current Hamiltonian function are the following:
The current value costate equations arewhere transversality conditions are .
Solving equations (40)–(44), we have
Similarly, in the long run, , tends to a steady state. We apply the superscript “” to identify the cooperative equilibrium results. Equations (45)–(49) are standard firstorder differential equations; solving the above equations and then substituting the results into equations (52) and (53), we get
Further, substituting equations (54) and (55) into equation (8), collating, and solving, we obtain the optimal trajectory of pollution stock in the upstream and downstream regions under the cooperative game as follows:where
5. Numerical Simulation
Through the analysis mentioned above, we have obtained the optimal emission, pollution abatement investment, and pollution stock under noncooperative game and cooperative game. In this section, we will analyze their trajectories of each variable and find the difference between the noncooperative model and cooperative model and simulate the optimal emission decision making and investment level of pollution abatement. The parameters used in the numerical examples are presented in Table 1, and we use the version 7.0 of Wolfram Mathematical Matlab to obtain the numerical solutions. The parameters are as follows [39].

5.1. Comparative Analysis of Optimal Emission Level
Drawing on [15], in this section, we simulate the dynamic trajectory of optimal emission in the basin at a finiteplanning horizon, i.e., T = 10, as shown in Figures 1–3.
Figure 1 simulates the dynamic trajectory of upstream and downstream pollution emissions in a noncooperative decisionmaking model. Figure 2 shows the dynamic trajectory of pollution emissions when the upstream and downstream regions adopt a cooperative decisionmaking model. From the figures, we can easily find out that, excluding other interference factors, at a finiteplanning horizon, the pollution emission level in the upstream of the basin is higher than that in the downstream, whether in the noncooperative decisionmaking model or cooperative decisionmaking model. Assuming that the planning period is 10, game participants adjust their decision making with the change of time. In the initial stage, due to the pressure of environmental regulation, the pollution level of the upstream and downstream regions of the basin is relatively low. With the continuous promotion of a limited number of repeated games, the participants in the game gradually adopt a noncooperative Nash equilibrium strategy to increase the level of pollution emissions. Figure 3 simulates the dynamic trajectory of pollution emission under the cooperative and noncooperative decisionmaking model. The curve trajectory shows an upward trend and tends to be the same at the planning period T = 10. It shows that under the finiteplanning horizon, whether the cooperative decision or the noncooperative decision is adopted at the beginning stage, the final game result is the noncooperative Nash equilibrium. This also verifies the previous conclusions.
When , we call it longterm steady state. In a differential game, the information that participants have at the beginning stage is similar to the finiteplanning horizon, and they only know the initial state of the system. At different time t, participants in the game take different decisions. Therefore, the optimal strategy is a timedependent dynamic strategy. Next, we simulate the dynamic trajectory of optimal emission in the basin at a longterm steady state as shown in Figures 4–6. Under the assumption of longterm steady state, Figure 4 simulates the dynamic trajectory of pollutant emission in the upstream and downstream regions of the basin in a noncooperative game model. Figure 5 depicts the dynamic trajectory of pollution emissions in the cooperative game model.
Different from the finiteplanning horizon conditions, the pollution emission decision of the game participants under the longterm steady state gradually stabilizes with the passage of time, and the emission level or emission amount tends to be stable. Compared with the initial stage, the total emission has a downward trend. Figure 6 summarizes the pollution emission level under the cooperative and noncooperative decisionmaking models. It can be seen that with the passage of time, the pollution emission level during the cooperative game shows a significant downward trend, and the cooperative emission in the stable state is lower than that in the noncooperative decisionmaking model.
5.2. Comparative Analysis of Optimal Pollution Abatement Investment
Similar to the comparative analysis of the optimal pollution emission level, we simulate the dynamic trajectory of the optimal pollution abatement investment under the condition of finiteplanning horizon.
Figures 7 and 8 show the dynamic trajectory of optimal pollution abatement investment in noncooperative and cooperative decisionmaking models, respectively. In Figure 7, the investment intensity of downstream pollution abatement in the noncooperative model is higher than that in upstream region. The investment level of upstream and downstream increases at first and then decreases, which shows that the decision game of noncooperative investment at finiteplanning horizon is unstable, and the later stage of limited repeated game tends to noncooperative Nash equilibrium. Even under the upstream and downstream cooperative investment decision model, the upstream and downstream pollution abatement investment tends to noncooperative Nash equilibrium, and the downward trend of investment level is more obvious. This can be seen from Figure 8. However, as far as the overall situation of the two investment decisionmaking methods is concerned, Figure 9 shows that the pollution abatement investment intensity in the noncooperative decisionmaking model is significantly higher than that in the cooperative decisionmaking model, and the change of investment intensity is unstable.
From the perspective of “learning by doing,” the accumulation effect of knowledge plays a positive role in most of the early stage of the noncooperative investment decisionmaking model, and it has a prominent depreciation effect at the end of the planning period. This can be seen from the trend of the curve trace of Figure 7. Similarly, it can be seen from Figure 8 that the depreciation effect has a stronger inhibition effect on pollution abatement investment than the accumulation effect in the cooperative investment decisionmaking model. The change process and degree of pollution abatement investment accumulation and depreciation effect under the two investment decisionmaking models can be reflected in Figure 9. Furthermore, we can also find that the investment level of noncooperative pollution abatement under finiteplanning horizon conditions is significantly higher than that of cooperative pollution abatement, that is, the accumulation effect of learning by doing is more obvious than the depreciation effect.
Different from the finiteplanning horizon conditions, the pollution abatement investment trajectory under the longterm steady state will show a more stable trend with the passage of time, that is, it tends to a stable value. As shown in Figure 10, in the short term, the pollution abatement investment under the noncooperative decisionmaking model will show an increasing trend. After increasing to a certain level, there will be a sharp downward trend and it will finally tend to be stable. However, the pollution abatement investment trajectory of the cooperative decisionmaking model shows a big difference. The change of investment is a smooth process that continues to grow until it reaches a stable value (Figure 11). Figure 12 shows the significant difference of investment trajectory between the two decision models. It is worth noting that under the condition of longterm steady state, the stable level of cooperative investment is higher than that of noncooperation, that is, the accumulation effect of pollution abatement investment is stronger than depreciation effect.
5.3. Comparative Analysis of Optimal Pollution Stock
In order to study the change of pollution stock in the upstream and downstream of the basin and to understand the movement track of the stock level more clearly, we simulate the optimal pollution stock trajectory curves under noncooperative and cooperative decision making, respectively, as shown in Figures 13 and 14. The change in the level of pollution stock under the two decisionmaking models is relatively similar. Under the finiteplanning horizon, when the noncooperative decisionmaking model (Figure 13) is adopted, the pollution stock in the upstream region has a significant decrease but rebounds at the end of the planning period; the pollutants in the downstream regions increase at the beginning of the plan and decrease in the later stage. In the cooperative model (Figure 14), the pollution stock in the upstream also has a significant decrease and tends to be stable in the later stage, while the pollution stock in the downstream shows an upward trend, that is, the amount of pollutants is increased. It shows that the regional pollution abatement decision under the finiteplanning horizon has a greater effect on the improvement of the upstream water environment quality, but it is opposite to the downstream region.
The change of water pollution stock in the basin also reflects the fact that the dynamic game under finiteplanning horizon is a noncooperative Nash equilibrium strategy. It can also be seen from Figure 15 that the dynamic track curve of pollution stock under the two decisionmaking models is a concave function, that is, the trend first declines and then rises. On the whole, the noncooperative decisionmaking model will reduce the water pollution stock in the basin, while the cooperative model is opposite.
Due to the limitation of planning period and dynamic game decision making, the fluctuation of pollutants in the basin under the finiteplanning horizon is small, and it is difficult to reach the ultimate goal of regional water pollution control. Similar to the previous analysis, we also simulated a dynamic trajectory of pollution stock at longterm steady state, as shown in Figures 16–18. Figures 16 and 17 show the dynamic trajectory of pollution stock under noncooperative and cooperative game decisions, respectively (when ). The trajectory changes of the two graphs are very similar. The pollution in upstream has a sharp decline in both noncooperative and cooperative decisionmaking models, while the pollution stock in downstream shows a downward trend after an upward trend. At last, both of them tend to be stable, but the stationary values are not the same.
In order to compare which game decisionmaking model has more significant effect on the reduction of pollution stock in the river basin, we simulate the stock dynamic trajectory under the noncooperation and cooperation decisionmaking model (as shown in Figure 18). It was finally found that the cooperative decisionmaking model can effectively reduce the stock of water pollution in the basin. Compared with the stock change of finiteplanning horizon, the longterm steady state is a more optimized dynamic game model. That is to say, the cooperative game decisionmaking model under the longterm steady state can effectively reduce the stock of water pollution, thus improving the water environment pollution of the basin.
6. Discussion
6.1. Further Analysis of the Optimal Emission Rate
Under the condition of finiteplanning horizon, i.e., limited T, the instantaneous emission difference of water pollution in the basin is
In this condition, the difference between the instantaneous emission rate is related to the time t and the length of planning period T. indicates that the longer the planning period is, the lower the equilibrium emission rate is. When T ⟶ ∞, the equilibrium emission rate approaches the equilibrium solution of the longterm steadystate differential game.
Figures 19 and 20 depict the trajectory curve of optimal instantaneous emission rate under two decision models (noncooperative decision model and cooperative decision model) during different planning periods (T = 5, T = 10, and T = 15).
When T ⟶ ∞, , the instantaneous emission difference in the basin area is
The results show that in the cooperative decision making, the amount of pollution emission in both regions is smaller than that in noncooperative decision making. Within a longterm steady state, the reduction of pollution emission is related to unit damage cost, discount rate, and selfpurification capacity of the water body.
6.2. Further Analysis of the Optimal Pollution Abatement Investment Level
From the perspective of river basin, in the upstream region, when , the investment intensity of pollution abatement has , indicating that under the noncooperative game conditions within a finiteplanning horizon, the higher the market price of emission rights, the bigger the optimal pollution abatement investment. Conversely, when , there is , indicating that the higher the market price of the emission rights is, the smaller the optimal pollution abatement investment will be. In downstream, the investment in pollution abatement is , indicating that the higher the market price of emission right is, the greater the investment in optimal pollution abatement is. Under the cooperative game investment decisionmaking model, there are , which shows that the higher the market price of emission rights is, the greater the investment intensity of the optimal pollution control is.
No matter longterm steady state or finiteplanning horizon, both decision models have , indicating that the smaller the cost of pollution abatement investment is, the greater the investment intensity of the optimal pollution abatement is; , and the difference of pollution abatement cost under the noncooperative decisionmaking model is inversely proportional to the investment intensity, which shows that the higher the investment cost of the downstream participants to upstream, the greater the investment intensity of the optimal pollution abatement.
Similarly, under the conditions of finiteplanning horizon and longterm steady state, both game decision models have . This also verifies the accumulation effect and depreciation effect of pollution abatement investment. It is worth noting that under finiteplanning horizon, there are , indicating that there is a crowding out effect of downstream pollution abatement investment on upstream investment. Further analysis found that , , and ; the lower the discount rate, the higher the environmental damage cost, the smaller the selfpurification ability of the water body, and the greater the optimal pollution abatement investment.
The investment difference under the infinite level condition iswhich indicates that the investment in pollution abatement in the two regions is greater than that in the noncooperative decision making when the cooperative decision is made. In longterm steady state, the increased investment intensity is related to the unit damage cost of pollution, discount rate, the difference of investment cost in upstream region, and the coefficient of selfpurification capacity of water body.
6.3. Further Analysis of Optimal Pollution Stock
The change of pollution stock level depends on the initial emission and the investment intensity of pollution abatement during the game period. Under the finiteplanning horizon and longterm steady state, both decision models can reduce the level of pollution stock. Under the condition of longterm steady state, whether it is cooperative game or noncooperative game decision, there are , . It shows that the accumulation effect of pollution abatement investment reduces the pollution stock, while the depreciation effect increases the pollution stock. Further analysis shows that there are , which also shows that the accumulation effect of upstream pollution abatement investment in the downstream region of the basin will reduce the stock of downstream pollution, that is, the environment in downstream will benefit. Conversely, the depreciation effect is not conducive to the improvement of downstream water environment.
Figures 21 and 22 depict the trajectory curves of the optimal pollution stock for two planning models (noncooperative decision model and cooperative decision model) for different planning periods (T = 5, T = 10, and T = 15). It can be seen that the longer the planning period, that is, the larger the T, the lower the level of water pollution in river basin. When T ⟶ ∞, the equilibrium pollution stock approaches a longterm steady state conditional differential game.
7. Conclusion
This paper analyzes a dynamic differential game model of watershed transboundary water pollution abatement and discusses the optimal decisionmaking problem under noncooperative and cooperative differential game, in which the accumulation effect and reduction effect of learningbydoing pollution abatement investment are taken into account. By solving dynamic equations of models, the Nash equilibrium solution of instantaneous emission rate, investment intensity of pollution abatement, and the pollution stock in upstream and downstream regions are obtained. Based on the results, the path simulation and analysis of the optimal trajectory curves of each variable under longterm steady state and finiteplanning horizon were carried out by numerical simulation. The results show that the change track of instantaneous emission rate, pollution abatement investment, and pollution stock under the condition of finiteplanning horizon is quite different from that under the condition of longterm steady state:(i)Under the condition of finiteplanning horizon, the game participants in the upstream and downstream regions choose the noncooperative strategy at the end of the planning period, which leads to a significant increase in the instantaneous emission rate of each region. In terms of the amount of pollution emission, the longer the planning period is, the lower the optimal emission rate is in equilibrium. Whether it is a noncooperative or a cooperative game, in the longterm steady state, after multiple games, the instantaneous emission rate of each region tends to a stable value, that is, the steady state level. The longterm steady state game under cooperative decision can effectively reduce the amount of pollution emission.(ii)The dynamic changes of water pollution abatement investment under the condition of finiteplanning horizon and longterm steady state are quite different. Specifically, under the condition of finiteplanning horizon, the investment intensity of pollution abatement in the implementation of noncooperative game is higher than that of cooperative game. At the end of the independent investment decision, the game decision tends to be a noncooperative Nash equilibrium state. At this time, the extrusion effect of learningbydoing pollution abatement investment is also relatively large. Under the condition of longterm steady state, the pollution abatement investment trajectory of the cooperative game is relatively stable and there is no obvious crowding out effect. Therefore, investment continues to rise, and the optimal equilibrium level at steady state is higher than that under noncooperative decision making.(iii)Due to the strong flow characteristics of water pollution, the trajectories of pollution stock in the upstream and downstream have opposite trends, and this difference is particularly evident in the finiteplanning horizon conditions. The level of pollution in the upstream region decreased obviously, while the stock in the downstream increased at first and then tended to be stable. So, total pollution stock level is relatively high. Under the condition of longterm steady state, the trajectories of upstream and downstream pollution in the noncooperative decisionmaking model and cooperative decisionmaking model are similar, but the cooperative decisionmaking model is superior to the noncooperative model in terms of the period of stabilization and steady state.
Transboundary water pollution control in river basins is a longterm and complex process. Environmental regulation policies in a short period of time or within a certain planning period can only play a role in local areas. Cooperative decision making within the region can obtain the optimal solution of the game.
Data Availability
The variables data used to support the findings of this study are included within the article (Table 1).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This study was supported by the National Social Science Foundation of China (grant no. 18ZDA040), the Humanities and Social Science Foundation of the Ministry of Education of China (grant no. 17JJD790017), and the Evaluation Commission of Social Science Achievements of Hunan Province of China (grant no. XSP20ZDA007).
References
 D. M. Kilgour, N. Okada, and A. Nishikori, “Load control regulation of water pollution: an analysis using game theory,” Journal of Environmental Management, vol. 27, no. 2, pp. 179–194, 1988. View at: Google Scholar
 M. Stimming, “Capital accumulation subject to population controlopenloop versus feedback investment strategies,” Annals of Operations Research, vol. 88, pp. 309–336, 1999. View at: Publisher Site  Google Scholar
 J. A. List and C. F. Mason, “Optimal institutional arrangements for transboundary pollutants in a secondbest world: evidence from a differential game with asymmetric players,” Journal of Environmental Economics and Management, vol. 42, no. 3, pp. 277–296, 2001. View at: Publisher Site  Google Scholar
 M. Breton, G. MartínHerrán, and G. Zaccour, “Equilibrium investment strategies in foreign environmental projects,” Journal Of Optimization Theory and Applications, vol. 130, no. 1, pp. 23–40, 2006. View at: Publisher Site  Google Scholar
 Å. Löfgren, K. Millock, and C. Nauges, “The effect of uncertainty on pollution abatement investments: measuring hurdle rates for Swedish industry,” Resource and Energy Economics, vol. 30, no. 4, pp. 475–491, 2008. View at: Publisher Site  Google Scholar
 D. W. K. Yeung, “Dynamically consistent cooperative solution in a differential game of transboundary industrial pollution,” Journal of Optimization Theory and Applications, vol. 134, no. 1, pp. 143–160, 2007. View at: Publisher Site  Google Scholar
 M. Antelo and M. L. Loureiro, “Asymmetric information, signaling and environmental taxes in oligopoly,” Ecological Economics, vol. 68, no. 5, pp. 1430–1440, 2009. View at: Publisher Site  Google Scholar
 D. W. K. Yeung and L. A. Petrosyan, “A cooperative stochastic differential game of transboundary industrial pollution,” Automatica, vol. 44, no. 6, pp. 1532–1544, 2008. View at: Publisher Site  Google Scholar
 H. Benchekroun and A. Ray Chaudhuri, “Transboundary pollution and clean technologies,” Resource and Energy Economics, vol. 36, no. 2, pp. 601–619, 2014. View at: Publisher Site  Google Scholar
 N. D. Hall, “Transboundary pollution: harmonizing international and domestic law,” University of Michigan Journal of Law Reform, vol. 40, p. 681, 2006. View at: Google Scholar
 O. Tahvonen, “Carbon dioxide abatement as a differential game,” European Journal of Political Economy, vol. 10, no. 4, pp. 685–705, 1994. View at: Publisher Site  Google Scholar
 S. B. Youssef, “Transboundary pollution, R&D spillovers and international trade,” The Annals of Regional Science, vol. 43, no. 1, pp. 235–250, 2009. View at: Publisher Site  Google Scholar
 S. Li, “A differential game of transboundary industrial pollution with emission permits trading,” Journal of Optimization Theory and Applications, vol. 163, no. 2, pp. 642–659, 2014. View at: Publisher Site  Google Scholar
 A. Bernard, A. Haurie, M. Vielle, and L. Viguier, “A twolevel dynamic game of carbon emission trading between Russia, China, and Annex B countries,” Journal of Economic Dynamics and Control, vol. 32, no. 6, pp. 1830–1856, 2008. View at: Publisher Site  Google Scholar
 S. Chang, S. P. Sethi, and X. Wang, “Optimal abatement and emission permit trading policies in a dynamic transboundary pollution game,” Dynamic Games and Applications, vol. 8, no. 3, pp. 542–572, 2018. View at: Publisher Site  Google Scholar
 S. Chang, X. Wang, and Z. Wang, “Modeling and computation of transboundary industrial pollution with emission permits trading by stochastic differential game,” PLoS One, vol. 10, no. 9, 2015. View at: Publisher Site  Google Scholar
 L. Argote and D. Epple, “Learning curves in manufacturing,” Science, vol. 247, no. 4945, pp. 920–924, 1990. View at: Publisher Site  Google Scholar
 E. D. Darr, L. Argote, and D. Epple, “The acquisition, transfer, and depreciation of knowledge in service organizations: productivity in franchises,” Management Science, vol. 41, no. 11, pp. 1750–1762, 1995. View at: Publisher Site  Google Scholar
 D. Epple, K. L. Argote, and K. Murphy, “An empirical investigation of the microstructure of knowledge acquisition and transfer through learning by doing,” Operations Research, vol. 44, no. 1, pp. 77–86, 1996. View at: Publisher Site  Google Scholar
 C. L. Benkard, “Learning and forgetting: the dynamics of aircraft production,” American Economic Review, vol. 90, no. 4, pp. 1034–1054, 2000. View at: Publisher Site  Google Scholar
 I. Kim and H. L. Seo, “Depreciation and transfer of knowledge: an empirical exploration of a shipbuilding process,” International Journal of Production Research, vol. 47, no. 7, pp. 1857–1876, 2009. View at: Publisher Site  Google Scholar
 A. B. Jaffe, R. G. Newell, and R. N. Stavins, “Technological change and the environment, resources for the future,” Tech. Rep., Washington, DC, USA, 2001, Discussion paper 0047REV. View at: Google Scholar
 S. J. Kline and N. Rosenberg, “An overview of innovation,” in The Positive Sum Strategy, R. Landau and N. Rosemberg, Eds., National Academies Press, Washington, DC, USA, 1986. View at: Google Scholar
 Y. Bramoullé and L. J. Olson, “Allocation of pollution abatement under learning by doing,” Journal of Public Economics, vol. 89, no. 910, pp. 1935–1960, 2005. View at: Publisher Site  Google Scholar
 S. Li and X. Pan, “A dynamic general equilibrium model of pollution abatement under learning by doing,” Economics Letters, vol. 122, no. 2, pp. 285–288, 2014. View at: Publisher Site  Google Scholar
 K. Xu, W.Y. K. Chiang, and L. Liang, “Dynamic pricing and channel efficiency in the presence of the cost learning effect,” International Transactions in Operational Research, vol. 18, no. 5, pp. 579–604, 2011. View at: Publisher Site  Google Scholar
 G. Janssens and G. Zaccour, “Strategic price subsidies for new technologies,” Automatica, vol. 50, no. 8, pp. 1999–2006, 2014. View at: Publisher Site  Google Scholar
 A. J. Clarke, “Learningbydoing and aggregate fluctuations: does the form of the accumulation technology matter?” Economics Letters, vol. 92, no. 3, pp. 434–439, 2006. View at: Publisher Site  Google Scholar
 N. Rivers and M. Jaccard, “Choice of environmental policy in the presence of learning by doing,” Energy Economics, vol. 28, no. 2, pp. 223–242, 2006. View at: Publisher Site  Google Scholar
 G. Zhong and W. Zhang, “Product and process innovation with knowledge accumulation in monopoly: a dynamic analysis,” Economics Letters, vol. 163, pp. 175–178, 2018. View at: Publisher Site  Google Scholar
 K. Kogan, F. E. Ouardighi, and T. Chernonog, “Learning by doing with spillovers: strategic complementarity versus strategic substitutability,” Automatica, vol. 67, pp. 282–294, 2016. View at: Publisher Site  Google Scholar
 X. Pan and S. Li, “Dynamic optimal control of processproduct innovation with learning by doing,” European Journal of Operational Research, vol. 248, no. 1, pp. 136–145, 2016. View at: Publisher Site  Google Scholar
 S. Li and J. Ni, “A dynamic analysis of investment in process and product innovation with learningbydoing,” Economics Letters, vol. 145, pp. 104–108, 2016. View at: Publisher Site  Google Scholar
 Z. Wei, Y. Yi, and C. Fu, “Cournot competition and “green” innovation under efficiencyimproving learning by doing,” Physica A: Statistical Mechanics and Its Applications, vol. 531, Article ID 121762, 2019. View at: Publisher Site  Google Scholar
 S. Chang, W. Qin, and X. Wang, “Dynamic optimal strategies in transboundary pollution game under learning by doing,” Physica A: Statistical Mechanics and Its Applications, vol. 490, pp. 139–147, 2018. View at: Publisher Site  Google Scholar
 M. Breton, G. Zaccour, and M. Zahaf, “A differential game of joint implementation of environmental projects,” Automatica, vol. 41, no. 10, pp. 1737–1749, 2005. View at: Publisher Site  Google Scholar
 D. W. Jorgenson, “The economic theory of replacement and depreciation,” Econometrics and Economic Theory, Palgrave Macmillan, London, UK, 1974. View at: Publisher Site  Google Scholar
 Z. Griliches, “R&D and productivity: the unfinished business,” in R&D and Productivity: The Econometric Evidence, pp. 269–283, University of Chicago Press, Chicago, IL, USA, 1998. View at: Google Scholar
 L. Fernandez, “Trade’s dynamic solutions to transboundary pollution,” Journal of Environmental Economics and Management, vol. 43, no. 3, pp. 386–411, 2002. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Zhigang Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.