Abstract

This paper analyzes a dynamic Stackelberg differential game model of watershed transboundary water pollution abatement and discusses the optimal decision-making problem under non-cooperative and cooperative differential game, in which the accumulation effect and depreciation effect of learning-by-doing pollution abatement investment are taken into account. We use dynamic optimization theory to solve the equilibrium solution of models. Through numerical simulation analysis, the path simulation and analysis of the optimal trajectory curves of each variable under finite-planning horizon and long-term steady state were carried out. Under the finite-planning horizon, the longer the planning period is, the lower the optimal emission rate is in equilibrium. The long-term steady-state game under cooperative decision can effectively reduce the amount of pollution emission. The investment intensity of pollution abatement in the implementation of non-cooperative game is higher than that of cooperative game. Under the long-term steady state, the pollution abatement investment trajectory of the cooperative game is relatively stable and there is no obvious crowding out effect. Investment continues to rise, and the optimal equilibrium level at steady state is higher than that under non-cooperative decision making. The level of decline in pollution stock under finite-planning horizon is not significant. Under the condition of long-term steady state, the trajectories of upstream and downstream pollution in the non-cooperative model and cooperative model are similar, but cooperative decision-making model is superior to the non-cooperative model in terms of the period of stabilization and steady state.

1. Introduction

In recent years, more and more experts and scholars use dynamic optimal control theory to study various complex pollution control problems. The differential game method based on dynamic optimal control theory provides an effective research tool for the treatment of transboundary pollution abatement. And it can analyze the interaction between the strategies of the participants and the dynamic change trajectory of pollution stock. Kilgour et al. [1] first used game theory to analyze the total amount control of pollution in river basins, taking total amount control of COD (chemical oxygen demand) as an example. Stimming [2] studied a differential game problem of participating enterprises under two kinds of environmental policies: pollution tax and emission permit. It was concluded that the total investment and emission amount under feedback strategy decision making were higher than that in open-loop strategy. List and Mason [3] applied differential game theory to analyze the optimal institutional arrangement of transboundary pollution abatement under suboptimal conditions and established a differential game model under two systems of cooperative and non-cooperative decision making, respectively. The results showed that if the emission yields of the two regions are asymmetric and the initial emission level is low, non-cooperative decision making will be better than cooperative from the perspective of total income. Breton et al. [4] used a two-party finite-time differential game model to analyze the cooperative implementation of environmental projects. The key assumption of the model is that pollution abatement investments in one country can reduce the stock of pollution in other countries, and the cost of pollution damage is linear. Löfgren et al. [5] analyzed the impact of environmental pollution taxes with future uncertainties on manufacturers' pollution control investment decisions. Yeung [6] assumed that different governments adopt cooperative and non-cooperative decision making to control environmental pollution and implement the policy of levying pollution tax, with the ultimate goal of maximizing social welfare. The social production department decides the output of the enterprise on its own, aiming at maximizing its own profits. The cooperative differential game of transboundary pollution is analyzed and discussed. In addition, a dynamic consistency of cooperative strategy is further studied, and the stochastic differential game analysis of this problem is extended. Antelo and Loureiro [7] established a three-stage game model of oligopoly competition pollution abatement investment and found that manufacturers will make different pollution investment decisions for the symmetry of pollution control technology information. In [8], a cooperative stochastic differential game of transboundary industrial pollution is presented, and a payment distribution mechanism is derived to maintain the subgame consistency. Additionally, there are several published studies of transboundary pollution problems from other views, such as renewable resource, clean technologies, harmonization of international and domestic law, abatement cost, R&D spillovers, and so on (for instance, [912]). These papers [8, 13, 14] took emission permit trading into account to study the problem of transboundary pollution. It has been shown in this literature that coordination of countries’ emission strategies leads to a lower total level of pollution and to a higher total welfare than when countries use non-cooperative emission strategies. Chang et al. [15] obtained optimal emission levels and abatement expenditures in a finite-horizon transboundary pollution game with emission trading between two regions. And the paper found that cooperation between the regions leads to increased abatement and lower emissions, resulting in a lower pollution stock. In another paper [16], Chang et al. presented a stochastic differential game to study this kind of problem. More generally, the process of emission permit price is assumed to be stochastic and to follow a geometric Brownian motion (GBM). All the results demonstrate that the stochastic emission permit prices can motivate the players to make more flexible strategic decisions in the games.

The results of theoretical research and empirical analysis have shown that the technology acquired through learning by doing may not last indefinitely, but may depreciate. And learning-by-doing “forgetting” or “depreciation” is of great significance to production planning and scheduling [1721]. When using economics to analyze environmental problems, scholars usually assume that technology is constant or that technological progress is considered exogenous. Unlike traditional analysis, technological advancement in real production is an endogenous and dynamic process, which is influenced by factors such as regional government policies and knowledge acquisition. Jaffe et al. [22] argued that the relationship between technological change and the environment has an important impact on environmental problems. The positive interaction between technology accumulation and good environmental pollution abatement policies should be better understood. Kline and Rosenberg [23] studied a variety of technology industries and found that in some cases, learning by doing in production process contributed more to technological progress than the original development itself. Bramoullé and Olson [24] argued that the government's Pigouvian tax, an environmental regulation policy, could put pollution abatement on the right technological track. By using the effect of learning by doing over time on the distribution of pollution abatement among heterogeneous technologies, the best conditions for sharing all technologies could be determined. At the same time, it is found that the more mature pollution abatement technology is more inclined to adopt infant technology.

It is learning that accumulates pollution abatement experience and thus reduces pollution control investment costs. Due to the accumulation of pollution control experience, the cost of governance should be reduced, which is measured by cumulative governance. In fact, “learning by doing” from the experience of pollution control can promote the transformation of governance processes and technologies. These changes can reduce costs. However, participants’ net present value income includes not only production income and emission trading income but also pollution abatement costs. Some scholars use the value function of production to measure the learning value of learning by doing. It has been widely used in some operation research literature, such as [2527]. Learning by doing is an important source of technological progress. Similarly, the technology of pollution abatement investment needs to be developed continuously in practice. In recent years, many authors have discussed the technology of pollution abatement investment [25] and the technology of investment accumulation [28]. Some studies have mentioned the environmental policy and abatement cost in the presence of learning by doing [24, 25, 29].

However, it seems that the existing literature rarely deals with the dynamic effects of accumulation and depreciation of learning by doing to analyze the changing trajectory of investment in transboundary pollution abatement. Furthermore, our work is in the spirit of the study by Li and Pan [25] and Zhong and Zhang [30]—the first time to investigate the effect of experience which is measured by the cumulative abatement investment from time 0 to t. In recent works [3135], the investment cost reduces with accumulative experience. In this paper, it is assumed that the accumulation of pollution abatement investment increases at a constant rate. As knowledge is forgotten, depreciated, or replaced by new one, knowledge will depreciate at a certain rate. Similarly, there is a process of “learning by doing” that has been forgotten or replaced by new ones in actual production and application of pollution abatement technologies. In order to explore the strength of this forgotten or substitution effect of polluting enterprises in the process of pollution abatement investment and its impact on the instantaneous emission rate and pollution stock, this article builds models, focuses on solving the optimal solution of related variables, and analyzes it. Our main purpose is to study the pollution abatement investment decision under the accumulation and depreciation of knowledge. Therefore, we use a mature dynamic differential game model to simplify and more intuitively analyze the changes of various variables by numerical simulation.

The rest of this paper is organized as follows. In Section 2, we will establish our basic dynamic general equilibrium model. In addition, the optimal Nash equilibrium solution of instantaneous emission rate, investment intensity of pollution abatement, and the pollution stock in upstream and downstream regions under non-cooperative game and cooperative game are presented in Sections 3 and 4, respectively. In Section 5, the path simulation and analysis of the optimal trajectory curves of each variable under finite-planning horizon and long-term steady state equilibrium were carried out by numerical simulation. Some discussions and further analysis are provided in Section 6. Finally, Section 7 concludes the paper.

2. The Basic Model

There are two adjacent regions () in a river basin, which we call upstream region 1 and downstream region 2, in our transboundary pollution model. It is assumed that both regions discharge organic pollutants into the river basin. For region (), production always leads to a quantity of by-products, namely, emissions (). We assume that is the industrial output of upstream and downstream region, which indicates that at time , the industrial production of region 1 and region 2 is and . The instantaneous emissions of pollutants from industrial output are (), that is, at time, the pollution emissions of region 1 and region 2 are and , respectively. It is assumed that pollution emissions in various regions are positively related to industrial production. And the instantaneous linear production function can be expressed as

According to literature [3, 4, 36], it is assumed that the regional industrial income function is , which can be expressed by the following quadratic functional form in terms of emissions:where is a positive constant. Thus, the profit function is expressed as a quadratic concave function of .

Environmental pollution damage cost is a linear function of pollution stock ; following [4], the cost function is . Among them, indicates the degree of environmental damage per unit pollution stock to region.

The investment intensity function of pollution abatement is . It is known that pollution abatement can be realized only when technique and labor are invested. So, we should face the abatement cost which could decrease the net revenue. Following [3], we assume that upstream investment abatement cost can be described by following the quadratic form:

Equation (3) means that the marginal cost is increasing with respect to the level of pollution abatement. So, the investment cost of water pollution abatement in downstream region 2 and the investment cost of water pollution control in upstream region 1 can be expressed as follows:where indicates the investment intensity of downstream region 2 in local area and indicates the investment intensity of downstream in upstream region 1. This functional form captures the idea that the cost of region 2 to region 1 depends on what the downstream is investing because investors will only acquire “learning by doing” for upstream regions after they have collected downstream investment experience in region 2.

By means of [24, 35], the experience of applying pollution abatement technology is measured by the cumulative abatement from time 0 to t, that is,where denotes the initial experience level of applying pollution abatement technology. Similar to the above, is a positive parameter and it represents the differences between the two regions’ ability in accumulating experience. According to the learning-by-doing theory, the amount of cumulative experience will lead to a decline in the unit cost.

With the rapid development of society and technology, the accumulation of pollution abatement investment increases at a constant rate. And the amount of cumulative experience will lead to a decline in the unit cost. As knowledge is forgotten, depreciated, or replaced by new one, knowledge will depreciate at a certain rate. According to Jorgenson [37] and Griliches [38], a proportional or geometric depreciation rule seems to be a good choice to represent the depreciation of aggregate stock of knowledge.

To simplify, we apply the proportional and linear approach using to represent depreciation rate of the aggregate stock of knowledge evolving over time. Correspondingly, the learning-by-doing function has also changed because of taking knowledge depreciation into consideration:

Here, we define parameters as rate of knowledge accumulation under investment in pollution abatement technology and parameters as depreciation rate of learning by doing. The investment intensity of pollution abatement in upstream region is .

In this model, the stocks of pollution in the upstream and downstream regions of the basin are and at time. And the pollution stock in the two basins can be expressed by the following two differential equations:where and represent the stocks of self-purification pollution. As we all know, with the change of time and temperature, the water in nature has certain self-purification ability. Without the loss of generality, it is assumed that the basin has the purification rate of water pollution, namely, the coefficient of self-purification ability of water . is the number of pollution transferred from upstream to downstream. is assumed to be the transfer coefficient, and .

The number of initial emission permits in region 1 is and region 2 is . It is assumed that the emission trading market is a fully competitive market and the price of emission trading right is constant at . If the amount of emission exceeds the initial allocation, the emission right can be purchased in the permit market. On the contrary, if there is a surplus of the emission right, it can be reserved for the next year or sold in the emission permit trading market.

Given the above assumptions, we can get the concrete expression of the income function of the two regions:

We define as the conversion rates of learning by doing. Then, , , and represent the cost savings brought by learning by doing, and it can also be considered as the income from learning-by-doing conversion.

3. Non-Cooperative Game

The amount of pollution abatement investment provided by the downstream region in the same basin will affect the investment enthusiasm of the upstream region in pollution control and the amount of pollutant discharge in the region. In turn, the amount of pollutant discharge in the upstream region will affect the amount of pollutant discharge in the downstream region, which constitutes a dynamic game relationship between the two sides. Considering the decision making in continuous time, this constitutes a dynamic differential game relationship.

Under this model, both regions aim to maximize the net present value of their own long-term income. The pollution discharge in the upstream region will affect the income of the downstream region by affecting the pollution stock in the same water basin. The decision-making problem of independent discharge from the two regions constitutes a differential game problem with , , , , and as control variables and , , , , and as state variables, aiming at maximizing the net present value of their respective income.

The current goal of region 1 is to maximize the expected present flow of instantaneous net revenue in terms of the emission path and the abatement level. We describe this issue as

Likewise, the current goal of downstream region 2 is to maximize the expected present flow of instantaneous net revenue in terms of the emission path and the abatement level. Similarly, the optimization problem of the downstream region can be given as

The current value Hamilton function of equation (10) is

The current value Hamilton function of equation (12) iswhere are the dynamic adjoint variables associated with the state equation about . Here, the dual variables , also called shadow prices or common state variables, are Lagrange multipliers, which are the derivatives of the two players’ value functions, i.e., revenues, with respect to the pollution stock .

To maximize (14) and (15), the first-order conditions of current Hamiltonian function are the following:

The current value costate equations arewhere transversality conditions are .

Solving equations (16)–(20), we have

According to the actual situation, here we focus on the analysis of and , namely, .

In the long run, , tends to a steady state. We apply the superscript “” to identify the non-cooperative equilibrium results. Equations (21)–(26) are standard first-order differential equations. Substituting into equations (29)–(31) and solving equations under state equilibrium conditions, we get

Further, substituting equations (32)–(34) into equation (8), collating, and solving, we obtain the optimal trajectory of pollution stock in the upstream and downstream regions under the non-cooperative game as follows:where and .

4. Cooperative Game

Now, consider another model, assuming that the upstream and downstream regions of the basin reach some agreement, set up a joint decision-making department or unified decision making by a higher management department, and jointly coordinate the pollution discharge strategies of the two regions with the goal of maximizing total net present value of long-term income. Then, the decision-making problem of combined pollution discharge in upstream and downstream regions constitutes a differential game problem, which takes instantaneous emission discharge (emission discharge rate) as control variable and pollution stock in water area, investment in pollution abatement, and learning by doing as state variables, and the total net present value of the whole benefit of the basin is maximized. We describe this issue as

The current value Hamilton function of equation (37) iswhere are the dynamic adjoint variables associated with the state equation about . Here, the dual variables , also called shadow prices or common state variables, are Lagrange multipliers, which are the derivatives of the two players’ value functions, i.e., revenues, with respect to the pollution stock .

To maximize (39), the first-order conditions of current Hamiltonian function are the following:

The current value costate equations arewhere transversality conditions are .

Solving equations (40)–(44), we have

Similarly, in the long run, , tends to a steady state. We apply the superscript “” to identify the cooperative equilibrium results. Equations (45)–(49) are standard first-order differential equations; solving the above equations and then substituting the results into equations (52) and (53), we get

Further, substituting equations (54) and (55) into equation (8), collating, and solving, we obtain the optimal trajectory of pollution stock in the upstream and downstream regions under the cooperative game as follows:where

5. Numerical Simulation

Through the analysis mentioned above, we have obtained the optimal emission, pollution abatement investment, and pollution stock under non-cooperative game and cooperative game. In this section, we will analyze their trajectories of each variable and find the difference between the non-cooperative model and cooperative model and simulate the optimal emission decision making and investment level of pollution abatement. The parameters used in the numerical examples are presented in Table 1, and we use the version 7.0 of Wolfram Mathematical Matlab to obtain the numerical solutions. The parameters are as follows [39].

5.1. Comparative Analysis of Optimal Emission Level

Drawing on [15], in this section, we simulate the dynamic trajectory of optimal emission in the basin at a finite-planning horizon, i.e., T = 10, as shown in Figures 13.

Figure 1 simulates the dynamic trajectory of upstream and downstream pollution emissions in a non-cooperative decision-making model. Figure 2 shows the dynamic trajectory of pollution emissions when the upstream and downstream regions adopt a cooperative decision-making model. From the figures, we can easily find out that, excluding other interference factors, at a finite-planning horizon, the pollution emission level in the upstream of the basin is higher than that in the downstream, whether in the non-cooperative decision-making model or cooperative decision-making model. Assuming that the planning period is 10, game participants adjust their decision making with the change of time. In the initial stage, due to the pressure of environmental regulation, the pollution level of the upstream and downstream regions of the basin is relatively low. With the continuous promotion of a limited number of repeated games, the participants in the game gradually adopt a non-cooperative Nash equilibrium strategy to increase the level of pollution emissions. Figure 3 simulates the dynamic trajectory of pollution emission under the cooperative and non-cooperative decision-making model. The curve trajectory shows an upward trend and tends to be the same at the planning period T = 10. It shows that under the finite-planning horizon, whether the cooperative decision or the non-cooperative decision is adopted at the beginning stage, the final game result is the non-cooperative Nash equilibrium. This also verifies the previous conclusions.

When , we call it long-term steady state. In a differential game, the information that participants have at the beginning stage is similar to the finite-planning horizon, and they only know the initial state of the system. At different time t, participants in the game take different decisions. Therefore, the optimal strategy is a time-dependent dynamic strategy. Next, we simulate the dynamic trajectory of optimal emission in the basin at a long-term steady state as shown in Figures 46. Under the assumption of long-term steady state, Figure 4 simulates the dynamic trajectory of pollutant emission in the upstream and downstream regions of the basin in a non-cooperative game model. Figure 5 depicts the dynamic trajectory of pollution emissions in the cooperative game model.

Different from the finite-planning horizon conditions, the pollution emission decision of the game participants under the long-term steady state gradually stabilizes with the passage of time, and the emission level or emission amount tends to be stable. Compared with the initial stage, the total emission has a downward trend. Figure 6 summarizes the pollution emission level under the cooperative and non-cooperative decision-making models. It can be seen that with the passage of time, the pollution emission level during the cooperative game shows a significant downward trend, and the cooperative emission in the stable state is lower than that in the non-cooperative decision-making model.

5.2. Comparative Analysis of Optimal Pollution Abatement Investment

Similar to the comparative analysis of the optimal pollution emission level, we simulate the dynamic trajectory of the optimal pollution abatement investment under the condition of finite-planning horizon.

Figures 7 and 8 show the dynamic trajectory of optimal pollution abatement investment in non-cooperative and cooperative decision-making models, respectively. In Figure 7, the investment intensity of downstream pollution abatement in the non-cooperative model is higher than that in upstream region. The investment level of upstream and downstream increases at first and then decreases, which shows that the decision game of non-cooperative investment at finite-planning horizon is unstable, and the later stage of limited repeated game tends to non-cooperative Nash equilibrium. Even under the upstream and downstream cooperative investment decision model, the upstream and downstream pollution abatement investment tends to non-cooperative Nash equilibrium, and the downward trend of investment level is more obvious. This can be seen from Figure 8. However, as far as the overall situation of the two investment decision-making methods is concerned, Figure 9 shows that the pollution abatement investment intensity in the non-cooperative decision-making model is significantly higher than that in the cooperative decision-making model, and the change of investment intensity is unstable.

From the perspective of “learning by doing,” the accumulation effect of knowledge plays a positive role in most of the early stage of the non-cooperative investment decision-making model, and it has a prominent depreciation effect at the end of the planning period. This can be seen from the trend of the curve trace of Figure 7. Similarly, it can be seen from Figure 8 that the depreciation effect has a stronger inhibition effect on pollution abatement investment than the accumulation effect in the cooperative investment decision-making model. The change process and degree of pollution abatement investment accumulation and depreciation effect under the two investment decision-making models can be reflected in Figure 9. Furthermore, we can also find that the investment level of non-cooperative pollution abatement under finite-planning horizon conditions is significantly higher than that of cooperative pollution abatement, that is, the accumulation effect of learning by doing is more obvious than the depreciation effect.

Different from the finite-planning horizon conditions, the pollution abatement investment trajectory under the long-term steady state will show a more stable trend with the passage of time, that is, it tends to a stable value. As shown in Figure 10, in the short term, the pollution abatement investment under the non-cooperative decision-making model will show an increasing trend. After increasing to a certain level, there will be a sharp downward trend and it will finally tend to be stable. However, the pollution abatement investment trajectory of the cooperative decision-making model shows a big difference. The change of investment is a smooth process that continues to grow until it reaches a stable value (Figure 11). Figure 12 shows the significant difference of investment trajectory between the two decision models. It is worth noting that under the condition of long-term steady state, the stable level of cooperative investment is higher than that of non-cooperation, that is, the accumulation effect of pollution abatement investment is stronger than depreciation effect.

5.3. Comparative Analysis of Optimal Pollution Stock

In order to study the change of pollution stock in the upstream and downstream of the basin and to understand the movement track of the stock level more clearly, we simulate the optimal pollution stock trajectory curves under non-cooperative and cooperative decision making, respectively, as shown in Figures 13 and 14. The change in the level of pollution stock under the two decision-making models is relatively similar. Under the finite-planning horizon, when the non-cooperative decision-making model (Figure 13) is adopted, the pollution stock in the upstream region has a significant decrease but rebounds at the end of the planning period; the pollutants in the downstream regions increase at the beginning of the plan and decrease in the later stage. In the cooperative model (Figure 14), the pollution stock in the upstream also has a significant decrease and tends to be stable in the later stage, while the pollution stock in the downstream shows an upward trend, that is, the amount of pollutants is increased. It shows that the regional pollution abatement decision under the finite-planning horizon has a greater effect on the improvement of the upstream water environment quality, but it is opposite to the downstream region.

The change of water pollution stock in the basin also reflects the fact that the dynamic game under finite-planning horizon is a non-cooperative Nash equilibrium strategy. It can also be seen from Figure 15 that the dynamic track curve of pollution stock under the two decision-making models is a concave function, that is, the trend first declines and then rises. On the whole, the non-cooperative decision-making model will reduce the water pollution stock in the basin, while the cooperative model is opposite.

Due to the limitation of planning period and dynamic game decision making, the fluctuation of pollutants in the basin under the finite-planning horizon is small, and it is difficult to reach the ultimate goal of regional water pollution control. Similar to the previous analysis, we also simulated a dynamic trajectory of pollution stock at long-term steady state, as shown in Figures 1618. Figures 16 and 17 show the dynamic trajectory of pollution stock under non-cooperative and cooperative game decisions, respectively (when ). The trajectory changes of the two graphs are very similar. The pollution in upstream has a sharp decline in both non-cooperative and cooperative decision-making models, while the pollution stock in downstream shows a downward trend after an upward trend. At last, both of them tend to be stable, but the stationary values are not the same.

In order to compare which game decision-making model has more significant effect on the reduction of pollution stock in the river basin, we simulate the stock dynamic trajectory under the non-cooperation and cooperation decision-making model (as shown in Figure 18). It was finally found that the cooperative decision-making model can effectively reduce the stock of water pollution in the basin. Compared with the stock change of finite-planning horizon, the long-term steady state is a more optimized dynamic game model. That is to say, the cooperative game decision-making model under the long-term steady state can effectively reduce the stock of water pollution, thus improving the water environment pollution of the basin.

6. Discussion

6.1. Further Analysis of the Optimal Emission Rate

Under the condition of finite-planning horizon, i.e., limited T, the instantaneous emission difference of water pollution in the basin is

In this condition, the difference between the instantaneous emission rate is related to the time t and the length of planning period T. indicates that the longer the planning period is, the lower the equilibrium emission rate is. When T ⟶ ∞, the equilibrium emission rate approaches the equilibrium solution of the long-term steady-state differential game.

Figures 19 and 20 depict the trajectory curve of optimal instantaneous emission rate under two decision models (non-cooperative decision model and cooperative decision model) during different planning periods (T = 5, T = 10, and T = 15).

When T ⟶ ∞, , the instantaneous emission difference in the basin area is

The results show that in the cooperative decision making, the amount of pollution emission in both regions is smaller than that in non-cooperative decision making. Within a long-term steady state, the reduction of pollution emission is related to unit damage cost, discount rate, and self-purification capacity of the water body.

6.2. Further Analysis of the Optimal Pollution Abatement Investment Level

From the perspective of river basin, in the upstream region, when , the investment intensity of pollution abatement has , indicating that under the non-cooperative game conditions within a finite-planning horizon, the higher the market price of emission rights, the bigger the optimal pollution abatement investment. Conversely, when , there is , indicating that the higher the market price of the emission rights is, the smaller the optimal pollution abatement investment will be. In downstream, the investment in pollution abatement is , indicating that the higher the market price of emission right is, the greater the investment in optimal pollution abatement is. Under the cooperative game investment decision-making model, there are , which shows that the higher the market price of emission rights is, the greater the investment intensity of the optimal pollution control is.

No matter long-term steady state or finite-planning horizon, both decision models have , indicating that the smaller the cost of pollution abatement investment is, the greater the investment intensity of the optimal pollution abatement is; , and the difference of pollution abatement cost under the non-cooperative decision-making model is inversely proportional to the investment intensity, which shows that the higher the investment cost of the downstream participants to upstream, the greater the investment intensity of the optimal pollution abatement.

Similarly, under the conditions of finite-planning horizon and long-term steady state, both game decision models have . This also verifies the accumulation effect and depreciation effect of pollution abatement investment. It is worth noting that under finite-planning horizon, there are , indicating that there is a crowding out effect of downstream pollution abatement investment on upstream investment. Further analysis found that , , and ; the lower the discount rate, the higher the environmental damage cost, the smaller the self-purification ability of the water body, and the greater the optimal pollution abatement investment.

The investment difference under the infinite level condition iswhich indicates that the investment in pollution abatement in the two regions is greater than that in the non-cooperative decision making when the cooperative decision is made. In long-term steady state, the increased investment intensity is related to the unit damage cost of pollution, discount rate, the difference of investment cost in upstream region, and the coefficient of self-purification capacity of water body.

6.3. Further Analysis of Optimal Pollution Stock

The change of pollution stock level depends on the initial emission and the investment intensity of pollution abatement during the game period. Under the finite-planning horizon and long-term steady state, both decision models can reduce the level of pollution stock. Under the condition of long-term steady state, whether it is cooperative game or non-cooperative game decision, there are , . It shows that the accumulation effect of pollution abatement investment reduces the pollution stock, while the depreciation effect increases the pollution stock. Further analysis shows that there are , which also shows that the accumulation effect of upstream pollution abatement investment in the downstream region of the basin will reduce the stock of downstream pollution, that is, the environment in downstream will benefit. Conversely, the depreciation effect is not conducive to the improvement of downstream water environment.

Figures 21 and 22 depict the trajectory curves of the optimal pollution stock for two planning models (non-cooperative decision model and cooperative decision model) for different planning periods (T = 5, T = 10, and T = 15). It can be seen that the longer the planning period, that is, the larger the T, the lower the level of water pollution in river basin. When T ⟶ ∞, the equilibrium pollution stock approaches a long-term steady state conditional differential game.

7. Conclusion

This paper analyzes a dynamic differential game model of watershed transboundary water pollution abatement and discusses the optimal decision-making problem under non-cooperative and cooperative differential game, in which the accumulation effect and reduction effect of learning-by-doing pollution abatement investment are taken into account. By solving dynamic equations of models, the Nash equilibrium solution of instantaneous emission rate, investment intensity of pollution abatement, and the pollution stock in upstream and downstream regions are obtained. Based on the results, the path simulation and analysis of the optimal trajectory curves of each variable under long-term steady state and finite-planning horizon were carried out by numerical simulation. The results show that the change track of instantaneous emission rate, pollution abatement investment, and pollution stock under the condition of finite-planning horizon is quite different from that under the condition of long-term steady state:(i)Under the condition of finite-planning horizon, the game participants in the upstream and downstream regions choose the non-cooperative strategy at the end of the planning period, which leads to a significant increase in the instantaneous emission rate of each region. In terms of the amount of pollution emission, the longer the planning period is, the lower the optimal emission rate is in equilibrium. Whether it is a non-cooperative or a cooperative game, in the long-term steady state, after multiple games, the instantaneous emission rate of each region tends to a stable value, that is, the steady state level. The long-term steady state game under cooperative decision can effectively reduce the amount of pollution emission.(ii)The dynamic changes of water pollution abatement investment under the condition of finite-planning horizon and long-term steady state are quite different. Specifically, under the condition of finite-planning horizon, the investment intensity of pollution abatement in the implementation of non-cooperative game is higher than that of cooperative game. At the end of the independent investment decision, the game decision tends to be a non-cooperative Nash equilibrium state. At this time, the extrusion effect of learning-by-doing pollution abatement investment is also relatively large. Under the condition of long-term steady state, the pollution abatement investment trajectory of the cooperative game is relatively stable and there is no obvious crowding out effect. Therefore, investment continues to rise, and the optimal equilibrium level at steady state is higher than that under non-cooperative decision making.(iii)Due to the strong flow characteristics of water pollution, the trajectories of pollution stock in the upstream and downstream have opposite trends, and this difference is particularly evident in the finite-planning horizon conditions. The level of pollution in the upstream region decreased obviously, while the stock in the downstream increased at first and then tended to be stable. So, total pollution stock level is relatively high. Under the condition of long-term steady state, the trajectories of upstream and downstream pollution in the non-cooperative decision-making model and cooperative decision-making model are similar, but the cooperative decision-making model is superior to the non-cooperative model in terms of the period of stabilization and steady state.

Transboundary water pollution control in river basins is a long-term and complex process. Environmental regulation policies in a short period of time or within a certain planning period can only play a role in local areas. Cooperative decision making within the region can obtain the optimal solution of the game.

Data Availability

The variables data used to support the findings of this study are included within the article (Table 1).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the National Social Science Foundation of China (grant no. 18ZDA040), the Humanities and Social Science Foundation of the Ministry of Education of China (grant no. 17JJD790017), and the Evaluation Commission of Social Science Achievements of Hunan Province of China (grant no. XSP20ZDA007).