Discrete Dynamics in Nature and Society

Volume 2018, Article ID 3293745, 6 pages

https://doi.org/10.1155/2018/3293745

## A Note on Strategic Stability of Cooperative Solutions for Multistage Games

^{1}School of Mathematics and Statistics, Qingdao University, Qingdao 266071, China^{2}College of Automation and Electrical Engineering, Qingdao University, Qingdao 266071, China^{3}Institute of Applied Mathematics of Shandong, Qingdao University, Qingdao 266071, China

Correspondence should be addressed to Hongwei Gao; moc.361@7002atgmc

Received 12 January 2018; Accepted 14 October 2018; Published 1 November 2018

Academic Editor: Seenith Sivasundaram

Copyright © 2018 Lei Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The problem of strategic stability of cooperative solutions for multistage games is studied. The sufficient conditions related to discount factors are presented, which guarantee the existence of Nash or strong Nash equilibria in such games and therefore guarantee the strategic stability of cooperative solutions. The deviating payoffs of players are given directly, which are related closely to these conditions and avoid the loss of super-additivity of a class of general characteristic functions. As an illustration, Nash and strong Nash equilibria are found for the repeated infinite stage Prisoner’s dilemma game.

#### 1. Introduction

* Strategic stability* of cooperative solutions in a game means that the outcome of cooperation must be attained in some Nash equilibrium, which is an effective mechanism to guarantee the sustainability of a cooperative agreement. It seeks to make the cooperative solution guaranteed by an equilibrium in an associated noncooperative game. See, for example, [1–4].

How can a cooperative agreement made at the start be sustained over time is an important issue in human behavior. Besides strategic stability,* time consistency* and* irrational-behavior-proof condition* are also effective aspects to sustain cooperation.

Time consistency may be described informally as follows: to sustain cooperation over time, each player’s cooperative payoff should belong to the same optimal principle at any time along the cooperative path. The concept of time consistency and its implement ingredient* imputation distribution procedure* (IDP) were initially proposed in [5, 6] and further developed in [7–9].

Irrational-behavior-proof condition requires that the partners involved in the cooperation must be sure that even in the worst scenario they will not lose compared with the noncooperative behavior. In [10], the irrational-behavior-proof condition is proposed. A further investigation could be found in [11].

In their seminal paper [12], Grauer and Petrosyan studied the strategic stability of cooperative solutions in an* infinite stage game* (ISG). They introduced a new (general) characteristic function in such a game, which plays a role of computing the cooperative solutions and describing the deviating payoffs of players. This characteristic function plays an essential role to construct Nash or strong Nash equilibria in the ISG. However, the general characteristic function has a drawback: it could not be super-additive, which would lead to the nonexistence of cooperative solutions, e.g., the core.

Motivated by the above observations, in this paper, we shall focus on the problem of strategic stability of cooperative solutions in ISGs and present sufficient conditions in terms of discount factors and prove that Nash or strong Nash equilibria exist in such games. The deviation payoffs are given directly, which are related closely to the sufficient condition of strategic stability and avoid the loss of super-additivity of the general characteristic functions. Then we study the* repeated infinite stage game* (RISG). As an illustration, Nash and strong Nash equilibria are found for the repeated infinite stage Prisoner’s dilemma game.

The proof technique is the trigger (penalty) strategy combined with the appropriate construction of the time-consistent IDP. When some player (coalition) deviates from the cooperative trajectory in some stage, other players (coalitions) would use the trigger strategies from the next stage. But this does not include the case, in which the trigger strategies could be used after several stages, perhaps because the information is delayed or the players in one coalition need time to coordinate their actions (see [13–16]). While in a finite stage game, to construct Nash or strong Nash equilibria, we need more strict conditions, which is the case with perfect information (see [17]).

The theory we developed could be applied to analyzing the dynamic cooperative behavior in society. A typical example is the global pollution control problem, which requires a joint effort of many countries for a long time. See, for example, [18–20].

The paper is organized as follows. In Section 2, the basic model about the ISG is introduced. In Section 3, the sufficient conditions are given and the existence of Nash and strong Nash equilibria are proved in ISGs. In Sections 4 and 5, the RISG and the repeated infinite stage Prisoner’s dilemma game are studied. In Section 6, some concluding discussions are provided.

#### 2. Formal Definitions and Terminology

In this section we introduce the basic model of ISGs (see also [12] for more details).

An* infinite game tree* is an infinite oriented treelike graph with the root , where is the set of vertices and is the set of vertices following after and .

A* single stage game * is a simultaneous -player game corresponding to each vertex in the tree , where is the set of players, is the set of strategies of player and is the payoff of player .

A* transition function* is defined as for each . For each game , the function determines the following stage game .

An* infinite stage game * in the tree is determined by the simultaneous games and the transition function .

Denote a sequence of situations by an -tuple of strategies , where . Define the corresponding sequence of vertices as a* trajectory* (* path*) in the graph , denoted by .

Defineas the* payoff of player * in the game . All the payoffs in the single stage games are uniformly bounded, which guarantees the existence of sum (1).

In game , the players possess* complete information*, which means they know the simultaneous game and remember all the strategies in the game history.

Suppose that players in are playing cooperatively with objective

Suppose there exist an -tuple of strategies and a trajectory satisfying (2). Define as an* optimal cooperative trajectory* of .

The* subgame * of the game is played in the subgraph , where is the set of vertices of the subgraph . The payoff of player in the subgame is denoted by .

The* characteristic function * in the subgame is defined in a classical way: , where is a value of zero-sum game played between coalition acting as player and coalition acting as player , with the payoff of coalition equal to . It is additionally assumed that the values exist for every and . Specially,

Denote the pair of optimal strategies in the game by , where .

Consider a sequence of subgames along an optimal cooperative trajectory . In each subgame , one can obtain the* imputation set * and the* core *:

Suppose the imputation . Define an* imputation distribution procedure* (IDP) as a function , such that

For every , define the* noncooperative infinite stage game *, which differs from the game only in the payoffs along the optimal cooperative trajectory . Suppose under the situation the path is realized. Denote the payoff in game bywhere . In a special case when , we have

Let . Game is called a* regularized game* of if IDP is defined in such a way that

In particular, if , is called a* strictly regularized game* of .

From (9) we get

Now suppose that is some optimality principle in the cooperative game , and is the same optimality principle defined in the subgame with initial conditions on the cooperative trajectory. can be the Shapley value, the core, the nucleolus, etc. If , condition (10) gives us the* time consistency* of the chosen imputation (or the IDP ) in game .

An -tuple is a* Nash equilibrium* of game if and only iffor all and all .

An -tuple is a* strong Nash equilibrium* of game if and only iffor all and all .

#### 3. Existence of Nash and Strong Nash Equilibria

Consider the following inequality with respect to : where is the stage payoff to player if deviating from her cooperative strategy and playing the best response to opponents’ cooperative strategies. The above inequality is reduced to the following:Let the value in the right-hand side in (14) be reached. Let

We can get the following.

Theorem 1. *In the regularized game , for any satisfying , the situation with players’ payoffs as guaranteed by the time-consistent IDP is a Nash equilibrium.*

Now suppose and consider another inequality with respect to : where is the stage payoff to coalition if deviating from the cooperative strategy and playing the best response to others’ cooperative strategies. The above inequality is reduced to the following:Let the value in the right-hand side in (17) be reached. Let

We can also get the following.

Theorem 2. *In the strictly regularized game , for any satisfying , the situation with players’ payoffs as guaranteed by the time-consistent IDP is a strong Nash equilibrium.*

Theorem 1 implies the cooperative solution (any imputation) can be strategically supported by a specially constructed Nash equilibrium in a regularized game . Theorem 2 implies the cooperative solution (any core) can be strategically supported by a specially constructed strong Nash equilibrium in a strictly regularized game . Since a strong Nash equilibrium is also a Nash equilibrium in the strictly regularized game , the existence of strong Nash equilibrium implies the existence of Nash equilibrium. We need only to prove Theorem 2 and the proof of Theorem 1 is similar.

*Proof of Theorem 2. *Consider the situation in the strictly regularized game and define the strategies of player as follows: where is the first vertex along the cooperative trajectory , on which player deviates from and is the -th component of strategy in the zero-sum game .

To prove the situation is a strong Nash equilibrium in the game , we have to show thatfor all and all .

It is easy to see when the -tuple is played the game develops along the cooperative trajectory . If under the situation the trajectory is also realized, then (20) will be true.

Suppose the strategy differs from the strategy in one of the single stage games . Denote the first vertex of path by , on which . In the situation , the deviating coalition cannot obtain more thansince, after deviating from , coalition will play against coalition in the zero-sum game , where .

From the time consistency of IDP and condition (17), we then obtainThis completes the proof of Theorem 2.

#### 4. The Case of Repeated Infinite Stage Game

In this part, we shall consider the case when is a* repeated infinite stage game* (RISG), in which a normal-form game appears infinite periods.

In each single stage game , the* characteristic function * is defined by , where is a value of zero-sum game played between coalition acting as player and coalition acting as player . In each game , one can construct the* imputation set * and the* core * using the characteristic function .

Consider a sequence of subgames along the cooperative trajectory . Put . The value of game will be equal to

For any imputation or , define the time-consistent IDP as . Then .

In the regularized game , the existence of IDP is equivalent to the nonemptiness of the imputation set , i.e., the existence of a solution of the following inequalities:

It can be simplified as,

In the strictly regularized game , the existence of IDP is equivalent to the nonemptiness of the core , i.e., the existence of a solution of the following inequalities:

It can be simplified as

Under the cooperative agreement, players will choose their cooperative strategies. But if some player deviates from her cooperative strategy at some stage , she will play her best response to other players’ cooperative strategies and get the payoff at this stage. Suppose other players will choose their trigger strategies from the next stage until the end and the deviator’s future payoff will be . If , player will never deviate from her cooperative strategy. The above inequality can be simplified to when . When , it will be , which always holds. Let

The following theorem can be formulated.

Theorem 3. *If the imputation set defined by (25) is not empty and the discount factor satisfies , then in the repeated infinite stage game , the situation with payoffs guaranteed by the time-consistent IDP is a Nash equilibrium.*

We can consider the similar deviation for coalitions. If some coalition acts individually and deviates from her cooperative strategy at some stage , at the current stage she will get the payoff . From the next stage, other players will form a coalition and choose the trigger strategies. So the deviating coalition’s future payoff will be . Coalition will not deviate from her cooperative strategy, if . It can be simplified to when . When , it will be , which always holds. Let

The following theorem can be formulated.

Theorem 4. *If the core defined by (27) is not empty and the discount factor satisfies , then in the repeated infinite stage game , the situation with payoffs guaranteed by the time-consistent IDP is a strong Nash equilibrium.*

#### 5. Example

To illustrate the theoretical result, we consider a repeated infinite stage game in which the two-person Prisoner’s dilemma game is played at each stage.

For each player , consists of two strategies and . is the payoff of player defined by . Since strategy dominates strategy for each , situation is a Nash equilibrium in game . But it is not Pareto optimal since situation is better off for both players.

The following strategies describe the cooperative behaviors for two players: they choose at each stage game. In Table 1, values of the characteristic function and the deviation payoffs in stage games are presented. Using these values, we find the Shapley value as the cooperative solution in each stage game, the imputation from stage , and the time-consistent IDP (see Table 2).