Coordinated Control and Estimation of Multiagent Systems with Engineering Applications
View this Special IssueResearch Article  Open Access
Punishment Effect of Prisoner Dilemma Game Based on a New Evolution Strategy Rule
Abstract
We discuss the effect of the punishment in the prisoner’s dilemma game. We propose a new evolution strategy rule which can reflect the external factor for both players in the evolution game. In general, if the punishment exists, the D (defectiondefection) structure (i.e., both of the two players choose DD strategy) which is the Nash equilibrium for the game can keep stable and never let the cooperation emerge. However, if a new evolution strategy rule is adopted, we can find that the DD structure can not keep stable and it will decrease during the game from the simulations. In fact, the punishment mainly affects the CD (cooperationdefection) structure in the network. After the fraction of the CD structure achieved some levels, the punishment can keep the CD structure stable and prevent it from transforming into CC (cooperationcooperation) structure. Moreover, in light of the stability of structure and the payoff of the individual gains, it can be found that the probability which is related to the payoff can affect the result of the evolution game.
1. Introduction
Game theory is ubiquitous in the real world [1–23] in nature and society, such as the invasion of alien species and the conflict of trade between two countries. However, how to settle up with the contradiction between the selfish individual and the social wellbeing and make maximum benefit for the whole society have confused scientists for some decades. There are two classic models in game theory: the public goods game (PGG) and the prisoner’s dilemma game (PDG). PGG can be used to study the problem about the cooperation in game [3]. Wang et al. [5] studied the evolutionary dynamics of PGG in finite populations. Under the evolutionary dynamics, players who contributed more could successfully defend the invasion and invade others. It could help us understand cooperative behaviors about the contributions in the real world. Furthermore, Wang et al. [6] considered the effect of wealth distribution about PGG under collective risk to analyze the cooperation among rich and poor individuals. On the other hand, PDG has been used to study how to eliminate the dilemma between the person and the society [8]. Nowak and May [9] found that the spatial structure could benefit the cooperators against defectors’ invasion, which inaugurated a new fieldcomplex network, to study the game theory. The vertex nodes represent the individuals and the edges represent the interactions among the players. Tomassini et al. [11] used two kinds of models of complex networkregular lattices and random graphs to research the HawkDove game and found that the fraction of cooperators in the network was related to the gaintocost ratio. Heterogeneity, one of the most important properties in complex network, plays a very important role in the evolution game. Fu et al. [12] found that in small world network the underlying network topological organization could help in enhancing and sustaining the cooperative behaviors. Furthermore, Fu et al. [13] presented a punished strategy having the high heterogeneity property which could make the cooperators survive and wipe out the defectors. Perc and Szolnoki [15] found that the distribution of the wealth and social status could promote the cooperation in the evolution game. Ishibuchi and Namikawa [16] researched the evolution strategies about iterated PDG. The players in the game were located in a cell of grid world. They found that the structure could benefit from promoting the cooperation with random pairing. Roca et al. [18] discussed the effect of spatial structure about the evolution cooperation. Despite the results, they offered some new insights like the relation between the intensities of selection.
In this paper, we pay our attention to the effect of the defection’s payoff for the evolution game. In order to simplify the process of the game of PDG, most of the literature usually adopts the limit PDG model, in which the punishment is zero. Here, we consider the factor about the punishment for the evolution game. The reminder of this paper is organized as follows. Section 2 gives the model and the strategy rule. Section 3 presents the simulation results and the explanations. And the conclusion is made in Section 4.
2. The Model and the Strategy Rule of the Evolution Game
There are only two strategies, C (cooperation) and D (defection), in the PDG. According to the strategy selected, the two players will get the benefits, respectively. If one player chooses the C and the other chooses the D, the individual choosing the strategy C will get the lowest payoff . Meanwhile, the individual choosing the strategy D will gain the highest payoff . If both of the two players choose the strategy C or strategy D, they will gain the payoff or , with and . We use a matrix denoting the possible strategies and payoff as follows: In order to study the processing of the game theory concisely, researches usually take the limited payoff matrix; that is, let element be zero, which is introduced by Nowak and May [19]. Here, we use the normal payoff matrix as follows: Now, we give the strategy rule on lattice network for the evolution game; that is, if one individual’s payoff is larger than its neighbor’s, it will keep its strategy; otherwise, it will randomly imitate the other individuals’ strategy in which one of its neighbors interacts with itself. Note that this evolution rule can reflect the external effect for the individuals in the game as in Figure 1. If the individual E’s payoff is larger than F’s, E will keep its strategy unchanged. Otherwise, E will imitate H’s, I’s, or G’s strategy randomly. At the same time, we can regard H, I, or G as the environment for E who does not interact with them. In the evolution game, the probability of the strategy changed between individuals and depended on the payoff difference [13]: where characterizes the noise for permitting irrational choices. In iterated PDG, both of the two players will choose the strategy D gradually, so the strategy (D, D) is the only Nash equilibrium for the PGD. Usually, the researches can let the punishment benefit be zero for simplifying the processing of the evolution game. However, in real world, the punishment benefit is not always zero. When the external effect becomes a very important factor in the evolution game, the effect of punishment can never be ignored. Moreover, sometimes the effect of punishment may not affect the evolution game in negative degree.
3. Main Result and Simulations
In this section, we will illustrate some simulations on the lattice network with the size ; let the final time step ; and all individuals’ strategies selected are D in the initial network. We run 100 simulations independently and take the average data of the 100 simulations for Figures 2–12. It is easy to see that in the payoff matrix can affect the result of the evolution game from the model. Therefore, we will discuss the two different cases for and , respectively. First, we take , , and . For , the result of the game will evolute to the equilibrium state which is the cooperators and defectors located on the lattice alternatively for the evolution strategy rule. With the punishment element existing, one individual will not choose the strategy C. Because others choose the strategy D, it means that the one selecting strategy C will gain no payoff; moreover, both of the two players selecting strategy D will also get payoffs. And, in PDG, both of the two players choose the strategy D which is the dilemma of the game. Generally, the punishment can help the DD structure keep stability. From Figure 3, comparing to , the appearance of cooperators in the network will increase slowly in the initial network. As the game goes, we can see that the faction of the cooperators for will be larger than that of . But, from the common sense, the result of the game for should be better than . Why can this unusual situation happen? If one individual on lattice selects strategy C, its neighbors can get the maximum payoff. However, the ones connected with the neighbors whose payoffs are less than the neighbors’, according to the evolution rule, may select the strategy C. In addition, the strategy of changing probability depends on the individual’s payoff. For the effect of the punishment element , the probability of the strategy D transforming to strategy C will be large for formula (3). At the same time, the probability of the strategy C changing to strategy D will be less contrarily. So, the CD structure formed for is more stable than for , which is the reason why the percentage of the cooperators for is more than that for at the end of the game. And then we find that the strategy DD structure cannot stop the cooperation in the network for the evolution strategy rule. For a fixed , we will see that the different can affect the evolution game. Here, let , , , and . From Figure 3, being similar to the above analysis, for , we can see that the fraction of cooperators will increase more slowly and achieve more profit than . Therefore, the punishment does not always take negative effect. In some particular evolution rules, the punishment can help in promoting the cooperation.
Next, for , the result of the evolution game is different from the situation for . The cooperators can form triangle clusters to fight against the defectors’ invasion efficiently and can expand more in the network [21, 22]. So, the cooperators can break up the equilibrium state who can take advantage obviously at the end of the game. Let , , , and . From Figure 4, for , the fraction of the cooperators will increase firstly and then when the percentage of the cooperators achieves some levels, the growth of the cooperators will slow down. Furthermore, at the end of the evolution game, the percentage of the cooperators will be less than .
We will illustrate it together in Figures 5–7. Because of the evolution rule, if someone chooses the strategy C, others who are the neighbors of the C individual’s neighbors may select the strategy C. Therefore, the DD structure cannot keep stable and then the DD structure will transform to the CD structure and drop obviously as the game goes. We can also see that, for higher , the fraction of DD structure decreases faster and more in Figure 8. With the cooperators increased in the network, the CD structure will increase and, accordingly, the CC structure will also increase. Because the cooperators can form the triangle structure to defend the defectors’ invasion and the cluster of the cooperators can be larger, so the fraction of the CC structure can increase. From velocity of increasing for cooperators in Figures 6 and 7, the CD structure will increase fast. After achieving the summit, the CD structure will transform to CC structure fast and then decrease as the evolution game goes. The effect of the CC cluster will enhance. That is the reason why the fraction of CC structure will increase as the game goes. With the effect of the punishment, the CD structure will decrease more slowly for than for . For higher , the CD structure can better keep stable. This situation means that the punishment can defend the cluster of the cooperators and keep the CD structure stable. Therefore, the fraction of the cooperators for will be larger than ’s at the end of the evolution game. In conclusion, for the particular evolution rule, the effect of the punishment can affect certain strategy structure. Here, we give the data about the the changing of the fraction of CD structure on lattice network as in Table 1.

From the above analysis, we can find that the changing of the payoff which the individual gains can affect the probability. And then we will focus on the effect of the probability. So, we can use the parameter in formula (3). Here, for and , let and . When the parameter turns to be small, the probability of the strategy changing will also be small. In Figure 8, for , the fraction of the cooperators will increase firstly and will be larger than at the end of the game. From Figures 9–11, we can see that the CD and CC structure for will increase firstly and the DD structure will decrease firstly. And, then, the CD structure for achieves the summit more; after that, the fraction of the CD structure decreases less than and the fraction of CC structure for in the network will almost be more than . Comparing with Figures 4–7, we can see the difference. In Figure 8, the faction of cooperators will increase slowly for higher , but the situation is in contrast to that in Figure 4. Why can this difference happen? For the smaller probability, the changing of strategy is not often. It will help the strategy DD structure in keeping stable. However, for the evolution strategy rule, the DD structure cannot keep stable and it will transform to CD structure. In another way, the smaller probability can reduce the effect of evolution rule in some degree. From Figures 9–11, we also can see that the effect of the punishment mainly affects the fraction of the CD structure. For higher , the CD strategy can gain more and keep more stable. In Figure 12, we can find that the more the is, the more the fraction of the cooperators is. Moreover, with the increased, the fraction of the cooperators in the network will increase fast, which implies that the probability can affect the result of the evolution game.
4. Conclusion
In this paper, we have discussed the problem of the effect of the punishment for the evolution game on lattice. We proposed an evolution strategy rule which can reflect the external factors. Under the evolution rule, we can find that the punishment can affect the evolution game. The punishment can help the cooperators to increase firstly which is contrary to the common sense that the DD structure will keep stable. Actually, the DD structure cannot be stable for the evolution rule. Moreover, the punishment through the CD structure affects the result of the evolution game. For higher , when the CD structure achieves the summit, it will keep more stable and decrease less. Despite the payoff the players gain, we also find that the probability is related to the evolution game. The more the probability is, the more and faster the fraction of cooperators increases.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work was supported by the National Natural Science Foundation of China under Grant no. 61174116.
References
 A. M. Colman, Game Theory and Its Applications in the Social and Biological Sciences, ButterworthHeinemann, Oxford, UK, 1995.
 E. Pennisi, “How did cooperative behavior evolve,” Science, vol. 309, no. 5731, p. 93, 2005. View at: Publisher Site  Google Scholar
 M. Olson, The Logic of Collective Action: Public Goods and the Theory of Groups, Harvard University Press, Cambrigdge, Mass, USA, 1965.
 H. Su, X. Wang, and Z. Lin, “Flocking of multiagents with a virtual leader,” IEEE Transactions on Automatic Control, vol. 54, no. 2, pp. 293–307, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 J. Wang, B. Wu, X. Chen, and L. Wang, “Evolutionary dynamics of public goods games with diverse contributions in finite populations,” Physical Review E, vol. 81, no. 5, Article ID 056103, 8 pages, 2010. View at: Publisher Site  Google Scholar
 J. Wang, F. Fu, and L. Wang, “Effects of heterogeneous wealth distribution on public cooperation with collective risk,” Physical Review E, vol. 82, no. 1, Article ID 016102, 13 pages, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 H. Su, N. Zhang, M. Z. Q. Chen, H. Wang, and X. Wang, “Adaptive flocking with a virtual leader of multiple agents governed by locally Lipschitz nonlinearity,” Nonlinear Analysis: Real World Applications, vol. 14, no. 1, pp. 798–806, 2013. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. Poundstone, The Prisoner’s Dliemma, Doubleday, New York, NY, USA, 1992.
 M. A. Nowak and R. M. May, “Evolutionary games and spatial chaos,” Nature, vol. 359, no. 6398, pp. 826–829, 1992. View at: Publisher Site  Google Scholar
 H. Su, M. Chen, X. Wang, and J. Lam, “Semiglobal observerbased leaderfollowing consensus with input saturation,” IEEE Transactions on Industrial Electronics, vol. 61, no. 6, pp. 2842–2850, 2014. View at: Publisher Site  Google Scholar
 M. Tomassini, L. Luthi, and M. Giacobini, “Hawks and Doves on smallworld networks,” Physical Review E, vol. 73, no. 1, Article ID 016132, 10 pages, 2006. View at: Publisher Site  Google Scholar
 F. Fu, X. Chen, L. Liu, and L. Wang, “Social dilemmas in an online social network: the structure and evolution of cooperation,” Physics Letters A, vol. 371, no. 12, pp. 58–64, 2007. View at: Publisher Site  Google Scholar
 F. Fu, X. Chen, L. Liu, and L. Wang, “Promotion of cooperation induced by the interplay between structure and game dynamics,” Physica A, vol. 383, no. 2, pp. 651–659, 2007. View at: Publisher Site  Google Scholar
 H. Su, Z. Rong, M. Chen, X. Wang, G. Chen, and H. Wang, “Decentralized adaptive pinning control for cluster synchronization of complex dynamical networks,” IEEE Transactions on Cybernetics, vol. 43, no. 1, pp. 394–399, 2013. View at: Google Scholar
 M. Perc and A. Szolnoki, “Social diversity and promotion of cooperation in the spatial prisoner’s dilemma game,” Physical Review E, vol. 77, no. 1, Article ID 011904, 5 pages, 2008. View at: Publisher Site  Google Scholar
 H. Ishibuchi and N. Namikawa, “Evolution of iterated prisoner's dilemma game strategies in structured demes under random pairing in game playing,” IEEE Transactions on Evolutionary Computation, vol. 9, no. 6, pp. 552–561, 2005. View at: Publisher Site  Google Scholar
 H. Su, M. Z. Q. Chen, J. Lam, and Z. Lin, “Semiglobal leaderfollowing consensus of linear multiagent systems with input saturation via low gain feedback,” IEEE Transactions on Circuits and Systems. I. Regular Papers, vol. 60, no. 7, pp. 1881–1889, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 C. P. Roca, J. A. Cuesta, and A. Sanchez, “Effect of spatial structure on the evolution of cooperation,” Physical Review E, vol. 80, no. 4, Article ID 046106, 16 pages, 2009. View at: Publisher Site  Google Scholar
 M. A. Nowak and R. M. May, “The spatial dilemmas of evolution,” International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, vol. 3, no. 1, pp. 35–78, 1993. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 B. Liu, X. L. Kou, and J. Zhang, “The imitating strategy rule about prisoner’s dillema game on lattice,” in Proceedings of the 5th International Conference on Intelligent Computation Technology and Automation (ICICTA '12), pp. 719–722, Hunan, China, January 2012. View at: Publisher Site  Google Scholar
 J. Vukov, G. Szabó, and A. Szolonoki, “Cooperation in the noisy case: Prisoner’s dilemma game on two types of regular random graphs,” Physical Review E, vol. 73, no. 6, Article ID 067103, 4 pages, 2006. View at: Publisher Site  Google Scholar
 C. K. Chan and K. Y. Szeto, “Decay of invincible clusters of cooperators in the evolutionary prisoner's dilemma game,” in Applications of Evolutionary Computing, vol. 5484 of Lecture Notes in Computer Science, pp. 243–252, Springer, 2009. View at: Publisher Site  Google Scholar
 J. Du, B. Wu, P. M. Altrock, and L. Wang, “Aspiration dynamics of multiplayer games in finite populations,” Journal of the Royal Society Interface, vol. 11, no. 94, Article ID 20140077, 2014. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 Dehui Sun and Xiaoliang Kou. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.