Coevolution of Artificial Agents Using Evolutionary Computation in Bargaining Game

Lee, Sangwook

doi:https://doi.org/10.1155/2015/468128

Advances in Multimedia

On this page

Abstract Introduction Experimental Results Conclusion References Copyright Related Articles

Special Issue

Advanced Issues on Topic Detection, Tracking, and Trend Analysis for Social Multimedia

View this Special Issue

Research Article | Open Access

Volume 2015 | Article ID 468128 | https://doi.org/10.1155/2015/468128

Coevolution of Artificial Agents Using Evolutionary Computation in Bargaining Game

Sangwook Lee¹

Academic Editor: Seungmin Rho

Received29 Aug 2014

Accepted23 Oct 2014

Published03 Aug 2015

Abstract

Analysis of bargaining game using evolutionary computation is essential issue in the field of game theory. This paper investigates the interaction and coevolutionary process among heterogeneous artificial agents using evolutionary computation (EC) in the bargaining game. In particular, the game performance with regard to payoff through the interaction and coevolution of agents is studied. We present three kinds of EC based agents (EC-agent) participating in the bargaining game: genetic algorithm (GA), particle swarm optimization (PSO), and differential evolution (DE). The agents’ performance with regard to changing condition is compared. From the simulation results it is found that the PSO-agent is superior to the other agents.

1. Introduction

The current bargaining game research is based on the established theoretical model of Ståhl [1] and Rubinstein [2]. Game theorists, economists, psychologists, and computer scientists have already started analyzing the underlying bargaining phenomenon which can be applied in e-commerce application [3], negotiation problem [4], and dispute resolution [5], to name a few. The game appears to be very simple but the results are fuzzy and controversial.

Over the past few years, a considerable number of studies have been conducted on modeling the bargaining game using artificial agents on the interaction among the homogeneous population. However, very few attempts have been made at the study on the interaction among the heterogeneous population. Matwin et al. designed a negotiation support system (NSS) which addresses multiple issues through populations of rules (classifier) which are learned by means of GA thereby supporting a two-party bargaining game [6]. Meanwhile, using evolution strategy, Page et al. proposed a generalized adaptive dynamic framework that can deal with games in which the payoff is not differentiable [7]. van Bragt and La Poutrè formulated bargaining strategies as finite automata coevolved by genetic algorithm to discriminate different opponents without any information about the identity or preferences of their counterparts [8]. Takadama et al. suggested three learning bargaining models which are based on evolution strategy (ES), learning classifier system (LCS), and reinforcement learning (RL) strategy. They evaluated heterogeneous-population interactions in their study [9]. Zhong et al. have tried to show that artificial agents with RL strategy can evolve against fixed rules and rotating rules with better performance [10]. Cooper et al. further utilized the RL strategy in terms of observing the relative speeds of learning by proponents and respondents [11]. Grosskopf studied the combined effect of RL and directional learning (DL) strategy in order to compare the result of the one-shot bargaining game with a proponent and varying respondents and showed that the strategies can coevolve [12].

The above studies have focused on the validity of the artificial agent models and compared the results of homogeneous-population interactions. However, these studies on the homogeneous-population interactions are conservative approaches due to the reason that the real-world bargaining game aims at the analysis of the deal in which there exist many behaviors with diversified propensities and tendencies, characterizing many kinds of agents.

In this paper we proceed to study interactions of agents in the heterogeneous population. We conducted experiments with three kinds of evolutionary computation based agents to play the bargaining game. From the experiments we identify what are the principal parameters and how much they affect the results of the bargaining game. Also patterns of action of artificial agents are analyzed according to their strategy. In particular, a bargaining game among EC-agents was conducted to observe the interaction and coevolution.

This paper is organized as follows: Section 2 briefly reviews the sequential bargaining game. The next section outlines the design consideration of artificial agents. In Section 4, coevolution model among EC-agents is described. The simulation results are demonstrated in Section 5. Finally, the paper concludes with some remarks in Section 6.

2. Sequential Bargaining Game

The sequential bargaining game is a division game of a fixed sum between two players. There exist infinite number of Nash equilibriums in the bargaining game according to the game theory and the subgame perfect equilibrium is that the last proponent makes a proposal as the ε, the lowest nonzero quantity, to the counterpart and the respondent always accepts the minimal proposal since any ε is better than a null demand. But experimental evidence is in contrast with this strategy due to the fact that the proponents tend to offer the counterpart more than the noncooperative game theory predicts, and the respondents reject the small offers. The rejection of a low offer by the respondent can be seen as punishment. Page et al. surveyed that “some 60~80% of proponents offer fractions between 0.4 and 0.5, and only 3% offer less than 0.2. They are well advised to do this—indeed, some 50% of respondents reject any split offering them less than one-third of the sum” [7, 13–15]. It seems discrepancy between game theory and experimental data results from the notion of fairness and the absence of common knowledge of rationality [16–18]. Recently, extensive studies have been carried out on the analysis of the bargaining game through the use of artificial agents [19–21].

A brief review, in this respect, follows. However, before that we prefer to review the following terms for clarity.(i)Payoff: reward which agent receives from the game.(ii)Control parameter: EC-agents factor which can affect the performance of agent in game.(iii)Sequential game: the game composed of multiple rounds.

3. Artificial Agent Models

In this section, we discuss the underlying bargaining game phenomenon vis-à-vis simulation models. The game kicks off by virtue of randomly, that is, with equal probability, deciding a proponent and a respondent. The proponent chooses a proposal , a real number between 0 and 10, which is the amount the proponent is able to pay at round . The respondent chooses a minimal acceptable demand , which is also a real number between 0 and 10 at round . If the proposal is more than the demand, that is, if , then the proponent earns , and the respondent earns . If the proposal is not accepted, that is, if , then the status of two players is exchanged and set at round . Finally if the deal between the two players is failed in the last round, that is, in our experiment, then each player earns null.

We introduce three kinds of the artificial agents for evolving strategies using genetic algorithms (GA), particle swarm optimization (PSO), and differential evolution (DE). These ECs are based on an arbitrarily initialized population of trial solution which evolves toward better solution by means of each EC operators.

Figure 1 shows an EC-agent which is called solution, strategy, vector, and position. In a bargaining game, it is important whether the gamer begins the first transaction as a proponent or a respondent, and thus each strategy is composed of two vectors. The first vector represents a strategy to put an EC-agent in the first proponent position and the second in the first respondent position. When the agent is the first proponent, the first row is used as its strategy, otherwise, the second row.

(a) 1st proponent

(b) 1st respondent

3.1. GA-Agent Model

Genetic algorithm (GA) is a search algorithm based on the mechanics of natural system, that is, the law of the survival of the fittest [22]. GA operators consist of selection, crossover, and mutation. Fitness value of individual solutions is measured by a payoff which a GA-agent earns in the bargaining game.

In GA-agent, we use a tournament selection, arithmetic crossover, and mutation as GA operators. The tournament selection is a selection method that one picks up two solutions randomly from current population and chooses a winner between them [23]. The arithmetic crossover is a crossover method that each gene of offspring is averaged value of two parents’ genes. As for mutation, we use the method to initialize genes. Figure 2 shows an evolution process of GA-agent.

3.2. PSO-Agent Model

Particle swarm optimization (PSO) is a metaheuristic method that optimizes a problem by iteratively trying to improve a candidate solution by moving particles, which are candidate solutions, around in the search space according to simple mathematical formulae which are concerned with particle’s position update and velocity update [24]. Each particle’s movement is influenced by its local best known position but is also guided toward the best known positions in the search space, which are updated as better positions found by other particles.

The PSO algorithm is initialized with the population of individuals being placed randomly on the search space and searching for an optimal solution by updating individual generations. In each iteration, the velocity and the position of each particle are updated according to its previous best position () and the best position found by neighbors of the particle (). The formula of particle’s velocity and position update is as follows: where is the index of particles in the swarm, is the index of positions in the particle, represents the iteration number, is the velocity vector of the th particle, and is the position vector. Note that and are the positive acceleration constants, and are random numbers uniformly distributed between 0 and 1, and is the inertia weight.

In [25], it was shown that a good convergence can be ensured by making two constants which are an acceleration and inertia. This can be demonstrated from the relation between them using an intermediate parameter . Consider the following: In PSO-agent, we use an original version of PSO with intermediate parameter . Figure 3 shows an evolution process of PSO-agent.

3.3. DE-Agent Model

Differential evolution (DE) is a metaheuristic method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. In the DE, at first, the initial solution vector group should be generated randomly. The generated solution vectors are updated by performing three processes which are replacement, making a trial vector, and crossover. The replacement is a process that if a candidate solution made by crossover is better than a present solution, a present solution is updated by a candidate solution. The trial vector is a vector made by the following formulae to combine the existing vectors from the population [26]. Consider the following: where , , and are randomly selected solutions in current population and is a real positive coefficient.

In DE-agent, we use a standard version of DE with a uniform crossover. Figure 4 shows an evolution process of PSO-agent. A candidate vector is generated by uniform crossover operation with randomly selected solution in current population and the trial vector as follows: where the means a random number between 0 and 1 and is probability of crossover.

4. Coevolution Model

The co-evolution model between two EC-agents in bargaining game is presented in Figure 5. After the solution groups of two kinds of EC-agents are randomly generated, each group is evaluated and evolved step by step. When one group of solutions is evaluated, entire solutions of another group are used for the counterparts in the bargaining game. And the player begin the bargaining game twice as a proponent or a respondent against each counterpart. Finally, the fitness value of solution is calculated by averaging all earns of total games. For example, when the number of entire solution is 30, two rounds of the bargaining game were conducted for each counterpart (beginning as a proponent, beginning as a respondent) to gain 60 different earns in total. The values were divided by 60 to determine the fitness of the solution.

5. Experimental Results

This section shows experimental results based on adaptive EC-agents. EC-agents have inter alia parameters which have effects on the performance. In a GA-agent, the parameters are a probability of crossover and mutation; in a PSO-agent, they are an intermediate parameter and maximum velocity; in a DE-agent, they are a coefficient and probability of crossover. We examined the impact of variations of the above parameters on the experimental results.

In order to observe the coevolution among EC-agents in a bargaining game, three experiments on GA-agent versus PSO-agent, GA-agent versus DE-agent, and PSO-agent versus DE-agent were conducted.

5.1. Experimental Environment

In order to create an experimental environment, we set the simulation parameters as follows:(i)population size: 30;(ii)maximum iteration: 10,000;(iii)maximum round in bargaining game: 5;(iv)number of counterparts: 30 (entire population).

5.2. Experiment of Single EC-Agent

In this experiment, each EC-agent is tested on bargaining game with the fixed group of the counterpart’s solutions in order to determine the optimal control parameter of each EC-agent.

5.2.1. GA-Agent

The control parameters of GA-agent are a crossover rate and mutation rate. As shown in Figure 6, the best performance of GA-agent in bargaining game was observed under the crossover rate of 0.9 and mutation rate of 0.05.

(a) Crossover rate (mutation rate = 0)

(b) Mutation rate (crossover rate = 0.9)

5.2.2. PSO-Agent

The control parameters of PSO-agent are an intermediate parameter and maximum velocity . As shown in Figure 7, the best performance of PSO-agent in bargaining game was observed under and = search space/5. Here, search space (SS) is 10; thus, .

(a)

(b)

5.2.3. DE-Agent

The control parameters of DE-agent are a coefficient and crossover rate CF. As shown in Figure 8, there is very little difference in the performance of DE-agent in bargaining game with regard to two control parameters. Thus, we adopt that and which are generally used.

(a)

(b)

5.3. Experiment of Coevolution between Two EC-Agents

5.3.1. GA-Agent versus PSO-Agent

The result of the bargaining game by means of coevolution between the GA-agent and PSO-agent is shown in Figure 9. The GA-agent was set to the optimal environment determined in Section 5.2.1 and the PSO-agent to that in Section 5.2.2. As you can see, the PSO-agent is superior to the GA-agent in the coevolution-based bargaining game.

5.3.2. GA-Agent versus DE-Agent

The result of the bargaining game by means of coevolution between the GA-agent and DE-agent is shown in Figure 10. The GA-agent was set to the optimal environment determined in Section 5.2.1 and the DE-agent to that in Section 5.2.3. As you can see, the GA-agent is superior to the DE-agent in the coevolution-based bargaining game.

5.3.3. PSO-Agent versus DE-Agent

The result of the bargaining game by means of coevolution between the PSO-agent and DE-agent is shown in Figure 11. The PSO-agent was set to the optimal environment determined in Section 5.2.2 and the DE-agent to that in Section 5.2.3. As you can see, the PSO-agent is superior to the DE-agent in the coevolution-based bargaining game.

5.4. Discussion

Firstly, performance measure among the EC-agents with respect to payoff is observed by changing the control parameters. The simulation results show the following implications. The control parameters of a GA-agent and PSO-agent have more influence on the performance than those of a DE: the probabilities of crossover and mutation of a GA-agent and the value of intermediate parameter and maximum velocity of a PSO-agent have effects on the performance but the probabilities of crossover and coefficient of a DE-agent have little effects on the performance.

Secondly, the coevolutionary process among three kinds of EC-agents which are GA-agent, PSO-agent, and DE-agent is tested to observe which EC-agent shows the best performance in the bargaining game. The simulation results show that a PSO-agent is better than a GA-agent and a DE-agent and that a GA-agent is better than a DE-agent with respect to coevolution in bargaining game.

In order to understand why a PSO-agent is the best among three kinds of EC-agents in the bargaining game, we observed the strategies of EC-agents after completion of game. Figure 12 shows the strategies of a GA-agent and a PSO-agent after completion of game. When the PSO-agent is a proponent, he suggests a small quantity of properties to the opponent, but when he is a respondent, he desired a large quantity. In contrast, when the GA-agent is a proponent, he suggests a large quantity to the opponent, but when he is a respondent, he desired a small quantity. In case of bargaining game between a PSO-agent and a DE-agent, the strategy of a DE-agent is similar to GA-agent of the figure. This indicated that the PSO-agent evolves in direction of the strategy to gain as much as possible at the risk of gaining no property upon failure of the transaction, while the GA-agent and the DE-agent evolve in direction of the strategy to accomplish the transaction regardless of the quantity.

(a) GA-agent Strategy

(b) PSO-agent Strategy

6. Conclusion

The interaction and coevolutionary process among the heterogeneous EC-agents are studied to observe the performance of the bargaining game. This paper investigates the nature of interaction and coevolutionary process in order to understand the pattern of action of three kinds of EC-agents and also identifies the principal parameters that influence the performance of agents. The simulation results show that the control parameters of a GA-agent and PSO-agent have more influence on the performance than those of a DE. Furthermore, the simulation results also show that a PSO-agent is better than a GA-agent and a DE-agent with respect to coevolution in bargaining game. We expect the analysis on the characteristics of artificial agents to help the researchers who study the game theory using artificial agents.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

References

I. Ståhl, Bargaining Theory, Stockholm School of Economics, Stockholm, Sweden, 1971.
A. Rubinstein, “Perfect equilibrium in a bargaining model,” Econometrica, vol. 50, no. 1, pp. 97–109, 1982.
View at: Publisher Site | Google Scholar | MathSciNet
T. Omoto, K. Kobayashi, and M. Onishi, “Bargaining model of construction dispute resolution,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, vol. 7, pp. 7–12, October 2002.
View at: Publisher Site | Google Scholar
S. Berninghaus, W. Güth, R. Lechler, and H.-J. Ramser, “Decentralized versus collective bargaining—an experimental study,” International Journal of Game Theory, vol. 30, no. 3, pp. 437–448, 2002.
View at: Publisher Site | Google Scholar
M. Nakayama, “E-commerce and firm bargaining power shift in grocery marketing channels: a case of wholesalers' structured document exchanges,” Journal of Information Technology, vol. 15, no. 3, pp. 195–210, 2000.
View at: Publisher Site | Google Scholar
S. Matwin, T. Szapiro, and K. Haigh, “Genetic algorithms approach to a negotiation support system,” IEEE Transactions on Systems, Man and Cybernetics, vol. 21, no. 1, pp. 102–114, 1991.
View at: Publisher Site | Google Scholar
K. M. Page, M. A. Nowak, and K. Sigmund, “The spatial ultimatum game,” Proceedings of the Royal Society B: Biological Sciences, vol. 267, no. 1458, pp. 2177–2182, 2000.
View at: Publisher Site | Google Scholar
D. D. B. van Bragt and J. A. La Poutrè, “Co-evolving automata negotiate with a variety of opponents,” in Proceedings of the Congress on Evolutionary Computation (CEC '02), vol. 2, pp. 1426–1431, Honolulu, Hawaii, USA, May 2002.
View at: Publisher Site | Google Scholar
K. Takadama, Y. L. Suematsu, N. Sugimoto, N. E. Nawa, and K. Shimohara, “Towards verification and validation in multiagent-based systems and simulations: analyzing different learning bargaining agents,” in Proceedings of the 4th Workshop on Multi-Agent Based Simulation, pp. 18–32, 2003.
View at: Google Scholar
F. Zhong, S. O. Kimbrough, and D. J. Wu, “Cooperative agent systems: artificial agents play the ultimatum game,” in Proceedings of the 35th Annual Hawaii International Conference on System Sciences, pp. 2169–2177, 2002.
View at: Google Scholar
D. J. Cooper, N. Feltovich, A. E. Roth, and R. Zwick, “Relative versus absolute speed of adjustment in strategic environments: responder behavior in ultimatum games,” Experimental Economics, vol. 6, no. 2, pp. 181–207, 2003.
View at: Publisher Site | Google Scholar
B. Grosskopf, “Reinforcement and directional learning in the ultimatum game with responder competition: experimental economics,” Experimental Economics, vol. 6, no. 2, pp. 141–158, 2003.
View at: Publisher Site | Google Scholar
A. E. Roth and I. Erev, “Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term,” Games and Economic Behavior, vol. 8, no. 1, pp. 164–212, 1995.
View at: Publisher Site | Google Scholar | MathSciNet
J. H. Kagel, C. Kim, and D. Moser, “Fairness in ultimatum games with asymmetric information and asymmetric payoffs,” Games and Economic Behavior, vol. 13, no. 1, pp. 100–110, 1996.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. J. Burnell, L. Evans, and S. Yao, “The ultimatum game: optimal strategies without fairness,” Games and Economic Behavior, vol. 26, no. 2, pp. 221–252, 1999.
View at: Publisher Site | Google Scholar | MathSciNet
T. D. Stanley and U. Tran, “Economics students need not be greedy: fairness and the ultimatum game,” Journal of Socio-Economics, vol. 27, no. 6, pp. 657–664, 1998.
View at: Publisher Site | Google Scholar
R. H. Thaler, “Anomalies: the ultimatum game,” Journal of Economic Perspectives, vol. 2, pp. 195–206, 1988.
View at: Google Scholar
R. Suleiman, “Expectations and fairness in a modified Ultimatum game,” Journal of Economic Psychology, vol. 17, no. 5, pp. 531–554, 1996.
View at: Publisher Site | Google Scholar
S.-C. Chang, J.-I. Yun, J.-S. Lee, S.-U. Lee, N.-P. Mahalik, and B.-H. Ahn, “Analysis on the parameters of the evolving artificial agents in sequential bargaining game,” IEICE Transactions on Information and Systems, vol. 88, no. 9, pp. 2098–2101, 2005.
View at: Publisher Site | Google Scholar
M. H. Seong and S. Y. Lee, “A bargaining game design using co-evolution analysis between artificial agents,” Advanced Science and Technology Letters, vol. 46, pp. 10–14, 2014.
View at: Google Scholar
M.-H. Seong and S.-Y. Lee, “A bargaining game using artificial agents based on genetic algorithms and particle swarm optimization,” International Journal of Software Engineering and Its Applications, vol. 8, no. 5, pp. 205–218, 2014.
View at: Publisher Site | Google Scholar
J. H. Holland, Adaptation in Natural and Artificial System, University of Michigan Press, Ann Arbor, Mich, USA, 1975.
D. E. Goldberg and K. H. KlÄosener, “A comparative analysis of selection schemes used in genetic algorithms,” in Foundation of Genetic Algorithms, G. Rawlins, Ed., pp. 69–93, Morgan Kaufmann, San Mateo, Calif, USA, 1991.
View at: Google Scholar
R. C. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proceedings of the 6th International Symposium on Micro Machine and Human Science (MHS '95), pp. 39–43, IEEE Service Center, Nagoya, Japan, October 1995.
View at: Publisher Site | Google Scholar
M. Clerc and J. Kennedy, “The particle swarm-explosion, stability, and convergence in a multidimensional complex space,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 1, pp. 58–73, 2002.
View at: Publisher Site | Google Scholar
R. Storn and K. Price, “Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341–359, 1997.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2015 Sangwook Lee. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

942

Downloads

1378

Citations