Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2013, Article ID 926267, 8 pages
http://dx.doi.org/10.1155/2013/926267
Research Article

Multiagent Reinforcement Learning with Regret Matching for Robot Soccer

Qiang Liu,1,2 Jiachen Ma,1,2 and Wei Xie1,2

1School of Astronautics, Harbin Institute of Technology, Harbin 150001, China
2School of Information and Electrical Engineering, Harbin Institute of Technology (Weihai), Weihai 264209, China

Received 4 April 2013; Revised 19 July 2013; Accepted 20 July 2013

Academic Editor: Yudong Zhang

Copyright © 2013 Qiang Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. P. K. C. Wang, “Navigation strategies for multiple autonomous mobile robots moving in formation,” Journal of Robotic Systems, vol. 8, no. 2, pp. 177–195, 1991. View at Google Scholar · View at Scopus
  2. M. J. Matarić, “Reinforcement learning in the multi-robot domain,” Autonomous Robots, vol. 4, no. 1, pp. 73–83, 1997. View at Google Scholar · View at Scopus
  3. M. Tan, “Multi-agent reinforcement learning: Independent versus cooperative agents,” in Proceedings of the 10th International Conference on Machine Learning, pp. 330–337, Morgan Kaufmann, 1993.
  4. T. Fujii, Y. Arai, H. Asama, and I. Endo, “Multilayered reinforcement learning for complicated collision avoidance problems,” in Proceedings of the IEEE International Conference on Robotics and Automation, vol. 3, pp. 2186–2191, May 1998. View at Scopus
  5. E. Uchibe, M. Nakamura, and M. Asada, “Co-evolution for cooperative behavior acquisition in a multiple mobile robot environment,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1, pp. 425–430, October 1998. View at Scopus
  6. K. H. Park, Y. J. Kim, and J. H. Kim, “Modular Q-learning based multi-agent cooperation for robot soccer,” Robotics and Autonomous Systems, vol. 35, no. 2, pp. 109–122, 2001. View at Publisher · View at Google Scholar · View at Scopus
  7. Y. Ma, Z. Cao, X. Dong, C. Zhou, and M. Tan, “A multi-robot coordinated hunting strategy with dynamic alliance,” in Proceedings of the Chinese Control and Decision Conference (CCDC '09), pp. 2338–2342, chn, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. J. H. Kim and P. Vadakkepat, “Multi-agent systems: a survey from the robot-soccer perspective,” Intelligent Automation and Soft Computing, vol. 6, no. 1, pp. 3–18, 2000. View at Google Scholar · View at Scopus
  9. C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992. View at Publisher · View at Google Scholar · View at Scopus
  10. M. E. Harmon and S. S. Harmon, Reinforcement Learning: A Tutorial, Wright Lab, Wright-Patterson AFB, Ohio, USA, 1997.
  11. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, UK, 1998.
  12. Y. Wang, Cooperative and intelligent control of multi-robot systems using machine learning [thesis], The University of British Columbia, 2008.
  13. Y. Duan, B. X. Cui, and X. H. Xu, “A multi-agent reinforcement learning approach to robot soccer,” Artificial Intelligence Review, vol. 38, no. 3, pp. 193–211, 2012. View at Publisher · View at Google Scholar · View at Scopus
  14. F. Michaud and M. J. Matarić, “Learning from history for behavior-based mobile robots in non-stationary conditions,” Machine Learning, vol. 31, no. 1–3, pp. 141–167, 1998. View at Google Scholar · View at Scopus
  15. M. Asada, E. Uchibe, and K. Hosoda, “Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development,” Artificial Intelligence, vol. 110, no. 2, pp. 275–292, 1999. View at Publisher · View at Google Scholar · View at Scopus
  16. M. Wiering, R. Sałustowicz, and J. Schmidhuber, “Reinforcement learning soccer teams with incomplete world models,” Autonomous Robots, vol. 7, no. 1, pp. 77–88, 1999. View at Publisher · View at Google Scholar · View at Scopus
  17. M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Proceedings of the 11th International Conference on Machine Learning, pp. 157–163, 1994.
  18. J. Hu and M. P. Wellman, “Nash Q-learning for general-sum stochastic games,” Journal of Machine Learning Research, vol. 4, no. 6, pp. 1039–1069, 2004. View at Publisher · View at Google Scholar · View at Scopus
  19. M. L. Littman, “Friend-or-foe Q-learning in general-sum games,” in Proceedings of the 18th International Conference on Machine Learning (ICML '01), pp. 322–328, 2001.
  20. A. Greenwald and K. Hall, “Correlated-Q learning,” in Proceedings of the 20th International Conference on Machine Learning, pp. 242–249, August 2003. View at Scopus
  21. J. Hu and M. P. Wellman, “Multiagent reinforcement learning: theoretical framework and an algorithm,” in Proceedings of the 15th International Conference on Machine Learning, pp. 242–250, 1998.
  22. J. Nash, “Non-cooperative games,” The Annals of Mathematics, vol. 54, no. 2, pp. 286–295, 1951. View at Publisher · View at Google Scholar
  23. M. Bowling, “Convergence and no-regret in multiagent learning,” Advances in Neural Information Processing Systems, vol. 17, pp. 209–216, 2005. View at Google Scholar
  24. S. Kapetanakis and D. Kudenko, “Reinforcement learning of coordination in cooperative multi-agent systems,” in Proceedings of the National Conference on Artificial Intelligence, pp. 326–331, AAAI Press, MIT Press, Menlo Park, Calif, USA, 1999. View at Scopus
  25. S. Hart and A. Mas-Colell, “A simple adaptive procedure leading to correlated equilibrium,” Econometrica, vol. 68, no. 5, pp. 1127–1150, 2000. View at Google Scholar · View at Scopus
  26. P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, “Gambling in a rigged casino: the adversarial multi-armed bandit problem,” in Proceedings of the 36th IEEE Annual Symposium on Foundations of Computer Science, pp. 322–331, October 1995. View at Scopus
  27. M. Zinkevich, “Online convex programming and generalized ininitesimal gradient ascent,” in Proceedings of the 20th International Conference on Machine Learning (ICML '03), Washington, DC, USA, 2003.
  28. J. R. Marden, Learning in Large-Scale Games and Cooperative Control, University of California, Los Angeles, Los Angeles, Calif, USA, 2007.
  29. F. Thusijsman, Optimality and Equilibrium in Stochastic Games, Centrum voor Wiskunde en Informatica, 1992.
  30. X. Wang and T. Sandholm, “Reinforcement learning to play an optimal nash equilibrium in team markov games,” Advances in Neural Information Processing Systems, vol. 15, pp. 1571–1578, 2002. View at Google Scholar
  31. E. Yang and D. Gu, “Multiagent reinforcement learning for multi-robot systems: a survey,” University of Essex Technical Report CSM-404, Department of Computer Science, 2004. View at Google Scholar
  32. H. P. Young, Strategic Learning and Its limIts, Oxford University Press, New York, NY, USA, 2004.
  33. D. Monderer and L. S. Shapley, “Fictitious play property for games with identical interests,” Journal of Economic Theory, vol. 68, no. 1, pp. 258–265, 1996. View at Publisher · View at Google Scholar · View at Scopus
  34. M. Riedmiller, T. Gabel, R. Hafner, and S. Lange, “Reinforcement learning for robot soccer,” Autonomous Robots, vol. 27, no. 1, pp. 55–73, 2009. View at Publisher · View at Google Scholar · View at Scopus
  35. T. Balch, Behavioral divesity in learning robot teams [thesis], Georgia Institute of Technology, 1998.
  36. R. C. Arkin, “Cooperation without communication: multiagent schemabased robot navigation,” Journal of Robotic Systems, vol. 9, no. 3, pp. 351–364, 1992. View at Publisher · View at Google Scholar
  37. T. Y. Tsou, C. H. Liu, and Y. T. Wang, “Term formation control for soccer robot systems,” in Proceeding of the IEEE International Conference on Networking, Sensing and Control, pp. 1121–1125, March 2004. View at Scopus