Research Article

EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents

Table 2

Average success rate for 4-agent/12-vertex box-pushing (evaluation episodes = 50,000).

 = 100,000 = 500,000 = 1000,000

EAQR82.6%98.6%99.6%
WoLF-PHC80.7%87.1%91.7%
EMA Q-learning66.7%76.6%78.7%
Single-agent RL60.2%91.2%95.9%