Research Article

EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents

Table 7

Average steps for the DSN problem (evaluation episodes = 5000).

 = 10,000 = 50,000 = 100,000

EAQR3.65 ± 0.353.22 ± 0.263.12 ± 0.20
WoLF-PHC3.64 ± 0.633.58 ± 0.643.69 ± 0.61
EMA Q-learning3.81 ± 0.433.94 ± 0.443.93 ± 0.45
Single-agent RL5.57 ± 0.255.3 ± 0.295.06 ± 0.27