Research Article
EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents
Table 7
Average steps for the DSN problem (evaluation episodes = 5000).
| | = 10,000 | = 50,000 | = 100,000 |
| EAQR | 3.65 ± 0.35 | 3.22 ± 0.26 | 3.12 ± 0.20 | WoLF-PHC | 3.64 ± 0.63 | 3.58 ± 0.64 | 3.69 ± 0.61 | EMA Q-learning | 3.81 ± 0.43 | 3.94 ± 0.44 | 3.93 ± 0.45 | Single-agent RL | 5.57 ± 0.25 | 5.3 ± 0.29 | 5.06 ± 0.27 |
|
|