Research Article
Performance Evaluation of Multiagent Reinforcement Learning Based Training Methods for Swarm Fighting
Table 5
Hyperparameters used for this experiment.
| Type | MARL | MARL-BC |
| Hyperparameters | Batch size | 1024 | 1024 | Buffer size | 20480 | 20480 | Learning rate | 0.0001 | 0.0001 | Entropy bonus | 0.005 | 0.005 | Num epoch | 3 | 3 |
| Network settings | Hidden units | 512 | 512 | Num layers | 3 | 3 |
| Reward signals | Discount factor | 0.99 | 0.99 | Strength | 1.0 | 1.0 |
| Behavior cloning | Steps | / | 100 M | Strength | 0.5 | 0.5 |
|
|