Research Article
Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
Table 1
Performance comparison of subpolicies and the aggregated policy.
| Policy | Episodes | Total reward | Average reward |
| Subpolicy1 | 20 | 720.69 | 36.03 | Subpolicy2 | 20 | 538.28 | 26.91 | Subpolicy3 | 20 | 463.98 | 23.20 | Aggregated policy | 20 | 829.17 | 41.46 |
|
|