Research Article
Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
Table 4
Performance comparison of the aggregated policy and subpolicies.
| Policy | Steps | Total reward (points) | Complete one lap |
| Subpolicy1 | 246 | 16690.60 | No | Subpolicy2 | 246 | 15413.12 | No | Subpolicy3 | 102 | ā1252.46 | No | Aggregated policy | 457 | 31603.37 | Yes |
|
|