Research Article

Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm

Table 4

Performance comparison of the aggregated policy and subpolicies.

PolicyStepsTotal reward (points)Complete one lap

Subpolicy124616690.60No
Subpolicy224615413.12No
Subpolicy3102āˆ’1252.46No
Aggregated policy45731603.37Yes