Research Article
Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning
Figure 5
The average cumulative reward curves. Each point is the average cumulative reward achieved per hundred episodes. The -axis denotes the average cumulative reward and x-axis denotes iteration epoch.