Research Article

Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning

Figure 5

The average cumulative reward curves. Each point is the average cumulative reward achieved per hundred episodes. The -axis denotes the average cumulative reward and x-axis denotes iteration epoch.