Research Article
Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning
Figure 4
Training curves of the loss function of Q target network. Each point is the average loss function value achieved per ten epochs. The -axis denotes the value of loss function and -axis denotes iteration epoch.