Research Article

Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning

Figure 4

Training curves of the loss function of Q target network. Each point is the average loss function value achieved per ten epochs. The -axis denotes the value of loss function and -axis denotes iteration epoch.