Research Article
A UAV Pursuit-Evasion Strategy Based on DDPG and Imitation Learning
Table 1
Table of training hyperparameter.
| Training hyperparameter | Symbol | Value |
| Discounting factor | | 0.9 | Inertial update rate | | 0.01 | Memory size | | 30000 | Size of batch experience | Batch size | 64 | Simulation time step | | 0.1 | Learning rate of Critic network | | 0.002 | Learning rate of Actor network | | 0.001 | Number of episodes | MaxEpisode | 4000 | Number of steps in one episode | MaxStep | 300 |
|
|