Research Article
A Novel Motion-Intelligence-Based Control Algorithm for Object Tracking by Controlling PAN-Tilt Automatically
Algorithm 2
Memory replay for Q-value network.
Get the memory pool D | |
Initialize Q-value network with random | |
For epochs = 1, 1000,000 do | |
Sample random from memory pool to get 50 samples | |
Perform a gradient descent step on equation (21) respect to the network parameters | |
Every 100 steps, clone the Q-value network to obtain the target network Q-value network | |
End For |