Research Article

Finite-Horizon Optimal Tracking Guidance for Aircraft Based on Approximate Dynamic Programming

Algorithm 1

Actor-critic learning procedure of tracking guidance.
Input:
Perturbation equations at every downrange step.
Cost function along the trajectory and at final state.
Output:
Optimal control weights for tracking reference trajectory.
(1) Randomly select sets of , calculate
and , obtain by (23).
(2) for to do
(3) Initialize , , and actor training
step .
(4) repeat
(5) Randomly select , apply previous control to calculate .
(6) Substitute to (32) gives to .
(7) Get the error of actor network from and .
(8) Calculate the gradient of weights and update , and by (35).
(9) Push training step .
(10) until
(11) Randomly select sets of , apply
actor network to get .
(12) Calculate , and according to Eqs. (26) and (27).
(13) Apply least square estimate to get .
(14) end for