Research Article
Finite-Horizon Optimal Tracking Guidance for Aircraft Based on Approximate Dynamic Programming
Algorithm 1
Actor-critic learning procedure of tracking guidance.
Input: | Perturbation equations at every downrange step. | Cost function along the trajectory and at final state. | Output: | Optimal control weights for tracking reference trajectory. | (1) Randomly select sets of , calculate | and , obtain by (23). | (2) for to do | (3) Initialize , , and actor training | step . | (4) repeat | (5) Randomly select , apply previous control to calculate . | (6) Substitute to (32) gives to . | (7) Get the error of actor network from and . | (8) Calculate the gradient of weights and update , and by (35). | (9) Push training step . | (10) until | (11) Randomly select sets of , apply | actor network to get . | (12) Calculate , and according to Eqs. (26) and (27). | (13) Apply least square estimate to get . | (14) end for |
|