Research Article

A Case Study on Air Combat Decision Using Approximated Dynamic Programming

Algorithm 2

Rollout decision procedure using approximated utility function.
Rollout_ADP_Policy()
Input variables:
(1) : the current state;
(2): the number of rollout steps;
(3): utility function approximated in ADP_Learn algorithm;
(4): red plane’s policy derived from , see (8).
Output variables:
(1): the best action respect to current state .
Local variables:
(1): to cache the maximum utility responding to different actions;
(2): the cache the next state computed by system equation.
Code:
(1) ;
(2);
(3)FOR , DO:
(4);
(5)FOR , DO:
(6)  ;
(7)END
(8)IF , THEN:
(9)  ;
(10)   ;
(11)   END
(12) END
(13) RETURN ;