International Journal of Aerospace Engineering

Research Article

Intelligent Online Multiconstrained Reentry Guidance Based on Hindsight Experience Replay

Training of multiconstrained reentry guidance based on HER.

Randomly initialize parameters of actor network and critic network .
Initialize target actor and target critic with , .
Initialize basic replay buffer
for, do
Initialize an HER replay buffer and a random process noise for exploration
Initialize an initial goal randomly
Run the Scenario Initialization Function, sample a basic state
for, do
Combine basic state and goal to expanded state
Sample an action from actor and noise:
Execute the action in the Policy Step Function and observe a new state
Combine basic new state and goal to expand new state
Store the transition in and
if the episode is done
Judge whether the HER condition is met and record it
end if
end for
if the HER condition is met
for, in do
Calculate and , ,
Recalculate reward according to Equations (30)–(33)
Combine basic states and goal to expand states:

Store the transition in
end for
Clear data in
end if
for, do
Sample a minibatch from the replay buffer
Update critic by Equation (24), update actor by Equation (25)
Update target network periodically: ,
end for
end for