Research Article
Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
Figure 5
Prediction of the next state and reward according to the global model.
(a) Prediction of the angle at next state |
(b) Prediction of the angular velocity at next state |
(c) Prediction of the reward |