Research Article

Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

Figure 5

Prediction of the next state and reward according to the global model.
(a) Prediction of the angle at next state
(b) Prediction of the angular velocity at next state
(c) Prediction of the reward