Research Article

Deep Q-Network with Predictive State Models in Partially Observable Domains

Figure 4

Comparison of RPSR-DQN to the policy-based method. Plots show the performance for three methods on all tasks: (a) CartPole-v1, (b) Swimmer-v1, and (c) Reacher-v1.
(a)
(b)
(c)