Research Article
Deep Q-Network with Predictive State Models in Partially Observable Domains
Figure 4
Comparison of RPSR-DQN to the policy-based method. Plots show the performance for three methods on all tasks: (a) CartPole-v1, (b) Swimmer-v1, and (c) Reacher-v1.
(a) |
(b) |
(c) |