Research Article
Deep Q-Network with Predictive State Models in Partially Observable Domains
Table 1
The best reward of three methods.
| | CartPole-v1 | Swimmer-v1 | Reacher-v1 |
| DRQN | 200 | 56 | −1.15 | DQN-1frame | 54 | 40.58 | −6.43 | RPSR-DQN | 200 | 59.52 | −0.02 | RPSP | 158 | 38.96 | −57.78 |
|
|