Research Article

Investigating the Effects of Hyperparameters in Quantum-Enhanced Deep Reinforcement Learning

Table 7

The reward and timestep of the agent for the last 200 episodes.

Episode799849899949999Alpha

Reward−0.24−0.24−0.260.950.95←0.1
0.950.950.950.950.95←0.2
0.950.950.950.950.95←0.3
0.950.950.950.950.95←0.4

Timesteps88766←0.1
66666←0.2
66666←0.3
66666←0.4