Research Article

Investigating the Effects of Hyperparameters in Quantum-Enhanced Deep Reinforcement Learning

Algorithm 1

Quantum enhanced deep Q-learning.
Set replay memory M to state size N
Initialize action-value function quantum circuit Q with arbitrary parameters θ
For episode e = 1, 2, 3, 4, ……. E do
 Initialize State s1 from the set state S and encode it into
 the quantum state using basis encoding
 for the time step t = 1, 2, 3, …. T do
  With probability ε, select a random action at
  otherwise, select the optimal action at from the result of quantum circuit
  Execute the selected action at and see the reward rt and the next state st+1
  Store transition in replay memory M
  Sample a random minibatch of transitions from the replay memory M
  
  Perform a gradient descent step on
 end for
end for