Wireless Communications and Mobile Computing

Research Article

Decentralized and Dynamic Band Selection in Uplink Enhanced Licensed-Assisted Access: Deep Reinforcement Learning Approach

Algorithm 2.

DQN training algorithm for dynamic band selection.

for each agent do
Initialize replay buffer
Initialize action value function with parameter
Initialize target action value function with parameter
Generate initial state from the environment simulator
end for
for do
for each agent do
Execute action from using -greedy policy
Collect reward and observation
Observe the next state from the environment simulator
Store the transition into
Sample random minibatch of transitions from
Evaluate the target
Perform a gradient descent step on with respect to
Every C steps, update the target network according to
end for
end for