Research Article

A Dueling Deep Recurrent -Network Framework for Dynamic Multichannel Access in Heterogeneous Wireless Networks

Algorithm 1

Training process of Dueling DRQN.
Initialize ,,,,,,,,.
Initialize experience pool and mini-batches .
Initialize the parameter of the estimation network as .
Initialize the parameter of the target network .
1: For episode do
2: For time-slot do
3:  Input into the estimation network and output the ;
4:  Select the action using the adaptive policy algorithm
    And update according to the equation (19);
5:  Execute action and generate the observation and ;
6:  Compute from ,;
7:  Store into the experience-replay pool .
8:  Ifthen
9:   Randomly generate an index subset ;
10:   Sample from ;
11:   For each sample in do
12:    Compute the and obtain .
13:   End for
14:   Calculate the loss function according to the equation (16) and update according to the equation (17);
15:   Minimize the loss function with learning rate .
17:  End if
18:  Every time slots: Update by setting .
19: End for
20: End for