Research Article

Dynamic Task Assignment Framework for Mobile Crowdsensing with Deep Reinforcement Learning

Algorithm 1

Q network training algorithm based on DDQN model.
Input: Historical data set H, replay memory D, maximum training episodes N, a constant Z, initialization evaluation network Q and target network
Output: Evaluate network Q
 1: For step from 1 to N do
 2: Initialized worker state =
 3: while is not the termination state do
 4:  if is the initial state then
 5:   Take the task with period equal to in the historical data set H as the
     action set of the current state s
 6:  else
 7:   Obtain the action set of the current state in the historical data set H
     according to the spatiotemporal constraints of Equation (6) and Equation (7)
 8:  end if
 9:  if the action set of is empty then
 10:   The worker executes the virtual task , the state transitions to , and
      the reward is 0
 11:   Store in the cache memory, where is the next state,
       is the reward, and is whether is the termination state
 12:  else
 13:   Take as input to get the value for each state-action
      pair
 14:   Use the method to select the corresponding action in the
      output of the current value
 15:   Get , , according to action . and store in the
      replay memory
 16:  end if
 17:  if replay memory D is full then
 18:   Cover a piece of data in D and extract a mini-batch to randomly sample
       for learning
 19:   Calculate the target value y according to Equation (12)
 20:   Gradient descent update of evaluation network Q parameters according
      to the loss function of Equation (11)
 21:   Update the target network parameters every Z step
 22:  end if
 23:   =
 24: end while
 25: end for