Research Article

Task Migration Based on Reinforcement Learning in Vehicular Edge Computing

Algorithm 1

Deep Q-learning method.
Initialize main DQN with random weights
Initialize target DQN with weights
Initialize replay memory to capacity
For each episode do
  Initialize initial state , reward
  For time slot , do
   The controller acquires information about vehicles, tasks, and VECS by interacting
   with the environment
   If the random number < :
    Select action
   Else:
    Randomly select action
   Execute action at controller, observe reward and next state
   Store the tuple in
   Randomly sample a minibatch of tuple from
   If episode terminates at , then
    
   Else:
    
   Perform a gradient descent step on with respect to the network parameters
    where loss function is and update
   Terminate when all the vehicles are out of simulation region
  End for
End for