Wireless Communications and Mobile Computing

Research Article

Task Migration Based on Reinforcement Learning in Vehicular Edge Computing

Deep Q-learning method.

Initialize main DQN with random weights
Initialize target DQN with weights
Initialize replay memory to capacity
For each episode do
Initialize initial state , reward
For time slot , do
The controller acquires information about vehicles, tasks, and VECS by interacting
with the environment
If the random number < :
Select action
Else:
Randomly select action
Execute action at controller, observe reward and next state
Store the tuple in
Randomly sample a minibatch of tuple from
If episode terminates at , then

Else:

Perform a gradient descent step on with respect to the network parameters
where loss function is and update
Terminate when all the vehicles are out of simulation region
End for
End for