Research Article
Task Migration Based on Reinforcement Learning in Vehicular Edge Computing
Initialize main DQN with random weights | Initialize target DQN with weights | Initialize replay memory to capacity | For each episode do | Initialize initial state , reward | For time slot , do | The controller acquires information about vehicles, tasks, and VECS by interacting | with the environment | If the random number < : | Select action | Else: | Randomly select action | Execute action at controller, observe reward and next state | Store the tuple in | Randomly sample a minibatch of tuple from | If episode terminates at , then | | Else: | | Perform a gradient descent step on with respect to the network parameters | where loss function is and update | Terminate when all the vehicles are out of simulation region | End for | End for |
|