Research Article
Deep Reinforcement Learning for Collaborative Computation Offloading on Internet of Vehicles
Algorithm 2
DQN-based joint computation offloading and resource allocation algorithm.
1: replay memory set | 2: action-value function with random weights | 3: target action-value function with weights | 4: for episode =1, M | 5: sequence and preprocessed sequence | 6: for t =1,2,...,T do | 7: With probability select a random action | 8: Otherwise select | 9: Execute action , observe the reward and the next state | 10: Set and preprocess | 11: Store experience in | 12: Sample random minibarch of experience from | 13: Set if episode terminates at step | 14: Otherwise | 15: Perform a gradient descent step on with respect to the network parameters | 16: Every step reset | 17: end for | 18: end for |
|