Wireless Communications and Mobile Computing

Research Article

Deep Reinforcement Learning for Collaborative Computation Offloading on Internet of Vehicles

Q-learning-based joint computation offloading and resource allocation algorithm.

Input: state space , action space , learning rate , discount factor
Output: the Q-values for every state-action pair
1: arbitrarily for ,
2: for each episode do
3: for each step of episode do
4: In the current state choose an action with a random probability
5: If < then
6: randomly select an action
7: else
8: select
9: end if
10: Execute action , observe the reward and the next state
11: Update according to eq.(29)
12: Update state
13: end for
14: end for