Research Article

Deep Reinforcement Learning for Collaborative Computation Offloading on Internet of Vehicles

Algorithm 1

Q-learning-based joint computation offloading and resource allocation algorithm.
Input: state space , action space , learning rate , discount factor
Output: the Q-values for every state-action pair
1:  arbitrarily for ,
2: for each episode do
3:  for each step of episode do
4:   In the current state choose an action with a random probability
5:   If < then
6:    randomly select an action
7:   else
8:    select
9:   end if
10:   Execute action , observe the reward and the next state
11:   Update according to eq.(29)
12:   Update state
13:  end for
14: end for