Research Article

A Resource Allocation Scheme for Intelligent Tasks in Vehicular Networks

Algorithm 2

Multiagent DQN algorithm.
1: Input: action space , state space , learning factor , discount factor , and other system parameters;
2: Onput: Target Q-network .
3: Initialization Process:
4: for each agent do
5:   Initialize replay buffer ;
6:   Initialize the weights of Q-network ;
7:   Initialize the weights of target Q-network .
8: end for
9: Learning Process:
10: for each agent do
11:   repeat
12:      Initialization state
13:    fordo
14:      Generate a random number ;
15:      ifthen
16:        Select a random action from ;
17:      else
18:        Select the action .
19:      end if
20:      Perform action ;
21:      Get new state and reward by Equation (11);
22:      Store transition in ;
23:      Sample random transitions from ;
24:      
25:      Use as a loss function to train Q-network;
26:      Updata the weights: ;
27:      ;
28:      , after steps.
29:    end for
30:   until converges.
31: end for