Wireless Communications and Mobile Computing

Research Article

A Resource Allocation Scheme for Intelligent Tasks in Vehicular Networks

Multiagent DQN algorithm.

1: Input: action space , state space , learning factor , discount factor , and other system parameters;
2: Onput: Target Q-network .
3: Initialization Process:
4: for each agent do
5: Initialize replay buffer ;
6: Initialize the weights of Q-network ;
7: Initialize the weights of target Q-network .
8: end for
9: Learning Process:
10: for each agent do
11: repeat
12: Initialization state
13: fordo
14: Generate a random number ;
15: ifthen
16: Select a random action from ;
17: else
18: Select the action .
19: end if
20: Perform action ;
21: Get new state and reward by Equation (11);
22: Store transition in ;
23: Sample random transitions from ;
24:
25: Use as a loss function to train Q-network;
26: Updata the weights: ;
27: ;
28: , after steps.
29: end for
30: until converges.
31: end for