Research Article

Joint Optimization for MEC Computation Offloading and Resource Allocation in IoV Based on Deep Reinforcement Learning

Algorithm 1

Decentralized multiagent DDPG optimization method.
Randomly initialize critic network and actor with weights and
Initialize target network and with weights ,
Initialize replay buffer
for episode
 Initialize a random process foe action exploration
 Receive initial observation state
  for
   Select action according to the current policy and exploration noise
   Execute action and observe reward and observe the next state
   Store all transitions in
   Sample a random mini-batch of transitions from
   Set
    
   Update critic network by minimizing the loss
    
   Update the actor policy by using the sampled policy gradient
    
   Update the target networks for each agent :
    
    
  end for
end for
end for