Research Article

Multiagent Reinforcement Learning for Task Offloading of Space/Aerial-Assisted Edge Computing

Algorithm 1

MADDPG algorithm for task offloading in SAGIN.
(1)Initialization:
(2)Randomly initialize critic network and actor with weights and
(3)Initialize target network and with weights and
(4)Empty replay buffer
(5)for episode do
(6)   Initialize a Gaussian noise with mean = 0;
(7)   Receive initial observation state ;
(8)   for time slot do
(9)    Select action according to the current policy and exploration noise
(10)    Execute action and observe the reward , and the next state
(11)    Collect the global state , and the action ;
(12)    Store transition in ;
(13)    Sample a random mini-batch of transitions from ;
(14)    Set ;
(15)    Update the critic network by minimize the loss
    
(16)    Update the actor policy by using the sampled policy gradient
    ;
(17)    Update the target networks for each agent :
     and ;
(18)   end
(19)end