Research Article

Deep Reinforcement Learning-Based UAV Data Collection and Offloading in NOMA-Enabled Marine IoT Systems

Algorithm 3

Joint TD3-based trajectory optimization, power control, and buoy-UAV association relationship scheme (TTO-PCAR).
Input: The UAV’s initial position , the buoys’ position , the OBS’s position ;
Output:, , , ;
1. for episode =0 todo
2.  for epoch todo
3.   / 11-15 of Algorithm 1/
4.   ifthen
5.    Let , and obtain current channel gain .
6.    Set .
7.    ifthen
8.     Let UAV-OBS association relationship .
9.    end if
10.   else
11.    Let , and obtain current channel gain .
12.    Update and with given by performing Algorithm 2.
13.     7-10 of Algorithm 1/
14.    Update with given tranmist power and association relationship.
15.   end if
16.  end for
17. end for