Wireless Communications and Mobile Computing

Research Article

Deep Reinforcement Learning-Based UAV Data Collection and Offloading in NOMA-Enabled Marine IoT Systems

Algorithm 3

Joint TD3-based trajectory optimization, power control, and buoy-UAV association relationship scheme (TTO-PCAR).

Input: The UAV’s initial position , the buoys’ position , the OBS’s position ;
Output: , , , ;
1. for episode =0 todo
2. for epoch todo
3. / 11-15 of Algorithm 1/
4. ifthen
5. Let , and obtain current channel gain .
6. Set .
7. ifthen
8. Let UAV-OBS association relationship .
9. end if
10. else
11. Let , and obtain current channel gain .
12. Update and with given by performing Algorithm 2.
13. 7-10 of Algorithm 1/
14. Update with given tranmist power and association relationship.
15. end if
16. end for
17. end for