Research Article
Deep Reinforcement Learning-Based UAV Data Collection and Offloading in NOMA-Enabled Marine IoT Systems
Algorithm 3
Joint TD3-based trajectory optimization, power control, and buoy-UAV association relationship scheme (TTO-PCAR).
Input: The UAV’s initial position , the buoys’ position , the OBS’s position ; | Output: , , , ; | 1. for episode =0 todo | 2. for epoch todo | 3. / 11-15 of Algorithm 1/ | 4. ifthen | 5. Let , and obtain current channel gain . | 6. Set . | 7. ifthen | 8. Let UAV-OBS association relationship . | 9. end if | 10. else | 11. Let , and obtain current channel gain . | 12. Update and with given by performing Algorithm 2. | 13. 7-10 of Algorithm 1/ | 14. Update with given tranmist power and association relationship. | 15. end if | 16. end for | 17. end for |
|