Research Article
Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions
Algorithm 1
UANOA algorithm for navigation and obstacle avoidance of the USV.
| Initialize replay memory | Initialize evaluate function of the USV with random weights | Initialize target function of the USV with random weights | for episode = 1,2, …, M do | for t = 1, …T do | With probability select a random USV rudder action | otherwise select | Get reward and next state by executing rudder action | Store experience in where is processed by the LSTM network | Sample random minibatch of experience from | Set | Perform a gradient descent step on with respect to the | weights | Every steps reset | end | end |
|