Mathematical Problems in Engineering

Research Article

Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions

Algorithm 1

UANOA algorithm for navigation and obstacle avoidance of the USV.

	Initialize replay memory
Initialize evaluate function of the USV with random weights
Initialize target function of the USV with random weights
for episode = 1,2, …, M do
for t = 1, …T do
With probability select a random USV rudder action
otherwise select
Get reward and next state by executing rudder action
Store experience in where is processed by the LSTM network
Sample random minibatch of experience from
Set
Perform a gradient descent step on with respect to the
weights
Every steps reset
end
end