Abstract

In this paper, we present an online obstacle avoidance planning method for unmanned underwater vehicle (UUV) based on clockwork recurrent neural network (CW-RNN) and long short-term memory (LSTM), respectively. In essence, UUV online obstacle avoidance planning is a spatiotemporal sequence planning problem with the spatiotemporal data sequence of sensors as input and control instruction to motion controller of UUV as output. And recurrent neural networks (RNNs) have proven to give state-of-the-art performance on many sequence labeling and sequence prediction tasks. In order to train the networks, a UUV obstacle avoidance dataset is generated and an offline training and testing is adopted in this paper. Finally, the proposed two types of RNN based online obstacle avoidance planners are compared in path cost, obstacle avoidance planning success rate, training time, time-consumption, learning, and generalization, respectively. And the good performance of the proposed methods is demonstrated with a series of simulation experiments in different environments.

1. Introduction

The online obstacle avoidance planner is one of the important modules of UUV which reflects its intelligence level, which requires the UUV to plan a collision-free trajectory autonomously when it navigates in long range and unknown environment. At present, the main obstacle avoidance methods include traditional methods [16], bionics algorithm [711], and reinforcement learning methods [1215]. Traditionally, a bottleneck restricting the development of UUV obstacle avoidance technology is the uncertainty of underwater sensing equipment. And, the performance of obstacle avoidance in complex environment and even maze environment is not satisfactory.

LSTM is a RNN architecture that employs three special gating schemes to address the vanishing and exploding gradient problems. It is able to process complex sequential information for learning features from long-term input data and has proven to give state-of-the-art performance on many challenging problems involving precipitation nowcasting, predicting water table depth, traffic forecasting, object tracking, punctuation prediction, and so on.

The aim of precipitation nowcasting is to provide a forecast of the rainfall intensity in a local region over a relatively short period of time (e.g., 0-6 hours). Shi et al. formulated precipitation nowcasting as a spatiotemporal sequence forecasting problem with the sequence of past radar maps as input and the sequence of a fixed number of future radar maps as output and proposed the convolutional LSTM model, in which the convolutional structures are used to extract features and LSTM is used to do forecasting problem [16].

Long-term predictions of water table depth in agricultural areas face enormous challenges because of their complex, heterogeneous hydrogeological characteristics, boundary conditions, and human activities. In addition, there are nonlinear interactions among these factors. Zhang et al. proposed a time series model based on LSTM to alternate computationally expensive physical models, especially in areas where hydrogeological data are difficult to obtain [17].

Accurate and real-time traffic flow prediction especially short-term traffic flow information is an important part of intelligent transportation system. However, due to the stochastic and nonlinear nature of traffic flow, accurately predicting traffic state is a challenging task. The LSTM is able to learn time series with long time dependency and automatically determine the optimal time lags. Ma et al. found this feature is especially desirable for traffic prediction problems, where future traffic condition is commonly relevant to the previous events with long time spans and proposed a LSTM-based traffic flow prediction method to capture nonlinear traffic dynamic in an effective manner [18]. Duan et al. explored a LSTM neural network model, which can automatically reserve historical sequence information in its model structure, for travel time prediction [19]. Chen et al. trained a LSTM model to learn the patterns in traffic condition sequences which utilized the historical traffic conditions obtained from AMAP, a web map service provider in China, to predict traffic conditions in the future [20].

Object tracking is a fundamental problem in computer vision with a wide range of applications. The target of a tracking system is to estimate the state sequence of the object based on observation sequence. LSTM has been introduced in object tracking for object representations via sequence learning. Li et al. employed LSTM units to directly learn temporally correlated representations of the objects in long sequences [21]. Zhou et al. introduced a bidirectional LSTM-based appearance model to learn the spatial contextual dependency [22]. Wang et al. proposed a 3D fish tracking method and multifish tracking method in which a LSTM network is employed to model the fish’s motion process [23, 24].

In other fields, Chen et al. modeled and predicted China stock returns using LSTM and improved the accuracy of stock return prediction greatly [25]. Sak et al. demonstrated the state-of-the-art performance of LSTM networks on speech recognition tasks compared with RNN and deep neural networks (DNNs) models [26]. Wu et al. utilized LSTM to solve remaining useful life estimation problem and got good remaining useful life prediction accuracy [27]. Chherawala et al. presented a handwriting recognition model based on LSTM network which automatically learns features from the input image in a supervised fashion [28].

In 2014, Koutník et al. introduced CW-RNN which simplifies the RNN architecture, improves the performance of network, and speeds up the network evaluation [29]. Achanta et al. showed that CW-RNN is equivalent to the standard RNN architecture with a time-varying leaky integration [30].

Two end-to-end online obstacle avoidance planners based on LSTM and CW-RNN, respectively, are presented in this paper. The obstacle avoidance planners take the information obtained by multibeam forward looking sonar (FLS) as input and directly output control instruction to motion controller of UUV. The RNN based obstacle avoidance planners remain robust performance even though the effects of measurement noises are considered. And due to the strong learning ability of RNN, the obstacle avoidance planners are capable for obstacle avoidance in the environments which far much complex than those environments existed in training samples.

2. UUV System Modeling

The obstacle avoidance planning on the vertical plane is usually achieved through depth adjustment, while the depth adjustment strategy often brings large pitch adjustment, which affects the attitude control of UUV. Therefore, this paper adopts the strategy of horizontal plane obstacle avoidance regulation priority and defines a horizontal 3 degrees-of freedom (DPF) control model for UUV, which can not only guarantee the safety of UUV collision avoidance planning, but also facilitate the UUV motion control.

The North East-fixed reference frame and body-fixed reference frame are shown in Figure 1. The 3 DOF control model of UUV is described as follows [31]:where is position vector correspond to the position of UUV in North East-fixed reference frame and the heading of UUV, respectively, is the transformation matrix from North East-fixed reference frame to body-fixed reference frame, denotes the velocity vector including the surge, sway, and yaw of UUV in body-fixed reference frame; the actuator input is denoted by , and , , and denote the system inertia matrix, coriolis-centripetal matrix, and damping matrix, respectively. Specifically,

A constant current is assumed in this paper which is expressed as a vector in body-fixed reference frame. And then the kinematic and dynamic equations of UUV can be described aswhere , , and

Assume there are two propellers distribute in the horizontal plane of UUV. And the force vector is modeled aswhere and denote the speeds of propellers of UUV, respectively, is the distance between the propeller and central axis of UUV, and and denote propeller coefficients.

3. Simulation Model of Sonar

The input data of obstacle avoidance planners proposed by this paper are obtained by multibeam forward looking sonar. A 2D simulation model of multibeam FLS based on SeaBat 8125 is established in this section. SeaBat 8125 is a state-of-the-art high-resolution multibeam echosounder [32]. It has a field of view sector, 80 beams with width of , and the maximum scan radius of . To simplify the input information of network, define the distance vector , where is the distance information detected by ray of sonar at time step and if , then set . The precision of sonar is set as . And taking the uncertainty of sonar detection into account, this paper sets the false alarm rate as 10%.

4. The Structures of Obstacle Avoidance Planners

4.1. The Structure of CW-RNN

The forward propagation of standard RNN is as follows:where and denote the weight matrices from input layer and hidden layer to hidden layer respectively, is the weight matrix between hidden layer and output layer, , , and are the input vector, hidden state vector, and output vector at time step , respectively, and and correspond to the biases of hidden layer and output layer, respectively.

As shown in Figure 2, the neurons in hidden layer are grouped into modules of size in the forward propagation of CW-RNN. Each module i is set an explicit clock to operate. For every module j, only if , the recurrent connections from module to module are existed. And the state of modules i will be updated only if the modules i satisfy at each time step . The long-term memory is restored by the modules have long period. The local information obtained from input data is solved by modules with short period.

Therefore, and are partitioned into blocks-rows corresponding to modules, and is a block-upper triangular matrix:where each block-row is partitioned into block-columns and

4.2. The Structure of LSTM

In LSTM, the memory blocks are used to replace the hidden units in RNN. As shown in Figure 3, such a memory block consists of a cell, an input gate, an output gate, and a forget gate. The current state of hidden layer is restored in cell, the three import gate units, which control the input, output, and forget of cell, respectively. The forward propagation of LSTM is as follows:where , , , , and are outputs of input gate, forget gate, cell, output gate, and memory block at time step t, respectively; is input vector of memory block at time step t; is the output vector of memory block at t-1 time step; , , , and are the weight matrices from input vector to input gate, forget gate, cell, and output gate, respectively; , , , and are the weight matrices from the output of memory block at previous time step to input gate, forget gate, cell, and output gate, respectively; , , , and are biases of input gate, forget gate, cell, and output gate, respectively; is activation function of gate unit, which is set as logistic sigmoid function in this paper; represents element-wise product.

5. Construction of UUV Autonomous Obstacle Avoidance Planning Learning System

The principle framework of UUV autonomous obstacle avoidance planning learning system is shown in Figure 4. At first, the RNN based obstacle avoidance planners are trained offline. Then these fully trained planners are used to do obstacle avoidance planning for UUV in real time according to the environmental information obtained by FLS and some information of UUV obtained by motion and attitude sensor. The motion controller controls the UUV based on control commands output by online obstacle avoidance planners.

The flowchart of RNN based online obstacle avoidance planning system is as follows.

Step 1. Initialize the start position and target position of UUV, and deploy UUV in the start position.

Step 2. Acquire data from sonar, motion, and attitude sensors.

Step 3. The online RNN obstacle avoidance planner output the desired yaw and velocity of UUV according to sensors data.

Step 4. UUV adjusts its heading and velocity according to the output instruction of online RNN obstacle avoidance planner.

Step 5. Determine whether the UUV reach the target position, and if so, the obstacle avoidance planning algorithm is stopped. Else, jump to Step 2.

6. Data Processing and Network Training

The input sequence of obstacle avoidance planners at time step t consists of distance vector and the angle between UUV and target in North East-fixed reference frame . The output vector of obstacle avoidance planners at time step t is constituted by the adjustment of heading and the velocity of UUV. The dataset consists of 120,000 training samples and 4810 test samples. In the dataset, the start point, target point, and obstacles are generated randomly. And Min–Max normalization is used to preprocess input and output data.

The only difference between the two types of obstacle avoidance planners is the structure hidden layers which are composed by CW-RNN and LSTM, respectively. This setting is convenient for comparison between the performance of CW-RNN and LSTM on obstacle avoidance for UUV. The two types of obstacle avoidance planners consist of input layer, hidden layer, middle layer, and output layer. There are 81 neurons in input layer, 23 neurons in middle layer, and 2 neurons in output layer. To overcome the problem of overfitting, dropout with 0.6 keep probability is used in the process of train. The loss function is mean squared error (MSE); the weights are updated using the backpropagation through time minibatch gradient descent to minimize MSE of which batch size is set as 10000. And the optimizer is Adam optimizer; the maximum number of iterations is 20000. All networks are trained at Core i3 CPU 2.00GHz×4.

The parameters of four networks are shown in Table 1. And the MSE of the four networks on test dataset is shown in Figure 5. Table 1 and Figure 5 show that, for the same network, the offline training time of the networks increases and the convergence slows down as the number of parameters rises, but the best MSE reduces. And in the early stage of training, the network with fewer parameters converges faster, while in the later stage, the opposite happens. Compared with CW-RNN, LSTM converges faster and obtains better results.

7. Results and Analysis

In this section, a statistical experiment and several illustrative examples are present to validate the ability of obstacle avoidance algorithms. The size of the map is set to ; the velocity of UUV is set as a constant . And taking the environmental factors into consideration, this paper added 10% false alarm rate to sonar data in simulation test cases.

7.1. Statistical Experiment

In order to verify the obstacle avoidance planning effect of each network under different environmental disturbances, the statistical experiment is designed in this paper. The experiment counted the performance of different networks on 100 random maps at the false alarm rate of 5%, 10%, and 15%, respectively. The experimental results are shown in Table 2. Table 2 shows that, for the same network, the more parameters, the higher planning success rate, the lower path cost, but more time the algorithm takes. Compared with CW-RNN, LSTM has advantages in path cost, success rate, and stability. The reasons for the failure of each network planning are shown in Table 3. Among them, ‘nonarrival’ means that UUV stops near the target point, not at the target point. ‘Disorientation’ means that UUV drifts through the map after dodging obstacles, rather than moving toward the target. The disorientation occurs when the obstacle avoidance planner cannot extract the target information. It can be seen from Table 3 that the increase of false alarm rate makes the probability of collision and disorientation path planned by CW-RNN increase, but it has little effect on LSTM. This indicates that LSTM is superior to CW-RNN in processing of long-term memory. As shown in Table 3, CW-RNNs get a higher probability than LSTMs both in the terms of collision and lost, which indicates that LSTM has better ability to learn and extract detailed features than CW-RNN.

7.2. Simulation Test Case 1

For further analysis of the learning ability of the proposed obstacle avoidance algorithms, this simulation test case tests the obstacle avoidance performance of the four structures in two maps with the same complexity as maps in training set. The tracks, yaw, and propeller speed of UUV are shown in Figures 69, respectively. As shown in the simulation results, in the maps with the same complexity as the training environment, the four proposed obstacle avoidance algorithms can quickly generate the path without collision with obstacles, and the planning results satisfy the UUV kinematics. In this simulation test case, all the four obstacle avoidance algorithms show strong learning ability. And compared with other structures, there are fewer oscillations in the path planned by LSTM45.

7.3. Simulation Test Case 2

Assume that the start point is (156, 39) and the target position of UUV is (630, 1070). Figure 10 shows the tracks of UUV planned by the four obstacle avoidance algorithms. As the simulation results show that all methods are effectively controlling UUV to avoid the obstacles and reach the target position. And all RNN based obstacle avoidance planners have learned the ability that adjusts UUV’s heading to navigate toward the target position quickly after avoiding obstacles. As Figures 11 and 12 show, the yaw and propeller speed of UUV planned by RNN based obstacle avoidance planners are conformed to the actual practice. In the map with discrete distribution of obstacles, even though the environment complexity of the map is improved, the four obstacle avoidance algorithms can generate noncollision paths. The simulation results indicate that all the four algorithms have a degree of generalization ability and adaptive capability.

7.4. Simulation Test Case 3

For further analysis of the ability of the proposed obstacle avoidance planners, a much complex map than those maps is included in train and test dataset is adopted in this simulation test case. The tracks, yaw, and propeller speed of UUV are shown in Figures 13, 14, and 15, respectively. As shown in the simulation results, in the complex environment with continuous distribution of obstacles, UUV is planned by CW-RNN96 to avoid obstacles with the roam mode. This is because the CW-RNN96 cannot extract the target point information, which is more detailed than the obstacle information. And LSTM18, CW-RNN180, and LSTM45 are still capable for obstacle avoidance, which exhibit satisfactory abilities of learning and generalization in this problem. Although LSTM18 has fewer parameters than CW-RNN96, it has a better performance in complex environment.

7.5. Simulation Test Case 4

In order to test the generalization and exploration ability of various methods, this simulation test case adopts a maze map of continuous obstacles shown in Figure 16. In the training set, the target is all set on the east side of the map, and UUV always moves on the west side of the target point, which means the angle between UUV and target in North East-fixed reference frame is . In this map, the target point is in the middle of the map, and UUV must move around the target and reach the target, which means . The simulation results are shown in Figures 1618. It can be seen from the simulation results that all the methods performed well in the early stage of planning (). As UUV moves, the range of changes to , collision exists in the path CW-RNN96 planning, disorientations exist in the path of CW-RNN180 planning disorientation, and nonarrival exists in the path of LSTM18 planning. Only LSTM45 has the capability of path generation in this maze environment and shows excellent performance. The simulation results show a strong generalization and exploration ability of LSTM45.

7.6. Simulation Test Case 5

The results of statistical experiment and simulation test 2-4 show that, compared with the three methods, LSTM45 is the best method for UUV obstacle avoidance planning. In this test case, a series of simulations in dynamic environments are used in to further test LSTM45’s ability of obstacle avoidance. Figures 19 and 20 show the simulation results of LSTM45 in several dynamic environments that the obstacle with different motions. And Figures 21 and 22 show the simulation results of LSTM45 in complex environment with many static and moving obstacles. Assume that the dynamic obstacles always travel in straight lines with constant velocity. The directions of motion of obstacles are indicated by the arrow in obstacles. The velocities of obstacles are set as 8kn in Figure 19(a) and 4kn in other cases. The simulation results show that LSTM45 drives UUV navigates toward the target, until a collision threat is found. After obstacle avoidance, UUV is planned to move toward the target continue. Although the training set does not contain any dynamic obstacles, LSTM45 still explore the strategy to avoid dynamic obstacles.

It can be seen from the above experiments that when the number of parameters is similar, CW-RNN and LSTM also show similar performance in terms of training time, the best loss, and time-consumption, but LSTM shows better path cost, success rate, generalization ability, and robustness compared with CW-RNN. It is worth noting that LSTM18 and CW-RNN180 have considerable learning ability and generalization ability, while the number of parameters, training time, and time-consumption of LSTM18 are about 1/3, ½, and 1/4 of CW-RNN180, respectively. And LSTM18 has significant advantages in learning capability, generalization ability, and robustness compared with CW-RNN96 with similar number of parameters. For all the four algorithms, LSTM45 has the best learning ability, generalization and exploration ability and robustness. It is able to solve the problem of obstacle avoidance for UUV in a dynamic or even complex dynamic environment after being trained in simple and static environments.

8. Conclusion

Inspired by state-of-the-art performance of CW-RNN and LSTM on many sequence prediction tasks, this paper presented two types of obstacle avoidance algorithms based on CW-RNN and LSTM, respectively, and compared the performance of CW-RNN and LSTM on obstacle avoidance task. The proposed obstacle avoidance algorithms based on LSTM and CW-RNN achieved a very robust performance on the online obstacle avoidance problem of UUV under unknown environment and remained robust performance even though the effects of measurement noises are considered. And due to the strong learning ability and generalization ability, the obstacle avoidance algorithms are capable for obstacle avoidance in the environments which are much complex than those environments existing in training samples. When the number of parameters is similar, CW-RNN and LSTM also show similar performance in terms of training time, the best loss, and time-consumption, but in terms of path cost, obstacle avoidance planning success rate, generalization ability, and robustness, LSTM has a better performance. For all the proposed four methods, LSTM45 obtained the best performance in terms of learning ability, generalization, and exploration ability and robustness. The simulation in dynamic environment verified further the excellent ability in obstacle avoidance planning.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research work is supported by China Natural Science Foundation (no. 61633008) and Natural Science Foundation of Heilongjiang Province under Grant F2015035.