Abstract

The blockage or failure of the drainage holes will endanger the stability of the slopes and traffic safety of a highway tunnel. This paper studies an algorithm for the automatic classification of drainage hole blockage degree based on convolutional neural network transfer learning to explore the intelligent detection method of drainage hole blockage. The model transfer method is adopted to input drainage hole image samples to retrain the pretrained network to classify new images. Experiments are performed on the collected samples of drainage hole images, and the accuracy of different network models is compared, ResNet-18 being the best. The ResNet-18 performance is compared using different transfer strategies and parameters. The results show that when the SGDM gradient optimisation algorithm is used and the learning rate is 0.0001, the identification effect of these samples is the best. The validation accuracy can reach 91.7%, test accuracy is 90.0%, and the effective classification of drainage hole blockage to different degrees is realised under the transfer learning strategy of ResNet-18 model 1–34 frozen layers. Furthermore, with an expansion of the samples in the future, the identification accuracy will be further improved. The automatic classification system of the blockage degree of drainage hole greatly reduces the cost of manual detection, plays a guiding role in the maintenance of drainage pipes, and effectively improves the safety of highway tunnels and slopes.

1. Introduction

With the continued extension of tunnels to mountains and offshore deepwater areas, a series of complex diseases have emerged in both tunnels under construction and in operation [1]. Severe crystal blockage of tunnel drainage pipes generally occurs during tunnel construction and operation. After the drainage pipes become clogged, the depressurisation capacity will be greatly weakened, which will easily lead to the inability of tunnel groundwater to be drained and a sudden rise in external water pressure [2]. This causes structural water leakage, cracked lining and damaged blocks, inverted arch uplift, and other diseases [35], all of which have a serious impact on the tunnel structure’s safe operation, ranging from road blockage to casualties. To effectively avoid the occurrence of tunnel disasters caused by the blockage of the drainage holes, it is necessary to conduct rapid and accurate inspections of the drainage holes regularly. Currently, the tunnel drainage hole blockage detection relies primarily on visual inspection and recording by inspectors on the spot, followed by maintenance by maintenance personnel. The inspection results are subjectively affected by the inspectors. The inspection tasks in the tunnel are challenging, and the workload is heavy. Because false detection and missed detection often occur, it is urgent to explore an intelligent detection method.

Computer vision-based methods can intuitively and effectively detect structural damage, image binarisation [6], and edge detection [7]. Two methods have been widely used in vision-based structural health monitoring. Ho [8] et al. proposed a method for dynamic displacement measurement of infrastructure based on multipoint vision. Ye [9, 10] et al. proposed a long-distance noncontact distributed structural displacement monitoring method for large-span bridges based on computer vision, which is also based on digital image processing correlation theory and multipoint template matching algorithm, and realised the kilometre-scale on-site structural displacement monitoring. However, this method still suffers from limitations due to the effects of illumination and distortion-induced noise [11]. Structure damage detection methods based on deep learning have been rapidly developed in the field of civil engineering, given the rapid advancement of artificial intelligence technology [12, 13]. Yeum et al. [14] proposed a vision-based bridge crack detection technology through automatic processing of object detection and grouping. Makantasis et al. [15] achieved tunnel crack detection by employing a shallow convolutional neural network (CNN) with two convolutional layers and one fully connected layer. Cha et al. [16] used a CNN deep structure to detect concrete cracks. Xu et al. [17] established a framework for surface crack identification of steel structures based on the restricted Boltzmann machine. Chen [18] proposed a CNN crack detection method based on naive Bayesian data fusion to improve the overall performance and robustness of the recognition algorithm. Bao et al. [19] proposed an abnormal data diagnosis method for structural health monitoring based on computer vision and deep networks. Konovalenk et al. [20] evaluated the application of residual neural networks in the identification of three types of industrial steel defects and researched and developed a model of rolling steel surface defect identification and classification based on deep residual neural network. Xue et al. [21] proposed an improved mask-RCNN (region-based CNN) model to realise automatic detection of tunnel lining leakage. In the drainage pipeline system disease detection, a defect classification system of sewer pipeline systems on closed-circuit television (CCTV) inspection videos based on CNN is proposed [2224]. However, no research has been conducted on an automatic method for detecting the degree of slope in drainage hole blockage.

In this paper, deep learning is applied to detect blockages in drainage holes. As shown in Figure 1, many drainage hole disease samples were collected and taken as input. By comparing different pretrained CNNs, the CNN model most suitable for these samples was selected. By comparing the performance of the model under different migration strategies (freezing different layers) and the influence of different network hyperparameters on the network prediction effect, a CNN model, which is most suitable for the experimental samples, is finally obtained. In this paper, the automatic classification method of the blockage degree of drainage holes is proposed for the first time. The method combining transfer learning and CNN is adopted to overcome the shortcomings of training traditional CNNs, such as insufficient samples and long training time, and to realise automatic classification of the blockage degree of drainage holes while avoiding the defects of manual detection, such as high cost, long time, error detection, and leak detection.

Section 2 introduces the collection of experimental samples and image preprocessing, as well as pretraining models and transfer learning methods. Section 3.1 shows the results of transfer learning for different pretraining models. Section 3.2 compares ResNet-18 under different transfer strategies. Section 3.3 compares the effects of different network hyperparameters on the prediction performance of the model. Section 3.4 presents the visualisation results of the convolutional process of a CNN. Section 4 shows the conclusion and prospects of this paper.

2. Experimental Data and Network Model

In this paper, the drainage hole blockage detection model retrains a pretrained CNN to classify a new set of images by using a model transfer learning method. Compared with training the network from the start with randomly initialised weights, fine-tuning the network through transfer learning is faster and simpler, owing to the transfer learning network’s excellent feature extraction and classification capabilities. Fine-tuning the network through transfer learning is faster and easier than training the network from scratch with randomly initialised weights, and the learned features can be quickly transferred to new tasks with a smaller number of training images, resulting in efficient and accurate classification and detection of drainage hole blockage diseases. Figure 2 shows the detection process of the automatic detection model of drainage hole blockage.

2.1. Experimental Data Preprocessing

The original samples of this experiment were personally shot in the tunnel and slope because there are no public samples on the blockage of drainage holes on the Internet. The proportion of picture target and background is quite different, owing to the influence of shooting angle, drainage hole position and other factors. As a result, the image must be preprocessed. First, the collected image samples are appropriately cropped, and the size is uniformly scaled to adapt to the input size of the network (224 × 224). Additionally, in the process of training the network, image enhancement techniques of translation and rotation are used to enhance the robustness of the network and improve the generalisation ability of the network.

All pictures were manually classified, and qualitative analysis was carried out according to the degree of blockage. Furthermore, the blockage of drainage holes was divided into four categories, namely, slight blockage, moderate blockage, heavy blockage, and no blockage. Slight blockage occurs when the blockage crystallisation of drainage holes is less than about 1/4 of the drainage hole area, moderate blockage occurs when the blockage crystallisation is greater than about 1/4 but less than 3/4, and heavy blockage occurs when the blockage crystallisation is greater than about 3/4. Figure 3 depicts the exact classification.

A total of 943 samples were used in this experiment, 101 of which were taken as test sets. The remaining 842 samples were 299 with slight blockage, 230 with moderate blockage, 71 with heavy blockage, and 242 with no blockage. Approximately 90% (758) were randomly selected as training and 10% (84) as the validation set. Table 1 presents the allocation of the specific sample.

2.2. Model Transfer Learning Method

CNN is a feedforward neural network that first appeared in the BP [25] algorithm in 1986. Compared with ordinary neural networks, it has the advantages of partial connection and weight sharing and is applied in the field of image recognition and target classification. In this paper, eight classical CNN models (AlexNet [26], SqueezeNet [27], VGG-16 [28], VGG-19, Resnet-18 [29], ResNet-50, ResNet-101, and GoogLeNet [30]) were selected as pretraining network models. All CNN models are composed of convolutional layers, activation layers, pooling layers, and fully connected layers. On the other hand, stacking layers make it difficult to improve the model’s learning ability and cause model degradation and gradient explosion [31]or disappearance. Unlike other networks, ResNet uses shortcut connections to solve the problem of model degradation in deep neural networks, as shown in Figure 4. Adding an identity map behind the shallow network with equal input and output improves accuracy and transforms the model into a shallow network while improving the accuracy of the model.

The pretrained network has been trained on over one million images and can classify images into 1000 object categories, including keyboard, mouse, pencil, and many animals, as well as images of civil engineering. As a result, the network has learned rich feature representations for a range of different images. Since the three last layers of the pretrained network are configured for 1000 classes, these three layers must be fine-tuned for our new classification problem, as shown in Figure 5. All but the last three layers are extracted from the pretrained network. By replacing the last three layers with new fully connected layers, softmax and classification layers, the fully connected layer is set to the same size as the number of classes in the new data, which is four for these experimental samples. In most networks, the last layer with learnable weights is fully connected. In SqueezeNet, the last learnable layer is a 1 × 1 convolutional layer, which is replaced by a new convolutional layer with the same number of filters as the number of classes.

3. Analysis of Results

The experimental platforms used in this article are as follows: Microsoft Windows 10 (operating system), i5-1035G1 central processing unit (CPU), NVIDIA GeForce MX350 8 GB graphics processing unit (GPU), 512 GB solid-state drive, and Matlab R2020a.

3.1. Network Model Comparison

All network models use the same network parameters with the specific parameters presented in Table 2. All network layers are trained, and the experimental results are comprehensively evaluated with four evaluation metrics, namely, validation accuracy, precision, recall, and F1-score.

This paper selects the validation set to evaluate the network, which contains 143 images, 30 of which are slight blockage, 23 are moderate blockage, 7 are heavy blockage, and 24 are no blockage. Different network models are compared in terms of validation accuracy, precision, recall, F1-score, test accuracy, and training time. Table 3 presents the performance of each network model. Figure 6 shows the training and validation accuracy, as well as the loss iteration curves during the training process. As can be observed, SqueezeNet has the shortest training time while having the lowest validation accuracy. The accuracy of the VGG-16/19 network is higher, but it takes a long time, and the test accuracy of VGG-19 is lower than the validation accuracy, resulting in overfitting. As the depth of the ResNet series network increases, so does the training time, but the accuracy decreases. In a comprehensive comparison, ResNet-18 has the highest validation accuracy and precision for each classification, recall, and F1-score. The verification accuracy and test accuracy rate are 90.5% and 91.0%, respectively. Although it is slightly lower than the VGG series network, the training time is 1/50 of it. As a result, ResNet-18 performs best in this experiment for the drainage hole target classification task.

3.2. Comparison of Transfer Strategies

In Section 3.1, the optimal network model ResNet-18 is obtained using transfer learning. This section continues to compare the performance of the model under different transfer strategies. Choose to freeze the weights of the different network layers by setting the learning rate to zero. The parameters of the frozen layer will not be updated during training. Because the gradient of frozen layers does not need to be calculated, freezing the weights of multiple initial layers can significantly accelerate the network training time. If the new samples are small, freezing the shallower network layer also prevents those layers from overfitting the new samples. To test the advantages of transfer learning, the ResNet-18 network without pretraining was also used in this experiment.

ResNet-18 comprises 71 layers, including 8 residual boxes. In this experiment, the layer from the first to the layer behind each residual box is frozen sequentially, that is, layers 1–11 (after the first residual box), layers 1–18 (after the second residual box), layers 1–27 (after the third residual box), layers 1–34 (after the fourth residual box), layers 1–43 (after the fifth residual box), layers 1–50 (after the sixth residual box), layers 1–59 (after the seventh residual box), and layers 1–66 (after the eighth residual box), respectively. The learning rate of the frozen layer is set to zero, all layers are reconnected in the original order, and the new layer diagram contains the same layers, but the learning rate of the frozen layer is zero.

The network models of different transfer strategies use the same network parameters. Table 2 presents the specific parameters. The recognition effects of the ResNet-18 using different transfer strategies are compared by freezing the networks of different layers. Table 4 presents the specific experimental comparison.

Through comparison, it is discovered that as the network freezing depth increases, the training time becomes shorter and shorter due to the fewer parameters that must be trained. When layers 1–11 are frozen, network validation accuracy and test accuracy suffer noticeably. When layers 1–18 are frozen, network verification accuracy is higher than that of training all layers of accuracy, but testing accuracy is lower. Network validation accuracy and test accuracy are significantly lower when layers 1–27 are frozen, but verification accuracy and evaluation indexes of the network are the highest when layers 1–34 are frozen, with validation accuracy and test accuracy of 91.7% and 90.0%, respectively. Following that, as the number of frozen layers increases, the accuracy of the network model gradually decreases. Furthermore, the network without pretraining is compared. The results show that, for the same number of training rounds, the training time of the network without pretraining is comparable to that of the network with pretraining for all layers. However, the accuracy is far lower than that of the pretrained network.

3.3. Comparison of Model Parameters

In the previous section, different migration strategies were compared, among which ResNet-18 frozen layers 1–34 performed best. Based on this, we compared the prediction effects using three different solvers (Adam, SGDM, and RMSProp) when the model’s initial learning rate was 0.01, 0.001, and 0.0001, respectively. Figure 7 shows the validation accuracy of different solver models when the initial learning rate is 0.0001. As presented in Table 5, the model accuracy improves dramatically when the learning rate decreases, and at the same learning rate, Adam outperforms RMSProp while SGDM outperforms Adam. When the solver is SGDM, the model verification accuracy is the highest (91.7%); when the learning rate is 0.0001, the test accuracy ranks third at 90.0%, and other evaluation indexes are optimal. Figure 8 shows partial test results; the values in parentheses are the predicted confidence.

3.4. Visualisation of Features

This section shows how to feed images to ResNet-18 CNN and the activation region of the addition layer (res2a, res2b, res3a, res3b, and res5b) at the end of the first 1–4 and the last residual box of the network in Figure 9, observing which regions in the convolutional layers are activated on the image and comparing them with the corresponding regions in the original image to investigate the features. When comparing the activated regions with the original image, it was discovered that the channels in the shallower layers learn simple features such as colour and edges, while the channels in the deeper layers learn complex features.

Each layer of a CNN consists of several two-dimensional arrays called channels. Each block in the activation region grid is the output channel in the conv1 layer. White pixels indicate strong positive activation regions, and black pixels indicate strong negative activation regions. Primarily grey channels are not strongly activated for the input image. The pixel positions in the channel activation regions correspond to the same positions in the original image. A white pixel at a position in a channel indicates that the channel is strongly activated at that position. Figure 10 shows the 64 feature maps obtained for each layer of res2a and res2b. Some of the channel images show the contours of the drainage holes.

Figure 11 shows the feature images of res3a and res3b, with 128 feature images obtained for each layer. Compared with res2a and res2b, the convolution process reduces the resolution of the feature images. Some additional interference was filtered out. The ideal features of the drainage holes were extracted from some of the feature images. Figure 12 shows the feature images for the last residual box addition layer, res5b, with 512 feature images obtained for each layer. At this point, the resolution continues to decrease, the image becomes extremely blurred, and the visual features disappear.

4. Conclusion and Outlook

Based on the drainage hole image samples, eight different pretrained CNN models, that is, AlexNet, SqueezeNet, VGG-16, VGG-19, GoogLeNet, ResNet-18, ResNet-50, and ResNet-101, were compared. The performance of the same model under different migration strategies and different hyperparameters is compared. The ideal network model of blockage detection of drainage holes is obtained. Through the analysis of the experiment, the following conclusions are drawn.(1)Based on the drainage hole blockage samples, eight different pretraining network models were compared, and a relatively accurate (91.5%) drainage hole blockage detection network model based on ResNet-18 was trained, resulting in the classification of drainage hole blockage into four types based on degree, that is, slight blockage, moderate blockage, heavy blockage, and no blockage.(2)Compared with the transfer learning model, it takes much longer to retrain the network, and the network accuracy is much lower. Comparing the performance of the network model under different migration strategies, the training time decreases as the frozen layer increases. When selecting the appropriate relative shallow network to freeze, keeping the basic without accuracy reduction can decrease the network training time. However, after a certain depth, the accuracy of the model decreases as the number of frozen layers increases.(3)In this paper, we compare different solvers and initial learning rates of the network model, and the results show that as the learning rate decreases, the model accuracy significantly increases. For these experimental samples, the model accuracy is best when the initial learning rate is 0.0001 and the solver is SGDM.

The congestion classification network model is helpful for the detection and maintenance of drainage holes in public infrastructure. Detection personnel can consider using an unmanned aerial vehicle to collect drainage hole pictures according to the planned route and then input the collected pictures into the drainage hole blockage detection model, which can automatically classify the blockage degree of the drainage hole.

The target of the image differs greatly from the background in the actual image acquisition process due to factors such as shooting angle and drainage hole location, and the adaptability of the model is reduced, resulting in the correctness of the trained model in this paper. An improvement is still required compared with the application of deep learning in other aspects. In this paper, only the target classification is carried out for the degree of drainage hole blockage, and the detection and positioning of the disease location can be carried out based on this research. Using the target detection method, several types of diseases can be identified in a single image, and the research on this drainage hole blockage classification network model is of great importance for the maintenance of drainage pipes. Furthermore, it provides important technical support for the development of intelligent maintenance management systems for drainage pipes at a later stage.

Data Availability

The data used to support the findings of this study are included within the article and are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge the financial support provided by the Guangdong Provincial Natural Science Foundation of China (Grant no. 2019A1515011397).