Research on Airport Target Recognition under Low-Visibility Condition Based on Transfer Learning

Li, Jiajun; Wang, Yongzhong; Qian, Yuexin; Xu, Tianyi; Wang, Kaiwen; Wan, Liancheng

doi:https://doi.org/10.1155/2021/9979630

International Journal of Aerospace Engineering

On this page

Abstract Introduction Related Work Results Discussion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 9979630 | https://doi.org/10.1155/2021/9979630

Research on Airport Target Recognition under Low-Visibility Condition Based on Transfer Learning

Jiajun Li,¹Yongzhong Wang,¹Yuexin Qian,¹Tianyi Xu,¹Kaiwen Wang,¹and Liancheng Wan¹

Academic Editor: Giovanni Palmerini

Received16 Mar 2021

Accepted24 May 2021

Published09 Jun 2021

Abstract

Operational safety in the airport is the focus of the aviation industry. Target recognition under low visibility plays an essential role in arranging the circulation of objects in the airport field, identifying unpredictable obstacles in time, and monitoring aviation operation and ensuring its safety and efficiency. From the perspective of transfer learning, this paper will explore the identification of all targets (mainly including aircraft, humans, ground vehicles, hangars, and birds) in the airport field under low-visibility conditions (caused by bad weather such as fog, rain, and snow). First, a variety of deep transfer learning networks are used to identify well-visible airport targets. The experimental results show that GoogLeNet is more effective, with a recognition rate of more than 90.84%. However, the recognition rates of this method are greatly reduced under the condition of low visibility; some are even less than 10%. Therefore, the low-visibility image is processed with 11 different fog removals and vision enhancement algorithms, and then, the GoogLeNet deep neural network algorithm is used to identify the image. Finally, the target recognition rate can be significantly improved to more than 60%. According to the results, the dark channel algorithm has the best image defogging enhancement effect, and the GoogLeNet deep neural network has the highest target recognition rate.

1. Introduction

The safety of aircraft operation is an eternal theme of the aviation industry and the foundation for the survival and development of civil aviation. Aircraft accidents in the world mainly occur in the four stages of take-off, climb, approach, and landing. The accident rate of the take-off stage is about 16%. According to the statistics of major flight accidents on passenger planes worldwide from 1980 to 1996, among 621 accidents, major approach and landing accidents accounted for 287, accounting for 46%. For every 20% increase in airport flights, the incidence of runway incursions increases by 140%. Under low-visibility conditions, the incidence of accidents will be higher. Therefore, as the connecting point of aircraft take-off and landing, the airport must do a good job of safety protection, especially in low-visibility conditions; timely identify the targets in the field; and eliminate as many safety hazards as possible for the operation of the aircraft. On the one hand, the airport must arrange the circulation of objects (personnel, ground vehicles, and aircraft) in the airport in an orderly manner. On the other hand, airport controllers must have sufficient professional knowledge and strong professional skills to identify and track the real-time position and dynamics of the aircraft and make accurate commands. Furthermore, it is necessary to use relevant auxiliary equipment to timely identify the aircraft on the runway and taxiway and unpredictable obstacles in the field (such as birds, circulating vehicles, and staff) to reduce the workload of staff and the occurrence of unsafe aviation accidents.

In recent years, deep learning methods have developed rapidly, and target recognition technology has been widely used in the fields of national defense, national economy, social life, space technology, and so on. For example, radio frequency identification technology (RFID) [1], one of the most promising information technologies in the 21st century, is widely used in material management, production line automation management, vehicle control, and access control systems through radio waves for information storage and exchange. Face recognition technology [2] with high social identity in life belongs to a kind of biometric recognition technology. It mainly recognizes the identity of each face by comparing the features of the face itself (the size, position, etc.) by comparing the characteristics of each organ. In addition, biometric technology [3] also includes fingerprint recognition, iris recognition, speech recognition, retinal recognition, and palm print recognition, which are used in electronic commerce, social welfare security, government, army, security and defense, and so on. Optical character recognition technology (OCR) [4] uses optical methods to detect the dark and bright patterns on the printing paper to determine the shape and then uses the character recognition method to realize the transformation of shape and computer text and completes the electronic printing, which brings great convenience to people’s study and work. Image recognition technology [5] is also called visual recognition technology. It uses computer technology to process and analyze the captured image, identify the target object, and identify the object category, which is conducive to the system to make a correct judgment. In navigation, environmental monitoring, weather forecasting, and many other fields, it has great application value.

Visibility has always been a hot topic in the aviation industry, and there are many reasons for low visibility, such as rain, snow, fog, and other meteorological conditions. Compared with the good weather condition, under the low-visibility condition, people’s target recognition ability will drop sharply, the vision becomes darker, the line of sight becomes shorter, and it is difficult to distinguish objects. At some airports with poor navigation equipment, operation is prohibited when visibility is less than 800 km [6], which seriously affects the normal operation of flights. For example, China’s Tengchong Airport is affected by altitude and perennial clouds and fog cannot be accurately predicted, causing low visibility resulting in frequent cancellations of flights, which seriously affect the efficiency of airlines and airports. The identification of the target objects in the airport field under low visibility can improve the efficiency of the airport operation to a certain extent and ensure the safety of the operation in the field.

This article is mainly based on low-visibility conditions and is aimed at finding a method that can improve the recognition rate of targets in the airport and assisting the controller to understand the conditions of the objects in the field more clearly, so as to make rapid and accurate decisions and improve the efficiency of airport operations. Here, five categories of birds, staff, aircraft, ground vehicles, and hangars are selected as the research objects, and the migration learning method is adopted to dehaze and enhance the low-visibility images to improve the recognition rate.

Transfer learning [7] was first proposed by Qiang Zhang in 2005. It is to train related similar data to realize the generalization ability of the model itself, so that it can be applied to different fields or problems. The most remarkable feature of using transfer learning is that it does not have to rely on big data to learn from scratch in every field, which greatly reduces the workload. For example, video surveillance, social network, and intelligent transportation generate massive amounts of text, images, and other data at all times. These initial data can be annotated, similar data can be found to build the model, and then, the target data can be added. Common transfer learning methods include (1) domain adaptation [8]: using source domain data or features to enhance training in target areas, thereby improving the model performance; (2) online transfer learning [9]: combining transfer learning with online learning, the source domain data and features increase with time, which can make full use of source data for training; (3) life-long transfer learning [10]: life-long uninterrupted learning to improve the transfer ability from the source to target is a combination of independent cross-domain knowledge transfer and life-long strategy gradient reinforcement learning, which is suitable for decision-making tasks with less training data; (4) heterogeneous transfer learning [11]: there is no requirement for the distribution and feature dimension of training data and test data, which expand the scope of application of transfer learning; (5) deep transfer learning [12]: it is a fusion of deep and transfer learning, combining the ability of deep learning to fit data, strong feature expression, and transfer learning to domain-independent feature expression, and the advantages of both can be exploited simultaneously; (6) reinforced transfer learning [13]: it is the fusion of reinforcement learning and transfer learning, which also makes use of the advantages of both, and is suitable for learning; and (7) confronting transfer learning [14]: transfer learning that can realize two-way confrontation. Because of sharing data and characteristics, the data and characteristics of the source and target fields can be fully utilized.

There are many methods of image fog removal, which are mainly divided into two categories: one is based on an a priori method to remove fog [15] and the other is based on image enhancement method to remove fog [16]. The a priori-based approach is implemented on the basis of the atmospheric degradation model. The defogging process is the process of responding to the model. The representative algorithms mainly include a dark channel fogging algorithm [17], single image fogging algorithms [18, 19], fast image restoration algorithm [20], and Bayesian algorithm [21]. The fog removal effect of this kind of method is often better. The method based on image enhancement is to remove fog from the angle of removing image noise, improving contrast and restoring image clarity. The main representative algorithms are the histogram equalization (HLE) [22], Retinex algorithm [23], wavelet transform [24], homomorphic filtering [25], etc. In addition, image deblurring can usually be divided into two types: motion blur and defocus blur [26]. This article only removes the fog instead of deblurring because (1)Considering that there will be certain restrictions on the speed of the target in the airport. According to ICAO regulations, the aircraft’s indicated airspeed at the threshold of the runway is defined as 1.3 times the aircraft’s stall speed under the maximum allowable landing weight and landing profile conditions. Motion blur is not significant, so this article will not consider it(2)The airport camera and the target scene are relatively fixed, so the focal length of the lens is also fixed, and the defocus blur caused by the depth of field is not considered

This paper will select several high evaluation algorithms to process and compare the images under the condition of low visibility and choose the best algorithm for fog removal as the final solution.

The rest of the paper is divided into the following parts: Section 2 is a theoretical introduction and it mainly studies the theory of the deep transfer learning method to be used in this paper. Section 3 describes data sources and data screening bases and results. Section 4 discusses the screening of image processing methods and target recognition methods and experiments, in which the advantages and disadvantages of different methods are discussed in detail. Section 5 shows the results of the experimental part of Section 4. It is concluded that the dark channel algorithm has the best image dehazing and enhancement effect and the GoogLeNet deep neural network has the highest target recognition rate. After that, other image data are used to verify the built model. The last section is a summary of the full text.

The access to large amounts of data found that the current world research on object recognition has been very thorough, but very few people apply it to recognize the airport field target. From the perspective of the airport, this article uses a variety of target recognition methods, by experimental comparison, identifying the best choice as the final solution. The method to be adopted in this paper is to use a group of low-visibility images or clear images for depth transfer learning training. If there is a problem of poor training and test results, the reasons are analyzed, and the training sample images are modified accordingly. Then, we do depth transfer learning training again to verify.

2.1. Deep Learning

Deep learning is one of the most popular machine learning algorithms in computer vision [27–29]. Its main motivation is to establish and simulate the neural network of the human brain for analytical learning. Compared with the traditional neural network, the main advantages are that the network structure is more hierarchical, the function complexity is larger, and the generalization ability is stronger. The earliest neural network idea was proposed in the artificial neuron model established by MCP in 1943 [30], and the neuron was simplified as a signal input and linearly weighted, summed, and outputted after nonlinear activation. However, the birth of the deep neural network has seen twists and turns. In 1986, Hinton invented the BP algorithm [31] and used sigmoid to implement nonlinear mapping, making it suitable for multilayer perceptions (MLP), and solving nonlinear classification and learning problems. But then, the “disappearing gradient problem” [32] was left out of the cold. Until 2006, a solution to the problem of gradient disappearance in deep network training was born, namely, Hinton proposed the “unsupervised pretraining to initialize weight-supervised training fine-tuning” [33] method and ReLU activation function [34]. It was proposed and won the first place in the ImageNet image recognition competition [35] before deep neural networks attracted the attention of many researchers. In the current computer vision technology, this method is the most accurate to realize target recognition by extracting features. In addition, deep neural networks have been also widely used in image classification, super resolution, semantic segmentation, and other fields.

2.2. Deep Transfer Learning

Like deep learning, transfer learning has great practicability in object recognition, image classification, and language processing [36–38]. The reason why this paper chooses to use deep transfer learning is that (1) deep learning and transfer learning have good results in object recognition; Uçar et al. [39] applied the object recognition technology in deep learning to cars to realize automatic driving. Yang et al. [40] used deep transfer learning to identify military targets under small training conditions and got excellent recognition effect results. Sheeny et al. [41] identified objects in 300 GHz radar images based on depth neural network and migration learning and then considered detection and classification in multiple object scenes. The deep neural network is based on a large data set, and the weight is obtained after multiple training. Limited by the data set, it is difficult to train a deep neural network with optimal weights from scratch. Because some basic features of image processing are consistent, the weight of the convolutional layer can be retained during deep transfer learning, and an image recognition network with high accuracy can be quickly trained by changing the method of the full connection layer. It is worth mentioning that domain adaptation can also realize image target recognition in low-visibility conditions, and it is applicable to the case that only the label samples in the source domain are present. It tries to narrow the gap between the source domain and the target domain by learning domain invariant characteristics and liberates the target domain from the expensive tag cost [42]. However, for the special area of the airport, the target category is more fixed than the general area. In this paper, under the condition of low visibility, the target samples in the airport need to be manually labeled, and the data used in this paper have been labeled. And considering the practical requirements, the model will be considered to run on a small embedded platform. Therefore, considering the convenience, operability, and portability of the method, deep transfer learning is finally chosen.

3. Data Set

What is needed to identify the airport target under low visibility is not a series of complex digital data. Directly intercepting the airport low-visibility surveillance video in a picture can be studied. Considering that the most important factors in the airport field are aircraft, people, hangars, birds, and ground vehicles, the data in this paper are derived from the Kaggle website data set. Therefore, the images containing these five elements are selected from the initial images of different airports in the world. In addition, the transfer learning method we want to use is universal for the identification of these five elements, so the key is to introduce the identification method. The selected sample initial image is shown in Figure 1.

There are several reasons why these six low-visibility images are selected as illustrative objects: (3)The same reasons for the lack of clarity of these images are affected by fog, and the main body of recognition is the aircraft(4)The scenes represented by these images are different, spread over different locations in the airport (runway, apron), in different states (take-off, taxi, and stop), which can better reflect the universality of the experimental method

Although these images are low-visibility maps, the visibility displayed varies, and it is clear that the last one has the lowest visibility. Selecting the original image with this feature can make the experiment more persuasive.

4. Experiments

4.1. Model Screening

Since Alex and Hinton won the ILSVRC competition with AlexNet in 2012 [43], deep learning once again attracted the attention of scholars in various fields. Then, it quickly developed GoogLeNet, VGGNet, ResNet, SqueezeNet, generative adversarial networks (GAN), and other networks. All played an important role in different fields. In this paper, the neural network was selected mainly for image recognition, and several networks including GoogLeNet, SqueezeNet, Xception, DenseNet, and ResNet were successively used for transfer learning, to train and process images under favorable meteorological conditions. According to the recognition effect, the most appropriate model is provided for the subsequent experiments.

The reason why images under low visibility are not directly selected for training is that when we try to use low-visibility images that also contain aircraft, people, hangars, birds, and ground vehicle targets for model verification, the recognition accuracy of GoogLeNet and SqueezeNet is less than 10%. For example, the four pictures in Figure 2 which only contain aircraft are identified by the model after training through low-visibility pictures, and the recognition effect of the aircraft is 9.581%, 98.21%, 15.24%, and 4.025%, respectively. It may be affected by shooting angle, distance, light intensity, fog concentration, and other factors. For different images, the recognition effect is different, but most low-visibility images have low recognition rate after direct training. And they have no reference value in practical application.

4.1.1. SqueezeNet Presentation

The SqueezeNet [44] designed by UC Berkeley and Stanford researchers is to simplify the network complexity and achieve the recognition accuracy of the network. It is a lightweight and efficient CNN model with few parameters but high performance. Compared with the large model, there are many advantages: (i)More efficient distributed training greatly reduces network traffic(ii)The model is small; it is convenient to update the client program(iii)It is convenient to deploy in specific hardware such as FPGA

The basic unit of the SqueezeNet network is the Fire module, which contains two layers of convolution operations: one is the squeeze layer with a convolution kernel; the other is the expand layer with and convolution kernels. Record the number of convolution kernels in the squeeze layer as . The number of and convolution kernels in the expand layer is recorded as and , respectively. are smaller than the number of map inputs; each of these layers has a ReLU activation layer. The and convolution output feature maps splice the channel dimension after the expand layer. In this design model, there are 9 layers of the Fire module; some pools are inserted in the middle.

In the operation of the system, the data of the previous stage is used as the learning sample of the next stage, so the data is continuously trained, the accuracy of the model is improved in the iteration, and finally, the verification data is selected to test the whole model. In this operation, the clear image of the airport selected by the Kaggle website is used as the initial image (the amount of data is not large), and the images are classified into five categories: aircraft, people, hangars, birds, and ground vehicles, and 70% of the data is selected for training samples; 30% of the data is measured and brought into the model to get the result.

4.1.2. GoogLeNet Presentation

GoogLeNet [45] is the champion model of ILSVRC in 2014, with a total of 22 layers. This model points out that the model quality can be improved by increasing the depth or width of the model. Besides, in order to solve the problems of overfitting and gradient vanishing, the model proposes the Inception modular structure, which not only maintains the structural sparsity of the neural network but also makes full use of the high computational performance of dense matrices.

The Inception structure is the core of the whole GoogLeNet network. And the main idea is to approximately cover an optimized local sparse structure in a convolutional neural network with a series of easily obtained dense substructures. As shown in Figure 3, the Inception structure diagram is nested layer by layer. convolution is used to reduce the dimension before and convolution operations, and ReLU nonlinear activation is introduced at the same time. The advantage of this structure is that when the calculation complexity is too high to be controlled by computation, a convolution of can be used to reduce the calculation of parameters.

Similar to SqueezeNet network applications, the same image set is used to bring 70% of the data into the GoogLeNet network structure as training samples, and 30% of the data are measured.

4.1.3. ResNet Presentation

ResNet [46] was proposed by He et al. in 2015, aimed at solving the problem of network feature extraction ability and accuracy decrease in the process of accumulation from shallow layers to deep layers in a deep neural network. It is mainly constructed by residual building block; each residual block contains multiple cascaded convolutional layers and shortcut connections; the output values of both are accumulated and outputted by the ReLU activation layer. Multiple residual blocks were connected in series, and identity mapping and residual mapping were used to maintain excellent network depth and performance. According to the design method of residual blocks, they can be divided into ResNet-18/34 and ResNet-50/101/152.

4.1.4. DenseNet Presentation

The classic network DenseNet [47] proposed by Huang et al. in 2017 uses a feedforward method to connect layers. Compared with the layer network corresponding to connections in the traditional convolutional neural network, there will be connections in DenseNet. This network is composed of the dense block. Each layer gets additional input from all previous layers and passes its own feature map to all subsequent layers. At the same time, each layer receives “collective knowledge” from the previous layers through concatenation. This structure makes the DenseNet network have the following advantages: (a) reduce vanishing gradient, (b) strengthen feature transfer, (c) encourage feature reuse, and (d) reduce the number of parameters.

4.1.5. Model Comparison

Based on AMD Ryzen 7 4700U with Radeon Graphics 2.00 GHz, this paper uses a single CPU to perform deep migration learning on the current hot deep neural network. The target recognition effect of each model is shown in Table 1.

Table 1 shows the comparison of target recognition effects after using nine different neural networks. Among them, DenseNet-201 and GoogLeNet have the largest recognition rates, both reaching 90.84%. But from the perspective of model depth and operating speed, although GoogLeNet has only 22 layers, it achieves the same effect as the 201-layer DenseNet model. In addition, the recognition speed of GoogLeNet is more than twice that of DenseNet-201, and the efficiency is significantly higher. Figure 4 shows the results of GoogLeNet.

The resulting figure shows that the average recognition rate of the GoogLeNet network is 90.84%, and the error loss is less than 0.5%. The GoogLeNet network has a stable and significant recognition effect on airport image targets, the network training speed is fast, and the CPU performance requirements are relatively small. Therefore, the GoogLeNet network will be used for modeling below.

4.2. Image Defogging

After a simple experiment, it is found that no matter which depth neural network is used, the target recognition effect is not significant when the visibility condition is low. Hence, the main task of this section is to try to use a variety of image defogging enhancement methods, to screen out the optimal scheme and to defog and enhance the auxiliary processing of the airport low-visibility image, so as to use the GoogLeNet network for target recognition later.

4.2.1. Dark-Channel Fog Removal Algorithm [52]

In most images without fog, at least one value of R, G, and B in any pixel is very low. This is because if the R, G, and B values are all high, it indicates that the pixel has a tendency to transition to white. Collecting the “smaller” values in all pixels in a certain way can form the dark channel image of the entire picture. The dark channel defogging algorithm proposed by Dr. He Kaiming and others is based on this theory.

At present, the formula for foggy pictures that is widely used in the field of computer vision is

In the formula, represents the image to be processed (with fog map), represents the real image (without fog map), represents the global atmospheric light value, and represents the transmittance. The final purpose is to obtain the fog-free graph by calculation, but the and in the formula are all unknowns. If we want to get the result, we need to use the dark channel theory to select the channel with the smallest pixel point in the original graph to get the gray graph, and then subject it to the minimum filtering [53]. Dark channels are mathematically defined as

The is a channel in the R, G, and B; the is a pixel in the graph and the is a small area near the pixel; and the is a dark channel diagram.

The formula is as follows: (1)Minimize the formula (1):(2)Divide by :(3)Again take the minimum value for (4):(4)Since most elements of the dark channel are close to zero, we can approximate(5)From (6),(6)(7) into (5), there are(7)In order to keep the image realistic, increase the parameter , limit the value as the smallest, and finally get the expression as follows:

4.2.2. Histogram Equalization [54]

Histogram is a statistical expression form of image technology, which is used to reflect the information in the image, including contrast, brightness, gray scale range, and distribution. The basic idea of histogram equalization is to make the original image nonlinear transformation to increase the number of gray levels, reduce the number of gray levels, increase the contrast of images with a small dynamic range of gray values, and make the whole image more colorful. The main steps are (1)Preprocessing the original image, the normalized histogram can be obtained, which is expressed as

In the above formula, the is the distribution probability of the gray level , is the total number of image pixels, and is the number of gray level . After normalization, the sum of all probabilities is 1, that is, (2)The gray value change table of the image histogram is calculated by using the gray scale transformation table, which can be expressed as

The round function means rounding. (3)Checking table transformation, the gray value in the preprocessed original image histogram is brought into the exchange table, and the new gray value histogram can be obtained. It should be noted that the gray value of the input original image should correspond to the output equalization gray value one by one, and the relationship between the relative size of the pixel gray value remains unchanged

In addition, histogram equalization can be divided into global and local methods. The global histogram equalization method realizes image processing by improving the visual effect of the whole image, while the local histogram equalization decomposes the image into multiple regions first, and then equalizes each region by a separate histogram. Finally, each region is combined into a complete image. When using the histogram equalization method to process images, different color models can be used to achieve different results, such as RGB, HSV, YCBCR, CMY, and HSI.

4.2.3. Image Filtering Algorithm

Filtering is an operation that suppresses and prevents interference by filtering out specific band frequencies in the signal. In an image, the noise is usually a high frequency signal, which needs to be filtered by low-pass filtering, and the corresponding image picture is smooth and fuzzy. However, if low-pass filtering is used alone, the edge and texture features displayed in the image will also be eliminated. Therefore, the smoothness of the effect should be considered, and the sharpness should be preserved in the image processing.

There are many filtering algorithms in images, which can be roughly divided into linear filtering and nonlinear filtering. Among them, the common algorithms in linear filtering are mean filtering and Gaussian filtering, and the common algorithms in nonlinear filtering are median filtering and bilateral filtering. No matter which kind of filtering algorithm, their function is the same, which is to suppress the noise in the image on the premise of preserving the original image detail feature as much as possible. The key to showing different effects is the principle difference between algorithms.

The following will explain the above common filtering algorithms for subsequent research.

(1) Mean Filter [55]. Mean filtering is a process in which the target pixel is given and its neighboring pixels are included. The distribution of each pixel point forms a matrix of , and the average value of all pixels in the matrix is used to replace the target pixel value in the center. In the program running, it can be expressed as

Among them, represents the processed image, is the original image, and represents the matrix, also known as “nuclear.” The kernel size is expressed as a tuple (width, height), and the commonly used kernel sizes are (3,3) and (5,5). Usually, the larger the kernel, the more blurred the image becomes.

However, mean filtering has its own shortcomings, in which it cannot be used in both in image denoising and protecting image details. Usually, processed images become blurred and noise is not removed well.

(2) Gaussian Filter [56]. In order to overcome the disadvantages of image blur after a simple local average method, a large number of researchers put forward a lot of local average methods to preserve image edges and details from the angles of neighborhood size, shape and direction, weight coefficient of each point, weighted average of parameters, and so on. Gaussian filtering is one of the image smoothing processing methods using the domain average idea.

Different from simple mean filtering, Gaussian filtering gives different weights to pixels in different positions when averaging the pixel values of each point in the neighborhood. In general, the closer the Euclidean distance from the center position, the greater the weight value assigned, which makes the adjacent pixels more important.

(3) Median Filter [57]. Median filtering is a nonlinear signal processing technology, which realizes noise suppression on the basis of ranking statistics theory. The basic principle is similar to the mean filtering. The key difference is that the median filter is to replace the pixel value of the target pixel with the median value of each point in the field of the target pixel, so that the surrounding pixel value is close to the real value and the isolated noise point is eliminated. The concrete method is to sort the pixels in the template by the pixel value, generate the monotone sequence, and then take the middle position pixel value instead of the target point value. Remember that is the original image, is the processed image, is the selected template, and the template is usually a matrix of (3,3) and (5,5) or can be different shapes, such as circular, linear, and cross; then, the median filter output can be recorded as

The best advantage of the median filter over the mean filter is that it can take into account the preservation of image boundary information while denoising.

(4) Bilateral Filter [58]. As a nonlinear filter, bilateral filtering can smooth the image noise and maintain the original edge effect. Its filtering principle is based on Gaussian filtering, and the weighted average method is also used. But the difference is that the weight of bilateral filtering not only considers the Euclidean distance between pixels but also considers the radiation difference in the pixel range, such as the color intensity difference between each point and the central pixel and the degree of similarity. Considering these two weights: weight and pixel range domain weight, the bilateral filtering formula is as follows: is the output pixel value, is the input template center point, and is the weight coefficient, whose size depends on the spatial domain kernel and the pixel range domain kernel, and is the product of both:

4.3. Modeling and Implementation

Through the model screening, the GoogLeNet network is used for modeling. In the process of implementation, the key lies in the selection of image processing methods. Although many methods of image defogging and noise reduction are introduced, there are advantages and disadvantages, but a lot of experiments are needed to find a suitable method for dealing with low-visible images in this paper.

In the experiment, not only one image fog removal enhancement method can be used, but also two or more methods can be combined to achieve the purpose of complementary advantages and improve the processing effect. However, excessive superposition of the same image processing using different methods will greatly increase the workload and reduce the efficiency of operation. Therefore, this paper only selects the scheme of single and two methods to carry on the experiment comparison. Finally, the scheme used is shown in Figure 5.

In addition, as the final scheme has not yet been determined, in order to reduce the workload, a picture in the sample image is selected as the experimental object, and the other pictures are identified and verified after the optimal fog removal and enhancement method is screened out. Here, the last picture of the sample image is selected as the representative, which shows the lowest visibility, and the recognition accuracy of the aircraft in the preliminary model screening is also the lowest, only 4.025%, which is convincing. Computer MATLAB software is used to input images processed by various schemes. After continuous learning, training, and iteration, a new recognition effect diagram can be obtained.

5. Results and Validation

5.1. Results

According to the modeling ideas and methods in the previous chapter, the target in Figure 2 is identified. Table 2 below shows the comparison table of target recognition effect using the GoogLeNet deep transfer learning model under different image defogging enhancement methods. Since the aircraft is the main body in the picture, the following will judge the quality of the method according to the recognition rate of the aircraft under low visibility of the runway.

From the result point of view, the target recognition rate is generally improved by using GoogLeNet modeling after image processing. In the above 11 schemes, the recognition rate is the largest when the dark channel algorithm is used alone, which is 76.95%, which is an increase of 70% compared to the 4.025% recognition rate of the original image. Secondly, when the fog filter_dark channel_CDIE, fog filter_dark channel_HE, HSV histeq, RGB histeq, and YCBCR histeq methods are used, the recognition rate reaches more than 40%. The worst effect is the guided-filter method with a recognition rate of 9.238%, which is only five percentage points higher than the original image aircraft recognition rate. Through experiments, it is finally found that the best method for airport target recognition under low-visibility conditions is to first use the dark channel algorithm for image defogging and enhancement and then use the GoogLeNet deep neural network to train and learn the known images for target recognition.

5.2. Model Validation

In order to avoid the contingency of the results and make the conclusion more universal and persuasive, the next step is to do the same processing on the other images in the example image, as shown in Figures 6 and 7, before and after the processing of the second and third image recognition effect comparison charts.

(a)

(b)

By comparison, it is found that the recognition rate of the aircraft in the first image increased from 13% to 62.14%, the recognition rate of the second image increased from 9.581% to 71.24%, and the fourth image increased from 15.24% to 60.92%. The rate has generally increased by about 50%. But what is surprising is that the recognition rate of the 3rd and 5th images drops slightly. This is because the initial recognition rate of the two images itself is very high, reaching more than 98%, which can accurately identify the target. After image processing, the image clarity changes, so the recognition rate may decrease, but the decrease is small, which does not affect the overall recognition effect. According to the above data, the combination of dark channel image processing technology and the GoogLeNet depth neural network is of great help to the recognition of airport targets. The lower the initial target recognition rate, the higher the recognition rate after image processing. It can be seen that the application of image processing technology is very important in this study. Besides, by comparing the third and fifth images with the picture information displayed by other images, the target recognition rate of different images can be different. It is mainly influenced by environmental visibility, light intensity, and shooting distance.

6. Discussion

This article mainly studies the recognition of targets in the airport under low-visibility conditions. In addition, this article has a certain frontier in research methods. Although many scholars have devoted themselves to the research of target recognition, few have applied it to airport target recognition and achieved practical application value. In this paper, starting from the actual situation, the research of airport target recognition has great practical significance in ensuring the safety of airport operations, improving aviation operation efficiency, and reducing the risk of runway incursion. The “GoogLeNet deep migration learning network-fog filter_dark channel” method used in this article cleverly combines image processing technology with image recognition technology. Compared with other methods, it not only has a high recognition rate for image targets under low-visibility conditions but also the characteristics of fast running speed, easy operation, and easy transplantation which have great research significance. Based on the method of deep migration learning, this paper first uses SqueezeNet, GoogLeNet, ResNet, DenseNet, VGG, and other models to recognize low-visibility images. It is found that the recognition rate is very low, even less than 10%, which has no application value. Therefore, after the article, the clear image is applied to each deep migration learning network model, and the optimal model GoogLeNet migration learning network is selected according to the target recognition result and used in the subsequent target recognition research under low-visibility conditions. Then, we explore the processing methods of low-visibility images. After trying to use a variety of image dehazing enhancement methods, such as dark channel algorithm, histogram equalization algorithm, filter algorithm, dark channel histogram equalization, and dark channel filtering algorithm, we get the best conclusion of the image processing effect from the dark channel algorithm. Finally, the dark channel image processing algorithm is combined with the GoogLeNet network to input low-visibility images for target recognition and verification. The results show that this method can be used to identify the objects in the low-visibility image of the airport. And the lower the initial target recognition rate is, the more obvious the recognition rate is after image processing. As shown in Figure 6, when the original recognition rate is lower than 10%, the recognition degree after processing can be improved to more than 60%. Furthermore, this paper also found in the research that the main factors affecting airport target recognition are environmental visibility, light intensity, and shooting distance.

Data Availability

The data of the results of this study is related to the flight safety of the aircraft and has copyright; upon request, part of the data can be obtained from the corresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research is supported by the project of the China Meteorological Administration (0052024), the Special Project of the Civil Aviation Flight University of China (TS-06), the Postgraduate Research and Innovation Fund Project of the Civil Aviation Flight University of China (X2021-27), and the Traffic Engineering Advantages and Characteristic Discipline Construction Project of the Civil Aviation Flight University of China (D202103). We are very grateful for the support of these funds.

References

J. Park, Y. J. Kim, and B. K. Lee, “Passive radio-frequency identification tag-based indoor localization in multi-stacking racks for warehousing,” Applied Sciences, vol. 10, no. 10, p. 3623, 2020.
View at: Publisher Site | Google Scholar
P. J. Phillips, “Improving face recognition technology,” Computer, vol. 44, no. 3, pp. 84–86, 2011.
View at: Publisher Site | Google Scholar
C.-M. Kim, E. J. Hong, K. Chung, and R. C. Park, “Driver facial expression analysis using LFA-CRNN-based feature extraction for health-risk decisions,” Applied Sciences, vol. 10, no. 8, p. 2956, 2020.
View at: Google Scholar
O. Matei, P. C. Pop, and H. Vălean, “Optical character recognition in real environments using neural networks and k-nearest neighbor,” Applied Intelligence, vol. 39, no. 4, pp. 739–748, 2013.
View at: Publisher Site | Google Scholar
H. Zhang, W. Wang, L.-j. Xu, H. Qin, and M. Liu, “Application of image recognition technology in electrical equipment on-line monitoring,” Power system protection and control, vol. 38, no. 6, pp. 88–91, 2010.
View at: Google Scholar
L. Zhu, G. Zhu, L. Han, and N. Wang, “The application of deep learning in airport visibility forecast,” Atmospheric and Climate Sciences, vol. 7, no. 3, p. 314, 2017.
View at: Google Scholar
J. Oh, M. Kim, and S.-W. Ban, “Deep learning model with transfer learning to infer personal preferences in images,” Applied Sciences, vol. 10, no. 21, p. 7641, 2020.
View at: Google Scholar
S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan, “A theory of learning from different domains,” Machine Learning, vol. 79, no. 1-2, pp. 151–175, 2010.
View at: Publisher Site | Google Scholar
Q. Wu, H. Wu, X. Zhou et al., “Online transfer learning with multiple homogeneous or heterogeneous sources,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 7, pp. 1494–1507, 2017.
View at: Publisher Site | Google Scholar
Q. Yang, “Big data, lifelong machine learning and transfer learning,” in Proceedings of the sixth ACM international conference on Web search and data mining, pp. 505-506, Rome, Italy, 2013.
View at: Google Scholar
Y. Zhu, Y. Chen, Z. Lu et al., “Heterogeneous transfer learning for image classification,” in Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1304–1309, San Francisco, CA, USA, 2011.
View at: Google Scholar
J. Lin, R. Ward, and Z. J. Wang, “Deep transfer learning for hyperspectral image classification,” in 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–5, Vancouver, BC, Canada, 2018.
View at: Google Scholar
G. Boutsioukis, I. Partalas, and I. Vlahavas, “Transfer learning in multi-agent reinforcement learning domains,” in European Workshop on Reinforcement Learning, pp. 249–260, Springer, 2011.
View at: Google Scholar
G. Cai, Y. Wang, L. He, and M. Zhou, “Adversarial transform networks for unsupervised transfer learning,” in 2020 IEEE International Conference on Networking, Sensing and Control (ICNSC), pp. 1–6, Nanjing, China, 2020.
View at: Google Scholar
Z. Chen, D. Zhang, Y. Xu, C. Wang, and B. Yuan, “Research of polarized image defogging technique based on dark channel priori and guided filtering,” Procedia computer science, vol. 131, pp. 289–294, 2018.
View at: Publisher Site | Google Scholar
M. J. Ai, L. Z. Dai, and Q. H. Cao, “A self-adaptation image enhancement method for fog elimination in foggy environment,” Computer Simulation, vol. 7, 2009.
View at: Google Scholar
K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, 2011.
View at: Google Scholar
R. Fattal, “Single image dehazing,” ACM transactions on graphics (TOG), vol. 27, no. 3, pp. 1–9, 2008.
View at: Publisher Site | Google Scholar
F. Fengbing, Z. Hongying, W. U. Bin, and W. U. Yadong, “Fast single image dehazing based on the fog depth,” Journal of Sichuan University of Science & Engineering (Natural Science Edition), vol. 287, 2016.
View at: Google Scholar
L.-J. Deng, H. Guo, and T.-Z. Huang, “A fast image recovery algorithm based on splitting deblurring and denoising,” Journal of Computational and Applied Mathematics, vol. 287, pp. 88–97, 2015.
View at: Publisher Site | Google Scholar
D. Sun, H. Wen, D. Wang, and J. Xu, “A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm,” Geomorphology, vol. 362, article 107201, 2020.
View at: Publisher Site | Google Scholar
R. A. D. Ladhake and A. Siddharth, “Combine approach of enhancing images using histogram equalization and Laplacian pyramid,” International Journal of Computer Trends & Technology, vol. 4, no. 4, 2013.
View at: Google Scholar
J. W. Park, H. Lee, B. Kim et al., “A low-cost and high-throughput FPGA implementation of the Retinex algorithm for real-time video enhancement,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 28, no. 1, pp. 101–114, 2019.
View at: Google Scholar
G. Xue, P. Xue, and Q. Liu, “A method to improve the Retinex image enhancement algorithm based on wavelet theory,” in 2010 International Symposium on Computational Intelligence and Design, pp. 182–185, Hangzhou, China, 2010.
View at: Google Scholar
S. Dong, J. Ma, Z. Su, and C. Li, “Robust circular marker localization under non-uniform illuminations based on homomorphic filtering,” Measurement, vol. 170, article 108700, 2020.
View at: Google Scholar
N. Zhang and J. Yan, “Rethinking the defocus blur detection problem and a real-time deep DBD model,” in European Conference on Computer Vision, pp. 617–632, Cham, 2020.
View at: Google Scholar
K. Fukushima, “A hierarchical neural network capable of visual pattern recognition,” Neural Network, vol. 1, 1989.
View at: Google Scholar
S. T. Krishna and H. K. Kalluri, “Deep learning and transfer learning approaches for image classification,” International Journal of Recent Technology and Engineering (IJRTE), vol. 7, no. 5S4, pp. 427–432, 2019.
View at: Google Scholar
A. Magotra and J. Kim, “Improvement of heterogeneous transfer learning efficiency by using hebbian learning principle,” Applied Sciences, vol. 10, no. 16, p. 5631, 2020.
View at: Google Scholar
S. A. Emami and A. Roudbari, “Identification of nonlinear time-varying systems using wavelet neural networks,” Advanced Control for Applications: Engineering and Industrial Systems, vol. 2, no. 4, p. e59, 2020.
View at: Google Scholar
F. Jiang, Y. Lu, Y. Chen, D. Cai, and G. Li, “Image recognition of four rice leaf diseases based on deep learning and support vector machine,” Computers and Electronics in Agriculture, vol. 179, p. 105824, 2020.
View at: Publisher Site | Google Scholar
Y.-g. Zhang, J. Tang, R.-p. Liao et al., “Application of an enhanced BP neural network model with water cycle algorithm on landslide prediction,” Stochastic Environmental Research and Risk Assessment, vol. 35, pp. 1–19, 2020.
View at: Google Scholar
J. Cao, H. Cui, Q. Zhang, and Z. Zhang, “Ancient mural classification method based on improved AlexNet network,” Studies in Conservation, vol. 65, no. 7, pp. 411–423, 2020.
View at: Publisher Site | Google Scholar
P. Dhar, S. Dutta, and V. Mukherjee, “Cross-wavelet assisted convolution neural network (AlexNet) approach for phonocardiogram signals classification,” Biomedical Signal Processing and Control, vol. 63, article 102142, 2021.
View at: Google Scholar
Y. Gao and K. M. Mosalam, “PEER hub ImageNet: a large-scale multiattribute benchmark data set of structural images,” Journal of Structural Engineering, vol. 146, no. 10, article 04020198, 2020.
View at: Google Scholar
C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep transfer learning,” in International conference on artificial neural networks, pp. 270–279, Cham, 2018.
View at: Google Scholar
M. Wang and W. Deng, “Deep visual domain adaptation: a survey,” Neurocomputing, vol. 312, pp. 135–153, 2018.
View at: Publisher Site | Google Scholar
K. Saito, K. Watanabe, Y. Ushiku, and T. Harada, “Maximum classifier discrepancy for unsupervised domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3723–3732, Salt Lake City,USA, 2018.
View at: Google Scholar
A. Uçar, Y. Demir, and C. Güzeliş, “Object recognition and detection with deep learning for autonomous driving applications,” SIMULATION, vol. 93, no. 9, pp. 759–769, 2017.
View at: Publisher Site | Google Scholar
Z. Yang, W. Yu, P. Liang et al., “Deep transfer learning for military object recognition under small training set condition,” Neural Computing and Applications, vol. 31, no. 10, pp. 6469–6478, 2019.
View at: Publisher Site | Google Scholar
M. Sheeny, A. Wallace, and S. Wang, “300 GHz radar object recognition based on deep neural networks and transfer learning,” IET Radar, Sonar & Navigation, vol. 14, no. 10, pp. 1483–1493, 2020.
View at: Publisher Site | Google Scholar
J. Yan, Z. Jing, and H. Leung, “Discriminative partial domain adversarial network,” in European Conference on Computer Vision, Glasgow, UK, 2020.
View at: Google Scholar
Z. Wu and S. He, “Improvement of the AlexNet networks for large-scale recognition applications,” Iranian Journal of Science and Technology, Transactions of Electrical Engineering, vol. 45, no. 2, pp. 493–503, 2020.
View at: Google Scholar
Y. Xu, G. Yang, J. Luo, and J. He, “An electronic component recognition algorithm based on deep learning with a faster SqueezeNet,” Mathematical Problems in Engineering, vol. 2020, 11 pages, 2020.
View at: Google Scholar
C. Li, H. Zhang, P. Wu, Y. Yin, and S. Liu, “A complex junction recognition method based on GoogLeNet model,” Transactions in GIS, vol. 24, no. 6, pp. 1756–1778, 2020.
View at: Publisher Site | Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016.
View at: Google Scholar
J. Dolz, I. B. Ayed, Y. Jing, and C. Desrosiers, “HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation,” IEEE Transactions on Medical Imaging, vol. 38, no. 5, pp. 1116–1126, 2018.
View at: Google Scholar
A. Flachot and K. R. Gegenfurtner, “Color for object recognition: hue and chroma sensitivity in the deep features of convolutional neural networks,” Vision Research, vol. 182, pp. 89–100, 2021.
View at: Publisher Site | Google Scholar
K. M. Hosny, M. A. Kassem, and M. M. Foaud, “Classification of skin lesions into seven classes using transfer learning with AlexNet,” Journal of Digital Imaging, vol. 33, no. 5, pp. 1325–1334, 2020.
View at: Publisher Site | Google Scholar
L. Zhang, P. Wang, H. Li, Z. Li, and Y. Zhang, “A robust attentional framework for license plate recognition in the wild,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, pp. 1–10, 2020.
View at: Google Scholar
H. Wang, F. Zhang, and L. Wang, “Fruit classification model based on improved Darknet53 convolutional neural network,” in 2020 International conference on intelligent transportation, big Data & Smart City (ICITBS), Vientiane, Laos, 2020.
View at: Google Scholar
L. Yun, Y. Gao, J.-s. Shi, and L.-z. Xu, “Enhancement dark channel algorithm of color fog image based on the local segmentation,” in Selected Papers from Conferences of the Photoelectronic Technology Committee of the Chinese Society of Astronautics 2014, Part II, no. article 952217, International Society for Optics and Photonics, 2015.
View at: Google Scholar
Z. Li and J. Zheng, “Single image de-hazing using globally guided image filtering,” IEEE Transactions on Image Processing, vol. 27, no. 1, pp. 442–450, 2017.
View at: Google Scholar
Z. Huang, Z. Wang, J. Zhang, Q. Li, and Y. Shi, “Image enhancement with the preservation of brightness and structures by employing contrast limited dynamic quadri-histogram equalization,” Optik - International Journal for Light and Electron Optics, vol. 226, no. 2, article 165877, 2021.
View at: Google Scholar
C.-T. Lu, L.-L. Wang, J.-H. Shen, and J.-A. Lin, “Image enhancement using deep-learning fully connected neural network mean filter,” The Journal of Supercomputing, vol. 77, no. 3, pp. 3144–3164, 2020.
View at: Google Scholar
V. Le, T. Kim, Y. Kim, and D. Aspnes, “Extended Gaussian filtering for noise reduction in spectral analysis,” Journal of the Korean Physical Society, vol. 77, no. 10, pp. 819–823, 2020.
View at: Publisher Site | Google Scholar
S. Anwar and G. Rajamohan, “Improved image enhancement algorithms based on the switching median filtering technique,” Arabian Journal for Science and Engineering, vol. 45, no. 12, pp. 11103–11114, 2020.
View at: Google Scholar
S.-J. Jang and Y. Hwang, “Noise-aware and light-weight VLSI design of bilateral filter for robust and fast image denoising in mobile systems,” Sensors, vol. 20, no. 17, article 4722, 2020.
View at: Google Scholar

Copyright

Copyright © 2021 Jiajun Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

739

Downloads

1006

Citations

International Journal of Aerospace Engineering

Research on Airport Target Recognition under Low-Visibility Condition Based on Transfer Learning

Abstract

1. Introduction

2. Related Work

2.1. Deep Learning

2.2. Deep Transfer Learning

3. Data Set

4. Experiments

4.1. Model Screening

4.1.1. SqueezeNet Presentation

4.1.2. GoogLeNet Presentation

4.1.3. ResNet Presentation

4.1.4. DenseNet Presentation

4.1.5. Model Comparison

4.2. Image Defogging

4.2.1. Dark-Channel Fog Removal Algorithm [52]

4.2.2. Histogram Equalization [54]

4.2.3. Image Filtering Algorithm

4.3. Modeling and Implementation

5. Results and Validation

5.1. Results

5.2. Model Validation

6. Discussion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright