Abstract

Extreme images refer to low-quality images taken under extreme environmental conditions such as haze, heavy rain, strong light, and shaking, which will lead to the failure of the visual system to effectively recognize the target. Most of the existing extreme image restoration algorithms only handle the restoration work of a certain type of image; how to effectively recognize all kinds of extreme images is still a challenge. Therefore, this paper proposes a classification and restoration algorithm for extreme images. Due to the large differences in the features on extreme images, it is difficult for the existing models such as DenseNet to effectively extract depth features. In order to solve the classification problem in the algorithm, we propose a Multicore Dense Connection Network (MDCNet). MDCNet is composed of dense part, attention part, and classification part. Dense Part uses two dense blocks with different convolution kernel sizes to extract features of different sizes; attention part uses channel attention mechanism and spatial attention mechanism to amplify the effective information in the feature map; classification part is mainly composed of two convolutional layers and two fully connected layers to extract and classify feature images. Experiments have shown that the recall of MDCNet can reach 92.75% on extreme image dataset. At the same time, the mAP value of target detection can be improved by about 16% after the image is processed by the classification and recovery algorithm.

1. Introduction

At present, with the rapid development of computer technology, artificial intelligence has been able to handle most of the outdoor monitoring and inspection works. The image of the equipment to be detected is obtained by the monitoring camera in real time, and the status of the equipment is analyzed based on the image recognition to realize the intelligent monitoring of the equipment. However, this technology relies heavily on the collected images, and the quality of the image will have a great impact on the monitoring results. Due to the uncertainty of the natural environment, there often occurs bad weather, such as haze, heavy rain, strong winds, and strong light. The extreme images taken in these environments will lead to the failure of the monitoring system to effectively identify and detect the targets.

In view of the various extreme environments that may be encountered in engineering, we sort out five types of common extreme images: haze, rain streaks, low illumination, blur, and raindrop, as shown in Figure 1. There have been a lot of researches on how these images can be restored, such as the density-aware single image deraining using a multistream dense network (DID-MDN) [1], dark channel prior dehazing algorithm [2], and DeblurGAN [3, 4]. This type of algorithm effectively solves the problem of how to restore extreme images to clear images; however, it does not solve the problem of how to recognize extreme images from the database. Due to the large differences of various depth features on extreme images, the existing DenseNet [5], ResNet [6], and other models have low classification accuracy for extreme images. Therefore, this paper proposed an algorithm of extreme image classification and restoration based on multicore dense connection network, which can achieve effective classification of extreme images.

The main work of this paper is as follows:(i)An extreme image dataset is established which consists of clear, haze, rain streak, low illumination, blur, and raindrop and is the first known dataset about extreme images.(ii)This paper proposes an algorithm for the classification and restoration of extreme images, which can recognize clear images and extreme images, and adopts the corresponding algorithm to restore the extreme images according to their categories. The algorithm can improve the working ability of outdoor monitoring, inspection, and other systems under various unconventional weather conditions and improve the robustness of the system.(iii)In order to solve the image classification problem in the extreme image classification restoration algorithm, a Multicore Dense Connection Network (MDCNet) is proposed. The dense block with two convolution kernels of different sizes can be used to more effectively extract the depth features of different sizes in the extreme image, which can achieve a better classification effect.

For the above five types of extreme images, there are some algorithms that can restore them to a certain extent. For haze images, the commonly used restoration algorithms include Retinex algorithm based on contrast enhancement and histogram equalization algorithm to highlight the image details through image enhancement technology and improve the contrast, the dark channel prior haze removal algorithm based on image restoration [2], color attenuation prior algorithm, and so on. According to the atmospheric scattering model, this kind of method estimates certain parameters through a priori and other means, so as to infer the original clear image. DehazeNet algorithm [7] and AOD-Net algorithm [8] are based on deep learning and so on. These algorithms achieve end-to-end haze removal by building a neural network model. For rain streak images, the DID-MDN rain streak removal algorithm [1] guides the work of the image region through density labels, while the GCANet algorithm [9] realizes the aggregation of context information through jumping connection, making the processed image clearer. [10, 11, 12]. For low-illumination images, the low-illumination image enhancement algorithm based on the camera response model [10] obtains enhanced images by studying the physical model of image imaging. RetinexNet algorithm [11] and EnlightenGAN algorithm [12] use neural networks such as GAN to achieve low-illumination image repair after training on a large number of data sets. For fuzzy image restoration algorithms, the DeblurGAN algorithm [3, 4] achieves blur removal without fuzzy core through multiple residual blocks and global residual connection. The SRN-DeBlurNet algorithm [13] builds a multilayer network model and reconstructs clear images through host optimization. For the raindrop image, attentive GAN for raindrop removal algorithm enabled the network to pay more attention to the region of raindrops through the guidance of attentive force.

As for the classification of extreme images, Yu et al. [14] proposed an algorithm to classify clear, haze, rain, snow, and dust according to color slant factor and chroma component. First, convert the image to the CIE-Lab color space. When the color cast factor D of the image is less than 1.4, it is a clear image. Otherwise, it is a extreme image. If it is a extreme image, it is then classified as a dust image or a haze image based on the chrominance component value. This algorithm adopts pure physics algorithm, which can recognize fewer image categories, has poor robustness, and is prone to misjudgment.

In the existing work, the processing of weather image classification is similar to the extreme image classification in this paper. The weather image classification task firstly needs to identify the weather category to which the image belongs, such as sunny, cloudy, rainy, snowy, and haze. At present, weather image classification methods can be roughly divided into two categories: the first category uses semantic segmentation, target detection, and other methods to identify auxiliary classification information that exists on the image. For example, there will be shadows in the sunny image, clouds in the cloudy image, and wet ground in the rainy image. For this feature, Zhao et al. [15] and Linet al. [16] proposed a multitask model to classify and semantically segment weather images, and the internal connection between the two tasks improves the classification accuracy. Shi et al. [17] divided the weather image into several parts through Mask R-CNN and then used VGG16 for feature extraction. Lin et al. [18] used semantic segmentation to divide images and extract auxiliary classification information.

The second category focuses on color differences such as contrast and brightness in different weather images and classifies them by extracting artificial features. Zhang et al. [19] designed a new feature extraction method for weather images. The features of sky, shadow, rain bands, snowflake, and dark channel are taken as local features of the image so that each feature can indicate specific weather. The contrast and saturation features are taken as global features that can indicate a variety of weather conditions, and the features extracted above are learned and classified by dictionary learning. Roser and Moosmann [20] extracted the features of image contrast, minimum brightness, sharpness, hue, and saturation and classified the image by SVM. Both artificial features and depth features are used in the literature [21], which proposed a new feature extraction method such as contrast and saturation of the image as artificial features, and the AlexNet model is used to extract the depth features of the image. The two types of features are fused, and a linear classifier is used for classification.

3. Construction of Extreme Image Dataset

In order to effectively solve the problem of extreme image restoration, it is necessary to construct a dataset composed of various extreme images. In the extreme image dataset constructed in this paper, there are six types of images, including five types of common extreme images (haze, rain streak, low illumination, blur, and raindrop) and one type of clear images, with 1200 images for each type, as shown in Figure 2.

3.1. Haze Images

In haze weather conditions, due to the presence of large amounts of suspended particulate matter in the air, these particles can produce scattering of light, causing objects to reflect light attenuation. At the same time, the reflected light and the light directly received by the observer (camera) are mixed, resulting in changes in the image contrast and clarity, and a large number of details are lost. However, it is relatively easy to collect haze images, because there are many such images on various image websites. Our haze weather images are mainly derived from various image search websites and existing weather image datasets [19, 22].

3.2. Rain Streak Images

Under rainy conditions, due to the relatively long exposure time of the camera, the falling track of raindrops (namely, rain streak) may appear on the image. Rain streak will lead to the occlusion of some details on the image, and its effect is similar to noise, which will affect the subsequent recognition and detection of the image. The number of such images is relatively small and difficult to collect. In the paper, these images are partly derived from the synthetic dataset published in the DID-MDN algorithm [1]. Due to such images synthesized by photoshop software that could not better fit the natural environment, we collected real rain streak images from various websites and published weather image datasets to supplement them.

3.3. Raindrop Images

In a rainy environment, the lens of the camera may be splashed with water, which will result in a lot of clumpy blurs on the captured image, and the information in the image cannot be extracted effectively. Such images are difficult to collect. The images of this paper come from the raindrop image dataset published by Qian et al. [23], which contains clear images and raindrop images taken under the same background environment.

3.4. Low-Illumination Images

Under weather conditions such as sunny days, due to the influence of lighting environment and shooting angle and other issues, local shadows or low illumination may appear on the images captured. We have filtered the existing sunny day images, and the low illuminance exists in each selected image.

3.5. Blur Images

When the camera is taking pictures, due to the influence of external forces, shaking or the object moving too fast will lead to motion blur on the image. This kind of fuzzy image environment background is complex and changeable. We collect the blurred images from various websites and deblurring task datasets; however, the number of such blurred images is small and cannot meet the training requirements. Therefore, we carry out a convolution operation on the clear image and synthesize the motion blur image by randomly adjusting the parameters of the convolution kernel.

3.6. Clear Images

The collection of such images is relatively simple and mainly comes from the image datasets of all sunny days, the rain streak synthesis dataset [1], and raindrop image dataset [10]. Meanwhile, the images are screened again to ensure that there are no extreme cases in the images and the details of the images are clearly visible.

Due to the complexity of the sources of the above five types of images, in order to remove the same images in the dataset, we compared the structural similarity (SSIM) of images one by one and deleted the same images with an SSIM value of 1. The formula of SSIM is as follows:where x and y are the sample images to be compared, and are the mean values of x and y, respectively, and are the variance of x and y, respectively, is the covariance of x and y, and and are two constants so as to avoid dividing by zero.

4. Extreme Image Classification and Restoration Algorithm

The flowchart of the extreme image classification and restoration algorithm is shown in Figure 3, and the pseudocode of the algorithm is shown in Algorithm1. For the input image, the pretrained MDCNet is used to recognize the category of the image. If it is a clear image, it will be output directly; or if it is an extreme image, the corresponding restoration algorithm will be used for processing according to their category labels. In order to prevent the existence of multiple extreme situations on an image, the processed extreme image is again recognized by classification. If it is recognized as a clear image, then it will be output; if it is an extreme image, the above restoration operation will be repeated. This operation may cause the algorithm to enter an endless loop; thus, we add an attribute of the number of enhancement times to the input image. Set the input original image enhancement number to 0, and for each enhancement. When is greater than or equal to 2, it is directly output as a clear image.

Inputs: OriginalImg: original image
   Dehaze: AOD-Net
   DeRainStreak: DID-MDN
   DeRaindrop:Attentive GAN for raindrop
   DeLowIllumination: HDR
   DeBlur: DeblurGAN-v2
Outputs:ProcessedImage
(1)i = 2
(2)def ImageRestoration (OriginalImg):
(3)  ClassNum = MDCNet (OriginalImg)
(4)  if ClassNum = = 0:
(5)    OutImg = OriginalImg
(6)  elif ClassNum = = 1:
(7)    OutImg = DeHaze (OriginalImg)
(8)  elif ClassNum = = 2:
(9)    OutImg = DeRainStreak (OriginalImg)
(10)  elif ClassNum = = 3:
(11)    OutImg = DeRaindrop (OriginalImg)
(12)  elif ClassNum = = 4:
(13)    OutImg = DeLowIllumination (OriginalImg)
(14)  elif ClassNum = = 5:
(15)    OutImg = DeBlur (OriginalImg)
(16)  return ClassNum, OutImage
(17)for epoch in range (i):
(18)  Num, Image = ImageRestoration (OriginalImg)
(19)  if Num = = 0:
(20)    ReImage = Image
(21)    break
(22)  else:
(23)    ReImage = Image
(24)ProcessedImage = ReImage

In the extreme image classification and restoration algorithm, we apply existing extreme image restoration algorithms [24]. For example, the AOD-Net proposed by Li et al. [8] is used for the haze image, the DID-MDN algorithm proposed by Zhang et al. [1] is applied to remove the rain streak from image [25], the Attentive GAN for raindrop removal algorithm is used for the raindrop image, HDR algorithm [10, 2628] is employed for low-illumination image, and the DeblurGAN-v2 algorithm proposed by Kupyn et al. [4] is used to blurred image.

5. Multicore Dense Connection Network (MDCNet)

In extreme image classification and restoration algorithms, the key challenge is how to effectively recognize and classify all kinds of images. At present, with the rapid development of artificial intelligence technology, CNN has been able to handle classification tasks of all kinds of images, and experimental results show that various extreme images and clear images can be recognized. However, the accuracy of classification is low, which cannot meet the needs of the application. Due to the large number of features on extreme images and different sizes, it is difficult to label; the environmental background of various extreme images is relatively complex, and it is difficult to find the color rules, so the two methods of weather image classification are difficult to achieve in this task. Therefore, we propose a Multicore Dense Connection Network (MDCNet). Two Dense Blocks with different convolution kernel sizes are used to extract features of different scales from the original image, the extracted feature images are fused, and finally, two convolutional layers are used to extract features of the fused image. Experimental results show that the recall of MDCNet proposed in this paper is 92.75% on the extreme image dataset, and it can effectively recognize various extreme images and clear images.

MDCNet is mainly composed of dense part, attention part, and classification part, as shown in Figure 4. Due to the large differences in features such as rain streaks, haze, blur, and low illumination, Dense Part uses two dense blocks with different convolution kernel sizes to extract depth features from the original image. For smaller size features such as rain streaks, the feature extraction is performed by the dense block of the small convolution kernel; for the larger size features such as haze and shadows, the feature is extracted by the dense block of the large convolution kernel. In this way, the depth features of different sizes can be effectively extracted from extreme images, the overall features on the images can be grasped while paying attention to the small feature differences, and the extreme images can be effectively recognized and classified.

Each dense block is mainly composed of 6 bottleneck blocks, 6 transition layers, and a channel attention module. Each bottleneck block is connected by a transition layer, as shown in Figure 5. In the dense block, the input image passes through each bottleneck block in turn, in which the input image of bottleneck block 5 is the image after stitching by the outputs of transition layers 1 and 4, and the input image of bottleneck block 6 is the image after stitching of the outputs of transition layers 2 and 5. Finally, the output image of transition layers 1–6 is concatenated with the input image of the dense block after upsampling, and the spliced-up feature image is sent to the attention module to assign different weights to each channel. Then, after one convolution, the final output feature image of the dense block is obtained. In dense part, the difference between each dense block lies in the size of the convolution kernel of the second convolutional layer used in the bottleneck block. The size of the convolution kernel of dense block 1 is 3 × 3, and the size of the convolution kernel of dense block 2 is 7 × 7.

In the attention part, as shown in Figure 6, the input image is the image after the splicing of the output feature maps of the two dense blocks, which is Concat [DB1, DB2]. The spatial attention module is used to assign weights in spatial dimensions to the spliced feature image to better highlight some of the more important channels. In the process of spatial attention, we found that too many channels would lead to information redundancy and could not effectively amplify important information in spatial dimensions. Therefore, we propose a convolutional layer in front of the spatial attention mechanism to compress the feature image into 3 channels to remove some invalid information. In the Classification Part, we refer to the VGG16 model with the Batch Normalization (BN) layer, which is mainly composed of two convolutional layers and two fully connected layers. The specific structure is as follows:where the input channel of the first convolution layer is 3, the output channel is 64, and the size of the convolution kernel is 3 × 3; the input channel of the second convolution layer is 64, the output channel is 24, and the size of the convolution kernel is 3; the pooling core size of the Average Pooling layer is 5; the input size of the first fully connected layer is 249696, and the output size is 512; the input size of the second fully connected layer is 512, and the output size is 6. The BN layer can speed up the training and convergence speed of the network and control the gradient explosion to prevent the gradient from disappearing and overfitting.

6. Experimental Results and Discussion

In this section, we propose several kinds of extreme image classification enhancement algorithms and verify their effectiveness. At the same time, various models are used to conduct comparative experiments on the extreme image dataset, and MDCNet ablation experiments are performed to evaluate the recall and precision of the experimental results.

6.1. Extreme Image Restoration Algorithm

Among the various extreme image restoration algorithms currently available, several algorithms with better results have been screened out to handle the extreme images, as shown in Table 1. At the same time, some representative images were selected from the extreme image dataset to verify various algorithms. The comparison results of images are shown in Figure 7 and the test results are compared using SSIM indicators, as shown in Table 2.

In Table 1, for haze images, AOD-Net [8] is based on CNN and directly generates haze-free images from haze images to achieve end-to-end dehazing effect; for rain streak images, DID-MDN algorithm [1] uses the amount of rain on the image. For raindrop images, the raindrop removal algorithm [10] is based on the GAN network, and the encoder can effectively repair the blur caused by raindrops on the image. For low-illumination images, the HDR algorithm [26] aimed to supplement the brightness of low-illumination areas to achieve the restoration of lost information. For blurred images, DeblurGAN-v2 algorithm [4] as an improved algorithm of DeblurGAN [3] can be more flexible and efficient to remove the motion blur on the image.

As shown in Figure 7 and Table 2, the abovementioned extreme image restoration algorithms have good results. Compared with the original image, the SSIM value of each restored image is above 0.6. At the same time, the restored image has higher contrast in visual angle and clearer details. After processing, the contrast of the haze image is improved, and the detailed information is enlarged; the rain streaks on the rainy image are removed, and there is no noise information; the dark areas on the low illuminance image become clear after processing; the original blurred objects on the blurred image are restored after processing; and the original outline of the image is restored.

6.2. Training Details of MDCNet

Use Tesla P100 GPU to train under the PyTorch framework, and the input image is flipped horizontally or vertically with a random probability of 0.5. The input image size is 512 × 512, and the batch size is 8. The optimization algorithm uses the Adam algorithm, and the loss function uses Cross-Entropy Loss function. The formula is as follows:where represents the true value and is the predicted value.

The extreme image dataset is used for training and the optimal model of recall rate stored on the verification set is used as the final model. The BN layer [29] is used in MDCNet, and the formula for the BN layer is as follows:where is the mean value of in a batch, is the standard deviation of in a batch, is a small constant to prevent division by zero errors, and and are learnable parameters. The output value of the BN layer will be affected by the batch size, and the BN layer needs to be closed during verification and testing.

6.3. Results on Extreme Image Dataset

In experiment, we used the following models for testing (Table 3).

In order to measure the performance of each model, we use the recall (R) and precision (P) as well as the overall average recall (AR) and average precision (AP) indicators for evaluation. The calculation method of recall and precision is as follows:where TP (true positive) means that the image is calibrated as a positive sample, and the classification result is a positive sample; FN (false negative) means that the image is calibrated as a positive sample, and the classification result is a negative sample; FP (false positive) means that the image is calibrated as a negative sample; the classification result is a positive sample.

According to the experimental results in Table 4, it can be seen that models such as DenseNet, ResNet, and VGG can effectively recognize extreme images, but the effect is not good, and VGG shows a better effect in this task because VGG has better translation and nondeformation. Since the environmental background of extreme images is more complicated than weather images, the fusion model for weather image classification proposed by Qiang G et al. cannot achieve better results in this task. Especially for the images with relatively complicated environmental backgrounds such as rain streaks and blurring, the extracted features such as sky and color space are not very helpful for classification. The performance of MDCNet on the extreme image dataset is better than that of DenseNet, ResNet, and VGG. Given that it has two dense blocks with different convolution kernel sizes, it can ensure that small features are not lost, while large-scale features are also obtained. With a corresponding improvement, the average recall rate can reach 92.75%. In the ablation experiment, the average recall and average precision of the MDCNet of a single dense block can reach more than 90%, but due to the inability to completely extract various depth features, it only shows good classification performance on a certain type of image, and its performance is still lower than MDCNet with two dense blocks.

6.4. Target Detection Experiment

In order to prove the recovery effect of the extreme image classification and recovery algorithm on the target detection, several commonly used target detection algorithms are used to detect the clear image, the composite extreme image, and the restored image. The testset in the VOC2007 dataset was used for detection, which contained 4950 clear images and 20 categories of targets. On this basis, the corresponding four kinds of extreme images are synthesized by an algorithm: for the image of rain streak, the rain streak is added to the image by the PS software [30]. The blur image is obtained by the convolution of the fuzzy checking image. The haze image is the image with haze generated based on the atmospheric scattering model [31]. Low-illumination images are created by reducing the contrast and brightness of clear images. The resulting image is shown in Figure 8. The abovementioned extreme images and clear images are randomly mixed in equal amounts to obtain a total of 4950 mixed image dataset, and the target detection algorithm (YoloV3 [32], SSD [33], Efficiendet [34], Faster-RCNN [35]) is used to detect the mixed image dataset, the original clear dataset and the dataset after classification and restoration. The mAP value of each algorithm is shown in Figure 9. As can be seen from the detection results in Table 4, extreme images have a great impact on the target detection algorithm. In the synthesized extreme image dataset, mAP decreased by 25.98% on average, while the mAP value of each target detection algorithm increased by about 16% after being processed by the classification recovery algorithm.

7. Conclusion

In this paper, in view of the current extreme image restoration algorithms that only solve the problem of image restoration but not how to recognize extreme images from the dataset, we propose an extreme image restoration algorithm, which recognizes the categories of each image and uses corresponding restoration algorithms to process the recognized categories. At the same time, due to the feature scale difference in extreme images, in order to solve the classification problem in the algorithm, we propose a Multicore Dense Connection Network (MDCNet) for extreme image classification. Experiments show that, compared to ResNet, DenseNet, VGG, and other models, MDCNet performs better on extreme image dataset and can meet the needs of extreme image classification and restoration algorithms. At the same time, the extreme image classification and recovery algorithm has a great improvement effect on the target detection of extreme images and can effectively improve the accuracy of target detection in extreme environments.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under grant nos. 61502297, 41975152, and 51707113.