Abstract

The imaging device is susceptible to factors such as the subject or the shooting environment when imaging, and complex variable blurring occurs in the final imaging. In most cases, we not only do not have the conditions to re-shoot a clear image but also do not know the specific parameters of the variable blur in advance. Therefore, the purpose of this study is to propose a motion blur fuzzy blind removal algorithm for character images based on gradient domain and depth learning. Deep learning is to learn the inherent laws and representation levels of sample data, and the information obtained during these learning processes is of great help to the interpretation of data such as text, images, and sounds. The algorithm used in this study is to preprocess the image by using guided filtering and L0 filtering and send the preprocessed gradient domain image block to the designed convolutional neural network for training. Extract the trained model parameters and realize the fuzzy kernel estimation and image. Image deblurring is performed using the TV regular term during image restoration. The experiment proves that the algorithm can effectively suppress the ringing effect and reduce the noise, and the motion blur effect is better. In this study, the MLP method, the edge detection method, and the proposed method are discussed, respectively. The PSNR values of the three motion blur removal methods are 26.49, 27.51, and 29.18, respectively. It can be seen that the motion blur removal method proposed in this study can effectively remove image motion blur.

1. Introduction

With the rapid development of science and technology, various photographic equipment have sneaked into all aspects of human life and have become a common existence in daily work and life. For example, journalists cannot do without professional cameras and almost one smart phone in life. Electronic surveillance cameras are more ubiquitous, and satellites at work provide a large number of remote-sensing images, and the use of drones is becoming more and more popular [1]. With the proliferation of highly developed network technologies, these imaging devices produce a large number of images at all times. These images not only enrich and facilitate people’s lives but also play a pivotal role in improving production quality, safeguarding property, and life safety [2]. However, in the actual imaging process, due to factors such as focal length, jitter, perceived target motion, and atmospheric turbulence, the obtained images often have some ambiguity [3]. The image quality degradation caused by the blurring of the image and the lack of image detail information not only brings inconvenience to people’s production and life but also causes economic losses and security loopholes in serious cases [4]. In most cases, people do not have the conditions to re-shoot clear images. First, the cost of replacing hardware equipment is high; second, the actual and objective conditions are harsh, such as images captured for news events, images of suspects photographed in surveillance, images of illegally driven vehicles, etc. [5]. In this case, it is necessary to eliminate image blur by a certain technical means to restore clear images and restore image quality, which has a broad market prospect [6]. At the same time, in blind image deblurring, the blur kernel needs to be estimated before restoring a clear image. The estimation of the blur kernel is one of the core problems of blind image deblurring, and it is also a difficult problem in blind image deblurring.

In recent years, thanks to the continuous improvement of computer hardware technology, the improvement of computer vision theory, and the rapid development of machine learning, especially deep learning algorithms, image blur removal technology has made considerable breakthroughs in theory and application. The fields of medicine, transportation, astronomy, criminal investigation, and military play an important role [7]. With the continuous development of related technologies, image deblurring technology is bound to be more widely used, and it is bound to undertake more important work to meet people’s more stringent technical requirements [8]. The blurring process of images is usually modeled as the process of convolution of fuzzy kernels with clear images [9]. The single image deblurring process can be seen as the inverse of the image blurring process, namely, deconvolution [10]. Solving a corresponding single clear image from a blurred image is a typical ill-posed problem. First is the uncertainty of the equation solution during deconvolution, that is, the most difficult singular solution in image restoration. There are multiple solutions to the deconvolution equation [11]. Early image deblurring studies often assume that degenerate models or fuzzy convolution kernels are known, mainly in mathematical models and algorithms. In the premise that the fuzzy kernel is known, the image is reversed, which is called unblind deblurring [12]. In most cases, the information of the fuzzy kernel will not be provided in advance, and the only information we have is an image with unknown blur. Compared with the deblurring problem under the condition of fuzzy kernel, blind deblurring is a serious ill-posed problem and a more challenging practical problem. It has a wider application range [13], so the fuzzy blind removal technique is based on single image. In recent years, it has received more and more attention from the academic community. In addition, early image deblurring studies focused more on the uniform blurring of images, which in the actual production and life are usually variable or inconsistent [14]. In other words, each pixel of the blurred variable image has different blur conditions. Furthermore, the focal length setting is incorrect when shooting, and the out-of-focus blur will appear in some of the imaged images [15]. Compared with the invariant blur, the case of variable blur is more common, and since it is impossible to adopt a unified fuzzy model for the whole graph, the recovery is more difficult [16]. The exploration of image deblurring technology, especially the exploration of image fuzzy blind removal technology with the above two problems of fuzzy kernel unknown and fuzzy kernel variable, has great theoretical research significance not only in academia but also in society and has a very high practical application value [17]. The use of image restoration technology to restore the degraded image can enable the lost or degraded image to be restored.

The innovations of this paper are as follows. (1) A blind image motion blur removal algorithm based on gradient domain and deep learning is proposed. (2) The algorithm combines image gradient domain preprocessing and CNN network methods, avoiding the shortcomings of directly using deep learning methods and ignoring image prior information. (3) Compared with other algorithms, this algorithm can effectively suppress the ringing effect and reduce noise and has better motion blur effect.

The text of this study is organized as follows. Section 1 summarizes the research background and significance of the variable fuzzy blind removal problem and briefly expounds the innovation of this study. The main research contents and arrangement of the study are explained. In Section 2, firstly, the current research status of variable blur blind removal is illustrated in detail. Then, the principle of deblurring and image preprocessing are briefly introduced. The gradient domain method and the image blur blind removal algorithm based on gradient domain are introduced in detail. Finally, the image blur blind removal algorithm based on deep learning is introduced. The three common image blur types studied in this study are briefly introduced, namely, Gaussian blur, motion blur, and out-of-focus blur. The convolutional neural network method is described in detail. An image motion blur blind removal algorithm based on gradient domain and depth learning is proposed, which provides relevant research work support for the following methods. Section 3 first introduces the dataset of this study, then introduces the hardware configuration and software environment of the experiment, and finally introduces the objective evaluation index of image quality used in this study. Section 4 firstly shows the effect picture of the noise blurred image after image preprocessing, then compares the influence of the size and number of the filter on the experimental effect, respectively, and then lists the noise blurred image. After comparing the effects of the algorithm, the PSNR values of the images processed by different methods are compared. In Section 5, the main research work of this study is summarized, and the advantages and disadvantages of this method are summarized.

2. Proposed Method

2.1. Related Work

Zhang et al. proposed a novel two-channel convolutional neural network (DC-CNN) framework for accurate spectral spatial classification of hyperspectral images (HSI). In this framework, one-dimensional CNN is used to automatically extract hierarchical spectral features, then two-dimensional CNN is used to extract spatially related hierarchical features, and then softmax regression classifier is used to combine spectral and spatial features and finally predict classification result. In order to overcome the limited training samples available in HSI, they proposed a simple data enhancement method, which effectively improved the HSI classification accuracy. For comparison and verification, they tested the proposed method and other three HSI classification methods based on deep learning on two real-world HSI datasets. The experimental results show that our DC-CNN-based method is much better than the latest method [18].

Kou et al. believed that the guided image filter (GIF) is a well-known local filter because of its edge retention characteristics and low computational complexity. Unfortunately, GIF may be affected by halo artifacts because the local linear model used in GIF does not adequately represent images near certain edges. They proposed a gradient domain GIF by combining explicit first-order edge-aware constraints. Edge-aware constraints allow edges to be better preserved. To illustrate the efficiency of the proposed filters, their proposed gradient domain GIFs are used for single image detail enhancement, tone mapping of high dynamic range images, and image saliency detection. Theoretical analysis and experimental results have proved that their proposed gradient domain GIF can produce better composite images, especially near the edge of the halo appearing in the original GIF [19].

Navarro et al. believed that motion blur is the basic clue to perceiving moving objects. This phenomenon appears as a visible trajectory along the trajectory of the object and is the result of a combination of relative motion and light integration that occurs in film and electronic cameras. In this work, they analyzed the mechanisms for generating motion blur in recording devices and methods for simulating motion blur in computer-generated images. Over time, light integration is one of the most expensive processes for simulating in high-quality rendering, so they have an in-depth review of existing algorithms and categorized them in the context of a formal model to highlight their differences, strengths, and limitations. They eventually completed the report and presented a number of alternative categories that will help the reader determine the best technology for a particular situation [20].

2.2. Image Deblurring
2.2.1. Image Deblurring Principle

The image is blindly deblurred, that is, the fuzzy kernel is unknown, and the clear image x and the blurred kernel k need to be obtained from the blurred image y. The blind deblurring problem can transform the solved fuzzy kernel into a nonblind deblurring problem. There are many reasons for image blurring, including optical factors, atmospheric factors, artificial factors, and technical factors. It is of great significance to deblur images in daily production and life. Among them, image restoration technology is a very important processing technology in the field of image processing. Similar to other basic image processing technologies such as image enhancement, it also aims to obtain a certain degree of improvement in visual quality. The difference is that the image restoration process is actually an estimation process, and the degraded image needs to be restored according to some specific image degradation models:where denotes the convolution operation. The first part is the data fidelity term, which is constrained by the L2 norm; the second part is the a priori regular constraint term, which is related to the original clear natural image, is the constraint term the parameter that balances the ratio between the data fidelity terms, and the selection of the regular term is related to the selected image prior.

2.2.2. Image Preprocessing

Image preprocessing is to separate each text image and hand it over to the recognition module for recognition. This process is called image preprocessing. First, the filtered image is guided as a base image to reduce the effects of noise and unwanted detail. Then, the L0 filtered gradient domain image and the corresponding clear image are taken as samples, and the designed CNN is trained. In this study, both the guided filtering and the L0 filtering are applied to the clear image, which can suppress irrelevant details, enhance the sharp edges of the image, and improve the robustness of the fuzzy kernel estimation. Enhancing useful information in an image, which can be a distortion process, aims to improve the visual effect of the image.

2.3. Image Blind Deblurring Algorithm Based on Gradient Domain
2.3.1. Blind Deblurring Algorithm Based on Gradient Domain Image

The gradient domain method is basically a transform domain method and still performs three basic steps of image space conversion, modified gradient field, and image reconstruction. This section describes each part of the proposed gradient domain natural image enhancement method in detail.(1)Image space conversion: the first step in the gradient domain method is to transform the space in which the image is located, transforming the image from the original space to the gradient space. In order to better detect the edges and suppress noise, the Sobel gradient operator is used to calculate the gradient of the original image f (x, y).(2)Modify the gradient field: the key part of the transform domain method to obtain the processing effect is to process the information in the transform space. Just as the frequency domain method multiplies the spectrum by a transfer function to process the frequency information, the gradient domain is also multiplied by the gradient field by a transform function to get the desired gradient field. If the original image is f(x, y), the gradient function of the original image is G(x, y), and the desired gradient function is expressed as Gnew(x, y), and the processing can be expressed aswhere is a transformation function.Modification of the gradient field requires selection of the corresponding transformation function based on the desired transformation result. The general method does not necessarily achieve the best results for a specific problem. It is necessary to formulate a corresponding transformation strategy according to the actual situation and flexibly use the gradient domain enhancement method.It is known from the mathematical knowledge that the function of is characterized by the fact that the amplitude of the zero point and the near zero region does not exceed 1, and as x increases, y increases sharply in a region to reach a value much larger than one and then decreases rapidly and tends to zero. The value of y in the region at the front and rear ends of the x-axis does not exceed 1, which is weakened in the transfer function, and the value of y in the intermediate range is greater than 1, which is an enhancement characteristic, and the range of the enhancement interval is small. Considering the processing effect and complexity of the function, we decided to use the function like as the transformation function of this study. In order to make the weakening area reduction effect more obvious, the enhancement part enlargement multiple is larger, and there is a faster change speed when the amplitude rises and falls, and the function self-multiplication is adopted, that is, the self-product product of the value less than 1 is smaller. The principle of a larger self-product of a value greater than 1 sets the transformation function. Thus, the transformation function formula is defined aswhere ||G(x, y)|| is the gradient mode at the pixel point (x, y) of the original image, a, b, and c are constant, and a > 1, b > 1, and c > 1.The function has two thresholds such thatThe interval is called a smooth interval, the interval is called an enhancement interval, and the interval (N2, +∞) is called an attenuation interval. In , the enhancement range and interval size of the enhancement interval can be controlled by the constants a, b, and c. Due to the different performance of the acquisition system equipment, the gradient distribution of the acquired image of the person is also different.(3)Image reconstruction: after processing the original gradient field, it is necessary to reconstruct the image with a new gradient field. In theory, reconstructing an image is as simple as integrating the new gradient function Gnew(x, y). However, the transformed gradient field may not be a conservative field; there is no image corresponding to this gradient field. The direct integration of this gradient field is not the result. Therefore, we use the principle of least squares to reconstruct the gradient field. Image reconstruction technology was initially used in radiological medical equipment to display images of various parts of the human body, namely, computed tomography technology, or CT technology for short, and it was gradually applied in many fields.

Let there be an image whose gradient field is closest to Gnew(x, y); then, can be obtained by minimizing the following integrals:

The Poisson equation can be obtained by substituting into the Euler–Lagrange equation:where is the Laplacian operator and is the divergence.

2.4. Blind Deblurring Algorithm Based on Deep Learning

In the field of blind image removal, there are more and more methods based on deep neural network (DNN). DNN is used for fuzzy classification of fuzzy images, and according to the classification results, the corresponding generalized regression neural network (GRNN) is used to perform fuzzy kernel parameter regression.

2.4.1. The following Is a Description of the Fuzzy Type
(1)Gaussian blur: the Gaussian fuzzy PSF is a two-dimensional Gaussian function, and the PSF is used to adjust the image pixel values to conform to the values of the two-dimensional normal distribution. Gaussian blur can generally be used to simulate atmospheric turbulence. From a mathematical point of view, the Gaussian blurring process of an image is the convolution of the image with the normal distribution. Since the normal distribution is also known as the Gaussian distribution, this technique is called Gaussian blur.(2)Motion blur: the linear motion blur PSF describes the relative linear motion of the subject domain camera.(3)Defocus blur: the defocus blur PSF can be represented by a disc-shaped mean wave filter, which is generally used for the same focal length of the subject at different depths of field when the simulation is taken, resulting in different degrees of blurring.
2.4.2. Fuzzy Kernel Estimation Based on Deep Network

The algorithm trains the network to estimate the blur parameters of each image patch by dividing the image into overlapping patches. In the first stage, the deep confidence network is used for pretraining and the DNN is fine-tuned to identify the fuzzy type. In the second phase, multiple GRNNs are trained to predict the corresponding types of fuzzy parameters. In the fuzzy classification and estimation part, the algorithm is divided into three steps: fuzzy block feature extraction, DNN training, and GRNN training. The flowchart of the algorithm is shown in Figure 1:

Among them, B1, B2, and B3 are Gaussian and motion defocus blur, respectively, and P1, P2, and P3 represent predicted values, respectively.

2.5. Deep Convolutional Network Structure

The method is divided into three processes. The first step is to use the CNN to estimate the fuzzy kernel to change the distribution at the image block level and calculate the candidate set of the rotated image to expand the fuzzy kernel. The second step is based on the Markov random field model. The smoothing of the numerically dispersed motion blur parameter field is performed. In the third step, the motion blur based on the image block prior is used to remove the motion blur of the positive image.

The variable motion blur in the algorithm is defined as follows. The motion blur length and the angle of the blurred image at the pixel position ( is image domain) are summarized by the motion vector . Each motion vector defines a fuzzy kernel that is nonzero on the motion trajectory. The process of generating the blurred image is , which is obtained by convolving the original image with the variable fuzzy , where is defined by the motion parameter field .

The algorithm is divided into three steps: block-level motion distribution estimation, motion parameter field construction, and variable blur removal.

The specific description of the steps is as follows.

2.5.1. Block-Level Motion Distribution Estimation
(1)Training CNN estimating motion distribution: the small block with p as the center pixel is represented by . The task of fuzzy kernel identification is to estimate the following conditional probability distribution, called “motion distribution:”Among them, , and and are the set of values of the length and angle of the motion blur kernel, respectively, and . For the case of =1, all values correspond to the same fuzzy kernel, that is, the unit fuzzy kernel.(2)Expand candidate fuzzy kernel: recording the blurred image clockwise rotation (rotation operator is ) version is ; then, crop the small pieces and of the same size in the blurred image and the rotated blurred image, respectively. The two have the following relationship: if the fuzzy kernel recognition result of is , it can be inferred that , the corresponding fuzzy kernel, is .
2.5.2. Motion Parameter Field Structure

In the previous section, the block-level motion distribution estimation is performed through the CNN network. In order to refine the parameter estimation accuracy, the block-level motion distribution is merged into a dense motion fuzzy parameter field according to the MRF model.

When using CNN classification, all pixel locations of patches are considered to have the same motion distribution. Correspondingly, for the pixel p having multiple motion distribution estimation results (because of dividing the overlapping small blocks), the weighted average of the motion distribution is performed, and the confidence of the motion vector defining the p-point is as follows:where p is the pixel location in and is the coordinates of the pixel p. For p that is closer to the center point q, the Gaussian function provides a larger weight.

2.5.3. Variable Blur Removal

The gradient domain is now used to train the network. The input of the network is subjected to the guided filtering and the L0 filtered gradient domain image and its corresponding blurred image, and the output is the edge information of the image. After the input picture is rotated by 90°, the edge information is extracted, which is the same as that obtained by extracting the edge information when not rotating. Therefore, the method only uses the vertical gradient to train the network and share the weights in the horizontal and vertical directions.

The structure of the CNN is divided into two parts: the first four layers of the network are mainly used to extract the main structure from the blurred image, suppressing irrelevant details; the latter four layers of the network are used to enhance the extracted structure and better restore the edges of the image. The output of CNN layer 5 is represented by pl, which is . is the training data of the first stage real image of the network training, that is, the gradient domain image after the original image is guided and filtered; is the training data of the second stage real image of the network training, that is, the gradient domain image of the original image after L0 filtering. They all have only one channel. At the 5th level, a single channel, , is used to train the two parts of the network. In order to solve the problem that the middle layer has only one channel and the training efficiency is not high, the method takes the weighted average of the feature map as the output value of the first stage, that is,where is a learnable coefficient.

The entire network training process loss value iswhere takes 10−6, is the weight of the regular term, and is the edge information obtained by network training.

3. Experiments

3.1. Dataset

The network training process uses the COCO-2014-training large-scale image library released by Microsoft, which provides a total of 83,000 images in 13 GB size and 60,000 instances of segmentation information in 80 categories corresponding to image numbers. The COCO image library is widely used in the field of image deep learning. Its content covers human landscape, natural scenery, plant flowers, and vehicles. We randomly select a character image as a clear training image.

3.2. Experimental Environment

Implemented in the Python scripting language, version Python 3.6.1, the deep network build uses the PyTorch 0.3.1 deep learning framework. The network training was performed on a desktop computer equipped with a 12G Nvidia GeForce GTX 1080ti GPU and Ubuntu 16.04.3 operating system (memory: 64G, clocked at 2.40 GHz, and CPU model: Intel(R)Xeon(R) E5-2620). The rest of the comparison experiments were done on a Windows 10 operating system desktop (memory: 8G, clocked at: 3.30 GHz, and CPU model: Intel(R) Core(TM) i5-4590). The image quality evaluation indicators were calculated using the Matlab source code and the Matlab toolkit provided by Eero.

3.3. Objective Image Quality Assessment

The existing image deblurring technique is used to restore the image. In fact, the technical level of the restoration result is completely consistent with the real image. Therefore, it is necessary to use a certain standard to evaluate the quality of the restored image, which requires the evaluation of image quality. A good IQA method can make a relatively comprehensive evaluation of image technology and has certain guiding significance. According to the image quality evaluation method, it is divided into a subjective evaluation method and an objective evaluation method.

The subjective evaluation algorithm subjectively scores images by human observers and is considered to be the most reliable method because humans are end users of images. However, observers are susceptible to the external environment, their own preferences, and cognitive levels during observation. Subjective evaluation methods are extremely unstable when the number of observers is not large enough. On the contrary, hiring a large number of observers for observation and scoring is too expensive and time consuming, making it an unrealistic choice for real-time systems. Therefore, in the study of this paper, an objective evaluation method is adopted.

4. Discussion

4.1. Motion Image Blind Deblurring
4.1.1. Image Preprocessing

On the basis of using the image to achieve image deblurring, it is proposed to preprocess the image by using guided filtering because the guide filtering can suppress the irrelevant details in the image and preserve the main structure of the image, as shown in Figure 2:

4.1.2. Influence of Filter Size on Training Results

The entire model was trained for 100,000 iterations. Figure 3 shows the effect of filter size on the training results. The entire network has a total of 8 layers, and the number of filters per layer is 128. In Figure 3, the green curve is obtained under filters of different sizes, the blue curve is obtained using a 5 × 5 filter throughout the network, and the red curve is a 7 × 7 filter used throughout the network acquired. The effect of filter size on training results is shown in Figure 3.

4.1.3. The Effect of the Number of Filters on the Training Results

The network of this experiment has a total of 8 layers. The size of each layer of filters is shown in Figure 4. The number of filters per layer is 64 and 32, respectively, which is compared with the number of filters per layer in this study. In Figure 4, the green curve is obtained when 128 filters per layer are used in this study, the red curve is obtained when 64 filters are used for each layer of the network, and the blue curve is obtained when 32 filters are used for each layer of the network. The effect of the number of filters on the training results is shown in Figure 4.

4.1.4. Deblurring Effect

Deblurring effect diagram using the algorithm proposed in this study is shown in Figure 5.

4.2. Comparative Analysis Based on Natural Image Blur Removal Algorithm

The research of image deblurring algorithm needs the support of theoretical knowledge such as mathematics, numerical analysis and signal processing. The complex operation of image deblurring depends on the computing power of the computer. With the rapid development of computers and the continuous improvement of mathematical theoretical knowledge, people have made new breakthroughs in the study of image deblurring. The deblurring effect is compared with the multilayer perceptron (MLP) method and the edge detection method. The MLP method uses a multilayer perceptron to achieve image deblurring; the edge detection method combines the image structure with an optimization algorithm to apply image deblurring. The deblurring effects of different algorithms are different, and the corresponding PSNR values are also different. Table 1 shows the PSNR values for the different methods:

It is known from Table 1 that the MSNR method and the edge detection method have little difference in removing the PSNR value of the motion blur, but the MLP method will amplify the noise, and the edge detection method may bring a ringing effect to the image.

5. Conclusions

(1)In this study, an image motion blur blind removal algorithm based on gradient domain and depth learning is proposed. CNN can learn the performance of image features for deblurring and design an effective CNN structure. Image gradient domain preprocessing and the CNN network combine the method to avoid the disadvantage of directly using the CNN network and ignoring the image prior information. Experiments show that the proposed method can not only effectively remove the image motion blur but also preserve the image details well; the amplification noise is not obvious, and the ringing effect is better suppressed.(2)This study investigates and summarizes the work of the predecessors. The research background and significance of variable fuzzy blind removal are expounded. The current research status of deblurring based on priori and the deblurring method based on deep learning are introduced. The common fuzzy models and common quality evaluation indicators are briefly introduced. Two methods of variable fuzzy blind removal based on deep learning are deeply studied and three defects are found: relying on the traditional image restoration method, the running speed is slow; for different fuzzy types, the fuzzy parameter estimation network needs to be designed separately. Poorness: for the estimation of variable fuzzy kernel, block and fusion processing is needed, which affects the restoration effect and increases the complexity of the algorithm.(3)This study provides an effective basis for image processing by comparing and analyzing the effects of different blind removal fuzzy algorithms.

The algorithm in this study can deal with blurred images with edges, but if the image is a text image, it is difficult for the algorithm in this study to restore it effectively. How to deblur the text-like image needs to be further studied in the future.

Data Availability

This study does not cover data research. No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Science and Technique Project of Henan (212102310397), the Special Project on Teaching Development Research and Practice of Zhoukou Normal University (JF2021001), the Teaching Quality Assurance Project of Zhoukou Normal University (J2021086), and the Key Scientific and Technological Project of Henan Province (22A520052).