Abstract

Image matching is important for vision-based navigation. However, most image matching approaches do not consider the degradation of the real world, such as image blur; thus, the performance of image matching often decreases greatly. Recent methods try to deal with this problem by utilizing a two-stage framework—first resorting to image deblurring and then performing image matching, which is effective but depends heavily on the quality of image deblurring. An emerging way to resolve this dilemma is to perform image deblurring and matching jointly, which utilize sparse representation prior to explore the correlation between deblurring and matching. However, these approaches obtain the sparse representation prior in the original pixel space, which do not adequately consider the influence of image blurring and thus may lead to an inaccurate estimation of sparse representation prior. Fortunately, we can extract the pseudo-Zernike moment with blurred invariant from images and obtain a reliable sparse representation prior in the blurred invariant space. Motivated by the observation, we propose a joint image deblurring and matching method with blurred invariant-based sparse representation prior (JDM-BISR), which obtains the sparse representation prior in the robust blurred invariant space rather than the original pixel space and thus can effectively improve the quality of image deblurring and the accuracy of image matching. Moreover, since the dimension of the pseudo-Zernike moment is much lower than the original image feature, our model can also increase the computational efficiency. Extensive experimental results demonstrate that the proposed method performs favorably against the state-of-the-art blurred image matching approach.

1. Introduction

Image matching has been an active research area in the field of computer vision, such as image mosaicing [1, 2], object tracking [3, 4], and character recognition [57]. Recent years have witnessed great progress in this task [814]. However, these methods always assume the input is ideal, while the image is inevitable to be blurred for camera shake or object defocus in practical applications. To deal with this problem, a two-stage method has been proposed; it first performs image deblurring [15, 16] to obtain a latent sharp image and then performs image matching utilizing the recovered image. Unfortunately, this straightforward approach heavily depends on the quality of the recovered image, while many deblurring methods are designed for improving human visual perception rather than machine perception; thus, there is no guarantee of the improvement of matching accuracy. Since the purpose of image deblurring is to improve the accuracy of image matching, some works propose to explore the correlation between the image deblurring and matching [17, 18]. Shao et al. [18] proposed a joint image restoration and matching method based on distance-weighted sparse representation (JRM-DSR), which utilizes the sparse representation prior to exploit the correlation between restoration and matching and obtains image restoration and matching simultaneously. The prior assumes that the blurry image, if correctly restored, can be well represented as a sparse linear combination of the dictionary constructed by the reference image. The key to this method is to obtain reliable representation coefficients to help image restoration and further to improve the matching accuracy. However, the JRM-DSR method obtains the sparse representation coefficients in the original pixel space, and it does not adequately consider the influence of image blurring. Due to image blurring, the so-obtained sparse representation coefficients in pixel space may not accurately reflect the similarity between the real-time image and the reference image. Therefore, it is impossible to obtain a reliable sparse representation prior. Fortunately, we can extract the pseudo-Zernike moment with blurred invariant [19] from images and calculate the sparse representation coefficients in the blurred invariant space.

Pseudo-Zernike moment blur invariant is derived from the pseudo-Zernike moments of the blurred images; it is invariable to convolution with circularly symmetric point spread function. Thus, it can efficiently alleviate the influence of image blurring and improve the accuracy of the sparse representation coefficients.

Motivated by the above analysis, we propose a joint image deblurring and matching method with blurred invariant-based sparse representation prior (JDM-BISR). The framework of our JDM-BISR is shown in Figure 1. Inspired by JRM-DSR [18], our JDM-BISR also assumes that if the blurry image can be correctly restored, it can lead to a sparse representation of the dictionary constructed by the reference image. Different from JRM-DSR, we obtain the sparse representation coefficients in blurred invariant space rather than original pixel space, thus improving the accuracy of the sparse representation prior, thereby facilitating the following deblurring and matching tasks. Moreover, since the dimension of the blur invariant is much lower than the original pixel vector, our method can also reduce the computation time of sparse representation and speed matching. We adopt the alternating minimization algorithm to solve the JDM-BISR model. The experimental results demonstrate that our JDM-BISR method performs favorably against the state-of-the-art blurred image matching approaches.

The main contributions of this paper are as follows:(i)We propose a joint image deblurring and matching method with blurred invariant-based sparse representation prior, to deal with the problem of blurred image matching.(ii)We extract pseudo-Zernike moment with blurred invariants from images and obtain the sparse representation coefficients in blurred invariant space, which alleviates the influence of image blurring and improves the reliability of the sparse representation prior.

The remainder of the paper is organized as follows. We will review the related works of pseudo-Zernike moment with blurred invariants and image matching in Section 2. In Section 3, we will detail the model of joint image deblurring and matching method with blurred invariant-based sparse representation prior. Experimental results and analysis will be presented in Section 4. Finally, we will conclude our work in Section 5.

In this section, we first introduce the definition of pseudo-Zernike blurred invariants, which is utilized in the paper, and then review the methods of image matching.

2.1. Pseudo-Zernike Blurred Invariants

Pseudo-Zernike blurred invariants are based on orthogonal pseudo-Zernike moments and are suitable for blur point spread functions with circular symmetry, and they have blur invariance and noise robustness. The computation of blur invariants of pseudo-Zernike moments needs to compute pseudo-Zernike moments first and then generate different orders of invariants via an iterative way. Specifically, for a polar coordinate image , the pseudo-Zernike moments of order p with repetition q are defined as follows [20]:where . Since is symmetrical to q, we only consider the case where .

Assuming , equation (1) can be reformulated aswhere

According to [20], we can obtainwhere

Generally speaking, the blurred image can be regarded as the convolution of the original image and the blur kernel point spread function . Considering the rotation invariance of pseudo-Zernike moment, we can obtain

According to [21], the relationship between the radial moment of the blurred image and original image is as follows:

By substituting equations (4) and (7) into equation (2), we can obtainwhere

According to above insights, Dai et al. [19] gave the definition of pseudo-Zernike blur invariant for blur point spread functions with circular symmetry:where denotes the order of pseudo-Zernike blur invariants.

2.2. Image Matching

Image matching has been intensively studied over the past decade due to its crucial role in computer vision. Traditional image matching methods have been classified into two classes [22]: feature-based methods and pixel-based methods. Feature-based methods first extract feature vectors from the real-time image and the reference image and then measure the similarity among the feature vectors, thereby obtaining the position of the real-time image. Following are some feature-based methods: Canny operator [23], Harris operator [24], SUSAN operator [25], SIFT feature descriptor [26], SURF operator [27], and ridgelet transform [28]. However, these methods perform poor when the input image is blurred, since it is hard to extract robust feature vector from the degenerated images. Since the pixel-based approaches utilize all of the pixels in the local window, they can achieve better performance than the feature-based approach under occlusion conditions. Many pixel-based methods are also proposed, e.g., template matching (TM) [29], increment sign correlation [10], binary coding and phase correlation [30], and selective correlation coefficient [9]. Recently, some cross-correlation-based methods [8, 31, 32] have also been proposed to improve the matching performance. Yoo and Ahn [8] utilized correlation coefficient of occlusion-free matching to determine the position of the real-time image. Bilal and Masud [31] reduced the search speed by applying a monotonically increasing cross-correlation function. Zhu and Deng [32] proposed a gradient direction selection cross-correlation method for image matching. However, the above methods cannot efficiently deal with the problem of blurred image matching.

An intuitive idea to solve this problem is to first resort to image restorations [3335] and then to perform image matching. Unfortunately, this straightforward approach is heavily depended on the quality of image restoration, while many restoration methods are designed for improving human visual perception rather than machine perception; thus, there is no guarantee of the improvement of matching accuracy. Therefore, some works attempt to explore the correlation between the image deblurring and matching [17, 18]. Yang et al. [17] utilized sparse representation prior to achieve joint face image restoration and recognition.However, to obtain sparsity, the sparse representation may choose different images to represent input images and result in an inaccurate recognition result, thus cannot give meaningful guidance for restoration Since the local information can ensure that similar samples have similar representation coefficients, Shao et al. [18] proposed a joint image restoration and matching method based on distance-weighted sparse representation (JRM-DSR), and they considered both local and sparse information, adopting distance-weighted sparse representation to obtain better representation coefficients.

However, they both obtained the sparse representation coefficients in the original pixel space, which do not adequately consider the influence of image blurring, thus leading to an inaccurate estimation of sparse representation prior. In this paper, we obtain the sparse representation coefficients in blurred invariant space rather than original pixel space, thus improving the accuracy of the sparse representation prior, thereby facilitating the following deblurring and matching tasks.

3. The Proposed Method

In this section, we will present our JDM-BISR model for blurred image matching. For completeness, we first give a brief overview of JRM-DSR.

3.1. JRM-DSR: An Overview

The JRM-DSR method aims to solve the problem of blurred image matching by fully exploiting the correlation between restoration and matching. Given the blurred input image and the dictionary , which is constructed by using a sliding window with step size 1 to extract small image blocks from the reference image, the JRM-DSR method hopes to obtain the recovered clear image , sparse representation coefficient α, and the blur kernel by solving the following optimization problem:where represents the Euclidean distance between the restored image and the dictionary , indicates point multiplication, and s denotes the sparse exponential of the responses of derivative filters. Then, we can obtain the matching position of the blurred image according to the sparse representation coefficient α. The first term is the image reconstruction constraint. The second one denotes that the blurred image, if correctly restored, should be represented as a linear combination of the few atoms in the dictionary. The third sparse regularization emphasizes the representation coefficient should be sparse, and it also enforces that similar images should have similar representation coefficients. The fourth term represents the sparse prior of natural image, where , . The last term is the regularization for the blur kernel , of which the norm is required to be as small as possible. The parameters η, λ, τ, and γ control the effects of the last four regularization terms

The basic idea of the JRM-DSR is that the blurred image, if correctly recovered, should be represented as a sparse linear combination of the dictionary. Meanwhile, a better restored image can lead to more accurate representation coefficients, which in turn can also improve the quality of image restoration. The JRM-DSR method iteratively recovers the input image by seeking the sparsest representation, thus correcting the initial mismatch and improving the confidence of image matching.

However, in the real application, there always exists some blur in the recovered image; thus, the so-obtained sparse representation coefficients in pixel space may not accurately reflect the similarity between the real-time image and the reference image. To overcome this problem and improve the performance of the image matching, we next propose a joint image deblurring and matching method by obtaining sparse representation prior in a blurred invariant space rather than original pixel space.

3.2. The Proposed JDM-BISR Model

In this section, we compute the sparse representation coefficients in the blurred invariant space and propose a joint image deblurring and matching method with blurred invariant-based sparse representation prior (JDM-BISR) The key idea of JDM-BISR is to obtain sparse representation prior in blurred invariant space rather than the original pixel space. The JRM-DSR approach has achieved good performance via obtaining sparse representation prior in the original pixel space. However, in practical applications, the restored image often has some blur, so the sparse representation coefficient obtained in the pixel space may not accurately reflect the similarity between the real-time image and the reference image. As we all know, blurred invariant [21, 36, 37] is a special image feature, which has certain blurred invariance to the blurred image. Generally, blurred invariant can be divided into orthogonal and nonorthogonal, and the former is superior to the latter. Therefore, we extract the pseudo-Zernike moment with blurred invariant [19] from images and perform sparse representation in this blurred invariant space. We formulate our JDM-BISR model as follows:where denotes the sparse representation coefficients, which is obtained in the blurred invariant space. Given the image dictionary , we can obtain the blur invariant dictionary via extracting pseudo-Zernike moment with blurred invariant from all image patches in the dictionary . Similarly, we can also extract pseudo-Zernike moment with blurred invariant from blurred real-time image. Therefore, we can obtain sparse representation coefficients in this blurred invariant space in each iteration and utilize this prior to help image deblurring and matching. As we can see from (12), similar to JRM-DSR, JDM-BISR also iteratively recovers the input image by seeking the sparsest representation among the small image blocks in the reference image.

3.3. Optimization

In this section, we adopt the alternating minimization algorithm [38, 39] to solve the proposed model, which divides the original problem into three subproblems and solves each subproblem separately while keeping the other subproblems fixed. By optimizing the alternating subproblems, our model will finally converge to a global minimization and output the result of image deblurring and matching.

Firstly, according to reference [19], we extract the pseudo-Zernike moment with blurred invariant feature and dictionary from blurred image and dictionary , respectively. Then, we initialize the sparse representation coefficient by solving the sparse representation of w.r.t , and the restored image as . In the following, we will update , , and iteratively.Updating k. For updating blur kernel , we fix all other variables and solve the following objective function:Given the restored image and the blurred image , the above equation has a closed-form solution, so we update bywhere denotes fast Fourier transform, denotes inverse fast Fourier transform, denotes the complex conjugate of , and indicates the elementwise product.Updating x. We update x byIn order to solve the above equation, we introduce an auxiliary variable :With the blur kernel , the blurred template image , and the representation coefficient , we decompose the above equation into x-subproblem and h-subproblem. In order to update the recovered template image , we first fix the auxiliary variable and solve x-subproblem byThe solution of the above problem isSecondly, we fix and update each dimension of separately byUpdating . At last, we update as

Since the sparse representation coefficient obtained by solving the above equation in the origin pixel space is inaccurate, our method utilizes sparse representation prior in robust invariant space as follows:

More specially, we extract the pseudo-Zernike moment blur invariants and according to equation (10), where the order and repetition of each blur invariants are the same, and obtain the weight via calculating the Euclidean distance between and . Then, the SPAMS toolbox [40] is applied to solve this weighted sparse representation. Finally, the matching position of the real-time image in the reference image is obtained bywhere is the set of central coordinates of each small image block on the reference image. Algorithm 1 summarizes the procedure of our joint image deblurring and matching method with blurred invariant-based sparse representation prior.

Input: a blurred real-time image and a clear reference image
Output: the predicted matching position , the recovered image , and the estimated blur kernel
Preparation: construct the image dictionary from the reference image, the coordinate dictionary , the blur invariant feature ,  and dictionary ;
Initialization: initialize by solving sparse representation of w.r.t , and the restored image as ;
for do
 Updating the blur kernel by solving equation (13);
 Updating the recovered image by solving equation (15);
 Updating the sparse coefficient by solving equation (21);
end
Predicting the matching position by equation (23);

4. Experiments and Analysis

In this section, we conduct extensive experiments on six aerial images to demonstrate the efficiency of the proposed JDM-BISR method. In the experiments, we set the size of the reference image as . Firstly, we generate the blurry image of each reference image using Gaussian blur kernel and then we randomly select 100 small images from each blurry reference image as the blurred real-time images, the size of which is set as . Next, we construct the dictionary by using a sliding window with step size 1 to extract image blocks from the reference image; the size of each image block is the same as the blurred real-time image.

We empirically set the parameters , , , , , and the number of iterations . We evaluate the performance of our JDM-BISR against the state-of-the-art image matching methods including template matching based on normalized correlation coefficient (NCC) [41], sparse representation-based classification (SRC) [22, 42], deblur + NCC (DNCC), and JRM-DSR [18]. For image matching, we adopt the position deviation (PD), which is represented by the Manhattan distance between the central coordinates of the image localization and the real position, to evaluate the performance of image matching:where denotes the central coordinates of the image localization and is the central coordinates of real position. For image deblurring, we utilize the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) between the recovered template image and the latent template image to evaluate the performance of image deblurring.

4.1. An Illustrative Example

Firstly, we illustrate the proposed JDM-BISR method with a simple example in Figure 2. Given a reference image and a blurry image, we jointly estimate the blur kernel and recover the latent sharp image and the matching position in an iterative way. Figure 2 shows the restored images and matching results on the reference image in each iteration, and Figure 3 shows the image matching deviation and image deblurring result of the example. For image matching, we can observe that the becomes smaller and smaller as the optimization iteration increases, which means that the underlying position of the blurry image can be determined with increasing confidence. Meanwhile, the restored image resembles more and more the clear image as indicated by the increase of the PSNR and SSIM. Actually, in the initialization stage, the distance between the predicted position and the ground truth is 3 pixels. After two iterations, with better restored image, the approach finds the accurate position. This implies that our approach can effectively regularize image deblurring, seeking the sparsest representation for image matching. On the one hand, a better recovered image will obtain better sparse representation coefficients for image matching; on the other hand, the sparse representation coefficients, tightly connected with image matching, will provide a powerful regularization for image deblurring.

4.2. Efficiency Analysis of Sparse Representation Prior in Blurred Invariant Space

In this section, we analyze the efficiency of sparse representation prior in blurred invariant space. Specifically, we compare two sparse representation-based image matching methods on above six aerial images: one is to obtain sparse representation in the original pixel space (SR-PIXEL) [22, 42] and the other is to obtain sparse representation in the blurred invariant space (SR-BI). These two methods utilize sparse representation to solve the matching problem, but the SR-BI method extracts pseudo-Zernike moment with blurred invariant and obtains sparse representation in this blurred invariant space rather than the pixel space.

The matching results of the above two methods are listed in Tables 1 and 2. In the experiments, σ is the standard deviation of Gaussian blur kernel and it ranges from 1 to 5. The dimension of the pixel vector in the SR-PIXEL method is 2500, and the dimension of the blur invariant in the SR-BI method is 50. The results show that the matching accuracy of the two methods are similar for , but the matching accuracy of the SR-BI method is higher than that of the SR-PIXEL method as σ increases. More specifically, the accuracy of the SR-PIXEL method is 31.67 for and , while the SR-BI method achieves 42.17 under the same conditions. This can be explained by the fact that the SR-BI method can extract blurred invariant feature from image, thus alleviating the influence of image blurring on matching. From these observations, we can conclude that the sparse representation prior obtained in blurred invariant space is more accurate than that obtained in pixel space, especially when the image is seriously blurred.

4.3. Results of Experiments

In this section, we conduct experiments on joint image deblurring and matching under different degradation settings. In our JDM-BISR algorithm, image deblurring and matching are tightly coupled. Thus, we present the results for image matching and deblurring separately. In addition, we also give a comparison of matching speed.

4.3.1. Image Matching Results Comparison

Tables 3 and 4 present the image matching accuracy for 600 blurry images on six reference images, where the standard deviation of Gaussian blur kernel is set as 3 and 4, respectively. From these tables, we can observe that the performance of the DNCC method is very poor, since the image is blurred severely, and poor quality of image deblurring seriously affects image matching performance. In addition, we can also observe that our JDM-BISR algorithm performs the best among all the methods in all cases, which denotes that the sparse representation obtained in blurred invariant space is more reliable than that obtained in original pixel space, thus improving the quality of image deblurring. On the other hand, a better restored image can lead to better matching results.

To visually demonstrate the effectiveness of the proposed JDM-BISR method, we choose a blurry image and its corresponding reference image as an illustrative example, where the standard deviation of Gaussian blur kernel is set as 3. Figure 4 shows the image matching and restoration results of our JDM-BISR method and other four methods. From these figures, we can observe that only our method can obtain the correct matching position and the better quality of restored image.

4.3.2. Image Deblurring Results Comparison

For image deblurring, we randomly select 600 blurry images for each blur kernel size to verify the efficiency of image deblurring, and the standard deviation of Gaussian blur kernel ranges from 1 to 5. Then, we utilize PSNR and SSIM to evaluate the performance of image deblurring between our JDM-BISR method and JRM-DSR method. Table 5 summarizes the average PSNR of two methods as the standard deviation of the Gaussian blur kernel σ ranges from 1 to 5. From the table, we can observe that the image deblurring performance of JDM-BISR is better than that of JRM-DSR in all cases, which is also in accordance with Table 6. This implies that the sparse representation prior obtained in blurred invariant space is more accurate than that obtained in pixel space, thus improving the quality of image deblurring effectively.

4.3.3. Matching Speed Comparison

In practical applications, we should not only consider the matching accuracy but also the matching speed. Therefore, we carry out experiments to compare the computing time of JRM-DSR and JDM-BISR methods; the experimental results are listed in Table 7. In the experiment, the size of blurry input image is set as ; thus, the dimension of the pixel vector in the JRM-DSR method is 2500. As shown in Table 7, the JRM-DSR method takes 43.65 seconds for joint image deblurring and matching, while JDM-BISR takes only 5.6 seconds since the dimension of the blur variant vector is much lower than pixel vector. We can see that our method is much faster than the JRM-DSR method and can meet the requirements of practical application.

4.4. Robust Analysis of the Proposed Approach

In this section, we analyze the influence of blur kernel size and scale variation on image matching.

4.4.1. Influence of Blur Kernel Size

To verify the robustness of our method to the kernel size, we utilize different degrees of blurred image for image matching, in which σ are set as 1, 2, 3, 4, and 5, respectively, and the kernel size corresponds to , , , , and . For each kernel size, 600 corresponding blurry images are adopted in the experiments. The matching results comparison among NCC, SRC, DNCC, JRM-DSR, and JDM-BISR is listed in Table 8. From Table 8, we can observe that the matching accuracy of all methods decreases as σ increases, which means that image blurring brings great challenges to image matching. However, our JDM-BISR method achieves higher matching accuracy than other methods in all cases. For example, our JDM-BISR method can achieve 59.33 when , but the highest accuracy of other methods is only 48.50. From these results, we can conclude that our JDM-BISR method is more robust to kernel size variation than other methods.

4.4.2. Influence of Scale Variation

To verify the robustness of our method to scale variation, we conduct image matching experiments on blurry input image with different sizes. In the experiment, we set the size of the blurry input image as and , respectively, and the standard deviation of Gaussian blur kernel is set as 3. The matching results of NCC, SRC, DNCC, JRM-DSR, and JDM-BISR methods are listed in Tables 9 and 10. From these tables, we can observe the matching accuracy of JDM-BISR method is higher than other methods in all cases, especially when the size of the blurry input image is . Besides, we can also see that as the blurry input image becomes smaller, the matching accuracy decreases This is because with the same blur kernel size, the smaller the blurry input image is, the more blurred the image is. Nevertheless, the matching accuracy of our JDM-BISR method still outperforms other methods when the size of the blurry input image becomes smaller.

5. Conclusions

In this paper, we propose a joint image deblurring and matching method with blurred invariant-based sparse representation prior (JDM-BISR). Our method obtains the sparse representation prior in the robust blurred invariant space rather than the original pixel space, thus improving the accuracy of the sparse representation prior, thereby facilitating the following image deblurring and matching tasks. Moreover, since the dimension of the pseudo-Zernike moment is much lower than the original image feature, our model also increases the computational efficiency. Extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art blurred image matching approach in terms of both deblurring and matching.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This study was supported by the project of the National Natural Science Foundation of China (nos. 61433007 and 61901184).