Abstract

Low-resolution (LR) license plate images or videos are often captured in the practical applications. In this paper, a distribution estimation based superresolution (SR) algorithm is proposed to reconstruct the license plate image. Different from the previous work, here, the high-resolution (HR) image is estimated via the obtained posterior probability distribution by using the variational Bayesian framework. To regularize the estimated HR image, a feature-specific prior model is proposed by considering the most significant characteristic of license plate images; that is, the target has high contrast with the background. In order to assure the success of the SR reconstruction, the models representing smoothness constraints on images are also used to regularize the estimated HR image with the proposed feature-specific prior model. We show by way of experiments, under challenging blur with size 7 × 7 and zero-mean Gaussian white noise with variances 0.2 and 0.5, respectively, that the proposed method could achieve the peak signal-to-noise ratio (PSNR) of 22.69 dB and the structural similarity (SSIM) of 0.9022 under the noise with variance 0.2 and the PSNR of 19.89 dB and the SSIM of 0.8582 even under the noise with variance 0.5, which are 1.84 dB and 0.04 improvements in comparison with other methods.

1. Introduction

Nowadays, the intelligent transport system (ITS) is increasingly used to address traffic problems. The ITS can apply the advanced information technology, data communication transmission technology, electronic sensor technology, control technology, and computer technology to the whole transportation management system effectively and efficiently. The vehicle license plate character recognition (VLPCR) system is one of the most important parts of the ITS and is widely used in traffic monitor and control. However, if the license plate image is captured at low resolution, the license plate cannot be readable; hence, the ITS could not work well. There are many reasons leading to the degradation of required license plate images, such as downsampling, blurring, warping, and noising. Thus, the problem addressed in this paper is using multiframe superresolution (SR) technique [13] to reconstruct license plate images with better quality.

The objective of multiframe SR is to fuse a sequence of low-resolution (LR) images representing the same scene in a single high-resolution (HR) image. Such kind of SR technique can be classified into three classes: (i) frequency domain approaches [4, 5], (ii) interpolation approaches [6, 7], and (iii) regularization approaches [810]. Among these approaches, the regularization approach is studied widely. Due to the ill-posedness of SR reconstruction problems, the basic idea of regularization approach is to incorporate the prior knowledge of the unknown HR image into the reconstruction process. The regularization approach includes deterministic and stochastic regularization approaches. The former uses prior models as regularization terms, and the latter uses prior models to establish prior probability distributions.

The popularly used prior models are the Tikhonov model [11], the total variation (TV) type model [12], and the Markov random field (MRF) model [13]. The Tikhonov model is based on the L2 norm. It might increase the punishment for the noise; however, it may blur the edges. The famous TV type model penalizes the total amount of changes in the image by using the L1 norm to measure the gradient. However, the TV type model could not remove the heavy noise completely, which may lead artifacts to be produced in the estimated HR images. In order to make use of advantages of the L1 and L2 norms, Suresh et al. proposed a discontinuity-adaptive Markov random field (DAMRF) prior model to reconstruct license plate images [14]. In [15], a generalized DAMRF prior model has been used to make the license plate more legible. In [16], the authors proposed a bimodal prior model for the text image and combined it with the Huber prior model. In these methods, the estimation of motion parameters and the reconstruction of HR image are separated and conducted independently, which is a well-known suboptimal solution. And only the translation motion was considered in [14, 15], which is not suitable to many realities of situation. In [15], the authors proposed a method to estimate the regularization parameter; however, they did not consider estimating the noise variance, which is also important for reconstructing the HR image.

In this paper, a new method based on the variational Bayesian inference (VBI) estimator is proposed to perform the SR of vehicle license plate, in which the HR image, the motion parameter, and hyperparameters are estimated jointly. The VBI estimator is a distribution estimation algorithm, which could solve the nonlinear, high-dimensional problem effectively [17].

The VBI framework used in this paper is similar to the ones used in [12, 18]. The main difference between the proposed method and these similar methods is the image modeling. An accurate and comprehensive image model is very useful to improve the quality of the reconstructed images. The image models proposed in [12, 18] have been proved to be efficient; however, they did not consider a significant feature of the license plate image; that is, its gray-level distribution satisfied the bimodal distribution.

The most significant characteristic of license plate is that the target has high contrast with the background to make it readable. Thus, for the gray image of license plate, the target pixels tend to cluster around one center, while the background pixels tend to cluster around another one. From this point of view, we propose a feature-specific model for the license plate image by considering that its gray-level distribution has two peaks. In this paper, this feature-specific prior information is introduced explicitly as constraints into the SR reconstruction of the vehicle license plate. Moreover, in order to assure the reconstruction success, a smoothing prior is combined with the feature-specific prior model to regularize the estimated license plate image. During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers. Thus, the gradient information will be estimated as exactly as possible. Certainly, the smoothing prior is useful to divide the target and background. Experimental results demonstrate that combining the feature-specific and smoothing prior models could reduce artifacts effectively.

The paper is organized as follows. The mathematical model for image degradation and VBI estimator are provided in Section 2. In Section 3, the prior probability distributions for the HR image, motion parameters, and hyperparameters are presented, and the corresponding optimization procedure is described in Section 4 in detail. In Section 5, experimental results are illustrated. The conclusion then follows.

2. Bayesian Framework

2.1. Degradation Model

Before attempting to solve the SR problem, it is necessary to know the process of generating the LR images. We assume that a set of LR observations are obtained from their corresponding single HR image. The size of arbitrary LR image and the HR image is and , respectively, is a downsampling factor in the horizontal and vertical direction, and let . Usually, it assumes that the LR observations are generated from the HR image through a sequence of operations that includes (i) geometrical warps, (ii) blur, (iii) downsampling, and (iv) an additive zero-mean white Gaussian noise. Such SR degradation model for the LR image derived from HR image is given bywhere is the total number of the LR observations, and are the vectorized version of the HR and LR images, respectively, is a downsampling operator, is a blurring operator, is a warp operator that represents subpixel shift between the LR image and the reference frame, and is a noise vector.

In this paper, we assume that the downsampling matrix and blurring matrix remain the same between the LR images and are known. The warp matrix represents the motion that occurs during the image acquisition. It is considered that, in this paper, the motion parameters are due only to global motion, and the motion model contains global translation and rotation; that is, , where is the rotation angle and and are the horizontal and vertical translations of the th HR image with respect to the reference frame. The noise model is assumed as the white Gaussian noise during acquiring LR observations.

Supposing that the noise () is white Gaussian noise and , we can getThen, the following equation is obtained by using (1):Since the noise among the LR images is mutually independent, we can obtainwhere .

2.2. The VBI Estimator

By using the VBI estimator, the variables including the HR image, motion parameters, and hyperparameters are estimated through their corresponding posterior probability distributions. Usually, the mean value of the obtained posterior probability distribution is used as the estimation of the corresponding variable.

The following conditional distribution is obtained by using the Bayes rule:where represents the conditional distribution of the LR image , and are prior distributions of and , respectively, is the prior distribution for the unknown HR image , and is the hyperparameter of the prior distribution . For convenience, we denote .

By using the VBI estimator, is approximated by a tractable distribution . This approximating distribution is found by minimizing the Kullback-Leibler (KL) divergence, which can measure the difference between the two distributions and . The KL divergence is defined as follows:

Since , , , and are mutually independent, . and are posterior probability distributions of and , respectively. and are posterior probability distributions of the hyperparameters and , respectively.

3. The Prior Probability Distributions

3.1. Image Prior Probability Distribution

In the Bayesian method, the prior information of the original HR image represented by the prior model plays an important role. A significant property of license plate is that the target of license plate has high contrast with the background (see Figure 1). The pixels in the license plate images can be divided into two classes. Thus, the original HR license plate image can be represented aswhere represents the set of pixels belonging to the target regions (i.e., the white regions in Figure 1) and represents the set of pixels belonging to the background regions (i.e., the black regions in Figure 1).

Therefore, we assume that, for the license plate images, the target pixels tend to cluster around one center, while the background pixels tend to cluster around another center. In order to make use of this characteristic, we propose the following model to regularize the estimated HR license plate image:where is a vector with size . In the following, we will demonstrate how to obtain the vector .

In (8), the elements of the vector are the mean values of and , respectively. That is to say, if (or ), thus, (or ), where and are the mean values of and , respectively. Then, and can be obtained as follows:where and represent the total number of pixels in the target regions and background regions, respectively.

In order to obtain , first, the following expressions are defined:where and are vectors with size , and , for . is a diagonal matrix with size , whose elements are 0s and 1s, and is a matrix with size , whose elements are 1s.

Then, for obtaining the matrix , we use OSTU to partition the estimated HR image into target regions and background regions. After the partition, the element values of corresponding to the pixels in the target regions are set as 1s; the others are set as 0s. We can obtainThus, Finally, the nonzero elements in and are extracted and positioned in at the corresponding location as desired.

Based on the prior model (8), the following prior probability distribution for the estimated HR license plate image is proposed:where is the hyperparameter of this prior distribution.

In order to assure the success of the reconstruction, we have adopted the total variation (TV) prior model and the simultaneous autoregressive (SAR) prior model [18] to regularize the estimated HR image. The TV and the SAR prior models are defined as follows:where and represent the horizontal and vertical gradient components of the th element of , respectively. Considerwhere denotes the Laplacian operator.

The TV model’s and the SAR model’s corresponding prior distributions are defined as

3.2. Motion Prior Probability Distribution

The motion parameters are modeled as stochastic variables following Gaussian distributions, similar to [12, 18]:where is the a priori mean vector and is the a priori covariance matrix. These two parameters can incorporate prior knowledge about the motion parameters into the estimation process. Setting and equal to zero represents the fact that no such knowledge is available, which makes only the observed LR images responsible for the estimation process.

In this work, the parameters are obtained by using the Lucas-Kanade method [19], and the inverse covariance matrices are set equal to zero matrices. And they will be used as the initial values in the following SR method.

3.3. Hyperparameter Prior Probability Distribution

The prior information about the hyperparameters is usually expressed using the conjugate prior distribution which is calculated conveniently. Moreover, the corresponding posterior distribution has the same functional form with the prior distribution and hence the analytic solution can be obtained. It is well known that the inverse Gamma distributions are the conjugate priors for the variance of the Gaussian distribution whose mean value is known. Thus, we assume that the hyperparameters obey Gamma distributions; that is,with the shape parameter and rate parameter .

These hyperparameters can incorporate prior knowledge about the variances of the HR image and noise among the observed LR images into the estimation process. In the following SR method, and will be used as the initial values. is set equal to 1 and is set equal to 0, which corresponds to utilizing flat prior distributions for the hyperparameters; in this case, only the observed LR images are responsible for the estimation process.

4. Optimization

In this method, three prior models (i.e., the proposed prior model, the TV prior model, and the SAR prior model) are used to regularize the estimated HR image; however, establishing a prior probability distribution that includes these prior models is difficult. Here, the following linear combination of three KL divergences is used to combine the proposed prior model, the TV prior model, and the SAR prior model:where , , , , , , and denote the different sets of all the variables corresponding to the prior probability distributions based on the proposed prior model, the TV prior model, and SAR prior model, respectively, for , and .

Then, is approximated by minimizing the following function:

Due to the half-quadratic form of TV model, (20) is difficult to be solved. In this work, this difficulty is overcome by resorting to the majorization-minimization (MM) approach [12]. Thus, a lower bound of is found by using the MM approach:where and the auxiliary variables need to be calculated by using the following formula:

Then, posterior probability distributions, , , , , , and , can be obtained. The solving process is described in the Appendix in detail. Consequently, the following explicit expressions are obtained to calculate the HR image, the motion parameters, and hyperparameters.

The formula for calculating the HR image is given by

The formula for calculating the motion parameters is given byIn (25) and (26), is an identity matrix with size , and the formulas for , , , , , , and are given in the Appendix.

The hyperparameters can be calculated by using the following formulas:

The optimization procedure can be concluded as shown in Algorithm 1.

Input: The LR observations ; The initial HR image ; The auxiliary variables , , ,
, ; The initial parameters, , , and , .
;
repeat
 Compute by using (25);
 Compute by using (11);
 Compute and by using (9);
 Compute by using (13);
 Compute z n by using (24);
 Compute by using (26);
 Compute by using (27);
 Compute , , and by using (28)–(30);
;
until convergence criterion is met
Output:  .

5. Experimental Results

We have conducted several experiments to evaluate the effectiveness of the proposed SR method for the license plate images. In the experiments, five observations were generated from each of the original images through translation, rotation, blurring, and downsampling. For translation and rotation, the following parameters are used for each image: , , , , and . For the blurring, a uniform PSF is used. The downsampling factor is . The additive white Gaussian noise with and is added to the LR observations.

In our experiments, ( and are the estimated images at the th and th iterations, resp.) is used as the convergence criterion. is the interpolated image of the first LR image , and . The motion parameters are obtained using the Lucas-Kanade method. The inverse covariance matrices are set equal to zero matrices. And , , and . Thus, we can obtain , , , and . Due to , we just present the values of and . In the simulation experiments, when the noise variance is set with , the values of and are set with 5 and 0.1, respectively. The values of and are set with 0.1 and 0.01, respectively, under . In the experiments on real data, the settings of and are used.

Our method is compared with the bicubic interpolation method, the TV-SAR method, and the l1-SAR method. The MATLAB code provided in [18] was used for the testing. For the test images, the performance of reconstruction methods is evaluated by measuring the improvement in peak signal-to-noise ratio (PSNR) and structure similarity (SSIM) index.

5.1. Simulation Experiments

In this subsection, we would like to show the experimental results by using the pictures presented in Figure 2 as test images. Experimental results are used to illustrate the effectiveness of the proposed models compared with the bicubic interpolation method, the TV-SAR method, and the l1-SAR method. Figures 3, 5, 7, and 9 show the reconstructed images obtained by using different SR methods. Results obtained by applying different approaches to LR images generated from Figures 2(a) and 2(c) are presented in Figures 3 and 7, respectively. These LR images are corrupted with average blur and white Gaussian noise of . Results obtained by applying different approaches to LR images generated from Figures 2(b) and 2(d) corrupted with average blur and white Gaussian noise of are presented in Figures 5 and 9, respectively. From these figures, we could see that the HR images reconstructed by the bicubic interpolation are blurred. From these figures, we could see that there exist obvious artifacts in these estimated HR images obtained by using the TV-SAR method and the l1-SAR method. In China, the Chinese character is an important component part of the license plate. The corresponding results are shown in Figures 7 and 9. As shown in Figure 7, the Chinese character obtained by our proposed method is polluted less severely, compared with the results obtained by other methods. Moreover, Chinese character obtained by our proposed method in Figure 9 is more obvious.

In order to make the visual contrast effect more obvious, the corresponding binary results are presented in Figures 4, 6, 8, and 10. The binarization step is also usually included in the vehicle license plate recognition. Although the reconstructed results obtained by our proposed method have some vague senses, the corresponding binary results are better than other results obtained by other methods. The binary results obtained by our proposed method are closer to the binary results of the original HR image. For example, there exist less miscellaneous points in the binary results presented in Figures 6 and 10.

The PSNR and SSIM values of each SR reconstruction method are presented in Tables 1 and 2. From these two tables, we see that our proposed method could produce the reconstructed HR image with the highest PSNR and SSIM values. We take the HR image presented in Figure 2(a) for example. The PSNR values achieved by different methods corresponding to and , respectively, are shown in the following: the bicubic method (PSNR (dB): 16.32, 15.15), the l1-SAR method (PSNR (dB): 20.23, 16.36), the TV-SAR method (PSNR (dB): 20.23, 16.36), and our proposed method (PSNR (dB): 20.76, 18.14). From these tables, it is noted that, in the case of , the PSNR value of our proposed method (20.76 dB) has outperformed that of the bicubic interpolation method more than 4.0 dB. And in this case, the PSNR value of our proposed method is slightly better than those of the l1-SAR method and the TV-SAR method. Even under a stronger noise, in the case of , the PSNR value (18.14 dB) of our proposed method is at least 1.5 dB larger than those of the bicubic interpolation method, the TV-SAR method, and the l1-SAR method.

5.2. Discussion

In this paper, the proposed method and the comparison methods are all based on the variational Bayesian framework for fair. The computational complexity of such kind of method has been analyzed in [12]. The majority of computations are performed for estimating the HR image and the motion parameters. The HR image is calculated by using the conjugate gradient method [12], and the motion parameters are calculated by inverting a 3 × 3 matrix for each observed LR image. Note that the matrix multiplications can be performed very efficiently by implementing the corresponding operators rather than storing full matrices.

A comparison of the computation time is listed in Table 3, under the average blur with size 7 × 7 and the zero-mean Gaussian white noise with . Table 3 shows that the proposed method uses the least time among the iterative methods. In the proposed method, the characteristic of gray-level distribution is introduced as constraints with the TV-SAR model into the reconstruction. During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers, which is beneficial to reduce the computation time.

In Section 5.1, the experimental results under challenging zero-mean Gaussian white noise with variances 0.2 and 0.5, respectively, are presented. In the practical application, sometimes, small noise may exist. In order to verify the performance of the proposed Bayesian framework in the presence of various intensity Gaussian white noise, several simulated experiments are conducted by using Figure 2(b) as a test image, as shown in Figure 11. It can be seen that the PSNR value increases as the noise becomes smaller.

In the experiments under different noise variances, the parameters and are adjusted by the criteria of obtaining optimal results. By taking the average blur with size 7 × 7 and zero-mean noise with as an example, the relationship between the parameters and and PSNR is shown in Figure 12. From Figure 12, we see that when nears 4 and nears 0.1, the highest PSNR value can be obtained. Otherwise, the PSNR value decreases if the stable is kept. It is noted that when nears 7 and nears 0.9, PSNR value decreases by a relatively sharp stage. Their relationship is quantitatively clear.

5.3. Experiments on Real Data

In this subsection, we test our proposed method on several vehicle plate image sequences. A commercial digital continuous shooting mode was used to capture the vehicle image sequences. In this experiment, we use four sequences as examples, and each sequence has ten images. Figure 13 shows one of the ten LR images obtained with a camera far off, and some plate image sequences with very poor quality under different conditions are used, as shown in Figure 16. In this experiment, the license plate is selected as the region of interest.

Figures 14, 15, 17, and 18 show a comparison between bicubic interpolation method, the l1-SAR method, the TV-SAR method, and our proposed method. In Figure 14(a), the number “5” could be misinterpreted as “S.” And there exist obvious artifacts in the reconstructed images obtained by l1-SAR and TV-SAR, respectively. In Figure 15, the results obtained by the bicubic interpolation method are quite blur, which is hard to read. In Figures 15(b) and 15(c), the number “6” and letter “P” are unreadable. However, the Chinese characteristic is still not readable in Figures 14 and 15, and the letter “A” might be confusing. In Figures 17(b) and 17(c), the letter “C” could be misinterpreted as “0.” In Figure 18, the reconstructed result obtained by the proposed method has a higher degree of visual contrast between the target and background; thus, it could be readable easily. In the proposed method, the characteristic of gray-level distribution is introduced as constraints into the reconstruction. During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers, which is beneficial to use the gradient information to preserve the image edges and suppress noise. Meanwhile, the smoothing prior is useful to divide the target and background. Thus, by using the prior information of the license plate image more fully, the proposed method obtains better reconstructed results.

These experimental results show that the proposed method in this paper achieves the best visual effect. Also, the practical utility and potential of our proposed method in enhancing LR frames captured from real traffic could be demonstrated by using the experiments described previously.

6. Conclusion

Current existing reconstruction methods for license plate images use the image models based on gradient information. However, they do not perform well, especially in the heavy noise. This will result in poor recognition results. In this paper, we proposed a new SR reconstruction method to reconstruct the license plate images. Given the significant characteristic of license plate, a feature-specific prior model has been proposed in this study and combined with the TV-SAR prior model. The target and the background can be divided as far as possible during the reconstruction process, which is beneficial to use the gradient information to preserve the image edges and suppress noise. In this paper, the HR image, the motion parameters, and the hyperparameters are estimated jointly by using the variational Bayesian inference estimator, and hence an unsupervised SR method for reconstructing license plate images is established.

In the future work, we will focus our study on estimating the parameters , , and and finding the relationship between these parameters.

Appendix

By using the lower bound of the prior model , (17), the upper bound of is given byReplacing with , the following expression formulates the minimization problem for obtaining :

Due to , the following mathematical expressions can be derived by minimizing :

In order to obtain the explicit forms of the distributions and , and need to be calculated. To calculate them, is first approximated using its first-order Taylor expansion, because is nonlinear with respect to . Thus,where is the mean value of .

Then, is approximated as follows:where ,

By using (A.10), can be approximated aswhere , = 1, 2, 3, are the elements of the covariance of .

Substituting (A.11) into (A.3), the explicit form of the distribution is obtained as multivariate Gaussian; that is,withwhere is an diagonal matrix, in which the elements on the diagonal are . The following formula is obtained from (22) to calculate :

Use (A.11) again, and can be approximated aswhere , , and for .

Substituting (A.10) into (A.1), the explicit form of the distribution is obtained as multivariate Gaussian; that is,with

Obviously, the distributions for hyperparameters, (A.5)–(A.8), are Gamma distributions. Their corresponding mean values are given as follows:

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This research is supported by NSFC Grant no. 61370179, the Independent Innovation Research Funds of HUST no. 2013YLQX001, and the Fundamental Research Funds for the Central Universities, HUST, no. 2015YGYL012.