#### Abstract

Edge-preserving Bayesian restorations using nonquadratic priors are often inefficient in restoring continuous variations and tend to produce block artifacts around edges in ill-posed inverse image restorations. To overcome this, we have proposed a spatial adaptive (SA) prior with improved performance. However, this SA prior restoration suffers from high computational cost and the unguaranteed convergence problem. Concerning these issues, this paper proposes a Large-scale Total Patch Variation (LS-TPV) Prior model for Bayesian image restoration. In this model, the prior for each pixel is defined as a singleton conditional probability, which is in a mixture prior form of one patch similarity prior and one weight entropy prior. A joint MAP estimation is thus built to ensure the iteration monotonicity. The intensive calculation of patch distances is greatly alleviated by the parallelization of Compute Unified Device Architecture(CUDA). Experiments with both simulated and real data validate the good performance of the proposed restoration.

#### 1. Introduction

Image restorations have wide applications in remote sensing, radar imaging, tomographic imaging, microscopic imaging, astronomic imaging, digital photography, and so forth [1β4]. For linear and shift invariant imaging systems, the transformation from to is well described by the following additive linear degradation model [3, 4]: where , , and represent the degraded observed image, the original true image, and the corrupting white Gaussian noise with variance , respectively. is the PSF (Point Spread function) of the imaging system and is the linear convolution operator. Throughout this paper, we assume that the PSF is known or could be numerically determined or estimated.

Bayesian or maximum a posteriori (MAP) approach, within Markov Random Fields (MRFs) framework, can provide a stable solution of the ill-posed inverse image restorations through incorporating a priori information about the geometrical properties of an image [3β8]. Via modelling the unknown parameters in the prior probability density functions, prior information can also be interpreted as the regularization metrics, which measures the extent to which the contextual constraint assumption is violated. With the incorporated prior information, Bayesian approach is able to distinguish solutions from less desirable ones by transforming the original ill-posed problem into a well-posed one. Generally, we can build the following posterior probability for image restorations, Here, and denote the likelihood distribution and the prior distribution, respectively. and denote the corresponding likelihood energy and the prior energy function. is the notation of the energy function evaluated at pixel index . The partition function is a normalizing constant. is the global parameter controlling the balance between the likelihood energy and the prior energy. We can build the posterior energy function in the logarithm form The restored or deconvolved image can be obtained through the maximization of function .

The widely used quadratic membrane (QM) prior or the Tikhonov L2 regularization tend to smooth both noise and edge details and often lead to unfavorable oversmoothing in restorations [5]. On the other side, some edge-preserving Bayesian restoration methods have been proposed, which can be classified into three main categories: wavelet relevant regularization restorations, Bayesian restorations with nonquadratic prior energies, and Bayesian restorations with total variation (TV) prior. Wavelet relevant regularization restorations, with multiscale stochastic prior models, have been proposed in the research of edge-preserving restorations [8β11]. In [10], priors based on wavelet decompositions and heavy-tailed pdfs were incorporated into the EM restoration algorithm. In [11], Neelamani et al. proposed a Fourier-Wavelet hybrid restoration, which exploits the potentials of the sparse representation of noise for Fourier transform and the sparse representation of the coefficients in wavelet transform. All these wavelet regularization methods are based on the distribution modeling and thresholding of the decomposed wavelet coefficients [11]. Edge-preserving priors with nonquadratic energies were also proposed to preserve the edge details by using nonlinear inverse-proportional functions between edge existences and intensity differences [5, 12, 13]. The weighting matrices in nonquadratic prior Bayesian restorations preserve the edges by turning off or suppressing smoothing at appropriate locations. Another impressive research direction in this area is the total variation (TV)-based image restorations [7, 14, 15]. This kind of approach can also be viewed as TV priors with *the half-quadratic regularization* prior energies. With no enforced global image continuity, TV prior restorations often demonstrate good property of edge preserving.

The nonquadratic priors preserve the edges by determining the pixel-wise regularizations based on the intensity-difference information within local fixed neighborhoods. The regularizations on TV priors also come from the information of the local intensity gradients. For all these nonquadratic and TV priors, the local information of intensity difference can by no means provide effective regularizations to discriminate noise from image textures in different scales. So these edge-preserving approaches often fail to cope well with some complicated restorations with high noise contaminations, and tend to produce block artifacts around the continuous edges in those situations [16β21].

Objects in the images are reasonably assumed to be a composition of edges and backgrounds with intensities changing coherently over different scales. Thus, for the traditional prior model, the intensity differences between individual pixels within local neighborhoods are often insufficient to characterize objects [19, 20]. This limit elicited our previous work in [21] which introduced a spatial adaptive (SA) prior restoration approach. This SA prior works by adaptively including the relevant neighboring pixels and excluding the negative irrelevant ones within a large neighborhood. Though showing expressive results, the iterative restoration in [21] updates the prior weight using the latest image estimate in a one step late (OSL) way, which leads to inconsistent posterior energy function and nonconvergent iterations. In addition, the approach in [21] is greatly limited by the high computational cost in calculating all the patch distances for all the pixel pairs in each prior neighborhood. In this paper, to overcome the first problem, we applied a large-scale total patch variation (LS-TPV) prior model, which is built through the specification of the conditional probability for each pixel [22β25]. The joint maximization estimation of this constrained entropy function will lead to convergent image/weight restorations. To overcome the second problem, we introduce the parallel computation structure of CUDA to calculate the patch distances [26β28]. Furthermore, another benefit from the CUDA-accelerated parallelization is that it allows further performance enhancement for the LS-TPV prior restoration by using larger search neighborhoods.

In Section 2, a review of some previous prior models is illustrated, and after that we introduce the proposed LS-TPV prior model. In Section 3, we give a joint estimation algorithm for the proposed restoration. In Section 4, we perform comparative experiments with both simulated and real data. Relevant visual/quantitative results are also presented. Conclusion and discussion are given in Section 5.

#### 2. Prior Model

Conventionally, the value of is commonly computed through a weighted sum of potential functions of the differences between the pixels in the neighborhood :

Generally, different choices of the potential function lead to different priors. For the simple space-invariant QM prior, the potential function has the form . Some Edge-preserving nonquadratic priors could be chosen by adopting a nonquadratic potential function , such as the Huber potential function where is the threshold parameter [5, 13]. Such edge-preserving nonquadratic priors preserve structure information by choosing nonquadratic potential functions that increase less as the differences between the adjacent pixels become bigger. Weight is a positive value that denotes the interaction degree between pixel and pixel . In traditional prior models, it is simply considered to be inversely proportional to the distance between pixels and . So on a square lattice of image , in which the 2D positions of pixels and ( are, respectively, and , is usually calculated by the geometric distance .

Restorations using total variation (TV) prior take the prior energy as the following expression: where and are linear operators corresponding to the first-order horizontal and vertical intensity differences at pixel , respectively [7, 14, 15]. Since both the smooth and sharp edges tend to have similar prior energies, this TV prior energy given by (2.3) does not penalize weak discontinuities and often favors images with bounded variations.

Traditional quadratic priors, nonquadratic priors and TV prior, only provide local intensity information for Bayesian restorations. And the edge-preserving nonquadratic and TV priors, though being able to preserve edge information, tend to produce unfavorable block artifacts or false edges in high-noise situations.

To improve restoration quality, in [21] we proposed a spatial adaptive (SA) prior model, in which a spatial adaptive prior neighborhood is implemented by setting the weight to an 1-0 binary function to classify the pixels in neighborhood into the pixels relevant to the center pixel and those not. The building of the SA prior can be formalized as follows: Here, is the energy function for the SA prior. , which represents the classification of the neighboring pixels in the search neighborhood , can be computed via (2.5) and (2.6). , the number of the neighboring pixels with nonzero in the neighborhood , is the normalization factor for the different sizes of non-1 neighboring pixels in each . Parameter in (2.5) is the threshold parameter. The distance is determined by the distance measure (2.6) between the two translated comparing patches and , whose centers are, respectively, located at pixel and pixel .

Though proved to be effective in suppressing both noise and block artifacts, all the weights in SA prior need to be heuristically inferred using current image estimate in the way of one step late (OSL), which often leads to inconsistent posterior energy function with no guaranteed convergence [8]. To overcome this problem, in this paper, we proposed an LS-TPV prior model, in which regularization takes effect through penalizing the weighted distances/variations between the patches surrounding each neighboring pixel pair. The weights in each are considered variables with a nonlinear dependence on image . For the central pixel and its neighboring pixel in each prior neighborhood , is considered the similarity probability between the two patches and surrounding pixels and , respectively [24]. An entropy prior of all the weights for each is introduced into our model in the form of . Here as the βFrameβ model in [25], the constraint is . Then, based on joint MAP theorem [12], we can obtain Here, is the introduced pixel-wise Lagrange-multiplier parameter. For each pixel , with denoting the current image estimate, the prior energy of the LS-TPV prior is in the combined form of a weighted patch similarity energy, an entropy energy β and a normalization constraint . We can thus obtain the following posterior energy function: Here, as (1.4), is the global parameter controlling the balance between the likelihood energy and the prior energy. The patch distance is calculated as , in which denotes the Gaussian convolution kernel with standard deviation . is the value of Lagrange-multiplier parameter at pixel . The parameter controls the maximum entropy constraint on , which is routinely modulated with respect to the noise levels (large is routinely set to suppress more noise for high-noise situations).

#### 3. Restoration Algorithm

To obtain tractable maximization of the posterior probability using the proposed LS-TPV prior energy in (2.8), we use the algorithm of iterative coordinate descent (ICD), in which each element is sequentially updated by maximizing the log-style local conditional probability function [29]. In this way, the objective image and the weight are jointly estimated through alternative maximization of the with respect to and .

*(1) .*

denotes the fixed current image estimation, and each weight for each pixel is updated by solving the following singleton optimization problem:

With Lagrange-multiplier introduced, by joint solving (3.1) with respect to each and , we can then obtain :
where can be easily calculated using currently available . Each is determined not only by the similarity metrics between patches and , but also by the sum of all the similarity metrics between patches with the central pixels located in . So is different from .

*(2) .*

Here denotes the fixed current weight estimate. Omitting the terms with no , the becomes

To make a tractable maximization of (2.8), based on the theories of pseudolikelihood and iterated conditional mode (ICM) in [29], we can restore each by solving the following factorized problem:

Considering the convexity for and , we conclude that, given the fixed weight , the Hessian matrix of each in (3.4) is a negative definite. In this step of image updating, we applied the restoration algorithm in [4] to solve (3.4) to obtain the image estimate of each pixel . The in (2.8) is separately convex with respect to each and . Denoting and as the th and th iterated estimates, we can obtain following relation: , which implies convergence to one local maximum and can be obtained for this joint updating strategy [30].

In each iteration in the proposed restoration, it is very expensive to perform the pixel-wise calculations of all the and for each pair of translated patches in each over the whole image region. In the experiments, to save computation cost, we do not extend the sizes of over the whole image region, and set them to some appropriate sizes. We achieve significant acceleration by using parallel Compute Unified Device Architecture (CUDA) [26β28]. For each iteration in the restoration using the proposed LS-TPV prior model, all threads in this block grid structure execute simultaneously to perform all the involved pixel-wise operations. Furthermore, considering the symmetry property that distance equals to distance , we can further save one half computation cost by only calculating one of the two distances and .

#### 4. Experiments

Both simulated and real data are used in the experiments. To provide comparative results, we also performed Wiener deconvolution, ForWaRD restoration, Bayesian restoration using TV prior, Huber prior, and the SA prior in [21]. The Wiener deconvolution was performed by using the degraded image to estimate the power spectrum in Fourier domain. The TWIST algorithm in [7] and Newton-Raphson algorithm in [12] are used in the restorations using TV prior and Huber prior, respectively. Restored images from above Wiener filter method are taken as the initial images for all the Bayesian restorations. In experiment, the parameters needed to be preset include the for all Bayesian restorations, the threshold for Huber prior, the wavelet threshold, and decomposition level in the ForWaRD method, the for the SA prior, the *h* and the and settings for the proposed LS-TPV prior. In the simulation experiment, all the parameters are set based on SNR maximization criteria with SNR calculated as
Here , , , and denote the total pixel number in image, the restored image, the original true image, and the mean of the original true image, respectively. As to the real data experiment, which has no available true image for SNR calculations, all the parameters are set based on the tradeoff between edge preservation and noise suppression. and of sizes ( and ) is used in our work. A full study of the effects of the sizes of and on the resulting restorations is performed in Section 4.5.

Generally, the solution to the restoration is considered optimized when the iteration goes stable. So for the Bayesian restorations using TV prior, Huber prior, and SA prior in which the energy functions are maximized with respect only to image, we stop the iterations when the current image estimates satisfy the condition . For the restorations using the proposed LS-TPV prior in which the energy functions are maximized with respect to both image and weights, we stop the iterations when the current image and weight estimates satisfy the condition . In practical experiments, to reach the above iteration conditions, it is found that 411, 433, 475, and 486 iterations are required for the Bayesian restorations using TV prior, Huber prior, SA prior, and the proposed LS-TPV prior, respectively.

##### 4.1. Simulation Experiment

In this section, simulated experiments with βLenaβ image (intensity range [β98ββββ117]) are performed. In the simulation, a uniform PSF is assumed, and Gaussian noise with variance = 0.1663 is added. The PSF used in this experiment is normalized to 1. Figure 1(b) is the degraded image simulated by (1.1) using the PSF with the white Gaussian noise. The parameters for different restorations are listed in Table 1. All the restored images are illustrated in Figures 1(c)β1(h), respectively.

**(a) Original image**

**(b) Degraded image**

**(c) Wiener restoration**

**(d) TV prior restoration**

**(e) Huber prior restoration**

**(f) ForWaRD restoration [11]**

**(g) SA prior restoration [21]**

**(h) LS-TPV prior restoration**

The restored images from TV prior restoration and Huber prior restoration are, respectively, shown in Figures 1(d) and 1(e). Figure 1(f) shows the restored image using the ForWaRD method. The result of Bayesian restoration using the SA prior is illustrated in Figure 1(g). For the Bayesian restoration using the proposed LS-TPV prior, the algorithm in Section 3 is used, and the corresponding restored image is shown in Figure 1(h).

Through the results, we find the ForWaRD method and all the Bayesian restorations (Figures 1(d)β1(h)) can overcome the ring effects in Wiener deconvolution (Figure 1(c)). The block artifacts (the left arrows), which are observed in the ForWaRD approaches and the restorations using Huber prior and TV prior, are effectively suppressed using the restorations using the SA prior and the proposed LS-TPV prior. The LS-TPV prior (Figure 1(h)) presents a further visual enhancement over the SA prior (Figure 1(g)) in block suppressing (the left arrows) and edge preserving (the right arrows). Figure 2 plots the profiles along one specified track (the red line in the hair region in βLenaβ image in Figure 2) for different restorations, and we can see that the profile from the proposed restoration (the blue profile) has the closest match with the profile of the true βLenaβ image (the red profile) of those from other restorations. Table 2 shows that, compared to other methods, the proposed LS-TPV prior restoration can lead to restored images with a higher SNR.

##### 4.2. Real Data Experiment

In this section, a real blurred astronomical βmoonβ image with intensity range [β77ββββ170] is obtained from βhttp://www.iceinspace.com.au/β. Here we use a normalized Gaussian PSF, which was tested to be effective. The parameters for different restoration approaches are listed in Table 3.

Figure 3(b) is the restored result from Wiener deconvolution. The restored images from TV prior restoration and Huber prior restoration are, respectively, illustrated in Figures 3(c) and 3(d). Figures 3(e) and 3(f) show the restored images from the ForWaRD method and the Bayesian restoration using SA prior. As to the Bayesian restoration using the LS-TPV prior, the restored image is shown in Figure 3(g). Also, we can note that the ForWaRD method and all the Bayesian restorations (Figures 3(c)β3(g)) are free of the ring effects in Wiener deconvolution (Figure 3(b)). Also, the block artifacts observed in the restorations using Huber prior and TV prior are effectively suppressed in the restorations using SA prior and the LS-TPV prior (the lower red arrows in Figures 3(f) and 3(g)). The oversmoothing in the ForWaRD restoration can not be observed in the restorations using the SA prior and the proposed LS-TPV prior (the upper arrows in Figures 3(e), 3(f), and 3(g)). Compared to the restoration using the SA prior, the restoration using the LS-TPV prior also shows a better visual performance in preserving edges (the upper arrows in Figures 3(f) and 3(g)).

**(a) Degraded image**

**(b) Wiener restoration**

**(c) TV prior restoration**

**(d) Huber prior restoration**

**(e) ForWaRD restoration [11]**

**(f) SA prior restoration [21]**

**(g) LS-TPV prior restoration**

##### 4.3. Computation Cost Comparisons

Table 4 lists the recorded CPU times (in seconds) of different restorations for the experiments with simulated data and real data, respectively. Please note that the notations *simu* and *real* in Table 4 correspond to experiments with simulated data and real data, respectively. We can see that the restorations using SA prior and the LS-TPV prior are much computationally intensive than other methods. We define LS-TPV prior res**1**, LS-TPV prior res**2,** and LS-TPV prior res**3** as the original serial restorations, the restorations after optimization, and the restorations after both optimization and parallelization. Table 4 shows that, with respect to the original serial version, the proposed restorations after optimization are about 3 times faster, and the restorations after both optimization and parallelization are about 50 times faster. But we can not claim that the LS-TPV prior restoration needs less computation costs than other methods since that we did not design the parallelized versions for them.

##### 4.4. Algorithm Monotonicity

Here, to analyze the monotonicity of the joint image/weight algorithm proposed in Section 3, we calculated the posterior function energy and SNR with respect to the iteration numbers for the proposed restorations. The total iteration number is set to 500. As illustrated in Figure 4, we can see that both the calculated posterior function energy and the SNR increase monotonically during the whole iteration.

**(a)**

**(b)**

##### 4.5. Study on Neighborhood and of Different Sizes

To study the roles of the sizes of and in restorations, we perform restorations using the proposed prior with and of different sizes in above simulation experiment with βLenaβ image. To characterize local image structures, we do not set larger than . Considering the fact that the total pixel numbers are different for different comparing patches (49 for , 9 for , and 1 for ), parameter should also change with the restorations using different patches . The hyperparameter and parameter for the proposed prior are chosen by hand to produce the best stable images in terms of SNR maximization. Combinations of and with different sizes and the corresponding computed SNR and the recorded CPU time cost (with CUDA parallelization) are all listed in Table 5.

We can see that higher SNR can be obtained when enlarging and from to and . We also note in Table 5 that more CPU time is needed for the restorations using the proposed prior with larger and . We should set the sizes of and based on the tradeoff between performance and computation cost. We find in Table 5 that no significantly quantitative SNR differences are made between and larger . In fact, as the iteration proceeds, more global information beyond the will be incorporated into the regularization of the pixels.

#### 5. Conclusions and Future Work Plan

In this paper on image restoration, we proposed an LS-TPV prior which penalizes the total distances between neighboring patches through a constrained mixture prior mode. A convergent joint image/weight updating algorithm, which estimates image and weights sequentially, is proposed to overcome the heuristical OSL weight determination for the SA prior restoration in [21]. We can see that, in addition to providing effective regularization, the proposed approach can lead to stable iteration with convergence guaranteed.

The application of the proposed prior model needs a pixel-wise computation of the distances between neighboring patches over a large region, which implies high computation cost of restorations. In this paper, remarkable advance in shortening computation time is achieved by optimizing patch distance computation and replacing the original pixel-wise calculation of patch distances by CUDA framework.

Further work includes applying the proposed approach in blind image deconvolutions, further justifying the proposed approach by using more image-quality measures based on human visual system, and further accelerating the computation by parallelize the PSF convolution in restorations.

#### Acknowledgments

This research was supported by National Basic Research Program of China under Grant (2010CB732503), National Natural Science Foundation under Grant (81000636), and the Project supported by Natural Science Foundation of Jiangsu Province (BK2009012).