Abstract
Super-resolution is a fusion process for reconstructing a high-resolution image from a set of low-resolution images. This paper proposes a novel approach to image super-resolution based on total variation (TV) regularization. We applied the Douglas-Rachford splitting technique to the constrained TV-based variational SR model which is separated into three subproblems that are easy to solve. Then, we derive an efficient and effective iterative scheme, which includes a fast iterative shrinkage/thresholding algorithm for denoising problem, a very simple noniterative algorithm for fusion part, and linear equation systems for deblurring process. Moreover, to speed up convergence, we provide an accelerated scheme based on precondition design of initial guess and forward-backward splitting technique which yields linear systems of equations with a nice structure. The proposed algorithm shares a remarkable simplicity together with a proven global rate of convergence which is significantly better than currently known lagged diffusivity fixed point iteration algorithm and fast decoupling algorithm by exploiting the alternating minimizing approach. Experimental results are presented to illustrate the effectiveness of the proposed algorithm.
1. Introduction
Multiframe image super-resolution (SR) is one of the promising techniques in image processing community since it enables us to obtain an image with a resolution that exceeds the hardware limitation, for example, the number of pixels in a charge-coupled device (CCD). Super-resolution is the process of combining a sequence of low-resolution (LR) noisy blurred images to produce a high-resolution (HR) image of sequence. It overcomes the inherent resolution limitation by bringing together the additional information from each image.
Generally, SR techniques can be divided into two broad categories: frequency domain methods and spatial domain methods. Most of the earlier SR work was developed in frequency domain using discrete Fourier transform (DFT), such as the work of Tsai and Huang [1], Kim et al. [2, 3], where high-frequency information is extracted from low-frequency data in the given LR frames. Many other popular frequency domain methods were proposed in Discrete cosine transform (DCT) domain and wavelet domain [4β7]. Although the frequency domain methods are intuitively simple and computationally cheap, they are extremely sensitive to model error [8], limiting their use. In a spatial domain alternative, Patti et al. [9] and Stark and Oskoui [10] proposed projection onto convex sets (POCSs) algorithm. A related method, the iterative back projection, was developed in [11, 12]. Although projection-based algorithms are usually robust to noise and allow some modeling flexibility, they are also known for their low rate of convergence. The hybrid maximum likelihood (ML)-maximum a posteriori (MAP)-POCS method was proposed in [13]. Based on these basic reconstruction methods, researchers have produced many extended algorithms, such as nonlocal-means (NLMs) based approach [14], multidimensional kernel regression-based approach [15], the joint formulation of reconstruction and registration [16β19], algorithms for multi-spectral and color [20, 21], hyperspectral [22], and compressed [23, 24] sequence.
The spatial domain methods discussed so far are generally confronted with the problem of slow convergence and expensive computation. To apply the SR algorithm to practical situations, many novel and powerful algebraic techniques have been proposed to reduce the computation complexity. For example, authors of [25β29] proposed efficient preconditioners to accelerate convergence of a conjugate gradient minimization algorithm. Xiao et al. [30] proposed an efficiency SR reconstruction algorithm employing the Armijo rule to identify the step length instead of the exact line search and replaced numerical approximation of the gradient of the MAP object function by analytic approximation. For pure translational motion and common space invariant blurring model, Elad and Hel-Or [31] proposed a novel SR algorithm that separates fusion and deblurring. The fusion method is a very simple noniterative algorithm, while preserving its optimality in ML sense. Farsiu et al. [32] proposed an efficient two-stage method for minimizing a novel framework combining a robust norm fidelity term and a bilateral prior, leading to an initial Median Shift-And-Add operation on Bayer-filtered LR data followed by a deblurring and interpolating stage. Huang et al. [33] proposed a fast decoupling algorithm by exploiting the alternating minimization approach.
In this paper, we propose a general framework for multiple shifted and linear space-invariant noisy blurred LR image frames which subsume several well-known SR models. The proposed model combines the TV regularization to formulate the SR image reconstruction as an optimization problem.
The purpose of this paper is to study an efficient TV-based SR reconstruction algorithm. There are two major contributions in this paper. As the first contribution, we propose an efficient algorithm that takes full advantage of the problem structures; that is, geometrical motion matrices, blur matrix, and the first-order finite-difference matrix all have block-circulant-circulant-block (BCCB) structure under periodic boundary condition. As such, we propose to compute the minimizer of our SR model by applying Douglas-Rachford splitting (DBS) techniques, respectively, alternating direction methods of multipliers (ADMM), which separate the SR model into three subproblems that are easy to solve. As the second contribution, we provide an accelerated scheme based on precondition design of initial value and forward-backward splitting (FBS) to speed up convergence. Our method can separate the SR treatment into measurements fusion, denoising, and deblurring. The fusion part is shown to be a very simple noniterative algorithm. The denoising problem can be solved by a linear time shrinkage operation. Fast Fourier transform (FFT) is employed to solve the deblurring problem. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.
The outline of this paper is as follows. In Section 2, we present the image observation model of the SR problem, then propose a TV-based SR model. In Section 3, we present an efficient SR reconstruction algorithm. Experimental results are provided in Section 4. Finally, concluding remarks are given in Section 5.
2. Problem Formulation
2.1. Observation Model
The image observation model is employed to relate the desired referenced HR image to all the observed LR images. Consider the desired HR image of size written in lexicographical notation as the vector , where . Let the parameters and be the subsampling factors in the horizontal and vertical directions, respectively; each observed LR image has the size . Thus, the LR image can be represented as , for and . A popular model assumes that LR images are generated from HR image through a sequence of operations that includes (i) geometrical motions , (ii) a linear space-invariant blur , (iii) a subsampling step represented by , and finally (iv) an additive white Gaussian noise with zero mean that represents both measurements noise and model mismatch [32]. All these are linear operators, represented by a matrix multiplying the image they operate on. We assume hereafter that and are identical for all images in the sequence. This model leads to the following set of equations, where all images are ordered lexicographically: where represents the imaging system.
The recovery of from is thus an inverse problem, combining motion compensation, denoising, deblurring, scaling-up operation, and fusion of the different images, which all merged to one. The quality of the desired SR image depends on the assumption that , and are known, or the accuracy in estimating the degraded operators. Throughout this paper, we assume that , and are known. The decimation is dependent on the resolution scale factor that we aim to achieve, and, as such, it is easily fixed. In this work, we shall assume that this resolution factor is an integer on both axes. In most cases, the blur refers to the camera point spread function (PSF), and, therefore, it is also accessible. Even if this is not the case, the blurring is typically dependent on few parameters, and those, in the worst case, can be manually set. To be identical with the work of Elad and Farsiu [31, 32], we focus on the simplest of the motion models, namely, the translational model. Reference [34] detailed the several reasons for this. We believe that an in-depth study of this simple case allows much insight to be gained about the problems inherent to SR image reconstruction.
2.2. TV Regularization-Based SR Model
In general, SR is an ill-posed problem either because the information contained in the observed LR images is not sufficient or because it has great sensitivity to the noise. Procedures adopted to stabilize the inversion of ill-posed problem are called regularization. In such stabilization scheme, we reconstruct the original HR image by finding the minimizer of some appropriate functional where is a regularization term which includes prior information about the original image, denotes the data fitting term depending on the given LR image , and is a positive parameter which controls the tradeoff between the two terms for minimization.
In general, the data fitting term can be deduced from an MAP estimation. If is corrupted by additive white Gaussian noise, the MAP estimation will lead to the data fitting term , where and denote a bounded and open domain of continuous LR image and HR image in , respectively. For regularization term, the popular choice is total variation seminorm , which was first proposed for image denoising [35], because TV norm can better preserve sharp edges or object boundaries that are usually the most important features to recover.
As stated previously, to invert the degradation process in (2.1), we can formulate a TV regularization model which requires solving the variational problem: where is the space of functions of bounded variation. Note that the fitting term in (2.3) is strictly convex and coercive and the TV regularity term is also convex (though not strictly so) and lower semicontinuous. So the objective function is globally strictly convex and possesses a unique minimizer. In terms of optimization, these are desirable properties.
3. An Operator Splitting Approach to TV-Based Super-Resolution
To solve the desired HR image of (2.3), commonly used method is the gradient descent method [19β32, 36]. Although this approach is simple, the nonlinearly and poor conditioning of the problem make this approach very slow. A more efficient class of solvers are those based on a linearized gradient method which solves the associated Euler-Lagrange equation via a lagged diffusivity fixed-point iteration [28, 29]. In each iteration of the linearized gradient method, a linear system needs to be solved, which becomes more and more difficult as becomes more ill-conditioned. Another group of algorithms is based on the well-known variable-splitting and penalty techniques in optimization. These ideas have gained wide application in image processing, such as works in [37β41]. Recently, Huang et al. [33] modified the SR model (2.3) by adding a quadratic term to get a simpler alternating minimization algorithm. The drawback of this method is the same as the lagged diffusivity fixed point method.
To efficiently solve the SR problem (2.3), in this section, we will show that the operator splitting method can be used to divide the problem (2.3) into subproblems that can be solved in sequence, and each of them permits a closed form solution. Among the current splitting methods, the most prominent splitting schemes are forward-backward splitting, double-backward splitting, Peaceman-Rachford splitting, and Douglas-Rachford splitting. In this paper, we will focus on the forward-backward splitting and Douglas-Rachford splitting. One may refer to [42, 43] and the references therein for more details.
3.1. FBS and DRS
Let be a real Hilbert space, and let be two set-valued operators. We assume that and are maximal monotone; that is, their resolvents and exist and are firmly nonexpansive. The problem which we will describe as a fundamental problem can be written in the form of a common zero inclusion problem
The idea of the forward-backward splitting algorithm is that, for any constant , we have This leads to the following result.
Theorem 3.1 (FBS [43, Theorem ]). Suppose that is maximal monotone and is a monotone operator such that is firmly nonexpansive for some . Furthermore, assume that a solution of (3.1) exists. Then, for every start element and step size , the forward-backward splitting algorithm converges weakly to an element of the set of solutions .
We now describe the Douglas-Rachford splitting scheme, which does exhibit general convergence, at least when used with a constant step size in finite-dimensional spaces. To introduce it, we can rewrite the fixed point relation (3.2) as follows: This leads to the following result.
Theorem 3.2 (DBS [43, Theorem ]). Let be maximal monotone, and assume that a solution of (3.1) exists. Then, for any elements and and any step size , the following Douglas-Rachford splitting algorithm converges weakly to an element : Furthermore, it holds that satisfies . If is finite-dimensional, then the sequence converges to .
3.2. Proposed Algorithm
In this subsection, we will apply the DRS to dual problem of the minimization functional (2.3). We first rewrite the energy functional (2.3) in a discrete form with , is the discrete total variation of . Here, denotes Euclidean norm, and is given by where and are forward difference operators with periodic boundary condition and . Therefore, is a BCCB matrix.
Consider the equivalent problem of (3.6) The Lagrangian for problem (3.8) is where the dual variable can be thought of as a vector of Lagrange multipliers. Therefore, the dual problem of (3.8) is where (resp., ) denotes conjugate function of β(resp., ).
Define operators and by By Fermat's rule, solving the dual problem (3.10) is equivalent to finding such that
By formally applying DRS (3.5) to (3.12) with as the step size, respectively, the ADMM iterations [44β51] are given by As pointed out by Setzer in [43, 49], we note that this iteration scheme coincides with DRS algorithm (3.5) with , , and . Since operators and are proper, lower semicontinuous, and convex, the operators and are maximal monotone see [43]. According to Theorem 3.2, the sequence converges to solution of (3.6). We also note that the above ADMM algorithm coincides with the alternating split Bregman algorithm proposed by Goldstein and Osher [40] with and .
3.2.1. -Subproblem
It is not difficult to show that the minimization of (3.13) with respect to is equivalent to solving two-dimensional problem of the form for which the unique minimizer is given by the following two-dimensional shrinkage formula: where the convention is followed. Here, shrinkage formula (3.17) which serves as nonlinear low-pass filter to restored HR image is tantamount to denoising treatment.
3.2.2. -Subproblem
Subproblem (3.14) is quadratic in , and the minimizer is given by the normal equations where is the transpose operator of .
Note that in (3.18) is BCCB matrix and can be diagonalized by FFT. Moreover, with periodic boundary condition, both and are BCCB matrices. Exploiting the fact that the product order of two BCCB matrices can commute, we get that , . Then, ( is a diagonal matrix [31]). However, does not have BCCB structure. Therefore, it does not allow us to apply FFT implementation to (3.18) directly as done in [37, 38]. The quadratic term in (3.14) that couples the variable by the matrix makes the algorithm computationally expensive. There are some techniques to overcome these problems. In [50], the authors introduced the three constrains: , , and used the alternating split Bregman technique to maximally decouple the variables. In [51], the linear system was solved noniteratively by using Sherman-Morrison-Woodbury (SMW) matrix inversion formula and FFT to diagonalize the Hessian matrix of the energy functional. In this paper, we use the forward-backward splitting method (3.3) to efficiently solve the -subproblem (3.14), which can decouple the variable and the constraint matrix and make the Hessian matrix of the energy functional have BCCB structure.
Let and . Then, with . Define operators and ; the FBS method (3.3) with as the step size applied to (3.14) leads to the iterative scheme The minimizer is given by the normal equations Under the periodic boundary condition, (3.20) can be solved by two FFTs, which simultaneously performs the LR measurements fusion and deblurring treatment. In the next subsection, the fusion part will be shown to be a simple noniterative.
Notice that is Lipschitz continuous with Lipschitz constant , where (resp., ) is the matrix norm of (resp., ). From [41,Theorem ], is firmly nonexpansive. Then, the convergence of is ensured by Theorem 3.1, while .
3.2.3. Optimal Initial Guess
Firstly, borrowed the idea of [31], we take initial data from maximum likelihood (ML) estimation; that is, is the minimizer of the optimization problem as follows: The gradient descent algorithm suggests the following iterative equation for the solution of (3.21): where denotes step size. Let us define the blurred super-resolution image by . Multiplying both sides of (3.22) with , we get where and . According to [52], the steady state solution of (3.23) is given by We note that copies the values from the LR grid to the HR grid after proper shifting and zero filling and copies a selected set of pixels in HR grid back on the LR grid (Figure 1 illustrates the effect of upsampling and downsampling matrices and ). Neither of these two operations changes the pixel value. It is easy to show that is a diagonal matrix. Each diagonal entry in corresponds to one pixel in the super-resolution image. Its value is a nonnegative integer, counting the number of measurements contributing to it. The fusion image is simply the addition of the measurements after proper zero-filling interpolation and motion compensation. Thus, is none other than the pixel-wise average of the measurement. Therefore, the noise of is reduced due to the averaging.
Because , Wiener filter was applied to . Then, the restoration image is taken as the initial data .
As stated previously, precondition design procedure of the initial data is summed up in Algorithm 1 as follows:
|
3.2.4. Algorithm Description
To sum up the above arguments, the complete resulting algorithm is summarized in Algorithm 2 as follows.
3.2.5. Some Complexity Notes
It is clear that the complexity of the proposed algorithm mainly includes three parts. The calculation in (3.17) and (3.15) have linear-time complexity of order for an image. Hence, the -subproblem (3.16) and can be solved quickly. The solution of the -subproblem (3.20) requires two FFTs (including one inverse FFT), which has a total complexity in the order of .
4. Experimental Results
In this section, we present some experimental examples to demonstrate the performance of our method. We use the three test images (βLenaβ Figure 2(b), βCameramanβ Figure 4(b), and βFingerprintβ Figure 8(b)) for the synthetic test. A sequence of LR frames of pixels from the original image is generated as follows. First, the original image was shifted by one pixel in the vertical direction. Then, to simulate the effect of camera PSF, this shifted image was convolved with a Gaussian blur kernel. The resulting image was subsampled by the factor 4 in each direction. The same approach with different motion vectors in the vertical and horizontal directions was used to produce eight LR images from the original scene. The resulting LR frames were corrupted with white Gaussian noise. All experiments were performed under Windows XP and MATLAB v7.1 running on a desktop with an Intel Core Dual Processor 3.00βGHz and 4.00βGB of memory.
(a)
(b)
(c)
(d)
(e)
For the objective comparison between the original HR and SR reconstructed images, we measure the peak signal-to-noise ratio (PSNR) and the relative error (ReErr) defined as where and are the original and the SR reconstructed images, respectively, and represents the image size.
We compare the proposed method (operator splitting method, OSM) with the lagged diffusivity fixed point iteration (LDFP) [28, 29] and alternating minimization method (AMM) [33]. In all the tests, we set the initial guess . The choice of parameters in three methods all base on the tradeoff between reconstruction effect and computing time. In the proposed method, the value of and are fixed to be 1.5 and 4, respectively. The stopping criterion of all the methods is that the relative difference (ReDiff) between the successive iterative of the SR reconstructed image should satisfy the following inequality:
In the first test, we apply Gaussian kernel with window size , standard deviation 0.5, and different noise level (noise variance ). One of LR frames is presented in Figures 2β5, 8(a), respectively. The corresponding reconstructed images of the three methods are shown in Figures 2, 4, 8(c)β8(e) and Figures 3, 5(b)β5(d), respectively. Our second test uses Gaussian kernel with support size , standard deviation 1.7, and noise variance 5. One of LR images is presented in Figures 6, 7(a), respectively. The corresponding reconstructed images by the three methods are shown in Figures 6, 7(b)β7(d). We can see theses SR reconstructed images by different methods are very similar in real visual perception. In Table 1, we compare their reconstruction performances in PSNRs and ReErrs. On one hand, we see from the table that both PSNRs and ReErrs of the reconstructed images by the proposed method are better than those by the LDFP and AMM method. On the other hand, it is clear from Table 1 that the proposed method is more efficient (in iterations and computation times) than the other two methods.
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
(e)
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d)
(e)
In Figure 9, we show that the convergence of the proposed method is faster than LDFP and AMM methods. The -axis refers to the number of iterations. The -axis in Figures 9(a), 9(b) refers to the PSNR and the -axis in Figures 9(c), 9(d) refers to the relative difference between the successive iteration of the reconstructed image. These figures show that the proposed method can provide good quality of reconstructed images in an efficient manner.
(a)
(b)
(c)
(d)
5. Conclusion
This paper proposes a general framework for multiple shifted and linear space-invariant blurred LR image frames which subsume several well-known SR models. The proposed model combines total variation (TV) regularization to formulate the SR image reconstruction as an optimization problem. Then, we propose an efficient algorithm that takes full advantage of the problem structures. As such, we propose to compute the minimizer of our SR model by applying DRS techniques (resp., ADMM) which separated the SR model into three subproblems that can be easily solved. Moreover, to speed up convergence, we provide an accelerated scheme based on precondition design of initial value and FBS. The proposed algorithm reduces the computational complexity. The good performance of the proposed explicit algorithm has been tested for synthetic data sets of several images degraded with Gaussian blur and contaminated with Gaussian white noise. Numerical results indicate that the algorithm recovers well edges and small features not appearing in the original degraded images. The experimental results indicate that the proposed algorithm has considerable effectiveness in terms of both objective measurements and visual evaluation.
Acknowledgments
This work was supported in part by the Natural Science Foundation of China under Grant no. 60802039, by the Doctoral Foundation of Ministry of Education of China under Grant no. 20070288050, NUST Research Funding under Grant no. 2010ZDJH07, and also sponsored by Natural Science Foundation of Jiangsu under Grant no. BK2010488 and βQing Lan Projectβ of Jiangsu Province. The authors would like to express their gratitude to the anonymous referees for making helpful and constructive suggestions.