Abstract
In the field of image superresolution reconstruction (SRR), the prior can be employed to solve the ill-posed problem. However, the prior model is selected empirically and characterizes the entire image so that the local feature of image cannot be represented accurately. This paper proposes a feature-driven prior model relying on feature of the image and introduces a block-based maximum a posteriori (MAP) framework under which the image is split into several blocks to perform SRR. Therefore, the local feature of image can be characterized more accurately, which results in a better SRR. In process of recombining superresolution blocks, we still design a border-expansion strategy to remove a byproduct, namely, cross artifacts. Experimental results show that the proposed method is effective.
1. Introduction
The superresolution reconstruction (SRR) is a technique that constructs one high-resolution image from low-resolution degraded images [1]. SRR plays a preprocessing role in image engineering, and it is important to provide accurate data for the subsequent image analysis, testing, understanding, and application [2, 3]. Many algorithms for SRR have been proposed and can be categorized into frequency domain methods and spatial domain methods. Frequency domain methods rely on the property of space-frequency transformation and the aliasing on low-resolution images [4, 5]. However, due to motion restriction and lack of prior, the frequency domain methods are not developed widely and spatial domain methods are preferred.
The major spatial domain methods include maximum likelihood (ML) [6], projection onto convex sets (POCS) [7], and maximum a posteriori (MAP) [8, 9]. In MAP method, the SR task is cast as the inverse problem of recovering the original high-resolution image by fusing the low-resolution images, based upon the observation model and prior about the recovered image [10]. As MAP method introduces prior of image and follows rigorous reasoning, it has become one of the most frequently used SRR methods.
In order to improve the performance of MAP method, many image prior models have been proposed. Gaussian Markov random field (GMRF) prior model [9, 11, 12] assumes a Gaussian distribution over the image; it not only penalizes the gradient of SRR solution but also preserves edge. Tikhonov prior model [13] uses certain high-pass operator to restrict the total energy of image grayscale and make image smooth. Total variation (TV) prior model [14, 15] adopts the idea of heavy tails which avoids oversmoothing in discontinuities. Bilateral TV (BTV) prior model [16, 17] is an updated version of TV prior model, which considers not only the intensity difference of pixels but also the distance between pixels and can remove more artifacts. Particularly, prior models above can be reduced to two fundamental models according to the exponent of norm [18]. Specifically, when the value of exponent is two, the prior model is a Gaussian model, and when the value of exponent is one the prior model is a Laplacian model. In this paper, these two models are named constant prior.
The conventional MAP with constant prior is widely used, whereas theconstant prior has two main problems to solve. Many papers discuss the learning of prior parameter from the observed data, but the selection from both models of constant prior is completely determined by experience and even is random. Although the recent example-based method [19] tries to overcome this drawback, it cannot get rid of the restriction of using external training set which is not the reliable evidence. The aboveconstant prior characterizes the total pixels’ statistic feature of entire image, so the distinction of different regions of the image is not considered.
In order to overcome the first drawback, inspired by the learning mechanism of Baker’s hallucination algorithm [20], this paper proposes a variable prior model which is essentially a mixture of two models of constant prior. The formulation of this variable prior model is similar to the elastic net prior model by Mohammad-Djafari [21]. But, in the prior model proposed in this paper, the weight assigned to each component model is obtained by learning and the formulation of this variable prior model completely lies on the pixels’ statistic feature of self-image, so the proposed prior model is also named feature-driven priormodel by us.
In order to overcome the second drawback, this paper introduces a block-based MAP framework under which the image is split into many blocks. According to the feature of block, a proper feature-driven prior is assigned to the block. When each block has its own feature-driven prior, MAP SRR is performed for each block. And finally all reconstructed blocks are recombined into a whole reconstructed image. This framework considers local distinction of image feature, so it can improve the reconstruction result. It is still worth noticing that in reconstruction image by above method there are usually some cross artifacts. In order to eliminate these byproducts, this paper designs a border-expansion strategy for each block to achieve satisfied result.
The rest of the paper is organized as follows. Section 2 briefly reviews the MAP framework for SRR and introduces the formulation of constant priormodel. Section 3 analyizes and reveals the drawback of the constant prior, proposes the variable prior model which is called feature-driven prior model, presents a block-based MAP SSR framework, and designs a border-expansion strategy for each block to eliminate cross artifacts. In Section 4, experiments are conducted to validate the effectiveness of proposed algorithm. Finally, conclusion will be given in Section 5.
2. MAP Framework for SRR
In order to formulate the MAP framework for SRR, observed images are denoted by and the desired high-resolution image is represented by . Then generative model of low-resolution images can be expressed as where represents the degraded matrix which includes warp, blur, and decimation, and represents zero-mean Gaussian white noise with variance .
In fact, MAP solution is derived from Bayes’ theorem. That is, given and , the posterior distribution of can be formulated as As is a constant value, maximizing (2) is equal to maximizing the logarithm of numerator of (2) with respect to . So the MAP solution for SRR is calculated iteratively by minimizing the following cost function [18]:
Equation (3) is fusion of ML cost function and regularization which, respectively, correspond to the first term and second term of (3). The multivariate probability distribution in regularization is the exact image prior which is generally formulated from the following two models or their modifications: where is the size of image and is the operator matrix which computes the gradient of the image in multidirection. In the past work [9–17], either the models above or their modifications were used to serve as the constant prior named in this paper.
3. Block MAP SRR with Feature-Driven Prior
3.1. The Drawback of Constant Prior Model
In fact, it is difficult to characterize images accurately by constant prior because the selection of such two formulated models is by experience. In addition, the formulation of constant prior makes it rigescent and inapplicable, so it still has some flaws.
Through comprehensive analysis, we find that above two models (4) and (5) not only are multivariate distributions of , but also correspond to multivariate distributions in which the random vector is the gradient (i.e., difference in multidirection) of image, denoted by (i.e., ) with size of . Elements of are approximately independent identically distributed with zero-mean and standard deviation (Gaussian model) or (Laplacian model). Thus, with the theorem of random variables function, the distributions of can be obtained from (4) and (5) as follows:
The flaw with the two models lies at that the standard deviations of ; that is, in (6) and in (7) are identical and equal to 1. This reveals that in (6) and in (7) cannot reflect the pixels’ statistic feature of image. Certainly, and in the conventional MAP-based SRR can be adjusted manually and through adding regulation parameter a satisfying result can be achieved, so the flaw is not exposed. However, and are both equal to 1 and the regulation parameter is not available when (6) or (7) is strictly substituted into (3). In this situation, when the prior model cannot characterize the feature of image the annoying superresolved image will probably emerge.
In fact, it is impossible for image that or is always equal to one, and it is not reasonable that is always equal to . Hence, a more practical prior of is given by
3.2. Feature-Driven Prior
Now it is confirmed that the standard deviation ( or ) of variable is just the core feature of prior which can be used later to determine the model of prior. Here, we firstly denote the samples of and by and , respectively. Then, given (i.e., ), maximum likelihood estimations of the standard deviations ( and ) in the models of (8) are as follows:
In this paper, the initial estimation of SRR iteration is selected as image sample , and then the grade sample is obtained. This sample selection is considered to be better than the image with similar texture [19] because the initial estimation is theoretically related to its true image and visually more similar to its true image than image with similar texture.
To identify which model is better for prior, the model of prior is defined as , where represents Gaussian prior model and represents Laplacian prior model. Then the posterior probability of is where is a normalization constant and is prior of without any additional condition. Simple and reasonable hypothesis is that follows the uniform distribution. Thus With (10) and (11), the better prior model is obtained by
As only has two options, (12) can be converted into a criterion in the form of ratio where and . Combining (8)–(9), (13) can be simplified as
When , that is, probability of Gaussian model is larger than Laplacian model, Gaussian model will be fitter for the image than Laplacian model and vice versa. Additionally, not only indicates which model is fitter for the image but also can construct a more flexible variable prior model (i.e., the feature-driven prior model).
The formulation of feature-driven prior model is a proportional mixture of Gaussian model and Laplacian model which is similar to the elastic net prior model [21]. (Inspired from elastic net regression [22], the elastic net prior model is a combination of a Gaussian model and a Laplacian model.)
Difference in the elastic net prior model is that the feature-driven prior model assigns the weight to constant prior model (i.e., Gaussian model and Laplacian model) according to the possibility and applicability of either model, which makes this mixture prior characterize the desired image more accurately.
The weight (i.e., proportion) assigned to the Laplacian model in the feature-driven prior can be formulated as And the weight (i.e., proportion) assigned to the Gaussian model in the feature-driven prior can be formulated as
Referring to (3) and (8), a new cost function is constructed as follows:
In (17), the last two terms correspond to the variable prior model which is flexible and named asfeature-driven prior. When approaches 0 the prior is Laplacian, when approaches positive infinity the prior is Gaussian, and when is modest the prior is mixture.
3.3. Block-Based MAP Framework for SRR
In Section 3.2, the feature-driven prior characterizes the entire image ignoring the difference between each region because the prior is derived from the entire image data. In order to take different region features into account, a block-based MAP framework is proposed for SRR in this subsection, in which the image is split into a tessellation with many blocks and for each block an exclusive prior is assigned.
Applying the above block-based framework to the SSR, the block-based MAP superresolution using feature-driven prior (BMSFP) can be constructed. Procedure of BMSFP is as follows:(1)the initial image for SRR iteration is estimated by shift and add algorithm (the shift and add algorithm, which is proposed by Farsiu et al. [16], is a simple and effective initial estimation method for SRR. After motion compensation according to for all LR images, there may be several measurements at each pixel point in high-resolution grid. Then shift and add algorithm can be achieved by computing the pixelwise median of all measurements.) [16];(2)this initial image is split into a tessellation with blocks ( is rows and is columns of blocks;(3)the statistics (, , and ) of each block can be calculated, which determine the feature-driven prior model for each block. An illustration of tessellation and its block images with their assigned feature-driven prior is shown in Figure 1;(4)each block cost function can be formulated by (17) based on its statistics; that is; where is the compressed version of of which some unrelated columns are deleted;(5)under the block-based MAP framework described in steps , a block superresolution can be conducted for each block by optimizing (18);(6)a perfect entire superresolution image is obtained by recombining all the superresolution results of each block.

(a)

(b)
3.4. The Cross-Artifact Problem and Solution
Theoretically, the method we proposed in Section 3.3 should be prominent in SRR. Whereas, in practice, the reconstructed image using the above method has obvious and annoying appearance containing cross between adjacent blocks. We call this phenomenon as cross artifact which reduces the final image’s quality even lower than initial image’s sometimes. In order to visualize this phenomenon, we implement an experiment and the result is shown in Figure 2.

(a)

(b)
From Figure 2 we can see that the cross artifact in (b) is very obvious, which makes the reconstructed image worse than initial image in (a) from not only vision but also performance index. Here, the performance index is mean square error (MSE) which calculates the mean square error between original image and its estimation. The MSE is formulated as follows: where is the number of pixels in image .
It is necessary to remove this byproduct, that is, cross artifact. However, where and how does this artifact come from? By deep exploration, we find that, for any SRR result, the border of image is usually different from other areas of image in the mean gray level, which we call as border effect. This is because the less neighbor pixels (only including two or three orientations) participate in the estimation of border, while, unlike the border case, for other regions, the neighbor pixels include entire four orientations. Based on the above analysis, there is no exception for each block in Figure 2; that is, each block has the border effect. It is certainly reasonable that the cross artifact will emerge in final superresolved image between adjacent blocks due to the border effect when all reconstructed blocks are recombined. The cross artifact, in fact, comes from the border effect of each block.
As the origin of cross effect is anatomized, the path to removing it is clear. Actually, the point lies in the modification on the size of block. When splitting the initial image we should expand each block several pixels width , which lead to the fact that adjacent blocks have twice--wide overlapped border. As such, these expanded blocks should be superresolved. But before recombining, we need to cut each block pixels width border. Thus, the border effect of block can be removed effectively, so as to eventually avoid the cross-effect emerging in the final superresolved image after recombining.
So far, when the SRR result may emerge cross artifact, we give the entire and effective artifact removal BMSFP (AR-BMSFP) algorithm shown in Algorithm 1.
|
4. Experimental Results
In this section, three tests are conducted to validate the utility of the proposed prior and algorithm. The performance of algorithm is evaluated by mean square error (MSE).
4.1. Comparison Experiments for Different Prior Models
In the first test, flexibility and superiority of the feature-driven prior are demonstrated. Firstly, the degraded low-resolution (LR) image sequence is generated from an original remote sensing image which is shown in Figure 3(a). In the degradation process, original image is warped by homography and blurred by Gaussian point spread function with standard deviation of 0.8 high-resolution (HR) pixels, and then the warped and blurred image is decimated by factor 2 : 1 in horizontal and vertical directions, and finally the white Gaussian noise with standard deviation of 5 is added. Note that the intensity range of image is from 0 to 255. Two of the five degraded LR images are shown in Figures 3(b) and 3(c) which have been zoomed in the same size as original image for observation (all LR images in the following tests are also shown in this mode).

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)
In the reconstruction process, the conventional preparation such as motion estimation [23] is omitted in order to focus on our algorithm. The information about warp, blurring, and decimation is supposed to be available; that is, the degraded matrix used in degradation can also be employed for reconstruction. In order to compare with the feature-driven prior, firstly we reconstruct image using LR sequence based on Tikhonov prior and Farsiu BTV prior, respectively. The results are shown in Figures 3(d) and 3(e). Then two cost functions are formed, respectively, by substituting Gaussian prior model (4) into (3) and Laplacian prior model (5) into (3). Optimizing two cost functions above by scaled conjugate gradients (SCG) [24] yields two results of superresolution as shown in Figures 3(f) and 3(g). In order to convince that the feature-driven prior is a little bit better than the state-of-the-art elastic net prior model, we reconstruct image based on elastic net prior, the result of which is shown in Figure 3(h). Finally, another cost function is formed with the feature-driven prior based on (17). Optimizing this cost function by SCG yields the result of superresolution as shown in Figure 3(i).
In order to demonstrate the applicability of the proposed prior model for different images, another standard image Lena is used to repeat this experiment. Figure 4(a) is the part of the original Lena image. Figures 4(b) and 4(c) are two of the five degraded LR images. Figures 4(d)–4(i) are results of SRR with, in order, Tikhonov prior, Farsiu BTV prior, Gaussian prior, Laplacian prior, elastic net prior, and feature-driven prior.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)
It is important to note that the goal of Section 4.1 is to compare different prior models, so the image is not split into blocks to perform SRR (i.e., the image-splitting scheme and border expansion are not required). All the experiments are performed for entire image. Therefore, there is no block artifact, that is, the cross artifact, in any results.
The MSE results for remote sensing image and Lena image are listed in Table 1. Figures 3 and 4 and Table 1 illustrate the advantage of the proposed prior compared with the other priors.
4.2. Comparison Experiments for Different Image-Splitting Schemes
In order to show that image-splitting scheme proposed in Section 3.3 is effective, the second test is conducted. Certainly, the detailed algorithm to be validated is AR-BMSFP proposed in Section 3.4 because the cross artifact must be removed. So the border expansion must be utilized in image-splitting experiments. In the test, AR-BMSFP with different image-splitting schemes and several other algorithms without image-splitting scheme are compared with each other. The second test also includes two groups of experiments which use Peppers image and remote sensing image, respectively.
The original Peppers image is shown in Figure 5(a). In order to show the robustness and applicability of the algorithm, we made some adjustments on degradation process used in Section 4.1. To be specific, Gaussian point spread function is changed with standard deviation of 1.0 HR pixels, the decimation factor is changed with 4 : 1, and white Gaussian noise is amplified with standard deviation of 10.

(a)

(b)

(c)

(d)

(e)

(f)

(g)
Then the AR-BMSFP SRR proposed in Section 3.4 is compared with the method without image-splitting scheme to validate its superiority. Figure 5(b) shows the result of MAP SRR based on Tikhonov prior model without image-splitting scheme. Figure 5(c) shows the result of MAP SRR based on Farsiu BTV prior model without image-splitting scheme. Figure 5(d) shows the result of MAP SRR based on feature-driven prior model without image-splitting scheme.
For AR-BMSFP algorithm, three experiments are conducted with , , and blocks, of which the results are shown in Figures 5(e), 5(f), and 5(g). It is worth noting that all these three results are obtained through several different border-expansion trials. Each final result is the best result selected from several trials. Therefore, the block artifact, that is, cross artifact, is not displayed in Figure 5.
To prove the robustness of proposed method, the second group of experiments is conducted using remote sensing image. The degradation process is also repeated as in the first test, based on which the reconstruction process of first group of experiments is repeated. The results are shown in Figure 6. It is also worth noting that the cross artifact has been removed in results through several border-expansion trials. So Figures 6(e), 6(f), and 6(g) have no artifacts.

(a)

(b)

(c)

(d)

(e)

(f)

(g)
The MSEs of two groups of experiments above are also listed in Table 2. Both visual quality and MSE show that the AR-BMSFP algorithm is better than algorithm without image-splitting framework. Furthermore, MSE improves as the number of blocks increases.
4.3. Comparison Experiments for Different Border Expansions in Cross-Artifact Removal
The third test is a final test in this paper, which validates the capability of our solution for cross-artifact emerging in result. There are three groups of experiments in this test to show the robustness.
As the cross artifact is only about 3 to 5 pixels width, the first two groups of experiments use small size images to make the cross artifact more visible. In the first group of experiments, a remote sensing image shown in Figure 7(a) is used. In the second group of experiments, a magnetic resonance imaging (MRI) image shown in Figure 8(a) is used.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(a)

(b)

(c)

(d)

(e)

(f)

(g)
The degradation of them is determined by the same method in Figure 5. Figures 7(b) and 8(b) show one of the degraded LR images, respectively. According to the algorithm present in Algorithm 1, we begin to reconstruct image from LR sequence. For comparison, we select the same strategy to split the image; that is, the splitting index is fixed as .
In order to illustrate effect with different border-expansion widths for blocks, we repeat the reconstruction with 0-, 1-, 2-, 3-, and 4-pixel border expansions for blocks in remote sensing image experiments. These results are shown in Figures 7(c)–7(g). From result images, we can see that the cross artifact in Figure 7(c) is severe as no expansion, and cross artifact in Figures 7(d) and 7(e) is still obvious as thin border expansion. Due to the thick border expansion, Figures 7(f) and 7(g) appear without cross artifact. But in vision, Figure 7(g) is a little better than Figure 7(f), especially for the river in image. Besides, the MSEs also prove the judgments in vision. To make it clear, MSE values are illustrated in the graph of Figure 9(a). It is also obvious that as the border becomes wider the MSE will decrease.

(a)

(b)
Keeping all the conditions and conducts in the first group of experiments, we carry out the second group of experiments for a MRI image. The results and graph are shown in Figures 8 and 9(b). The difference from previous is that we repeat the reconstruction with 0-, 1-, 2-, 3-, 4-, and 5-pixel border expansions. Note that due to need of layout, Figure 8 only includes results with 0-, 1-, 3-, 4-, and 5-pixel border expansions. But Figure 9(b) shows all results. From Figure 8, the effect of reconstruction is not as good as that of Figure 7. It is perhaps because of the fact that the size of image is smaller than that in Figure 7 and the texture of image is more monotonous than that in Figure 7. Even so, all results and graph still give the same conclusion as Figure 7. It also proves the robustness of our algorithm.
No doubt it is time consuming to do the above experiment using bigger size image. But the capability to remove cross artifact is still effective for bigger size image. In order to prove this claim, a image shown in Figure 10(a) is selected to repeat the above experiment.

(a)

(b)

(c)

(d)

(e)

(f)
The degradation of the image is the same as the method of the first two groups of experiments. A degraded LR image is shown in Figure 10(b). The AR-BMSFP reconstruction is conducted with 0-, 1-, 2-, and 3-pixel border expansions. The reconstructed results are shown in Figures 10(c)–10(f). Figure 10 verifies that the AR-BMSFP algorithm is effective for bigger size image. Additionally, the computed MSEs of all results are listed in Table 3 to prove that the MSE decreases as the border is expanded.
5. Conclusion
This paper introduces an effective algorithm to achieve superresolution. To make the prior model more accurate, the proposed algorithm supposes that the image prior is a variable prior based on feature instead of constant prior. Additionally, the proposed block-based framework enables the prior to characterize the local image rather than the entire image. By analysis and experiment we find that for each block the result has border effect which leads to the cross artifact as a result of superresolution. In order to remove this byproduct, we add a strategy in proposed algorithm, which expands the block several pixels before optimization and cuts the expanded border when recombining all blocks. This strategy is simple but effective for our algorithm.
Experimental results validate the effectiveness of the proposed algorithm. In future, the focus of research will be on speeding up the algorithm and combining registration with reconstruction in SRR.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (nos. 61374019, 61263029), the Natural Science Foundation of Jiangsu Province, China, (BK20130851), Natural Science Foundation of the Higher Education Institutions of Jiangsu Province, China, (no. 13KJB520009), and College Industrialization Project of Jiangsu, China, (no. JHB2012-4).