Abstract

Image interpolation, as a method of obtaining a high-resolution image from the corresponding low-resolution image, is a classical problem in image processing. In this paper, we propose a novel energy-driven interpolation algorithm employing Gaussian process regression. In our algorithm, each interpolated pixel is predicted by a combination of two information sources: first is a statistical model adopted to mine underlying information, and second is an energy computation technique used to acquire information on pixel properties. We further demonstrate that our algorithm can not only achieve image interpolation, but also reduce noise in the original image. Our experiments show that the proposed algorithm can achieve encouraging performance in terms of image visualization and quantitative measures.

1. Introduction

Image interpolation is a very important aspect of image processing and involves the use of a known pixel set to produce an unknown pixel set, resulting in an image of higher resolution [1, 2]. This technique is widely used in remote sensing, aerospace, infrared imaging, low-light level night imagery, and other fields [35]. However, maintaining image quality during image interpolation is still a difficult issue [6]. To address this, many image interpolation methods have been proposed. For example, traditional bilinear interpolation computes the unknown pixel value using the location information between the adjacent pixels. This technique does not consider the contents of the image, so edge blurring will occur in the interpolated image [7, 8]. In order to capture image details more clearly, an artifact-free image upscaling method called ICBI [9] has recently been proposed, which uses iterative curvature-based interpolation to obtain a high image quality, but does not take into account underlying local information between image patches. Local image information can be mined according to its structural redundancy characteristic, as proposed by Glasner et al. [10]. This characteristic can lay the foundations for the training and predicting of a statistical model [11, 12]. A statistical model known as Gaussian process regression (GPR) was first applied in the reconstruction of high-resolution images in 2011 and has been shown to be capable of generating an image with sharp edges by extracting the necessary information from a low-resolution image [13]. However, it should be noted that this method only uses the local structural information for each pixel’s neighborhood, so it can still generate unexpected details. To develop the above techniques, we propose here a novel energy-driven interpolation algorithm employing Gaussian process regression (EGPR) (Figure 1). This algorithm not only emphasizes the influence of adjacent pixel properties on interpolated values, but also brings into full play the role of the statistical model.

Our contribution is twofold. Firstly, we propose a framework for both magnification and deblurring in order to fulfill the interpolation task for low-resolution images with low noise. Secondly, we demonstrate an energy-driven approach based on the properties of adjacent pixels within this framework. In addition, we define the processing unit and its properties for better implementation of the EGPR algorithm.

The rest of the paper is structured as follows. Section 2 discusses GPR. Section 3 illustrates the proposed EGPR algorithm. Section 4 presents experimental work carried out to demonstrate the effectiveness of our algorithm. Section 5 concludes the paper.

2. Gaussian Process Regression

In recent years, GPR has become a hot issue in the field of machine learning and has attracted great academic interest [1416]. It has many advantages, including its rigorous underlying statistical learning theory, easy regression process implementation, few parameters, and improved model interpretability [1719]. As a result of these benefits, it has been used in many areas [2023]; however, to the best of our knowledge, it has not yet been fully utilized in image interpolation. Rasmussen and Williams [24] defined the Gaussian process and noted in particular that a Gaussian process is completely specified by its mean and covariance functions ((2.1) and (2.2), resp.): [],𝜇(𝑥)=𝐸𝑌(𝑥)(2.1)COV𝑥,𝑥𝑌𝑥=𝐸(𝑌(𝑥)𝜇(𝑥))𝑥𝜇,(2.2) where 𝑥 and 𝑥 are any random variables. In particular, they could represent n-dimensional input or output vectors. The Gaussian process can be written as follows: 𝑔𝜇(𝑥)𝐺𝑃(𝑥),COV𝑥,𝑥.(2.3)

There are a variety of covariance functions, of which one of the most commonly used is the squared exponential (SE) covariance function 𝑔𝑥COV𝑝𝑥,𝑔𝑞1=exp2||𝑥𝑝𝑥𝑞||2.(2.4) In Gaussian processes, the marginal likelihood 𝑝(𝑦𝑋) at a point is very useful and is the integral of the likelihood multiplied by the prior probability 𝑝(𝑦𝑋)=𝑝(𝑦𝑔,𝑋)𝑝(𝑔𝑋)𝑑𝑔.(2.5) We can rewrite (2.5) as follows: 1log𝑝(𝑔𝑋)=2𝑔𝑇COV11𝑔2||||𝑛logCOV2log2𝜋.(2.6)

We can make use of Gaussian identities to obtain (2.7), in order to compute the log marginal likelihood. The conjugate gradients method has been applied to solve this equation. Using this approach, we can obtain the hyperparameters of the covariance function. Further details of GPR can be found in [24] as follows: 1log𝑝(𝑔𝑋)=2𝑦𝑇COV(𝑋,𝑋)+𝜎2𝑛𝐼11𝑦2||logCOV(𝑋,𝑋)+𝜎2𝑛𝐼||𝑛2log2𝜋.(2.7)

3. The Proposed Algorithm

In this paper, we combine the energy-driven approach with GPR to accomplish the task of image interpolation. The proposed algorithm models low-resolution image data as a function of a probability distribution that satisfies a local static Gaussian process. This algorithm framework is shown in Figure 2 and is broadly divided into the training process and prediction process. Firstly, the GPR model can be established using the low-resolution image data. Next, this model is used to predict the unknown pixel values of a high-resolution image by adopting an energy computation approach. Through the above two steps, we produce a high-quality enlarged image. The process is further clarified by the following: 𝐿=𝑑𝐿,𝐻=𝐿𝑠,𝐻=𝑓𝐻,(3.1) where 𝐿 and 𝐻 denote the input low-resolution image with a little noise and the output high-resolution image, respectively, 𝐿  denotes the noise-free low-resolution image, 𝐻 denotes the initial high-resolution image, 𝑠denotes the upsampling factor, and 𝑑 and 𝑓 denote the clear transfer function and energy transfer function, respectively.

3.1. Training

The following definitions are used in the EGPR algorithm.

Definition 3.1. A given image 𝐿 is divided into many regions of equal size, and each region is defined as a processing unit (PU). Each PU is also divided into 3×3 overlapping image patches (the total number is 𝑀). The center of each patch is defined as an output vector 𝑌𝑇𝑅 of PU, where 𝑌𝑇𝑅=(𝑦1,𝑦2,𝑦𝑀)𝑇, while the nearest eight values are defined as an input vector 𝑋𝑇𝑅 of PU, where 𝑋𝑇𝑅=𝑥11,𝑥12,𝑥18𝑥21,𝑥22,𝑥28𝑥.𝑀1,𝑥𝑀2,𝑥𝑀8.(3.2)

Definition 3.2. Given a total of 𝑁 pixels in each PU, the pixels are sorted and denoted as 𝐼1,𝐼2𝐼𝑁. 𝐼maxave, 𝐼minave, and 𝐼ave are defined using the following formulae: 𝐼maxave=Top𝑖=1𝐼𝑖Top,𝐼minave=Below𝑖=1𝐼𝑖Below,𝐼ave=𝑁𝑖=1𝐼𝑖𝑁,(3.3) where top represents the number of the largest pixels used and below the number of the smallest pixels used. Then 𝐼maxave, 𝐼minave, and 𝐼ave are called basic properties of PU.

To facilitate the operation of the PU, it is necessary to introduce some properties in advance.

Property 1. Given a number 𝑁 in each PU, if 𝐼maxave=0, then pixel value 𝐼𝑖=0, where 𝑖𝑁.

Property 2. Given 𝑥𝑖𝑗, if 𝑥𝑖𝑗=𝑎, then its corresponding output vector value is 𝑦𝑖𝑗=𝑎, where 𝑦𝑖𝑗𝑌𝑇𝑅,𝑖𝑀,𝑗=1.

Denoising is the first step in the EGPR algorithm, and we use the following formula (3.4) to obtain noise-free images: 𝐼𝑖=𝐼neighbor,𝐼maxave𝐼minave𝜃&𝐼𝑖𝐼ave,𝐼+𝐵𝑖,otherwise,(3.4) where 𝜃 and 𝐵 represent empirical values, and 𝐼neighbor represents the adjacent pixel value.

Before applying GPR, we can obtain the particular relationship between the input and output vectors of PU according to Properties 1 and 2. Pixels with this relationship need not be included in the following GPR training, so the predicted values can be directly obtained, saving time and speeding up the EGPR algorithm.

Training plays an important role in the EGPR algorithm, and we adopt a different approach from that used in [13]. Our algorithm contains two processes: training domain establishment and GPR model foundation. In the first stage, we search possible training domains along the four directions of each specific PU. Next, we compute the structural similarity between directions to determine the definite training domain. Inspired by the concept of image SSIM, we define the PU structural similarity as follows.

Definition 3.3 (PU structural similarity). Given two processing units 𝑃 and 𝑄, their structural similarity is defined as 𝑆(𝑃,𝑄)=2𝑚𝑃𝑚𝑄+𝐶12𝜓𝑃𝑄+𝐶2𝑚2𝑃+𝑚2𝑄+𝐶1𝜓2𝑃+𝜓2𝑄+𝐶2,(3.5) where 𝐶1 and 𝐶2 are constants, and the other components are calculated as follows: 𝑚𝑃=1𝑁𝑁𝑖=1𝑝𝑖,𝑝𝑖𝑃,𝑚𝑄=1𝑁𝑁𝑖=1𝑞𝑖,𝑞𝑖𝑄,𝜓𝑃=1𝑁1𝑁𝑖=1𝑝𝑖𝑚𝑝21/2,𝜓𝑄=1𝑁1𝑁𝑖=1𝑞𝑖𝑚𝑞21/2,𝜓𝑃𝑄=1𝑁1𝑁𝑖=1𝑝𝑖𝑚𝑝𝑞𝑖𝑚𝑞.(3.6)

When the search step count reaches the predefined number, or if the PU structure similarity falls below a certain value, the first stage is complete. In the second stage, we apply a Gaussian process prior probability and establish the GPR model with Gaussian noise 𝛾 (see (3.7) below) using the image data from training domains. In (3.7), “𝐺𝑃” denotes a Gaussian process 𝑦=𝑔(𝑋)+𝛾,𝛾𝐺𝑃0,𝜎2𝑛.(3.7)

When aiming to achieve high-quality images, the conjugate gradients method is chosen to obtain the model hyperparameters, including mean, variance, and log marginal likelihood. Notice that different iteration numbers in the conjugate gradients method may lead to different prediction accuracies. Figure 3 shows the interpolation images obtained after 50 iterations and 100 iterations, where it can be seen that the latter is better than the former.

3.2. Prediction

Inspired by the ICBI algorithm, we firstly compute the initial pixel value 𝐻(2𝑥+1,2𝑦+1) according to the following formulae: 𝜏1𝜏=𝐼(2𝑥+4,2𝑦)+𝐼(2𝑥+2,2𝑦2)+𝐼(2𝑥,2𝑦+4)+𝐼(2𝑥2,2𝑦+2),2𝑑=𝐼(2𝑥+4,2𝑦+2)+𝐼(2𝑥+2,2𝑦+4)+𝐼(2𝑥,2𝑦2)+𝐼(2𝑥2,2𝑦),(3.8)1(2𝑥+1,2𝑦+1)=𝜏1𝑑+𝐼(2𝑥+2,2𝑦+2)+𝐼(2𝑥,2𝑦)3(𝐼(2𝑥,2𝑦+2)+𝐼(2𝑥+2,2𝑦)),2(2𝑥+1,2𝑦+1)=𝜏2𝐻+𝐼(2𝑥+2,2𝑦)+𝐼(2𝑥,2𝑦+2)3(𝐼(2𝑥,2𝑦)+𝐼(2𝑥+2,2𝑦+2)),(3.9)=1(2𝑥+1,2𝑦+1)2(𝐼(2𝑥,2𝑦)+𝐼(2𝑥+2,2𝑦+2)),𝑑1(2𝑥+1,2𝑦+1)<𝑑21(2𝑥+1,2𝑦+1),2(𝐼(2𝑥+2,2𝑦)+𝐼(2𝑥,2𝑦+2)),𝑑1(2𝑥+1,2𝑦+1)𝑑2(2𝑥+1,2𝑦+1).(3.10)

However, the pixel value obtained is only a roughly estimated value and needs further refinement. Following [9], we establish (3.11) to calculate the energy of each interpolated pixel, and the initial estimate can be modified accordingly, 𝐸(2𝑥+1,2𝑦+1)=𝑐𝐸𝑐(2𝑥+1,2𝑦+1)+𝑒𝐸𝑒(2𝑥+1,2𝑦+1)+𝑖𝐸𝑖(2𝑥+1,2𝑦+1),(3.11) where 𝑐, 𝑒, and 𝑖 are chosen to adjust the energy contributions from the three parts.

𝐸𝑐 represents the curvature continuity energy and can be computed with the following formulae: 𝜎1=||𝑑1(2𝑥,2𝑦)𝑑1||+||𝑑(2𝑥+1,2𝑦+1)2(2𝑥,2𝑦)𝑑2||,𝜎(2𝑥+1,2𝑦+1)2=||𝑑1(2𝑥,2𝑦)𝑑1||+||𝑑(2𝑥+1,2𝑦1)2(2𝑥,2𝑦)𝑑2||,𝜎(2𝑥+1,2𝑦1)3=||𝑑1(2𝑥,2𝑦)𝑑1||+||𝑑(2𝑥1,2𝑦+1)2(2𝑥,2𝑦)𝑑2||,𝜎(2𝑥1,2𝑦+1)4=||𝑑1(2𝑥,2𝑦)𝑑1||+||𝑑(2𝑥1,2𝑦1)2(2𝑥,2𝑦)𝑑2||,𝐸(2𝑥1,2𝑦1)𝑐(2𝑥+1,2𝑦+1)=𝛼1𝜎1+𝛼2𝜎2+𝛼3𝜎3+𝛼4𝜎4,(3.12) where 𝛼𝑖(𝑖=14) are weight values (see (3.13)), and 𝑑1 and 𝑑2 have the same meanings as above. 𝜃 is set as the threshold 𝛼𝑖=1if𝜎𝑖<𝜃,0,otherwise.(3.13) The second energy term 𝐸𝑏 represents the curvature enhancement energy and can be computed by 𝐸𝑒||𝑑(2𝑥+1,2𝑦+1)=2||||𝑑(2𝑥+1,2𝑦+1)1||(2𝑥+1,2𝑦+1).(3.14) The third energy term, 𝐸𝑖, represents the isolevel curves smoothing energy and can be computed with 𝐸𝑖(2𝑥+1,2𝑦+1)=𝐷(2𝑥+1,2𝑦+1)𝐼(2𝑥+1,2𝑦+1),(3.15) where 𝐷(2𝑥+1,2𝑦+1) can be computed as follows: 𝐷(2𝑥+1,2𝑦+1)=2𝑑3(2𝑥+1,2𝑦+1)𝑑4(2𝑥+1,2𝑦+1)𝑑5(2𝑥+1,2𝑦+1)𝑑3(2𝑥+1,2𝑦+1)2+𝑑4(2𝑥+1,2𝑦+1)2+𝑑3(2𝑥+1,2𝑦+1)2𝑑2𝑑22𝑑3(2𝑥+1,2𝑦+1)𝑑3(2𝑥+1,2𝑦+1)2+𝑑4(2𝑥+1,2𝑦+1)2,𝑑31(2𝑥+1,2𝑦+1)=2𝑑(𝐼(2𝑥,2𝑦)𝐼(2𝑥+2,2𝑦+2)),41(2𝑥+1,2𝑦+1)=2𝑑(𝐼(2𝑥,2𝑦+2)𝐼(2𝑥+2,2𝑦)),51(2𝑥+1,2𝑦+1)=2(𝐼(2𝑥+1,2𝑦1)+𝐼(2𝑥+1,2𝑦+3)𝐼(2𝑥1,2𝑦+1)𝐼(2𝑥+3,2𝑦+1)).(3.16)

Suppose that the low-resolution image 𝐿𝑖𝑗 is of size 𝑚×𝑛 and that it is changed to the corresponding interpolated image 𝐻𝑖𝑗 of size ((𝑚×2scale)(2scale1))×((𝑛×2scale)(2scale1)), where “scale” denotes the magnification factor. Then we use the nearest interpolation algorithm for the missing pixels in order to obtain the image 𝐻𝑖𝑗 of size (𝑚×2scale)×(𝑛×2scale).

Similarly, we partition 𝐻𝑖𝑗 into overlapping processing units. The eight adjacent pixels of each pixel are treated as GPR model test input data based on the model 𝑀 which was obtained from the training process. Note that if PU has Property 1 or Property 2, then we can directly obtain the corresponding pixel values. Otherwise, we capture the prediction distribution of unknown pixels in the initial high-resolution image. The joint distribution of the training domain output 𝑦 and the test output 𝑓 is given by the following equation: 𝑦𝑔𝐺𝑃0,COV(𝑋,𝑋)+𝜎2𝑛𝐼,COV𝑋,𝑋𝑋COV𝑋,𝑋,COV,𝑋,(3.17) where 𝑋 denotes the GPR training data matrix, 𝑋 is the test matrix, and COV(𝑋,𝑋) is the 𝑛×𝑛 matrix of covariances. Therefore, we can derive the predictive distribution based on the obtained model 𝑀: 𝑔𝑋,𝑦,𝑋𝐺𝑃𝑔,𝑉𝑔,𝑔𝑋=COV,𝑋COV(𝑋,𝑋)+𝜎2𝑛𝐼1𝑉𝑦,𝑔𝑋=COV,𝑋𝑋COV,𝑋COV(𝑋,𝑋)+𝜎2𝑛𝐼1COV𝑋,𝑋.(3.18)

During the prediction of high-resolution image pixels, two rules should be obeyed. Firstly, the PU divided by the initial high-resolution image should correspond to that divided by the low-resolution image. Secondly, the gradient algorithm should satisfy the common positive definite matrix. If not, it will lead to a zero prediction, and the prediction value will need modifying. The modification method can be utilized to maintain the original interpolated pixel value. Finally, we combine all the processing units together in a smooth manner to obtain the high-resolution images without noise.

4. Experimental Results and Discussion

In this section, we compare the experimental results obtained using the proposed algorithm with those obtained using the bilinear algorithm, GPR algorithm [13], and ICBI algorithm [9]. Each algorithm was run in MATLAB. In order to evaluate algorithm performance, we first downsampled original high-quality images to acquire low-resolution images. Then we enlarged these low-resolution images by utilizing the different interpolation algorithms and compared the enlarged images with the original high-quality images. In all experiments, we set the PU size to 30×30, but this may be increased according to the magnification factor. At the same time, we used zero mean and square exponential functions as the respective mean and covariance functions in the EGPR. The covariance function required two hyperparameters: a characteristic length scale, the default value of which was 0.21, and the standard deviation of the signal, the default value of which was 0.08. In addition, to achieve color image interpolation, we trained and predicted the GPR model separately for each of the R, G, and B channels.

Figure 4 shows the interpolation results from the four algorithms when “scale” was set as 1. Figures 4(a)4(d) are comparisons of image 1, and Figures 4(e)4(h) are comparisons of image 2. In the enlarged red-bordered region, it can be seen that the bilinear method introduces jaggy effects, the GPR method reduces these jaggy effects, and the ICBI method achieves a clear edge but is still a little blurry. By employing the energy computation based on properties of adjacent pixels, our new method generates a clearer image without noise.

Similarly, Figures 5 and 6 demonstrate the interpolation results with scales of 2 and 3, respectively. From these figures, it can be seen that our method achieved the clearest and smoothest enlarged image of the four methods tested, for example, along edges on the root hand in Figure 6(h). Moreover, the advantages of our proposed algorithm become more enhanced at greater enlargement factors.

To further validate our algorithm, we also provide objective measurements. Peak signal-to-noise ratio (PSNR) and root mean square (RMS) error are traditional quantitative measures of accuracy, and by comparing their values for the above images, we can conclude that the proposed EGPR algorithm yields interpolated pixel values that are much closer to their original high-quality values than those obtained with the bilinear algorithm, GPR algorithm, and ICBI algorithm. Tables 1 and 2 summarize the PSNR and RMS values for each algorithm at different magnification factors and for each image. It can be observed that the PSNR values for images obtained using the EGPR algorithm are the highest, and those using the bilinear algorithm are the lowest. Further, RMS values for images obtained using the EGPR algorithm are the lowest, and those using the bilinear algorithm are the highest. Overall, it can be clearly demonstrated that our new method outperforms the other three algorithms.

MSSIM [25] is an image quality assessment index which assesses the image visibility quality from an image formation point of view under the assumption of the correlation between human visual perception and image structural information. We compared the MSSIM obtained using the EGPR algorithm at different scale values with the corresponding values obtained using the bilinear, GPR, and ICBI algorithms, as shown in Table 3. It is noted that our new algorithm achieves a greater MSSIM than the other three algorithms, and the results show that the images obtained using our algorithm are closer to the original high-resolution images in terms of image structure similarity.

In addition, Figure 7 clearly demonstrates the quantitative assessment results for each image at different magnification levels. In this figure, the blue dots represent the quality scores of the images obtained using the comparison algorithms, and the red dots represent those obtained using our algorithm. Our interpolation algorithm is notably superior to the other algorithms, according to all three objective measurements. The proposed algorithm therefore yielded encouraging performance in terms of image visualization and quantitative quality assessment, making it a competitive image interpolation algorithm.

5. Conclusions

In this paper, we have presented a novel EGPR method for image interpolation. The main feature of this new algorithm is its ability to obtain relatively high prediction accuracy of the unknown pixels by fully utilizing underlying image patch information. The implementation process involves two steps: training and prediction. The former creates a GPR model using only single-image data as the training set, and the latter combines energy computation with the acquired model to produce a high-resolution image. Experiments have shown that our algorithm can yield encouraging performance not only in terms of image visualization but also in terms of PSNR, RMS, and MSSIM quality measures. However, better image interpolation comes at the expense of greater algorithm complexity. Methods of improving the algorithm efficiency need further investigation. In future, we can improve this algorithm to address the problem of the interpolation of image sequences. Images in the same sequence are also subject to the recurrence phenomenon, whereby images contain spatial-temporal correlation [26]. We believe that this problem can be addressed using the improved EGPR algorithm by finding an appropriate energy-driven computation and training mode.

Acknowledgments

This work was supported by the National Basic Research Program of China (973 Program) 2012CB821200 (2012CB821206), the National Natural Science Foundation of China (no. 91024001, no. 61070142), and the Beijing Natural Science Foundation (no. 4111002).