An Example-Based Super-Resolution Algorithm for Selfie Images

William, Jino Hans; Venkateswaran, N.; Narayanan, Srinath; Ramachandran, Sandeep

doi:https://doi.org/10.1155/2016/8306342

The Scientific World Journal

On this page

Abstract Introduction Results Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2016 | Article ID 8306342 | https://doi.org/10.1155/2016/8306342

An Example-Based Super-Resolution Algorithm for Selfie Images

Jino Hans William,¹N. Venkateswaran,¹Srinath Narayanan,¹and Sandeep Ramachandran¹

Academic Editor: Stuart H. Rubin

Received15 Nov 2015

Revised28 Jan 2016

Accepted08 Feb 2016

Published15 Mar 2016

Abstract

A selfie is typically a self-portrait captured using the front camera of a smartphone. Most state-of-the-art smartphones are equipped with a high-resolution (HR) rear camera and a low-resolution (LR) front camera. As selfies are captured by front camera with limited pixel resolution, the fine details in it are explicitly missed. This paper aims to improve the resolution of selfies by exploiting the fine details in HR images captured by rear camera using an example-based super-resolution (SR) algorithm. HR images captured by rear camera carry significant fine details and are used as an exemplar to train an optimal matrix-value regression (MVR) operator. The MVR operator serves as an image-pair priori which learns the correspondence between the LR-HR patch-pairs and is effectively used to super-resolve LR selfie images. The proposed MVR algorithm avoids vectorization of image patch-pairs and preserves image-level information during both learning and recovering process. The proposed algorithm is evaluated for its efficiency and effectiveness both qualitatively and quantitatively with other state-of-the-art SR algorithms. The results validate that the proposed algorithm is efficient as it requires less than 3 seconds to super-resolve LR selfie and is effective as it preserves sharp details without introducing any counterfeit fine details.

1. Introduction

With the advent of smartphones having sophisticated camera technologies and integrated online social networking services, selfies gain popularity among social media users. Selfie is typically a photograph that one has taken of oneself, using the front camera of a smartphone. Most conventional smartphones have two cameras, a primary rear camera and a secondary front camera. As the front camera is mainly intended for video conference, it has limited pixel resolution compared with rear camera. For instance, Apple’s iPhone 6 has a 1.2-megapixel (MP) front camera which is very much limited compared with primary 8 MP rear camera in terms of pixel resolution. Though the front camera is designed for video conference, it is often used by users to capture selfies. Selfies are low-resolution (LR) images, as the fine details in it are explicitly missed due to hardware limitation of the front camera. Despite the fact that selfies are self-portraits which essentially comprise facial information of the user, it is equally important to proclaim the importance of the background information in it. The vital background information can be an interesting scene, astounding location, or a group of friends. Selfies are widely shared via social media; hence the volume of such images is burgeoning and there is a need to improve the quality of these images.

Super-resolution (SR) algorithm [1] aims to generate high-resolution (HR) image from single or ensemble of LR images. Example-based SR algorithms [2–4] enhance the resolution of LR image by learning the high frequency (HF) details from LR-HR training examples. The priori which defines the relation between the LR and HR images could be learned from the training image-pairs. The learned image-pair priori [5] can be used to generate HR image from the observed LR image. Conventional example-based SR algorithms can be characterized into two categories with respect to the way image-pair priori is learned from the training set, namely, the implicit- and explicit-priori based methods. The implicit-priori based algorithms [6–8] represent the priori directly from the training image-pairs. Most of the traditional -nearest neighbor algorithms [6, 9] are implicit and are computationally expensive to search the -nearest neighbors to estimate the HR image. The explicit-priori based algorithms either use a dictionary [10–12] or a regression function [13, 14] to map the correspondence between the LR-HR image-pairs. Dictionary based algorithms [15, 16] represent the priori between LR and HR image-pairs by a LR-HR dictionary pair. In regression based approaches, the regression function which maps the LR and HR image-pairs can be mapped by either a supervised [13] or semisupervised [17] learning process. The time required to train explicit image-pair priori is generally high. Therefore, conventional example-based SR algorithms are not suitable for super-resolving selfies.

The main challenge in super-resolving LR selfie is to learn the image-pair priori which maps the LR to HR image-level correspondence with minimum computational complexity. As HR images captured by the rear camera preserve fine details, it can be used to learn a priori to super-resolve selfies. Most of the conventional example-based SR algorithms are implemented by vectorizing the training image-pairs [9, 15]. By vectorizing, the image-level information between image-pairs is lost due to structural disparity. Hence the vector-based priori which relates the LR-HR image-pairs is not effective [18]. To overcome this difficulty, a novel matrix-based priori is proposed by Tang and Yuan [18] to model the image-pair priori. However, the matrix-based priori is derived based on the assumption that most of the image patches extracted from natural training images are full rank [18]. Though this assumption is valid for natural images, patches extracted from real-life images with facial information and smooth textures are intuitively rank deficient.

This paper endeavors to improve the spatial resolution of selfies by efficiently learning an optimal matrix-value regression (MVR) operator from LR-HR image patch-pairs extracted from training samples captured by rear camera of the smartphone. The training image patch-pairs are factorized by singular value decomposition (SVD) to accommodate rank deficient patch-pairs in the learning process. The MVR operator explicitly models the correspondence between the LR- and HR-training image patches to super-resolve the LR selfies. As the proposed MVR algorithm avoids vectorization, it preserves the structural similarity of training image patches and enjoys image-level information within them. The computational cost of the proposed algorithm is greatly reduced by optimally selecting a larger patch-size in both training and recovering phase as it carries significant image-level information. The main contributions of this paper are as follows:(i)A fast selfie SR algorithm: LR selfies are super-resolved by a fast example-based algorithm using an optimal MVR operator learned from HR training images captured by rear camera of the smartphone.(ii)Effective and efficient MVR operator: The computational cost to learn the MVR operator is minimum. Also, it faithfully preserves the structural similarity between training image patch-pairs, which makes the MVR operator effective and efficient. The remainder of the paper is organized as follows. A brief description on image-pair analysis methods is reported in Section 2. In Section 3, the proposed SR methodology for selfie images is explained in detail. In Section 4, experimental evaluations are reported to compare the performance of the proposed method and finally Section 5 concludes the paper.

2. Brief Description on Image-Pair Analysis Methods

Example-based super-resolution algorithms estimate the fine details that are missed in LR images by learning the correspondence between training image-pairs. The process of example-based super-resolution is summarized in Figure 1. Effective image-pair analysis methods are required by example-based SR algorithms to learn an image-pair regression operator, which defines a relation between LR-HR image-pairs. Training image-pair typically consists of a HR image and its corresponding synthetically generated LR image. Well learned image-pair regression operator provides significantly precise correspondence between LR-HR patch-pairs and could be effectively used as a global priori in many inverse image processing tasks [19, 20]. Example-based SR is an ill-posed problem and requires sophisticated image-pair analysis methods [18] to learn the suitable regression operator from training examples.

Image-pair analysis methods are classified as vector-based and matrix-based methods. In vector-based image-pair analysis methods [15, 16], LR-HR image patch-pairs are represented as feature vectors and its correspondence is learned with an explicit vector-based regression operator. Though image patch-pairs are faithfully represented as vectors in vector-based methods, its image-level structural information is lost due to vectorization [21, 22]. Therefore the problem of image-pair analysis is converted to a problem of vector-pair analysis. To avoid structural disparity and preserve image-level information within patch-pairs, a few matrix-based image-pair analysis methods are suggested [18, 23]. In these methods [18, 23], a linear matrix-based regression operator is learned to map the global dependency [18] between LR-HR patch-pairs.

2.1. Matrix-Value Regression (MVR) Operator

An image patch-pair denoted as defines a linear matrix-value regression (MVR) operator such thatIf the image patch-pairs are assumed to be full rank matrices, then the MVR operator can be obtained aswhere refers to the matrix inverse of .

The MVR operator profoundly depends on the full rank condition of its constituent patch-pairs to compute the matrix inverse. For rank deficient matrices, computing inverse is not stable. Hence, in recent matrix-based image-pair analysis methods [18, 23, 24] the patch-pairs are assumed to be full rank matrices. However, the main difference between a selfie image and general image is with respect to its information content. Typical selfie images essentially carry the facial information of the user that contains a foreground with vivid facial features of similar textures and a background with less complex information. However, general images will carry any natural information having more complex structures with random patterns and textures [25]. Image patches extracted from random natural images are intuitively assumed to be full rank [18] due to complex structures in it. Though this assumption is valid (ideally producing 5% rank deficiency) for natural images, image patches extracted from selfie images are intuitively rank deficient. To validate this, an experiment was carried out with 100000 patches extracted from training images and it is observed that approximately 50% of the patches are rank deficient as shown in Table 1. This is attributed to the similar texture details present in the training samples. Furthermore, this percentage increases for larger patch-size as the patch coherence becomes higher. To accommodate rank deficient patch-pairs to represent the image-pair priori, matrix inverse is computed by factorizing the patch-pairs with singular value decomposition.

2.2. Similarity Measure via MVR Operator

The linear MVR operator precisely models the correspondence between the image patch-pairs . Therefore from (1), we get

If a LR test patch is identical with th patch in the training set, then becomes an identity matrix. The term can be observed as a patch-similarity measure which defines the mutual information between and . From (3), the HR estimation of the LR test image can be found effectively using the MVR operator.

2.3. Computational Efficiency via MVR Operator

The MVR operator significantly reduces the computational complexity by reducing the number of variables required to represent the operator. As the image patch-pairs are matrices of size , the image-pair regression operator will be a matrix of size . Therefore it is required to have variables to represent the matrix-based regression operator. Nevertheless, in vector-based approaches, as image patches are column vector of size , the regression operator that maps the two vectors should be a matrix of size and hence requires variables.

3. The Proposed Selfie Super-Resolution Methodology

The overview of the proposed selfie SR methodology is illustrated in Figure 2. The example-based selfie SR algorithm consists of a training phase (performed offline), where an optimal MVR operator is learned from a set of image patch-pairs extracted from the training image set and a reconstruction phase performing super-resolution on the test selfie image using the learned matrix-value regression (MVR) operator from the previous phase.

3.1. Training Set Construction

The training phase begins by collecting a few HR images , captured by the rear camera of the smartphone, which are considered as HR examples. Each of these HR images is downscaled by a scale-factor . These downscaled images form the corresponding LR images . To avoid resolution disparity, the LR images are upscaled to the size of the target HR image by an interpolation operator and are denoted by . The set of images in forms the training image-pairs. Let and denote image patches of size extracted from and , respectively. For every image patch extracted from the HR image centered at its origin , there exists a self-similar example patch [25] around its origin in the LR image , where and . The correspondence between and is learned by an optimal MVR operator.

3.2. Algorithm to Learn Optimal MVR Operator

Let the training patch-pairs be denoted as , where is low- and high-resolution patch-pairs of size and is the number of training patch-pairs. Let be a MVR operator mapping the low-resolution image space to the high-resolution image space.

The optimal MVR operator is subsequently learned from the training set using the least square regression model given bywhere is the Frobenius norm. Let be the cost function such that (4) becomeswhere

To obtain the optimal MVR operator, the target function is given bywhere , , and are the auxiliary matrices.

The optimal MVR operator can be deduced by imposing condition for minimization on (7); hence

Therefore, the optimal MVR operator is given by

The inverse of the auxiliary matrix is computed by factorizing with SVD; thus , where and are orthogonal matrices and is a diagonal matrix with singular values. Thus

The optimal MVR operator shown in (10) explicitly represents the image-level correspondence between the low- and high-resolution image patch-pairs. The MVR operator resulting from the training phase is used to reconstruct the fine details from the low-resolution selfie images. The procedure to deduce optimal MVR operator is summarized in Algorithm 1.

Input: Training image patch-pairs
Output: Optimal Matrix-value operator
Steps:
() Calculate the auxiliary matrices and


() Factorize the auxiliary matrix using SVD

() Find the inverse of auxiliary matrix

() Find the optimal Matrix-value operator

Output: Optimal Matrix value operator

3.3. Algorithm for SR Reconstruction

In the reconstruction phase, LR selfies captured by the front camera are super-resolved using the MVR operator learned from Algorithm 1. In addition, the MVR operator is adapted to learn from the test selfie itself by a bootstrapping approach [16]. The given test selfie is assumed to be the HR image and the scaled-down version is its LR counterpart. The correspondence between the LR-HR patch-pairs extracted from the bootstrapped image-pairs is used to update the optimal MVR operator. The test selfie is interpolated by a factor with an interpolation operator . Nonoverlapping image patches of size are extracted from the interpolated test image. This collection of low-resolution patches is represented as . Every test LR image patch in set is super-resolved using the optimal MVR operator, such that

The super-resolved test image patches are merged to form the super-resolved high-resolution image . The steps involved in the reconstruction phase are summarized in Algorithm 2.

Input: Optimal Matrix-value operator , LR selfie test image,
Output: Super-resolved selfie image
Steps:
() Construct non overlapping patches of size from test selfie image

() For every test image patch , find the super-resolved patch

() Merge the super-resolved patches

Output: Super-resolved selfie image

4. Results and Discussions

The proposed algorithm is evaluated for its effectiveness and efficiency by conducting both qualitative and quantitative experiments on various test images shown in Figures 3 and 4. The test images are super-resolved using state-of-the-art approaches such as Yang et al.’s sparse representation based algorithm [15], Kim et al.’s sparse regression algorithm [9], Dong et al.’s nonlocal autoregressive modeling (NARM) algorithm [26], and He et al.’s Gaussian process regression algorithm [27] and their performance metrics are estimated and compared. Among the algorithms chosen for comparison, Yang et al.’s, Kim et al.’s, and the proposed algorithm are training-based algorithm, whereas Dong et al.’s and He et al.’s algorithm are training-free algorithm. The results of the aforementioned algorithms are obtained using the source codes available at the author’s homepage.

(a)

(b)

(c)

4.1. Experimental Setup

In the experiments carried out, test images shown in Figures 3 and 4 are used as LR images. Though the algorithm is proposed to super-resolve LR selfie images, few standard test images (shown in Figure 3) such as Barbara, girl, and Lena are used to fairly compare the performance of the proposed algorithm with other state-of-the-art SR algorithms.

To evaluate the effectiveness of the proposed algorithm on selfies, various test selfies captured by different smartphones such as iPhone 4s, iPhone 6, and Nexus 5 with diverse specifications are collected. Figure 4 shows the selfie test images used for comparison, in which images (#1) and (#2) show the selfie test images captured by Nexus 5 with a resolution of 2 MP and (#3) and (#4) depict the selfie test image captured by iPhone 4s with a pixel resolution of 1 MP. Images (#5) and (#6) represent the selfie test image captured by iPhone 6 with a spatial resolution of 1.2 MP and (#7) shows the famous Oscar selfie image (courtesy Google image). The training dataset is generated from a collection of HR images captured by the rear camera of the smartphone offline. The training dataset is limited to 50 HR images with different poses, exposures extracted from the root directory of the smartphone. However, the number of training examples can be extended by adding more examples to the training set. The HR images captured by Nexus 5 have a pixel resolution of 12 MP and iPhone 6 and iPhone 4s have a pixel resolution of 8 MP. Sample training HR images are shown in Figure 5.

The training and testing color images are converted to YCbCr channel and only the luminance channel is considered for super-resolution as it is sensitive to human eye. The LR images are synthetically generated by downsampling the test images shown in Figures 3 and 4 using bicubic interpolator. The downsampled LR images are resized to the size of target HR image and are contiguously blocked into nonoverlapping patches of size . The LR test images are super-resolved by a scale-factor of , 3, and . LR-HR training image-pairs are generated with the same scale-factor . All the experiments were carried out using Matlab R2012 on an Intel core [email protected] GHz processor with 4 GB RAM.

4.2. Experimental Analysis

Effectiveness. Qualitative and quantitative evaluation are carried out to assess the effectiveness of the proposed algorithm. Qualitative evaluation of SR methods relies on a few attributes of the reconstructed image such as sharpness, naturalness, and granularity [28]. The sharpness of an image is assessed based on the HF details it preserves. The naturalness of an image is affected by the artifacts present in it. Various artifacts such as ghosting, ringing, jagging, and staircase artifacts generally affect the quality of an image. A visual comparison is made to assess the fidelity of the proposed algorithm qualitatively. The effectiveness of the proposed method is quantitatively evaluated based on a few objective performance metrics such as root mean square error (RMSE), peak signal-to-noise ratio (PSNR), and structural similarity (SSIM) index [29]. A high PSNR score indicates that the scaled-up image is free from distortions and effectively reconstructs the HF details. Similarly, a high SSIM value (typically 1) implies that the scaled-up image has a very similar structure to its ground truth. For fair comparison, the standard test images shown in Figure 3 are super-resolved using the proposed method and are compared with the aforementioned algorithms. Table 2 summarizes the quantitative comparison of various SR algorithms on test images for 3x magnification.

Figure 6 shows the 2x visual comparison for the standard test image Barbara. Figure 6(a) shows the ground truth and its corresponding scaled-up local image and Figure 6(b) shows the LR image and its corresponding local image. Figures 6(c)–6(e) depict the super-resolved image and its local image by Yang et al.’s algorithm, Kim et al.’s algorithm, and Dong et al.’s algorithm. Figure 6(f) shows the SR image and its corresponding local image super-resolved by the proposed MVR algorithm. In Figure 6(c), the texture on the table cloth is blurred when compared with ground truth. Though the stripes in the table cloth are sharp in Figure 6(d), it is not the same pattern as in the ground truth as the fine details in the table cloth are not well preserved. Dong et al.’s method reconstructs the texture as in the ground truth; however it introduces ringing and jagging artifacts, as observed in Figure 6(e), and accordingly has low PSNR value. As observed from Figure 6(f), it is evident that the proposed algorithm preserves sharp texture details as in ground truth and is free from artifacts.

(a)

(b)

(c)

(d)

(e)

(f)

For visual comparison on test selfies, 3x magnification on test selfie images is carried out. Figure 7 depicts the qualitative visual comparison for five test selfie images. Figure 7(a) depicts the test LR selfie image. Figures 7(b)–7(e) depict the SR images reconstructed by Yang et al.’s, Kim et al.’s, Dong et al., and He et al.’s algorithm. Figure 7(f) shows the proposed SR image. The local region of interest (ROI) is highlighted in red boxes and is presented in the bottom left corner of the image. In Yang et al.’s SR based on sparse representation [15] model, two coupled dictionaries are trained simultaneously from random raw image patches. Based on a dictionary pretrained from thousands of natural images, Yang et al.’s method seems to produce natural-looking results. Though Yang’s algorithm faithfully reconstructs natural-looking images, it can be observed from Table 2 that the objective measures are not the best among other comparative algorithms. This is because the fine details in the image are not well preserved due to the fact that a universal dictionary used in this method fails to represent complex structures accurately. For instance, the spectacle frame in the ROI of test image (#1) shown in Figure 7(b) looks sharp and natural but for the ROI of test image (#4) in Figure 7(b), the structure of the letters is not preserved. Due to the fact that a natural image priori is used to postprocess the SR image, Kim et al.’s [9] method effectively reproduces more visually appealing images. It preserves minute details (the eyelash in the ROI of test image (#3) in Figure 7(c)) in the reconstructed image. The PSNR and SSIM value for Kim et al.’s method is better than other comparative algorithms, as a postprocessing with an image edge priori is carried out on the reconstructed image. Nevertheless, for images with complicated edges, the edge priori tends to introduce ringing artifacts along the corner of edges, which reduces the PSNR and SSIM value. For example, artifacts can be visualized in the fan rails of ROI of test image (#5) in Figure 7(c). In Dong et al.’s [26] method, overly smooth HF details are recovered as in the ROI of test image (#2) in Figure 7(d). Also, it is prone to introduce artifacts as it can be visualized in the ROI of test image (#3) in Figure 7(d). Due to the artifacts in the reconstructed image, the average PSNR and SSIM value is lesser for Dong et al.’s method. It can be observed that the characters in the ROI of test image (#4) in Figure 7(e) are not faithfully reconstructed by He et al.’s [27] Gaussian process regression method. On the contrary, the proposed method preserves the sharp details and fine textures in most of the images without affecting the naturalness of the image. Also it is observed that the proposed method provides more photorealistic details as it does not introduce any counterfeit fine details. The effectiveness of the proposed algorithm is quantitatively validated from the PSNR and SSIM value observed from Table 2. The proposed method achieves the best PSNR and SSIM value which indicates that the proposed algorithm reconstructs the LR image with minimal distortions and a high SSIM value corroborates the effectiveness of the structural similarity which has been preserved by the proposed matrix-based regression algorithm. The proposed method performs better than other state-of-the-art SR approaches as it avoids vectorization of image patch-pairs during training phase of the MVR operator, which intuitively preserves structural similarity and image-level information within patch-pairs. Also, as the MVR operator is trained with HR images captured by the rear camera of the smartphone it effectively corresponds to the relation between LR-HR patch-pairs, thereby improving the performance of the proposed algorithm. For instance, in the highlighted ROI of test image (#1) shown in Figure 7(f), the fine details in the frame of the spectacle are well preserved. Similarly, the shadow of the pole in the ROI of test image (#2) is very clear. In the ROI of test image (#3) shown in Figure 7(f), very fine details in eye such as eyebrow and eyelash are sharp and the HF details are preserved. Also, it is observed that the structure of letters in the ROI of test image (#4) in Figure 7(f) is preserved when compared with other state-of-the-art approaches.

Efficiency. The efficiency of the proposed matrix-based SR algorithm is compared with aforementioned algorithms on a personal computer with Intel core [email protected] GHz processor with 4 GB RAM.

The computation time required to train and recover the images is reported in Table 3. Among the training-free algorithms (Dong et al. and He et al. algorithms), the average CPU time taken to recover the SR image by He et al.’s Gaussian process regression algorithm [27] is significantly high as the source code available in the author’s homepage is not optimized. The NARM based SR algorithm by Dong et al. [26] takes approximately 3~6 minutes to recover the HR image with a magnification factor of . It is witnessed from Table 3 that the training time required by training-based SR algorithms such as Yang et al. and Kim et al. algorithm is significantly high, as it has to extract training image patches from an extensive dataset to train an universal dictionary. Owing to the fact that image patches are represented as matrices and large patches (typically of size ) are used in the proposed MVR algorithm, the computational time is significantly less (<a minute), thereby outperforming other state-of-the-art approaches. The experimental results presented in Table 3 reveal that the proposed MVR algorithm can be efficiently applied to super-resolve LR selfie images with minimum computational expense.

4.3. Influence of Patch-Size

The size of image patch used in training and recovering phase significantly influences the performance of the algorithm. Intuitively, selecting a larger patch-size may produce overly smooth results whereas a small patch tends to produce undesired artifacts in smooth areas of the image. In addition, computational cost of the algorithm is influenced by patch-size. Hence a performance evaluation based on variation in patch-size for the proposed algorithm is carried out and depicted in Figure 5. The magnified ROI highlighted in red box is compared for visual fidelity. In addition, a quantitative analysis based on PSNR for different patch-size is reported in Table 4. The size of training patch is varied from to with a step size of 8 pixels. For a small patch-size of as in Figure 8, the freckles near the eye are relatively blurred and are quantitatively validated in Table 4. The qualitative and quantitative performance of the proposed algorithm increase as the patch-size is increased and are maximum for a patch-size of as shown in Figure 8 and Table 4, respectively. For instance, it is perceived that the freckles near the eyes are crisper comparatively and hence the eyes look sharper and are natural-looking for a patch-size of as in Figure 8(e). Due to the fact that the image-level information between patches is preserved by the proposed matrix-based regression algorithm, the performance of the proposed algorithm is better for larger patch-sizes. However, too large patch-size will reduce the performance of the algorithm as it is more complex to utilize the image-level information within them.

4.4. Influence of Scale-Factor

The test images are magnified by a scale-factor and its performance evaluation is carried out. For visual comparison, test image (#6) is upscaled by a factor of 2x, 3x, 4x, and 5x by the proposed algorithm and is depicted in Figure 9. The ROI considered for visual evaluation is the texture of the shirt. It is observed from Figure 9 that the texture details are well preserved for 2x magnification. For 3x magnification, the proposed algorithm is able to preserve the fine texture details as the interleaved pattern in the shirt is clear to visualize. For 4x magnification, though the pattern in the ROI is visible, the fine details in it are lost. It is also observed that ringing artifacts along the edges affect the quality of the image. Furthermore, the texture details are lost for a magnification factor of 5x. The results are quantified by its PSNR values tabulated in Table 5.

4.5. Influence of Training Dataset

Training images captured by the rear camera of the smartphone can serve as fine exemplar to train the MVR operator. The performance of the proposed algorithm can be influenced by the training dataset used to train the MVR operator. To validate this, a performance evaluation based on variation in dataset is carried out. It is observed that the training images from the same device as the test image lead to better results than when the training and testing images are taken from different devices. For visual comparison, test selfie (#2) is super-resolved by a factor of 3x using the MVR operator trained by four different datasets and is depicted in Figure 10. The training dataset TR1 has a collection of random natural images as example images. Similarly, training datasets TR2, TR3, and TR4 have collection of example images captured by the rear camera of iPhone 4s, iPhone 6, and Nexus 5, respectively. From Figure 10(e), it is observed that the freckles beneath the eye are sharp and crisp for the image super-resolved using the MVR operator trained with TR4. This is due to the fact that both training examples in TR4 and the test selfie (#2) are captured by the same smartphone. As the rear camera of the smartphone is used by the same user, example images captured from the rear camera tend to possess similar low-level image features such as texture, granularity, and exposure as in the selfie image captured by front camera. In addition to this, the facial information contained in the selfies can possibly reoccur in the training set as it is captured by the same user. This self-similarity improves the interdependency between the images and results in a more robust and efficient MVR operator. The results are quantified by the PSNR values tabulated in Table 6.

5. Conclusion

In this paper, a fast example-based SR algorithm for super-resolving LR selfie image is presented. The proposed SR algorithm learns an optimal matrix-value regression (MVR) operator from a set of training samples captured from the rear camera of a smartphone. The relation between LR-HR training patch-pairs is established by an optimal MVR operator. It preserves structural similarity across training patch-pairs and effectively represents the image-level information of the training image patch-pairs. It is used effectively to super-resolve clean LR selfie image captured by the front camera of the smartphone and it is observed that the fine details in the super-resolved test selfie are preserved. In the future, the proposed algorithm will be extended to super-resolve distorted selfie images. Qualitative and quantitative experiments have validated the efficiency and effectiveness of the proposed algorithm over other state-of-the-art SR algorithms.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE Signal Processing Magazine, vol. 20, no. 3, pp. 21–36, 2003.
View at: Publisher Site | Google Scholar
W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example-based super-resolution,” IEEE Computer Graphics and Applications, vol. 22, no. 2, pp. 56–65, 2002.
View at: Publisher Site | Google Scholar
J. Yu, X. Gao, D. Tao, X. Li, and K. Zhang, “A unified learning framework for single image super-resolution,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 4, pp. 780–792, 2014.
View at: Publisher Site | Google Scholar
D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” in Proceedings of the 12th International Conference on Computer Vision (ICCV '09), pp. 349–356, IEEE, Kyoto, Japan, October 2009.
View at: Publisher Site | Google Scholar
Y. Tang, L. Li, and X. Li, “Learning similarity with multikernel method,” IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, vol. 41, no. 1, pp. 131–138, 2011.
View at: Publisher Site | Google Scholar
H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. I275–I282, Washington, DC, USA, July 2004.
View at: Google Scholar
Y. Tang, Y. Yuan, P. Yan, L. Li, X. Pan, and L. Li, “Single-image super-resolution based on semi-supervised learning,” in Proceedings of the 1st Asian Conference on Pattern Recognition (ACPR '11), pp. 52–56, IEEE, Beijing, China, November 2011.
View at: Publisher Site | Google Scholar
X. Gao, K. Zhang, D. Tao, and X. Li, “Image super-resolution with sparse neighbor embedding,” IEEE Transactions on Image Processing, vol. 21, no. 7, pp. 3194–3205, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
K. I. Kim and Y. Kwon, “Single-image super-resolution using sparse regression and natural image prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 6, pp. 1127–1133, 2010.
View at: Publisher Site | Google Scholar
M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.
View at: Publisher Site | Google Scholar
J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. R. Bach, “Supervised dictionary learning,” in Proceedings of the Advances in Neural Information Processing Systems 21 (NIPS '09), pp. 1033–1040, Curran Associates, June 2009.
View at: Google Scholar
L. Bertelli, T. Yu, D. Vu, and B. Gokturk, “Kernelized structural SVM learning for supervised object segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 2153–2160, IEEE, Providence, RI, USA, June 2011.
View at: Publisher Site | Google Scholar
K. S. Ni and T. Q. Nguyen, “Image superresolution using support vector regression,” IEEE Transactions on Image Processing, vol. 16, no. 6, pp. 1596–1610, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
Y. Tang, Y. Yuan, P. Yan, and X. Li, “Greedy regression in sparse coding space for single-image super-resolution,” Journal of Visual Communication and Image Representation, vol. 24, no. 2, pp. 148–159, 2013.
View at: Publisher Site | Google Scholar
J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Curves and Surfaces, J.-D. Boissonnat, P. Chenin, and A. Cohen, Eds., vol. 6920 of Lecture Notes in Computer Science, pp. 711–730, Springer, 2012.
View at: Publisher Site | Google Scholar
H. Chen, Y. Zhou, Y. Y. Tang, L. Li, and Z. Pan, “Convergence rate of the semi-supervised greedy algorithm,” Neural Networks, vol. 44, pp. 44–50, 2013.
View at: Publisher Site | Google Scholar
Y. Tang and Y. Yuan, “Image pair analysis with matrix-value operator,” IEEE Transactions on Cybernetics, vol. 45, no. 10, 2015.
View at: Publisher Site | Google Scholar
A. Levin and B. Nadler, “Natural image denoising: optimality and inherent bounds,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 2833–2840, Providence, RI, USA, June 2011.
View at: Publisher Site | Google Scholar
O. Le Meur and C. Guillemot, “Super-resolution-based inpainting,” in Computer Vision—ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7–13, 2012, Proceedings, Part VI, vol. 7577 of Lecture Notes in Computer Science, pp. 554–567, Springer, Berlin, Germany, 2012.
View at: Publisher Site | Google Scholar
A. Shashua and A. Levin, “Linear image coding for regression and classification using the tensor-rank principle,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '01), pp. 42–49, Kauai, Hawaii, USA, 2001.
View at: Publisher Site | Google Scholar
M. A. O. Vasilescu and D. Terzopoulos, “Multilinear analysis of image ensembles: tensorfaces,” in Computer Vision—ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28–31, 2002 Proceedings, Part I, vol. 2350 of Lecture Notes in Computer Science, pp. 447–460, Springer, Berlin, Germany, 2002.
View at: Publisher Site | Google Scholar
Y. Tang, H. Chen, Z. Liu, B. Song, and Q. Wang, “Example-based super-resolution via social images,” Neurocomputing, vol. 172, pp. 38–47, 2016.
View at: Publisher Site | Google Scholar
Y. Tang and H. Chen, “Matrix-value regression for single-image super-resolution,” in Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR '13), pp. 215–220, IEEE, Tianjin, China, July 2013.
View at: Publisher Site | Google Scholar
J. Yang, Z. Lin, and S. Cohen, “Fast image super-resolution based on in-place example regression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '13), pp. 1059–1066, IEEE, Portland, Ore, USA, June 2013.
View at: Publisher Site | Google Scholar
W. Dong, L. Zhang, R. Lukac, and G. Shi, “Sparse representation based image interpolation with nonlocal autoregressive modeling,” IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1382–1394, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
H. He and W.-C. Siu, “Single image super-resolution using Gaussian process regression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 449–456, IEEE, Providence, RI, USA, June 2011.
View at: Publisher Site | Google Scholar
T. Leisti, J. Radun, T. Virtanen, R. Halonen, and G. Nyman, “Subjective experience of image quality: attributes, definitions, and decision making of subjective image quality,” in Image Quality and System Performance VI, vol. 7242 of Proceedings of SPIE, 2009.
View at: Publisher Site | Google Scholar
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2016 Jino Hans William et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

4211

Downloads

1766

Citations