Research Article | Open Access
Xuan Zhu, Xianxian Wang, Jun Wang, Peng Jin, Li Liu, Dongfeng Mei, "Image Super-Resolution Based on Sparse Representation via Direction and Edge Dictionaries", Mathematical Problems in Engineering, vol. 2017, Article ID 3259357, 11 pages, 2017. https://doi.org/10.1155/2017/3259357
Image Super-Resolution Based on Sparse Representation via Direction and Edge Dictionaries
Sparse representation has recently attracted enormous interests in the field of image super-resolution. The sparsity-based methods usually train a pair of global dictionaries. However, only a pair of global dictionaries cannot best sparsely represent different kinds of image patches, as it neglects two most important image features: edge and direction. In this paper, we propose to train two novel pairs of Direction and Edge dictionaries for super-resolution. For single-image super-resolution, the training image patches are, respectively, divided into two clusters by two new templates representing direction and edge features. For each cluster, a pair of Direction and Edge dictionaries is learned. Sparse coding is combined with the Direction and Edge dictionaries to realize super-resolution. The above single-image super-resolution can restore the faithful high-frequency details, and the POCS is convenient for incorporating any kind of constraints or priors. Therefore, we combine the two methods to realize multiframe super-resolution. Extensive experiments on image super-resolution are carried out to validate the generality, effectiveness, and robustness of the proposed method. Experimental results demonstrate that our method can recover better edge structure and details.
In video surveillance, medical imaging, satellite observation, and other scenes, due to the imaging equipment, the hardware storage, natural environment, and other limited factors, we usually get low-resolution (LR) images . However, high-resolution (HR) images are often needed for subsequent image processing and analysis in most practical applications. As an effective approach to solve this problem, super-resolution (SR) technique fulfils the task of estimating HR image from one or a sequence of LR images. The SR technology increases high-frequency components and removes the resolution degradation, blur, noise, and other undesirable effects by making full use of the existing data information.
As a hot research direction in the field of image processing, the problem of SR has been studied for more than three decades, and many SR approaches have been proposed. According to the number of input LR images, SR approaches can be broadly classified into two categories: single-image SR and multiframe SR . According to the processing method, it mainly includes three kinds of SR approaches: interpolation-based methods , reconstruction-based methods , and learning-based methods . Interpolation methods get the value of interpolated point from its surrounding pixels with different weight. The classical interpolation methods include nearest interpolation, bilinear interpolation, and bicubic interpolation . Although such methods have simple principle and low algorithm complexity, they tend to produce considerable blurring and jagged artifacts. The reconstruction-based methods [7–10] are usually used for multiframe SR. These methods usually incorporate the reconstruction constraints or the prior knowledge to model a regularized cost function with a data-fidelity term . The reconstruction-based methods possess the ability to recover better edges and suppress aliasing artifacts. However, they cannot restore the fine structures when the upscaling factor is larger, as the performance depends heavily on the nonredundant complementary information among input LR images. Undoubtedly, the learning-based methods have become a research hotspot in recent years. The methods exploit the information from training images to establish the relationship between HR and LR image patches. As the relationship reflects the inherent similarity among natural images, the learning methods can restore high-frequency information effectively. There are some typical methods, such as Example-Based method , Neighbor Embedding method , Sparse Coding method [14–16], and Anchored Neighborhood Regression method . In 2010, Yang et al.  proposed an image SR method via sparse representation, and it can provide better reconstruction results. In 2012, Zeyde et al.  improved the efficiency of Yang’s method by reducing the dimension of training samples and using K-SVD algorithm to train dictionaries. In 2014, Farhadifard et al.  presented a single-image SR based on sparse representation via directionally structured dictionaries. It can avoid the problem that using same dictionary for sparse representation of image patches cannot reflect the differences of image patch structure characteristics , which exists in Yang et al.  and Zeyde et al. . Usually, learning-based methods need a large and representative database, leading to high computational costs in the process of training dictionaries.
Inspired by the work of [18, 20] and considering the importance of learning dictionary, the author presents a novel Direction and Edge dictionaries model for image SR. Firstly a pair of Direction and Edge templates is built to classify the training image patches into two clusters. Then each cluster is studied to get two pairs of HR and LR overcomplete Direction and Edge dictionaries. Finally sparse coding and Direction and Edge dictionaries are combined to realize single-image SR. The performance of reconstruction-based methods degrades rapidly when the upscaling factor is larger. Therefore we combine the above single-image SR with the POCS to realize multiframe SR. Experimental results prove that our method is feasible and effective, while demonstrating better edge and texture preservation performance.
The content of this paper is arranged as follows: Section 2 introduces sparse representation and Direction and Edge learning dictionaries. In Section 3, the novel sparse representation based image SR using Direction and Edge dictionaries is illustrated. The experimental results of single-image and multiframe SR and their evaluation are given in Section 4. Section 5 arrives at a brief conclusion.
2. Sparse Representation and Direction and Edge Learning Dictionaries
2.1. Sparse Representation
After downsampling and fuzzy , HR image is degenerated into LR image :where , so . If is an image patch taken from , is an image patch taken from which is in the same location with . The sparse representation model is as follows :where is the sparse representation coefficient of , is the HR overcomplete dictionary. Assuming LR overcomplete dictionary , , then , so it can be clarified that HR and LR image patches have the same sparse representation coefficient. As a result, taking known a pair of HR and LR dictionaries as the premise of prior knowledge, we are able to rebuild the corresponding HR image patch as long as we acquire sparse representation coefficient of the LR image patch.
2.2. Direction and Edge Learning Dictionaries
The quality of reconstructed image depends largely on the expression ability of overcomplete dictionary. In Yang et al. , dictionary training scheme is as follows: where is the set of sampled HR training image patches and is the corresponding LR training image patches, is the sparse representation coefficient, and is a balance parameter.
Based on the same sparse representation model (2), Zeyde et al.  modify the above dictionary training method: LR dictionary is trained from the LR set by applying K-SVD algorithm  to solve the following minimization problem : where denotes the sparsity constraint. The obtained sparse representation matrix is used to infer dictionary as follows:
Both Yang et al.  and Zeyde et al.  have two similar aspects in dictionary training: (i) the large scale of training sample sets leads to heavy computational burden in the training process; (ii) it ignores the difference between image patches with only one pair of global dictionaries whose representation ability is limited.
It has been shown in  that designing multiple dictionaries is more beneficial than a single one. Furthermore, in  it is pointed out that using clustering to design several dictionaries improves quality and reduces computational complexity . In 2014, Farhadifard et al.  trained eight pairs of directionally structured dictionaries for directional patches and a pair of dictionaries for nondirectional patches. Firstly, the two-dimensional space is divided into eight fixed directions. Then they design eight kinds of template sets, and each kind of template set contains several templates. Finally these templates are applied to classify the training sets into eight directional clusters and one nondirectional cluster and further to learn a pair of dictionaries for each cluster.
As everyone knows, edge represents the large-scale structure of image and has the characteristics of smoothness, so human visual system is more sensitive to edge. Besides, image content is highly directional. In short, edge and direction are the most important features of an image. In order to better capture the intrinsic direction and edge characteristic of image, we design Direction and Edge dictionaries for different clusters of patches, instead of a global dictionary for all the patches.
Based on the consideration of the significant difference between edge pixels and neighborhood pixels and the strong direction performance of the image, we design a new pair of Direction and Edge templates, as Figure 1. It is not difficult to find that the template A represents vertical direction and edge, while the template B represents horizontal direction and edge.
Direction and Edge templates are used to guide the clustering of image patches and further to obtain Direction and Edge dictionaries. Firstly, each patch is clustered, and the training image patches are classified into two clusters, in which the criterion for clustering is Euclidean distance. Thus the Euclidean distances between the image patch and two templates are obtained and the smaller value determines which cluster the patch belongs to. Finally, two clusters are trained, respectively, to obtain two pairs of HR and LR dictionaries, which are referred to as the Direction and Edge dictionaries.
There are some advantages of Direction and Edge dictionaries: (i) the dictionaries are expected to better represent the intrinsic direction and edge characteristics of the natural images; (ii) the reconstructed HR image via the above dictionaries inherits the large-scale information of natural images and has more high-frequency information, which are the most important parts for SR; (iii) they reduce computational complexity due to the fact that structural dictionaries can be smaller than a global dictionary.
In order to improve the algorithm efficiency, our templates are at the size of 6 by 6. Compared with Farhadifard et al. , our method contains only two templates, which consider not only the direction, but also the edge features. In addition, there is no need to set a specific threshold, which is for clustering nondirectional patches in . Of course, we can try other possible classification templates.
3. Image SR Based on Direction and Edge Dictionaries
3.1. Single-Image SR
The single-image SR based on Direction and Edge dictionaries includes three steps: tectonic training sets, Direction and Edge dictionary training, and image reconstruction, as shown in Figures 2, 3, and 4. In training sets’ tectonic phase, after taking patches overlapped from training images, all patches are classified into two clusters according to the Euclidean distances. In the Direction and Edge dictionary training phase, we gain the LR dictionary using K-SVD algorithm for each cluster of LR training set and then obtain the corresponding HR dictionary by (5). In the reconstruction phase, after computing sparse representation coefficient of LR patch, the HR patch is obtained from the coefficient multiplied by corresponding class HR dictionary.
3.1.2. Algorithm Implementation
Step 1 (tectonic training sets). (a) Take 91 natural images as HR image library, and the LR image library is comprised of LR images achieved from downsampling of HR images. To reach the HR image dimension, LR images are scaled up to the size of HR images via bicubic interpolation and are termed medium-resolution (MR) images.
(b) Take patches with five-pixel overlap from HR images , and then calculate the Euclidean distances between each normalized patch and the two templates. Classify the patches into two classes by distances, and mark the first and second class position.
(c) Take the same size patch from MR image in the same position as HR image, and then use the first- and second-order gradients of the patches as the feature vector. Develop the first class LR (LR1) training set and the second class LR (LR2) training set by combining the corresponding class feature vectors.
(d) Extract image patch from HR-MR image to be columns feature vector, so as to develop the first class HR training set (HR1) and the second class HR training set (HR2) by collecting corresponding class feature vectors.
Step 2 (Direction and Edge dictionary training). For first class, train LR1 training set by K-SVD algorithm to get first class LR dictionary and sparse coefficient . According to (5), get first class HR dictionary from known and HR1 training set. Similarly, get second class LR dictionary and HR dictionary .
Step 3 (image reconstruction). (a) Acquire MR image by interpolation amplification of the input LR image. Take patch with five-pixel overlap from MR image and classify the patches into two clusters by same method as above. Then get feature vectors by extracting the first- and second-order gradients of patches. Finally calculate the sparse coefficient of each column characteristic vector on corresponding class ;
(b) Calculate high-frequency information of each patch from known and corresponding class . Add high-frequency information to the corresponding MR image patch, and then remove the patch effect to obtain final HR image.
The results of single-image SR are showed in Section 4.
3.2. Multiframe SR
The method of POCS is widely used for multiframe SR and easily available to introduce prior knowledge. However, it usually shows jagged edges in the reconstructed results when the upscaling factor is larger. Our method based on Direction and Edge dictionaries can recover more high-frequency information and preserve smooth edges. Therefore, we combine the POCS method with our single-image SR method to realize multiframe SR. It includes three steps: multiframe registration, POCS reconstruction, and single-image SR based on Direction and Edge dictionaries, like Figure 5.
In the stage of multiframe registration image, firstly extract feature points of input multiple images by SURF algorithm  and complete feature points matching. Then remove the mismatching points by RANSAC algorithm . Finally the registration images are obtained according to the parameters computed from affine transformation matrix.
3.2.2. Algorithm Implementation
Step 1 (multiframe registration image). (a) Obtain LR sequence images via geometric distortion and downsampling of HR image. Then select the first frame as reference frame and other frames for the floating frame. Use SURF algorithm to extract feature points and RANSAC algorithm to remove the false matching points.
(b) The registration images are calculated on the basis of the affine transformation model with matching points.
Step 2. Use POCS method to reconstruct the registration images by an upscaling factor .
Step 3. The result of POCS is magnified by our method by a factor of . The whole reconstruction upscaling factor is .
4. The Experimental Results and Evaluation
In this section, we demonstrate the numerous experiments to verify the performance of our method. All the experiments are executed with MATLAB 8.3.0.
4.1. Single-Image SR
The experimental setting in this paper refers to Yang et al. . The same 91 training images are adopted, and the dictionaries have 256 atoms, and patch size is with the overlap width equal to 5 between the adjacent patches. LR training and testing images are generated by resizing the ground truth image by bicubic interpolation. Since human visual system presents more sensitivity to the luminance changes, we only apply the SR method to the luminance component, while applying the simple bicubic interpolation to the chromatic components.
We compare the proposed single-image SR based on Direction and Edge dictionaries with the bicubic interpolation method and several state-of-the-art SR methods, including Yang et al. , Zedye et al. , NCSR , ANR , and CSC . The source codes of competing methods are downloaded from the authors’ websites and we use the recommended parameters by the authors.
Visual Quality. We perform experiments on 16 widely used test images by an upscaling factor 2. In Figures 6, 7, and 8, we show the single-image SR results of competing methods on images of Plant, Parrot, and Comic. In order to clearly compare, we amplify four times of local line in left upper corner of the figure. As highlighted in the small window, the SR results by our method can recover more high-frequency information and reduce artifacts.
PSNR and SSIM. The peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) values by the competing methods are shown in Tables 1 and 2. Our method achieves much better PSNR and SSIM index than bicubic and NCSR. The average values are only slightly inferior to Yang’s method, Zedye’s method and ANR. For the PSNR index, our method is better than Yang’s method on Raccoon, better than Zedye’s method on Hat, Lena, and Bike, and better than ANR on Hat, Parrot, and Raccoon. The PSNR of CSC based on convolutional neural network is higher than our method, but the method takes long running time and at least 3G memory. In short, the results verify not only the validity of our method, but also the good robustness for different kinds of input.
4.2. Multiframe Image SR
The experiments aim to obtain a HR image (512 × 512) from 10 frames LR image (128 × 128) by an upscaling factor of 4 (, ). In order to simulate the imaging process in actual scene, we obtain 10 LR images from the original HR image via downsampling by a factor of 2, random jitter around 1~2 pixel and clockwise rotation of −1~+1 degree.
In this part, we perform SR experiments on multiframe images and the upscaling factor is 4. However, most of the state-of-the-art SR methods are for single-image SR and the upscaling factor is 2 or 3. So we compare our method with the bicubic interpolation method and POCS. As to bicubic interpolation method, we directly magnify the second frame image with a factor 4.
In order to verify the good robustness of our method for different kinds of images, Table 3 shows the PSNR and SSIM values of multiframe SR. Compared with other methods, our method achieves higher PSNR and SSIM values.
Figures 9–11 are the multiframe SR results of competing methods on images Lena, Monarch, and Pepper. In order to demonstrate conveniently, we only reveal four input LR images and cut the reconstructed image as it is too large. As figures show, the edges produced by our method are more smooth and natural, and the results have more details and fewer artifacts.
In this paper, we present a novel approach for image super-resolution based on sparse representation in terms of Direction and Edge dictionaries. The key idea is to classify image patches based on their direction and edge features and selectively code each patch using more appropriate dictionary. According to the Euclidean distances between image patch and two new templates, image patches are divided into two clusters and then are trained to obtain two pairs of Direction and Edge dictionaries. Single-image experimental results indicate the usefulness of the proposed Direction and Edge dictionaries. Furthermore, we combine the POCS with our single-image SR method to realize multiframe SR, especially when upscaling factor is larger, while the experiments show that it has the same satisfactory results. In short, our proposed method achieves not only competitive PSNR and SSIM values, but also more pleasant visual quality of image edge structures and texture.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
- Li. Jinming, Research on sparse representation based image super-resolution reconstruction method [D. E. thesis], Chongqing University, 2015.
- X. Zhu, B. Li, J. Tao, and B. Jiang, “Super-resolution image reconstruction via patch haar wavelet feature extraction combined with sparse coding,” in Proceedings of the 2015 IEEE International Conference on Information and Automation, ICIA 2015, pp. 770–775, August 2015.
- Z. Wei and K.-K. Ma, “Contrast-guided image interpolation,” IEEE Transactions on Image Processing, vol. 22, no. 11, pp. 4271–4285, 2013.
- S. S. Panda, M. S. R. S. Prasad, and G. Jena, “POCS based super-resolution image reconstruction using an adaptive regularization parameter,” International Journal of Computer Science Issues, vol. 8, no. 5, 2011.
- D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” in Proceedings of the 12th International Conference on Computer Vision (ICCV '09), pp. 349–356, October 2009.
- F. Zhou, W. Yang, and Q. Liao, “Interpolation-based image super-resolution using multisurface fitting,” IEEE Transactions on Image Processing, vol. 21, no. 7, pp. 3312–3318, 2012.
- H. Stark and P. Oskoui, “High-resolution image recovery from image-plane arrays, using convex projections,” Journal of the Optical Society of America A: Optics and Image Science, vol. 6, no. 11, pp. 1715–1726, 1989.
- M. Irani and S. Peleg, “Improving resolution by image registration,” CVGIP: Graphical Models and Image Processing, vol. 53, no. 3, pp. 231–239, 1991.
- R. R. Schultz and R. L. Stevenson, “A Bayesian approach to image expansion for improved definition,” IEEE Transactions on Image Processing, vol. 3, no. 3, pp. 233–242, 1994.
- S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar, “Fast and robust multiframe super resolution,” IEEE Transactions on Image Processing, vol. 13, no. 10, pp. 1327–1344, 2004.
- Y. Zhang, J. Liu, W. Yang, and Z. Guo, “Image super-resolution based on structure-modulated sparse representation,” IEEE Transactions on Image Processing, vol. 24, no. 9, pp. 2797–2810, 2015.
- W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example-based super-resolution,” IEEE Computer Graphics and Applications, vol. 22, no. 2, pp. 56–65, 2002.
- H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. 275–282, IEEE, Washington, DC, USA, July 2004.
- J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008.
- S. Gu, W. Zuo, Q. Xie, D. Meng, X. Feng, and L. Zhang, “Convolutional sparse coding for image super-resolution,” in Proceedings of the 15th IEEE International Conference on Computer Vision, ICCV 2015, pp. 1823–1831, December 2015.
- W. Dong, L. Zhang, G. Shi, and X. Li, “Nonlocally centralized sparse representation for image restoration,” IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1620–1630, 2013.
- R. Timofte, V. De, and L. V. Gool, “Anchored neighborhood regression for fast example-based super-resolution,” in Proceedings of the 14th IEEE International Conference on Computer Vision (ICCV '13), pp. 1920–1927, December 2013.
- J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.
- R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Curves and Surfaces 2010, J. D. Boissonnat et al., Ed., vol. 6920 of Lecture Notes in Computer Science, pp. 711–730, Springer, Berlin, Germany, 2012.
- F. Farhadifard, E. Abar, M. Nazzal, and H. Ozkaramanh, “Single image super resolution based on sparse representation via directionally structured dictionaries,” in Proceedings of the 2014 22nd Signal Processing and Communications Applications Conference, SIU 2014, pp. 1718–1721, April 2014.
- Q.-S. Lian and W. Zhang, “Image super-resolution algorithms based on sparse representation of classified image patches,” Acta Electronica Sinica, vol. 40, no. 5, pp. 920–925, 2012.
- D. L. Donoho, “For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution,” Communications on Pure and Applied Mathematics, vol. 59, no. 7, pp. 907–934, 2006.
- R. Rubinstein, A. M. Bruckstein, and M. Elad, “Dictionaries for sparse representation modeling,” Proceedings of the IEEE, vol. 98, no. 6, pp. 1045–1057, 2010.
- N. Ai, J. Peng, X. Zhu, and X. Feng, “Single image super-resolution by combining self-learning and example-based learning methods,” Multimedia Tools and Applications, vol. 75, no. 11, pp. 6647–6662, 2016.
- M. Elad and I. Yavneh, “A plurality of sparse representations is better than the sparsest one alone,” IEEE Transactions on Information Theory, vol. 55, no. 10, pp. 4701–4714, 2009.
- W. Dong, L. Zhang, G. Shi, and X. Wu, “Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization,” IEEE Transactions on Image Processing, vol. 20, no. 7, pp. 1838–1857, 2011.
- F. Farhadifard, Single image super resolution based on spar- se representation via structurally directional dictionaries [M. S. thesis], Eastern Mediterranean University (EMU), 2013.
- H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robust features (SURF),” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346–359, 2008.
- M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the Association for Computing Machinery, vol. 24, no. 6, pp. 381–395, 1981.
Copyright © 2017 Xuan Zhu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.