An Improved Method of Training Overcomplete Dictionary Pair

Wang, Zhuozheng; Deller, John R.; Jia, Kebin; Zhang, Wenli

doi:https://doi.org/10.1155/2014/386835

Mathematical Problems in Engineering

On this page

Abstract Introduction Experimental Results and Analysis Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2014 | Article ID 386835 | https://doi.org/10.1155/2014/386835

An Improved Method of Training Overcomplete Dictionary Pair

Zhuozheng Wang,¹John R. Deller,²Kebin Jia,¹and Wenli Zhang¹

Academic Editor: Yoshinori Hayafuji

Received20 Aug 2014

Accepted23 Oct 2014

Published23 Nov 2014

Abstract

Training overcomplete dictionary pair is a critical step of the mainstream superresolution methods. For the high time complexity and susceptible to corruption characteristics of training dictionary, an improved method based on lifting wavelet transform and robust principal component analysis is reported. The high-frequency components of example images are estimated through wavelet coefficients of 3-tier lifting wavelet transform decomposition. Sparse coefficients are similar in multiframe images. Accordingly, the inexact augmented Lagrange multiplier method is employed to achieve robust principal component analysis in the process of imposing global constraints. Experiments reveal that the new algorithm not only reduces the time complexity preserving the clarity but also improves the robustness for the corrupted example images.

1. Introduction

Superresolution (SR) reconstruction techniques produce high-resolution (HR) images from one or more low-resolution (LR) images, thereby increasing the high-frequency image detail and correcting degradation caused by LR sensors. SR technology is employed in numerous applications including medical imaging systems, face hallucination, remote sensing image processing, and military target reconnaissance and surveillance. There are three principal approaches achieving SR. Interpolation-based and reconstruction-based methods are traditional, examples of which include the Bicubic [1], IBP [2], POCS [3], and mixed ML/MAP/POCS methods [4]. In contrast, learning-based methods construct optimally weighted constraints inferred from training overcomplete dictionary pair. Learning-based methods are capable of extrapolating high-frequency image features that are not apparent in LR images. Instances of learning methods are found in the example-based method [5], in the support vector regression (SVR) approach [6], in neighbor embedding methods (NESR) [7], and in papers on sparse representation of superresolution (SRSR) methods [8, 9], which are the most effective SR algorithms.

In the process of SR algorithms based on sparse representation, an overcomplete dictionary pair, for high-resolution patches and for low-resolution patches, will be trained to estimate the sparse coefficients. The accuracy of training dictionary will directly affect the performance of SR algorithms. However, the SR algorithms are often affected by corrupted acquired digital images including the lack of camera focus, faulty illumination, or missing data.

Principal component analysis (PCA) has been widely applied to traditional denoising SR algorithms by solving the constrained optimization problem. This method works well in practice as long as the noise power is small. However, it breaks down when corruption is severe, even if only very few of the observations are affected. For example, there are two PCA simulation results shown in Figure 1. For comparison, 100 sampled 2-dimensional arrays are represented by circle with some small noise in Figure 1(a) and with one error in Figure 1(b). On the bottom of Figure 1(a), circles can arrange by order of size. However, each circle has the same size in Figure 1(b). The results verify that PCA fails to estimate the correct order when data is corrupted by large errors. Furthermore, training dictionary has a high time complexity.

(a) Data with random noise

(b) Data was broken

To remedy these shortcomings, this paper presents an improved training dictionary algorithm based on lifting wavelet transform (LWT) and robust principal component analysis (RPCA). High-frequency wavelet coefficients are estimated by 3-tier LWT decomposition process, thereby reducing two-dimensional image data storage to about 75%. For low-frequency image components reflected in the LWT decomposition, scale coefficients are determined through robust principal component analysis (RPCA) instead of PCA. RPCA can preserve image detail and edge information and recovery of the broken data simultaneously. In the next section, this paper will explain the strategy of training dictionary in SR algorithm.

2. Training Dictionary Based on Sparse Representation

2.1. Sparse Representation Overview

Suppose that signal can be represented as a linear combination of elements in an overcomplete dictionary , where denotes the number of atoms in the dictionary. Then an observation of signal , say , can be expressed as follows: in which is a sparse vector and is the matrix used to downsample . A similar model can be applied to digital images. Let denote an original HR image and denote a degraded LR version of .

Now suppose that is partitioned into submatrices, called patches, of dimension . Each patch in is denoted by a lower case “,” with a single integer index, say . Similarly, is partitioned into LR patches, . For the purposes of this development, the relation between the index on patch and its position in is arbitrary and similarly for the index of and its position in . However, there must be a one-to-one correspondence between patches and for a given . The image degradation between and is modeled by in which represents the blurring filter. The model is shown in Figure 2.

The process of image degradation (per patch) is viewed as a projection from a high- to a low-dimension space . According to the theory of manifolds, the local characteristics of the image patches are essentially unchanged by projection [10]. Given sufficient sparsity of , HR image patches can be perfectly recovered from the sparse representation of LR image patches with high probability. Similar to (1), the LR and HR image patches are represented as linear combinations of dictionary elements as follows: where indicates the -norm. denotes the sparse basis vector which is used to represent for patch in both the HR and LR images. and are the overcomplete joint dictionary pair corresponding to the HR and LR patches, respectively.

Since (3) is underdetermined, we use a Lagrange multiplier to solve the ill-posed problem [8] as follows: in which balances sparsity of the solution and fidelity of the approximation to . Consider

indicates the feature extraction operator which retains the high-frequency details in the image. extracts the region of overlap between the current target patch and previously reconstructed HR image. contains the values of the previously reconstructed HR image on the overlap. Finally, HR image patches are reconstructed using the optimal determined in the following equation:

2.2. Training Overcomplete Dictionary Pair

The joint dictionary pair and is trained using known example image patches. Given as HR image patches and as LR image patches, optimization problem becomes the following:

The trained dictionary pair and is required to solve for the optimal sparse vector of (4).

2.3. LWT Strategies for Example Images

To reduce the time complexity of the SR algorithm, the LWT procedure is achieved before training the example images, which is called in (7). By adopting separate training strategies for high- and low-frequency components of example images, the LWT can differentially preserve the critical features that accompany these separate bandwidths.

The example images are converted to low-frequency, horizontal-, vertical-, and diagonal- high-frequency domain coefficients by the LWT. For the three high-frequency components, wavelet coefficients on this layer can be accurately estimated by the next layer’s wavelet coefficients through strong correlation. Therefore, in the process of 3-tier LWT, the first layer’s high-frequency wavelet coefficients can be estimated by the second and the third layers’ high-frequency wavelet coefficients. The number of example image pixels involved in training dictionary is only 25% of the original ones, which leads to greatly reducing the iteration time and preserving the edge information simultaneously.

For low-frequency components, RPCA discussed in the next section is employed to recover the corruption.

3. Matrix Completion and RPCA

3.1. Overview

The matrix completion problem has been the subject of intense research in recent years. In 2009, Candès and Recht [11] demonstrated exact matrix completion using convex optimization. As the rank of a matrix is not a convex function, the nuclear norm of the matrix can be used to approximate its rank, yielding a convex minimization problem for which there are numerous efficient solutions [11].

In 2010, Lin et al. [12] published a fast scalable algorithm for solving the robust PCA problem. The method is based on recovering a low-rank matrix with an unknown fraction of its entries corrupted. The algorithm proceeds as follows: given a rank matrix , where is the target dimension of the subspace, the observation matrix is . We can use to model these sets of linear measurements, where is a subsampling projection operator and represents the matrix of perturbations including complex noise and large errors. It should be relatively sparse compared to .

3.2. Matrix Completion

The objective of matrix completion is to recover the low-dimensional subspace the truly low-rank matrix from , under the assumption that is zero in which denotes the nuclear norm of a matrix. It has been shown that the solution to this convex relaxation can exactly recover the true low-rank matrix under quite general conditions [11]. Furthermore, the recovery is stable to small and bounded noise [12], that is, when the entries of are nonzero and small-bounded.

3.3. Robust Principal Component Analysis

The conventional PCA method is used to efficiently estimate a low-dimensional subspace from high-dimensional data. It can be written as follows: where is the Frobenius norm. That means input matrix is corrupted to generate output matrix by i.i.d. Gaussian noise. To use PCA, the singular value decomposition (SVD) of is computed to project the columns of onto the subspace spanned by the principal left singular vectors of .

Robust principal component analysis (RPCA) has identity operator and sparse matrix which is different from matrix completion and PCA. Wright [8, 13] has shown that a low-rank matrix can be recovered exactly by solving the following convex optimization problem from observation matrix , as long as the error matrix is sufficiently sparse. Consider where is a positive weighting parameter. RPCA has been used for background modeling, removal of shadows from face images, alignment of human face, and image denoising.

3.4. Training Dictionary Based on RPCA

Because the low-frequency part of an image has the main energy the LWT is achieved to example images before training dictionary. Then one uses RPCA to improve the effectiveness and robustness of the SR algorithms.

There is little difference in the coefficient values of the low-frequency subimages after a LWT on different images of the same scene. RPCA coefficients are used to represent the low frequencies in an attempt to preserve fidelity and coherency between the subbands. Algorithms have been developed to solve the RPCA problem that has recovered low-rank matrix and sparse matrix from observation matrix . In this paper, we employ the IALM method to compute the low-frequency subband coefficients of dictionary images.

Suppose denotes multisensor corrupted image set. Corresponding to ready-to-be-fused images , low-frequency subimages will be computed by using LWT. Let be the low-frequency subimages where denotes the gray values of each pixel whose coordinate is . If we represent as a vector by concatenating all columns of the subimages, we define a matrix as follows. The data are standardized. For convenience, let .

Then, we can express the subimages in an equation similar to (8): where denotes the clean and integrated low-frequency subimage sequence matrix and denotes the sparse matrix including errors and noises even though low-pass filter will alternate most of the noise power. The coefficient values of the low-frequency subimages are similar after a LWT on the same scene. If the recovered matrix was free of noise and those corrupted elements were fixed successfully, all column vectors in should have similar underlying image structures, and the rank of should be low which is the prerequisite to matrix completion and RPCA. In such an ideal case, a good estimation of may be found by matrix completion and RPCA as indicated by the following equation: where augmented Lagrange multiplier is

In this equation, is a positive weighting parameter representing the ratio of the sparse matrix to low-rank matrix . is a positive value. is the trace of and is the iterated Lagrange multiplier.

Augmented Lagrange multiplier method has excellent convergence and solution accuracy. In the proposed method of this paper, RPCA is coupled with the inexact augmented Lagrange multiplier (IALM) method to determine the low-frequency coefficients of LWT for corrupted example images. Firstly, we should define some notations to denote some variables which is shown in Notation of section. is iteration time.

Then, the result of and will be computed by the following equation:

In summary, IALM is used to determine the low-frequency component to be fused, and self-adapting regional variance is employed to estimate the high-frequency contribution. The fused wavelet coefficients are combined by ILWT to create the final result.

4. Experimental Results and Analysis

4.1. Time Complexity Experiments

To assess the accuracy and robustness of training overcomplete dictionary pair, 100000 standard LR and HR image patches were partitioned and trained from image libraries (http://decsai.ugr.es/cvg/dbimagenes/). To validate the new procedure, the four numbers of atoms were setting, respectively: , , , and . The size of each atom is pixels. Different from the SRSR algorithm, the proposed method, namely, , processed the 3-tier LWT to the example images. By adopting inverse lifting wavelet transform (ILWT), reconstructed HR image patches preserve the high-frequency critical features. The 75% example image pixels involved in the training dictionary are reduced. Experiment result is shown in Figure 3.

Obviously, the computation training dictionary seconds of are shorter 61.32%, 62.00%, 60.40%, and 62.67% than SRSR by increasing the number of atoms . A larger number of atoms indicate that overcomplete dictionary pair includes more reconstructed accuracy, in the meanwhile, spending more time to train. In this study, balancing the training accuracy and time complexity, the value of is set to 1024.

4.2. Robustness Experiments

As mentioned in Section 1, PCA failed to estimate the original data when data was broken severely. To solve this problem, sparse coefficients are determined through RPCA instead of PCA.

To verify the reported method robustness to some missing data and errors, the 10% random errors and the 10% missing data were applied to HR image patches for training dictionary. Therefore, the downsampling rate to original images reaches 20%. An overcomplete dictionary pair and were estimated to process SR. Test images called “bookshelf,” “girl face,” “lena,” “flower,” and “building” were processed by three mainstream SR algorithms—Bicubic, NESR, SRSR, and .

The peak signal-to-noise ratio (PSNR) is an expression for the ratio between the maximum possible power of a signal and the power of distorting noise that affects the quality of its representation. This objective metric is used to compare the robustness of algorithms by measuring the proximity of the SR reconstructed image and the original image.

Table 1 shows the PSNR (dB) performance evaluation measures for the four SR algorithms. The results reveal that Bicubic has the worst reconstructed performance. NESR and SRSR have been severely affected by broken data in training dictionary, ranging from 21.3871 dB to 31.4007 dB. Relative to the other three algorithms, has the most robustness to broken data in the process of training dictionary.

The results of the four comparative SR algorithms are shown in Figures 4, 5, 6, 7, and 8. The required HR resolution is four times that of the LR images.

(a) LR image

(b) Bicubic

(c) NESR

(d) SRSR

(e)

(f) HR image

(a) LR image

(b) Bicubic

(c) NESR

(d) SRSR

(e)

(f) HR image

(a) LR image

(b) Bicubic

(c) NESR

(d) SRSR

(e)

(f) HR image

(a) LR image

(b) Bicubic

(c) NESR

(d) SRSR

(e)

(f) HR image

(a) LR image

(b) Bicubic

(c) NESR

(d) SRSR

(e)

(f) HR image

5. Summary

For the computational complexity of training overcomplete dictionary pair, this paper proposes the 3-tier LWT to decompose the trained example images into low-frequency, horizontal, vertical, and diagonal high-frequency components. The high-frequency components in the example images can be accurately estimated through the second and the third layers’ wavelet coefficients.

PCA fails to estimate the gray value of broken pixels in the example images when corruption is severe, even if only very few of the observations are affected. For similarity of multiframe observation images, this research uses RPCA coupled with the IALM method instead of PCA to solve the robustness problem.

Experimental results show that the new algorithm not only shortens effectively the training overcomplete dictionary pair time of up to 60% but also provides significantly improved clarity for the reconstructed SR images.

Notation of

:	Observation matrix based on low-frequency subimages
:	Computed th iterated error (sparse) matrix
:	Computed th iterated low-rank matrix
:	Computed th iterated Lagrange multiplier
:	Tolerated value of normalized mean squared error
:	Singular value decomposition to a matrix
:	Orthogonal matrices of singular value decomposition
	Diagonal matrix of singular value decomposition
:	The soft-shrinkage operator.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research is supported in part by the National Natural Science Foundation of China (Grant no. 30970780) and by the General Program of Science and Technology Development Project of Beijing Municipal Education Commission of China (Grant no. KM201110005033). John R. Deller’s effort was supported in part by the U.S. National Science Foundation under Cooperative Agreement DBI-0939454. Any opinions, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect the views of the NSF. This work was undertaken in part while Zhuozheng Wang was a Visiting Research Scholar at Michigan State University. The authors thank the Beijing University of Technology’s Multimedia Information Processing Lab for assistance.

References

T.-R. Kim, J. S. Yang, S. Shin, and J. Lee, “Statistical torsion angle potential energy functions for protein structure modeling: a bicubic interpolation approach,” Proteins: Structure, Function and Bioinformatics, vol. 81, no. 7, pp. 1156–1165, 2013.
View at: Publisher Site | Google Scholar
M. Irani and S. Peleg, “Improving resolution by image registration,” CVGIP: Graphical Models and Image Processing, vol. 53, no. 3, pp. 231–239, 1991.
View at: Publisher Site | Google Scholar
H. Stark and P. Oskoui, “High-resolution image recovery from image-plane arrays, using convex projections,” Journal of the Optical Society of America A: Optics and Image Science, vol. 6, no. 11, pp. 1715–1726, 1989.
View at: Publisher Site | Google Scholar
R. C. Hardie, K. J. Barnard, and E. E. Armstrong, “Joint MAP registration and high-resolution image estimation using a sequence of undersampled images,” IEEE Transactions on Image Processing, vol. 6, no. 12, pp. 1621–1633, 1997.
View at: Publisher Site | Google Scholar
W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example-based super-resolution,” IEEE Computer Graphics and Applications, vol. 22, no. 2, pp. 56–65, 2002.
View at: Publisher Site | Google Scholar
K. S. Ni and T. Q. Nguyen, “Image super resolution using support vector regression,” IEEE Transactions on Image Processing, vol. 16, no. 6, pp. 1596–1610, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. 275–282, IEEE, Washington, DC, USA, July 2004.
View at: Google Scholar
J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Proceedings of the 7th International Conference on Curves and Surfaces, pp. 711–730, Avignon, France, 2010.
View at: Google Scholar
X. He and P. Niyogo, “Locality preserving projections,” Advances in Neural Information Processing Systems, vol. 16, pp. 153–160, 2003.
View at: Google Scholar
E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Lin, M. Chen, and Y. Ma, “The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices,” University of Illinois at Urbana-Champaign technical report, 2010.
View at: Google Scholar
J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma, “Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization,” in Proceedings of the Neural Information Processing Systems, 2009.
View at: Google Scholar

Copyright

Copyright © 2014 Zhuozheng Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

861

Downloads

1040

Citations