Abstract

We consider sparse signal inversion with impulsive noise. There are three major ingredients. The first is regularizing properties; we discuss convergence rate of regularized solutions. The second is devoted to the numerical solutions. It is challenging due to the fact that both fidelity and regularization term lack differentiability. Moreover, for ill-conditioned problems, sparsity regularization is often unstable. We propose a novel dual spectral projected gradient (DSPG) method which combines the dual problem of multiparameter regularization with spectral projection gradient method to solve the nonsmooth optimization functional. We show that one can overcome the nondifferentiability and instability by adding a smooth regularization term to the original optimization functional. The advantage of the proposed functional is that its convex duality reduced to a constraint smooth functional. Moreover, it is stable even for ill-conditioned problems. Spectral projected gradient algorithm is used to compute the minimizers and we prove the convergence. The third is numerical simulation. Some experiments are performed, using compressed sensing and image inpainting, to demonstrate the efficiency of the proposed approach.

1. Introduction

In the present manuscript we are concerned with ill-posed linear operator equation: where is sparse with respect to an orthonormal basis and is a bounded linear operator. In practice, exact data are not known precisely, but only an approximation with is available. We call the noisy data and the noise level. It is well known that the conventional method for solving (1) is sparsity regularization, which provides an efficient way to extract the essential features of sparse solutions compared with oversmoothed classical Tikhonov regularization.

In the past ten years, sparsity regularization has certainly become an important concept in inverse problems. The theory of sparse recovery has largely been driven by the needs of applications in compressed sensing [1, 2], bioluminescence tomography [3], seismic tomography [4], parameter identification [5], and so forth. For accounts of the regularizing properties and computational techniques in sparsity regularization we refer the reader to [57] and the references given there. In general, sparsity regularization is given bywhere , is the regularization parameter balancing the fidelity and regularization term . The functional in (3) is not convex if , it is challenging to investigate the regularizing properties and numerical computing method of minimizers. Limited work has been done for ; we refer the reader to [811] for a recent account of the theory. In this paper, we will focus our main attention on the situation of .

The aim of this paper is to consider a regularization functional of the formWe call (4) problem. A main motivation to investigate the problem is that noisy data often contain impulsive noise. For Gaussian noise, fidelity is a natural choice. However, a typical nondifferentiable fidelity used in application involving impulsive noise is the fidelity, which is more robust than fidelity [12].

fidelity technique was motivated by [13] for signal processing and later had attracted considerable attention in image processing, especially in image denoising. In image processing, fidelity typically combines with TV regularization term. For details of problems, we refer the reader to [1417]. Nowadays fidelity has received growing interest in the inverse problems where solutions are sparse with respect to an orthonormal basis. Minimizers of cost functions involving fidelity combined with sparsity regularization have been studied. We refer the reader to [1823] and the references given there.

For fidelity, regularizing properties must be handled specifically. Burger and Osher [24] proved convergence rate when the regularization parameter is chosen arbitrarily fixed but sufficiently small; the authors call this phenomenon exact penalization. The most curious situation is that, in [25], the optimal convergence rate requires that regularization parameter must be equal to a fixed value. In [26], Grasmair et al. proved that source condition together with finite basis injectivity (FBI) condition is weaker than the restricted isometry property and obtained linear convergence . Flemming and Hegland discussed convergence rates in -regularization when the basis is not smooth enough [27]. König et al. obtained high order polynomial rates of convergence in the special corrupted domain, even though the underlying problem is exponentially ill-posed in the classical sense [28].

Though fidelity is robust, more researchers prefer to use fidelity because of its differentiability. Hence a key issue for the fidelity is the numerical computing methods. In the past few years, numerous algorithms have been systematically proposed for the problems. On the other hand, in spite of growing interests in the fidelity, we can indicate limited work has been done for numerical methods of problems. For sparsity regularization, the popular algorithms, for example, homotopy (LARS) method [29], iteratively reweighted least squares (IRLS) method [30], and iterative thresholding algorithm [31, 32], cannot be directly applied to problem due to the fact that both fidelity and regularization term lack differentiability. There are only a few papers, in which numerical algorithms for problems have been discussed systematically. References [20, 22] proposed the new model by statistics method. Reference [21] discussed the prior sparse representation and the data-fidelity term and proposed the two-phase approach. In [23], authors propose a robust bisparsity model (RBSM) to effectively exploit the prior knowledge about the similarities and the distinctions of signals. However, the researchers often devoted them to compressive sensing problem, where random matrices are well-conditioned. For ill-conditioned problems, these methods are often unstable [33] [Chap. 5]. Moreover, the researchers assume that the solution is sparse itself, which is different from the general assumption that the solution is sparse with respect to an orthonormal basis. In [19], Yang and Zhang proposed a Primal Dual-Interior Point Methods (PD-IPM) for EIT problem, which is efficient at dealing with the nondifferentiability. However, they did not give the convergence proof. Yang and Zhang reformulated the problem into the basis pursuit model which can be solved effectively by ADM method [18]. It is a competitive method compared with other algorithms for compressive sensing. In [34], Xiao et al. applied ADM method to problem directly. Numerical results illustrated that the proposed algorithm performs better than Yall1 [18].

In this paper, we investigate regularizing properties of problems. The convergence rates of can be shown to be and by a priori and a posteriori choice with the additional source condition and FBI condition. As mentioned above, dual is a conventional technique to solve the Tikhonov regularization with fidelity. However, there are some limitations to this approach to problems due to the fact that it is difficult to obtain the dual formulation of problems. Inspired by [14] and multiregularization theory [3537], a smooth term is added to original functional of regularization. The dual problem of this new cost function is reduced to a constraint smooth functional. Moreover, the smooth regularization term can improve the stability. The conventional method to solve the constraint smooth functional is projected gradient algorithm. We use spectral projected gradient (SPG) method to seek for the minimizers of the constraint functional and prove the convergence.

An outline of this paper is as follows. We devote Section 2 to a discussion of regularizing properties, including well-posedness and convergence rate. The convergence rate for a priori and a posteriori choice is established. In Section 3, inspired by the theory of multiregularization, we construct a new functional which is equal to a constraint smooth functional according to Frechel duality. In Section 4, the spectral projected gradient method is applied to compute the minimizers. Section 5 provides a detailed exposition of multiparameter choice rule based on the balancing principle. Numerical experiments involving compressed sensing and image inpainting are presented in Section 6, showing that our proposed approaches are robust and efficient.

2. Regularization Properties

2.1. Notation and Assumptions

For the approximate solutions of , we consider the minimization of the regularization functional where and the subdifferential of at is denoted by . All along this paper, denotes a Hilbert space which is a subspace of space and denotes the inner product; denotes a Banach space which is a subspace of space. is a bounded linear operator and . is an orthonormal basis where is some countable index set. From now, we denote

In order to prove convergence rate results we denote by the minimizer of the regularization functional for every and use the following definition of -minimum norm solution.

Definition 1. An element is called a -minimum norm solution of linear problem if

We define the sparsity as follows.

Definition 2. is sparse with respect to in the sense that is finite. If for some , the is called -sparse.

For a subset of indices , we denote bywhere . We denote by the complement of . In addition, let and .

Source condition or variational inequality are necessary for analysis of convergence rate. With these conditions one can obtain convergence rate , even saturation convergence rate . But for sparsity regularization, in order to obtain convergence rate , even linear convergence , one needs the following assumptions.

Assumption 3. Assume that the following hold:(i)Source condition: there exists and some satisfying .(ii)Finite basis injectivity (FBI) condition: for every finite set , the restriction of to is injective.

The FBI condition was originally introduced in [8]. In [26], Grasmair et al. used a slightly weaker condition based on FBI condition and showed that Assumption 3 is in some sense the weakest possible condition that guarantees the linear convergence rate. Actually, if is the unique -minimizing solution, Assumption 3 is satisfied [26]. For , being the unique -minimizing solution can not guarantee that Assumption 3 is satisfied. Even if Assumption 3 is satisfied for , we do not know if linear convergence can be obtained due to fact that traditional proof for linear convergence needs the convexity of .

2.2. Well-Posedness and Convergence Rate

In this subsection, well-posedness and convergence rate of the regularization method are given.

Proposition 4 (well-posedness). For every and , minimizing is well-defined, stable, and convergent.

Proof. In , is a linear bounded operator, is Hilbert space, and is sequentially lower semicontinuous. It is easy to verify Assumption 3.13 [38]; then proof is along the lines of the proofs of Theorem 3.22 (existence), 3.23 (stablity), and 3.26 (convergence) in [38].

We now turn to convergence rate. For problem, Lorenz proved convergence rate with source condition and FBI condition [7]. Linear convergence rate was improved by Grasmair et al. for nonlinear equation with additional two nonlinear conditions [8]. In [26], Grasmair et al. proved that source condition together with FBI condition is weaker than the restricted isometry property and obtained linear convergence . Flemming discussed convergence rates in -regularization when the basis is not smooth enough [27].

Remark 5. We note that Bregman distance cannot be used as an error measure in this section due to the fact that fails to be strictly convex. We refer reader to [8, 39] for details. In this section we use norm as error measure.

The next results show a convergence rate for a priori parameter choice rule.

Lemma 6. Assume that Assumption 3 holds. Then there exists a constant such that for all .

Proof. Since , we see that Together with the definition of , (10) implies that is a finite set. From Assumption 3, the restriction to every is injective. Then for some constant for all . By the definition of and we have that for every and . From (11), it is easy to show thatThen we conclude that and the proof is completed.

Lemma 7. Assume there exist such thatfor all . If , then

Proof. Since minimize , thenTogether with condition (14) this implies the inequality which proves Lemma 7.

Theorem 8. Assume that Assumption 3 holds. Then for the choice , we deduce that

Proof. Let ; we can estimate From Lemma 6, we obtain that is,LetAccording to Lemma 7, it follows that The assertion follows from .

Next we turn our attention to a posteriori parameter choice which is so-called discrepancy principle due to Morozov [40]. The regularization parameter defined via the discrepancy principle is for some . Morozov discrepancy principle is a widely used and easy numerical implementation rule. Furthermore, for , (4) is equivalent to a constrained minimization problem [26] which is widely used in compressive sensing. The next results show a convergence rate by the Morozov discrepancy principle.

Theorem 9. Assume that Assumption 3 holds and that the regularization parameter is determined by (24). Then

Proof. Since is a minimizer of , the inequality holds. This together with implies that Proceeding as in the proof of Theorem 8, we obtain that From Lemma 6, it follows that The proof is completed.

Remark 10. Linear convergence rate can also be obtained when the a priori choice and the Morozov discrepancy principle are applied to fidelity. However, one must let in the a priori parameter choice rule.

3. Dual Problem

In this section, let in (4) take the same value; that is, for all . It is reasonable because convergence can be obtained when [6]. Let ; then (4) is equivalent to Let , where . In addition, we denote by a dictionary satisfied with and . For example, in the field of wavelet transform, is a wavelet decomposition operator and is a wavelet reconstruction operator [41, 42]. Let ; then (4) is equivalent to Dual is a conventional method for solving Tikhonov regularization with fidelity. However, there are some limitations to this approach to solve (32). The main difficulty is that both the fidelity and the regularization term are nondifferentiable. Moreover, for ill-conditioned problems, sparsity regularization is often unstable. We add the smooth penalty to (32) to construct the following functional: The advantage of problem (33) in place of (32) is that the dual problem of (33) is a constraint smooth functional and projected gradient algorithm can be used to compute minimizers. Moreover, the regularization effect of penalty is weak and penalty can improve the stability of (32).

Next we will investigate the convergence of the minimizers to the functional as tending to zero.

Theorem 11. Let be fixed and be a sequence tending to zero. Then the minimizers to the functional have a subsequence converging to being a minimizer of the functional .

Proof. By the definition of , we have and hence Since is a fixed value and , there exists a constant such that It follows that is uniformly bounded. Therefore, a subsequence of and exist such that By the weak lower semicontinuity of norm, we obtain that and hence that for all . Therefore, is a minimizer of the functional and the proof is completed.

Next we consider the dual problem of . We will show that the constraint smooth minimization problems are the dual problem of .

Theorem 12. is the dual problem of . The solutions of and of have the following relation: for all with , whereis the soft threshold function.

Proof. Letthen problem can be rewritten as Let us denote by and the conjugate function of and . By the Fenchel duality [43] [Chap 3, Chap 10], it follows that In , the functionals and are proper, convex lower semicontinuous. By the Fenchel duality theorem [43](Chap 3, Prop. 2.4, Prop. 4.1, Rem. 4.2), it is easy to show that that is,Since and are minimizers of and , we have thatMoreover, the extremality condition (48) is equivalent to the Kuhn-Tucker conditionsBy the definition of the subgradient and in (49), it follows that Then we obtain By the definition of the subgradient and in (49), we conclude that For , this leads to ; we must impose in this case that . For , this leads to , valid only when . When , we put . Summarizing where is soft threshold function which is defined by the proof is completed.

Remark 13. In [39] (chap 1), numerical experiments showed that the choice of slightly larger than 1 gives even better results than . It is necessary to discuss the functional where ; we call (55) problem. We do not have to add additional penalty to (55); its dual problem can be obtained by Fenchel dual theory directly.

4. Computation of Minimizers

As mentioned above, the functional equation (33) can be transformed using Fenchel duality into a smooth functional with a box constraint, which can be solved effectively using projected gradient-type methods. The spectral projected gradient (SPG) method [44] has been proved to be effective for the minimization of differentiable functional with box constraints. The function is defined by then By setting we denote by the orthogonal projection on the ball Given initial value and a constraint on step length , that is, , let parameter , and , . We use the convergence criteria given by (see Algorithm 1). We have the following convergence results.

(1) Set , , , , , , , ; choose ,
(2)
(3) Select using line research Algorithm 2
(4) compute , , ,
(5)
if    then
Set
Else
Set
end if
(6)
Until or
(1) Set
(2) Compute
(3)
if
then
Let
Else
Let , and go to (2)
end if

Proposition 14. For and , algorithm SPG is well-defined; the sequence generated by algorithm SPG converges to a stationary point of the functional .

Proof. Since is convex and has continuous derivatives, the convergence follows from standard results (refer to Theorem in [44]).

5. Choice of Parameter and

The solution of converges theoretically to the solution of as . Obviously smaller is better. However, tends to infinite as . Moreover, the smaller weakens the regularization effect of the penalty, which leads to instability. If the solution is sparse, we can say that the nonzero coefficients are impulsive parts and the zero coefficients are smooth parts. Multiparameter regularization is a conventional method for ill-posed problems if the solutions have a number of different structures. In [35, 37], numerical experiments show that multiparameter regularization can effectively recover the different part of solutions. If the solutions contain only a single structure, multiparameter regularization also has better performance. Hence it does not necessarily require the parameter tending to zero when we choose the parameters and . A conventional method for the choice of parameters and is multiparameter regularization choice principle, for example, discrepancy principle [35] and balance principle [37]. In this paper, we use multiparameter balance principle for the choice of regularization parameters and .

Balance principle is to compute minimizers of the function where is a fixed constant, and (62) is equivalent to [37] A fixed point algorithm for and according to (63) is as in Algorithm 3.

(1) Choose and , and let
(2)  
(3)   
Until a stopping criterion satisfied.

6. Numerical Simulations

In this section, we present some numerical experiments to illustrate the efficiency of the proposed method. In Section 6.1, numerical experiments involve compressive sensing. We aim to demonstrate that fidelity is more stable than the fidelity and is capable of handing both impulsive and Gaussian noises. In Section 6.2, we first compare the performance of the DSPG method with the alternating direction method (ADM) and TNIP method by well-conditioned compressive sensing problems. In the second example, we discuss an ill-posed problem where the condition number of linear operator is 255; we aim to demonstrate that the proposed method is stable even with large condition numbers. In Section 6.3, we discuss the image inpainting where images are sparse with respect to the Daubechies wavelets. For image inpainting, the linear operator is moderate ill-condition and the condition number is around 4000. In order to compare the restoration results, the quality of the computed solution is measured by relative error and PSNR which are, respectively, defined by All experiments were performed under Windows 7 and Matlab R2010a on HP ProBook 4431s with Intel Core i5 2410M CPU 2.30 GHz 2.30 GHz and 4 GB of memory.

6.1. Comparison with and Fidelity

This example involves compressive sensing problem , where matrix is random Gaussian and is the observed data containing white noise or impulsive noise. The true solution is 16-sparse with respect to natural basis of space which is defined by White noise is generated such that data attains a desired SNR, which is defined by The impulsive noise is measured by relative error, which is defined by Figure 1 presents the restoration results by and fidelities. The left column describes the data with different white noise levels. The SNR of data are 30 dB, 20 dB, 10 dB, and 5 dB. The right column is restoration results; the regularization parameters are , , , and . As can be seen from Figure 1, the quality of restoration by and fidelities is similar if the data are contaminated with low noise levels. If data contain high noise levels, the performance of fidelity is better. However, we can not recover high-quality solutions with high noise levels. In Figure 2, the left column describes the data which are contaminated by different impulsive noise levels. Rerr() of noise level are 3%, 7%, 15%, and 22%. The value of impulsive noise is ±1 at random positions and 0 at other positions. The right column contains restoration results according to different noise levels. The regularization parameters are , , , and . As can be seen from Figure 2, the fidelity is more stable for impulsive noise and always offers high-quality restoration even with poor data. In contrast to the fidelity, the quality of restoration results by the fidelity is always poor.

6.2. Comparison of DSPG with ADM-, ADM-, and TNIP

In this subsection, we present the comparison results of DSPG with ADM-, ADM-, and TNIP. ADM- [18] is an efficient alternating direction method for problem. ADM- [18] and TNIP [45] (truncated Newton interior point) are for problem. In the first experiment, we use random Gaussian matrix , where sampling length is and signal length . The condition number of random Gaussian matrix is around 5. The signal is -sparsity. We add impulsive noise to true signal. For each fixed pair , we take 100 iterations, where , 0.4, 0.3, 0.2 and 0.1, and 0.2. Figure 3 presents comparison results for four different iterative algorithms. The left column describes convergence rates of Rerr for . The right column describes convergence rates of Rerr() for . increase from top to bottom row. As can be seen from Figure 3, ADM- converges faster than DSPG and the accuracy is better than DSPG when . With decreasing, DSPG method performs more competitively. The accuracy of DSPG method is even better than ADM- when . Though the optimal relative error of ADM- is better than DSPG method, the corresponding optimal iteration number or stopping tolerance of ADM- is difficult to estimate in practice. The other fidelity algorithms, especially TNIP method, converge obviously faster than ADM- and DSPG method. However, the accuracy of the two algorithms is worse than ADM- and DSPG method no matter what value the is.

Next, in order to test the stability of the DSPG method for ill-conditioned problems, we use matrix () whose condition number is 255. This problem was discussed by Lorenz in [8] where the ill-conditioned matrix is generated by Matlab code: = tril(ones(200)). The signal is -sparsity where and 0.2. We add impulsive noise to data. As can be seen from Figure 4, DSPG method converges obviously faster than the ADM- method. The relative error of DSPG method is also better than ADM- method. It is shown that DSPG method is stable even for large condition number matrices. In Table 1, data contain impulsive noise with corruption Rerr, 0.3%, 0.5%, 1%, 3%, 5%, 10%. As can be seen from Table 1, the quality of restoration improves as noise level decreases. Theoretically, ADM- and DSPG methods are adept to process impulsive noise. However, ADM- method is sensitive to noise when the operators are ill-conditioned. In this case, ADM- cannot obtain reasonable restoration. ADM- and DSPG methods are more stable to noise level even if matrix has large condition numbers. For low noise levels, ADM- and DSPG methods have advantage over the other two methods. For high noise levels, the restoration results of ADM- method are very close to ADM- and TNIP methods. Restoration results of the DSPG method are obviously better than the other three methods.

6.3. Image Inpainting

We consider 2D image inpainting problems. The image is Barbara (; cf. Figure 5). We randomly remove pixels of Barbara to create an incomplete image. In this case, the image inpainting is a moderate ill-posed problem. The condition number of the matrix is around 4000. For our purpose, we make use of Daubechies 4 wavelet basis as a dictionary. We use four scales, for a total of wavelet and scaling coefficients (cf. Figure 6). As seen from Figure 6, the representation of the image with respect to Daubechies 4 basis is sparse. We add salt-and-pepper noise by Matlab code: imnoise(image, ‘salt & pepper’, d). In this example, . The restoration results are shown in Figure 5(d). Restoration results of four images with different noise levels are given in Table 2. Restoration results show that if images have a sparse representation with respect to an orthogonal basis, DSPG method can obtain satisfactory results even if the image inpainting are moderate ill-posed problems.

7. Conclusion

With source and FBI conditions, we have proved that regularization method yields convergence rates of order and . For numerical solutions, we have proposed a novel DSPG approach for sparsity regularization with fidelity. Numerical results indicate that the proposed DSPG algorithm performs competitively with several state-of-art algorithms such as ADM method. On various classes of test problems with different condition numbers, the proposed DSPG method has exhibited the following advantages: (i) compared with other algorithms, the dual problem of our methods is more simple which is a box-constraint smooth functional and can be solved effectively by projected methods; (ii) for ill-conditioned problems, DSPG method is more stable with respect to noise level. We remark that for well-conditioned problems, for example, compressive sensing, optimal accuracy of ADM- is better than DSPG method. However, the optimal accuracy of ADM- method is strongly dependant on stopping tolerance values which can be difficult to estimate in practice. To the best of the author’s knowledge, multiparameter regularization is first used to obtain the dual formulation of + problems. One can also try to use multiparameter regularization strategy to solve dual problem of the other nonsmooth functionals, for example, problems.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (Grants nos. 41304093 and 11301119) and the Fundamental Research Funds for the Central Universities (Grant no. 2572015CB19) and Heilongjiang Postdoctoral Research Develop-Mental Fund (no. LBH-Q16008). The authors are grateful to Professor Wen Song of Harbin Normal University for many useful discussions and suggestions.