Abstract
We consider sparse signal inversion with impulsive noise. There are three major ingredients. The first is regularizing properties; we discuss convergence rate of regularized solutions. The second is devoted to the numerical solutions. It is challenging due to the fact that both fidelity and regularization term lack differentiability. Moreover, for illconditioned problems, sparsity regularization is often unstable. We propose a novel dual spectral projected gradient (DSPG) method which combines the dual problem of multiparameter regularization with spectral projection gradient method to solve the nonsmooth optimization functional. We show that one can overcome the nondifferentiability and instability by adding a smooth regularization term to the original optimization functional. The advantage of the proposed functional is that its convex duality reduced to a constraint smooth functional. Moreover, it is stable even for illconditioned problems. Spectral projected gradient algorithm is used to compute the minimizers and we prove the convergence. The third is numerical simulation. Some experiments are performed, using compressed sensing and image inpainting, to demonstrate the efficiency of the proposed approach.
1. Introduction
In the present manuscript we are concerned with illposed linear operator equation: where is sparse with respect to an orthonormal basis and is a bounded linear operator. In practice, exact data are not known precisely, but only an approximation with is available. We call the noisy data and the noise level. It is well known that the conventional method for solving (1) is sparsity regularization, which provides an efficient way to extract the essential features of sparse solutions compared with oversmoothed classical Tikhonov regularization.
In the past ten years, sparsity regularization has certainly become an important concept in inverse problems. The theory of sparse recovery has largely been driven by the needs of applications in compressed sensing [1, 2], bioluminescence tomography [3], seismic tomography [4], parameter identification [5], and so forth. For accounts of the regularizing properties and computational techniques in sparsity regularization we refer the reader to [5–7] and the references given there. In general, sparsity regularization is given bywhere , is the regularization parameter balancing the fidelity and regularization term . The functional in (3) is not convex if , it is challenging to investigate the regularizing properties and numerical computing method of minimizers. Limited work has been done for ; we refer the reader to [8–11] for a recent account of the theory. In this paper, we will focus our main attention on the situation of .
The aim of this paper is to consider a regularization functional of the formWe call (4) problem. A main motivation to investigate the problem is that noisy data often contain impulsive noise. For Gaussian noise, fidelity is a natural choice. However, a typical nondifferentiable fidelity used in application involving impulsive noise is the fidelity, which is more robust than fidelity [12].
fidelity technique was motivated by [13] for signal processing and later had attracted considerable attention in image processing, especially in image denoising. In image processing, fidelity typically combines with TV regularization term. For details of problems, we refer the reader to [14–17]. Nowadays fidelity has received growing interest in the inverse problems where solutions are sparse with respect to an orthonormal basis. Minimizers of cost functions involving fidelity combined with sparsity regularization have been studied. We refer the reader to [18–23] and the references given there.
For fidelity, regularizing properties must be handled specifically. Burger and Osher [24] proved convergence rate when the regularization parameter is chosen arbitrarily fixed but sufficiently small; the authors call this phenomenon exact penalization. The most curious situation is that, in [25], the optimal convergence rate requires that regularization parameter must be equal to a fixed value. In [26], Grasmair et al. proved that source condition together with finite basis injectivity (FBI) condition is weaker than the restricted isometry property and obtained linear convergence . Flemming and Hegland discussed convergence rates in regularization when the basis is not smooth enough [27]. König et al. obtained high order polynomial rates of convergence in the special corrupted domain, even though the underlying problem is exponentially illposed in the classical sense [28].
Though fidelity is robust, more researchers prefer to use fidelity because of its differentiability. Hence a key issue for the fidelity is the numerical computing methods. In the past few years, numerous algorithms have been systematically proposed for the problems. On the other hand, in spite of growing interests in the fidelity, we can indicate limited work has been done for numerical methods of problems. For sparsity regularization, the popular algorithms, for example, homotopy (LARS) method [29], iteratively reweighted least squares (IRLS) method [30], and iterative thresholding algorithm [31, 32], cannot be directly applied to problem due to the fact that both fidelity and regularization term lack differentiability. There are only a few papers, in which numerical algorithms for problems have been discussed systematically. References [20, 22] proposed the new model by statistics method. Reference [21] discussed the prior sparse representation and the datafidelity term and proposed the twophase approach. In [23], authors propose a robust bisparsity model (RBSM) to effectively exploit the prior knowledge about the similarities and the distinctions of signals. However, the researchers often devoted them to compressive sensing problem, where random matrices are wellconditioned. For illconditioned problems, these methods are often unstable [33] [Chap. 5]. Moreover, the researchers assume that the solution is sparse itself, which is different from the general assumption that the solution is sparse with respect to an orthonormal basis. In [19], Yang and Zhang proposed a Primal DualInterior Point Methods (PDIPM) for EIT problem, which is efficient at dealing with the nondifferentiability. However, they did not give the convergence proof. Yang and Zhang reformulated the problem into the basis pursuit model which can be solved effectively by ADM method [18]. It is a competitive method compared with other algorithms for compressive sensing. In [34], Xiao et al. applied ADM method to problem directly. Numerical results illustrated that the proposed algorithm performs better than Yall1 [18].
In this paper, we investigate regularizing properties of problems. The convergence rates of can be shown to be and by a priori and a posteriori choice with the additional source condition and FBI condition. As mentioned above, dual is a conventional technique to solve the Tikhonov regularization with fidelity. However, there are some limitations to this approach to problems due to the fact that it is difficult to obtain the dual formulation of problems. Inspired by [14] and multiregularization theory [35–37], a smooth term is added to original functional of regularization. The dual problem of this new cost function is reduced to a constraint smooth functional. Moreover, the smooth regularization term can improve the stability. The conventional method to solve the constraint smooth functional is projected gradient algorithm. We use spectral projected gradient (SPG) method to seek for the minimizers of the constraint functional and prove the convergence.
An outline of this paper is as follows. We devote Section 2 to a discussion of regularizing properties, including wellposedness and convergence rate. The convergence rate for a priori and a posteriori choice is established. In Section 3, inspired by the theory of multiregularization, we construct a new functional which is equal to a constraint smooth functional according to Frechel duality. In Section 4, the spectral projected gradient method is applied to compute the minimizers. Section 5 provides a detailed exposition of multiparameter choice rule based on the balancing principle. Numerical experiments involving compressed sensing and image inpainting are presented in Section 6, showing that our proposed approaches are robust and efficient.
2. Regularization Properties
2.1. Notation and Assumptions
For the approximate solutions of , we consider the minimization of the regularization functional where and the subdifferential of at is denoted by . All along this paper, denotes a Hilbert space which is a subspace of space and denotes the inner product; denotes a Banach space which is a subspace of space. is a bounded linear operator and . is an orthonormal basis where is some countable index set. From now, we denote
In order to prove convergence rate results we denote by the minimizer of the regularization functional for every and use the following definition of minimum norm solution.
Definition 1. An element is called a minimum norm solution of linear problem if
We define the sparsity as follows.
Definition 2. is sparse with respect to in the sense that is finite. If for some , the is called sparse.
For a subset of indices , we denote bywhere . We denote by the complement of . In addition, let and .
Source condition or variational inequality are necessary for analysis of convergence rate. With these conditions one can obtain convergence rate , even saturation convergence rate . But for sparsity regularization, in order to obtain convergence rate , even linear convergence , one needs the following assumptions.
Assumption 3. Assume that the following hold:(i)Source condition: there exists and some satisfying .(ii)Finite basis injectivity (FBI) condition: for every finite set , the restriction of to is injective.
The FBI condition was originally introduced in [8]. In [26], Grasmair et al. used a slightly weaker condition based on FBI condition and showed that Assumption 3 is in some sense the weakest possible condition that guarantees the linear convergence rate. Actually, if is the unique minimizing solution, Assumption 3 is satisfied [26]. For , being the unique minimizing solution can not guarantee that Assumption 3 is satisfied. Even if Assumption 3 is satisfied for , we do not know if linear convergence can be obtained due to fact that traditional proof for linear convergence needs the convexity of .
2.2. WellPosedness and Convergence Rate
In this subsection, wellposedness and convergence rate of the regularization method are given.
Proposition 4 (wellposedness). For every and , minimizing is welldefined, stable, and convergent.
Proof. In , is a linear bounded operator, is Hilbert space, and is sequentially lower semicontinuous. It is easy to verify Assumption 3.13 [38]; then proof is along the lines of the proofs of Theorem 3.22 (existence), 3.23 (stablity), and 3.26 (convergence) in [38].
We now turn to convergence rate. For problem, Lorenz proved convergence rate with source condition and FBI condition [7]. Linear convergence rate was improved by Grasmair et al. for nonlinear equation with additional two nonlinear conditions [8]. In [26], Grasmair et al. proved that source condition together with FBI condition is weaker than the restricted isometry property and obtained linear convergence . Flemming discussed convergence rates in regularization when the basis is not smooth enough [27].
Remark 5. We note that Bregman distance cannot be used as an error measure in this section due to the fact that fails to be strictly convex. We refer reader to [8, 39] for details. In this section we use norm as error measure.
The next results show a convergence rate for a priori parameter choice rule.
Lemma 6. Assume that Assumption 3 holds. Then there exists a constant such that for all .
Proof. Since , we see that Together with the definition of , (10) implies that is a finite set. From Assumption 3, the restriction to every is injective. Then for some constant for all . By the definition of and we have that for every and . From (11), it is easy to show thatThen we conclude that and the proof is completed.
Lemma 7. Assume there exist such thatfor all . If , then
Proof. Since minimize , thenTogether with condition (14) this implies the inequality which proves Lemma 7.
Theorem 8. Assume that Assumption 3 holds. Then for the choice , we deduce that
Proof. Let ; we can estimate From Lemma 6, we obtain that is,LetAccording to Lemma 7, it follows that The assertion follows from .
Next we turn our attention to a posteriori parameter choice which is socalled discrepancy principle due to Morozov [40]. The regularization parameter defined via the discrepancy principle is for some . Morozov discrepancy principle is a widely used and easy numerical implementation rule. Furthermore, for , (4) is equivalent to a constrained minimization problem [26] which is widely used in compressive sensing. The next results show a convergence rate by the Morozov discrepancy principle.
Theorem 9. Assume that Assumption 3 holds and that the regularization parameter is determined by (24). Then
Proof. Since is a minimizer of , the inequality holds. This together with implies that Proceeding as in the proof of Theorem 8, we obtain that From Lemma 6, it follows that The proof is completed.
Remark 10. Linear convergence rate can also be obtained when the a priori choice and the Morozov discrepancy principle are applied to fidelity. However, one must let in the a priori parameter choice rule.
3. Dual Problem
In this section, let in (4) take the same value; that is, for all . It is reasonable because convergence can be obtained when [6]. Let ; then (4) is equivalent to Let , where . In addition, we denote by a dictionary satisfied with and . For example, in the field of wavelet transform, is a wavelet decomposition operator and is a wavelet reconstruction operator [41, 42]. Let ; then (4) is equivalent to Dual is a conventional method for solving Tikhonov regularization with fidelity. However, there are some limitations to this approach to solve (32). The main difficulty is that both the fidelity and the regularization term are nondifferentiable. Moreover, for illconditioned problems, sparsity regularization is often unstable. We add the smooth penalty to (32) to construct the following functional: The advantage of problem (33) in place of (32) is that the dual problem of (33) is a constraint smooth functional and projected gradient algorithm can be used to compute minimizers. Moreover, the regularization effect of penalty is weak and penalty can improve the stability of (32).
Next we will investigate the convergence of the minimizers to the functional as tending to zero.
Theorem 11. Let be fixed and be a sequence tending to zero. Then the minimizers to the functional have a subsequence converging to being a minimizer of the functional .
Proof. By the definition of , we have and hence Since is a fixed value and , there exists a constant such that It follows that is uniformly bounded. Therefore, a subsequence of and exist such that By the weak lower semicontinuity of norm, we obtain that and hence that for all . Therefore, is a minimizer of the functional and the proof is completed.
Next we consider the dual problem of . We will show that the constraint smooth minimization problems are the dual problem of .
Theorem 12. is the dual problem of . The solutions of and of have the following relation: for all with , whereis the soft threshold function.
Proof. Letthen problem can be rewritten as Let us denote by and the conjugate function of and . By the Fenchel duality [43] [Chap 3, Chap 10], it follows that In , the functionals and are proper, convex lower semicontinuous. By the Fenchel duality theorem [43](Chap 3, Prop. 2.4, Prop. 4.1, Rem. 4.2), it is easy to show that that is,Since and are minimizers of and , we have thatMoreover, the extremality condition (48) is equivalent to the KuhnTucker conditionsBy the definition of the subgradient and in (49), it follows that Then we obtain By the definition of the subgradient and in (49), we conclude that For , this leads to ; we must impose in this case that . For , this leads to , valid only when . When , we put . Summarizing where is soft threshold function which is defined by the proof is completed.
Remark 13. In [39] (chap 1), numerical experiments showed that the choice of slightly larger than 1 gives even better results than . It is necessary to discuss the functional where ; we call (55) problem. We do not have to add additional penalty to (55); its dual problem can be obtained by Fenchel dual theory directly.
4. Computation of Minimizers
As mentioned above, the functional equation (33) can be transformed using Fenchel duality into a smooth functional with a box constraint, which can be solved effectively using projected gradienttype methods. The spectral projected gradient (SPG) method [44] has been proved to be effective for the minimization of differentiable functional with box constraints. The function is defined by then By setting we denote by the orthogonal projection on the ball Given initial value and a constraint on step length , that is, , let parameter , and , . We use the convergence criteria given by (see Algorithm 1). We have the following convergence results.


Proposition 14. For and , algorithm SPG is welldefined; the sequence generated by algorithm SPG converges to a stationary point of the functional .
Proof. Since is convex and has continuous derivatives, the convergence follows from standard results (refer to Theorem in [44]).
5. Choice of Parameter and
The solution of converges theoretically to the solution of as . Obviously smaller is better. However, tends to infinite as . Moreover, the smaller weakens the regularization effect of the penalty, which leads to instability. If the solution is sparse, we can say that the nonzero coefficients are impulsive parts and the zero coefficients are smooth parts. Multiparameter regularization is a conventional method for illposed problems if the solutions have a number of different structures. In [35, 37], numerical experiments show that multiparameter regularization can effectively recover the different part of solutions. If the solutions contain only a single structure, multiparameter regularization also has better performance. Hence it does not necessarily require the parameter tending to zero when we choose the parameters and . A conventional method for the choice of parameters and is multiparameter regularization choice principle, for example, discrepancy principle [35] and balance principle [37]. In this paper, we use multiparameter balance principle for the choice of regularization parameters and .
Balance principle is to compute minimizers of the function where is a fixed constant, and (62) is equivalent to [37] A fixed point algorithm for and according to (63) is as in Algorithm 3.

6. Numerical Simulations
In this section, we present some numerical experiments to illustrate the efficiency of the proposed method. In Section 6.1, numerical experiments involve compressive sensing. We aim to demonstrate that fidelity is more stable than the fidelity and is capable of handing both impulsive and Gaussian noises. In Section 6.2, we first compare the performance of the DSPG method with the alternating direction method (ADM) and TNIP method by wellconditioned compressive sensing problems. In the second example, we discuss an illposed problem where the condition number of linear operator is 255; we aim to demonstrate that the proposed method is stable even with large condition numbers. In Section 6.3, we discuss the image inpainting where images are sparse with respect to the Daubechies wavelets. For image inpainting, the linear operator is moderate illcondition and the condition number is around 4000. In order to compare the restoration results, the quality of the computed solution is measured by relative error and PSNR which are, respectively, defined by All experiments were performed under Windows 7 and Matlab R2010a on HP ProBook 4431s with Intel Core i5 2410M CPU 2.30 GHz 2.30 GHz and 4 GB of memory.
6.1. Comparison with and Fidelity
This example involves compressive sensing problem , where matrix is random Gaussian and is the observed data containing white noise or impulsive noise. The true solution is 16sparse with respect to natural basis of space which is defined by White noise is generated such that data attains a desired SNR, which is defined by The impulsive noise is measured by relative error, which is defined by Figure 1 presents the restoration results by and fidelities. The left column describes the data with different white noise levels. The SNR of data are 30 dB, 20 dB, 10 dB, and 5 dB. The right column is restoration results; the regularization parameters are , , , and . As can be seen from Figure 1, the quality of restoration by and fidelities is similar if the data are contaminated with low noise levels. If data contain high noise levels, the performance of fidelity is better. However, we can not recover highquality solutions with high noise levels. In Figure 2, the left column describes the data which are contaminated by different impulsive noise levels. Rerr() of noise level are 3%, 7%, 15%, and 22%. The value of impulsive noise is ±1 at random positions and 0 at other positions. The right column contains restoration results according to different noise levels. The regularization parameters are , , , and . As can be seen from Figure 2, the fidelity is more stable for impulsive noise and always offers highquality restoration even with poor data. In contrast to the fidelity, the quality of restoration results by the fidelity is always poor.
6.2. Comparison of DSPG with ADM, ADM, and TNIP
In this subsection, we present the comparison results of DSPG with ADM, ADM, and TNIP. ADM [18] is an efficient alternating direction method for problem. ADM [18] and TNIP [45] (truncated Newton interior point) are for problem. In the first experiment, we use random Gaussian matrix , where sampling length is and signal length . The condition number of random Gaussian matrix is around 5. The signal is sparsity. We add impulsive noise to true signal. For each fixed pair , we take 100 iterations, where , 0.4, 0.3, 0.2 and 0.1,