Method of Lagrange Multipliers for Normalized Zero Norm Minimization
We present a normalization of the -norm. A compressive sensing criterion is proposed using the normalized zero norm. Based on the method of Lagrange multipliers, we derive the solution of the proposed optimization framework. It turns out that the new solution is a limit case of the least fractional norm solution for , where its fixed-point iteration algorithm can readily follow an existing algorithm. The derivation of the minimal normalized zero norm solution herein gives a relation in the aspect of Lagrange multiplier method to existing works that invoke least fractional norm and least pseudo zero norm criteria.
Various applications in science and engineering need to recover a desired signal from a set of observed data or measured data based on a modeling or measurement matrix , which either depends on the model or can be chosen beforehand, for and . A linear system in Figure 1 can be represented bywhere is the amount of perturbation hidden in the output .
The signal can be recovered by solving an optimization problem related to linear least squares (LLS), i.e.,where is the square root of the maximal allowable noise power . Usually, the matrix is of full rank, i.e.,where is the rank of matrix. Whether the number of provided data is greater than, equal to, or less than the number of unknown variables , i.e., the size of the matrix , the LLS problem can be classified into
The solution of (4) is well known as the LLS estimatewhere is the transpose of a vector or matrix, and is the inverse of square matrix. The LLS estimate exists provided that the matrix or or is invertible, i.e., the measurement matrix needs to be of full rank. The LLS estimate provides all nonzero entries to the desired signal .
Many works indicate that the desired signal often is subject to sparsity, i.e., the situation when a number of elements in are zeros. Even though the signal sometimes does not strictly entail the sparsity, it is efficient to keep an approximate value of the signal that contains only a sufficient amount of its largest components, from which we called compressible signal. The sparsity nature of the signal is usually hidden and can be exposed by discovering a sparse basis and its associated spanning coefficients. The decomposition of the coefficient vector can be seen as a superposition of dictionary elements with a remaining term . Compressed sensing was an emerging field that spans many applications in science and engineering, e.g., imaging and vision , photonic mixer device , electronic defense , security and cryptosystem , radar [7, 8], earth observation , wireless networks [10, 11], biometric watermarking , and healthcare .
In compressive sensing, the -norm is originally adopted to impose the zero elements in the solution. The optimization of -norm criterion however appears to be a combinatorial nondeterministic polynomial-time hard (NP-hard) problem, which appears to be prohibitive. The performance of the above optimization problem can however be analyzed, e.g., in .
Instead of the -norm, -norm is often of interest because it is more convenient than -norm optimization in terms of computability while its accuracy is comparable (see e.g., [15, 16]). A widely considered method designed for norm minimization of dictionary coefficients is known as matching pursuit . Its variations are presented in terms of basis pursuit denoising , orthogonal matching pursuit , compressive sampling matching pursuit , stagewise orthogonal matching pursuit , gradient pursuits , etc. Most approaches based on the matching pursuit involve the -norm, except for basis pursuit denoising, which considers the -norm.
In this work, we point out that the zero norm that is mostly adopted in compressed sensing literature is not the actual zero norm, but rather a pseudo zero norm. We also show that the actual zero norm is unbounded and thus trivial. Later we present a normalized -norm and apply its special case for to be a new objective function. By using the method of Lagrange multipliers, the proposed constrained optimization is solved and the emerging solution is equal to the limit case of that given by the least fractional norm for .
This paper is organized as follows. In Section 2, we point out that the norm is diverged or undefined, whereas the so-called zero norm adopted in compressive sensing is actually not a proper norm or is only a pseudonorm. In Section 3, we propose a normalized norm. It is later shown that for the normalized zero norm is approximately a geometrical mean and unfortunately it does not hold the triangle inequality of the proper norm. In Section 4, we consider the compressive sensing model. The fractional norm for and its criterion presented in the past are revisited herein. The corresponding solution is found by the method of Lagrange multiplier. In Section 5, we propose an alternative criterion based on the normalized zero norm. We later derive its solution through the method of Lagrange multiplier. The solution is found in a closed form and unfortunately turns to be a limiting case to that of the least fractional norm criterion in the former works. In Section 6, numerical examples of the solution are provided in conjunction with other works. Concluding remarks of the paper are provided in Section 7.
2. Conventional Zero Norm
Let be a complex-valued vector, expressed as
Let be a positive integer. The -norm, or simply the -norm, , is given bywhere is the absolute value of . Basic properties of the -norm in (7) are as follows:(i)The -norm is positive-definite, i.e.,(ii)The -norm is zero if and only if the vector is zero, i.e., where is a zero vector whose all elements are zeros.(iii)The -norm has a triangle inequality, i.e., for and .(iv)Furthermore, the -norm has a scalability or homogeneity, which can be shown as for and .
2.1. Zero Norm
When is zero, the 0-norm of a vector can be determined fromwhere is the exponential function of . From the Taylor’s series expansion, we can express
For a small value of , i.e., , we can approximatewhere is the big ‘oh’ notation of Bachmann–Landau symbols, i.e.,
By using a property of logarithm power, we can show that
It is obvious that the 0-norm is unbounded and thus trivial.
2.2. Pseudo Zero Norm
Most works, however, instead consider
By assigning to 0, we have a convenient relationwhere is the cardinality of a set or herein the number of nonzero members in a set . The pseudo zero norm counts the number of nonzero elements in . It is important to note that the pseudo zero norm actually is not the zero norm and is not a proper norm, because it does not preserve the homogeneity. However, it is widely used to replace the zero norm due to its simple computability.
3. Normalized Zero Norm
Let us introduce a normalization of the -norm as
In statistical analysis (, Ch. 3), the above quantity is known as -mean, a generalized mean, Hölder mean, mean of degree , power mean, etc. We can represent the relation between the normalized -norm and the conventional -norm by
When is zero, we can derive
Using the Taylor’s series expansion of the exponential function, we have
Under the same manipulation, we can show that
We can see that due to the first-order Taylor series approximation, can be approximated by the geometric mean of . The geometric mean has the following properties:(i)It is positive-definite for any , i.e.,(ii)It can simply be zero when only one of all entries in is zero, i.e.,(iii)It does not hold the triangle inequality, e.g., for and , which mean .(iv)It is homogeneous, i.e.,(v)It is concave for any (see, e.g., ) and a monotonically increasing function.
4. Compressive Sensing
Let us consider an underdetermined system where there are more unknown signal components than equations, i.e., . In this case, there is infinite number of solutions for . Recent works indicate that the desired signal often is subject to sparsity, i.e., the situation when a number of elements in are zeros. Even though sometimes the signal does not strictly entail the sparsity, it is efficient to keep an approximate version that contains only a sufficient amount of its largest components, from which we called compressible signal. The sparsity nature of the signal is hidden and can be imposed by discovering a sparse basis and its associated spanning coefficients. The decomposition of the coefficient vector can be seen as a superposition of dictionary elements with a remaining term . Let be the number of nonzero elements in . If is the true value of , there is a relationwhere is the cardinality of set, and is the support set or the sparsity pattern of . The number is often known as sparsity degree. Compressive sensing can be seen as a problem of finding a -sparse signal . Unfortunately, the solution in (5) does not preserve the inherent sparsity of the signal. A different kind of vector norms can be used to explore the signal sparsity. The signal recovery can be formulated as an optimization problem, i.e.,
Note that we do not express the norm in (31) as similarly to most works, because the zero norm in most works is equal to the pseudo zero norm in this paper. In general, the objective function in terms of the modified zero norm is nonconvex. If is an identity matrix, an exact solution of (31) is a hard shrinkage of . For arbitrary matrix , one may resort to combinatorial optimization. Even an approximate value of the true minimum of the problem in (31) is nondeterministic polynomial-time hard or NP-hard, which appears to be prohibitive. The performance of the above optimization problem can, however, be analyzed, e.g., in .
4.1. Compressive Sensing by Fractional Norm
An alternative way is the consideration of [24–26]. When lies in the range ,(i)the norm accepts only a positive real-valued argument, i.e., ,(ii)it is a concave function, and(iii)the fractional norm does not hold the triangle property, e.g., mean
The last one implies that the fractional norm is not a proper norm or is only a quasinorm. For , the compressive sensing problemis nonconvex, nonsmooth, and non-Lipschitz. The fractional norm gives a closer approximation to the pseudo zero norm than the 1-norm, since the smaller the norm index , the sparser the solutions. It is shown in  that although a local minimum is found, exact reconstruction is possible with much sparser solution than that required by the 1-norm reconstruction. The case of provides the sparsest solution for , while the compressive sensing solution with has no significant difference from that with [28, 29].
4.2. Method of Lagrange Multipliers
We can see that when is equal to 2, the least fractional norm criterion provides the same result as that shown in (5), i.e.,
Unlike the solution by the LLS criterion in (5), the solution by the fractional norm in (38) depends on the unknown variable , which later needs to be involved with an iterative computation. The computation procedure in an iterative way, which is known as FOCal Underdetermined System Solver (FOCUSS), is summarized in .
Algorithm 1 can suffer from many local minima. An alternative tries to avoid the NP-hard problem for by sequentially minimizing a smooth function.
Sometimes, the least fractional exponent norm criterion is called iteratively reweighted least squares (IRLS) in compressive sensing [31–33]. When the vector in Algorithm 1 converges to the true value of , a number of elements in may be close to zeros, which can cause an ill condition of the matrix . It is suggested in  thatwhere is a regularization quantity that is large at the beginning of the iteration and gradually smaller when the iteration converges. In , the regularization quantity in (41) is replaced by its square . Let be a nonincreasing set of all elements in , which can be represented by
Let be the th element of the nonincreasing set , i.e., .
Algorithm 2 is different from  in two aspects. First, the regularization parameter in  is computed from the updated at the th iteration, which is unavailable. Second, the procedure addressed in  considers only .
5. Compressive Sensing by Normalized Zero Norm
We propose a new criterion by using the normalized zero norm. Under a similar idea to (32), the usual compressive sensing problem can alternatively be formulated as
Fortunately, when approximated as the geometrical mean in (25), the normalized zero norm makes the optimization problem nonconvex under a concave objective function and an affine/convex constraint.
The Lagrange function can be expressed as
The derivative of the Lagrange function with respect to and can be written as
We can show that
Using the chain rule for a real-valued variable , the derivative can be written as
By using the result from (47), we can derive
At the critical point , we have
At the critical point , we have
It should be noted that the result in (51) is equal to (38) with . Thus, the compressive sensing problem using the geometric mean is the limit case of the fractional norm problem for in (35), i.e.,and tends to be the desired but complicated problem with the pseudo zero norm in (31). Although the solution of the minimization of the normalized zero norm by Lagrange multiplier method appears to be the same as the former works, it gives a relation in the Lagrange multiplier method point of view to existing works that invoke least fractional norm and least pseudo zero norm criteria.
6. Numerical Examples
All computer simulations in this work are conducted using Python language. The root-mean-squared relative error (RMSRE), denoted byis the index for evaluating the performance of each algorithm. It is calculated by the square root of the probabilistic average of the square of the normalized estimation error, where the expectation is taken into account with respect to the randomization caused by(i)the true value of , which is assumed to follow an identical and independent real-valued Gaussian distribution with zero mean and unit variance, i.e.,(ii)the sparsity pattern of all nonzero elements in , which is assumed to have an equal probability for locations on all possible positions ( for a sparse signal vector).
The algorithms intended to comparison include(i)-norm, the problem in (4) whose Euclidean norm is replaced by the -norm, i.e., ,(ii)-norm, the problem in (4),(iii)FOCUSS, Algorithm 1 with and the maximum number of iterations of ,(iv)IRLS, Algorithm 2 with , and(v)Theoretical approximate normalized zero norm (TANZN); the best possibility of the first iteration of the fixed-point iteration in (52), calculated by substituting by the theoretical or true value , i.e.,
The minimizations of the -norm and -norm subject to are conducted by an interior-point solver for convex optimization . For both fixed-point iteration methods, such as the FOCUSS and the IRLS, the norm exponent is assumed to be . It should be noted that the solution in (55) is the ideal case of (51), because the true value is unknown. The realistic implementations of (55) were addressed in the past, e.g., in terms of the FOCUSS, the IRLS, etc. The design of a more accurate algorithm to the fixed-point iteration required by (51) may remain open for a future work. The matrix inverse in (55) is usually subject to a large condition number, which can cause a numerical failure. One has to resolve this numerical instability by adding a tiny amount, e.g., , to each diagonal element of before its inverse operation.
In Figure 2, numerical computation is done with and from independent runs for each value of . One can see that the -norm approach is very precise from to . At this critical region, the error abruptly arises probably because it is beyond the capability of the interior-point method in the convex optimization. The -norm method does not explore the sparsity nature of the signal vector and thus performs worst. The IRLS performs almost identically to the FOCUSS, except for a little better performance during the transition region. The ideal TANZN method stays constant for any value of . It is worse than its actual implementations, such as the FOCUSS and the IRLS, for . However, it is worth noting that both fixed-point iteration techniques each involve multiple iterations, e.g., maximal iterations in the FOCUSS, while the TANZN represents the best case for a single substitution or the first iteration.
In Figure 3, we assume that the length of the input elements is and the number of nonzero elements is , which depends on the number output elements , where is the operator of rounding to the nearest lower integer. One can see that when more observed data are available, the RMSRE decreases or the signal acquisition is more precise from all the abovementioned methods. The FOCUSS approach preforms identically to the IRLS algorithm. The -norm minimization is the realistic method that provides the least amount of signal recovery error. The TANZAN technique indicates that if the desired signal reaches its true value, the possible acquisition error can be lower than that by the -norm minimization.
A normalization of the -norm, denoted by is presented. A compressive sensing criterion using the normalized zero norm is proposed. Based on the method of Lagrange multipliers, the solution of the proposed optimization framework, i.e., is derived. It turns out that the new solution is a limit case of the fractional norm solution for , where its fixed-point algorithm can readily follow the FOCUSS algorithm in . In our companion works, we find that the minimization of the normalized zero norm by Tikhonov regularization method provides a different solution from that of the fractional norm [28, 29].
The data used to support the findings of this study are available upon request.
Conflicts of Interest
The author declares no conflicts of interest.
V. M. Patel and R. Chellappa, Sparse Representations and Compressive Sensing for Imaging and Vision, Springer Science, New York, NY, USA, 2013.
M. H. Conde, Ed.in Compressive Sensing for the Photonic Mixer Device: Fundamentals, Methods and Results, Springer Fachmedien, Wiesbaden, Germany, 2017.
A. K. Mishra and R. S. Verster, in Compressive Sensing Based Algorithms for Electronic Defence, Ser. Signals and Communication Technology, Springer International, Cham, Switzerland, 2017.
M. Testa, D. Valsesia, T. Bianchi, and E. Magli, in Compressed Sensing for Privacy-Preserving Data Processing, Ser. Signal Processing, Springer Nature, Singapore, 2019.
M. Amin, Ed.in Compressive Sensing for Urban Radar, CRC Press, Boca Raton, FL, USA, 2015.
A. D. Maio, Y. C. Eldar, and A. M. Haimovich, Eds.in Compressed Sensing in Radar Signal Processing, Cambridge University Press, Cambridge, UK, 2020.
C. Chen, in Compressive Sensing of Earth Observations, Ser. Signal and Image Processing of Earth Observations, CRC Press, Boca Raton, FL, USA, 2017.
Z. Han, H. Li, and W. Yin, Eds.in Compressive Sensing for Wireless Networks, Cambridge University Press, Cambridge, UK, 2013.
L. Kong, B. Wang, and G. Chen, When Compressive Sensing Meets Mobile Crowdsensing, Springer Nature, Singapore, 2019.
R. M. Thanki, V. J. Dwivedi, and K. R. Borisagar, Eds.in Multibiometric Watermarking with Compressive Sensing Theory: Techniques and Applications, Ser. Signals and Communication Technology, Springer International, Cham, Switzerland, 2018.
M. Khosravy, N. Dey, and C. A. Duque, in Compressive Sensing in Healthcare, Ser. Advances in Ubiquitous Sensing Applications for Healthcare, Academic Press, London, UK, 2020.
H. Shen, X. Li, L. Zhang, D. Tao, and C. Zeng, “Compressed sensing-based inpainting of aqua moderate resolution imaging spectroradiometer band 6 using adaptive spectrum-weighted sparse Bayesian dictionary learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 2, pp. 894–906, 2014.View at: Publisher Site | Google Scholar
P. S. Bullen, “Means and their inequalities,” Mathematics and its Applications., vol. 560, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2nd edition, 2004.View at: Google Scholar
S. Boyd and L. Vandenberghe, Convex Optimization, Seventh Printing with Corrections 2009, Cambridge University Press, New York, NY, USA, 2004.
S. Foucart and M.-J. Lai, “Sparsest solutions of underdetermined linear systems via lq-minimization for ,” Applied and Computational Harmonic Analysis, vol. 26, no. 3, pp. 395–497, 2009.View at: Google Scholar
R. Chartrand and W. Yin, “Iteratively reweighted algorithms for compressive sensing,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing 2008 (ICASSP 2008), pp. 3869–3872, Las Vegas, NV, USA, April 2008.View at: Google Scholar
S. Diamond and S. Boyd, “CVXPY: a python-embedded modeling language for convex optimization,” Journal of Machine Learning Research, vol. 17, no. 83, pp. 2909–2913, 2016.View at: Google Scholar