Fractional Mathematical Modelling and Optimal Control Problems of Differential EquationsView this Special Issue
Research Article | Open Access
Ye Li, Jun Sun, Biao Qu, "A Gradient Projection Algorithm with a New Stepsize for Nonnegative Sparsity-Constrained Optimization", Mathematical Problems in Engineering, vol. 2020, Article ID 6489190, 7 pages, 2020. https://doi.org/10.1155/2020/6489190
A Gradient Projection Algorithm with a New Stepsize for Nonnegative Sparsity-Constrained Optimization
Nonnegative sparsity-constrained optimization problem arises in many fields, such as the linear compressing sensing problem and the regularized logistic regression cost function. In this paper, we introduce a new stepsize rule and establish a gradient projection algorithm. We also obtain some convergence results under milder conditions.
In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem (NN-SCO):where is a continuously differential function with a lower bound. is a sparse set, where is a given integer regulating the sparsity level in and is the nonnegative orthant in . is the norm of , counting the number of nonzero elements in . Many application problems can be translated into problem (1), such as the widely studied linear compressing sensing problem of with being a sensing matrix, is the observation vector, and is the Euclidean norm in . Problem (1) has also used to the regularized logistic regression cost function .
Recently, a great deal of work has been devoted to algorithms for sparsity-constrained optimization problem. Beck and Eldar  established the IHT algorithm which converges to L-stationary under the Lipchitz continuity of the gradient of objective function. Beck and Hallak  generalized these results to sparse symmetric sets. Lu  designed a nonmonotone algorithm for symmetric set constraint problems. Pan, Xiu, and Zhou [6, 7] established the B-stationary, C-stationary, and -stationary based on the Bouligand tangent cone and Clarke tangent. Recently, Pan, Zhou, and Xiu  established the improved IHT algorithm (IIHT) for problem (1) by using Armijo line search. They proved that any accumulation point converged to an -stationary point under the restricted strong smoothness of objective function which is weaker than the Lipchitz continuity of the gradient.
Inspired by the above literature studies, in this paper, we establish a gradient projection algorithm with a new stepsize. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes it more applicable. Meanwhile, we prove the convergence of the algorithm.
To make it easier to read, we give some used notations as follows:
Definition 2 (see ). A function is called 2s-restricted strongly smooth (2s-RSS) with parameter , and if for any satisfying , it holds that
Definition 3 (see ). A function is called 2s-restricted strongly convex (2s-RSC) with parameter , and if for any satisfying , it holds thatIf and only if for any and , we haveIn particular, in (5), if , the function is called 2s-restricted convex (2s-RC).
Definition 4 (see ). The projected gradient of is defined by
Lemma 1 (see ). For, vector is an -stationary point if and only if
In particular, when , .
When , .
Lemma 2 (see ). .
Lemma 3 (see ). For any , we havewhere is a vector whose th component is one and others are zeros.
3. Main Results
In this section, we establish a new algorithm which improves the IIHT algorithm for (1) and then we analyze its convergence properties. At first, let us develop the gradient projection algorithm with a new stepsize rule.
Algorithm 1. Step 1. Initialize , , and , and set. Step 2. Compute , where Step 3. Compute where satisfies . Step 4. If , then stop; otherwise, set and go to Step 2.Next, let us list the following assumptions for convenience:(1)For any , (2) is bounded below on
Lemma 4. Let the sequence be generated by Algorithm 1, and set . Then, we havewhere .
Proof. LetThen,Thus,Then, (11) is tenable.
Lemma 5. We suppose . For and , we havewhere .
Proof. Since , by the definition of projection, we getMoreover,Because is a constant independent of , we can getTherefore,By Lemma 4, we getHence,Let . We get
Lemma 6. Let the sequence be generated by Algorithm 1. Then,(1)(2) is an increasing sequence, and when , converges(3)(4)for any , if , we have
Proof. (1)Since , we get Setting in (15), formula (1) can be obtained.(2)We can easily get that is an increasing sequence by (15). Moreover, by the assumptions , we can get that converges.(3)Let in (1). We can get Summing over both sides of this inequality, we get Since is bounded below, we get(4)It easily can be got by (2).
Lemma 7. Let the sequence be generated by Algorithm 1. Suppose that the function is 2s-RC. We have
Theorem 1. Let the sequence be generated by Algorithm 1. Then, the following results hold:(1)Any accumulation of sequence is an -stationary point.(2)If is 2s-RC, the projected gradient sequence converges to zero, i.e.,
Proof. (1)Suppose that is an accumulation point of sequence . Then, there exists a subsequence converges to . Because we get Moreover, We consider the next two cases:
Case 1. For , there must exist a sufficiently large index and a constant such that By and (33), we can get Since without loss of generality, we can suppose . Let . We get i.e.,
Case 2. For , we consider two subcases.
Subcase 1. When , we get Due to the property of the projections and , we have Thus, Taking limits on both sides, we obtain
Subcase 2. When , suppose , and we have For all sufficiently large , we have Since , for all sufficiently large , we have which contradicts with . Thus, . Summarizing the two cases, we obtain Thus, is an -stationary point of (1).(2)Set . By Lemma 3, we haveBy Definition 4, we haveMoreover, the maximum value is taken at . For any , there exists and satisfiesBecause and , we geti.e.,Thus, for any , we getTaking , we getBy the Cauchy–Schwartz inequality, we geti.e.,By Lemma 7, we getTaking limits on both sides and using Lemma 6, we haveBy (32), we get
Theorem 2. Let the sequence be generated by Algorithm 1. is an accumulation point of the sequence . Suppose is , then the following results hold:(1)If then is a global minimizer of (1)(2)If , then is a local minimizer of (1)
Proof. (1)For , we have . Since is , by Definition 3, we have Because is an accumulation point of the sequence . By Theorem 1, is an -stationary. By Lemma 1, we can get Thus, is a global minimizer of (1).(2)If , then . In fact, for all sufficiently large , taking , we getFor any , we haveThus,By and , we haveFor any satisfying , we have . Since is , by Definition 3, Theorem 1, and Lemma 1, we haveThus, is a local minimizer of (1).
Theorem 3. Let the sequence be generated by Algorithm 1. is a limit of the sequence . Suppose is with parameter and for all sufficiently large , and we havewhere .
Proof. By Theorem 2, we get . As is with parameter , for any and , we haveSet . By Theorem 2, we get . For all sufficiently large , we haveFor all sufficiently large , we haveBecause , we getSince and , we have . Thus,Setting , we getThus,By and we get .
From and , we have .
Thus,where . Thus, the sequence is Q-linear convergence to .
In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem. We introduce a new stepsize rule and propose a new gradient projection algorithm to solve this problem. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes the new algorithm more applicable. Meanwhile, we prove the convergence of the algorithm.
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
This project was supported by the National Science Foundation of Shandong Province (no. ZR2018MA019).
- M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, Berlin, Germany, 2010.
- S. Bahmani, B. Raj, and P. Boufounos, “Greedy sparsity-constrained optimization,” The Journal of Machine Learning Research, vol. 14, pp. 807–841, 2013.
- A. Beck and Y. C. Eldar, “Sparsity constrained nonlinear optimization: optimality conditions and algorithms,” SIAM Journal on Optimization, vol. 23, no. 3, pp. 1480–150, 2013.
- A. Beck and N. Hallak, “On the minimization over sparse symmetric sets: projections, optimality conditions, and algorithms,” Mathematics of Operations Research, vol. 41, no. 1, pp. 196–223, 2016.
- Z. Lu, “Optimization over sparse symmetric sets via a non-monotone projected gradient method,” 2015, https://arxiv.org/abs/1509.08581.
- L.-L. Pan, N.-H. Xiu, and S.-L. Zhou, “On solutions of sparsity constrained optimization,” Journal of the Operations Research Society of China, vol. 3, no. 4, pp. 421–439, 2015.
- L. Pan, N. Xiu, and S. Zhou, “Gradient support projection algorithm for affine feasibility problem with sparsity and nonnegativity,” 2014, https://arxiv.org/abs/1406.7178.
- L. Pan, S. Zhou, N. Xiu, and H.-D. Qi, “A convergent iterative hard thresholding for nonnegative sparsity optimization,” Pacific Journal of Optimization, vol. 13, pp. 325–353, 2017.
- S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu, “A united framework for high-dimensional analysis of M-estimators with decomposable regularizes,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, December 2009.
- C. Wang and B. Qu, “Convergence of the gradient projection method with a new stepsize rule,” OR Transactions, vol. 6, pp. 36–44, 2002.
Copyright © 2020 Ye Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.