A Gradient Projection Algorithm with a New Stepsize for Nonnegative Sparsity-Constrained Optimization

Li, Ye; Sun, Jun; Qu, Biao

doi:https://doi.org/10.1155/2020/6489190

Mathematical Problems in Engineering

On this page

Abstract Introduction Preliminaries Conclusions Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Special Issue

Fractional Mathematical Modelling and Optimal Control Problems of Differential Equations

View this Special Issue

Research Article | Open Access

Volume 2020 | Article ID 6489190 | https://doi.org/10.1155/2020/6489190

A Gradient Projection Algorithm with a New Stepsize for Nonnegative Sparsity-Constrained Optimization

Ye Li,^1,2Jun Sun,³and Biao Qu¹

Guest Editor: Chuanjun Chen

Received26 May 2020

Revised30 Jul 2020

Accepted08 Aug 2020

Published26 Aug 2020

Abstract

Nonnegative sparsity-constrained optimization problem arises in many fields, such as the linear compressing sensing problem and the regularized logistic regression cost function. In this paper, we introduce a new stepsize rule and establish a gradient projection algorithm. We also obtain some convergence results under milder conditions.

1. Introduction

In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem (NN-SCO):where is a continuously differential function with a lower bound. is a sparse set, where is a given integer regulating the sparsity level in and is the nonnegative orthant in . is the norm of , counting the number of nonzero elements in . Many application problems can be translated into problem (1), such as the widely studied linear compressing sensing problem of with being a sensing matrix, is the observation vector, and is the Euclidean norm in [1]. Problem (1) has also used to the regularized logistic regression cost function [2].

Recently, a great deal of work has been devoted to algorithms for sparsity-constrained optimization problem. Beck and Eldar [3] established the IHT algorithm which converges to L-stationary under the Lipchitz continuity of the gradient of objective function. Beck and Hallak [4] generalized these results to sparse symmetric sets. Lu [5] designed a nonmonotone algorithm for symmetric set constraint problems. Pan, Xiu, and Zhou [6, 7] established the B-stationary, C-stationary, and -stationary based on the Bouligand tangent cone and Clarke tangent. Recently, Pan, Zhou, and Xiu [8] established the improved IHT algorithm (IIHT) for problem (1) by using Armijo line search. They proved that any accumulation point converged to an -stationary point under the restricted strong smoothness of objective function which is weaker than the Lipchitz continuity of the gradient.

Inspired by the above literature studies, in this paper, we establish a gradient projection algorithm with a new stepsize. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes it more applicable. Meanwhile, we prove the convergence of the algorithm.

The rest of this paper is organized as follows. In Section 2, we present some notations, definitions, and lemmas. In Section 3, we give the algorithm of (1) and prove the convergence properties.

2. Preliminaries

2.1. Notations

To make it easier to read, we give some used notations as follows:

2.2. Definitions

Definition 1 (see [8]). Let be a given feasible point of (1). We say that is an -stationary point, if there exists such that

Definition 2 (see [9]). A function is called 2s-restricted strongly smooth (2s-RSS) with parameter , and if for any satisfying , it holds that

Definition 3 (see [9]). A function is called 2s-restricted strongly convex (2s-RSC) with parameter , and if for any satisfying , it holds thatIf and only if for any and , we haveIn particular, in (5), if , the function is called 2s-restricted convex (2s-RC).

Definition 4 (see [10]). The projected gradient of is defined by

2.3. Lemmas

Lemma 1 (see [8]). For, vector is an -stationary point if and only if

In particular, when , .

When , .

Lemma 2 (see [8]). .

Lemma 3 (see [8]). For any , we havewhere is a vector whose ^th component is one and others are zeros.

3. Main Results

In this section, we establish a new algorithm which improves the IIHT algorithm for (1) and then we analyze its convergence properties. At first, let us develop the gradient projection algorithm with a new stepsize rule.

Algorithm 1. Step 1. Initialize , , and , and set. Step 2. Compute , where Step 3. Compute where satisfies . Step 4. If , then stop; otherwise, set and go to Step 2.Next, let us list the following assumptions for convenience:(1)For any , (2) is bounded below on

Lemma 4. Let the sequence be generated by Algorithm 1, and set . Then, we havewhere .

Proof. LetThen,Thus,Then, (11) is tenable.

Lemma 5. We suppose . For and , we havewhere .

Proof. Since , by the definition of projection, we getMoreover,Because is a constant independent of , we can getTherefore,By Lemma 4, we getHence,Let . We get

Lemma 6. Let the sequence be generated by Algorithm 1. Then,(1)(2) is an increasing sequence, and when , converges(3)(4)for any , if , we have

Proof. (1)Since , we get Setting in (15), formula (1) can be obtained.(2)We can easily get that is an increasing sequence by (15). Moreover, by the assumptions , we can get that converges.(3)Let in (1). We can get Summing over both sides of this inequality, we get Since is bounded below, we get(4)It easily can be got by (2).

Lemma 7. Let the sequence be generated by Algorithm 1. Suppose that the function is 2s-RC. We have

Proof. Because the sequence be generated by Algorithm 1, we get . By Lemma 4 and Lemma 5 in reference [8], we can get

Theorem 1. Let the sequence be generated by Algorithm 1. Then, the following results hold:(1)Any accumulation of sequence is an -stationary point.(2)If is 2s-RC, the projected gradient sequence converges to zero, i.e.,

Proof. (1)Suppose that is an accumulation point of sequence . Then, there exists a subsequence converges to . Because we get Moreover, We consider the next two cases:

Case 1. For , there must exist a sufficiently large index and a constant such that By and (33), we can get Since without loss of generality, we can suppose . Let . We get i.e.,

Case 2. For , we consider two subcases.

Subcase 1. When , we get Due to the property of the projections and , we have Thus, Taking limits on both sides, we obtain

Subcase 2. When , suppose , and we have For all sufficiently large , we have Since , for all sufficiently large , we have which contradicts with . Thus, . Summarizing the two cases, we obtain Thus, is an -stationary point of (1).(2)Set . By Lemma 3, we haveBy Definition 4, we haveMoreover, the maximum value is taken at . For any , there exists and satisfiesBecause and , we geti.e.,Thus, for any , we getTaking , we getBy the Cauchy–Schwartz inequality, we geti.e.,By Lemma 7, we getTaking limits on both sides and using Lemma 6, we haveBy (32), we get

Theorem 2. Let the sequence be generated by Algorithm 1. is an accumulation point of the sequence . Suppose is , then the following results hold:(1)If then is a global minimizer of (1)(2)If , then is a local minimizer of (1)

Proof. (1)For , we have . Since is , by Definition 3, we have Because is an accumulation point of the sequence . By Theorem 1, is an -stationary. By Lemma 1, we can get Thus, is a global minimizer of (1).(2)If , then . In fact, for all sufficiently large , taking , we getFor any , we haveThus,By and , we haveFor any satisfying , we have . Since is , by Definition 3, Theorem 1, and Lemma 1, we haveThus, is a local minimizer of (1).

Theorem 3. Let the sequence be generated by Algorithm 1. is a limit of the sequence . Suppose is with parameter and for all sufficiently large , and we havewhere .

Proof. By Theorem 2, we get . As is with parameter , for any and , we haveSet . By Theorem 2, we get . For all sufficiently large , we haveFor all sufficiently large , we haveBecause , we getSince and , we have . Thus,Setting , we getThus,By and we get .
From and , we have .
Thus,where . Thus, the sequence is Q-linear convergence to .

4. Conclusions

In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem. We introduce a new stepsize rule and propose a new gradient projection algorithm to solve this problem. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes the new algorithm more applicable. Meanwhile, we prove the convergence of the algorithm.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

Acknowledgments

This project was supported by the National Science Foundation of Shandong Province (no. ZR2018MA019).

References

M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, Berlin, Germany, 2010.
S. Bahmani, B. Raj, and P. Boufounos, “Greedy sparsity-constrained optimization,” The Journal of Machine Learning Research, vol. 14, pp. 807–841, 2013.
View at: Google Scholar
A. Beck and Y. C. Eldar, “Sparsity constrained nonlinear optimization: optimality conditions and algorithms,” SIAM Journal on Optimization, vol. 23, no. 3, pp. 1480–150, 2013.
View at: Publisher Site | Google Scholar
A. Beck and N. Hallak, “On the minimization over sparse symmetric sets: projections, optimality conditions, and algorithms,” Mathematics of Operations Research, vol. 41, no. 1, pp. 196–223, 2016.
View at: Publisher Site | Google Scholar
Z. Lu, “Optimization over sparse symmetric sets via a non-monotone projected gradient method,” 2015, https://arxiv.org/abs/1509.08581.
View at: Google Scholar
L.-L. Pan, N.-H. Xiu, and S.-L. Zhou, “On solutions of sparsity constrained optimization,” Journal of the Operations Research Society of China, vol. 3, no. 4, pp. 421–439, 2015.
View at: Publisher Site | Google Scholar
L. Pan, N. Xiu, and S. Zhou, “Gradient support projection algorithm for affine feasibility problem with sparsity and nonnegativity,” 2014, https://arxiv.org/abs/1406.7178.
View at: Google Scholar
L. Pan, S. Zhou, N. Xiu, and H.-D. Qi, “A convergent iterative hard thresholding for nonnegative sparsity optimization,” Pacific Journal of Optimization, vol. 13, pp. 325–353, 2017.
View at: Google Scholar
S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu, “A united framework for high-dimensional analysis of M-estimators with decomposable regularizes,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, December 2009.
View at: Google Scholar
C. Wang and B. Qu, “Convergence of the gradient projection method with a new stepsize rule,” OR Transactions, vol. 6, pp. 36–44, 2002.
View at: Google Scholar

Copyright

Copyright © 2020 Ye Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

168

Downloads

595

Citations