Abstract
Nonnegative sparsity-constrained optimization problem arises in many fields, such as the linear compressing sensing problem and the regularized logistic regression cost function. In this paper, we introduce a new stepsize rule and establish a gradient projection algorithm. We also obtain some convergence results under milder conditions.
1. Introduction
In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem (NN-SCO):where is a continuously differential function with a lower bound. is a sparse set, where is a given integer regulating the sparsity level in and is the nonnegative orthant in . is the norm of , counting the number of nonzero elements in . Many application problems can be translated into problem (1), such as the widely studied linear compressing sensing problem of with being a sensing matrix, is the observation vector, and is the Euclidean norm in [1]. Problem (1) has also used to the regularized logistic regression cost function [2].
Recently, a great deal of work has been devoted to algorithms for sparsity-constrained optimization problem. Beck and Eldar [3] established the IHT algorithm which converges to L-stationary under the Lipchitz continuity of the gradient of objective function. Beck and Hallak [4] generalized these results to sparse symmetric sets. Lu [5] designed a nonmonotone algorithm for symmetric set constraint problems. Pan, Xiu, and Zhou [6, 7] established the B-stationary, C-stationary, and -stationary based on the Bouligand tangent cone and Clarke tangent. Recently, Pan, Zhou, and Xiu [8] established the improved IHT algorithm (IIHT) for problem (1) by using Armijo line search. They proved that any accumulation point converged to an -stationary point under the restricted strong smoothness of objective function which is weaker than the Lipchitz continuity of the gradient.
Inspired by the above literature studies, in this paper, we establish a gradient projection algorithm with a new stepsize. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes it more applicable. Meanwhile, we prove the convergence of the algorithm.
The rest of this paper is organized as follows. In Section 2, we present some notations, definitions, and lemmas. In Section 3, we give the algorithm of (1) and prove the convergence properties.
2. Preliminaries
2.1. Notations
To make it easier to read, we give some used notations as follows:
2.2. Definitions
Definition 1 (see [8]). Let be a given feasible point of (1). We say that is an -stationary point, if there exists such that
Definition 2 (see [9]). A function is called 2s-restricted strongly smooth (2s-RSS) with parameter , and if for any satisfying , it holds that
Definition 3 (see [9]). A function is called 2s-restricted strongly convex (2s-RSC) with parameter , and if for any satisfying , it holds thatIf and only if for any and , we haveIn particular, in (5), if , the function is called 2s-restricted convex (2s-RC).
Definition 4 (see [10]). The projected gradient of is defined by
2.3. Lemmas
Lemma 1 (see [8]). For, vector is an -stationary point if and only if
In particular, when , .
When , .
Lemma 2 (see [8]). .
Lemma 3 (see [8]). For any , we havewhere is a vector whose th component is one and others are zeros.
3. Main Results
In this section, we establish a new algorithm which improves the IIHT algorithm for (1) and then we analyze its convergence properties. At first, let us develop the gradient projection algorithm with a new stepsize rule.
Algorithm 1. Step 1. Initialize , , and , and set. Step 2. Compute , where Step 3. Compute where satisfies . Step 4. If , then stop; otherwise, set and go to Step 2.Next, let us list the following assumptions for convenience:(1)For any , (2) is bounded below on
Lemma 4. Let the sequence be generated by Algorithm 1, and set . Then, we havewhere .
Proof. LetThen,Thus,Then, (11) is tenable.
Lemma 5. We suppose . For and , we havewhere .
Proof. Since , by the definition of projection, we getMoreover,Because is a constant independent of , we can getTherefore,By Lemma 4, we getHence,Let . We get
Lemma 6. Let the sequence be generated by Algorithm 1. Then,(1)(2) is an increasing sequence, and when , converges(3)(4)for any , if , we have
Proof. (1)Since , we get Setting in (15), formula (1) can be obtained.(2)We can easily get that is an increasing sequence by (15). Moreover, by the assumptions , we can get that converges.(3)Let in (1). We can get Summing over both sides of this inequality, we get Since is bounded below, we get(4)It easily can be got by (2).
Lemma 7. Let the sequence be generated by Algorithm 1. Suppose that the function is 2s-RC. We have
Proof. Because the sequence be generated by Algorithm 1, we get . By Lemma 4 and Lemma 5 in reference [8], we can get
Theorem 1. Let the sequence be generated by Algorithm 1. Then, the following results hold:(1)Any accumulation of sequence is an -stationary point.(2)If is 2s-RC, the projected gradient sequence converges to zero, i.e.,
Proof. (1)Suppose that is an accumulation point of sequence . Then, there exists a subsequence converges to . Because we get Moreover, We consider the next two cases:
Case 1. For , there must exist a sufficiently large index and a constant such that By and (33), we can get Since without loss of generality, we can suppose . Let . We get i.e.,
Case 2. For , we consider two subcases.
Subcase 1. When , we get Due to the property of the projections and , we have Thus, Taking limits on both sides, we obtain
Subcase 2. When , suppose , and we have For all sufficiently large , we have Since , for all sufficiently large , we have which contradicts with . Thus, . Summarizing the two cases, we obtain Thus, is an -stationary point of (1).(2)Set . By Lemma 3, we haveBy Definition 4, we haveMoreover, the maximum value is taken at . For any , there exists and satisfiesBecause and , we geti.e.,Thus, for any , we getTaking , we getBy the Cauchy–Schwartz inequality, we geti.e.,By Lemma 7, we getTaking limits on both sides and using Lemma 6, we haveBy (32), we get
Theorem 2. Let the sequence be generated by Algorithm 1. is an accumulation point of the sequence . Suppose is , then the following results hold:(1)If then is a global minimizer of (1)(2)If , then is a local minimizer of (1)
Proof. (1)For , we have . Since is , by Definition 3, we have Because is an accumulation point of the sequence . By Theorem 1, is an -stationary. By Lemma 1, we can get Thus, is a global minimizer of (1).(2)If , then . In fact, for all sufficiently large , taking , we getFor any , we haveThus,By and , we haveFor any satisfying , we have . Since is , by Definition 3, Theorem 1, and Lemma 1, we haveThus, is a local minimizer of (1).
Theorem 3. Let the sequence be generated by Algorithm 1. is a limit of the sequence . Suppose is with parameter and for all sufficiently large , and we havewhere .
Proof. By Theorem 2, we get . As is with parameter , for any and , we haveSet . By Theorem 2, we get . For all sufficiently large , we haveFor all sufficiently large , we haveBecause , we getSince and , we have . Thus,Setting , we getThus,By and we get .
From and , we have .
Thus,where . Thus, the sequence is Q-linear convergence to .
4. Conclusions
In this paper, we are mainly concerned with the nonnegative sparsity-constrained optimization problem. We introduce a new stepsize rule and propose a new gradient projection algorithm to solve this problem. The new algorithm removes the condition of the restricted strong smoothness of objective function which makes the new algorithm more applicable. Meanwhile, we prove the convergence of the algorithm.
Data Availability
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Acknowledgments
This project was supported by the National Science Foundation of Shandong Province (no. ZR2018MA019).