Nonlinear Analysis: Algorithm, Convergence, and Applications 2014View this Special Issue
A Cutting Plane and Level Stabilization Bundle Method with Inexact Data for Minimizing Nonsmooth Nonconvex Functions
Under the condition that the values of the objective function and its subgradient are computed approximately, we introduce a cutting plane and level bundle method for minimizing nonsmooth nonconvex functions by combining cutting plane method with the ideas of proximity control and level constraint. The proposed algorithm is based on the construction of both a lower and an upper polyhedral approximation model to the objective function and calculates new iteration points by solving a subproblem in which the model is employed not only in the objective function but also in the constraints. Compared with other proximal bundle methods, the new variant updates the lower bound of the optimal value, providing an additional useful stopping test based on the optimality gap. Another merit is that our algorithm makes a distinction between affine pieces that exhibit a convex or a concave behavior relative to the current iterate. Convergence to some kind of stationarity point is proved under some looser conditions.
Bundle methods family is based on the cutting plane method, first described in [1, 2], where the convexity of the objective function is the fundamental assumption. The extension of bundle methods to the nonconvex case is not straightforward; however, it is apparent that a number of ideas valid in the convex framework are also valuable in the treatment of the nonconvex case. In , by combining cutting plane method with proximity control, for minimizing nonsmooth nonconvex function, the authors propose an iterative algorithm which makes a distinction between affine pieces that exhibit a convex or a concave behavior relative to the current point in the iterative procedure, but the exact information of the objective function and its subgradient is needed.
Level bundle method is another method which is easier to implement and has shown encouraging numerical results [4, 5]. Some evidence shows that constrained level bundle methods are preferable under certain conditions. Also, strategies for updating the level parameter are readily available .
In various real-world applications, the objective function and/or its subgradient can be costly (sometimes impossible) to compute. This is particularly true when is given by some optimization problem; for example, . In such situations, approximate values must be used. Various inexact bundle methods that use approximate functions values and subgradient evaluations have been studied .
In this work, we combine the level bundle method with cutting plane method involving the idea of proximal control by employing inexact objective function information for minimizing a nonconvex function. The proposed algorithm makes a distinction between affine pieces that exhibit a convex or a concave behavior relative to the current iteration point such that the downward shifting of the affine pieces is not arbitrary. The rest of this paper is organized as follows. Section 2 gives some preliminary results and introduces the basic idea of the proposed algorithm. Section 3 describes formally the cutting plane and level stabilization bundle method with inexact data for minimizing nonsmooth nonconvex functions. Section 4 is devoted to convergence analysis of the algorithm with looser conditions than . Section 5 contains some conclusions of the proposed method.
2. The Construction of the Model
In this paper, we consider the following unconstrained minimization problem: where is not necessarily differentiable. We assume that is locally Lipschitz; then is differentiable almost everywhere. It is well known that, in , under the above hypotheses, there is defined at each point the generalized gradient (Clarke's gradient): where is the set, where is not differentiable, and is locally bounded. An extension of the generalized gradient is the Goldstein -subdifferential defined as which is first described by Goldstein in . The Goldstein subdifferential is both outer and inner semicontinuous as a multifunction of and .
Throughout the paper, we assume that, given , the oracle provides some approximate values and of the objective function and its subgradient, respectively, such that where and are some unknown errors. Specifically, is uniformly bounded; that is, there exists such that Let be the current stability center; the bundle of available information is the set of elements: where is the linearization error defined by and . With this information, we create the linearization At iteration , a polyhedral cutting plane model of is available: Suppose there exists such that , , and is uniformly bounded; that is, there exists such that , . We recall that the classical cutting plane methods [1, 2] minimize at each iteration, and the minimization of can be written in linear programming form: which is equivalent to solving where . We divide the set into two sets and defined as follows: We observe that is never empty since . By using and , we define two piecewise affine functions: In fact, can be regarded as an approximation of the difference function . Because and , thus ; it is reasonable to consider the approximation significant as far as . Therefore we introduce a kind of trust model: Let be a nonnegative scalar representing how much we aim to reduce the value at the current iteration. Define the corresponding level parameter . Since , . Then the level set associated with and is given by We obtain the search direction by solving the following convex quadratic subproblem, parameterized in the nonnegative scalar , where the first two constraints ensure and the last constraint represents the idea of level bundle method: Observe that since is feasible. We have consequently that the optimal value cannot be positive.
The dual problem of can be written in the form where and are matrices whose columns are, respectively, the vectors , and , and and are vectors whose components are , and , , respectively. and are the vectors with components , , , , and , . The primal optimal solution is related to the dual optimal solution by the following formulae:
Before giving a description of the algorithm, we state some simple properties of .
Lemma 1. Let ; then the following conclusions hold:(i);(ii);(iii).
Proof. We omit the proof since the conclusions can be obtained by imitating the proof of Lemma 2.1 .
Lemma 2. For any , the following conclusions hold:(i);(ii);(iii).
Proof. The conclusions can be obtained by imitating the proof of Lemma 2.2 .
In this section, we state the algorithm in full detail and give some comments on it.
Algorithm 3. We have the following.
Step 0 (initialization). The following global parameters are to be set: the stopping tolerances , , the proximity measure , the descent parameter , the cut parameter , the reduction parameter , the increase parameter , and the level measure parameter . Choose a starting point ; set ; the oracle provides and satisfying Assumption (4). The initial bundle is made up of just one element so that is empty, while is a singleton. Since , a lower bound for is available; set . Set the iteration counter .
Step 1 (first stopping test). If , then terminate.
Step 2 (second stopping test). Set . If , then stop.
Step 3 (level feasibility checking). Set the level parameter . If the level set defined by (15) is detected to be empty, then set , and go back to Step 2. Otherwise set , , and .
Step 4 (direction finding). Find the solution of for increasing value of , such that where equals the minimum value of if such does exist; otherwise set .
Step 5 (bundle updating). Set Calculate If , then terminate.
Else set ; go to Step 4.
Step 6 (trial point calculating). Set , calculate , , and set .
Step 7 (index insertion). (a) If and , then insert the element into the bundle for an appropriate value of and set .
(b) Else, if , then insert the element , into the bundle for an appropriate value of .
(c) Else find a scalar such that satisfies the condition and insert the element into the bundle for an appropriate value of , where .
Step 8 (descent test). If , go to Step 5. Choose .
If set the new stability center , , set , and go to Step 1 (serious step).
Otherwise set , , set , and go to Step 9 (null step).
Step 9 (resolving or ). Solve or, equivalently, , to obtain the primal and dual optimal solutions and and go to Step 6.
That is the end of the algorithm.
A few comments on the algorithm are in the following order.
(a) Step 1 is justified by the optimality estimate.
(b) Step 2 is another stopping criterion. means that , , and since the update sets , it holds that for all . If Algorithm 3 stops at Step 2, we have that which means that is a -approximate solution to problem (1). If at Step 2 the rule is replaced by for all , Algorithm 3 becomes an approximate proximal bundle algorithm.
(c) Step 4 may use the dual quadratic programming method of , which can solve efficiently sequences of related subproblems with varying . The construction of at Step 4 may be discretized by repeatedly solving for increasing values of or by adopting techniques of the type described in .
(d) is never a consequence of the choice of too small . In fact, we note that if , it holds that The right-hand side of the above inequality is bigger than , so too small cannot lead to ; therefore we need to increase the value of .
(e)We remark that the insertion of a bundle index into or at Step 7 is not simply based on the sign of .
(f) The level constraint provides an additional useful stopping test based on a certain optimality gap, something not present in cutting plane methods. It should be noted that this additional stopping test is useful, as usual proximal bundle methods sometimes “take time” to accumulate enough information to recognize that an acceptable approximate solution has been already computed.
(g) To keep the size of problem (16) manageable, the number of elements in bundle should be kept bounded, without impairing convergence. For this, the usual aggregation techniques of proximal bundle methods can be employed here. Specifically, the model can be composed of as few as only two cutting planes, corresponding to the new linearization and the aggregate linearization.
4. Convergence Analysis
Convergence analysis of Algorithm 3 has to account for all the following cases: For the first case (26), if the stopping tolerance is positive, then the method terminates with an approximate solution in a finite number of iterations.
Lemma 4. Suppose the level sets are empty for infinitely many times; then : And every cluster point of the sequence (if any exists) is a -approximate solution to problem (1).
Proof. Since , . Also, by Steps 3 and 8, ; thus which means that if is empty at iteration , then the update decreases the optimality gap by a factor of at least . Hence if this happens infinitely many times, we have as . Moreover, we have . As is decreasing and bounded below , we conclude that which gives (29). Let be any cluster point of , and let be a subsequence converging to as . Then which establishes the last conclusion.
From now on, we consider the case when for large enough. Without loss of generality, we can simply assume for all . For cases (27) and (28) we make the following assumptions:(A1)the set is compact;(A2)for any , the directional directive of at exists and where .
Remark 5. Assumption is a looser condition than the one emerging in , where the condition that is weakly semismooth is required.
Next we introduce Lemma 6 to see what will happen if no really new stability center is generated; that is, there exists the last stability center and from then on only null steps are generated.
Lemma 6. Suppose is the last stability center followed by a null step sequence such that and with the algorithm looping between Steps 6 and 9. Then the following conclusions hold.(i)There exists an index such that, for each , every new bundle index is inserted into and remains unchanged.(ii)Step 7(c) is well defined.(iii)Whenever a new bundle index is inserted into , the condition holds, where is the subgradient corresponding to the new bundle element.
Proof. (i) The conclusion can be obtained by noting that no bundle index can be inserted into as soon as falls below the threshold .
(ii) Since the sufficient decrease condition (23) is not satisfied, it follows that According to the inexact data in Assumption (4), , by mean value theorem there exists a scalar such that Thus the conclusion follows from Assumption .
(iii) Observe that the condition is ensured either by construction or by the fact that whenever .
Now we consider the case of infinitely many descent steps; we prove that either an approximate solution is achieved at Step 5 or we obtain a really new stability center .
Lemma 7. For Algorithm 3, if infinitely many descent steps are generated one either obtains an approximate solution at Step 5 or one executes descent step at Step 8.
Proof. Firstly we prove that the algorithm cannot pass through Step 5 for infinitely many times. Just like the result of Lemma 4.2 in , we can obtain that the indices of the new bundle elements are inserted into and are never removed. Moreover, when a passage at Step 5 occurs all the elements with index are removed. Taking into account (18) and the constraint in the dual problem , there exists an index such that, for all , can be expressed in the form (say )
where . But since and , we have
which leads to a contradiction.
Next we show that it is impossible to have for infinitely many times and the descent condition (23) is not satisfied with the algorithm looping between Steps 6 and 9. Indexing by the th passage through such a loop, we observe that, by Lemma 6 (i), there exists an index such that, for every , the sequence is nondecreasing, bounded, and hence convergent. Moreover, is bounded; it admits a convergent subsequence, say . The above consideration implies also that is convergent to a nonpositive limit, say . Next, we can imitate the proof of Lemma 4.2 in  to show that , which, by Lemma 2 (iii), contradicts the fact that .
Finally we show after a finite number of descent steps that Algorithm 3 stops at a point satisfying the condition of some kind of approximate solution.
Theorem 8. For any and , Algorithm 3 stops in a finite number of iterations at a point satisfying the approximate stationarity condition with .
Proof. Suppose that the conclusion does not hold. It follows from Lemma 7 that for infinitely many times the descent condition (23) is satisfied. Let be the stability center at the th passage; then and Since and , so , where is the Lipschitz constant of on . It follows from that is bounded away from zero. Then from Lemma 2(iii), is bounded away from zero as well. Therefore by passing to the limit we obtain Since , , we have , which contradicts the fact that the value of at is positive.
In this paper, we propose a new algorithm for nonsmooth nonconvex minimization by employing the approximate values of the objective function and its subgradient. It combines the cutting plane method, level bundle method, and the idea of proximal control. The aim is to take advantage of good properties of all above-mentioned methods, thus speeding up the optimization process. In addition, the algorithm provides a useful stopping test based on the optimality gap, something not present in the proximal bundle methods. Compared with bundle methods for nonsmooth nonconvex functions, the amount of the shifting of affine pieces appears somehow arbitrary, but in our paper, the use of downward shifting is restricted to some particular cases.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors would like to acknowledge the valuable suggestions and helpful comments from the referees. This research was supported by the National Natural Science Foundation of China (Grants 11301246, 11171049, and 11171138).
U. Brannlund, On relaxation methods for nonsmooth convex optimization [Ph.D. thesis], Department of Mathematics, Royal Institute of Technology, Stockholm, Sweden, 1993.
W. de Oliveira and M. Solodov, “A doubly stabilized bundle method for nonsmooth convex optimization,” Tech. Rep., 2013, http://www.optimization-online.org/DB_HTML/2013/04/3828.html.View at: Google Scholar
F. H. Clarke, Optimization and Nonsmooth Analysis, John Wiley & Sons, New York, NY, USA, 1983.View at: MathSciNet
A. Fuduli and M. Gaudioso, “The proximal trajectory algorithm for convex minimization,” Tech. Rep. 7/98, Laboratorio di Logistica, Dipartimento di Elettronica Informatica e Sistemistica, Universita della Calabria, Rende Cosenza, Italy, 1998.View at: Google Scholar