Abstract

We present an approximate nonsmooth algorithm to solve a minimization problem, in which the objective function is the sum of a maximum eigenvalue function of matrices and a convex function. The essential idea to solve the optimization problem in this paper is similar to the thought of proximal bundle method, but the difference is that we choose approximate subgradient and function value to construct approximate cutting-plane model to solve the above mentioned problem. An important advantage of the approximate cutting-plane model for objective function is that it is more stable than cutting-plane model. In addition, the approximate proximal bundle method algorithm can be given. Furthermore, the sequences generated by the algorithm converge to the optimal solution of the original problem.

1. Introduction

Bundle method is one of the efficient and promising methods for solving nonsmooth optimization problems. For example, a class of maximum eigenvalue function can be minimized by bundle method [1], bundle-filter method can be used to deal with nonsmooth convex constrained optimization problem [2], and penalized bundle method was proposed to solve nonsmooth optimization problem by Bonnans et al. [3]. Recently, a minimization problem for a class of constrained maximum eigenvalue function has been solved in [4] with the help of penalized bundle method. However, when constructing the cutting-plane model for the objective function, the formula shows that the subdifferential of is involved in [4]. Note that is the face of exposed by , where . So changes drastically when the multiplicity of the is changed [5]. Thus is unstable and it leads to the instability of cutting-plane model in [4]. In this paper, to avoid this drawback, we try to give a more stable approximate cutting-plane model for the objective function.

In view of a class of nonsmooth optimization problem of the form where is the composite function of and an affine mapping , specifically is affine, and is a linear operator from to ; is a nonsmooth convex function. We modify the elements in bundle and give an approximate proximal bundle method algorithm for . Our algorithm is established on the basic assumption that at least one approximate function value and one approximate subgradient at each point are available. And suppose .

The initial motivation for our present work lies in the following facts. The whole idea of so-called bundle methods can be concentrated on constructing a good approximation for the objective function. Reference [4] has solved this class of optimization problem with the help of cutting-plane model and penalized bundle method. However, the cutting-plane model in [4] is unstable. Our motivation is to construct a more stable approximate model for the objective function. We try to use the approximate subdifferential of the objective function . Note that in the equation , obviously, is involved. And is no longer a face of ; it is the intersection of with the half-space . In particular, almost all matrices in have rank (for ), whereas the rank of matrices in is at most the multiplicity , which is 1 for almost all (so is relative stable); this also gives an idea of the big gap existing between these two convex sets. Therefore we decide to introduce an enlarged subdifferential [6] of which can be regarded as an outer approximation of and an inner approximation of . Through the enlarged subdifferential of , we can easily get an enlarged subdifferential of the objective function . Using the approximate subgradient in the enlarged subdifferential to construct the approximate cutting-plane model for the objective function, there are two advantages: on one hand it can avoid the instability of the cutting-plane model which is constructed in [4], on the other hand it can avoid too much elements in . After constructing a more stable approximate cutting-plane model, we propose our algorithm and prove the convergence of the algorithm.

The rest of this paper is organized as follows. Section 2 mainly contains the approximate cutting-plane model of the objective function. Here we firstly introduce an enlarged subdifferential which is an outer approximation of , simultaneously, it is an inner approximation of . And then, using the approximate subgradient in the enlarged subdifferential to complete the constructing of an approximate cutting-plane model for the objective function. Section 3 gives the algorithm with respect to the approximate proximal bundle method and also provides a corresponding compression mechanism. Section 4 is devoted to convergence analysis of the algorithm mentioned in Section 3. Section 5 gives the conclusions.

In the paper, the standard norm and inner product are all in Hilbert space and are denoted by and .

2. The Approximate Model of the Objective Function

In this section, we will mainly give the approximate model for the objective function. It is known that [4] has given a kind of approximate model for the objective function : where and .

It is known that is the face of exposed by , where . If the multiplicity of is changed, the subdifferential of changes drastically. So it leads that the subdifferential of function is unstable. Thus this causes the instability of . In order to construct a stable approximate model, we take the approximate subdifferential of function into account. Firstly, we consider the approximate subdifferential of .

Definition 1. For all , and is an affine mapping. One defines the following:(1)the set of indices of -largest eigenvalues: (2)the -multiplicity of : (3)the -first eigenspace: where is the eigenspace of associated with the its eigenspace .

For the composite maximum eigenvalue function, its approximate subdifferential is where . From this one sees that is the intersection of with the half-space , instead of being a face of . In particular, almost all matrices in have rank (for ), whereas the rank of matrices in are at most the multiplicity , which is 1 for almost all ; this gives ideas that is more stable than and there is a big gap existing between these two convex sets.

Introduce a compact convex set where is a matrice whose columns form orthonormal basis of .

Proposition 2. Let be an affine mapping, then for all and one has

Proof. The left inclusion can be derived from simple chain rules and (7). To see another inclusion, take . Then Since and . Thus it follows that Together with (3), we have ; for this means that, according to (6), . Accordingly, the proof of right inclusion is completed.

Note that the compact convex set can be regarded as an outer approximation of and an inner approximation of . This set will avoid both the weakness of (unstable) and the drawback of (too much elements). Applying the linear mapping in (8). Then Set . Then we obtain This formula is equivalent to Add simultaneously, since , we have Set . Thus . Finally, is the enlarged subdifferential of the objective function and it can be regarded as an outer approximation of and an inner approximation of . The following work will base on the enlarged subdifferential .

Choose and , then Next at each point compute an approximate function value satisfying Thus, using these approximate information of , the approximate model becomes Let the aggregate linearzation error at be denoted by : With the notation, we obtain the form of approximate model where .

Observe that is a more stable approximate model than . In the following part, based on the approximate model , we will give the approximate proximal bundle method algorithm.

3. The Approximate Proximal Bundle Method Algorithm

Algorithm 3 (the approximate proximal bundle method).
Step 0. Let , be given parameters. Choose , we can obtain an approximate function value and approximate subgradient . Then construct the approximate model and set , .
Step  1. If , stop.
Step  2. Solve the quadratic program: Define as the nominal decrease:
Step  3. Call the black box again with , if Then set , otherwise set . Corresponding, the former one is said to be descent step and the latter one is called null step.
Step  4. Append to the bundle model and construct . Change to and go to Step 1.

Remark 4. For a parameter , then the candidate point can be obtained through solving the dual problem of .

Theorem 5. If is the unique solution to and suppose that , then and is a solution to In addition, one can also have the following relations: (i) ;(ii) , where ;(iii) .

Proof. Turn into a quadratic programming problem with an extra scalar variable : The corresponding lagrangian is, for , By strong convexity, the dual problem of is equivalent to problem .
And is the solution to the quadratic programming problem , then (i) holds.
To see (ii) and (iii), note that since there is no duality gap, the primal optimal value in equals the dual optimal value in . Thus the term (ii) holds. The relation implies that the term (iii) holds.

From Theorem 5 we can ensure the candidate point which appears in Step 2 can be obtained.

Remark 6. As iterations conduct, the elements in the bundle become more and more. When the size of the bundle becomes too big, it is necessary to compress bundle. So, at Step 4, one should append the compression subalgorithm.

When the current size of the bundle is bigger than the maximal size, that is, , Step 4 turns out to be as follows:

Step  4′. Let be the active indices.

If , then keep active couples and delete all inactive couples from the bundle. Set and define . Then append new element to the bundle and construct model . Let , and go to Step 1. Note that is in new element: (i)when it is a descent step, ,(ii)when it is a null step,

If , then delete two or more couples of elements. In addition, . Define and then append new element to the bundle and construct model . Let , and go to Step 1. Note that is in new element: (i)when it is a descent step, ,(ii)when it is a null step,

Remark 7. Under certain circumstance with , if the remaining couples are still too many after discarding all inactive couples from the bundle, one synthesizes indispensable information of active elements in bundle. Simultaneously the corresponding affine function is called aggregate linearization and is denoted by For the aggregate linearization , it holds that When the maximum capacity is reached, for instance, if , then assume that one discards the elements from the bundle and appends the aggregate couple. The resulting model will be where . Note that for all and , in any case, one can have

4. Convergence Analysis

To show convergence of the algorithm, we have to refer the stoping tolerance . So we consider two situations, that is, and .

Firstly, when the parameter , use the notation to denote the set of indices in which a new descent step is done.

Theorem 8. Consider Algorithm 3 and use the notation to denote . Assume that the algorithm never stops as well as . Then

Proof. Since and the algorithm never stops, the nominal decrease must satisfy for all . Note that is a descent index set, we have and . Let be the index following in . Between and the algorithm makes null steps only without moving the stability center for all . The descent test at gives Thus for all , it holds that when we let , Removing to the right side of the inequation, we obtain the desired result.

When is taken strictly positive, by Theorem 8, there is an index for which if has minimizers. By Theorem 5(ii), both and are all small. Therefore, is the minimizer.

Secondly, when the stopping tolerance , the algorithm either stops by having found a solution to or it never stops. In this case, there are two possibilities for the sequence of descent steps . One is that it has infinitely many elements. Another is that there is an iteration where a last descent step is done, that is, for all . We proof these two cases separately.

Case 1. There are infinitely many elements in .

Theorem 9. Suppose that the algorithm generates infinitely many descent steps for all . One has (i)if has an empty solution, then ;(ii)if has minimizers and with and , the sequence is minimizing for and converges to a minimizer of .

Proof. (i) Note that the algorithm loops forever and for all , thus holds. Then the infinite sequence of objective values is strictly decreasing. If has no solution, the sequence is close to . Therefore the proof of (i) is completed.
(ii) To show that is a minimizing sequence. Suppose for contradiction purposes that there exists and such that for all . By Theorem 5(iii), we have By and Theorem 5(ii), hence Write the relation (37) for , we obtain Since with , there exists such that for all . Thus the relation (38) becomes Summing the inequations over yields Letting , we obtain a contradiction for the divergence assumption . Therefore, the sequence is a minimizing sequence.
To see that converges to a minimizer of , take in (37) , a solution to , and sum over . It holds that for all . We see that Then the sequence is bounded. Take a subsequence converging to as . Given , take big enough such that Writing (37) with and summing it over from to an arbitrary yield which implies that converges to a minimizer of .

Case 2. There are finitely many descent steps. It means that the last descent step is followed by an infinite number of null step, that is, for all .

Theorem 10. Assume that Algorithm 3 generates a last descent step followed by infinitely many null steps. Then sequence converges to and minimizes the function .

Proof. Let be the solution to , we have Set Then By (30), we have Thus, the sequence is bounded. By the relation (47), we have It is obvious from the convexity that So, it holds that From the bounded sequence extract a subsequence as , for and ; thus Then When , the descent test is never satisfied, that is, , adding to the both sides of the inequation and using the definition of in (21), we can obtain Let , then . In addition, as By Theorem 5(iii), it holds that Passing to the limit as the inequation shows that minimizes the function .
This is to show that . Since passing to the limit as , we obtain The inequality holds if and only if . Thus is the minimizer.

5. Conclusions

In this paper, we have given an enlarged subdifferential and constructed a more stable approximate cutting-plane model for the objective function. Then we have proposed an approximate proximal bundle algorithm and shown how to use this algorithm to minimize a class of maximum eigenvalue functions. Finally, it may be possible to extend this method to an even larger class of constrained maximum eigenvalue function.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank the referees for the valuable discussion and recommendation for improving the paper. This work is supported by the National Natural Science Foundation of China under Project no.11171138.