A Decomposition Method with Redistributed Subroutine for Constrained Nonconvex Optimization

Lu, Yuan; Wang, Wei; Pang, Li-Ping; Li, Dan

doi:https://doi.org/10.1155/2013/376403

Abstract and Applied Analysis

On this page

Abstract Introduction Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2013 | Article ID 376403 | https://doi.org/10.1155/2013/376403

A Decomposition Method with Redistributed Subroutine for Constrained Nonconvex Optimization

Yuan Lu,¹Wei Wang,²Li-Ping Pang,³and Dan Li³

Academic Editor: Jean M. Combes

Received06 Sept 2012

Revised08 Dec 2012

Accepted13 Dec 2012

Published26 Feb 2013

Abstract

A class of constrained nonsmooth nonconvex optimization problems, that is, piecewise objectives with smooth inequality constraints are discussed in this paper. Based on the -theory, a superlinear convergent -algorithm, which uses a nonconvex redistributed proximal bundle subroutine, is designed to solve these optimization problems. An illustrative example is given to show how this convergent method works on a Second-Order Cone programming problem.

1. Introduction

Consider the following constrained nonsmooth convex program: where is convex and piecewise and , are convex of class .

Many approaches are proposed for solving this program. For example, we have converted it into an unconstrained nonsmooth convex program via the exact penalty function in [1]. And we have showed that the objective function of this unconstrained optimization problem is a particular case of function with a primal-dual gradient structure, a notion related to the -space decomposition. Based on the -theory, we have designed an algorithm frame which converges with local superlinear rate.

Yet, very little systematic research has been performed on extending this convex program to a nonconvex framework. The purpose of this paper is to study the following nonconvex program: where is piecewise and are of class . Based on the -decomposition theory, which is first introduced in [2] for convex functions, and further studied in [3–13]. We give a -algorithm using a redistributed proximal bundle subroutine to generate a sequence of approximate proximal points. When a primal-dual track exists, these points approximate the primal track points and give the algorithm's -steps. And this subroutine also approximates dual track points that are -gradients needed for the algorithm's -Newton steps. The interest in devising -algorithm for (2) lies on the “smoothing” effect of -subspace and its potential to speed up the algorithm's convergence under certain conditions.

The rest of the paper is organized as follows. Section 2 breaks into two subsections. In the first part, the nonconvex program (2) is transformed into an unconstrained problem by means of the exact penalty function. Based on the Clarke subdifferential of the objective function of this unconstrained problem, we obtain the -space decomposition. The second part of Section 2 is devoted to deal with the primal-dual function and its second-order properties. Section 3 designs a conceptual Algorithm 10 and gives its convergence theorem. When a primal-dual track exists, we substitute the -step in Algorithm 10 with the redistributed proximal bundle subroutine. In the final section, this algorithm is applied to the Second-Order Cone programming problem to emphasis the theoretical findings.

2. The -Decomposition Results

2.1. The -Space Decomposition

In program (2), is piecewise . Specifically, for all , is continuous and there exists a finite collection of functions , such that We refer to the function , , as structure functions.

The Clarke subdifferential of at a point , denoted by , can be computed in terms of the gradients of the structure functions that are active at ; see [14, Lemma 1]. More precisely, where is the set of active indices at and

Let be a solution of (2). By continuity of the structure functions, there exists a ball such that For convenience, we assume that the cardinality of is and reorder the structure functions, so that From now on, we consider that

Let denote the exact penalty function of (2) with and , where is a penalty parameter. More precisely, where

Call the set of indices realizing the max at .

The following assumptions and definitions will be used in the rest of this paper.

Assumption 1. The set is linearly independent.

Assumption 2. Given and there exists an open bounded set and a function such that, , is lower- on satisfying on .

Definition 1 (see [15, Definition 10.29]). The function is lower- on an open set if for each there is a neighbourhood of upon which a representation holds, where is a compact set and the functions are of class on such that , , and depend continuously not just on but jointly on .

Lemma 2 (see [19, Proposition 1]). If Assumption 2 holds, then is bounded below and prox-bounded.

Definition 3 (see [16, Definition 1]). Given a lower semicontinuous function , a point where is finite and is nonempty, and an arbitrary subgradient , the orthogonal subspaces define the -space decomposition, and , where is the direct sum of space decomposition.

Theorem 4. Suppose Assumption 1 holds. Then one has the following results at :(i) the Clarke subdifferential of has the following expression: where ; , and ;(ii) let denote the subspace generated by the Clarke subdifferential . Then

Proof. Since defined in (2) belongs to the PDG-structured family and by Lemma 2.1 in [16] the Clarke subdifferential of at can be formulated by where ; , , and .
Together with , there exists where , and .
Letting ; and ; , , we have . Then it follows from the definition of space in Definition 3 and means that the second formula holds.

Remark 5. (i) Since the subspaces and generate the whole space , every vector can be decomposed along its -components at . In particular, any can be expressed as where and .
(ii) For any , we have From Theorem 4(ii), the -component of a subgradients is the same as that of any other subgradient at , that is, .

2.2. Primal-Dual Function and Its Second-Order Properties

In order to obtain a fast algorithm for (2), we will define an intermediate function. This function is called primal-dual function which is about .

Definition 6 (see [8, Definition 1]). We say that is a primal-dual track leading to , a minimizer of and zero subgradient pair, if for all small enough satisfy the following:(i) is a function satisfying for all ,(ii)the Jacobian is a basis matrix for ,(iii)the particular -Lagrangian is a -function.When we write we implicitly assume that . If we define the primal-dual track to be the point . If then for all in a ball about .

Theorem 7. Suppose the Assumption 1 holds. Then for all small enough, the following hold:(i) the nonlinear system, with variable and the parameter , has a unique solution and is a function;(ii) primal track is , with and in (i) is , with where In particular, , , and ;(iii), and .

Proof. Items (i) and (ii) follow from the assumption that , are along the lines of [5, Theorem 5.1] and applying a Second-Order Implicit Function Theorem; see [17, Theorem 2.1]. The conclusion of (iii) can be obtained in terms of (i) and the definitions of and .

Lemma 8 (see [7, Theorem 4.5]). Given , the system with has a unique solution. In particular, , and , .

The following theorem gives the definition and properties of primal-dual function.

Theorem 9. Given and supposing Assumption 1 holds, consider the primal-dual function:
Then for small enough, the following assertions are true:(i) is a function of ;(ii) the gradient of is given by where In particular, when , one has where (iii) the -Hessian of is given by where In particular, when , one has where

Proof. (i) From Theorem 7(iii), we have Since and are , (i) holds.
(ii) In view of the chain rule, differentiating the following system with respect to : we have Multiplying each equation by the appropriate and , respectively, summing the results, and using the fact that yields where Using the transpose of the expression of , we get which together with (6.11) in [5] yields the desired result.
In particular, if , then and . It follows from Remark 5(ii) that where
(iii) Differentiating (ii) with respect to , we obtain where
According to the proof of Theorem 6.3 in [5], we get Then when ,

3. Algorithm and Convergence Analysis

Supposing , we give an algorithm frame which can solve (2). This algorithm makes a step in the -subspace, followed by a -Newton step in order to obtain superlinear convergence rate.

Algorithm 10 (algorithm frame).
Step 0 (Initialization). Given , choose a starting point close to enough, and a Clarke subgradient , set .
Step 1. Stop if
Step 2. Find the active index set and .
Step 3. Construct -decomposition at , that is, . Compute where
Step 4. Perform -step. Compute which denotes in (23) and set .
Step 5. Perform -step. Compute from the system where is such that . Compute .
Step 6 (update). Set , and return to Step 1.

Theorem 11. Suppose the starting point close to enough and , . Then the iteration points generated by the algorithm converge and satisfy

Proof. Let , . It follows from Theorem 7 that Since exists and , we have from the definition of -Hessian matrix that By virtue of (53), we have . It follows from the hypothesis that is invertible and hence . In consequence, one has The proof is completed by combining (56) and (58).

Since Algorithm 10 relies on knowing the subspaces and and converges only locally, it needs significant modification for implemental. Our -algorithm defined below finds -step by approximating equivalent proximal points.

Given a positive scalar parameter , the proximal point function depending on is defined by If Assumption 2 holds, then the proximal point is single-valued; see [18, Theorem 1].

Corresponding to the primal track, the dual track is defined by For its properties, one can refer to [16].

The next theorem shows that -steps in Algorithm 10 can be replaced by proximal steps, at least in the locality of a minimizer, if Assumptions 1 and 2 hold.

Theorem 12. Suppose that Assumptions 1 and 2 hold, and that . Then for all sufficiently large and for any sequence , one has for all large , where .

Proof. Since are , and are lower-. Functions defined by sums, maximums are lower- [15, Example 10.35]; therefore is lower-. From Lemma 2 and [15, Proposition 13.33], we have is prox-bounded and -regular. Appling the definition of and the fact , are , respectively, we have that is subdifferentially regular. So is a function with pdg structure satisfying strong transversality and prox-regular at , and , , by [16, Theorem 5.3] we get the result.

In order to define a nonconvex -algorithm for (2) problem, we will use a nonconvex bundle method to approximate proximal points. Many practically nonconvex bundle algorithms are modifications of some convex forerunner, with a fixed model function. Basically, such fixes consist in redefining linearization errors to enforce nonnegativity. However, a redistributed proximal bundle method for nonconvex optimization [19] based on [18] is a different picture. This work proposes an approach based on generating cutting-planes models, not of the objective function as most bundle methods do, but of a local convexification of the objective function. They deal with the augmented functions at : where denotes convexification parameter; in the following is model prox-parameter and strands for the prox-parameter, which satisfies .

Bundle subroutine accumulates information from past points in the form where is some index set containing an index such that , , , , and . This information is used at each iteration to define a -model underestimating via the cutting-plane function. To approximate a proximal point we solve a first quadratic programming subproblem , which has the following form and properties.

The problem has a dual Their respective solutions, denoted by and , satisfy In addition, for all such that For convenience, in the sequel we denote the output of these calculations by The vector is an estimate of a proximal point and, hence, approximates a primal track point when the latter exists. To proceed further we define new data, corresponding to a new index , by letting and computing An approximate dual track point, denoted by , is constructed by solving a second quadratic problem, which depends on a new index set: The second quadratic programming problem, denoted by , has a dual problem similar to (66), Similar to (67), the respective solutions, denoted by and , satisfy Let an active index set be defined by Then, from (74), , so for all such and for a fixed . Define a full column rank matrix by choosing the largest number of indices satisfying (76) such that the corresponding vectors are linearly independent and by letting these vectors be the columns of . Then let be a matrix whose columns form an orthonormal basis for the null-space of . And let if is vacuous.

For convenience, in the sequel we denote the output from these calculation by The bundle subprocedure is terminated and is declared to be approximation of if Otherwise, above is replaced by and new iterate data are computed by solving the updated two quadratic programming problems above.

Now we consider a heuristic algorithm depending on the -theory and the primal-dual track point approximations above.

Algorithm 13 (nonconvex -Algorithm for (2)).
Step 0. Select initial starting point and positive parameter , a convexification growth parameter . Compute the oracle values and , and the additional bundle information , with . Also, let be a matrix with orthonormal -dimensional columns estimating an optimal -basis. Set and .
Step 1. Stop if .
Step 2. Choose an positive definite matrix , where is the number of columns of .
Step 3. Compute an -Newton step by solving the linear system Set .
Step 4. Initialize and run the bundle subprocedure with . Compute recursively, until satisfaction of (78). Then set .
Step 5. If then set And apply rule where Otherwise, execute a line search on the line determined by and to find thereon satisfying ; reinitialize and restart the bundle subroutine with , and set , , , to find new values for ; then set .
Step 6. Replace by and go to Step 1.

4. An Illustration Numerical Example

Now we report numerical result to illustrate Algorithm 13. Our numerical experiment is carried out in Matlab 7.8.0 running on a PC Intel Core 2 Duo CPU 2.93 GHz and 2.00 GB memory.

We consider the following Second-Order Cone programming problem (SOCP): where is symmetric infinite matrix and with .

This (SOCP) can be formulated in the form: equivalently, Let Then (SOCP) problem is equivalent to the nonlinear programming problem:

Let , , then the exact penalty function of this nonlinear programming problem is with .

In the implementation, the initial starting point is chosen arbitrarily, and the parameters have values and . Optimality is declared when stopping criterion is satisfied.

Numerical results are summarized in Table 1 in which denotes the number of variables, denotes the number of function and one subgradient evaluation.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China under Projects nos. 11226230, 11171138 and 11171049, 11226238 and General Project of the Education Department of Liaoning Province no. L2012427.

References

Y. Lu, L.-P. Pang, F.-F. Guo, and Z.-Q. Xia, “A superlinear space decomposition algorithm for constrained nonsmooth convex program,” Journal of Computational and Applied Mathematics, vol. 234, no. 1, pp. 224–232, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Lemaréchal, F. Oustry, and C. Sagastizábal, “The $U$ -Lagrangian of a convex function,” Transactions of the American Mathematical Society, vol. 352, no. 2, pp. 711–729, 2000.
View at: Publisher Site | Google Scholar | MathSciNet
R. Mifflin and C. Sagastizábal, “ $V U$ -decomposition derivatives for convex max-functions,” in Ill-Posed Variational Problems and Regularization Techniques, R. Tichatschke and M. A. Théra, Eds., vol. 477 of Lecture Notes in Economics and Mathematical Systems, pp. 167–186, Springer, Berlin, Germany, 1999.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Lemaréchal and C. Sagastizábal, “More than first-order developments of convex functions: primal-dual relations,” Journal of Convex Analysis, vol. 3, no. 2, pp. 255–268, 1996.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
R. Mifflin and C. Sagastizábal, “On $V U$ -theory for functions with primal-dual gradient structure,” SIAM Journal on Optimization, vol. 11, no. 2, pp. 547–571, 2000.
View at: Publisher Site | Google Scholar | MathSciNet
R. Mifflin and C. Sagastizábal, “Functions with primal-dual gradient structure and $U$ -Hessians,” in Nonlinear Optimization and Related Topics, G. Pillo and F. Giannessi, Eds., vol. 36 of Applied Optimization, pp. 219–233, Kluwer Academic Publishers, 2000.
View at: Google Scholar | MathSciNet
R. Mifflin and C. Sagastizábal, “Primal-dual gradient structured functions: second-order results; links to epi-derivatives and partly smooth functions,” SIAM Journal on Optimization, vol. 13, no. 4, pp. 1174–1194, 2003.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. Mifflin and C. Sagastizábal, “A $V U$ -algorithm for convex minimization,” Mathematical Programming B, vol. 104, no. 2-3, pp. 583–608, 2005.
View at: Publisher Site | Google Scholar | MathSciNet
F. Shan, L.-P. Pang, L.-M. Zhu, and Z.-Q. Xia, “A $U V$ -decomposed method for solving an MPEC problem,” Applied Mathematics and Mechanics, vol. 29, no. 4, pp. 535–540, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
Y. Lu, L.-P. Pang, J. Shen, and X.-J. Liang, “A decomposition algorithm for convex nondifferentiable minimization with errors,” Journal of Applied Mathematics, vol. 2012, Article ID 215160, 15 pages, 2012.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. Daniilidis, C. Sagastizábal, and M. Solodov, “Identifying structure of nonsmooth convex functions by the bundle technique,” SIAM Journal on Optimization, vol. 20, no. 2, pp. 820–840, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. L. Hare, “A proximal method for identifying active manifolds,” Computational Optimization and Applications, vol. 43, no. 2, pp. 295–306, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. L. Hare, “Functions and sets of smooth substructure: relationships and examples,” Computational Optimization and Applications, vol. 33, no. 2-3, pp. 249–270, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. Mifflin, L. Qi, and D. Sun, “Properties of the Moreau-Yosida regularization of a piecewise $C^{2}$ convex function,” Mathematical Programming A, vol. 84, no. 2, pp. 269–281, 1999.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, vol. 317 of Fundamental Principles of Mathematical Sciences, Springer, Berlin, Germany, 1998.
R. Mifflin and C. Sagastizábal, “ $V U$ -smoothness and proximal point results for some nonconvex functions,” Optimization Methods & Software, vol. 19, no. 5, pp. 463–478, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Lang, Real and Functional Analysis, Springer, New York, NY, USA, 3rd edition, 1993.
View at: Publisher Site | MathSciNet
W. Hare and C. Sagastizábal, “Computing proximal points of nonconvex functions,” Mathematical Programming B, vol. 116, no. 1-2, pp. 221–258, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. L. Hare and C. Sagastizábal, “A redistributed proximal bundle method for nonconvex optimization,” SIAM Journal on Optimization, vol. 20, no. 5, pp. 2442–2473, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet

Copyright

Copyright © 2013 Yuan Lu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

751

Downloads

1086

Citations