Abstract

We propose a convergence analysis of a new decomposition method to solve structured optimization problems. The proposed scheme is based on a class of modified Lagrangians combined with the allocation of resources decomposition algorithm. Under mild assumptions, we show that the method generates convergent primal-dual sequences.

1. Introduction

The interest for large-scale optimization problems [1] has grown over the past twenty years and it is likely to continue to increase during the upcoming decades; while on the one hand, the systemic approach to modeling real systems results in more and more complicated models, the increasing computing power of the microprocessors, along with the recent advancements in parallel architectures on the other hand, seems to push back the limits of practical treatment of these models. Decomposition methods (including splitting, partitioning, and parallel methods) are practical candidates to solve large problems where an internal structure allows to identify the “weakly coupled subsystems.” It should be clear that the reduction of the dimension of the original problem is not the only motivation for decomposing it into subproblems. Other important motivations are as follows:(i)partitioning heterogeneous models when it is the juxtaposition of various parts of the model which turns its numerical treatment difficult (as in mixed models with continuous and discrete variables) [2];(ii)decentralizing the decision-making such that the decomposition procedure could lead to autonomous subsystems, which are capable of computing their own part of the global solution independently without the need of a centralized decision level [2, 3];(iii)parallelizing the computation among different processors [1, 2].

Our objective in this paper is to present a family of decomposition algorithms based on proximal-like techniques, which are suitable for decentralized and parallelized computations. The algorithm can be seen as a separable version of the nonlinear rescaling principle method [4] and is closely related to three known techniques: the Partial Inverse method, proposed by Spingarn in 1985 [5, 6], the Alternate Direction Multiplier method [7, 8], proposed originally by Gabay and Mercier [9] for the splitting of variational inequalities, and the Separable Augmented Lagrangian Algorithm (SALA) developed by Hamdi [1012].

We will use a simplified framework to present the algorithm called the -SALA. It is tailored towards minimizing a convex separable function with separable constraints. A nice feature of the -SALA is that it preserves separability, as each iteration splits into a proximal step on the dual function and a projection step on the subspace. On the other hand, it yields the proximal decomposition on the graph of a maximal operator that has been introduced by Mahey et al. [13] for quadratic choice of the auxiliary functions .

The first drawback associated with the classical quadratic multiplier method (augmented lagrangian [14, 15]) and/or the nonlinear rescaling principle algorithm is that the augmented Lagrangian function and also the rescaled Lagrangian function are no long separable even when the original problem is separable. In other words, when applied to separable constraints like , the terms or are no long separable. However, some careful reformulation of the problem (e.g., by introducing additional variables) may preserve some of the given separable structure, thus giving a chance to the augmented Lagrangian framework, to play again an important role in the development of efficient decomposition schemes.

The second drawback associated with the multiplier methods is the fact that it is only differentiable once even when the problem’s data allow for higher differentiability, disabling the application of efficient Newton type methods. In fact such a lack of continuity in the second derivative can significantly slow down the rate of convergence of these methods and thus causing algorithmic failure. One way of coping with this difficulty is to use the recently developed nonquadratic multiplier methods based on entropy-like proximal methods (see [3, 1620]), and thereby leading to multiplier methods of which are twice continuously differentiable as opposed to the classical quadratic multiplier (if the original problem is also ). This is an important advantage since Newton type methods can then be applied.

With respect to the first drawback, in [10, 11], Hamdi has proposed a separable augmented Lagrangian algorithm (SALA) which can be derived from the resource directive subproblems associated with the coupling constraints ((SALA) was developed for non-convex and convex separable problems.). And in this paper, we alleviate the second drawback by using the Nonlinear Re-scaling NR method, so as to get a separable augmented Lagrangian function, which is at least twice continuously differentiable.

It is worth citing here some recent works where decomposition methods related to our subject were developed (to the best of our knowledge). The most recent one is a modification of the original algorithm (SALA) [1012] where Guèye et al. [21] replaced the scalar penalty parameter in [10] by a diagonal positive definite matrix of scaling factors. Their convergence analysis was done for only affine constraints. Auslender and Teboulle [22] proposed the entropic proximal decomposition method inducing -lagrangians for solving the structured convex minimization problems and variational inequalities based on the combination of their recent logarithmic-quadratic proximal point theory [23, 24] with the Chen-Teboulle decomposition scheme [25]. Kyono and Fukushima [26] proposed an extension of the Chen-Teboulle decomposition scheme [25] combined with the Bregman-based proximal point algorithm. Kyona and Fukushima their method was developed for solving large-scale Variational Inequalities (VIPs) (see [26] and for more developments on VIPs see [2729] and references therein.). Hamdi and Mahey [11] proposed a stabilized version of the original (SALA) by using a primal proximal regularization that yields to better numerical stability, specially for nonconvex minimization problems. For other references on decomposition methods, one may refer to [5, 7, 11, 12, 25, 26, 3034] and to Hamdi’s survey on decomposition methods based on augmented lagrangian functions [35] and references therein.

The remainder of this paper is organized as follows. In Section 2, we present the nonlinear rescaling principle of Polyak. In Section 3, we describe the application of the nonlinear rescaling method in conjunction with (SALA) to a general structured convex program in order to yield to the decomposition method presented in [30], which can be seen as separable augmented lagrangian method. Section 4 is dedicated to the extended convergence analysis of the proposed algorithm.

2. Nonlinear Rescaling Principle

Let be a convex real-valued function and let be finite concave real-valued functions on , and consider the convex programming problem: Recently nonquadratic augmented lagrangian have received much attention in the literature; see, for example, ([3, 4, 1620, 36] and references therein.). These methods basically rely on applying new class of proximal-like maps on the dual of (), see [20], and in turns are in fact equivalent to using the nonlinear re-scaling method, as shown in [37]. In this paper we use the nonlinear re-scaling principle to construct smooth lagrangian decomposition methods. Thus, we begin by summarizing this approach; for details see [4, 18, 37] and references therein.

The main idea of the nonlinear re-scaling principle (called here the NR method) is considering a class of strictly concave and smooth enough scalar function with particular properties, and using it to transform the constraints terms of the classical lagrangian. The NR method alternates at each step the unconstrained minimization of the classical lagrangian for the equivalent problem with the lagrange multiplier update. It allows for generating a wide class of augmented lagrangian methods (for instance, the exponential multiplier method, the modified log-barrier method, and the Log-sigmoid multiplier method [18], etc.).

Let us consider the following class of functions defined of and satisfying the following properties:(P1), for all ;(P2), for all ;(P3); (P4); (P5), for all and , for all , where ;(P6);(P7).

Note that from (P2) and (P3) the functions and are one-to-one and is strictly concave. Let , then we have for any , The nonlinear re-scaling principle is based on the idea of transforming the original problem () to an equivalent problem, namely, to one which has the same set of optimal solutions as (). To this end, let us consider here a parameterized transformed problems written as follows: Clearly, problem ([Ck,φ]) is also convex and has the same feasible set as (). The nonlinear re-scaling iterations are based on the classical lagrangian function denoted here by associated to the problem ([Ck,φ]); that is, and can be resumed as follows.

Algorithm 2.1. Given , , , generate the sequence :
Find .
Update , .

Remark 2.2. Note that the multipliers are nonnegative for all by (P1). Also, it is worth to mention the possibility to change the penalty parameter at each iteration . In [37], the authors propose a dynamic scaling parameters update as follows: . This update will be used in our decomposition scheme in the next section.
The NR algorithm allows a generation of a wide class of augmented lagrangian. Typical examples include the choices to get the exponential multiplier method, or to get the modified log-barrier function method. Convergence analysis of Algorithm 2.1 is given in [4, 18, 37, 38].

3. Separable Modified Lagrangian Algorithm

In this section, we recall the generalization of the separable augmented lagrangian algorithm (SALA) algorithm (see [10, 11]) called ( SALA) proposed in [30], to solve large-scale convex inequality constrained programs with separable structure. We are concerned here with block separable nonlinear constrained optimization problems: where are all convex functions, and is the convex set where are defined from for , , . Along this work, all the functions , are and we assume the following.(A1) The optimal set of () is nonempty and bounded.(A2) The Slater’s condition holds, that is,

Now, to construct our decomposition algorithm, we use the allocation vectors to get the equivalent problem (If is an optimal solution to (SPy) then is an optimal solution to ().), to which we propose to apply the nonlinear re-scaling principle with partial elimination of the constraints. We mean that only the constraints are replaced by the equivalent ones . Then for any , the following minimization problem: is equivalent to the (SPy) and according to Algorithm 2.1, for all , where , we have the following iterative scheme: where denotes the classical lagrangian for ([SPy,φ]) given by where .

The minimization in (3.3) is done by alternating the minimization with respect to , then followed by the one w.r.t. the allocation variable ; that is, we fix and find Then we can split the above minimization into independent sub-problems with low dimension. That is, And now we fix to solve for The following lemma gives an important link between the allocation variable and the lagrange dual variable.

Lemma 3.1. According to  (3.9), and are orthogonal and satisfy where .

Proof. For any , by writing the classical lagrangian to (3.9) where , , and using the optimality, with (3.4), we show that which means that does not depend on ; that is, , for all , , and can be replaced by for all .
Now, according to (P1) and after straightforward calculations, we reach (3.10). Equation (3.11) is obtained directly using (3.4) and (3.10). The orthogonality of the vectors and is direct.

One can observe that also the penalty parameters belong to the set , and finally our algorithm (SALA) can be stated as follows.

Algorithm 3.2. We have the following steps.Step 1. Select , where, , , , where , and where , , , .Step 2. Determine: for anyStep 3. If , , whereStop.Else: go to Step 4.Step 4. Update and go back to Step 2:

The following proposition gives us some properties of .

Proposition 3.3. (1)   is strictly convex in for any , .
(2) For any K.K.T point of (SPy) One has(i), (ii), (iii), where and .

Proof. (1) . Since , , , are concave, strictly concave (resp.) and is increasing. Then is strictly convex in for any , , .
(2) Let be any K.K.T point of (SPy) then (i)we have from the complementary condition. then we get (ii)We have Similarly, if then therefore . This means that (iii) From (ii) we can calculate , At , we have Let then That is, the first column is the gradient and so on. Let that is, the diagonal is Then, Then

Remark 3.4. The analysis presented in this paper differs from the short one presented in [30]. In this paper, our analysis is made possible by the strong tool of recession functions.
Assumption (A1) can be written in terms of recession functions (, for all for convex proper and lower semicontinuous functions. In our case the functions are convex proper and continuous.)

Further, since is not identically , [39, Theorem  9.3, page  77], allows us to reformulate (A1) as follows: The next proposition shows that the minimization subproblems are solvable.

Proposition 3.5. If is nonempty and bounded, then for any , and

Proof. To this goal, we need to show that for any . According to Proposition  2.1 in [40] and Proposition   in [41], we can express the recession function of as follows where .
If, we denote , , then and by using we have Now, since , the above relation becomes and finally, using (3.31) and the proof is complete.

4. Convergence Analysis

In this section, we present the convergence analysis of the sequence for a wide class of constraint transformations under some assumptions on the input data. To this goal, we give the following main two propositions.

Proposition 4.1. Under   and , the dual sequence is bounded.

Proof. Let and let the vector such that , for all , . Then, it is easy to see that where denotes the perturbation function associated to ( [SPy,φ]) and defined by So, by adding and subtracting the term , for with the same structure as , and if we set , we obtain That is, which can be rewritten also as follows: and then we have Using (P1), (P3), and (P4) and if we take , , we can show that when is not feasible, and it is easy to see that the minimum in this case equals zero and we have , for all . Thus, using the fact that is concave, is bounded, and is in the dual level set, the sequence is bounded.

Proposition 4.2. Under , , the primal sequence is bounded.

Proof. Let and fix and . It is clear that and using the feasibility of , we obtain directly Now, let us assume that the primal sequence is unbounded, then the sequence is bounded and .
Let and such that and take such that
By dividing both sides in (4.8) by , we get and according to (4.11) and the monotonicity of , and by denoting , (4.12) becomes Since the dual sequence is bounded, at the limit we have Since and using , we can rewrite (4.14) as follows: Since and (4.15) is equivalent to If then and then .
If then .
Now by letting and , we deduct that and since is not empty and is bounded, then which contradicts the fact that . Thus, what we assumed is false, and the primal sequence is bounded.

Proposition 4.3. Let the sequences generated by SALA and assume that there exist a primal solution to the original problem (). Then the following inequality holds: where

Proof. See [30].

Proposition 4.4. Let the sequences generated by SALA and assume that there exists a saddle point of the Lagrangian associated to problem (). Then the following inequality holds:

Proof. See [30].

Proposition 4.5. All the three sequences , , and generated in Algorithm 2.1 are bounded.

Theorem 4.6. Let and , the respective limit point of the bounded sequences and generated by SALA. Then one has the following properties:(i);(ii);(iii).

Proof. See [30].

Theorem 4.7. If the assumptions, , and are satisfied, then one has the following.(1)Any limit point () of the sequence is in the set .(2)The sequences and are convergent and

Proof. Let be any limit point of the sequence , then there exists a subsequence converging to . Without restricting the generality, we can assume . For such that , then and since , , , then Since , we have . Therefore, . Then, and then, therefore,
Now, we will prove that the set is bounded. Since is bounded, there exists such that . Let be the closed ball of center zero and radius then Since is continuous for all , then, is closed and bounded for all . Then there exists such that Let = max, then Therefore, , for all , for all , which means is a bounded set.
Thus, for such that , and the above result, we get That is, Also, if is a limit point of , keeping in mind that (see the proof given by Polyak in [38]) , then On another hand, from (4.32) we get Since the Lagrange of problem () is and it is convex, then from (4.34) and then Also, we know that , for all , and for any which implies and then for all, for all , and by the saddle point theorem we have Since is an increasing sequence bounded above by then it is convergent. Let be any convergent subsequence of . Then from (3.4) Since Then converges to . The remaining proof is similar to the one given in [30].

In the previous theorem, we proved the boundedness of the primal and dual sequences, but for the sequence of allocation vectors we can get the boundedness directly by proving that the set is bounded. (The same way as we proved that the set is bounded.).