Convergence Analysis of Multiblock Inertial ADMM for Nonconvex Consensus Problem

Liu, Yang; Dang, Yazheng

doi:https://doi.org/10.1155/2023/4316267

Journal of Mathematics

On this page

Abstract Introduction Preliminaries Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2023 | Article ID 4316267 | https://doi.org/10.1155/2023/4316267

Convergence Analysis of Multiblock Inertial ADMM for Nonconvex Consensus Problem

Yang Liu^1,2and Yazheng Dang³

Academic Editor: Qiang Wu

Received29 Jul 2022

Revised13 Dec 2022

Accepted04 Mar 2023

Published28 Mar 2023

Abstract

The alternating direction method of multipliers (ADMM) is one of the most powerful and successful methods for solving various nonconvex consensus problem. The convergence of the conventional ADMM (i.e., 2-block) for convex objective functions has been stated for a long time. As an accelerated technique, the inertial effect was used by many authors to solve 2-block convex optimization problem. This paper combines the ADMM and the inertial effect to construct an inertial alternating direction method of multipliers (IADMM) to solve the multiblock nonconvex consensus problem and shows the convergence under some suitable conditions. Simulation experiment verifies the effectiveness and feasibility of the proposed method.

1. Introduction

The nonconvex global consensus problem with regularization [1] has the following form:where are smooth, possibly nonconvex functions, while is a convex nonsmooth regularization term and X is a closed convex set. This problem is related to the convex global consensus problem discussed heavily [2], but it is possible that s are nonconvex.

In many practical applications, s need to be handled by a single agent, such as a thread or a processor. Now, we transform problem (1) into the following equivalent linearly constrained problem under the help of new variables :

Note that the problem (2) owns blocks with different variables and one globe variable. Then, each distributed agent can handle a single local variable and a local function , respectively.

The augmented Lagrangian function with multipliers of problem (2) is defined as follows:where is a penalty parameter and problem (2) can be solved distributively by the following classical ADMM procedure:

ADMM was initially introduced in the 1970s [3, 4], and its convergence properties for convex case have been extensively studied. However, ADMM or its directly extended version may not converge when there is a nonconvex function in the objective. Yang et al. [5] studied the convergence of the ADMM for the nonconvex optimization model which come from the background/foreground extraction. Hong et al. [6] analyzed the convergence of alternating direction method of multipliers for a family of nonconvex problems. Guo et al. [7] studied the convergence of ADMM for multiblock nonconvex separable optimization models.

Recently, some scholars studied the inertial type of ADMM for convex optimization. For example, Chen et al. [8] analyzed a class of inertial ADMM for linearly constrained separable convex optimization, and Moudafi and Elissabeth [9] extended the inertial technique to solve the maximal monotone operator inclusion problem. The research interests for the nonconvex cases are increasing in recent years; e.g., Chao et al. [10] proposed and analyzed an inertial proximal ADMM for a class of nonconvex optimization problems while all the above inertial ADMM algorithms were presented for solving only two-block optimization problem (not for multiple-block case). Whether the convergence of the inertial ADMM is assured when the involved number of blocks is more than two? It is an important problem to research.

The purpose of the present study is to examine the convergence of inertial ADMM with multiblocks for nonconvex consensus problem under the assumption that the potential function satisfies the Kurdyka–Lojasiewicz property. The preliminary numerical results show the effectiveness of the proposed algorithm.

The rest of this paper is organized as follows. In Section 2, some necessary preliminaries for further analysis are summarized. Section 3 proposes a multiblock nonconvex inertial ADMM algorithm and analyzes its convergence under some conditions. In Section 4, we prove the validity of the algorithm by the numerical experiment. Finally, some conclusions are drawn in Section 5.

2. Preliminaries

Let denote the n-dimensional Euclidean space, denote the extended real number set, and N denote the natural number set. represents the Euclidean norm. Let denote the domain of function and denote the inner product. For function if if we say that is lower semicontinuous at . If is lower semicontinuous at every point , we say that is lower semicontinuous function.

For a set and a point , let . If , we set for all .

The Lagrangian function of (2), with multiplier , is defined as

Definition 1. If such thatthen is called a critical point or stationary point of the Lagrange function .
A very important technique to prove the convergence of the ADMM for nonconvex optimization problems relies on the assumption that the potential function satisfying the following Kurdyka–Lojasiewicz property (KL property) [11–14].
For notational simplicity, we use to denote the set of concave functions such that(i), is continuous differentiable on and continuous at (ii)

Definition 2. (see [14]) (KL property). Let be a proper lower semicontinuous function. If there exists , a neighborhood of , and a function , such that for all , it holds thatthen is said to have the KL property at .

3. Algorithm and Convergence Analysis

For convenience, we fix the following notations: , . Basis on (4), we propose the following algorithm for solving problem (2).

Algorithm 1. Inertial ADMM (IADMM). Choose and , For the given point , consider the iterative scheme:whereassociated with .
From the optimality conditions of (8) (a) and (8) (b), we have

Remark 1. Compared with the inertial ADMM in [10], each subproblem in our algorithm has the inertial term, and we handle multiblock case here.
Subsequently, we will discuss the convergence of Algorithm 1 under the following assumptions.

Assumption 1. (i) is proper lower semicontinuous, and is Lipschitz continuous; i.e.,(ii) is large enough, such that .

Lemma 1. For each , define , we have

Proof. Since from (11), one hasThus,Hence, the result is obtained.

Lemma 2. Select large enough, suppose that Assumption 1 holds. Then, for each ,where .

Proof. By the definition of the augmented Lagrangian function, (8) (c) and (15), we haveFrom (8) (a) and (8) (b), we obtainrespectively. Then, it is easy to getTherefore, we haveAdding up (17) and (20), by the Assumption 1 (ii), we havewhich implies thatThen, the results are obtained.

Remark 2. From Assumption 1 (ii), we know that . Define the following potential regularized augmented Lagrangian function:where
If we take thenFrom Lemma 2, we havewhich implies that the whole sequence is monotonically nonincreasing. It is importance for our convergence analysis.

Lemma 3. If the sequence is bounded, then .

Proof. Since the sequence is bounded, there exists a subsequence such that . Since is lower semicontinuous, is Lipschitz differentiable, and the function is lower semicontinuous, which leads to ; thus, is bounded from below. From Lemma 2, we know that is nonincreasing; thus, is convergent and for each k.
From Lemma 2, it yieldsHence,Consequently, .

Lemma 4. There exists such that for each , where

Proof. From the definition of , we haveFrom Lemma 1 and the optimality conditions, we getFrom (29) and (30), we obtainwhereThus,It follows from Assumption 1 and Lemma 1 that there exists such that , for each .

Lemma 5. Let denote the cluster point set of . Then, is a nonempty compact set, and .

And if , then is a critical point of the Lagrangian function of the problem (2). Moreover, is finite and constant on and .

Proof. In view of the definition of , it is true that is nonempty and compact, and .
Let . Then, there exists a subsequence of converging to . Since , we have .
Since , we have .
Let . From Lemma 2, we haveThat is, .
Thus,Since is proper lower semicontinuous, we obtainFrom above, we getTogether with the continuity of and the closeness of , we obtainThus, is a critical point of the Lagrange function L of the problem (2).
From (37) and Lemma 5, we haveTherefore, from (39) and the descent of , we obtainThus, is constant on . Moreover, .

Theorem 1. Let be the KL property at each point of . Then, the bounded sequences converges to a critical point of . Moreover,

Proof. By Lemma 5, we have , for all . We consider the following two cases:(i)If there exist an integer such that . From Lemma 2, for all , we have Thus, for any , we have ; therefore, for any , it follows that and the assertion holds.(ii)Assume that for all . Since , it follows that for any give , there exists , such that . Again since , for give , there exists , such that , for all .Thus, when , we haveIn view of is nonempty compact set, is constant on . By Definition 2, we have , for all .From the concavity of , we haveSince and Lemma 2, we obtainLet Thus,for all . That is,for all .
Since , we haveFrom (48) and (49), we obtainSumming up the above formula for , we haveNotice that , it is easy to getwhereThus,From Lemma 1, we getBy Lemma 5, we conclude that the sequences converge to a critical point of .

4. Numerical Experiment

In this section, we present the results of a simple numerical example to verify the effectiveness of Algorithm 1. We consider the following compressive sense problem, which takes the following form:where is a feature matrix, is a response vector, and is a regular parameter. In general, problem (56) is NP-hard. In order to overcome this difficulty, one may relax norm to the norm, considering the following nonconvex problem:

Let _, and , . We now focus on applying Algorithm 1 to solve problem (57) with the suitable parameters. The iterative processes are as follows:

Simplifying the procedures (58), we obtain the closed-form iterative formulas:where , is the half shrinkage operator^[16], and indicates the soft shrinkage operator imposed on the entries of .

The experimental data are generated as follows. We use distributed computing toolbox in MATLAB, and the purpose is to achieve simple distributed computing. Suppose the feature matrix is standard normal distribution N (0, 1) m n. Select sparse vector from the N (0, 1) distribution. The parameters and are set as and , where the noise vector The variables were initialized to be zero. The primal residual is defined as We employ as the stopping criteria, where . The numerical results are reported in Table 1. We report the number of iterations (“Iter.”) and the computing time in seconds (“Time”) for the algorithms with different parameters under the dimension m = 2500, n = 1000.

The values of with the iterations are plotted in Figures 1 and 2.

where .

From Table 1, and Figures 1 and 2, we can see that ADMM converges more slowly than IADMM since “Iter.” of ADMM bigger than that of IADMM under the same conditions. Finally, numerical results show that the algorithm is feasible and effective.

5. Conclusion

In this paper, inspired by the application of nonconvex global consensus problem with regularization, we propose multiblock inertial ADMM algorithm for solving certain nonconvex global consensus problems. We have proven its convergence under some suitable conditions, and it turns out that any cluster point of the sequence generated by the proposed algorithm is a critical point. Numerical experiment is conducted to illustrate the effectiveness of the multiblock inertial ADMM (IADMM) algorithm. Its potential of the flexible multiblock inertial ADMM to analyze and design other types of nonconvex case, as well as a more thorough computational study, are topics of our further research.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (72071130; 71901145); National Social Science Fund Major Project of China (21&ZD200; 20&ZD199); Humanities and Social Sciences Research Project of the Ministry of Education (20YJC820030) and China Postdoctoral Science Foundation (2021M692047).

References

S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, pp. 1–122, 2010.
View at: Publisher Site | Google Scholar
G. Li and T. K. Pong, “Global convergence of splitting methods for nonconvex composite optimization,” SIAM Journal on Optimization, vol. 25, no. 4, pp. 2434–2460, 2015.
View at: Publisher Site | Google Scholar
R. Glowinski and A. Marroco, “Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires,” Revue française d'automatique, informatique, recherche opérationnelle. Analyse numérique, vol. 9, no. 2, pp. 41–76, 1975.
View at: Publisher Site | Google Scholar
D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finite element approximation,” Computers & Mathematics with Applications, vol. 2, no. 1, pp. 17–40, 1976.
View at: Publisher Site | Google Scholar
L. Yang, T. K. Pong, and X. Chen, “Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction,” SIAM Journal on Imaging Sciences, vol. 10, no. 1, pp. 74–110, 2017.
View at: Publisher Site | Google Scholar
M. Hong, Z. Q. Luo, and M. Razaviyayn, “Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems,” SIAM Journal on Optimization, vol. 26, no. 1, pp. 337–364, 2016.
View at: Publisher Site | Google Scholar
K. Guo, D. Han, D. Z. W. Wang, and T. Wu, “Convergence of ADMM for multi-block nonconvex separable optimization models,” Frontiers of Mathematics in China, vol. 12, no. 5, pp. 1139–1162, 2017.
View at: Publisher Site | Google Scholar
C. H. Chen, R. H. Chan, S. Q. Ma, and J. F. Yang, “Inertial proximal ADMM for linearly constrained separable convex optimization,” SIAM Journal on Imaging Sciences, vol. 8, no. 4, pp. 2239–2267, 2015.
View at: Publisher Site | Google Scholar
A. Moudafi and E. Elissabeth, “Approximate inertial proximal methods using the enlargement of maximal monotone operators,” International Journal of Pure and Applied Mathematics, vol. 5, no. 3, pp. 283–299, 2003.
View at: Google Scholar
M. Chao, Y. Zhang, and J. Jian, “An inertial proximal alternating direction method of multipliers for nonconvex optimization,” International Journal of Computer Mathematics, vol. 98, 2021.
View at: Google Scholar
K. Kurdyka, “On gradients of functions definable in o-minimal structures,” Annales de l'Institut Fourier, vol. 48, no. 3, pp. 769–783, 1998.
View at: Publisher Site | Google Scholar
S. Lojasiewicz, “Sur la geométrie semi-et sous-analytique,” Annales de l'Institut Fourier, vol. 43, no. 5, pp. 1575–1595, 1993.
View at: Publisher Site | Google Scholar
S. Lojasiewicz, “Une propriété topologique des sous-ensembles analytiques réels,” in Les Équations Aux Dérivées Partielles, Editions du centre National de la Recherche Scientifique, Paris, France, 1963.
View at: Google Scholar
J. Bolte, A. Daniilidis, O. Ley, and L. Mazet, “Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity,” Transactions of the American Mathematical Society, vol. 362, no. 6, pp. 3319–3363, 2009.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2023 Yang Liu and Yazheng Dang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

293

Downloads

267

Citations