Abstract

The alternating direction method of multipliers (ADMM) is an effective method for solving two-block separable convex problems and its convergence is well understood. When either the involved number of blocks is more than two, or there is a nonconvex function, or there is a nonseparable structure, ADMM or its directly extend version may not converge. In this paper, we proposed an ADMM-based algorithm for nonconvex multiblock optimization problems with a nonseparable structure. We show that any cluster point of the iterative sequence generated by the proposed algorithm is a critical point, under mild condition. Furthermore, we establish the strong convergence of the whole sequence, under the condition that the potential function satisfies the Kurdyka–Łojasiewicz property. This provides the theoretical basis for the application of the proposed ADMM in the practice. Finally, we give some preliminary numerical results to show the effectiveness of the proposed algorithm.

1. Introduction

In this paper, we consider the following possibly nonconvex and nonsmooth optimization problem:where are variables vectors, is differentiable, and each is proper and lower semicontinuous, are given matrix, and .

The alternating direction method of multipliers (ADMM) is a very effective method for solving the convex two-block optimization problem [1, 2]. A natural idea is to extend ADMM to solve problem (1). However, ADMM or its directly extend version may not converge, when either the involved number of blocks is more than two, or there is a nonconvex function, or there is a nonseparable structure. Recently, there have been a few developments on it, e.g., [313].

Hong et al. [6] considered the sharing and consensus problem and showed that the classical ADMM converges to the set of stationary solutions, only provided that the penalty parameter in the augmented Lagrangian is chosen to be sufficiently large. Li and Pong [8] studied the convergence of ADMM for some special two-block nonconvex models, where one of the matrices A and B is an identity matrix. Wang et al. [9, 10] studied the convergence of the nonconvex Bregman ADMM algorithm, which includes ADMM as a special case. Wang et al. [11] studied the convergence of the ADMM for nonconvex nonsmooth optimization with a nonseparable structure. Guo et al. [4, 5]studied the convergence of classical ADMM for two-block and multiblock nonconvex models where one of the matrices is an identity matrix. Yang et al. [13] studied the convergence of the ADMM for a nonconvex optimization model which come from the background/foreground extraction.

The purpose and the main contribution of this paper is to propose and prove the convergence of a new variant ADMM for nonconvex coupled problems (1). The novelty of this paper can be summarized as follows:(1)Compared to the existing literature, the model in this paper is more general. There is no nonseparable structure in the models considered by [410, 12, 13]. Wang et al. [11] considered two scenarios. If , then (1) is the scenario 1 in [11]. If , for , then (1) becomes the scenario 2 in [11]. Furthermore, in this paper, the matrices and B are possibly not full column or row rank.(2)The proposed algorithm combines linearization technology with regularization technology. Linearization technology and regularization technology can effectively reduce the difficulty of the solving subproblems.

The rest of this paper is organized as follows. In Section 2, some basic concepts and necessary preliminaries for further analysis are summarized. In Section 3, we propose the algorithm and analyze the convergence of it for 3-block nonconvex and nonsmooth coupled problems. Finally, some conclusions are made in Section 4.

2. Preliminaries

denotes the n-dimensional Euclidean space, denotes the extended real number set, and denotes the natural number set. The image space of a matrix is defined as . denotes the Euclidean projection onto . If matrix , let denote the smallest positive singular value of the matrix . represents the Euclidean norm. is the domain of a function . . . For a set and a point , let If , we set for all . For a point-to-set mapping F, its graph is defined by

Definition 1 (see [14]). Let be a proper function. If there exists such thatfor all and , then f is called strongly convex with modulus δ.

Definition 2 (see [15]). For a convex differential function , the associated Bregman distance is defined asThe Bregman distance plays an important role in iterative algorithms. The Bregman distance share many similar nice properties of the Euclidean distance. However, the Bregman distance is not a metric, since it does not satisfy the triangle inequality nor symmetry. Some examples of Bregman distance include [16](i)Classical Euclidean distance: if , then (ii)Itakura–Saito distance: if , then (iii)Mahalanobis distance: if with Q a symmetric positive definite matrix, then Let us now collect some useful properties about Bregman distance.

Proposition 1 (see [15]). Let ϕ be differentiable and strongly convex function with modulus δ, then(i) and if and only if (ii) for all The following notations and definitions are quite standard and can be founded in [14, 17].

Definition 3. Let be a proper lower semicontinuous function.(i)The subdifferential, or regular subdifferential, of f at isWhen , we set .(ii)The limiting subdifferential, or simply the subdifferential, of f at , written , is defined as(iii)A point that satisfies is called a critical point or a stationary point of the function f. The set of critical points of f is denoted by .The following proposition collects some properties of the subdifferential.

Proposition 2 (see [17]). Let and be proper lower semicontinuous functions. Then, the following holds:(i) for each . Moreover, the first set is closed and convex, while the second is closed and not necessarily convex.(ii)Let be a sequence such that it converges to . If , then .(iii)If is a local minimizer of f then .(iv)If is continuous differentiable, then .

The Lagrangian function of (1), with multiplier , is defined as

Definition 4. If such thatthen is called a critical point or stationary point of the Lagrange function .
A very important technique to prove the strong convergence of the ADMM for nonconvex optimization problems relies on the assumption that the benefit function satisfying Kurdyka-Łojasiewicz property (KL property) [1821]. There are many functions which satisfy this inequality. Especially, when the function belongs to some functional classes, e.g., semialgebraic, real subanalytic, and log-exp (see [2224]). It is often elementary to check that such an inequality holds.
For notational simplicity, we use to denote the set of concave functions such that(i)(ii) is continuous differentiable on and continuous at 0(iii)The KL property can be described as follows.

Definition 5. (see [1821]) (KL property). Let be a proper lower semicontinuous function. If there exists , a neighborhood U of , and a function , such that for all , it holds thatthen f is said to have the KL property at .

Lemma 1 (see [22]) (uniformized KL property). Suppose that is a proper lower semicontinuous function and is a compact set. If for all and satisfies the property at each point of . Then, there exist , and such thatfor all

Lemma 2 (see [25]) (Descent lemma). Let be a continuous differentiable function where gradient is Lipschitz continuous with the modulus , then for any , we have

Lemma 3 (see [26]). Let be a nonzero matrix and let denote the smallest positive eigenvalue of . Then, for every , there holds

3. Algorithm and Convergence

For the convenience of analysis, we only consider the case of . The obtained results could naturally be generalized to the case of . Thus, in the rest of this paper, we consider the following nonconvex and nonsmooth 3-block optimization problem:where is proper and lower semicontinuous but possibly nonconvex, is differentiable, , and .

In this paper, we present the following algorithm for (12).

Algorithm 1. LBADMM: start with and . With the given iteration point , the new iteration point is given as follows:where , , and are the Bregman distances associated with , , and ϕ, respectively.

Remark 1. Due to the different structures of the problem, the algorithm in this paper is different from the existing algorithms. In order to make use of the properties of differentiable blocks and simplify the calculation of each iteration, we linearize the differentiable part in the and subproblems. If the function is only related to the variable y, that is, , then the algorithm LBADMM will become the Bregman ADMM in [9, 10]. Different from [9, 10], we do not assume B is full row rank.
In this section, we always assume that the sequence is generated by algorithm LBADMM. Let , where denotes the smallest positive eigenvalue of .

Assumption 1. (i) is -Lipschitz continuous, i.e., for all (ii) and (iii) are Lipshitz continuous with the modulus , respectively(iv) strongly convex with the modulus , and (v)The following lemma establishes the relationship between the dual variable and the original variables.

Lemma 4. For each ,

Proof. By Assumption 1 (ii) and Lemma 3, we haveFrom the optimality condition of y-subproblem in (14) yieldsTaking into account , one hasThus,It follows from the abovementioned formula and (17) thatThe proof is completed.
The augmented Lagrangian function with multiplier of (12) is defined aswhere is the Lagrangian function of (12). LetLetThe following lemma implies the monotonicity of the sequence .

Lemma 5. For each ,where

Proof. From (17), we haveAdding up the abovementioned three formulas, we haveand hencethat is,From Lemma 2, Assumption 1 (iv), and Proposition 1, we obtainRecall thatAdding up (71) and (72), we haveTogether with (14), we obtainwhich implies thatThat is, (23) holds.

Remark 2. From Assumption 1 (iv), we have and . Furthermore, from Assumption 1 (v), we have .

Lemma 6. If the sequence is bounded, then

Proof. Since is bounded, the sequence is bounded and there exists a subsequence such that Since are lower semicontinuous and is Lipschitz differentiable, the function is lower semicontinuous, which leads toThus, is bounded from below. From Lemma 5, is nonincreasing. Thus, is convergent. Furthermore, is also convergent and for each k. By Lemma 5, we haveFrom the abovementioned formula, we obtainNote that and the arbitrariness of t, we obtainIn view of (14), we haveThus,

Lemma 7. There exists such thatwhere

Proof. From the definition of , we haveFrom (14) and the optimality conditions, one hasThat is,From (43) and (45), we havewhereThus,It follows from Assumption 1 and Lemma 4 that there exists a such thatThe following theorem shows that the algorithm LBADMM has global convergence.

Theorem 1. Let denote the cluster point set of , then(i) is a nonempty compact set, and (ii)If , then (iii) is finite and constant on and equal to

Proof. (i)By the definition of , it is trivial.(ii)Let , then there exists a subsequence of converging to . Since , Since and , LetIt follows from (14) that and . Thus,Noting that and are lower semicontinuous, we haveIt follows form the abovementioned four formulas thatTogether with the continuity of and the closeness of , by taking limit in (45) along the subsequence and yieldsThat is, is a critical point of the Lagrange function L of (12).(iii)From (53) and Lemma 5, we haveFrom (55) and the descent of , we obtainTherefore, is constant on . Moreover,
The following theorem is the main result of this paper.

Theorem 2 (strong convergence). Suppose that Assumption 1 holds, satisfies the property at each point of , then(i)(ii) converges to a critical point of

Proof. From Theorem 1, we have for all . We consider two cases.(i)If there exists an integer such that . From Lemma 5, we haveThus, for any we have Hence, for any it follows that and the assertion holds.(ii)Assume that for all . Since , it follows that for any given there exists such that , for all . Since , for given there exists such that , for all . Consequently, when ,Since is a nonempty compact set and is constant on , applying Lemma 1, we haveFrom Lemma 7, one hasFrom the concavity of , we haveThus, associating with Lemma 5 and , we haveFor convenience, we set Thus,That is,By the fact , we obtainwhich along with (64) yieldsSumming up the abovementioned formula for yieldsNotice that ; thus,whereThus,By Lemma 4, one has . Furthermore, . Consequently is a Cauchy sequence. The assertion then follows immediately from Theorem 1.

Remark 3. In this section, the main conclusions are based on the boundedness assumption of the sequences . The following conclusion shows that we only need to assume that the sequence is bounded.

Proposition 3. If and the sequence is bounded, then is bounded.

Proof. From (17), one hasSince is bounded, is bounded. By Assumption 1 (ii) and Lemma 3, we haveThus, is bounded.
Next, we present a sufficient condition of boundedness of the sequence , which is similar with Lemma 8 in [4].

Lemma 8. Let be the sequence generated by Algorithm 1. Suppose that and there exists such that

Ifthen is bounded.

Proof. From Lemma 5, we know thatThen, combining with , we obtainNote that , we haveUnder the assumptions, one can easily observe that , and are all bounded. Boundedness of follows from Proposition 3. Therefore, is bounded.

4. Numerical Results

In compressed sensing, a fundamental problem is recovering an n-dimensional sparse signal x from a set of m incomplete measurements. In such a case one needs to find the sparsest solution of a linear system, which can be modeled aswhere is the measurement matrix, is the observed data, is a regularization parameter, and denotes the number of nonzero elements of x. In general, the abovementioned models are NP-hard. In order to overcome such a difficulty, one can relax the regularization to regularization. And, some scholars generally solve the following problems instead of solving problem (78) [10, 27]:where

Based on (79), we construct the following problems:

In order to verify the effectiveness of the algorithm LBADMM, we now focus on applying the algorithm LBADMM to solve the nonconvex optimization problem (81). Applying the algorithm LBADMM to problem (81) with and , we havewhere is the half shrinkage operator [27] defined as withwith

In the experiment, we choose , to normalize the columns to have unit norm. The variable are generated with 100 nonzero entries, each sample from an Gaussian distribution. The variables , , , and were initialized to be zero. The vector , where . We set , , , and , and the regularization parameter . It is easy to verify that the parameters meet Assumption 1 (v). Defining the residual at iteration k as , a reasonable termination criterion is that the residual must be small, so we choose the stop criterion as

The numerical results are reported in Table 1. The codes were written by matlab R2016a, the computer running the program is configured as Windows 10 system, Inter (R) Core (TM) i7-6500U 2.5 GHz CPU, 8 GB memory. We report the number of iterations (“Iter”.), the computing time in seconds (“Time”) and the objective function value (“f-val”). Numerical results show that the Algorithm LBADMM is stable and effective.

A part of computational results are presented in Figures 13. In each figure, we plot the trend of the objective value (“objective-value”) and the trend of the residual defined by (“”).

5. Conclusions

We propose a new algorithm called linear Bregman ADMM for the three-blocks optimization problem with the nonseparable structure. The proposed algorithm integrates the basic ideas of the linearization technology and regularization technology. We show that any cluster point of the sequence generated by the proposed algorithm is a critical point. Under the condition that the potential function satisfies the Kurdyka-Łojasiewicz property and the penalty parameter is larger than a constant, the strong convergence of the algorithm is proved. Preliminary numerical results show that the algorithm LBADMM is stable and effective.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Natural Science Foundation of China (nos. 11601095 and 11771383) and Natural Science Foundation of Guangxi Province (nos. 2016GXNSFBA380185 and 2016GXNSFDA380019).