Abstract

In this paper, we propose an accelerated proximal point algorithm for the difference of convex (DC) optimization problem by combining the extrapolation technique with the proximal difference of convex algorithm. By making full use of the special structure of DC decomposition and the information of stepsize, we prove that the proposed algorithm converges at rate of under milder conditions. The given numerical experiments show the superiority of the proposed algorithm to some existing algorithms.

1. Introduction

Difference of convex problem (DCP) is an important kind of nonlinear programming problems in which the objective function is described as the difference of convex (DC) functions. It finds numerous applications in digital communication system [1], assignment and power allocation [2], compressed sensing [36], and so on [713].

It is well known that the method to solve the DCP is the so-called difference of the convex algorithm (DCA) [14] in which the concave part is replaced by a linear majorant in the objective function and a convex optimization subproblem needs to be solved at each iteration. Note that the difficulty of the involved subproblem relies heavily on the DC decomposition of the objective function, and it can be easily solved when the objective function can be written as the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function, and a continuous concave function [15]. Motivated by this, Gotoh et al. [16] proposed the so-called proximal difference of the convex algorithm (PDCA) for solving DCP, in which not only the concave part is replaced by a linear majorant in each iteration but also the smooth convex part is replaced by some techniques. Furthermore, if the proximal mapping of the proper closed convex function can be easily computed, then the subproblem involved in the PDCA can be solved efficiently. However, when the concave part of the objective is void, the PDCA reduces to the proximal gradient algorithm which may be slow in computing [17]. In fact, since the convergence rate of the PDCA heavily depends on the Lojasiewicz exponent of the objective function, the PDCA converges linearly in general [18, 19]. To accelerate the convergence rate of the proximal difference of the convex algorithm, researchers recall the well-known extrapolation technique to design some efficient algorithms [2024]. This technique has been extensively used in accelerating the proximal type algorithms for convex programming [25, 26], and the convergence rate of the algorithms can be improved from to . Motivated by this, Wen et al. [27] proposed the proximal difference of the convex algorithm with extrapolation (PDCAE) for solving the DCP. The numerical experiments [27] show that the PDCAE has a better performance although it converges linearly in theory [27]. Now, a question is posed naturally: can we propose new type of the PDCA in which the convergence rate can be improved in theory? This constitutes the motivation of the paper.

In this paper, inspired by the work in [2023, 27], we establish an accelerated proximal DC programming algorithm (APDCA) for the DCP by combining the extrapolation technique and the PDCA. In the algorithm, the current iteration point is replaced by a linear combination of the previous two points, and extrapolation technique is involved in the stepsize. By making full use of the special structure of DC decomposition and the information of stepsize, we prove that the APDCA converges at rate of under milder conditions. The given numerical experiments show the superiority to some existing algorithms.

The remainder of the paper is organized as follows. In Section 2, we describe the DC optimization problem considered in this paper and present our new designed algorithm. In Section 3, we establish the global convergence and the quadratic convergence rate of the new designed algorithm. Some numerical experiments are provided in Section 4. Some conclusions are drawn in Section 5.

To end this section, we recall some definitions used in the subsequent analysis [2830].

For an extended real valued function , we denote its domain by dom . The function is said to be strongly convex if there exists an such that for all , where is a convex set and is a identity matrix. The function is said to be proper if it never equals and dom . Moreover, a proper function is closed if it is lower semicontinuous. A proper closed function is said to be level-bounded if the lower level sets of are bounded; that is, are bounded for any . Given a proper closed function , the limiting subdifferential of at is given as follows:where mean and . Note that dom It is well known that the (limiting) subdifferential reduces to the classical subdifferential in convex analysis when is a convex function; that is,

Furthermore, if is continuously differentiable, then the (limiting) subdifferential reduces to the gradient of and denoted by .

2. Algorithms for DC Programming

Consider the following difference of convex programming:where is a strongly convex function with , is a smooth convex function, is Lipschitz continuous with , is a continuous convex function, and is Lipschitz continuous with .

For the DCP, the following is a classical DCA which takes the following iterative scheme [14]:

By replacing the concave part in the objective function by a linear majorant and replacing the smooth convex part by a quadratic majorant, Gotoh et al. [16] proposed a proximal DCA for the DCP. For the sake of completeness, we list Algorithm 1 as follows.

Initial step. Take , , and .
Iterative step. Compute the new iterate by the following iterative scheme:
until is satisfied
  where is the Lipschitz constant of .

Despite a simple subproblem is involved in the algorithm, the PDCA is potentially slow [19, 27]. To accelerate the convergence rate of the PDCA, we incorporate extrapolation technique into the PDCA to obtain the following algorithm (Algorithm 2).

Initial step. Take , with , , , and .
Iterative step. Compute the new iterate by the following iterative scheme:
until is satisfied.

3. Convergence Analysis of the APDCA

In this section, we establish the global convergence of the algorithm and its convergence rate. To continue, we first recall the following conclusions.

Lemma 1. (see [25]). Let be a continuously differentiable function with Lipschitz continuity gradient whose Lipschitz constant . Then, for any , it holds that

Lemma 2. Let . For the sequence generated by the APDCA, it holds that

Proof. Since is strong convex function, there exists constant such thatwhere .
Connecting the fact that is Lipschitz continuous with constant with Lemma 1, we havewhere , which means thatIt follows from is convex function thatConnecting (7) and (9) with (10), we haveOn the other hand, since is convex, it follows thatwhich means thatConnecting the fact that is Lipschitz continuous with constant with Lemma 1, we havewhere . Summing (13) and (14), we haveAdding to both sides of (15) yieldsBy taking , (16) yields thatBy the optimality conditions of (8), one hasthat is,Then, for , it follows from (11) and (17) thatwhere the first equality follows from (19), the second equality follows from the fact that , and the last inequality follows from . We have conclusion (6).
Before proceeding further, we need the following conclusions.

Lemma 3. (see [25, 31]). Let . Then, the sequence generated by (6) is increasing, and

Lemma 4. Let be a sequence generated by the APDCA. Then,where , , and is the critical point of problem (3).

Proof. From (7) and (6), we have . Then, it follows thatHence, to show the assertion, we only need to show thatIn fact, by taking , one has from Lemma 2 thatHence,Using Lemma 2 again, one has from thatthat is,Multiplying (25) by and (27) by , respectively, and summing them yieldwhere the first equality follows from the fact that and the last equality follows by some manipulation. The desired result follows.
Now, we are ready to show the convergence rate of the APDCA.

Theorem 1. For the sequence generated by the APDCA, it holds thatwhere is a stationary point of (3).

Proof. Using the notations used in Lemma 4, let , and it follows from (27) thatHence,Then, from Lemma 4, we know that the sequence is nonincreasing. Therefore,where the second inequation follows from and , and the last equation follows from .
Then, it follows from Lemma 3 thatThe desired result follows.

4. Numerical Experiments

In this section, we evaluate the performance of the APDCA by applying it to the DC regularized least squares problem. We will compare the performance of the APDCA with the algorithm in [15] (PDCA) and GIST in [32].

On APDCA and PDCA, we set and . On GIST, we set . We initialize the three algorithms at the origin point and terminate the algorithms when

Furthermore, we terminate PDCA when the number of iteration is more than 5000 (denoted by “max” on the report).

Example 1. Least squares problems with regularizer are as follows:where , and is the regularization parameter.
This problem takes the form of (3) with , , and . Note that the purpose of adding is to ensure strong convexity of .
To compare the performance of the three algorithms, we report the number of iterations (denoted by Iter), CPU times in seconds (denoted by CPU time), the sparsity of the solution (denoted by sparsity), and the function values at termination (denoted by fval), averaged over the 30 random instances. The numerical results are reported in Tables 1 and 2, from which we can see that the APDCA always outperforms PDCA and GIST. Specifically, from Table 1, we can see that the APDCA is about 2.5 times faster than GIST and is about 5.2 times faster than PDCA for the parameter . From Table 2, we can see that the APDCA is about 2.1 times faster than GIST and is about 8.4 times faster than PDCA for the parameter . Tables 1 and 2 also show that the APDCA requires fewer iteration steps than the other two algorithms. Specifically, from Table 1, the iteration step of APDCA is about of GIST for the parameter . From Table 2, the iteration step of APDCA is about of GIST for the parameter . Meanwhile, Tables 1 and 2 also show that the solution given by APDCA is more sparse than that given by GIST and PDCA.

Example 2. Least squares problems with logarithmic regularizer are as follows:where is a constant, and is the regularization parameter.
This problem takes the form of (3) with , , and . Note that the purpose of adding is to ensure strong convexity of . For this example, we set .
To compare the performance of the three algorithms, we report the number of iterations (denoted by Iter), CPU times in seconds (denoted by CPU time), the sparsity of the solution (denoted by sparsity), and the function values at termination (denoted by fval), averaged over the 30 random instances. The numerical results are reported in Tables 3 and 4, from which we can see that the APDCA always outperforms PDCA and GIST. Specifically, from Table 3, we can see that the APDCA is about 1.9 times faster than GIST and is about 8.3 times faster than PDCA for the parameter . From Table 4, we can see that the APDCA is about 1.6 times faster than GIST and is about 11.3 times faster than PDCA for the parameter . Tables 3 and 4 also show that the APDCA requires fewer iteration steps than the other two algorithms. Specifically, from Table 3, the iteration step of APDCA is about of GIST for the parameter . From Table 4, the iteration step of APDCA is about of GIST and is about of PDCA for the parameter . Meanwhile, Tables 3 and 4 also show that the solution given by APDCA is more sparse than that given by GIST and PDCA.

5. Conclusions

In this paper, we propose an accelerated proximal point algorithm for the difference of convex optimization problem by combining the extrapolation technique with the proximal difference of the convex algorithm. By making full use of the special structure of difference of convex decomposition and the information of stepsize, we prove that the proposed algorithm converges at rate of under milder conditions. The given numerical experiments show the superiority of the proposed algorithm to some existing algorithms.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

The authors equally contributed to this paper and read and approved the final manuscript.

Acknowledgments

This project was supported by the Natural Science Foundation of China (grants nos. 11801309, 11901343, and 12071249).