Multivariate Spectral Gradient Algorithm for Nonsmooth Convex
Optimization Problems

Hu, Yaping

doi:https://doi.org/10.1155/2015/145323

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2015 | Article ID 145323 | https://doi.org/10.1155/2015/145323

Multivariate Spectral Gradient Algorithm for Nonsmooth Convex Optimization Problems

Yaping Hu¹

Academic Editor: Dapeng P. Du

Received20 Apr 2015

Revised04 Jul 2015

Accepted05 Jul 2015

Published16 Jul 2015

Abstract

We propose an extended multivariate spectral gradient algorithm to solve the nonsmooth convex optimization problem. First, by using Moreau-Yosida regularization, we convert the original objective function to a continuously differentiable function; then we use approximate function and gradient values of the Moreau-Yosida regularization to substitute the corresponding exact values in the algorithm. The global convergence is proved under suitable assumptions. Numerical experiments are presented to show the effectiveness of this algorithm.

1. Introduction

Consider the unconstrained minimization problemwhere is a nonsmooth convex function. The Moreau-Yosida regularization [1] of at associated with is defined bywhere is the Euclidean norm and is a positive parameter. The function minimized on the right-hand side is strongly convex and differentiable, so it has a unique minimizer for every . Under some reasonable conditions, the gradient function of can be proved to be semismooth [2, 3], though generally is not twice differentiable. It is widely known that the problemand the original problem (1) are equivalent in the sense that the two corresponding solution sets coincidentally are the same. The following proposition shows some properties of the Moreau-Yosida regularization function .

Proposition 1 (see Chapter XV, Theorem , [1]). The Moreau-Yosida regularization function is convex, finite-valued, and differentiable everywhere with gradientwhereis the unique minimizer in (2). Moreover, for all , one has

This proposition shows that the gradient function is Lipschitz continuous with modulus . In this case, the gradient function is differentiable almost everywhere by the Rademacher theorem; then the B-subdifferential [4] of at is defined by where , and the next property of BD-regularity holds [4–6].

Proposition 2. If is BD-regular at , then(i)all matrices are nonsingular;(ii)there exists a neighborhood of , , and ; for all , one has

Instead of the corresponding exact values, we often use the approximate value of function and gradient in the practical computation, because is difficult and sometimes impossible to be solved precisely. Suppose that, for any and for each , there exists an approximate vector of the unique minimizer in (2) such thatThe implementable algorithms to find such approximate vector can be found, for example, in [7, 8]. The existence theorem of the approximate vector is presented as follows.

Proposition 3 (see Lemma in [7]). Let be generated according to the formulawhere is a stepsize and is an approximate subgradient at ; that is,(i) If satisfiesthen (11) holds with(ii) Conversely, if (11) holds with given by (13), then (12) holds: .

We use the approximate vector to define approximation function and gradient values of the Moreau-Yosida regularization, respectively, byThe following proposition is crucial in the convergence analysis. The proof of this proposition can be found in [2].

Proposition 4. Let be arbitrary positive number and let be a vector satisfying (9). Then, one gets

Algorithms which combine the proximal techniques with Moreau-Yosida regularization for solving the nonsmooth problem (1) have been proved to be effective [7, 9, 10], and also some trust region algorithms for solving (1) have been proposed in [5, 11, 12], and so forth. Recently, Yuan et al. [13, 14] and Li [15] have extended the spectral gradient method and conjugate gradient-type method to solve (1), respectively.

Multivariate spectral gradient (MSG) method was first proposed by Han et al. [16] for optimization problems. This method has a nice property that it converges quadratically for objective function with positive definite diagonal Hessian matrix [16]. Further studies on such method for nonlinear equations and bound constrained optimization can be found, for instance, in [17, 18]. By using nonmonotone technique, some effective spectral gradient methods are presented in [13, 16, 17, 19]. In this paper, we extend the multivariate spectral gradient method by combining with a nonmonotone line search technique as well as the Moreau-Yosida regulation function to solve the nonsmooth problem (1) and do some numerical experiments to test its efficiency.

The rest of this paper is organized as follows. In Section 2, we propose multivariate spectral gradient algorithm to solve (1). In Section 3, we prove the global convergence of the proposed algorithm; then some numerical results are presented in Section 4. Finally, we have a conclusion section.

2. Algorithm

In this section, we present the multivariate spectral gradient algorithm to solve the nonsmooth convex unconstrained optimization problem (1). Our approach is using the tool of the Moreau-Yosida regularization to smoothen the nonsmooth function and then make use of the approximate values of function and gradient in multivariate spectral gradient algorithm.

We first recall the multivariate spectral gradient algorithm [16] for smooth optimization problem:where is continuously differentiable and its gradient is denoted by . Let be the current iteration; multivariate spectral gradient algorithm is defined bywhere is the gradient vector of at and is solved by minimizingwith respect to , where , .

Denote the th element of and by and , respectively. We present the following multivariate spectral gradient (MSG) algorithm.

Algorithm 5. Set , , , , , , , , , and ; is a strictly decreasing sequence with , .

Step 1. Set . Calculate by (14) as well as by (15). Let , .

Step 2. Stop if . Otherwise, go to Step 3.

Step 3. Choose satisfying ; find which satisfieswhere and is the smallest nonnegative integer such that (22) holds.

Step 4. Let . Stop if .

Step 5. Update by the following formula:

Step 6. Compute the search direction by the following:(a)If , then set ; otherwise set for , where , .(b)If or , then set for .Let .

Step 7. Set ; go back to Step 2.

Remarks. (i) The definition of in Algorithm 5, together with (15) and Proposition 3, deduces thatthen, with the decreasing property of , the assumed condition in Lemma 7 holds.

(ii) From the nonmonotone line search technique (22), we can see that is a convex combination of the function value and . Also is a convex combination of the function values , , , as . is a positive value that plays an important role in manipulating the degree of nonmonotonicity in the nonmonotone line search technique, with yielding a strictly monotone scheme and with yielding , whereis the average function value.

(iii) From Step 6, we can obtain thatthen there is a positive constant such that, for all ,which shows that the proposed multivariate spectral gradient algorithm possesses the sufficient descent property.

3. Global Convergence

In this section, we provide a global convergence analysis for the multivariate spectral gradient algorithm. To begin with, we make the following assumptions which have been given in [5, 12–14].

Assumption A. (i) is bounded from below.
(ii) The sequence , , is bounded; that is, there exists a constant such that, for all ,

The following two lemmas play crucial roles in establishing the convergence theorem for the proposed algorithm. By using (26) and (27) and Assumption A, similar to Lemma in [20], we can get the next lemma which shows that Algorithm 5 is well defined. The proof ideas of this lemma and Lemma in [20] are similar, hence omitted.

Lemma 6. Let be the sequence generated by Algorithm 5. Suppose that Assumption A holds and is defined by (25). Then one has for all . Also, there exists a stepsize satisfying the nonmonotone line search condition.

Lemma 7. Let be the sequence generated by Algorithm 5. Suppose that Assumption A and hold. Then, for all , one has where is a constant.

Proof (Proof by Contradiction). Let satisfy the nonmonotone Armijo-type line search (22). Assume on the contrary that does hold; then there exists a subsequence such that as . From the nonmonotone line search rule (22), satisfies together with in Lemma 6, we haveBy (28) and (31) and Proposition 4 and using Taylor’s formula, there iswhere . From (32) and Proposition 4, we havewhere the second inequality follows from (26), Part 3 in Proposition 4, and , the equality follows from , and the last inequality follows from (27). Dividing each side by and letting in the above inequality, we can deduce that which is impossible, so the conclusion is obtained.

By using the above lemmas, we are now ready to prove the global convergence of Algorithm 5.

Theorem 8. Let be generated by Algorithm 5 and suppose that the conditions of Lemma 7 hold. Then one has sequence has accumulation point, and every accumulation point of is optimal solution of problem (1).

Proof. Suppose that there exist and such that From (22), (26), and (29), we get Therefore, it follows from the definition of and (23) thatBy Assumption A, is bounded from below. Further by Proposition 4, for all , we see that is bounded from below. Together with for all from Lemma 6, it shows that is also bounded from below. By (38), we obtainOn the other hand, the definition of implies that , and it follows that This is a contradiction. Therefore, we should haveFrom (17) in Proposition 4 together with as , which comes from the definition of and in Algorithm 5, we obtainSet as an accumulation point of sequence ; there is a convergent subsequence such thatFrom (4) we know that . Consequently, (42) and (43) show that . Hence, is an optimal solution of problem (1).

4. Numerical Results

This section presents some numerical results from experiments using our multivariate spectral gradient algorithm for the given test nonsmooth problems which come from [21]. We also list the results of [14] (modified Polak-Ribière-Polyak gradient method, MPRP) and [22] (proximal bundle method, PBL) to make a comparison with the result of Algorithm 5. All codes were written in MATLAB R2010a and were implemented on a PC with 2.8 GHz CPU, 2 GB of memory, and Windows 8. We set , , , and , and the parameter is chosen asthen we adopt the termination condition . For subproblem (5), the classical PRP CG method (called subalgorithm) is used to solve it; the algorithm stops if or holds, where is the subgradient of at the point . The subalgorithm will also stop if the iteration number is larger than fifteen. In its line search, the Armijo line search technique is used and the step length is accepted if the search number is larger than five. Table 1 contains problem names, problem dimensions, and the optimal values.

The summary of the test results is presented in Tables 2-3, where “Nr.” denotes the name of the tested problem, “NF” denotes the number of function evaluations, “NI” denotes the number of iterations, and “” denotes the function value at the final iteration.

The value of controls the nonmonotonicity of line search which may affect the performance of the MSG algorithm. Table 2 shows the results for different parameter , as well as different values of the parameter ranging from to on problem Rosenbrock, respectively. We can conclude from the table that the proposed algorithm works reasonably well for all the test cases. This table also illustrates that the value of can influence the performance of the algorithm significantly if the value of is within a certain range, and the choice is better than .

Then, we compare the performance of MSG to that of the algorithms MPRP and PBL. In this test, we fix and . To illustrate the performance of each algorithm more specifically, we present three comparison results in terms of number of iterations, number of function evaluations, and the final objective function value in Table 3.

The numerical results indicate that Algorithm 5 can successfully solve the test problems. From the number of iterations in Table 3, we see that Algorithm 5 performs best among these three methods, and the final function value obtained by Algorithm 5 is closer to the optimal function value than those obtained by MPRP and PBL. In a word, the numerical experiments show that the proposed algorithm provides an efficient approach to solve nonsmooth problems.

5. Conclusions

We extend the multivariate spectral gradient algorithm to solve nonsmooth convex optimization problems. The proposed algorithm combines a nonmonotone line search technique and the idea of Moreau-Yosida regularization. The algorithm satisfies the sufficient descent property and its global convergence can be established. Numerical results show the efficiency of the proposed algorithm.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The author would like to thank the anonymous referees for their valuable comments and suggestions which help a lot to improve the paper greatly. The author also thanks Professor Gong-lin Yuan for his kind offer of the source BB codes on nonsmooth problems. This work is supported by the National Natural Science Foundation of China (Grant no. 11161003).

References

J. B. Hiriart-Urruty and C. Lemaréchal, Convex Analysis and Minimization Algorithms, Springer, Berlin, Germany, 1993.
M. Fukushima and L. Qi, “A globally and superlinearly convergent algorithm for nonsmooth convex minimization,” SIAM Journal on Optimization, vol. 6, no. 4, pp. 1106–1120, 1996.
View at: Publisher Site | Google Scholar | MathSciNet
L. Q. Qi and J. Sun, “A nonsmooth version of Newton's method,” Mathematical Programming, vol. 58, no. 3, pp. 353–367, 1993.
View at: Publisher Site | Google Scholar | MathSciNet
F. H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, NY, USA, 1983.
View at: MathSciNet
S. Lu, Z. Wei, and L. Li, “A trust region algorithm with adaptive cubic regularization methods for nonsmooth convex minimization,” Computational Optimization and Applications, vol. 51, no. 2, pp. 551–573, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
L. Q. Qi, “Convergence analysis of some algorithms for solving nonsmooth equations,” Mathematics of Operations Research, vol. 18, no. 1, pp. 227–244, 1993.
View at: Publisher Site | Google Scholar | MathSciNet
R. Correa and C. Lemaréchal, “Convergence of some algorithms for convex minimization,” Mathematical Programming, vol. 62, no. 1–3, pp. 261–275, 1993.
View at: Publisher Site | Google Scholar | MathSciNet
M. Fukushima, “A descent algorithm for nonsmooth convex optimization,” Mathematical Programming, vol. 30, no. 2, pp. 163–175, 1984.
View at: Publisher Site | Google Scholar | MathSciNet
J. R. Birge, L. Qi, and Z. Wei, “Convergence analysis of some methods for minimizing a nonsmooth convex function,” Journal of Optimization Theory and Applications, vol. 97, no. 2, pp. 357–383, 1998.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Wei, L. Qi, and J. R. Birge, “A new method for nonsmooth convex optimization,” Journal of Inequalities and Applications, vol. 2, no. 2, pp. 157–179, 1998.
View at: Publisher Site | Google Scholar | MathSciNet
N. Sagara and M. Fukushima, “A trust region method for nonsmooth convex optimization,” Journal of Industrial and Management Optimization, vol. 1, no. 2, pp. 171–180, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
G. Yuan, Z. Wei, and Z. Wang, “Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization,” Computational Optimization and Applications, vol. 54, no. 1, pp. 45–64, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
G. Yuan and Z. Wei, “The Barzilai and Borwein gradient method with nonmonotone line search for nonsmooth convex optimization problems,” Mathematical Modelling and Analysis, vol. 17, no. 2, pp. 203–216, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
G. Yuan, Z. Wei, and G. Li, “A modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs,” Journal of Computational and Applied Mathematics, vol. 255, pp. 86–96, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
Q. Li, “Conjugate gradient type methods for the nondifferentiable convex minimization,” Optimization Letters, vol. 7, no. 3, pp. 533–545, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
L. Han, G. Yu, and L. Guan, “Multivariate spectral gradient method for unconstrained optimization,” Applied Mathematics and Computation, vol. 201, no. 1-2, pp. 621–630, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
G. Yu, S. Niu, and J. Ma, “Multivariate spectral gradient projection method for nonlinear monotone equations with convex constraints,” Journal of Industrial and Management Optimization, vol. 9, no. 1, pp. 117–129, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Yu, J. Sun, and Y. Qin, “A multivariate spectral projected gradient method for bound constrained optimization,” Journal of Computational and Applied Mathematics, vol. 235, no. 8, pp. 2263–2269, 2011.
View at: Publisher Site | Google Scholar | MathSciNet
Y. Xiao and Q. Hu, “Subspace Barzilai-Borwein gradient method for large-scale bound constrained optimization,” Applied Mathematics and Optimization, vol. 58, no. 2, pp. 275–290, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
H. Zhang and W. W. Hager, “A nonmonotone line search technique and its application to unconstrained optimization,” SIAM Journal on Optimization, vol. 14, no. 4, pp. 1043–1056, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
L. Lukšan and J. Vlček, “Test problems for nonsmooth unconstrained and linearly constrained optimization,” Tech. Rep. 798, Institute of Computer Science, Academy of Sciences of the Czech Republic, Praha, Czech Republic, 2000.
View at: Google Scholar
L. Lukšan and J. Vlček, “A bundle-Newton method for nonsmooth unconstrained minimization,” Mathematical Programming, vol. 83, no. 3, pp. 373–391, 1998.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2015 Yaping Hu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

851

Downloads

837

Citations