A Frame-Based Conjugate Gradients Direct Search Method with Radial Basis Function Interpolation Model

Fang, Xiaowei; Ni, Qin

doi:https://doi.org/10.1155/2017/4082432

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Conclusion Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2017 | Article ID 4082432 | https://doi.org/10.1155/2017/4082432

A Frame-Based Conjugate Gradients Direct Search Method with Radial Basis Function Interpolation Model

Xiaowei Fang^1,2and Qin Ni¹

Academic Editor: Rigoberto Medina

Received20 Jul 2016

Accepted19 Dec 2016

Published19 Jan 2017

Abstract

In this paper, we propose a new hybrid direct search method where a frame-based PRP conjugate gradients direct search algorithm is combined with radial basis function interpolation model. In addition, the rotational minimal positive basis is used to reduce the computation work at each iteration. Numerical results for solving the CUTEr test problems show that the proposed method is promising.

1. Introduction

In this paper, we consider the following problem:where function is assumed to be continuously differentiable from into , and the derivative information is unavailable or untrustworthy, for example, because of noise and using finite differences. Problem (1) has numerous applications in engineering, such as the helicopter rotor blade design [1, 2], the aeroacoustic shape design [3], groundwater community problems [4], and medical image registration problems [5].

There are two main methods for solving (1). The first class of methods is the model based methods, which are constructed by means of multivariate interpolation, including under and overdetermined. These methods were introduced by Powell [6] and Winfield [7] and were developed by [8–11]. The second class of methods is the direct search methods which are based on the comparison rules of objective function values. These methods were pioneered by Hooke and Jeeves [12]. The convergence theory was established by Torczon [13, 14]. Audet and Dennis [15] proposed a general framework for direct search method. Coope and Price [16] extended the PRP method [17, 18] to solve (1) and presented a frame-based conjugate gradients direct search algorithm (Max-PRP for short). In each iteration, Max-PRP employed the fixed maximal positive basis to estimate the first and second gradients; then the search direction is determined by employing the PRP formula. Numerical tests showed that the Max-PRP was effective on a wide variety of unconstrained optimization problem. In addition, some classical and modern direct search methods were introduced by Kolda et al. [19].

Generally, model based methods are more efficient than direct search methods in that they are able to exploit structure inherently in the problem. But direct search methods are simpler to code and to parallelize. Therefore, it is natural to try to combine both methods. In 2010, Custódio et al. [20] proposed a hybrid method integrating minimum Frobenius norm quadratic interpolation models in a direct search framework and numerical results showed that the addition of quadratic interpolation models improved the performance of the direct search method. In 2013, Conn and Le Digabel [21] showed that the use of quadratic interpolation models can improve the efficiency of the mesh adaptive direct search method.

The above hybrid algorithms were based on the quadratic interpolation models. In 2008, Wild et al. [22] presented a new derivative-free algorithm (ORBIT for short), which employed radial basis function (RBF) interpolation models. The RBF interpolation models allowed ORBIT to interpolate nonlinear functions using fewer function evaluations than the quadratic interpolation models. In 2013, Wild and Shoemaker [23] proved the global convergence of the ORBIT under some mild assumptions. Numerical results showed that the method using RBF interpolation models outperformed methods using quadratic interpolation models.

Motivated by the efficiency of the ORBIT, we propose a new hybrid direct search method, which combines the frame-based conjugate gradients strategies with the RBF interpolation models. In each iteration, a minimal positive basis is used to construct the frame. In a maximal positive basis, function values are computed, while, in a minimal positive basis, function values are evaluated. So the computation work in the new hybrid direct search method can be reduced. In addition, when the trial point of RBF interpolation models cannot satisfy the decrease condition, we employ PRP formula to get the search direction, which is similar to the Max-PRP. Furthermore, we rotate the minimal positive basis according to the local topography of objective function, making our method more effective in practice. The convergence is established under some mild conditions. Some numerical results show that the proposed method is promising.

This paper is organized a s follows. In Section 2, we present some basic notions for positive basis, frame, and describe our method. In Section 3, we prove the convergence of the proposed method. In Section 4, numerical results show the efficiency of method derived in this paper compared to Max-PRP [16]. Concluding remarks are given in Section 5. The default norm used in this paper is Euclidean.

2. The New Hybrid Direct Search Method

We first state the definition about positive basis, which can be found in [24].

Definition 1. Positive basis in is a set of vectors with the following two properties: (i)Every vector in is a nonnegative linear combination of the members of .(ii)No proper subset of satisfies ().

It is easy to know that cardinality of any positive basis satisfies . Two famous and simple examples of positive bases are where is a basis for , represents the minimal positive basis, and represents the maximal positive basis.

In addition, we give some concepts about frames, which were proposed by Coope and Price [25, 26].

Definition 2. A frame can be defined as where is a central point of a frame, is frame size, and is a positive basis in .

Definition 3. A frame is a minimal frame, if and only if

Definition 4. A frame is a quasi minimal frame if and only if where , is a positive constant, and the corresponding central point is called a quasi minimal point.

Let be th iterate. We will discuss the strategy of RBF interpolation model, search direction, and rotation of positive basis in detail below.

2.1. RBF Interpolation Model

Choose a positive basis and obtain a set of interpolate data points , where , , is the frame size, and the other points of set are chosen in the subset of previously evaluated points.

RBF interpolation model is a popular model for optimization, and some theory and implementations can be found in [27]. Corresponding to the set of interpolate data points , we get the following RBF interpolation model:where is a radial basis function and are parameters to be determined. are polynomial tails used in the context of RBF interpolation models, which most frequently are linear.

In addition, coefficients are required to satisfy

These, in conjunction with interpolation conditions ,

We define the linear system:where , , , and for , for , . We employ null-space method to solve system (8), which is similar to the approach of [22].

Then, we minimize the RBF interpolation model by solving the following problem:where , , and is the radius factor parameter.

2.2. PRP Direction

Consider the following linear model:where . The coefficients can be determined by regression interpolation conditions:Then, we have thatThis system can be solved by the method of least squares. For example, if we choose the positive basis as , and , where is th unit vector, then th element of is calculated according to the following formula:where . The PRP direction is obtained by

2.3. Rotation of the Positive Basis

In order to modify the positive basis such that at least one of the new directions is more closely conformed to the local behavior of the function, we rotate the positive basis at each step. This idea is similar to that in [28].

Suppose thatwhere is a basis for . Denotewhere describe the movements performed along the vectors in previous iterations.

We get positive basis by rotating . Firstly, we obtain linearly independent vectors according to :where represents the sum of all the movements made in the directions for The lemma 8.5.4 of [29] proved that is linearly independent.

Secondly, we use the Gram-Schmidt orthogonalization method to get a class of standard orthogonal basis:

Finally, we can getwhere is combined with according to the same combination principal as .

For instance, if , we obtain

Supposing that is the sequence of quasi minimal iteration points, then the above process can be summarized as the following algorithm.

Algorithm 5.
Step 0 (initializations). Choose initial point , positive basis , step length , and radius factor parameter . Choose , , . Set , .
Step 1 (checking the stopping condition). If the stopping condition is not met, then go to Step 2, otherwise output the lowest known point and stop.
Step 2 (determining the frame). Create a frame at iterate according to the positive basis and step length , and calculate the corresponding function values.
Step 3 (building the RBF interpolation model). Evaluate the RBF model parameters according to formula (8), and get solution of subproblem (9). If , then set and go to Step 6, otherwise go to Step 4.
Step 4 (obtaining the PRP direction). Obtain the search direction using (14), execute the line search process to find , and set . If , then set and go to Step 6, otherwise go to Step 5.
Step 5 (updating the current iteration point). Let be defined by the following rule:Step 6 (rotating the positive basis and updating the some parameters). Obtain according to (15)–(19) and compute . If frame is a quasi minimal frame, then set , , , otherwise set . In addition, increment by one and go to Step 1.

Remark 6. In Step 3, we set as linear polynomial tails, and in (8).

3. Convergence Analysis

Now we have the following convergent property of Algorithm 5.

Theorem 7. Supposing that the sequence of function value is bounded, then the sequence is infinite.

Proof. Assume that is finite; let be the final quasi minimal point and .
From Steps 3, 4, and 5 of Algorithm 5, we know thatorwhere is a positive constant, , are frame size and positive basis corresponding to iterate , respectively. Supposing that is the frame corresponding to quasi minimal iterate , then the frame is not quasi minimal. From Definition 4, it follows that there exists at least a vector , such thatBy (22), (23), and (24), we haveThen, we havewhere is a positive integer and .
Because frame is the final quasi minimal frame, by Step 6 of Algorithm 5, we know that is a positive constant for ; that is,By (25), (26), and (28), we haveIf we ignore the stopping condition and let , then , which contradicts the condition that is bounded. The proof of this theorem is complete.

Theorem 8. Assume the following conditions are satisfied:
(A1) is continuously differentiable.
(A2) for and , where is a positive constant and is the th vector in .
Then each cluster point of is a stationary point of .

Proof. Let be an arbitrary cluster point of and the subsequence converge to , where is an infinite subset of natural numbers. Assume , and . According to Taylor expansion and (A1), we havefor all , where , are frame size and positive basis corresponding to the iteration point , respectively, and is th vector of . From Definition 4, we haveCombining (30) and (31) with (A2), we obtainLet , then , , . According to Step 6 of Algorithm 5, we have . Combining these with (32) and (A1), we haveLet the numbers of be , then there exist nonnegative coefficients such thatCombining (33) and (34), we havewhich yields . The proof of this theorem is complete.

Remark 9. Although Theorem 8 needs the assumed condition (A1), in practice, we do not solve derivative-free problems that accurately. So we only assure that is continuously differentiable near the stationary point.

4. Numerical Experiments

In this section, we discuss numerical test results for Algorithm 5. Our tests are performed on a PC with Intel Core Duo CPU ([email protected] GHz, 3.60 GHz) and 8 GB RAM, using MATLAB 7.12.0.

To compare our algorithm to Max-PRP, we choose to work with the performance profiles [30] and data profiles [31] for derivative-free optimization. The performance profile is the following fraction: where is the set of benchmark problems, is the set of optimization solvers, is the number of function evaluations required to satisfy the convergence test for problem on solver .

The data profile is defined that where is the number of variables in .

We use the following convergence condition: where is the initial point for the test problem, is tolerance, is the best function value achieved by any solvers within function evaluations, and is a positive integer.

The benchmark problems set in our experiments is proposed in [32, 33] and CUTEr test problem set [34]. The problems set includes 78 nonlinear least squares problems and 60 normal nonlinear programming problems . Tables 1 and 2 show some information about test problems, where is the number of variables and is the number of components. The problems of Table 1 are defined by The problems of Table 2 are defined by

In all problems, we have In addition, we define the maximum computational budget as 150 simplex gradients, where the computational budget of a simplex gradients is equal to function evaluations, so .

The parameters of our numerical experiments are listed as follows: , , , , , , and takes value 2 if the previous iteration was successful, or 1 otherwise.

In the RBF interpolation model of (8), we set as linear polynomial tails and . In addition, we set as the maximum number of points considered in the interpolate data points . All the previously evaluated points are used to compute the RBF interpolation model when its number is lower than . Similar to [20], whenever there are more previously evaluated points than for building the RBF interpolation model, 80% of the desired points are selected as the ones nearest to the current iterate and the last 20% are chosen as the ones further away from the current iterate. This strategy is adopted in order to preserve the geometry and diversify the information used in the RBF interpolation model.

In Figure 1, we show the performance profiles related to Algorithm 5 and Max-PRP. As we can see, Algorithm 5 outperforms Max-PRP when , and the difference is significantly large as the performance ratio decreases. In addition, Algorithm 5 guarantees better results than Max-PRP when . For example, Algorithm 5 can solve about 90% test problems, while Max-PRP only solves no more than 85%, if performance ratio .

The data profiles of Algorithm 5 and Max-PRP are reported in Figure 2. When the number of simplex gradients is larger than 40, Algorithm 5 performs better than Max-PRP as it solves a higher percentage of problems. For example, with a budget of 400 simplex gradients and , Algorithm 5 solves almost 90% of the problems, while Max-PRP solves roughly 85% of the problems.

5. Conclusion

The computational results which are presented in this paper show that Algorithm 5 appears quite competitive. The performance profiles and the data profiles of numerical results indicate that Algorithm 5 often reduces the number of function evaluations which is required to reach stationary point and is superior to Max-PRP.

Competing Interests

The authors declare that there are no competing interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (11071117 and 11274109), the Natural Science Foundation of Jiangsu Province (BK20141409), and the Natural Science Foundation of Huzhou University (KX21072).

References

A. J. Booker, P. D. Frank, J. E. Dennis Jr. et al., “Managing surrogate objectives to optimize a helicopter rotor design further experiments,” AIAA MDO, 98-4717, 1998.
View at: Google Scholar
A. J. Booker, J. E. Dennis Jr., P. D. Frank et al., “Optimization using surrogate objectives on a helicopter test example,” in Computational Methods for Optimal Design and Control, pp. 49–58, Birkhäuser, Boston, Mass, USA, 1998.
View at: Google Scholar
A. L. Marsden, M. Wang, J. E. Dennis Jr., and P. Moin, “Optimal aeroacoustic shape design using the surrogate management framework,” Optimization and Engineering, vol. 5, no. 2, pp. 235–262, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
P. Mugunthan, C. A. Shoemaker, and R. G. Regis, “Comparison of function approximation, heuristic, and derivative-based methods for automatic calibration of computationally expensive groundwater bioremediation models,” Water Resources Research, vol. 41, no. 11, pp. 1–17, 2005.
View at: Publisher Site | Google Scholar
R. Oeuvray and M. Bierlaire, “A new derivative-free algorithm for the medical image registration problem,” International Journal of Modelling and Simulation, vol. 27, no. 2, pp. 115–124, 2007.
View at: Google Scholar
M. J. D. Powell, “A direct search optimization method that models the objective and constraint functions by linear interpolation,” in Advances in Optimization and Numerical Analysis, pp. 51–67, Springer Netherlands, 1994.
View at: Google Scholar
D. Winfield, “Function minimization by interpolation in a data table,” IMA Journal of Applied Mathematics, vol. 12, no. 3, pp. 339–347, 1973.
View at: Publisher Site | Google Scholar | MathSciNet
A. R. Conn, K. Scheinberg, and P. L. Toint, “Recent progress in unconstrained nonlinear optimization without derivatives,” Mathematical Programming, vol. 79, no. 1–3, pp. 397–414, 1997.
View at: Publisher Site | Google Scholar | MathSciNet
A. R. Conn, K. Scheinberg, and L. N. Vicente, “Global convergence of general derivative-free trust-region algorithms to first- and second-order critical points,” SIAM Journal on Optimization, vol. 20, no. 1, pp. 387–415, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
M. J. D. Powell, “Least Frobenius norm updating of quadratic models that satisfy interpolation conditions,” Mathematical Programming, vol. 100, no. 1, pp. 183–215, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
K. Scheinberg and P. L. Toint, “Self-correcting geometry in model-based algorithms for derivative-free unconstrained optimization,” SIAM Journal on Optimization, vol. 20, no. 6, pp. 3512–3532, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. Hooke and T. A. Jeeves, ““Direct search” solution of numerical and statistical problems,” Journal of the ACM, vol. 8, no. 2, pp. 212–229, 1961.
View at: Publisher Site | Google Scholar
V. Torczon, “On the convergence of the multidirectional search algorithm,” SIAM Journal on Optimization, vol. 1, no. 1, pp. 123–145, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
V. Torczon, “On the convergence of pattern search algorithms,” SIAM Journal on Optimization, vol. 7, no. 1, pp. 1–25, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Audet and J. Dennis, “Analysis of generalized pattern searches,” SIAM Journal on Optimization, vol. 13, no. 3, pp. 889–903, 2003.
View at: Publisher Site | Google Scholar | MathSciNet
I. D. Coope and C. J. Price, “A direct search frame-based conjugate gradients method,” Journal of Computational Mathematics, vol. 22, no. 4, pp. 489–500, 2004.
View at: Google Scholar | MathSciNet
B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” Revue Francaise Informatique Recherche Operationnelle, vol. 3, no. 1, pp. 35–43, 1969.
View at: Google Scholar
T. G. Kolda, R. M. Lewis, and V. Torczon, “Optimization by direct search: new perspectives on some classical and modern methods,” SIAM Review, vol. 45, no. 3, pp. 385–482, 2003.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. L. Custódio, H. Rocha, and L. N. Vicente, “Incorporating minimum Frobenius norm models in direct search,” Computational Optimization and Applications, vol. 46, no. 2, pp. 265–278, 2010.
View at: Publisher Site | Google Scholar | MathSciNet
A. R. Conn and S. Le Digabel, “Use of quadratic models with mesh-adaptive direct search for constrained black box optimization,” Optimization Methods and Software, vol. 28, no. 1, pp. 139–158, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
S. M. Wild, R. G. Regis, and C. A. Shoemaker, “ORBIT: optimization by radial basis function interpolation in trust-regions,” SIAM Journal on Scientific Computing, vol. 30, no. 6, pp. 3197–3219, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
S. M. Wild and C. Shoemaker, “Global convergence of radial basis function trust-region algorithms for derivative-free optimization,” SIAM Review, vol. 55, no. 2, pp. 349–371, 2013.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. R. Conn, K. Scheinberg, and L. N. Vicente, Introduction to Derivative-Free Optimization, vol. 8 of MPS/SIAM Series on Optimization, SIAM, Philadelphia, Pa, USA, 2008.
View at: Publisher Site | MathSciNet
I. D. Coope and C. J. Price, “Frame based methods for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 107, no. 2, pp. 261–274, 2000.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
I. D. Coope and C. J. Price, “On the convergence of grid-based methods for unconstrained optimization,” SIAM Journal on Optimization, vol. 11, no. 4, pp. 859–869, 2001.
View at: Publisher Site | Google Scholar | MathSciNet
M. D. Buhmann, Radial Basis Functions: Theory and Implementations, vol. 12 of Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, Cambridge, UK, 2003.
View at: Publisher Site | MathSciNet
L. Grippo and F. Rinaldi, “A class of derivative-free nonmonotone optimization algorithms employing coordinate rotations and gradient approximations,” Computational Optimization and Applications. An International Journal, vol. 60, no. 1, pp. 1–33, 2015.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear Programming: Theory and Algorithms, John Wiley & Sons, New York, NY, USA, 3rd edition, 2006.
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | MathSciNet
J. J. Moré and S. M. Wild, “Benchmarking derivative-free optimization algorithms,” SIAM Journal on Optimization, vol. 20, no. 1, pp. 172–191, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar | MathSciNet
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” Association for Computing Machinery. Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar | MathSciNet
N. I. M. Gould, D. Orban, and P. L. Toint, “CUTEr and SifDec: a constrained and unconstrained testing environment revisited,” ACM Transactions on Mathematical Software, vol. 29, no. 4, pp. 373–394, 2003.
View at: Google Scholar

Copyright

Copyright © 2017 Xiaowei Fang and Qin Ni. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

701

Downloads

681

Citations