Journal of Optimization

Journal of Optimization / 2016 / Article

Research Article | Open Access

Volume 2016 |Article ID 3260940 | 14 pages | https://doi.org/10.1155/2016/3260940

Hybridization of Adaptive Differential Evolution with an Expensive Local Search Method

Academic Editor: Manlio Gaudioso
Received27 Dec 2015
Revised09 Jun 2016
Accepted14 Jun 2016
Published31 Jul 2016

Abstract

Differential evolution (DE) is an effective and efficient heuristic for global optimization problems. However, it faces difficulty in exploiting the local region around the approximate solution. To handle this issue, local search (LS) techniques could be hybridized with DE to improve its local search capability. In this work, we hybridize an updated version of DE, adaptive differential evolution with optional external archive (JADE) with an expensive LS method, Broydon-Fletcher-Goldfarb-Shano (BFGS) for solving continuous unconstrained global optimization problems. The new hybrid algorithm is denoted by DEELS. To validate the performance of DEELS, we carried out extensive experiments on well known test problems suits, CEC2005 and CEC2010. The experimental results, in terms of function error values, success rate, and some other statistics, are compared with some of the state-of-the-art algorithms, self-adaptive control parameters in differential evolution (jDE), sequential DE enhanced by neighborhood search for large-scale global optimization (SDENS), and differential ant-stigmergy algorithm (DASA). These comparisons reveal that DEELS outperforms jDE and SDENS except DASA on the majority of test instances.

1. Introduction

Optimization is concerned with finding best solution for an objective function. In general, an unconstrained optimization problem can be stated as follows: Find global optimum of an objective function , where and is the dimension of the problem.

Evolutionary algorithms (EAs) are inspired from Darwinian theory of evolution [1]. They are very efficient for finding global optimum of many real world problems, including problems from mathematics, engineering, economics, business, and medicines. EA family consists of a variety of stochastic algorithms, like Genetic Algorithms (GAs) [2], Particle Swarm Optimization (PSO) [3, 4], Evolutionary Strategies (ES) [5], and differential evolution algorithm (DE) [6, 7].

Among EAs, DE is the most recent algorithm and is efficient in solving many optimization problems. DE has many advantages. For example, it is simple to understand and implement, has a few control parameters, and is robust [8]. There is no doubt that DE is a remarkable optimizer for many optimization problems. But it has few limitations, like stagnation, premature convergence, and loss of population diversity [9, 10]. Being a global optimizer, DE suffers from searching the neighborhood of the approximate solution to the given problem. This makes room for hybridizing DE with other techniques to improve its poor exploitation (exploring the neighborhood of the approximate solutions). On the other hand, the role of LS methods is to stabilize the search especially in the environs of a local optimum. Thus, they can be combined with global search algorithms to enhance their local searching.

The main aim of this paper is to experiment with and validate the performance of our newly proposed hybrid algorithm, DEELS, which combines JADE [11, 12] and BFGS [13]. As a result, we want to see whether this hybridization will improve the performance of JADE further. Contrary to our published preliminary work [14], this paper presents DEELS in full depth. It also comments on the performance of DEELS for large-scale global optimization problems with dimension 1000. Moreover, in contrast to our previous published comparison with JADE only [14], this time DEELS is compared with jDE [15], SDENS [16], and DASA [17] on problems from CEC2005 and CEC2010 test suits to further explore the capabilities of DEELS for handling small and large dimension problems.

The rest of this paper is organized as follows. Section 2 describes the basic DE, JADE, and the BFGS algorithms. Section 3 presents literature review. Section 4 presents proposed algorithm. Section 5 gives the experimental results, and finally Section 6 concludes this paper and discusses future research direction.

2. Some Relevant Existing Methods

As mentioned earlier, DEELS depends upon JADE and BFGS. Thus, this section presents the basic operators of DE, JADE, and BFGS.

2.1. Basic DE

Differential evolution (DE) [6, 7] is a recently developed bioinspired scheme for finding the global optimum of an optimization problem. This section briefly reviews the DE algorithm. More details about it can be found in [1822]. The working of DE can be described as follows.

2.1.1. Parent Selection

For each member , , of the current generation , three other members, , , and , are randomly selected, where , , and are randomly chosen indices such that , , and and . Thus, for each individual, , a mating pool of four individuals is formed in which an individual breeds against three individuals and produces an offspring.

2.1.2. Reproduction

To generate an offspring, DE incorporates two genetic operators, mutation and crossover. They are detailed as follows:(1)Mutation. After selection, mutation is applied to produce a mutant vector , by adding a scaled difference of the two already chosen vectors to the third chosen vector; that is,where is the scaling factor.(2)Crossover. After mutation, the parameters of the parent vector and mutant vector are mixed by a crossover operator and a trial member is generated as follows:where .

2.1.3. Survival Selection

At the end, the trial vector generated in (2) is compared with its parent vector on the basis of its objective function value. The best of the two will get a chance to become a member of the the new generation; that is,

2.2. JADE

JADE [11] is an adaptive version of DE which modifies it in three aspects.

2.2.1. DE/Current/to-best Strategy

JADE utilized two mutation strategies: one with external archive and the other without it. These strategies can be expressed as follows [11]:where is a vector chosen randomly from the top individuals and , , and are chosen from the current population , while is chosen randomly from , where denotes the archive of JADE and is a constant chosen as . In DEELS, we will utilize the strategy given in (4).

2.2.2. Control Parameters Adaptation

For each individual , control parameter and the crossover probability are generated independently from Cauchy and normal distributions, respectively, as follows [11]:These are then truncated to and , respectively. Initially, both and are set to 0.5. They are then updated at the end of each generation as follows:where denotes the Lehmer mean and denotes the arithmetic mean and is the set of successful ’s while is the set of successful ’s at generation .

2.2.3. Optional External Archive

At each generation, the failed parents are sent to the archive. If the archive size exceeds , some solutions are randomly deleted from it to keep its size equal to . The archive inferior solutions play a roll in JADE’s mutation strategy with archive. The archive not only provides information about direction but improves the diversity as well.

2.3. BFGS

The BFGS method, also known as the quasi Newton algorithm, employs the gradient and Hessian in finding a suitable search direction. BFGS is considered as a good LS method due to its efficiency. The detailed algorithm of BFGS is presented in Algorithm 1.

Input: error: desired accuracy;
   : number of iterations.
   x: the starting vector.
   H: The Hessian matrix, initalize as identity matrix.
(1) .
(2) while    do
(3)   Find the difference ;
(4)   Compute the difference of gradients ;
(5)   if    and    then
(6)     ;
(7)     ;
(8)     Revise the Hessian matrix as:
          
(9)   end if
(10)   Compute the search direction by using the current Hessian matrix ;
(11)   Calculate by golden section method [23];
(12)   ;
(13) end while
Output: is the output of the algorithm.

3. Brief Review of Variants of DE and Hybridization of DE with Local Search Methods

To improve the performance of DE, many researchers devised modifications to the classic DE and proposed different variants. Some researchers modified the selection scheme [24], while others varied mutation and crossover operators [25]. Recently, in [26], orthogonal crossover was used instead of binomial and exponential crossover. Some have introduced new variants like opposition based DE (ODE) [27], centroid based initialization (ciJADE) [28], jDE [15], and genDE [8], while others introduced adaptation and self-adaptation of control parameters and as in [29, 30], SaDE [31], JADE [11, 12], SHADE [32], and EWMA-DECrF [33]. Some introduced cooperative coevolution into DE for large-scale optimization [34]. A group of researchers applied it to discrete problems [35, 36], while others take advantage of its global search ability in continuous domains [26, 3740].

In recent years, the hybridization of DE with LS methods has gained much attraction due to their individual merits. Many hybrid algorithms have shown significant performance improvement. Here, we review some of the methods in this category.

A new differential evolution algorithm with localization around the best point (DELB) is proposed in [41]. In DELB, the initial steps are the same as those in DE except that the mutation scale factor is chosen from randomly for each mutant vector. DELB also modifies the selection step by introducing reflection and contraction. The trial vector is compared with the current best and the parent vector. If the parent is worse than the trial vector, it is replaced by a new concentrated or reflected vector. In DELB, the trial vector can be replaced by its parent vector or reflected vector or contracted vector, while in classic DE only the trial vector replaces the parent.

Recently in [42], DE is hybridized with nonlinear simplex method. This method is known as NSDE. The authors of [42] applied nonlinear simplex method with uniform random numbers to initialize DE population. Initially, individuals are generated uniformly and then next are generated from these points by application of Nelder-Mead Simplex (NMS). Now from population, the fittest are selected as DE’s initial population and the rest of DE is unaltered in NSDE. Thus, NSDE modifies DE in the population step only. It has shown good performance in reducing function evaluations and CPU time.

In another experiment, Brest et al. [43] hybridized DE with Sequential Quadratic Programming (SQP), an efficient but expensive gradient-based LS method. Their hybrid applies the DE algorithm until function evaluations reach of the maximum function evaluations. It then applies SQP for the first time to the best point thus obtained. Afterwards, SQP is applied after every 100 generations to the best solution of the current search. Expensive local search iteration number is set to . In their hybrid, the population size keeps reducing and the process ends with minimum population size. DE provides the users with flexible offspring generation strategies [44]. Hence, hybridization of DE will continue to remain an active field of multidisciplinary research in the years to come.

Thus, we present a new algorithm, DEELS, which utilizes an expensive local search for refining the solutions. The details of DEELS are presented in the following section.

4. A New Hybrid Algorithm: DEELS

In this section, we present our new proposed algorithm, DEELS, which is the combination of two methods with contrasting features. First, we will discuss the main features of the algorithm. Then, we will describe it explicitly.

4.1. Main Idea

Though JADE, due to its adaptive parameter control strategy, performs better than classic DE on many optimization problems, however, its performance worsens with the increase in dimension. BFGS is a LS technique which has a strong self-correcting ability [45] in searching the optimal solution, but it is not globally as good as JADE. The important question is how to reconcile two different aspects to solve the minimization problem.

A very natural way would be to hybridize these two techniques, JADE and BFGS, together for solving unconstrained optimization problems. The issue is how to combine them in a way which is easy to understand and implement. Many hybrid approaches incorporate expensive methods to find the best solution. But, here, the new algorithm incorporates the robust and costly method not only for refining good solutions, but for locating them in the population during the search process.

DEELS begins with JADE and allows it to search for generations. It then selects the best individuals from this population and applies to them the expensive LS, that is, BFGS, for the first time. The objective of applying efficient search is to make them potential individuals to produce better offspring and lead the search in promising directions. These are then introduced into the population and the worse solutions are removed from it.

The purpose of calling BFGS after generations is to concentrate the population and add local search ability to the overall scheme and thus help it avoid getting trapped in the local optimal solutions. For these reasons, BFGS is invoked two more times in the evolution, with an interval of generations. If function value is less than a threshold , this means that it is in the neighborhood of the value to reach and this current best solution might lead the search to the desired optimal solution. Hence, it is desirable to apply the efficient LS by more than one iteration to this best solution. Thus, BFGS is applied by iterations when the best solution is in the vicinity of a local optimum. If the output solution of BFGS is the best known solution, then the algorithm stops; otherwise, it continues until the allowed maximum number of function evaluations is met.

In [43], the population size is reduced dynamically, while in our hybrid algorithm, we keep the population size fixed, since reducing the population size might result in losing population diversity, which is very important for DE. DEELS has got much inspiration from the state-of-the-art paper [46]. We apply expensive LS in combination with an EA (DE) instead of their inexpensive LS. In [46], both methods are LSs, while DEELS combines BFGS with JADE to investigate the effect of combing an EA with a LS method. In [46], a restart is also incorporated, while this is not necessary in DEELS.

4.2. Algorithmic Framework of DEELS

The details of DEELS are given in Algorithm 2. Here, we explain the different strategies used in DEELS.

(1) Inputs: Generate uniform and random points, from the search space to form population ;
(2) : the number of points selected for LS;
(3) : the number of iterations of LS for concentration;
(4) : the number of iterations of LS for refining solution;
(5) : population size;
(6) FES: number of function evaluations;
(7) : generation counter;
(8) : interval between the LS calls;
(9) error: desired accuracy for LS method;
(10) , ;
(11) Evaluate the population;
(12) Set and ;
(13) whiledo
(14)   Start the algorithm with JADE by using (4) for generating mutant vector, (2) for trial vector,
     (3) for best solution selection and (6) and (7) for adaptation of control parameters;
(15)   Explore the population for generations.
(16)   Sort the objective values;
(17)   Select best points;
(18)   for to do
(19)     Apply iteration of BFGS to these points;
(20)     ifthen
(21)      Break;
(22)     else ifthen
(23)      Update the population by adding new points to it such that its size becomes ;
(24)      Sort the objective values;
(25)      Delete the worse individuals from ;
(26)     end if
(27)   end for
(28)   Apply JADE to this new population until next generations;
(29)   ifthen
(30)     Break;
(31)   else
(32)     ;
(33)   end if
(34) end while
4.2.1. Global Search

JADE improves the population of solutions by updating it from generation to generation with the help of genetic operators, mutation, and crossover. These operators help the search by producing promising solutions. JADE also possesses global search ability and thus adds it to DEELS. Moreover, JADE being a population based method can keep the diversity of the population and thus decreases the chances of DEELS getting trapped in local optima.

4.2.2. LS

The BFGS method has very strong self-correcting properties (when the right line search is used). If, at some iteration, the Hessian matrix contains bad curvature information, it has the ability to correct these inaccuracies by only few updates [45]. For this reason, BFGS generally performs very well, and once in the neighborhood of a minimizer it can attain superlinear convergence [45]. Though BFGS is efficient, it is a costly method, since it computes the gradient at the given point, which utilizes function evaluations per gradient in DEELS. Further, it approximates the Hessian matrix , which is an matrix of second-order partial derivatives [47], the computational cost of which is per iteration [47]. BFGS needs function evaluations per iteration [45]. Thus, the overall overhead of BFGS is also per iteration.

The BFGS method plays two roles in DEELS; first, it is employed for generating promising solutions in the population after specified intervals of evolution. Secondly, it improves the quality of the best solution found so far by JADE and BFGS together.

Next, we explain what we mean by the terms concentration and refinement. As said earlier, the issue is to have an easy-to-understand and easy-to-implement search process. To achieve this, we need to rely on the fact that the problem is to distinguish between ordering points of which we have a lot and good ones (local optima) of which we have relatively few and the best points (global optima) of which we, potentially, may have only one or none.

Let us draw a diagram (see Figure 1) of the main process which is to rely on LS to do a course clustering (i.e., bring towards the basin of local optima of the majority of the good points in the population) and a refinement step in which hopefully the local optimum will be identified. It is clear that, initially, this process will be rather ineffective because of the sheer randomness of the population of solutions as shown in Figure 1(a); unless we are very lucky, it is unlikely to generate good points in the first population. But the important thing is that the process will become more and more effective as concentration takes its toll, on the population (see Figure 1(c)).

4.2.3. Updating the Population

Adding promising solutions to the population of DEELS and removing the worst points from it can improve the quality of offspring in the next generations. As if good parents can produce good offspring, worse parents also have the chance of producing worse solutions. Hence, their removal can have a good effect on the entire population. New potential solutions can also increase the convergence rate.

4.2.4. Stopping Condition

DEELS stops when one or both of the following conditions are met:(1)The maximum number of function evaluations is reached.(2), where is the best individual found in a run and is the known value to reach of the test instance.The maximum number of function evaluations is set to for CEC2010 test instances with dimension 1000, while for 30-dimensional problems (CEC2005), these are chosen as .

5. Comparison Studies

This section reports on two sets of experiments. In Experiment , DEELS is compared with jDE, while in Experiment , DEELS is compared with SDENS and DASA. For comparison with SDENS and DASA, the experimental results for the best, median, mean, and standard deviation values are obtained from [17]. Moreover, all the experiments are conducted in MATLAB environment.

5.1. Experiment  1

In our preliminary results [14], DEELS was compared with JADE only, which is its internal optimization technique. However, here we compare DEELS with another state-of-the-art algorithm jDE [15], which is a self-adaptive DE variant for 30-dimensional problems.

5.1.1. Test Instances for Experiment  1

To study the performance of DEELS, we use CEC2005 test suit (see Table 1). This test suit was especially designed for single-objective unconstrained continuous optimization. Further, it was developed for low dimensions, for example, 30 and 50 dimensions. That is why we selected these instances for our experimental study. More details about these instances can be found in [48]. The instances of CEC2005 can be divided into the following:(i)Unimodal test instances ().(ii)Multimodal test instances:(1)Basic multimodal test instances (),(2)Expanded multimodal test instances ().(iii)Hybrid composition test instance ().The 15th test instance, is designed by combining ten different benchmark functions, that is, two Rastrigin’s functions, two Weirstrass’s functions, two Griewank’s functions, two Ackley’s functions, and two Sphere functions. Its value to reach is 120.


Test instance Test instances definitionInitialization rangeValue to reach

Minimize , −450
where and ,
where is the dimension and is the shifted global optimum.

Minimize , −450
where and ,
where is the dimension and is the shifted global optimum.

Minimize , −450
where and ,
where is the dimension, is the shifted global optimum.
is the orthogonal matrix.

Minimize , −450
where and ,
where is the dimension and is the shifted global optimum.

Minimize , −310
where , , is the dimension,
matrix, are random numbers ,
, is the th row of , ,
is an vector, and are random numbers .

Minimize , 390
where and ,
where is the dimension and is the shifted global optimum.

Minimize , −180
where and ,
where is the dimension and is the shifted global optimum.
is the linear transformation matrix, .

Minimize,−140
where , ,
is the dimension, and is the shifted global optimum.
M is the linear transformation matrix.

Minimize , −330
where and ,
where is the dimension and is the shifted global optimum.

Minimize , −330
where and ,
where is the dimension and is the shifted global optimum.
M is the linear transformation matrix.

Minimize , where90
, , , ,
is the dimension, is the shifted global optimum,
, and M is the linear transformation matrix.

Minimize , where−460
,
, ,
, is the dimension,
, and are random integers .
M is the linear transformation matrix.

Minimize ,−130
Griewank’s function is as follows: .
Rosenbrock’s function is as follows: , , .
is the dimension. is the the shifted global optimum.

Minimize