Research Article | Open Access
Memetic Differential Evolution with an Improved Contraction Criterion
Memetic algorithms with an appropriate trade-off between the exploration and exploitation can obtain very good results in continuous optimization. In this paper, we present an improved memetic differential evolution algorithm for solving global optimization problems. The proposed approach, called memetic DE (MDE), hybridizes differential evolution (DE) with a local search (LS) operator and periodic reinitialization to balance the exploration and exploitation. A new contraction criterion, which is based on the improved maximum distance in objective space, is proposed to decide when the local search starts. The proposed algorithm is compared with six well-known evolutionary algorithms on twenty-one benchmark functions, and the experimental results are analyzed with two kinds of nonparametric statistical tests. Moreover, sensitivity analyses for parameters in MDE are also made. Experimental results have demonstrated the competitive performance of the proposed method with respect to the six compared algorithms.
In 1989, the name of “memetic algorithms” (MAs)  was introduced for the first time. In the last two decades, MAs gradually became one of the recent growing areas of research in evolutionary computation. They combine various evolutionary algorithms (EAs) with different LS methods to balance exploration and exploitation. Existing examples of memetic algorithms are NM-BRO , MA-LSCh-CMA , LBBO , IMMA , and MPSO . In the framework of MAs, LS operators are used to execute further exploitation for the individuals generated by common EA operations, which is helpful to enhance the EA’s capacity of solving complicated problems.
Differential evolution was first proposed by Storn and Price  in 1995 to solve global numerical optimization problems over continuous search spaces. It shares some similarities with other EAs. For example, DE works with a population of solutions, called vectors; it uses recombination and mutation operators to generate new vectors and, finally, it has a replacement process to discard the less fit vectors. DE represents solutions with real coding. Some of the differences with respect to other EAs are as follows: DE uses a special mutation operator based on the linear combination of three individuals and uses a uniform crossover operator. It has several attractive features. DE is relatively simple to implement and was demonstrated to be very effective on a large number of cases. In the past few decades, DE has been successfully used in many real-world applications, such as space trajectory design [8–10], hydrothermal optimization , underwater glider path planning , and vehicle routing problem .
Despite its successful applications to different classes of problems in different fields, DE was demonstrated to converge to a fixed point, a level set , or a hyperplane not containing the global optimum . Furthermore, in some cases it was shown to have slow local convergence.
In order to overcome these shortcomings, some authors have proposed a hybridization of DE with some local search heuristics. dos Santos Coelho and Mariani  proposed a version of memetic DE which combines DE with the generator of chaos sequences and sequential quadratic programming technique (DEC-SQP). In this memetic algorithm, DE with chaos sequences is the global optimizer and SQP is applied to the best individual to find the local minimum. Noman and Iba  proposed an adaptive hill-climbing crossover-based local search operation for enhancing the performance of standard differential evolution (DEahcSPX). Muelas et al.  developed MDE-DC which combines DE with multiple trajectory search algorithm (MTS). Neri and Tirronen  proposed the scale factor local search differential evolution (SFLSDE). In SFLSDE, golden section search and unidimensional hill-climb local search are applied to detect an optimal value of the scale factor and generate a higher quality offspring. Wang et al.  proposed an adaptive MA framework called DE-LS. In DE-LS, self-adaptive differential evolution (SaDE)  is the global search method, while covariance matrix adaptation evolution strategy (CMA-ES)  and self-adaptive mixed distribution based univariate EDA (MUEDA)  are employed as the local search methods. Vasile et al.  proposed an inflationary differential evolution algorithm (IDEA), which hybridizes DE with the restarting procedure of Monotonic Basin Hopping (MBH), to solve space trajectory optimization problems. Minisci and Vasile  and Di Carlo et al.  proposed an adaptive version of inflationary differential evolution algorithm (AIDEA) and a multipopulation version of AIDEA (MP-AIDEA) which automatically adapt the values of four control parameters. Locatelli et al.  proposed a memetic differential evolution for disk-packing and sphere-packing problems. In this algorithm, two kinds of local searches (MINOS and SNOPT) are used to detect local minima. Asafuddoula et al.  proposed an adaptive hybrid DE algorithm (AH-DEa) which has three features. The first is its use of adaptive crossover rates from a given set of discrete values. The second is an adaptive crossover strategy at different stages of the evolution. The last is the inclusion of a local search strategy to further improve the best solution. Qin et al.  proposed an advanced SaDE, which incorporates two different local search chains (Lamarckian and Baldwinian) into SaDE to enhance exploitation capability. Trivedi et al.  hybridized DE and GA to solve the unit commitment scheduling problems, in which GA was used to handle the binary unit commitment variables while DE was employed to optimize the continuous power dispatch related variables. In the same year, Li et al.  proposed a new hybridization, named DEEP, based on DE framework and the key features of CMA-ES, which generates a trial vector by first using a DE/rand/1/bin strategy followed by an Evolution Path (EP) mutation of CMA-ES.
The focus of this paper is to optimally combine DE global search operators with Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm to improve local search in continuous optimization. A new contraction criterion, which is based on the maximum distances in objective space and decision space, is proposed. When the contraction criterion is satisfied, BFGS starts from the best solution at the current generation. Furthermore, a restart mechanism is employed. If the best solution is not improved during the course of the local search, the population is reinitialized to increase the chance to find the global optimum.
The paper is organized as follows: DE is briefly introduced in Section 2. The proposed DE algorithm with local search and reinitialization is presented in Section 3. The design of the experiments, the results, and the corresponding discussions are included in Section 4. The last section, Section 5, is devoted to conclusions and the future work.
2. A Short Introduction to Differential Evolution
DE is a population-based stochastic parallel optimization method. Each vector (or individual) of the population at generation is called the target vector, and it will generate one offspring called the trial vector. For example, the th vector of the population will generate one trial vector . Trial vectors are generated by adding weighted difference vectors to the target vector. This process is referred to as the mutation operator where the target vector is mutated. A crossover step is then applied to produce an offspring which is only accepted if it improves on the fitness of the parent individual. Many variants of standard DE have been proposed, which use different learning strategies and/or recombination operations in the reproduction stage. A general DE variant may be recorded as DE/a/b/c, where “a” denotes the mutation strategy, “b” specifies the number of difference vectors used, and “c” specifies the crossover scheme which may be binomial or exponential. The DE/rand/1/exp is described in Algorithm 1.
3. Proposed Algorithm
In this section, we describe four major operations of the proposed MDE algorithm in detail, including contraction criterion, BFGS search, reinitialization scheme, and boundary constraint handling. The detailed description of MDE is given in Algorithm 2.
3.1. Contraction Criterion
In order to design an effective and efficient hybrid algorithm for global optimization, we need to take advantage of both the exploration capabilities of EA and the exploitation capabilities of LS and combine them in a well-balanced manner. To incorporate BFGS into DE successfully, a triggering condition, called contraction criterion, is needed to decide when the local search has to start. There are several kinds of methods to define a contraction criterion. Qin and Suganthan  applies local search method after a fixed number of generations (every 200 generations). Sun et al.  starts the LS if the promising solution is not updated in t-consecutive generations. Simon et al.  use the minimum fitness in the objective space as the contraction criterion; Di Carlo et al. [8–10] perform LS when the maximum distance in decision space is below a given threshold.
In MDE, we propose a new contraction criterion which combines two criteria: (a) is the improved maximum distance in objective space and (b) is the maximum distance in decision space. The idea of is derived from where is a measure of the diversity of the population in objective space.
The distance in decision space is defined aswhere is the Euclidean distance. is a measure of the diversity of the population in decision space.
3.2. BFGS Search
In MDE, the local search utilizes the better solutions obtained by the global search to update the population of MDE and thus enhances MDE’s exploitation ability to find the best solution. In MDE, we use the BFGS algorithm as the local search method. BFGS is one of the quasi-Newton methods which do not need the precise Hessian matrix and is able to approximate it based on the individual successive gradients. BFGS is considered as the most effective and popular quasi-Newton method and has been proven to have good performance even for nonsmooth optimizations. The details can be found in .
3.3. Reinitialization Scheme
If the best solution has not been improved after local search, a reinitialization of the whole population is used to give the algorithms more opportunities to find the global optimum. Simon et al.  proposed a partial reinitialization of the population. Every 20 generations, the algorithm selects the best individuals from a temporary population of individuals as the reinitialization pool. Sun et al.  chose the individuals, which have the largest distances from the local optima, from a temporary population to form the next population. Zamuda et al.  proposed a population size reduction method as the reinitialization scheme. In MDE, we apply a simple reinitialization scheme described in Algorithm 3. If the result of the local search does not improve the best individual in the population, a reinitialization of the population is triggered. A counter keeps track of the number of restarts. For , where is user-defined, individuals are generated randomly in the search space, drawing samples from a uniform distribution. For , individuals in the population are initialized randomly in the search space, while the rest are initialized by a normal distribution which takes the best individual as the center and as the standard deviation. Algorithm 3 summarises the reinitialization procedure. The function randreal draws samples from a uniform distribution while function Gaussian draws samples from a normal distribution and are the lower and upper boundaries on .
3.4. Boundary Constraint Handling
After mutation and crossover, each generated trial vector undergoes boundary constraint check. If some variables of are out of the boundary, a repair method is applied as follows:where can generate a random real number from .
4. Experimental Results
In order to verify the performance of MDE, we select the 21 nonnoisy benchmark functions from CEC2005 special session on real-parameter optimization (excluding noisy functions , , , and ) since MDE has no ability to handle functions with noisy landscapes. The details about these functions can be found in . We compare MDE with six peer algorithms, including CLPSO , GL-25 , CMA-ES , LBBO , SFLSDE , and L-SHADE .
4.1. Experimental Setup
For each algorithm on each benchmark problem, we conduct 25 independent runs and limit each run to max function evaluations, where is the problem dimension (, , and ). The performance of the algorithms is evaluated in terms of function error value , defined as , where is the global optimum of the test function. The mean and the standard deviation of the function error values are recorded. The parameters of MDE are set as , , , , , and ; the mutation and crossover strategies are the same as those in . For the other six algorithms, we use the same parameter settings in their original papers.
4.2. Performance Criteria
To effectively analyze the results, two nonparametric statistical tests, as similarly done in [35, 36], are used in the experiments. (i) Wilcoxon’s signed-rank test at is performed to test the statistical significance of the experimental results between two algorithms on both single-problem and multiproblem. (ii) Friedman’s test is employed to obtain the average rankings of all the compared algorithms. Wilcoxon’s signed-rank test on single-problem is calculated by Matlab, while Wilcoxon’s signed-rank test on multiproblem and Friedman test are calculated by the software of KEEL .
4.3. Comparison between the Other Six Algorithms and MDE
Table 1 shows the results of MDE and the other six algorithms on the 10-dimensional benchmarks. It can be seen that MDE performs significantly better than CLPSO, GL-25, CMA-ES, LBBO, SFLSDE, and L-SHADE on 15, 16, 17, 7, 8, and 8 test functions. And CLPSO, GL-25, CMA-ES, LBBO, SFLSDE, and L-SHADE win on 4, 4, 3, 5, 8, and 8 test functions, respectively. MDE obtains similar results with the other six algorithms in 2, 1, 1, 9, 5, and 5 cases. Additionally, the results of the multiple-problem statistical analysis are shown in Table 4. It can be seen that MDE can obtain higher values than values in all cases, where is the sum of ranks for the functions on which MDE outperformed the compared algorithm, and is the sum of ranks for the opposite . According to Wilcoxon’s test at and , there are significant differences in three cases (MDE versus CLPSO, MDE versus GL-25, and MDE versus CMA-ES), which means that in those cases MDE is significantly better than CLPSO, GL-25, and CMA-ES. In addition, Friedman’s test is employed to evaluate the significant differences of all the compared algorithms. As shown in Figure 1, MDE gets the second average ranking value, while L-SHADE gets the first average ranking values on the 10-dimensional problems.