Memetic Differential Evolution with an Improved Contraction Criterion
Memetic algorithms with an appropriate trade-off between the exploration and exploitation can obtain very good results in continuous optimization. In this paper, we present an improved memetic differential evolution algorithm for solving global optimization problems. The proposed approach, called memetic DE (MDE), hybridizes differential evolution (DE) with a local search (LS) operator and periodic reinitialization to balance the exploration and exploitation. A new contraction criterion, which is based on the improved maximum distance in objective space, is proposed to decide when the local search starts. The proposed algorithm is compared with six well-known evolutionary algorithms on twenty-one benchmark functions, and the experimental results are analyzed with two kinds of nonparametric statistical tests. Moreover, sensitivity analyses for parameters in MDE are also made. Experimental results have demonstrated the competitive performance of the proposed method with respect to the six compared algorithms.
In 1989, the name of “memetic algorithms” (MAs)  was introduced for the first time. In the last two decades, MAs gradually became one of the recent growing areas of research in evolutionary computation. They combine various evolutionary algorithms (EAs) with different LS methods to balance exploration and exploitation. Existing examples of memetic algorithms are NM-BRO , MA-LSCh-CMA , LBBO , IMMA , and MPSO . In the framework of MAs, LS operators are used to execute further exploitation for the individuals generated by common EA operations, which is helpful to enhance the EA’s capacity of solving complicated problems.
Differential evolution was first proposed by Storn and Price  in 1995 to solve global numerical optimization problems over continuous search spaces. It shares some similarities with other EAs. For example, DE works with a population of solutions, called vectors; it uses recombination and mutation operators to generate new vectors and, finally, it has a replacement process to discard the less fit vectors. DE represents solutions with real coding. Some of the differences with respect to other EAs are as follows: DE uses a special mutation operator based on the linear combination of three individuals and uses a uniform crossover operator. It has several attractive features. DE is relatively simple to implement and was demonstrated to be very effective on a large number of cases. In the past few decades, DE has been successfully used in many real-world applications, such as space trajectory design [8–10], hydrothermal optimization , underwater glider path planning , and vehicle routing problem .
Despite its successful applications to different classes of problems in different fields, DE was demonstrated to converge to a fixed point, a level set , or a hyperplane not containing the global optimum . Furthermore, in some cases it was shown to have slow local convergence.
In order to overcome these shortcomings, some authors have proposed a hybridization of DE with some local search heuristics. dos Santos Coelho and Mariani  proposed a version of memetic DE which combines DE with the generator of chaos sequences and sequential quadratic programming technique (DEC-SQP). In this memetic algorithm, DE with chaos sequences is the global optimizer and SQP is applied to the best individual to find the local minimum. Noman and Iba  proposed an adaptive hill-climbing crossover-based local search operation for enhancing the performance of standard differential evolution (DEahcSPX). Muelas et al.  developed MDE-DC which combines DE with multiple trajectory search algorithm (MTS). Neri and Tirronen  proposed the scale factor local search differential evolution (SFLSDE). In SFLSDE, golden section search and unidimensional hill-climb local search are applied to detect an optimal value of the scale factor and generate a higher quality offspring. Wang et al.  proposed an adaptive MA framework called DE-LS. In DE-LS, self-adaptive differential evolution (SaDE)  is the global search method, while covariance matrix adaptation evolution strategy (CMA-ES)  and self-adaptive mixed distribution based univariate EDA (MUEDA)  are employed as the local search methods. Vasile et al.  proposed an inflationary differential evolution algorithm (IDEA), which hybridizes DE with the restarting procedure of Monotonic Basin Hopping (MBH), to solve space trajectory optimization problems. Minisci and Vasile  and Di Carlo et al.  proposed an adaptive version of inflationary differential evolution algorithm (AIDEA) and a multipopulation version of AIDEA (MP-AIDEA) which automatically adapt the values of four control parameters. Locatelli et al.  proposed a memetic differential evolution for disk-packing and sphere-packing problems. In this algorithm, two kinds of local searches (MINOS and SNOPT) are used to detect local minima. Asafuddoula et al.  proposed an adaptive hybrid DE algorithm (AH-DEa) which has three features. The first is its use of adaptive crossover rates from a given set of discrete values. The second is an adaptive crossover strategy at different stages of the evolution. The last is the inclusion of a local search strategy to further improve the best solution. Qin et al.  proposed an advanced SaDE, which incorporates two different local search chains (Lamarckian and Baldwinian) into SaDE to enhance exploitation capability. Trivedi et al.  hybridized DE and GA to solve the unit commitment scheduling problems, in which GA was used to handle the binary unit commitment variables while DE was employed to optimize the continuous power dispatch related variables. In the same year, Li et al.  proposed a new hybridization, named DEEP, based on DE framework and the key features of CMA-ES, which generates a trial vector by first using a DE/rand/1/bin strategy followed by an Evolution Path (EP) mutation of CMA-ES.
The focus of this paper is to optimally combine DE global search operators with Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm to improve local search in continuous optimization. A new contraction criterion, which is based on the maximum distances in objective space and decision space, is proposed. When the contraction criterion is satisfied, BFGS starts from the best solution at the current generation. Furthermore, a restart mechanism is employed. If the best solution is not improved during the course of the local search, the population is reinitialized to increase the chance to find the global optimum.
The paper is organized as follows: DE is briefly introduced in Section 2. The proposed DE algorithm with local search and reinitialization is presented in Section 3. The design of the experiments, the results, and the corresponding discussions are included in Section 4. The last section, Section 5, is devoted to conclusions and the future work.
2. A Short Introduction to Differential Evolution
DE is a population-based stochastic parallel optimization method. Each vector (or individual) of the population at generation is called the target vector, and it will generate one offspring called the trial vector. For example, the th vector of the population will generate one trial vector . Trial vectors are generated by adding weighted difference vectors to the target vector. This process is referred to as the mutation operator where the target vector is mutated. A crossover step is then applied to produce an offspring which is only accepted if it improves on the fitness of the parent individual. Many variants of standard DE have been proposed, which use different learning strategies and/or recombination operations in the reproduction stage. A general DE variant may be recorded as DE/a/b/c, where “a” denotes the mutation strategy, “b” specifies the number of difference vectors used, and “c” specifies the crossover scheme which may be binomial or exponential. The DE/rand/1/exp is described in Algorithm 1.
3. Proposed Algorithm
In this section, we describe four major operations of the proposed MDE algorithm in detail, including contraction criterion, BFGS search, reinitialization scheme, and boundary constraint handling. The detailed description of MDE is given in Algorithm 2.
3.1. Contraction Criterion
In order to design an effective and efficient hybrid algorithm for global optimization, we need to take advantage of both the exploration capabilities of EA and the exploitation capabilities of LS and combine them in a well-balanced manner. To incorporate BFGS into DE successfully, a triggering condition, called contraction criterion, is needed to decide when the local search has to start. There are several kinds of methods to define a contraction criterion. Qin and Suganthan  applies local search method after a fixed number of generations (every 200 generations). Sun et al.  starts the LS if the promising solution is not updated in t-consecutive generations. Simon et al.  use the minimum fitness in the objective space as the contraction criterion; Di Carlo et al. [8–10] perform LS when the maximum distance in decision space is below a given threshold.
In MDE, we propose a new contraction criterion which combines two criteria: (a) is the improved maximum distance in objective space and (b) is the maximum distance in decision space. The idea of is derived from where is a measure of the diversity of the population in objective space.
The distance in decision space is defined aswhere is the Euclidean distance. is a measure of the diversity of the population in decision space.
3.2. BFGS Search
In MDE, the local search utilizes the better solutions obtained by the global search to update the population of MDE and thus enhances MDE’s exploitation ability to find the best solution. In MDE, we use the BFGS algorithm as the local search method. BFGS is one of the quasi-Newton methods which do not need the precise Hessian matrix and is able to approximate it based on the individual successive gradients. BFGS is considered as the most effective and popular quasi-Newton method and has been proven to have good performance even for nonsmooth optimizations. The details can be found in .
3.3. Reinitialization Scheme
If the best solution has not been improved after local search, a reinitialization of the whole population is used to give the algorithms more opportunities to find the global optimum. Simon et al.  proposed a partial reinitialization of the population. Every 20 generations, the algorithm selects the best individuals from a temporary population of individuals as the reinitialization pool. Sun et al.  chose the individuals, which have the largest distances from the local optima, from a temporary population to form the next population. Zamuda et al.  proposed a population size reduction method as the reinitialization scheme. In MDE, we apply a simple reinitialization scheme described in Algorithm 3. If the result of the local search does not improve the best individual in the population, a reinitialization of the population is triggered. A counter keeps track of the number of restarts. For , where is user-defined, individuals are generated randomly in the search space, drawing samples from a uniform distribution. For , individuals in the population are initialized randomly in the search space, while the rest are initialized by a normal distribution which takes the best individual as the center and as the standard deviation. Algorithm 3 summarises the reinitialization procedure. The function randreal draws samples from a uniform distribution while function Gaussian draws samples from a normal distribution and are the lower and upper boundaries on .
3.4. Boundary Constraint Handling
After mutation and crossover, each generated trial vector undergoes boundary constraint check. If some variables of are out of the boundary, a repair method is applied as follows:where can generate a random real number from .
4. Experimental Results
In order to verify the performance of MDE, we select the 21 nonnoisy benchmark functions from CEC2005 special session on real-parameter optimization (excluding noisy functions , , , and ) since MDE has no ability to handle functions with noisy landscapes. The details about these functions can be found in . We compare MDE with six peer algorithms, including CLPSO , GL-25 , CMA-ES , LBBO , SFLSDE , and L-SHADE .
4.1. Experimental Setup
For each algorithm on each benchmark problem, we conduct 25 independent runs and limit each run to max function evaluations, where is the problem dimension (, , and ). The performance of the algorithms is evaluated in terms of function error value , defined as , where is the global optimum of the test function. The mean and the standard deviation of the function error values are recorded. The parameters of MDE are set as , , , , , and ; the mutation and crossover strategies are the same as those in . For the other six algorithms, we use the same parameter settings in their original papers.
4.2. Performance Criteria
To effectively analyze the results, two nonparametric statistical tests, as similarly done in [35, 36], are used in the experiments. (i) Wilcoxon’s signed-rank test at is performed to test the statistical significance of the experimental results between two algorithms on both single-problem and multiproblem. (ii) Friedman’s test is employed to obtain the average rankings of all the compared algorithms. Wilcoxon’s signed-rank test on single-problem is calculated by Matlab, while Wilcoxon’s signed-rank test on multiproblem and Friedman test are calculated by the software of KEEL .
4.3. Comparison between the Other Six Algorithms and MDE
Table 1 shows the results of MDE and the other six algorithms on the 10-dimensional benchmarks. It can be seen that MDE performs significantly better than CLPSO, GL-25, CMA-ES, LBBO, SFLSDE, and L-SHADE on 15, 16, 17, 7, 8, and 8 test functions. And CLPSO, GL-25, CMA-ES, LBBO, SFLSDE, and L-SHADE win on 4, 4, 3, 5, 8, and 8 test functions, respectively. MDE obtains similar results with the other six algorithms in 2, 1, 1, 9, 5, and 5 cases. Additionally, the results of the multiple-problem statistical analysis are shown in Table 4. It can be seen that MDE can obtain higher values than values in all cases, where is the sum of ranks for the functions on which MDE outperformed the compared algorithm, and is the sum of ranks for the opposite . According to Wilcoxon’s test at and , there are significant differences in three cases (MDE versus CLPSO, MDE versus GL-25, and MDE versus CMA-ES), which means that in those cases MDE is significantly better than CLPSO, GL-25, and CMA-ES. In addition, Friedman’s test is employed to evaluate the significant differences of all the compared algorithms. As shown in Figure 1, MDE gets the second average ranking value, while L-SHADE gets the first average ranking values on the 10-dimensional problems.
Table 2 shows that MDE performs significantly better than the other six compared algorithms in the majority of the test functions. For example, MDE wins in 12 cases, only loses in 3 cases, and ties in 6 cases, compared with SFLSDE. We can also find that MDE obtains much better solutions than LBBO and CLPSO. As for L-SHADE, MDE wins in 8 cases and loses in 10 cases. According to the results of the multiple-problem statistical analysis shown in Table 5, it can be seen that MDE can obtain higher values than values in all cases. According to Wilcoxon’s test at and , there are significant differences in four cases (MDE versus CLPSO, MDE versus GL-25, MDE versus CMA-ES, and MDE versus SFLSDE), which means that in those cases MDE is significantly better than CLPSO, GL-25, CMA-ES, and SFLSDE. And from Table 5, we can find that L-SHADE and MDE have comparable results. Moreover, Figure 2 shows that MDE performs the first average ranking value and L-SHADE obtains the second average ranking values on the 30-dimensional problems by Friedman’s test.
Table 3 shows that MDE also performs significantly better than five compared algorithms in the majority of the test functions except L-SHADE. For example, MDE wins in 15 cases, only loses in 3 cases, and ties in 3 cases, compared with LBBO. When comparing with L-SHADE, MDE wins in 8 cases and loses in 12 cases. Table 6 shows that MDE can perform higher values than values in five cases. According to Wilcoxon’s test at and , there are significant differences in two cases (MDE versus GL-25 and MDE versus LBBO), which means that in those cases MDE is significantly better than GL-25 and LBBO. Figure 3 shows that L-SHADE obtains better average ranking values than the other six algorithms on 50-dimensional problems by Friedman’s test.
In general, according to the analysis above, we can conclude that MDE and L-SHADE have better average rankings among the seven algorithms on 21 benchmark problems for all of three different dimensions. The performance of MDE is comparable to L-SHADE on 10- and 30-dimensional problems, while L-SHADE is better than MDE on 50-dimensional problems because of the larger initial population and linear population size reduction.
4.4. Influence of Contraction Criterion
In the previous experiments the recommended initial and are used. In order to test the influence of different initial contraction criterion values to the enhanced performance of MDE, in this section, MDE is tested with different initial and values. The initial values are set as and . All other parameters are not changed as described in Section 4.1. Nine groups of experiments with different combinations between and are done. means the value of parameters and . The statistical results by Friedman’s test with all initial values are shown in Table 7.
From Table 7, MDE owes the best average ranking value at than the other 8 groups on both 10-dimensional and 30-dimensional test functions. On 50-dimensional test functions, is the better choice to MDE. In general, we can conclude that it is better to set smaller contraction criterion values as the dimension of test functions increases.
4.5. Influence of Parameter
The experiment is to test the influence of parameter in MDE. Friedman’s test results are shown in Table 8, where the values of are set as in Table 8. All other parameters are kept unchanged as described in Section 4.1. In addition, all experiments are conducted for 25 independent runs for each function.
It can be seen from Table 8 that MDE with gets the better average ranking value than the other six cases at . At , is the best choice and is the second best choice to MDE. On 50-dimensional test functions, is the better choice parameter to MDE. Generally speaking, the small value such as 3 or 5 is good to enhance the performance of the MDE algorithm.
4.6. Influence of Population Size
To analyze the influence of the population size , different values of are tested in a set of experiments. Friedman’s test results are shown in Table 9, where the values of are set as . All other parameters are kept unchanged as described in Section 4.1.
From Table 9, MDE with ranks the first both at and , while MDE with performs the best at . From the results, it can be concluded that, as the dimension increases, a properly increased population size can enhance the search capability of MDE.
A memetic differential evolution, MDE, has been introduced in this paper. MDE uses a new contraction criterion to decide when the local search starts. In addition, MDE includes the global and local search operators, along with reinitialization, to improve performance.
To evaluate the performance of MDE, 21 benchmark functions with different characteristics are chosen for test. The results show that (i) MDE can obtain better or at least comparable results, compared with the other six algorithms; (ii) small contraction criterion values and can enhance the performance of MDE in terms of the quality of the final results; (iii) a large population size is good to MDE as the dimension increases.
In this paper, some preliminary experiments have been performed to verify its effect on the performance of MDE. In our future work, MDE will be tested on some real-world applications problems. Moreover, we believe that some other local search algorithms and adaptive population size strategy can also be used in MDE.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The authors sincerely thank Dr. Simon D. for making the source code of LBBO available online, Dr. Hansen N. for making the source code of CMA-ES available online, and Dr. Ryoji Tanabe for making the source code of L-SHADE available online. This work was supported by the National Natural Science Foundation of China under Grant nos. 61103144, 61472375, and 41571403, by the China Postdoctoral Science Foundation under Grant no. 2012M511301, by Joint Funds of Equipment Pre-Research and Ministry of Education of China no. 6141A02022320, by the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) under Grant no. cug160207, and by the Laboratory Opening Fund of China University of Geosciences (Wuhan) under Grant no. SKJ2015222.
P. Moscato, “On evolution, search, optimization, genetic algorithms and martial arts—towards memetic algorithms,” Technical Report Caltech Concurrent Computation Program 826, California Institute of Technology, Pasadena, Calif, USA, 1989.View at: Google Scholar
M. Di Carlo, M. Vasile, and E. Minisci, “Multi-population adaptive inflationary differential evolution,” in Proceedings of the Student Workshop on Bio-inspired Optimization Methods and their Applications (BIOMA '14), pp. 41–54, 2014.View at: Google Scholar
Y. Wang and B. Li, “A self-adaptive mixed distribution based uni-variate estimation of distribution algorithm for large scale global optimization,” in Nature-Inspired Algorithms for Optimisation, vol. 193 of Studies in Computational Intelligence, pp. 171–198, Springer, Berlin, Germany, 2009.View at: Publisher Site | Google Scholar
X. Sun and Z.-Q. Wu, “Fractal and multifractal description of surface topography,” Acta Physica Sinica, vol. 50, no. 11, pp. 2130–2131, 2001.View at: Google Scholar
R. Ros, “Benchmarking the bfgs algorithm on the bbob-2009 function testbed,” in Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2409–2414, 2009.View at: Google Scholar
A. Zamuda, J. Brest, and E. Mezura-Montes, “Structured population size reduction differential evolution with multiple mutation strategies on CEC 2013 real parameter optimization,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '13), pp. 1925–1931, IEEE, June 2013.View at: Publisher Site | Google Scholar
P. N. Suganthan, “Problem definitions and evaluation criteria for the cec2005 special session on real-parameter optimization,” Tech. Rep., Nanyang Technological University, Singapore, 2005.View at: Google Scholar
S. García, D. Molina, M. Lozano, and F. Herrera, “A study on the use of non-parametric tests for analyzing the evolutionary algorithms' behaviour: a case study on the CEC'2005 Special Session on Real Parameter Optimization,” Journal of Heuristics, vol. 15, no. 6, pp. 617–644, 2009.View at: Publisher Site | Google Scholar