Research Article | Open Access
A Convergent Differential Evolution Algorithm with Hidden Adaptation Selection for Engineering Optimization
Many improved differential Evolution (DE) algorithms have emerged as a very competitive class of evolutionary computation more than a decade ago. However, few improved DE algorithms guarantee global convergence in theory. This paper developed a convergent DE algorithm in theory, which employs a self-adaptation scheme for the parameters and two operators, that is, uniform mutation and hidden adaptation selection (haS) operators. The parameter self-adaptation and uniform mutation operator enhance the diversity of populations and guarantee ergodicity. The haS can automatically remove some inferior individuals in the process of the enhancing population diversity. The haS controls the proposed algorithm to break the loop of current generation with a small probability. The breaking probability is a hidden adaptation and proportional to the changes of the number of inferior individuals. The proposed algorithm is tested on ten engineering optimization problems taken from IEEE CEC2011.
Differential evolution (DE) is a population-based stochastic real-parameter algorithm for continuous optimization problems firstly introduced by [1, 2]. Recently, several detailed surveys of advances in DE were conducted (i.e., [3, 4]). These, along with the competitions in the 1996–2012 IEEE International Conferences on Evolutionary Computation (CEC), showed that DE is one of the most powerful stochastic optimizers. In the last two decades, many outstanding DE variants have been proposed. Few of these algorithms are based on convergence theory; therefore few of them, like Elite Genetic algorithm, guarantee convergence (all the “converge,” “convergent,” or “convergence” of this paper mean the convergence in probability) in the theory for any continuous optimization problems regardless of initial populations. However, theoretical studies of stochastic algorithms are attracting the attention of a greater number of researchers. IEEE CEC 2013 held a special session focusing on the theoretical foundations of bioinspired computation. Aiding algorithm design is one of the most important purposes of theoretical studies.
As to the convergence theory researches in DE, two trends are gradually becoming apparent. One is designing improved DE algorithms based on theoretical foundations. Reference  performed a mathematical modelling and convergence analysis of continuous multiobjective differential evolution (MODE) using certain simplifying assumptions. This work was extended by . Reference  proposed a differential evolution Markov chain algorithm (DE-MC) and proved that its population sequence is a unique joint stationary distribution. Reference  presented a convergent DE using a hybrid optimization strategy and a transform function and proved its convergence by the Markov process. Reference  presented a DE-RW algorithm that applied a random-walk mechanism to the basic DE variants (the convergence of DE-RW was not proved, but it can be easily proved by Theorem 2 in Section 4). The other trend involves the study of convergence theory as an aid to DE algorithm design. Reference  established asymptotic convergence behaviour of a basic DE (DE/rand/1/bin) by applying the concepts of Lyapunov stability theorems. The analysis is based on the assumption that the objective function has the following two properties: (1) the objective function has the second-order continual derivative in the search space, and (2) it possesses a unique global optimum within the range of search. These researches show that the enhancement of population diversity is an efficient route to the development of globally convergent DE algorithms. However, a difficult task associated with increased diversity is how to remove the extra inferior individuals generated during the evolving process.
This paper proposes a self-adaptive convergent DE algorithm with a hidden adaptation selection, named SaCDEhaS. This paper then proves that the SaCDEhaS guarantees global convergence in probability. The SaCDEhaS algorithm is formed by integrating the basic DE with two extra operators, that is, a uniform mutation operator and a hidden adaptation selection (haS) operator. The parameters’ self-adaptation and use of the uniform mutation operator both enhance the diversity of populations. The proposed haS operator automatically eliminates some inferior individuals that are generated as part of enhancing the population diversity. Experimental results and comparison studies of all the bound-constrained and unconstrained optimization problems posed by the CEC 2011 continuous benchmark functions for engineering optimization  show that (1) the uniform mutation and haS operators improve algorithm performance, and (2) SaCDEhaS is competitive with the top four DE variants on the CEC 2011 competition.
The rest of this paper is organized as follows. Section 2 briefly introduces the basic DE algorithm. Section 3 describes in detail the proposed algorithm. Section 4 gives a sufficient condition for the convergence of modified DE and proves the global convergence of the proposed algorithm. Numerical experiments are then presented in Section 5, followed by conclusions in Section 6.
2. Basic Differential Evolution Algorithm
DE is arguably one of the most powerful stochastic real-parameter optimization algorithms. And it is used for dealing with continuous optimization problems [12, 13]. This paper supposes that the objective function to be minimized is , , and the feasible solution space is , where and are the lower and upper boundary values of , respectively. The basic DE works through a simple cycle of mutation, crossover, and selection operators after initialization. There are several main variants of DE [1, 2]. This paper uses the DE/rand/1/bin strategy; the strategy is most commonly used in practice. It can be described in detail as follows.
Initialization. Like any other evolutionary algorithms, DE starts with initializing a population of -dimensional vectors representing the potential solutions (individuals) over the optimization search space. We will symbolize each individual by , for , where is the current generation and is the maximum number of generations. The initial population (at ) should be sufficiently scaled to cover the search space as much as possible. Generally, the initialization of population is carried out by generating uniformly randomizing vectors within the search space. We can initialize the th dimension of the th individual according to where is a uniformly distributed random number defined in (the same below).
Mutation_ _DE/rand/1/*. For each target vector , DE creates a donor vector by the mutation operator. The mutation operator of DE/rand/1/ can be formulated as follows: Here the indices are uniformly random integers mutually different and distinct from the loop index . And is a real parameter, called mutation or scaling factor.
If the element values of the donor vector exceed the prespecified upper bound or lower bound, we can change the element values by the periodic mode rule as follows:
Crossover_ _DE/*/bin. This paper uses the binomial crossover operator, which generates the trial vector by mixing elements of the donor vector with the target vector as follows: Here is the probability of crossover and is a random integer in .
Selection. The selection operator determines which one between the target and the trial vector survives to the next generation. The selection operator for minimization problems can be formulated as This means that equals the trial vector if and only if is a better cost function value than ; otherwise, the parent individual is retained to the next generation.
3. Convergent DE Algorithm with Hidden Adaptation Selection
The proposed SaCDEhaS algorithm is formed by integrating the basic DE with a self-adaptive parameter control strategy and an extra trial vector generation strategy, that is, the uniform mutation operator, and using the haS operator instead of the selection operator of the basic DE.
3.1. Self-Adaptive Parameter Control Strategy
The parameter control strategies of DE have been extensively investigated over the past decade and some developments have been reported. Generally, we distinguish between two forms of setting parameter values: parameter tuning and parameter control. The former means that the user must find suitable values for the parameters by tuning the algorithm and then running the algorithm with those fixed parameters. Reference  indicated that a reasonable value for could be chosen between and and the effective range of is usually between 0.4 and 1. A value of has been found to work well across a large range of problems .
The latter, that is, parameter control, means that the values of the parameters are changed during the run. References [15, 16] categorised the change into three classes: deterministic parameter control, adaptive parameter control, and self-adaptive parameter control. Reference  presented a randomly self-adaptive parameter control strategy; their experimental results showed that the SaDE algorithm with randomly self-adaptive parameter control strategy was better than, or at least comparable to, the basic DE algorithm and several competitive evolutionary algorithms reported in the literature. In particular, the self-adaptive strategy does not increase the time complexity compared to the basic DE algorithm. The parameter control strategy is formulated as follows: Here and are the lower and upper limits of and both lie in . and are two new parameters. Reference  used , , and . Then the parameter belongs in while the belongs in .
In a word, a good parameter setting or a good parameter control strategy can benefit significantly the performance of DE. Combining the merits of the parameter tuning and the parameter control, the proposed SaCDEhaS algorithm initializes , and then employs the randomly self-adaptive strategies (6), (7) to change the parameters and , respectively. For SaCDEhaS, a good parameter setting during the initial phase benefits its convergence rate, while the application of the randomly self-adaptive strategies during the middle and the latter phases is helpful to keep its diversity.
3.2. Trial Vector Generation Strategies
3.2.1. Uniform Mutation Operator
After the classical DE crossover operator, SaCDEhaS creates a variation vector corresponding to each trial vector by the uniform mutation operator. Uniform mutation runs independently on each element of each trial vector. Each element is replaced by a feasible solution randomly generated with an auxiliary convergence probability (), which is a control parameter taking a small value. The description of uniform mutation is as follows: Here , and are the lower and upper boundary values of , respectively. From the formula of the uniform mutation, the variation vector equals the trail vector with probability and equals a uniformly distributed random vector in the feasible region with .
3.2.2. Hidden Adaptation Selection Operator
After uniform Mmutation operator, SaCDEhaS has the following hidden adaptation selection operator (haS) instead of the selection operator of the basic DE. The haS operator determines which one between the target and the variation vector survives to the next generation or to break the current loop with the probability . We formulate the haS operator as follows: Here “break” denotes an execution breaking out of the loop from to .
3.3. Pseudocode and Observation
The pseudocode of SaCDEhaS algorithm (SaCDEhaS/rand/1/bin) is shown in Algorithm 1.
Observations of SaCDEhaS. (i) How to achieve the convergence: the uniform mutation operator makes each generation population ergodic, which assists the algorithm with elitist selection to reach convergence. The theoretical proof of convergence for SaCDEhaS will be shown in the following section.
(ii) How to understand haS operator: the haS operator has the following characteristics.
The “otherwise” in the expression (9) means and . That is to say, for every inferior variation (trial) solution , the haS operator forces SaCDEhaS to break the current loop with a low probability . So the actual probability of breaking the current loop is higher if there are more inferior solutions in the variation population. That is to say, the actual probability is a hidden adaption and proportional to the number of inferior solutions in each target population.
The haS operator remains the current best solution of the target population to the next generation. In fact, the better individuals, function values of which are calculated, are remained greedily to the next generation according to the first two formulas in expression (9). Those current trial individuals, which are abandoned by the “break” strategy in the haS operator, do not waste computing overhead to calculate their function values during the program’s execution. And the corresponding target individuals are remained to the next generation regardless of their function values. That is to say, the individual, which is confirmed as the current best solution by calculating function values, must be survived to the next generation.
Since the parameters will be regenerated by formulas (6) and (7) after the “break” strategy, one could think of the haS as a triggering strategy of breaking the current loop and regenerating new parameters. Individuals of a target population can be divided into two parts, that is, the previous individuals and the later individuals, by the triggering time. It is obvious that the previous individuals in a population have a greater probability to be updated than the later individuals. In fact, the previous individuals serve two purposes: one is to provide learning information which determines whether or not the current loop is stopped, and the other is to be candidate solutions benefitting the algorithm’s search. Unlike a simple regeneration strategy of new parameters, the haS operator does not abandon the previous individuals. This can speed up the convergence without increasing extra computing overhead.
(iii) How to achieve the tradeoff between exploration and exploitation: obviously, population diversity is enhanced by using the uniform mutation operator and the self-adaptive technology for parameters. Meanwhile, however, more inferior solutions may be generated. SaCDEhaS employs the proposed haS operator to minimize the negative influence made by enhancing population diversity. The minimization of the negative influence can promote the balance between the exploration and exploitation ability on some level. In fact, the “break” strategy in haS operator is designed based on the randomly self-adaptive strategy. If the randomly generated parameters are not good and generate many inferior solutions, the loop will have a higher probability to be stopped and new parameters will be randomly generated in the next generation. In the process, the previous individuals’ information of a population is used to determine whether the loop continues or breaks.
(iv) How to estimate the computing overhead: comparing with the basic DE, SaCDEhaS has an extra computing overhead to generate random values in three operators, that is, in the self-adaptive parameter control strategy, in the uniform mutation operator, and in the haS operator. However, the computing overhead of generating random values is smaller than an objective function evaluation. So  suggests algorithms to estimate their computing overhead by setting function evaluation times (FEs). As shown in Table 2, the convergence speed of SaCDEhaS is quicker than that of SaDE within the same FEs.
4. Proof of Global Convergence for the Proposed Algorithm
Different definitions of the convergence exist for analysing asymptotic convergence of algorithms. The following definition of convergence, that is, convergence in probability, is used in this paper.
Definition 1. Let be a population sequence associated with a random algorithm. The algorithm has global convergence in probability for a certain optimization problem, if and only if
where is a small positive real, denotes an expanded optimal solution set, , and is an optimum of the objective function .
Several important theorems for the global convergence of evolutionary algorithms (EAs) have been presented. Rudolph  generalized convergence conditions for binary and Euclidean search space to a general search space. Under the convergence condition, the EAs with an elitist selection strategy converge to the global optimum. The measure associated with a Markovian kernel function, which needs to be calculated in the convergence condition, seems not to be very convenient. He and Yu  introduced several convergence conditions for EAs. The convergence conditions are based on certain probability integral of the offspring entering the optimal set. Perhaps, the most convenient theorem for proving the global convergence of DE variants is the one recently presented by Hu et al. . It just needs to check whether or not the probability of the offspring in any subsequence population entering the optimum solution set is big enough. The theorem can be described in detail as follows.
Theorem 2 (see ). Let be a population sequence of a DE variant with a greedy selection operator. In the target population , there exists at least one individual , which corresponds to the trial individual , such that
and the series diverges; then the DE variant holds global convergence.
Where denotes any subsequence of natural number set, is the probability of , locating in the optimal solution set , is a small positive real depending on .
The series diverging means that the probability is large enough. That is to say, if the probability of entering into the optimal set, in a certain subsequence population, is large enough, the DE variant with elitist selection holds global convergence.
Conclusion. SaCDEhaS converges to the global optima set of continuous optimization problems, regardless of the initial population distribution.
Proof. From Theorem 2, it is needed to prove that SaCDEhaS satisfies the following two characteristics.
(i) The Selection Operator of SaCDEhaS Is Greedy. SaCDEhaS algorithm uses the haS selection operator. According to the characteristic (2) of the “How to understand haS operator” in Section 3, we can know that the haS selection operator can remain greedily the current best solution to the next generation.
(ii) The Probability of Trial Individuals Entering into the Optimal Solution Set is Large Enough. According to formula (8), the probability of uniform mutation operator generating a uniformly distributed random vector equals . So the probability where denotes the measure of a set. Now we set ; then the series diverges.
So, according to Theorem 2, we can get that the conclusion holds.
Observation of SaCDEhaS’s Convergence. In fact, like the elite genetic algorithm, SaCDEhaS satisfies the classical convergent model characterized by two aspects: one is the retention of the current best solution, and the other is the ergodicity of the population. Especially, the population ergodicity makes the algorithm have an ability of escaping the local optima. The uniform mutation operator makes SaCDEhaS satisfy the second characteristic, while the haS operator makes it meet the first characteristic.
5. Numerical Experiment
In this section, the performance of SaCDEhaS is tested on the benchmark function set proposed for the Testing Evolutionary Algorithms on Real-world Numerical Optimization Problems CEC 2011 Special Session . Experiments are conducted on two comparative studies: (1) between SaCDEhaS and a SaDE algorithm which removes the uniform mutation and haS operators, and (2) between SaCDEhaS and the top four DE variants in the CEC 2011 competition. The first comparison study was implemented to show the effect of the uniform mutation and haS operators. The second comparison was to demonstrate the promising performance of the proposed SaCDEhaS algorithm.
5.1. Problem Definitions and Evaluation Criteria
The benchmark consisted of thirteen engineering optimization problems. Of these, T08, T09, and T11 have equality or inequality constraints imposed, and the other 10 problems are related to bound constrained optimization (or unconstrained optimization). These are summarized in Table 1.
|“Cur.Best” denotes the best function values obtained within FEs on the CEC 2011 competition.|
Marks the fact that the best value is obtained by the proposed algorithm SaCDEhaS.
| “—” denotes that the value is larger than . Marks that the current best value is obtained. |
According to the evaluation criteria given in the technical report , researchers must report the mean, best, and worst objective function values obtained over 25 independent runs after executing their algorithms for , , and function evaluations (FEs).
5.2. Configuration of Parameters
In our experiments, SaCDEhaS and SaDE used the same population size. Depending on the dimensions of the problems, we set the population sizes for the 10 problems, in order, at 50, 250, 10, 10, 100, 80, 150, 80, 150, and 150. In order to determine the value of auxiliary convergence probability , the experiments tested the performance of SaCDEhaS on all ten optimization problems with different values, that is, . For each parameter value and each problem, SaCDEhaS run 25 times independently within FEs, and the best value and the mean value of the 25 runs were recorded. According to all experimental results, we can get that the optimal were , , , , , , , , , and in order.
As shown in Figure 1, those lines of f3 and f7 were horizontal, while those lines of the other eight problems were crooked. This indicated that the parameter is insensitive to the problems f3, f7 and sensitive to the other problems.
(a) f1, f3, f7
(b) f4, f12, f13
(c) f2, f10
(d) f5, f6
5.3. Comparison of SaCDEhaS and SaDE
The results over 25 independent runs for the SaCDEhaS and SaDE algorithms were recorded in the four Tables (Tables 5, 6, 7, and 8) in the appendix. In these tables, the best, median, worst, and mean function values and standard deviations were shown for FEs of , , and , respectively. The comparative aspects extracted in Table 2 included the best and mean values from the tests of SaCDEhaS and SaDE. Sign Test , which is a popular statistical method to compare the overall performances of algorithms, was then used to analyze the best values and the mean values in Table 2. As shown in Table 3, the probability value of supporting the null hypothesis of Sign Test on the mean values equaled 0.011, while the probability on the best values was 0.001, which was less than the significance level 0.05. So we can reject the null hypothesis; that is to say, the overall performance of SaCDEhaS is better than SaDE.
|“Neg. dif.” and “Pos. dif.” denote the number of the negative and positive differences, respectively. “ value” denotes the probability value supporting the null hypothesis. |
In addition, from Table 2, the following are comments on notable aspects of performance of the two algorithms.(i)In problem T04, SaDE began to stagnate from the first stage when FEs are (the mean and best values were always equal to and , resp.). However, that did not happen to SaCDEhaS in any of the problems (in the three stages, the mean values were , , and , resp., and at the second stage, SaCDEhaS achieved the current best value ).(ii)SaCDEhaS achieved the current best values on six of the 10 problems (T1, T2, T3, T4, T7, and T12), while SaDE achieved them on only four (T1, T2, T3, and T7). The current best values are marked with star in Table 2. For problem T12 in particular, the minimum value achieved by SaCDEhaS was less than any other reported values in the CEC 2011 competition.
The above analyses indicate that SaCDEhaS outperforms SaDE on the benchmark set. This shows that the employment of the uniform mutation and haS operators benefits the algorithm.
5.4. Comparison of SaCDEhaS with Other Improved DE Variants
The CEC 2011 competition results are available on the homepage of Suganthan . Four of the top six algorithms belong to the DE family. SAMODE  turned out to be the best of these, followed by DE-RHC , Adap. DE , and ED-DE algorithm . In order to fully evaluate the performance of the proposed SaCDEhaS algorithm, we now compare it with the four top algorithms, for all the box-constrained global optimization problems of the benchmark function set for the CEC 2011 Special Session. Table 4 shows the best and mean values in the test instances for the SaCDEhaS, SAMODE, DE-RHC, Adap. DE, and ED-DE algorithm.