Abstract

Many researches have identified that differential evolution algorithm (DE) is one of the most powerful stochastic real-parameter algorithms for global optimization problems. However, a stagnation problem still exists in DE variants. In order to overcome the disadvantage, two improvement ideas have gradually appeared recently. One is to combine multiple mutation operators for balancing the exploration and exploitation ability. The other is to develop convergent DE variants in theory for decreasing the occurrence probability of the stagnation. Given that, this paper proposes a subspace clustering mutation operator, called SC_qrtop. Five DE variants, which hold global convergence in probability, are then developed by combining the proposed operator and five mutation operators of DE, respectively. The SC_qrtop randomly selects an elite individual as a perturbation’s center and employs the difference between two randomly generated boundary individuals as a perturbation’s step. Theoretical analyses and numerical simulations demonstrate that SC_qrtop prefers to search in the orthogonal subspace centering on the elite individual. Experimental results on CEC2005 benchmark functions indicate that all five convergent DE variants with SC_qrtop mutation outperform the corresponding DE algorithms.

1. Introduction

The classical optimization methods, frequently used in scientific application, consist of strategies based on Hessian Matrix [1] and based on Gradient [2]. It can be probed that the solution obtained by using the classical methods is globally optimum [3]. However, if the derivation of an objective function cannot be calculated, it gets difficult to search the optimal solution for classical optimization methods [4]. So metaheuristic algorithms have been popularly used in the scientific application associated with solving nondifferentiable nonlinear-objective functions. The greatest interesting methods in metaheuristic algorithms include genetic algorithm (GA), particle swarm optimization algorithm (PSO), differential evolution algorithm (DE), artificial bee colony algorithm (ABC), and Cuckoo search (CK) algorithm.

Among those metaheuristic algorithms, DE has been identified as one of the most powerful optimizers. DE, proposed by Stron and Price in 1995 [5], is the only one which is able to still secure competitive ranking in optimization competitions of all IEEE International Conferences on Evolutionary Computation (CEC) [68] since 1996. The competitiveness of DE is also supported by many comparison researches [912].

However, the stagnation problem still exists in DE variants [13, 14]. In order to overcome the disadvantage, two kinds of ideas for improving DE algorithms have gradually appeared in the latest studies. One is to develop DE variants based on composite trial vector generation strategies. The other is to develop convergent DE variants in theory.

1.1. DE Variants Based on Composite Trial Vector Generation Strategy

The classical mutation operators of DE algorithm prefer the exploration ability or the exploitation ability on some level, which easily results in a blind search over feasible region or insufficient diversity of a population. So the population is easy to be trapped in a stagnation when using single mutation operator to generate trial vector generation strategies. In order to solve this problem, a natural idea is to combine different learning strategies for the tradeoff between the exploration and exploitation ability. Wang et al. [10] proposed a composite DE, which generates trial vectors by combining three mutation operators, that is, DE/rand/1/bin, DE/rand/2/bin, and DE/current-to-rand/1. Rahnamayan et al. [15] proposed an opposition-based DE, which combines an opposition-based learning method and the classical mutation operators to generate trial vectors. Monamed et al. [16] proposed a directed mutation rule based on the weight difference vector between the best and the worst individuals and then developed an alternative DE by combining the directed mutation and the classical mutation strategies.

1.2. DE Variants Holding Convergence in Probability

Taking into account that a convergent algorithm may have stronger robustness, so its probability of trapping in a stagnation can get smaller than those algorithms which cannot guarantee global convergence. With the progresses of the theoretical researches on DE, some convergent DE algorithms based on mathematical theory have been proposed.

In [17], Hu et al. proved that the classical DE cannot converge to the global optimal set with probability 1 and then proposed a convergent DE algorithm. In [18], Hu et al. summarized the theoretical works of DE on three aspects, that is, theoretical researches on the timing complexity, theoretical researches on the dynamical behavior of DE’s population, and researches on the convergence properties of DE. The paper then proved a sufficient condition for global convergence of DE and proposed a convergent DE algorithm framework. In [19], Hu et al. proposed a self-adaptive DE algorithm and then proved the algorithmic convergence by using the sufficient condition presented in [18]. Ter Braak [20] proposed a differential evolution Markov chain algorithm (DE-MC) and proved that its population sequence is a unique joint stationary distribution. Zhao [21] presented a convergent DE using a hybrid optimization strategy and a transform function and proved its convergence by the Markov process. Zhan and Zhang [22] presented a DE-RW algorithm which applied a random-walk mechanism to the basic DE variants (the convergence of DE-RW was not proved, but it can be easily proved by Theorem 2 in Section 4 below). Li et al. [23] proposed a convergent DE algorithm by incorporating Gaussian mutation, a diversity-triggered reverse sampling into DE/rand/1/bin.

Based on the above two research lines, the main contributions of this paper can be summarized as follows.(i)Firstly, this paper proposes a subspace clustering mutation operator, called SC_qrtop, which randomly selects an elite individual as a perturbation’s center and employs the difference between two randomly generated boundary individuals as a perturbation’s step. Theoretical analyses and numerical simulations demonstrate that SC_qrtop mutation prefers to search in the orthogonal subspace centering on the elite individual.(ii)Secondly, this paper presents a convergent DE model by combining SC_qrtop mutation and the classical mutation operators of DE and gives the theoretical proof of the algorithm convergence.(iii)Finally, numerical experiments on CEC2005 benchmark functions validate that SC_qrtop mutation operator has positive effect on the performance of all five classical mutation operators of DE.

The rest of this paper is structured as follows. Section 2 briefly introduces the basic DE algorithm. Section 3 presents and analyzes the subspace clustering mutation operator. Section 4 gives the DE variants based on the subspace clustering mutation and proves its convergence in theory. Numerical experiments on CEC2005 benchmark functions are then presented in Section 5. Section 6 discuses the theoretical significance of the proposed operator, followed by conclusion and future work in Section 7.

2. Classical Differential Evolution

DE is used for dealing with the continuous optimization problem. This paper supposes that the objective function to be minimized is , , and the feasible solution space is. The classical DE [19, 24, 25] works through a simple cycle of operators including mutation, crossover, and selection operator after initialization. The classical DE procedures are described in detail as follows.

Initialization. The first step of DE is the initialization of a population of   -dimensional potential solutions (individuals) over the optimization search space. We will symbolize each individual by , for , where is the current generation and is the maximum number of generations. For the first generation , the population should be sufficiently scaled to cover the optimization search space as much as possible. Initialization is implemented by using a uniformly sampling to generate the potential individuals in the optimization search space. We can initialize the th dimension of the th individual according to where is a uniformly distributed random number confined on the range .

Mutation Operators. After initialization, DE creates a donor vector corresponding to each individual in the th generation through the mutation operator. Several most frequently referred mutation strategies are presented as follows:DE/rand/1: DE/best/1: DE/current-to-best/1: DE/best/2: DE/rand/2: where denotes the best individual of the current generation, the indices are uniformly random integers mutually different and distinct from the running index , and is a real parameter, called mutation or scaling factor.

If the element values of the donor vector exceed the prespecified upper bound or lower bound, we can change the element values by the periodic mode rule as follows:

Crossover Operator. Following mutation, the crossover operator is applied to further increase the diversity of the population. In crossover, a trial vector, , is generated by the binomial crossover, which combines the elements of the target vectors, , and the donor vector, : where is the probability of crossover and is a random integer on .

Selection Operator. Finally, the selection operator is employed to maintain the most promising trial individuals in the next generation. The classical DE adopts a simple selection scheme. It compares the objective value of the target with that of the trial individual . If the trial individual reduces the value of the objective function then it is accepted for the next generation, otherwise the target individual is retained in the population. The selection operator is defined as

The pseudocode of the classical DE algorithm (DE/rand/1) is illustrated in Pseudocode 1.

= initial._population ( ), , CR = initial._parameters
while ! termination_condition do
for     to  
        // mutation
        // crossover
  if     then       // selection
   
  end if
  
end for
end while

3. Subspace Clustering Mutation Operator

DE variants with global convergence are attracting more and more attention. A common convergent model of evolutionary algorithm (EA) is identified by two characteristics. One is that each population has the ergodic. The other is that the best solution of each generation will be reserved to the next generation. Since the greedy selection strategy of DE algorithm can reserve the best solution to the next generation, the ergodic of the population turns into the key problem for developing convergent DE variants. In addition, considering the balance of exploration and exploitation, we propose a subspace clustering mutation operator, called SC_qrtop. It can be formulated as follows:

SC_qrtop: where is an individual selected by randomly sampling from the top % of the th population. and are two boundary individuals, each element of which is equal to the upper, , or lower boundary value, , with an equal probability.

The characteristics of SC_qrtop can be summarized as follows.(i)Employing SC_qrtop mutation can make the population ergodic. In fact, SC_qrtop mutation makes the probability of the donor individual locating in any small regions of the whole search space to be greater than 0. This paper calls the probability ergodic probability.(ii)SC_qrtop mutation can reproduce the with a small probability, and the probability equals the ergodic probability. This characteristic benefits the balance of the exploration and exploitation ability on some level.(iii)The individuals, which are generated by SC_qrtop mutation, prefer to locate in the orthogonal subspace of the . So the mutation operator improves the search capacity in the orthogonal subspace of the outstanding individuals. Given that, we call the operator subspace clustering mutation (short as SC_qrtop).(iv)The implementation of SC_qrtop mutation is very simple. It is also easy to unite the SC_qrtop mutation with the mutation operators of the classical DE.

The reasons that SC_qrtop mutation has the above characteristics can be analyzed both theoretically and experimentally as the following three subsections, that is, probability analyses, statistical experiments, and implementation tips of SC_qrtop mutation.

3.1. Probability Analysis of SC_qrtop Mutation

In this subsection, we analyze the probabilities of the individuals generated by SC_qrtop locating in each subspace of the search space.

Let SC_donor vector denote the individual generated by SC_qrtop mutation. Let , for , denote the event that SC_donor vector locates in the dimensional subspace and does not locate in the dimensional subspace, where is the dimension of the search space. Let , for , denote the probability of the event . Let denote the event that an individual locates in a null space, and let denotes the probability of the event .

(i) Supposing the Search Space Is One Dimensional. In this case , we establish a coordinate system with an origin , and then the search space has two subspaces. One is a null space, the other is itself. As shown in Figure 1, the region is the null space, which just includes a point . That is to say, The is a set including all points except for

That the SC_donor vector locates in the subspace means . That is, the boundary individual equals the other individual . So we get

In addition, from the above definition of , the is the probability of SC_donor vector locating in the region except for the null space . That is to say, is the probability of the SC_donor locating in the region . So and we then get

(ii) Supposing the Search Space Is Two Dimensional. In this case , a rectangular coordinate system with an origin is established as shown in Figure 2. The regions in Figure 2 can be represented as follows:

Next, we analyze the probability of SC_donor locating in each subspace.

Firstly, the probability of SC_donor locating in the region equals the concurrence probability of the independent events on the axis and axis together. So

Secondly, that SC_donor locates in the region means the concurrence of two events on the axis and on the axis, or the concurrence of two events on the axis and on the axis. So

Similarly, we can get the probability of the SC_donor locating in the region

(iii) Supposing the Search Space Is Dimensional. According to the same procedure as the case , we can get the probabilities , for , as follows:

Here , for , is a combination, which equals . “” denotes the usual factorial.

3.2. Statistical Experiment of SC_qrtop Mutation

The statistical experiments are conducted to give the distribution landscapes of sampling experiments associated with SC_qrtop mutation. As shown in Figure 3, we show three distribution landscapes at the different cases, that is, two dimensional space, one top individual, and 100 independent repetitions; two-dimensional space, three top individuals, and 300 independent repetitions; and three-dimensional space, one top individual, and 300 independent repetitions. The figure shows the clustering feature of subdonor points in the subspaces. Taking Figure 3(a) as an example, some subdonor points (marked by “”) locate at the origin while some (marked by “+”) locate on the two vertical axises, and the remainder (marked by “”) distribute uniformly in the two-dimensional search space. These three cases are corresponding with the occurrence of events, that is, ,  , and . As shown in Table 1, at 100 independent repetitions, the occurrence times are 28, 44, and 28 in order. The experimental proportions are close to the theoretical probabilities , , , that is, 0.25, 0.50, 0.25.

In Table 1, the “Pra_sub” values of , , , are equal to , , , and , respectively, which can be calculated as follows:

3.3. Implementation Tips of SC_qrtop Mutation

It is easy to incorporate SC_qrtop into other classical mutation operators. Taking the DE/rand/1 mutation as an example, we increase the region of the random integer to times; that is, . If , then execute the classical DE/rand/1 or else execute SC_qrtop mutation operator. That is to say, the modified algorithm will employ SC_qrtop mutation operator with the probability % (considering the balance of exploration and exploitation, we suggest this probability is equal to the one generating the top individual ). For other classical mutation operators, there is a random integer we have to generate. So we can process the application of SC_qrtop mutation operator in the same way.

4. Convergent DE Algorithm Based on Subspace Clustering Mutation

In this section, this paper proposes a convergent DE algorithm framework and proves its global convergence in probability.

4.1. Algorithmic Framework

This section proposes a convergent DE algorithm based on SC_qrtop mutation operator, called CDE/SC_qrtop. The pseudocode of CDE/SC_qrtop_rand/1 is demonstrated in Pseudocode 2. In the same way, it is easy to develop the other convergent variants, that is, CDE/SC_qrtop_best/1, CDE/SC_qrtop_current-to-best/1, CDE/SC_qrtop_best/2, and CDE/SC_qrtop_rand/2.

= initial._population ( ), , CR = initial._parameters
while ! termination_condition do
for     to  
+ generate a random integer
+ if   , then
            //mutation
+ else
+          //SC_qrtop
         //crossover
   if   ,  then           //selection
    
   end if
   
end  for
end  while
“+” marks the added operator. “ %” is the probability using SC_qrtop mutation.

4.2. Convergence Proof

There are some different kinds of definitions of convergence for analyzing asymptotic convergence of random algorithms. A frequently used convergence definition, that is, convergence in probability, will be used in this paper.

Definition 1. Let be a population sequence associated with a random algorithm. The algorithm holds global convergence in probability for a certain optimization problem, if and only if where is a small positive real, denotes an expanded optimal solution set, that is, , is an optimum of the objective function .
Several important theorems for the global convergence of evolutionary algorithms have been presented. Rudolph [26] generalized the convergence conditions for binary and Euclidean search space to a general search space. Under the convergence condition, the EAs with an elitist selection rule will converge to the global optimum. The measure associated with a Markovian kernel function, which needs to be estimated in the convergence condition, seems not to be very convenient. He and Yu [27] presented a theoretical analysis of the convergence conditions for EAs. The convergence conditions are based on certain probability integral of the offspring entering the optimal set. Perhaps, the best convenient theorem for proving the global convergence of DE variants is the one recently presented by Hu et al. [19]. The theorem is based on the previous two. It just needs to check whether or not the probability, of the offspring in any subsequence population entering the optimum solution set, is big enough. The theorem can be described in detail as follows.

Theorem 2 (see [19]). Consider to be a population sequence of a DE variant with a greedy selection operator. In the target population , there exists at least one individual , which corresponds to the trial individual , such that and the series diverges; then the DE variant holds global convergence.
Where denotes any subsequence of natural number set, denotes the probability that belongs to the optimal solution set , and is a small positive real which depends on .

From Theorem 2, we can get that if the probability entering into the optimal set in a certain subsequence population is large enough, then the DE variant holds global convergence.

In fact, for each generation population of CDE/SC_qrtop, the probability of each SC_donor individual locating in any subspace can be calculated by the following expression: Let ; then whether or not the classical mutation operator generates an optimum, the probability that at least one donor individual locates in the optimal set , can be estimated as follows: where denotes the measure of a measurable set. If the crossover probability , then So if   takes we can get that diverges. Hence, we draw a conclusion that the CDE/SC_qrtop algorithm holds global convergence.

5. Numerical Experiments

The main purpose of numerical experiments is to reveal that the proposed SC_qrtop operator can enhance the search ability of all five classical mutation operators, that is, rand/1, best/1, current-to-best/1, best/2, and rand/2. So this paper compared CDE/SC_qrtop algorithms and the corresponding classical DE algorithms with five classical mutations, respectively. The experiments were conducted on 25 test instances proposed in the CEC2005 special session on real-parameter optimization [28]. These benchmark function set includes four classes:(i)5 unimodal functions 1–5,(ii)7 basic multimodal functions 6–12,(iii)2 expanded multimodal functions 13-14,(iv)11 hybrid composition functions 15–25.

The number of decision variables, , was set to 10 for all the 25 test functions. The population size, , was set to be 60 for all the algorithms. The mutation factor, , was set to be 0.5 while the crossover probability, , was set to be 0.9. The probability using SC_qrtop mutation was suggested to be 20%. For each algorithm and each test function, 25 independent runs were conducted with 150000 function evaluations (FEs) as the termination criterion.

Generally speaking, due to using the best solution of the current population, the DE variants with a mutation strategy based on the best solution, that is, DE/best/1, DE/cur-to-best/1, DE/best/2, have more powerful exploitation ability, while the other random mutation strategies make the DE variant possess more powerful exploration ability. Given that, this section will analyze the experimental results from the following two aspects, that is, the comparison on three mutation strategies based on the best solution and the comparison on the other two random mutation strategies.

5.1. Comparison on Three Mutation Strategies Based on the Best Solution

The experimental results of CDE/SC_qrtop_best/1 & DE/best/1, CDE/SC_qrtop_cur-best/1 & DE/cur-best/1, and CDE/SC_qrtop_best/2 & DE/best/2 were reported in Tables 2, 3, and 4, respectively. The bottom right corner in every table summarized the statistical analyses of the experimental results. The priority of the comparison analyses is the best solution, the mean, and the standard deviation in turn.

From Tables 2, 3, and 4, the number of the benchmark functions, on which the three CDE/SC_qrtop variants outperform the corresponding DE algorithms, is 14, 21, and 12 in turn. Meanwhile, the number of the benchmark functions, on which CDE/SC_qrtop variants are below than the corresponding DE algorithms, is 3, 4, and 6 in turn. Figure 4 showed the evolution landscapes of the average error of the best function values on 25 runnings derived from all the six algorithms on the benchmark function 1–14. The results show that the SC_qrtop mutation can greatly improve the search ability of the three mutation strategies based on the best solution.

Given further analyses of the results, we can get that the improvement of the three CDE/SC_qrtop variants is more effective on the basic multimodal functions and the hybrid composition functions. To the unimodal functions, the three classical DE algorithms, especially for DE/best/2, already achieve (or approach) the optimum solution, so the three CDE/SC_qrtop variants just got the similar results with that achieved by the corresponding DE algorithms.

5.2. Comparison on the Other Two Random Mutation Strategies

The experimental results of CDE/SC_qrtop_rand/1 & DE/rand/1, and CDE/SC_qrtop_rand/2 & DE/rand/2 were reported in Tables 5 and 6, respectively. As above, the bottom right corner in every table summarized the statistical analyses of the experimental results. The priority of the comparison analyses is the best solution, the mean, and the standard deviation in turn.

As shown in Tables 5 and 6, the numbers of the benchmark functions, on which the two CDE/SC_qrtop variants outperform the corresponding DE algorithms, are both 9. Meanwhile, the numbers of the benchmark functions, on which CDE/SC_qrtop variants are below than the corresponding DE algorithms, are both 7. The results show that the SC_qrtop mutation can weakly improve the search ability of the three mutation strategies based on the best solution.

In summary, the SC_qrtop mutation can improve the search ability of all the classical mutation operators of DE. The improvement on the three mutation strategies based on the best solution is very significant while is small on the two random mutation strategies. The results also demonstrate that the SC_qrtop mutation focuses on the exploration more than the exploitation. The experimental results support the theoretical conclusion that the CDE/SC_qrtop algorithms can guarantee global convergence in probability.

6. Discussion

The previous theoretical analysis proved that the differential evolution algorithm incorporating the SC_qrtop mutation operator holds convergence in probability. The numerical experiments showed that these convergent SC_qrtop DE algorithms are significantly better than or at least comparable to the corresponding DE algorithms, respectively.

Generally, the populations of a convergent algorithm have more diversity, which can enhance the algorithmic exploration ability and make the algorithm hold stronger robustness. The DE-RW [22] uses a random-walk mechanism to enhance the population diversity until the individuals are ergodic, thereby making the algorithm hold global convergence in probability. Also, like the DE-RW algorithm, the convergent DE algorithm presented in the literature [23] utilizes a Gaussian mutation operator to enhance the algorithmic exploration ability. The researches of these convergent DE algorithms bring the DE field a significant step. However, the algorithmic performance depends on the balance between the exploration and exploitation ability. Just enhancing the exploration ability may decrease the convergence speed of a algorithm. Unlike the random-walk and the Guassian mutation operators, the proposed SC_qrtop mutation operator takes account of the balance. As shown in Table 1, the occurrence probability of the event always equals the occurrence probability of the event . That is to say, the probability () of reproducing elite individuals equals the probability () of randomly sampling in the whole solution space. The reproduction of elite individuals benefits enhancing the exploitation ability; meanwhile, the randomly sampling in the whole solution space is conducive to enhancing the exploration ability.

In addition, the proposed SC_qrtop mutation operator can be incorporated into any state-of-the-art DE algorithms, thereby developing convergent DE algorithms in theory. Due to the fact that the SC_qrtop mutation operator takes account of the balance between the exploration and exploitation ability, the performance of the convergent algorithms based state-of-the-art DE is expected.

7. Conclusion and Future Work

In many recent literatures, a significant amount of experiments indicated that composite trial vector generation strategy is an effective way for balancing the exploration ability and exploitation ability of DE variants. Taking into account that a convergent algorithm in theory has stronger robustness, this paper proposed a subspace clustering mutation operator for DE variants. By compositing the proposed mutation with the classical mutation operators, this paper developed five convergent DE variants. The experimental results on CEC2005 benchmark functions indicated that all five convergent DE variants with the subspace clustering mutation operator outperform the corresponding DE algorithms and also indicated that the effectiveness, of combining the subspace clustering mutation operator and any one of three mutation strategies based on the best solution (i.e., DE/best/1, DE/current-to-best/2, and DE/best/2), is more significant than those of combining the subspace clustering mutation and the other two random mutation strategies (i.e., DE/rand/1, DE/rand/2).

Two possible directions for future work can be summarized as below.(i)Incorporate the subspace clustering operator into other outstanding DE variants, that is, differential evolution utilizing proximity-based mutation operators (Pro DE) [29], differential evolution using a neighborhood-based mutation operator [30], and so forth. A lot of numerical experiments have verified that these algorithms can get better performance for the majority of some benchmark problems. Following the study of this paper, we can incorporate the subspace clustering operator into these outstanding DE variants, and it is easy to prove that these incorporated algorithms can guarantee global convergence in probability. However, whether the subspace clustering operator can further enhance performance of these outstanding DE variants remains to be verified by numerical experiments.(ii)Generalize the work to other similar evolutionary algorithms, such as particle swarm optimization (PSO), Cuckoo search algorithm (CK), and artificial bee colony algorithm (ABC).

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to acknowledge Professor Qingfu Zhang for his guidance. This work was supported in part by the National Natural Science Foundation of China under Grant nos. 61170202 and 11301163, the Fundamental Research Funds for the Central Universities under Grants nos. 2012-YB-19 and 2013-YB-003, and the Key Project of Chinese Ministry of Education under Grant no. 212109.