Research Article | Open Access
An Improved Brain Storm Optimization with Differential Evolution Strategy for Applications of ANNs
Brain Storm Optimization (BSO) algorithm is a swarm intelligence algorithm inspired by human being’s behavior of brainstorming. The performance of BSO is maintained by the creating process of ideas, but when it cannot find a better solution for some successive iterations, the result will be so inefficient that the population might be trapped into local optima. In this paper, we propose an improved BSO algorithm with differential evolution strategy and new step size method. Firstly, differential evolution strategy is incorporated into the creating operator of ideas to allow BSO jump out of stagnation, owing to its strong searching ability. Secondly, we introduce a new step size control method that can better balance exploration and exploitation at different searching generations. Finally, the proposed algorithm is first tested on 14 benchmark functions of CEC 2005 and then is applied to train artificial neural networks. Comparative experimental results illustrate that the proposed algorithm performs significantly better than the original BSO.
In the past few decades, swarm intelligence algorithms, which were derived from the concepts, principles, and mechanisms of nature-inspired computation, have attracted more and more attention from researchers. Though swarm intelligence is a new category of stochastic, population-based optimization algorithms compared with evolutionary algorithms (EAs) such as Genetic Algorithms (GA) , Evolutionary Programming (EP) , Evolutionary Strategies (ES) , Genetic Programming (GP) , and differential evolution (DE) , so far, a lot of swarm intelligence optimization algorithms, such as Particle Swarm Optimization (PSO) [6, 7], have been proposed to tackle many challenging complex computational problems in science and industry.
Brain Storm Optimization (BSO) algorithm [8, 9] is a swarm intelligence algorithm inspired by human being’s behavior of brainstorming. Derived from human creative problem-solving process, BSO has achieved more and more attention and has been applied to the function of optimization, optimal satellite formation reconfiguration , the design of DC Brushless Motor , economic dispatch considering wind power , multiobjective optimization [13, 14], and so forth. Due to the flexibility and scalability of the original BSO algorithm , many new versions of BSOs have been designed and implemented. To improve the performance and reduce the computation burden of BSO algorithm, Zhan et al.  modified the grouping operator and the creating operator and designed a BSO variant named Modified BSO (MBSO). Cheng et al.  used two kinds of partial reinitialization strategies to improve the population diversity in BSO algorithm. Recently, Yang et al.  proposed advanced discussion mechanism-based brain storm optimization (ADMBSO), which incorporated intercluster and intracluster discussions into BSO to control global and local searching ability. However, optimization problems have become more and more complex, from simple unimodal functions to hybrid rotated shifted multimodal functions. To the best of our knowledge, all variants of BSO have not been tested on the complex benchmark functions of CEC2005 . Hence, more effective improvement to the original BSO algorithms is always necessary.
Among existing metaheuristics, differential evolution (DE) is a simple yet powerful global optimization technique with successful applications in various areas . To achieve the most satisfactory optimization performance, an improved BSO algorithm with differential evolution strategy and new step size method named BSODE, which can maintain both the exploration ability and the diversity of population, is proposed in the paper. The improved algorithm will be first tested on 14 benchmark functions of CEC 2005 and then will be applied to train artificial neural networks.
The rest of this paper is organized as follows. Section 2 briefly introduces the original BSO algorithm and the basic DE operator. Section 3 describes the improved BSO with differential evolution strategy (BSODE). Section 4 presents the tests on 14 benchmark functions. The applications for artificial neural network are given in Section 5, followed by conclusions in Section 6.
2. Related Work
2.1. Brain Storm Optimization
The BSO algorithm is motivated by the philosophy of brainstorming, and brainstorming is a widely used tool for increasing creativity in organizations which has achieved wide acceptance as a mean of facilitating creative thinking . A potential solution in the solution space represents an idea in BSO. BSO sticks to the rules of interchange of ideas by a team and uses clustering, replacing, and creating operators to produce global optimum generation by generation. In the procedure of BSO, firstly, ideas are randomly initialized within the solution space, and then each idea is evaluated according to its fitness function. Next points of cluster center are also randomly selected and initialized like ideas, where is less than . As presented in [8, 9], the rest of the process in BSO can be described as follows.
2.1.1. Clustering Individuals
Clustering is a process of grouping similar objects together, and, during each generation, all the ideas are clustered into clusters according to idea (or individual) features. The best idea in each center is chosen as its cluster center, and the clustering operation can refine a search area. -means is a popular algorithm used in clustering; herein it is used in clustering operation.
2.1.2. Disrupting Cluster Center
Cluster center disrupting operation randomly chooses a cluster center and replaces it with a newly generated idea with a probability of p_replace, which is also named as the replacing operation. The value of p_replace is utilized to control the probability to replace a cluster center by a randomly generated solution. This is used to avoid premature convergence and help individuals “jump out” of the local optima.
2.1.3. Creating Individuals
To maintain the diversity of population, a new idea (individual) can be generated based on one idea or two in one cluster or two, respectively. In the creating operation, BSO first randomly chooses one cluster or two according to a probability of p_one. Then, in the basis of choosing one cluster or two, an idea of cluster center or a random idea is selected with a probability of p_one_center and p_two_center. The selecting operation is defined below as where rand is a random value between 0 and 1. After choosing one idea or two, the selected idea(s) is updated according towhere normrnd is the Gaussian random value with mean 0 and variance 1 and is an adjusting factor slowing the convergence speed down as the evolution goes, which can be expressed as where rand is a random value between 0 and 1. The max_iteration and current_iteration denote the maximum number of iterations and current number of iterations, respectively. The logsig is a logarithmic sigmoid transfer function, and such form is beneficial to global search ability at the beginning of the evolution and enhances local search ability when the process is approaching to the end. is a predefined parameter for changing slopes of the logsig function. The new created idea is evaluated, and if the fitness value is better than the current idea, the old idea will be replaced by the new one.
2.2. Differential Evolution
Differential evolution (DE)  is a population-based and stochastic optimization algorithm. Like other EAs, DE begins with an initial population vector containing a number of target individuals. The current generation evolves into the next generation through evolutionary operations repeatedly until the termination condition is attained. For the original DE, the mutation, crossover, and selection operators are defined as follows.
2.2.1. Mutation Operation
The mutation operation of classical DE scheme (DE/rand/bin) can be summarized as follows:where is the generation number, the indices , , and are mutually exclusive integers randomly chosen from the range between 1 and , and is a mutation scaling factor which affects the differential variation between two individuals.
2.2.2. Crossover Operation
After the process of mutation, crossover operation is used to each pair of the target vector in order to enhance the potential diversity of the population. DE applies a crossover operator on and to generate the offspring individual at the th generation. The crossover operation is defined aswhere rand is a random value between 0 and 1 and CR is a parameter of crossover probability.
2.2.3. Selection Operation
The greedy selection is employed by means of comparing a parent and its corresponding offspring. The selection operation at the th generation is described aswhere is the objective function value of the trial vector .
3. The Improved Algorithm: BSODE
In order to achieve better performance, an improved BSO algorithm needs to make use of both the global search information about the search space and the local search information of solutions found so far. The global search information can guide the search for exploring promising areas, while the local search information of solutions can be helpful for exploitation. In this paper, we effectively integrate the mutation and crossover scheme of DE into BSO in the idea creating operator of intracluster and intercluster. The key reason for integrating differential evolution strategy into BSO is that it can take advantage of DE that is mainly based on the distance and direction information and has the advantage of not being biased towards any prior defined guider.
Although the differential strategy has been utilized in  to design a BSO variant named Modified BSO (MBSO) and  to develop another BSO variant termed Close Loop BSO (CLBSO), our BSODE algorithm has significant differences with MBSO and CLBSO. On one hand, in the MBSO and CLBSO, a new idea is created by adding the difference of two distinct random ideas and from all the current ideas to a current idea according towhere rand is a random value between 0 and 1 and indices and are mutually exclusive integers randomly chosen from the range between 1 and . Even though formula (7) is similar to the mutation formula (4) of DE algorithm, they are obviously different in essence. In formula (4), , the mutation scaling factor, is a real constant between 0 and 2, and the idea behind is to control the amplification of the differential variation. However, in formula (7), the factor, rand, represents a random direction scaling, and the scope of distance is less than in some cases. In mutation operator, our BSODE will be strictly in accordance with formula (4) of DE algorithm. On the other hand, the MBSO and CLBSO do not take the crossover operator between ideas into account, and our BSODE uses the crossover operator to enhance the performance of local exploitation according to formula (5).
Furthermore, in our BSODE, we mimic Osborn’s rule that different unusual ideas generating from different perspectives and suspending assumptions by participants are welcome in the brainstorming of human being. According to the inspiration, we add the differential evolution strategy into the normal idea to generate more different ideas. According to the above analysis, the differential evolution operators of BSODE algorithm include intracluster differential evolution operator and intercluster differential evolution operator. The detailed differential evolution operators of the BSODE algorithm are described below.
3.1. Differential Evolution Operator
3.1.1. Intracluster Differential Evolution Operator
In creating operator of BSO, if the selected idea is a normal one in one cluster according to the probability of p_one and (1, p_one_center), we will let the idea to learn from the differential value of two random selected ideas and the cluster center. The differential evolution operator of a normal idea is defined aswhere is the mutation scaling factor which affects the differential variation between two ideas and the indices and are mutually exclusive integers randomly chosen in the selected cluster. Then, according to formula (5), the crossover operation is used to generate new solutions by shuffling competing vectors and also to increase the diversity of the population.
Figure 1 shows the working process of intracluster differential evolution operator of a normal idea. The search space of the normal idea includes the area between two random ideas and and the cluster center , and it can be seen that the local exploitation extends within the cluster space.
3.1.2. Intercluster Differential Evolution Operator
In creating operator of BSO, if the selected ideas are two normal ones in two clusters according to the probability of (1, p_one) and (1, p_two_center), and we will let the idea to learn from the differential value of two randomly selected normal ideas in two clusters and the best idea in all clusters, respectively. The differential evolution operator of two normal ideas is defined aswhere is the mutation scaling factor, GobalIdea is the best idea in all clusters, and and denote the normal idea of cluster 1 and cluster 2. Then, according to formula (5), the crossover operation is used to generate new solutions.
Figure 2 shows the working process of two normal ideas in intercluster differential evolution operator. The search space of the two normal ideas includes the area between global best idea point and two normal ideas, and it can be seen that the global exploration extends between the two clusters.
According to the above analysis, the pseudocode of differential evolution operator about intracluster and intercluster is summarized in Algorithm 1.
3.2. A New Step Size Method
To adjust the convergence speed as the evolution goes in idea generation, the original BSO algorithm defines an adjusting factor described by formula (3). Figure 3(a) shows the adjusting factor which controls the scale of step. We can observe that at first the adjusting factor keeps around 1, while when half the number of generations has been reached, it rapidly turns to be near 0. This method to control the size of step can also balance exploration and exploitation at different searching generations. However, it just takes effect only for very short interval. Hence, we introduce a simple new step size method which is shown in Figure 3(b). The dynamic mutation function is described as follows:where rand is a random value between 0 and 1. The max_iteration and current_iteration denote the maximum number of iterations and current number of iterations, respectively.
(a) The curve of adjusting factor in BSO
(b) The new step size adjusting factor in BSODE
3.3. Flowchart and Pseudocode of the BSODE
As previously analyzed, the complete flowchart of the BSODE algorithm is shown in Figure 4.
This efficient differential evolution operator in intercluster improves the global search capability and avoids convergence to local minima. At the same time, the differential evolution operator in intracluster extends the capability of local exploitation and accelerates the convergence. The pseudocode of the BSODE is summarized in Algorithm 2.
4. Benchmark Functions Optimization Problem
The numerical benchmark functions are used for comparing the performance and accuracy of our proposed BSODE algorithm with the original BSO algorithm and other variants of DE in this section. For a fair comparison, all the experiments are conducted on the same machine with an Intel 3.4 GHz CPU, 4 GB memory. The operating system is Windows 7 with MATLAB 8.0 (R2012b). For the purpose of reducing statistical errors, each function is independently run 25 times.
4.1. Benchmark Functions
We choose 14 widely known rotated or shifted benchmark functions used in CEC2005, which is given in Table 1 . All functions are tested on 30 dimensions. Results obtained are compared with some other variants of DE algorithms defined in the next subsection. The searching range and true optima for all functions are also given in Table 1. Among 14 benchmark functions, to are shifted unimodal functions, to are shifted or rotated multimodal functions, and to are expanded shifted or rotated functions.
4.2. Parameter Settings of Comparative Algorithms
Because the BSODE algorithm combines the original BSO and the differential evolution strategy, we specially compare the performance of BSODE with the DE, the original BSO, and the PSO. Besides, we also compare the performance of BSODE with other variants of DE. We choose two DE variants, that is, CoDE  and SaDE . To eliminate influences of statistical errors, each problem function is independently run 25 times which is a prescribed evaluation criterion in CEC2005 . For all approaches, the population size is set to 30. The same stopping criterion is used in all algorithms, that is, reaching certain number of iterations or function evaluations (FEs). In our experiments, we have run all algorithms on the benchmark functions using the same FEs of for a fair comparison, where is the size of the problem dimension. The parameter settings of all the algorithms are given in Table 2.
4.3. The Contribution of the New Step Size Method to BSODE
To verify the rationality and effectivity of the new step size in BSODE and to fully understand the effect of the new step size method, herein, we investigate the contribution of the new step size to BSODE. Considering the consistency of the paper, we compare all 14 benchmark functions with 25 independent runs. We test the BSO, BSO with new step size (BSO-newstep), BSODE, and the BSODE with the original adjusting factor, that is, without new step size (BSODE-nonewstep). The experimental results are presented in Table 3.
Table 3 shows that the BSO-newstep (BSO with new step size) has better performance than the original BSO and BSODE has better performance than the BSODE-nonewstep (BSODE without new step size), and it can be concluded that the new step size plays a significant role in BSODE algorithm. This is in good agreement with our analysis in Section 3.2 that the control of the new step size can balance exploration and exploitation at different searching generations in Section 3.2.
4.4. Comparison of Results
4.4.1. Comparisons on Solution Accuracy
The results of solution accuracy are given in Table 4 in terms of the mean optimum solution and the standard deviation of the solutions obtained in the 25 independent runs by each algorithm over 300,000 FEs on 14 benchmark functions. In all experiments, the dimensions of all problems are 30. In each row of the table, the mean values are listed in the first part, and the standard deviations are listed in the last part, and the two parts are divided with a symbol “±.” The best results among the algorithms are displayed in bold.