Research Article | Open Access
Parallel and Cooperative Particle Swarm Optimizer for Multimodal Problems
Although the original particle swarm optimizer (PSO) method and its related variant methods show some effectiveness for solving optimization problems, it may easily get trapped into local optimum especially when solving complex multimodal problems. Aiming to solve this issue, this paper puts forward a novel method called parallel and cooperative particle swarm optimizer (PCPSO). In case that the interacting of the elements in -dimensional function vector ,,…,,…, is independent, cooperative particle swarm optimizer (CPSO) is used. Based on this, the PCPSO is presented to solve real problems. Since the dimension cannot be split into several lower dimensional search spaces in real problems because of the interacting of the elements, PCPSO exploits the cooperation of two parallel CPSO algorithms by orthogonal experimental design (OED) learning. Firstly, the CPSO algorithm is used to generate two locally optimal vectors separately; then the OED is used to learn the merits of these two vectors and creates a better combination of them to generate further search. Experimental studies on a set of test functions show that PCPSO exhibits better robustness and converges much closer to the global optimum than several other peer algorithms.
Inspired from social behavior and cognitive behavior, Kennedy and Eberhart [1, 2] proposed particle swarm optimizer (PSO) algorithm to search for optimal value through population-based iterative learning algorithm. Due to the simple implementation and effective searching ability of the PSO algorithm, it has been widely used in feature selection , robot path planning , data processing , and other problems. However, many experiments have shown that the PSO algorithm may easily get trapped into local optimum especially when facing complex multimodal optimization problems. Better optimization algorithms are always needed for solving complex real-world engineering problems. In general, the unconstrained optimization problems that we are going to solve can be formulated as a -dimensional minimization problem as follows:where is the vector to be optimized and is the number of parameters .
In PSO, a member in the swarm, called a particle, represents a potential solution which is a point in the search space. Let denote the size of the swarm, the current state of each particle is represented by its position vector , and the movement of particle is represented by velocity vector , where is positive integer indexing particle in the swarm. Using represents the iteration number; the velocity and the position can be updated as follows:where the inertia weight determines how much the previous velocity can be preserved. A large inertia weight value tends to global exploration and a small value for local exploitation. and denote the acceleration constants which are usually set 2.0 or adaptively controlled . and are random numbers generated between 0 and 1 for the th dimension of th particle. represents the best previous position of particle which is defined by from the previous iterations and represents the best position among particle ’s neighborhood which is defined by or from the previous iterations. The model, which is inclined to exploitation, has a faster convergence speed but has a higher probability of getting stuck into local optimum than the model. On the contrary, the model considered focusing more on exploration is less vulnerable to the attraction of local optima but has slower convergence speed than the model . The position vector and the velocity vector are initialized randomly and are updated by (2) generation by generation until some criteria are met.
There are many modified versions of PSO that have been reported in the literature. Most studies address the performance improvement of PSO from one of the four aspects: population topology [9–11], diversity maintenance [6, 12, 13], hybrid with auxiliary operations [14–16], and adaptive PSO [8, 17]. However, it is difficult to find a robust solution due to the complexity and large search space of high dimensional problems. Therefore, people try to split the large search space into smaller ones in order to simplify the problem. In , the search space implemented by genetic algorithm is divided by splitting the solution vector into several smaller vectors. Each smaller search space is optimized separately and the fitness function is evaluated by combining all the solutions found by the smaller search spaces. The same technique is used in PSO called cooperative PSO (CPSO_H), which uses several subswarms to search -dimensional function vector separately . The searching results are integrated by a global swarm to improve the performance of the original PSO on high dimensional problems. Compared with traditional PSO, the CPSO_H has significant performance in terms of solution quality and robustness and performs better and better as the dimension of the problem increases. However, the efficiency of the algorithm is highly affected by the degree of interacting between function vectors .
Inspired from previous studies, a novel algorithm called parallel and cooperative PSO (PCPSO) is proposed in this paper. PCPSO tries to overcome the influence of interacting between vector elements when taking similar split method as mentioned in . Assuming that there is no interacting among the elements of the vector , the CPSO is used . Although the application of CPSO is useless in solving real problems, it provides a framework for PCPSO. For high degree of interacting of vector elements, local optimum can be obtained by CPSO . In order to jump out of local optimum, orthogonal experimental design (OED) is used to learn from two locally optimal vectors which are achieved by CPSO . A better combination of these two vectors can be obtained to push the search further to get closer to global optimum.
The rest of this paper is organized as follows. Section 2 describes the PCPSO algorithm. In Section 3, the benchmark test functions are used to test PCPSO algorithm and their results are compared with some peer algorithms taken from literature to verify the effectiveness and robustness of PCPSO algorithm. Final conclusions are given in Section 4.
2. Parallel and Cooperative Particle Swarm Optimizer
2.1. Cooperative Particle Swarm Optimizer
Among PSO variants in the literature, premature convergence is still the main deficiency of the PSO especially when facing large dimensional high correlation problems. The difficulty for particles jump out of local optimum is that particles need to change several elements of vector simultaneously, but usually the PSO variants can only change one element at one time when facing local optimum. Although the OED method is used to combine the best elements of vector as an exemplar ; the function value usually gets bigger because of the high correlation. Here, we first introduce the CPSO used in ; then the PCPSO is proposed based on CPSO. OED is used to learn and create a new vector from two locally optimal vectors which are achieved by CPSO and the new vector is treated as a start point for further search.
In , there are two cooperative PSO algorithms, CPSO_ and CPSO_. In CPSO_, the -dimensional search space is decomposed into subcomponents, each corresponding to a swarm of -dimensions (where ). CPSO_ is a hybrid method combining a standard PSO with the CPSO_. The subcomponents are used to place interacting elements together. However, the interacting elements are not known in real problems. We simplify the CPSO_ algorithm as CPSO which means the -dimensional search space is decomposed into subcomponents. Algorithm 1 shows the algorithm of CPSO. In order to evaluate the fitness of a particle in a swarm, the context vector is applied, which is a concatenation of all global best particles from all swarms. The evaluation of the th particle in the th swarm is done by calling the function which returns a -dimensional vector with its th component replaced by .
2.2. Parallel and Cooperative Particle Swarm Optimizer Algorithm
Since CPSO method uses one-dimensional swarm to search each dimension and interacting elements of the solution vector cannot be changed simultaneously, it is easier to fall into local optimum. If we try to use the CPSO algorithm (called First1D) to solve the high correlation problem, the vector started from random values will fall into a locally optimal vector. However, when another CPSO algorithm (called Second1D) which does not have any relationship with First1D is also used to solve the same problem, another locally optimal vector will be obtained. Since the performance of CPSO depends on numerical distribution when is initialized by random values, the independent vectors and , containing good information for optimum search, will fall into different locally optimal vectors. It is desperately needed to find a method of extracting good information from and and form a new vector which is closer to globally optimal value. If we exhaustively test all the combinations of and for the new vector , there are trials which are unrealistic to accomplish in practice. With the help of OED [15, 16, 23, 24], a relatively good vector is obtained from and , using only a few experimental tests.
The OED with both the orthogonal array (OA) and the factor analysis makes it possible to discover the best combinations for different factors with only small number of experimental samples [23, 24]. Here we have a brief discussion about OED combined with an example of vectors and . In order to simplify the problem, assuming , the optimization problem is to minimize the Sphere function , with , which are derived from First1D and Second1D (actually First1D and Second1D should obtain the same globally optimal vector because the test function is simple; here we just want to explain the OED theory). The whole analysis of using OED for Sphere function is shown in Table 1.
OA is a predefined table which has numbers arranged in rows and columns. In Table 1, we can see that there are three factors , , and that affect function values and two levels and to choose. The rows represent the levels of factors in each combination, and the columns represent specific factors that can be changed from each combination [15, 24]. The programming of generating OA can be found in . Factor analysis evaluates the effects of individual factors based on function values and determines the best level for each factor to form a new vector . Assuming denotes a function value of the combination , where stands for total number of combinations. The main effect of factor with level as is expressed aswhere and . If the factor is and the level is , , otherwise 0. In Table 1, we first calculate the effect of level 1 in factor ; then the level 2 in factor is . Since this is a minimal optimization problem, the better level 1 is chosen because the effect value of level 1 is less than that of in Table 1. As the example shown in Table 1, although this new vector is not shown in the four combinations tested, the best combination can be obtained by OED method.
OED works well when the interacting of vector elements is small. When the OED method is used to process some functions with high interacting vector elements, there exists a problem that the function value of new vector is bigger than the function values of vector and . However, there are still two advantages by using OED in high correlated problems. One is the particle with new vector jumping out of locally optimal vector; the other one is that each element of the new vector chosen between and is probably closer to the globally optimal vector. Therefore, we try to use the new vector as context vector to repeat the CPSO algorithm.
Based on the analysis mentioned above, we propose the PCPSO algorithm to overcome the problem of falling into local optimum. Figure 1 shows the flowchart of PCPSO. First we use two parallel CPSO algorithms (First1D and Second1D); then OED learning is applied to form the new vector which is assigned to ; the better vector chosen from and is assigned to the new . The new vectors and , which are treated as context vectors in CPSO algorithm, keep the search going further. In order to illustrate the efficiency of PCPSO algorithm, Figure 2 shows the convergence characteristic of the PCPSO on Rotated Ackley’s function. The detailed description of Rotated Ackley’s function can be found in Section 3. The parameter settings for First1D and Second1D are the same as CPSO_H algorithm , where decreases linearly from 0.9 to 0.4, , stands for particle number, and represents dimensions. Assuming First1D and Second1D need 80 CPSO iterations to accomplish locally optimal vectors followed with OED computation, there are 6 iterations of OED computation and therefore the total CPSO iteration is . From Figure 2, we can see that the function value drops sharply at first 80 iterations and that First1D and Second1D have similar performance. For the next 80 iterations, the better vector chosen from and is assigned to for another First1D implementation; the new vector is assigned to for another Second1D implementation. The performance on First1D and Second1D are significantly different in iterations from 80 to 240. First1D algorithm gets trapped in locally optimal vector and Second1D algorithm still makes the function value drop sharply even though it is higher at start point at the 81st iteration. Since First1D algorithm has almost lost searching ability after 80 iterations, we can also adopt the strategy that Firt1D algorithm is only implemented at first 80 iterations in order to save computational cost. Vector is used for saving the best optimal vector ever found and is combined with vector for new vector by OED method after 80 iterations. These operations iterate generation by generation until some criteria are met.
Compared with other PSO algorithms using OED [15, 16], the PCPSO needs some extra computations of updating vector . However, the cooperation between local vector and makes the searching more effective. In , the OED result is set as an exemplar for the other particles to learn. From Figure 2, we can see that sometimes the function values after OED are even bigger. Another limitation of  is that the searching method gets trapped into local optimum easily because the two vectors for OED learning are similar. Based on the principle that each element of function vector moves closer to the globally optimal vector after OED, it can be seen that the new vector jumps out of local optimum and keeps searching further even though the function value is higher than corresponding local optimum at start iteration.
3. Experimental Results and Discussions
3.1. Test Functions
Test functions including unimodal functions and multimodal functions are used to investigate the performance of PSO algorithms . We choose 15 test functions on 30 dimensions. The formulas and the properties of these functions are presented below.
Group A: unimodal and simple multimodal functions are as follows: (1) Sphere function (2) Rosenbrock’s function (3) Griewank’s function (4) Rastrigin’s function (5) Ackley’s function (6) Weierstrass function (7) Noncontinuous Rastrigin’s function (8) Schwefel’s function
Group B: Rotated and shifted multimodal functions are as follows.
In Group A, the functions can be divided into smaller search spaces because of low interacting of the elements. Therefore, the CPSO algorithm is suitable to optimize these functions. In Group B, seven rotated multimodal functions are generated by an orthogonal matrix to test the performance of the PCPSO algorithm. The new rotated vector , which is obtained through the original vector left multiplied by orthogonal matrix , performs like the high interacting vector, because all elements in vector will be changed once one element in vector changes. Detailed illustration of generating the orthogonal matrix is introduced in . Meanwhile, one shifted and rotated function is also listed in group B to test the performance of PSO algorithm. Table 2 shows the globally optimal vectors , the corresponding function values , the search range, and the initialization range of each function. Consider the following: (9) Rotated Ackley’s function (10) Rotated Rastrigin’s function (11) Rotated Griewank’s function (12) Rotated Weierstrass function (13) Rotated noncontinuous Rastrigin’s function (14) Rotated Schwefel’s function (15) Rotated and Shifted Rastrigin’s functionwhere is the shifted globally optimal vector.
3.2. PCPSO’s Performance on 15 Functions
The performance of PCPSO is tested by functions which includes multimodal, rotated, and shifted functions in 30 dimensions. The parameter settings for the PCPSO algorithm are as follows: decreases linearly from 0.9 to 0.4, , and particle number . Since the vectors of have very low interacting elements, Figure 3 shows the convergence characteristics of PCPSO algorithm on test functions in 30 iterations, where the -axis is expressed in log format. From Figure 3, we can see that all the function values drop significantly in 30 iterations except Rosenbrock’s function. In order to fully test PCPSO’s performance on Rosenbrock’s function, we make a comparison between PCPSO and CPSO algorithm in Figure 4. The upper plot of Figure 4 shows convergence characteristic of PCPSO which cooperates with the two independent algorithms (First1D and Second1D) by using OED learning; the lower plot is the convergence characteristic of CPSO algorithm. From Figure 4, we can see that the OED learning makes the function vector jump out of locally optimal vector and drops the function value finally. The convergence characteristics for the remaining six test functions are shown in Figure 5. All these six functions are rotated multimodal and nonseparable functions. Similar to Figure 2, in spite the fact that the function value of Second1D is increased in some cases at the start point caused by OED learning, it will drop sharply with the increase of iterations. This phenomenon is especially obvious in test function . To sum up, PCPSO works well in terms of preventing locally optimal vector and dropping function value sharply. The OED learning finds the best combination of two locally optimal vectors and in iterations 80, 160, and 240. However, there are still some drawbacks causing the vector falling into locally optimal vector like ; the reason is that the vectors and are equal to each other, the OED learning ability is lost finally.
3.3. Comparison with Other PSOs and Discussions
Seven PSO variants are taken from the literature for comparison on 15 test functions with 30 dimensions. In the following, we briefly describe the peer algorithms shown in Table 3. The first algorithm is the standard PSO (SPSO), whose performance is improved by using a random topology. The second algorithm fully informed PSO (FIPS) uses all the neighbor particles to influence the flying velocity. The third orthogonal PSO (OPSO) aims to generate a better position by using OED. The fourth algorithm fitness-distance-ratio PSO (FDR_PSO) solves the premature convergence problem by using the fitness distance ratio. The fifth comprehensive learning PSO (CLPSO) is proposed for solving multimodal problems. The sixth cooperative PSO (CPSO_H) uses one-dimensional swarms to search each dimension separately. The seventh orthogonal learning PSO (OLPSO) can guide particles to fly towards an exemplar which is constructed by and using OED method.
The swarm size is set 40 for the seven PSO variants mentioned above and 10 for the PCPSO algorithm. The maximal FEs for the eight algorithms are set 200,000 in each run of each test function. All functions were run 25 times and the mean values and standard deviation of the results are presented in Table 4. The best results are shown in boldface. From the results, we observe that CPSO_H, CLPSO, and OLPSO perform well on the first 8 functions. The reason is that CPSO-H suits well on 1-dimensional search, CLPSO and OLPSO can search further by using their own learning strategies. The learning strategy in CLPSO is to prevent falling into local optimum by random learning, while in OLPSO the strategy is to find a better combination of locally optimal vector. PCPSO method can keep searching by the cooperation of two locally optimal vectors and . With the implementation of Second1D, the algorithm can always find a better vector compared with the previous best one by using OED method. The PCPSO method performs well on , , , and . However, the function value obtained by this method is not close enough to the globally optimal value, the premature convergence still happens in and due to lack of vector diversity. The advantages of this method are the rapid convergent speed and robustness; whatever test functions are, the algorithm can find an acceptable optimum especially when facing some complex functions. The nonparametric Wilcoxon rank sum tests are taken as -test in Table 5 to determine whether the results obtained by PCPSO are statistically different from the best results achieved by the other algorithms. A value of one indicates that the performance of the two algorithms is statistically different with 95% certainty, whereas the value of zero implies that the performance is not statistically different. Among these 15 functions, there are 10 functions that are statistically different between the best results and the results of PCPSO. From Tables 4 and 5, we can see that the PCPSO obtains 5 best functions that are statistically different from the others. In addition, most of the best functions obtained by PCPSO are focused on nonseparable multimodal problems, which demonstrates the effectiveness and robustness of PCPSO algorithm.