Abstract

We analyze the convergence time of particle swarm optimization (PSO) on the facet of particle interaction. We firstly introduce a statistical interpretation of social-only PSO in order to capture the essence of particle interaction, which is one of the key mechanisms of PSO. We then use the statistical model to obtain theoretical results on the convergence time. Since the theoretical analysis is conducted on the social-only model of PSO, instead of on common models in practice, to verify the validity of our results, numerical experiments are executed on benchmark functions with a regular PSO program.

1. Introduction

Particle swarm optimizer (PSO), introduced by [1, 2], is a stochastic population-based algorithm for solving continuous optimization problems. As shown by [3] and by lots of real-world applications, PSO is an efficient and effective optimization framework. Although PSO has been widely applied in many fields [47], understanding of PSO from the theoretical point of view is still quite limited. Most of previous theoretical results [818] are derived under the system that assumes a fixed attractor or a swarm consisting of a single particle.

Due to the lack of theoretical analysis on PSO particle interaction, in this paper, we will make an attempt to analyze the convergence time for PSO on the facet of particle interaction. In particular, we will firstly introduce a statistical interpretation of PSO, proposed by [19], to capture the essence of particle interaction. We will then analyze the convergence time based on the statistical model. Finally, numerical experiments will be conducted to confirm the validity of our theoretical results obtained on simplified PSO, the social-only model, in a normal PSO configuration.

In the next section, we will briefly introduce the algorithm of PSO and the statistical interpretation of social-only PSO. In Section 3, we will analyze the convergence time of PSO based on the statistical model. The experimental results are presented in Section 4, followed by Section 5 which concludes this work.

2. Particle Swarm Optimization and the Statistical Interpretation

The social-only model of PSO can be described as pseudocode shown in Algorithm 1. In this paper, we will use boldface for vectors, for example, , . Without loss of generality, we assume that the goal is to minimize the objective function.

procedure SOCIAL-ONLY PSO (Objective function )
   Initialize particles
   while the stopping criterion is not satisfied do
  for   do
    if     then
      
       if     then
      
       end if
    end if
end for
for     do
    for     do
     
       
  end for
end for
  end while
end procedure

According to Algorithm 1, in the beginning, particles are initialized, where is the swarm size, an algorithmic parameter of PSO. Each particle contains three types of information: its location (), velocity (), and personal best position (). At each generation, each particle updates its personal best position () and neighborhood best position () according to its objective value. After updating the personal and neighborhood best positions, each particle updates the velocity according to and . In the velocity update formula, is the weight of inertia which is usually a constant. and are random values sampled from uniform distributions and , where and are called acceleration coefficients. Finally, each particle updates its position according to the velocity and then goes to next generation.

From the aforementioned brief description, we can already see that particle interaction is a crucial mechanism in the design of PSO. Although there have been previous studies on particle interaction and the PSO behavior, most of these studies were totally based on the assumption of fixed attractors, a false condition for PSO in action. In order to take particle interaction into consideration, we use an alternative view of PSO that regards the whole swarm as a unity. Instead of tracking the movement of each particle, we consider the overall swarm behavior by transforming the state of entire swarm into a statistical abstraction. Furthermore, in order to concentrate on particle interaction, we adopt the social-only model of PSO [20], which does not consider personal best positions.

The statistical interpretation of PSO we use in this paper is modified from [19], summarized in Algorithm 2. In the statistical model, the exact particle locations are not traced but modeled as a distribution over . Velocities are viewed as random vectors . The swarm size is considered as the number of samples from distribution , since the geographic knowledge is embodied in the distribution, the neighborhood attractor can be viewed as the best of the samples.

procedure STATISTICAL INTERPRETATION OF SOCIAL-ONLY PSO (Objective
Function
   Initialize: ,
   while the stopping criterion is not satisfied do
  for     do
     for     do
       
     end for
  end for
   
   for     do
     for     do
     
     end for
   end for
   
   
  end while
end procedure

Each particle is considered as a random vector sampled from , and the velocity is sampled from . The neighborhood attractor can then be defined as . At each generation, is updated as . The next distribution is thus the statistical characterization denoted by functions of the observed values: Since is a constant, distribution can be removed because given two random vectors and , we can simply let be the distribution of  .

For simplicity, in this paper, we consider the positions of each dimension of a particle is independently sampled from distribution . Consider the random variable and let . If we divide the support of into disjoint regions , such that for , and each region is associated with a random variable of velocity . By picking for each region, when is sufficiently large, the swarm can be characterized as Each component of can be approximated with a normal distribution by the central limit theorem. As a consequence, normal distributions are a reasonable choice for describing the behavior of the entire swarm. We let the distribution of th dimension, , be , where is the normal distribution with mean and variance . The update of distribution becomes simply by calculating the mean and the variance.

The mean can be calculated by taking the average of updated positions, and the variance is calculated by using the maximum likelihood estimation (MLE). Let and be the variance and mean of the th dimension at the th generation. Let and for and let , . To estimate the variance of the th dimension at the ()th generation, we use the maximum likelihood estimation (MLE). The likelihood function is defined as the joint probability: To find that maximizes , we differentiate with respect to : the value of that maximizes is . As a result, in our model of PSO, the results of MLE is for .

3. Convergence Time Analysis

In this section, we will analyze the PSO convergence time based on the aforementioned statistical interpretation of the social-only model. As the first step, we must define the state of convergence. Since, in this work, we regard the entire swarm as a distribution, the state of convergence is then referred to as the variance of the distribution. We define the state of convergence as the variance for every dimension is less than a given value . By using this definition, we can now start our analysis of PSO convergence time. To estimate the variance after distribution update, we need the following lemma from [21].

Lemma 1. Let . Define , where . One has , where is the chi-square distribution with degrees of freedom.

With this lemma, we can obtain the following.

Lemma 2. Given the swarm size , acceleration coefficient , and variance of the th dimension at the th generation , one has .

Proof. We know . The expected value is Let . Since for and are i.i.d., by Lemma 1, , and . Then, we can obtain

Lemma 2 is derived under the condition that is given. The following lemma will derive the relationship between and .

Lemma 3. .

Proof.

Now, we can obtain the relationship of convergence time and algorithmic parameters of PSO.

Theorem 4. Given swarm size , acceleration coefficient , , and . Let . One has for when and .

Proof. From Lemma 3, we know Since , we have The last inequality holds because

We have two corollaries immediately from Theorem 4.

Corollary 5. Given swarm size , acceleration coefficient , and level of convergence such that and , one has for for .

Corollary 6. Given swarm size, , , and such that and , there exists a constant such that for , one has for .

Corollary 5 reveals the linear relationship between the level of convergence and the convergence time, and the interpretation of Corollary 6 is that when the swarm size is sufficiently large, the effect of enlarging swarm size on the convergence time is not important. In the next section, we will empirically examine the two corollaries with a common practical PSO configuration.

4. Experiments

In this section, we verify the validity of Corollaries 5 and 6 by running standard PSO 2006 downloaded from Particle Swarm Central. We use two objective functions in our experiments:(i)sphere function [22]:(ii)schwefel’s problem 1.2 [22]: We have for both and in the following experiments.

We firstly examine Corollary 5. The PSO algorithmic parameters are given as , , , and swarm size = 50. The value of is varied from to . For each value of , we perform 100 independent runs. For each run, we count the number of generations from initialization to the state in which variances for all dimensions are smaller than , and we calculate the mean number of generations for the 100 runs.

The comparison of these experimental results and our theoretical results is shown in Figures 1 and 2. From Figure 1, we can see that the experimental results of are very close to , and from Figure 2, the experimental results of are very close to . The experimental results agree with our estimation in Corollary 5, in which the value of and the PSO convergence time are linearly related.

After Corollary 5 is empirically verified with the standard PSO, we now examine Corollary 6. The parameters we used in PSO are given as , , , and . The swarm size ranges from 50 to 1000 with step 5. For each swarm size, we perform 100 independent runs and record the mean as we did in last experiment.

The comparison of experimental and theoretical results is shown in Figures 3, 4, 5, and 6. From Figures 3 and 4, we can see that the convergence time is close to , where , and in Figures 5 and 6, the convergence time is close to , where . As we can observe from these figures, when the swarm size becomes large, the increase of convergence time is insignificant, confirming our estimation in Corollary 6.

5. Conclusions

In this paper, a statistical interpretation of a simplified model of PSO was adopted to analyze the PSO convergence time. In order to capture the essence of particle interaction, the statistical model adopted in this paper assumed no fixed attractors. The effect of particle interaction was included in our analysis. Our theoretical results revealed the relationship between the convergence time and the level of convergence as well as the relationship between the convergence time and the swarm size. Numerical results, in the standard settings of PSO, were obtained to empirically verify our theoretical results derived with a simplified PSO configuration. The agreement between the experimental and theoretical results indicated the importance of particle interaction in PSO. Consequently, more research effort should be invested into analyzing the working of particle interaction in order to better understand particle swarm optimization.

Some future extensions of this study are now ready to be explored. First of all, the relationship between PSO and the number of dimensions, that is, in the adopted model, the relationship between and , where . Second, the theoretical analysis conducted in this study is independent of objective functions. In Section 4, we verify the analysis with only two objective functions. More functions of various features and properties, which do not violate the settings of the adopted statistical model, should be used to examine the estimation. Third, the distribution which we used in this paper is the normal distribution. However, there should exist some objective functions that enforce the swarm to distribute according to different distributions. The analysis presented in this paper will fail on those objective functions. As a result, more sophisticated models should be adopted to provide good descriptions of the PSO macrobehavior and to enable researchers to derive more accurate PSO estimations. Finally, the social-only PSO model we adopted in this paper does not take the personal experience into consideration. We also need more sophisticated models to analyze the PSO macrobehavior influenced by the personal experience.

Acknowledgment

The work was supported in part by the National Science Council of Taiwan under Grant NSC 99-2221-E-009-123-MY2.