Research Article | Open Access
Artificial Bee Colony Algorithm Based on -Means Clustering for Multiobjective Optimal Power Flow Problem
An improved multiobjective ABC algorithm based on -means clustering, called CMOABC, is proposed. To fasten the convergence rate of the canonical MOABC, the way of information communication in the employed bees’ phase is modified. For keeping the population diversity, the multiswarm technology based on -means clustering is employed to decompose the population into many clusters. Due to each subcomponent evolving separately, after every specific iteration, the population will be reclustered to facilitate information exchange among different clusters. Application of the new CMOABC on several multiobjective benchmark functions shows a marked improvement in performance over the fast nondominated sorting genetic algorithm (NSGA-II), the multiobjective particle swarm optimizer (MOPSO), and the multiobjective ABC (MOABC). Finally, the CMOABC is applied to solve the real-world optimal power flow (OPF) problem that considers the cost, loss, and emission impacts as the objective functions. The 30-bus IEEE test system is presented to illustrate the application of the proposed algorithm. The simulation results demonstrate that, compared to NSGA-II, MOPSO, and MOABC, the proposed CMOABC is superior for solving OPF problem, in terms of optimization accuracy.
Real-life optimization problems usually have several conflicting objectives. The solution of such problems is very difficult because their objectives tend to be in conflict with each other. So a decision maker (DM) needs to determine an optimal trade-off among them. These problems can be modeled as multiobjective optimization problems (MOPs). Generally, the optimal solution of a multiobjective optimization problem is a set of optimal solutions (largely known as Pareto-optimal solutions). A Pareto-optimal solution to an MOP is a candidate for the optimal trade-off but it is not possible to find a better solution than others from the Pareto-optimal solutions, because one cannot be better than others without any further information. Therefore, it is necessary to find as many Pareto-optimal solutions as possible .
Swarm intelligence (SI) is an innovative artificial intelligence technique for solving complex multiobjective optimization problems, such as Pareto-archived evolution strategy (PAES) , Pareto envelope-based selection algorithm- (PESA-) II , nondominated sorting genetic algorithm II (NSGA-II) , strength Pareto evolutionary algorithm (SPEA2) , indicator-based evolutionary algorithm (IBEA) , multiobjective particle swarm optimization (MOPSO) , multiobjective evolutionary algorithm based on decomposition (MOEA/D) , and two-lbest multiobjective particle swarm optimization (2LB-MOPSO) . As the widely adopted intelligent search method, the artificial bee colony (ABC) algorithm is a powerful search technique that drew inspiration from the biological foraging behaviors observed in bee colony [10, 11]. Due to its simplicity and efficiency, ABC has become a high competitor in solving the MO problems. Many researchers have presented several existing multiobjective ABC (MOABC) algorithms [12, 13]. However, these proposed algorithms still suffer from low convergence rate and lack the diversity of swarm.
To conquer the weakness of initial MOABC, an improved multiobjective ABC algorithm based on -means clustering, named CMOABC, is proposed. In the CMOABC, for accelerating the convergence rate of canonical MOABC, the employed bees’ phase is modified to adjust the strength of information communication. Additionally, the population is partitioned into several subpopulations based on -means clustering. Information communication between the subpopulations depends on reclustering the population after each specific iteration. To further enhance the population diversity, a number of individuals with worse performance regenerate in the reclustering process. With the test against several multiobjective benchmark functions, the results demonstrate that the proposed CMOABC outperforms the MOABC, MOPSO, and NSGA-II on convergence metric and diversity metric.
In the real-world optimization problems, optimal power flow (OPF) is a classical multiobjective problem. Traditionally, the basic objective of optimal power flow (OPF) problem is to schedule the committed generating units to meet the system load demand at minimum operating cost while satisfying the various system equality and inequality constraints . But the passage of clean air act amendments in 1990 forced the utilities to reduce the emission from fossil fuel fired thermal station . Therefore, in addition to fuel cost, emission must also be considered as an objective. On the whole, OPF problem is a nonlinear, constrained optimization problem where many competing objectives are present.
In the early research, many traditional optimization techniques such as linear programming (LP) , nonlinear programming (NLP) [17, 18], quadratic programming (QP) , Newton method, and interior point method (IPM) [20–22] have already been employed to solve the OPF problem. However, these methods suffer from severe limitations in handling the OPF problem, including discrete-continuous functions, and complex constraints . In order to overcome the limitations of classical optimization techniques, recently, a wide variety of the multiobjective heuristic evolutionary algorithms have been proposed to solve OPF problem, mainly including genetic algorithm (GA) [23, 24], enhanced genetic algorithm (EGA) , evolutionary programming (EP) [26, 27], multihive multiobjective bee algorithm (M2OBA) , modified shuffle frog leaping algorithm (MSFLA) , differential evolution (DE) [30, 31], harmony search (HS) algorithm , biogeography based optimization (BBO) [33, 34], multiobjective evolutionary algorithm (MOEA) , multiobjective particle swarm optimization [36, 37], and fuzzy clustering-based particle swarm optimization (FCPSO) . However, all the mentioned mathematical techniques have some drawbacks such as being trapped in local optima and they are only suitable for considering a specific objective function in the OPF problem. Due to its outstanding performance on the test against benchmark functions, CMOABC is suitable for solving the multiobjective problems. Therefore, CMOABC is utilized to solve OPF problem. Compared with MOABC, MOPSO, and NSGA-II, the proposed CMOABC algorithm can accommodate considerable potential for solving OPF problem.
The rest of this paper is organized as follows. Section 2 presents optimal power flow (OPF) formulation, including the objective function and the constraints of the OPF problem. The proposed algorithm CMOABC is detailed in Section 3, and also in this section we test the CMOABC on benchmark functions: four ZDT and two DTLZ. Afterward, in Section 4, the CMOABC is applied to solve the OPF problem. Finally, Section 5 outlines the conclusions.
2. Optimal Power Flow Problem Formulation
The multiobjective optimal power flow (OPF) is a nonlinear, nonconvex optimization problem. The main goal of OPF is to optimize the settings of control variables in terms of one or more objective functions while satisfying several equality and inequality constraints. In general, the OPF problem can be mathematically formulated as follows:where is the objective function to be optimized, is the equality constraints representing nonlinear power flow equations, and is the system operating constraints. Here, is the vector of independent control variables including(1)generator active power output except at slack bus ;(2)generator bus voltage ;(3)transformer tap setting ;(4)shunt VAR compensation .
Hence, can be expressed aswhere , , and denote the number of generating units, number of regulating transformers, and number of shunt compensators, respectively. Generators active powers (except slack bus ) and generators bus voltages are continuous variables, while the tap settings of the tap changing transformers and VAR injections of the shunt capacitors are discrete variables.
And is the vector of dependent variables including(1)slack bus generated active power ;(2)load (PQ) bus voltage ;(3)generator reactive power output ;(4)transmission line loading (line flow) .
Hence, can be expressed aswhere is the number of PQ buses and is the total number of transmission lines.
2.1. Objective Function
In this paper, the OPF problem is to minimize three competing objective functions, namely, the total fuel cost, the total emission cost minimization, and total power losses, while satisfying several equality and inequality constraints. Generally the problem is formulated as follows.
2.1.1. Minimization of Total Fuel Cost
This objective function is to minimize the total fuel cost of the system. The fuel cost curves of the thermal generators are modeled as quadratic cost curves and can be represented as follows:where , , and are the fuel cost coefficients of the th generator and is real power output of the th generator.
2.1.2. Minimization of Total Power Losses
The power flow solution gives all bus voltage magnitudes and angles. Then, the total MW active power loss in a transmission network can be described as follows: where is the number of transmission lines, and are the voltage magnitudes at the th bus and th bus, respectively, and and are the voltage angles at the th bus and the th bus, respectively.
2.1.3. Total Emission Cost Minimization
In this paper, two important types of emission gasses, namely, sulphur oxides and nitrogen oxides , are taken as the pollutant gasses. The emission gasses generated by each generating unit may be approximated by a combination of a quadratic and an exponential function of the generator active power output. Here, the total emission cost is defined as follows :where is the total emission cost (ton/h) and , , and are the emission coefficients of the th unit.
2.2. Equality Constraints
The equality constraints represented by (2) are typical power flow equations which are defined as follows:where is the number of buses in the system. and are the active and reactive power generators at the th bus; and are the active and reactive power loads at the th bus, respectively; and are the real and imaginary parts of the th element of the bus admittance matrix, respectively; and are the voltage magnitudes at the th bus and the th bus, respectively; and are the voltage angles at the th bus and the th bus, respectively.
The equality constraints of power balance equations shown in (9) are forced by unconstrained Newton-Raphson based power flow calculations; therefore, there is no need to integrate them into the objective function.
2.3. Inequality Constraints
The inequality constraints represented by (3) are the power system operating limits including the following.
(1) Generator Constraints. Generator active power , generator reactive power , and generator voltage magnitude are restricted by their lower and upper limits:
(2) Transformer Constraints. Transformer taps have minimum and maximum setting limits:
(3) Switchable VAR Sources. The switchable VAR sources have restrictions as follows:
(4) Security Constraints. These include the limits on the load bus voltage magnitudes and transmission line flows limits:
In this work, the penalty factor method is utilized for handling the inequality constraints. In this regard, each control vector which violates constraints will be fined by these penalty factors. Therefore, in the next step, this control vector will be deleted automatically.
3. Multiobjective Artificial Bee Colony Algorithm Based on -Means Clustering
3.1. Canonical ABC Algorithm
The artificial bee colony (ABC) algorithm, proposed by Karaboga and Basturk  and further developed by Basturk and Akay [11, 40] for real-parameter optimization, which simulates the intelligent foraging behavior of a honeybee swarm, is one of the most recently introduced swarm-based optimization techniques.
The entire bee colony contains three groups of bees: employed bees, onlookers, and scouts. Employed bees explore the specific food sources; meanwhile they pass the food information to onlooker bees. The number of employed bees is equal to that of food sources; in other words, each food source owns only one employed bee. Then onlooker bees choose good food sources based on the received information and then further exploit the food near their selected food source. The food source with higher quality would have a larger opportunity to be selected by onlookers. There is a control parameter called “limit” in the canonical ABC algorithm. If a food source is not improved anymore when limit is exceeded, it is assumed to be abandoned by its employed bee and the employed bee associated with that food source becomes a scout to search for a new food source randomly. The fundamental mathematic representations are listed as follows .
Step 1 (initialization phase). In initialization phase, a group of food sources are generated randomly in the search space using the following equation: where , . is the number of food sources. is the number of variables, that is, problem dimension. and are the lower and upper bounds of the th variable, respectively.
Step 2 (employed bees’ phase). In the employed bees’ phase, the neighbor food source (candidate solution) can be generated from the old food source of each employed bee in its memory using the following expression:where is a randomly selected food source and must be different from ; is randomly chosen indexes; is a random number in range .
Step 3 (onlooker bees’ phase). In the onlooker bees’ phase, an onlooker bee selects a food source depending on the probability value associated with that food source and can be calculated as follows:where is the fitness value of the th solution.
Step 4 (scout bees’ phase). In scout bees’ phase, if a food source cannot be improved further through a predetermined cycle (called “limit” in ABC), the food source is supposed to be abandoned. The employed bee subsequently becomes a scout. A new food source will be produced randomly in the search space using (14).
The employed, onlooker, and scout bees’ phase will recycle until the termination condition is met.
3.2. CMOABC for Optimization
This section presents the detailed description of the proposed multiobjective artificial bee colony algorithm based on -means clustering (named CMOABC). The proposed algorithm makes improvements on two aspects: (1) modifying the method of information communication in employed bees’ phase and (2) partitioning the population using -means clustering. The number of the clusters will be chosen from the predefined set . After each specific iteration, when the defined criteria are met, parts of individuals in clusters will be removed and an equal number of new individuals are regenerated randomly. Then the number of clusters will change into a new value of the set and the population will be reclustered based on -means clustering.
3.2.1. Modified Employed Bees’ Phase of ABC
To increase the exploitation of ABC algorithm and fasten the convergence rate, the employed bees’ phase is modified. The new food source (candidate solution) is generated using the following way:where is the position of the global best solution and the term can drive the new candidate solution to the global best solution, is the current iteration, and is the maximum iteration.
The parameter in (18) plays an important role in balancing the exploration and exploitation of the candidate solution search. If , (18) is identical to (15). With the increase of , the probability of the candidate solution learning to the best solution increases correspondingly. In this way, in the beginning of optimization process, the proposed algorithm operates as the canonical artificial bee colony algorithm, which can well keep the population diversity; in the end of optimization process, the modified ABC algorithm has a considerable improvement on both convergence rate and local search.
Furthermore, in (17), the different learning intensities, which are determined by and , are adopted. We adopt the beta probability distribution  to tune these two parameters, which is based on the similar strategy of tuning parameters in . The beta distribution is flexible for modeling data that are measured in a continuous scale on a truncated interval in range , since its density is a versatile way to represent different shapes depending on the values of the two parameters that index the distribution. The probability density () of the beta distribution in the range is calculated as follows:where is the beta function.
The mean () and variance () of the beta distribution are calculated as follows:
The values of and are tuned by script (see (21)). An example of and values generated using beta distribution is shown in Figure 1. Considerwhere denotes the max iteration and is the current iteration.
3.2.2. Cluster Setting
The -means clustering method is employed to partition the population into subpopulations. The basic concepts of -means clustering are presented firstly, and then we will give a detailed description of its application in our proposed algorithm.
(I) Basic Parameters of -Means. The cluster centers are substituted for center positions of food sources and the formula of computing the centers is shown in (22). If the th cluster contains members and the members are denoted by , then the center () is determined as
The radius () of a cluster is defined as the mean distance (Euclidean) of the members from the center of the cluster. Thus, can be written as
(II) Clusters in the Proposed Algorithm CMOABC. In the proposed algorithm, the stochastically generated population is partitioned into subpopulations based on the widely adopted -means cluster method (Algorithm 1) . The number of clusters is determined by the predefined set , where . Every cluster operates as the modified ABC introduced in the above section. During optimization, it may happen that two or more clusters come close to each other or get overlapped to a high degree. Then, they will practically search the same domain of the functional landscape. To avoid this scenario, the distances between each two clusters are calculated as follows:where is the distance between one cluster and its neighbor and is the center of the th cluster’s neighbor. is the center of the th cluster.
If the distance between one cluster and its neighbors is smaller than the specific distance , one of the clusters will be removed and its nondomination solutions are stored:where is the radius of and is the radius of the neighbors of .
During optimization, information communication is not an inconsiderable aspect. For keeping the good ability of exploration, in the proposed algorithm, there is no information communication among clusters in specific iterations. Therefore, in order to exchange information among individuals, the whole population is repartitioned into clusters based on -means clustering after each iteration, where and are orderly chosen from the predefined set . That is, the individuals in a cluster may be distributed into different new clusters when the number of the clusters is changing. Moreover, to balance the exploration and exploitation, the value of is not a constant. To strengthen exploration, the rate of sharing information should keep a low value in the first half of search process. To improve the ability of exploitation, in the second half, the interval iteration changes to a small value. The formula of is listed as follows:where is the maximum iterations; iter is the current iteration.
After each iteration, a certain number of individuals in each cluster should be regenerated according to this cluster’s contribution to the external archive that is used to store the nondomination solutions. For the th cluster, the number of solutions updating to the external archive during each iteration is recorded in . Then, according to its position in the sort of , the number of individuals needed to regenerate in the th cluster is calculated in (27). The individuals that will be removed in cluster are determined by nondomination sort:where is the number of individuals in the th cluster; indicates the th cluster’s position in the sort of ; and is the current number of clusters.
Additionally, to prevent the size of a cluster () from becoming too large, we set a limit as the maximum of cluster size. If the number of individuals in a cluster exceeds the limit, the best solutions are reserved based on nondomination and others are removed to other clusters stochastically. Setting formula of is shown in where is the number of individuals in the population and is the number of clusters in current.
3.3. External Archive and Crowding Distance
3.3.1. External Archive
In multiobjective optimization, in the absence of preference information, none of the solutions can be said to be better than the others. Therefore, in order to keep the best solutions, a fixed-sized external archive is used by the CMOABC algorithm. The external archive is used to keep a historical record of the nondominated vectors found along the search process [6, 45].
3.3.2. Crowding Distance
In order to prevent the number of nondominated solutions from exceeding the size of EA, the crowding distance  is used to remove the crowded solutions. But in the proposed algorithm, the rules of removing the crowded solutions are modified. In each iteration, the distances between each nondominant solution and adjacent nondominant solution are calculated first, whether the size of external archive reaches the defined size or not. The detailed improvement is described as follows.(1)The size of external archive exceeds the defined size of EA: calculate the crowding distance and remove the crowded solutions .(2)The size of external archive does not outnumber the defined size: the crowding distance will be compared with a metric DIS. DIS is defined in (29); those nondominant solutions in the EA whose distances are all less than DIS will be removed from the EA . Consider
3.3.3. Constraints Handling Rules
To solve constraints, this paper uses a modified way to handle constraints which is introduced by Mohamed and Sabry . Based on the initial constraints handling methods , three modified rules are as follows.(1)The trial vector is feasible and the target vector is not.(2)The trial vector and target vector are both feasible, but the trial vector has fitness value smaller than or equal to the corresponding target vector.(3)The trial vector and target vector are both infeasible, but the trial vector has overall constraint violation smaller than or equal to the corresponding target vector.
The above selection procedure allows the trial vector to be entered in the new population if it has the same amount of constraint violation or objective function value as the target vector. Therefore, this simple modification can help the algorithm to spread out and pass through the search space, so the algorithm can escape from stagnation.
3.4. Benchmark Test
3.4.1. Test Function
To fully evaluate the performance of the CMOABC algorithm without a biased conclusion towards some chosen problems, in this paper, six commonly recognized benchmark functions have been used. Four of these test problems ZDT1, ZDT2, ZDT3, and ZDT6  are of two objectives, while the other two, DLTZ2 and DLTZ6 , are of three objectives.
3.4.2. Performance Measures
In order to facilitate the quantitative assessment of the performance of a multiobjective optimization algorithm, two performance metrics are taken into consideration: (1) convergence metric and (2) diversity metric .
Convergence Metric. This metric measures the extent of convergence to a known set of Pareto-optimal solutions:where is the number of nondominated solutions obtained with an algorithm and is the Euclidean distance between each of the nondominated solutions and the nearest member of the true Pareto-optimal front. To calculate this metric, we find a set of uniformly spaced solutions from the true Pareto-optimal front in the objective space. For each solution obtained with an algorithm, we compute the minimum Euclidean distance of it from chosen solutions on the Pareto-optimal front. The average of these distances is used as the convergence metric . Figure 2 shows the calculation procedure of this metric.
Diversity Metric. This metric measures the extent of spread achieved among the obtained solutions. Here, we are interested in getting a set of solutions that spans the entire Pareto-optimal region. This metric is defined aswhere is the Euclidean distance between consecutive solutions in the obtained nondominated set of solutions and is the number of nondominated solutions obtained by an algorithm. is the average value of these distances. and are the Euclidean distances between the extreme solutions and the boundary solutions of the obtained nondominated set, as depicted in Figure 3.
3.4.3. Experimental Setting
The effectiveness of CMOABC will be demonstrated on various standard test problems in this section. In addition, the proposed algorithm will be compared with the nondominated sorting genetic algorithm II (NSGA-II) , the multiobjective particle swarm optimization (MOPSO) , and the multiobjective artificial bee colony algorithm (MOABC) . In order to compare the different algorithms with a fair time measure, the number of function evaluations (Fes = 10,000) is used for the termination criterion.
For two test examples presented in this study, the CMOABC algorithm parameters are set as follows: the population size is 500 and the set is defined as .
For NSGA-II, the population size, crossover, and mutation probabilities are selected as 500, 0.8, and 0.2, respectively, for the two test examples. For MOPSO, the population size, mutation rate, and divisions for the adaptive grid are selected as 500, 0.5, and 30. For MOABC, the population size is selected as 500, and for other parameter settings refer to .
3.4.4. Two Objective Functions’ Results
The results reported in terms of the best, worst, average, median, and standard deviation of the performance measures are listed in Tables 1–4. Figures 4–7 show the optimal front obtained by four algorithms for each two objective functions, respectively. The continuous lines represent the Pareto-optimal front, while mark spots represent found nondominated solutions by the algorithms.