Research Article  Open Access
Zhiming Song, Maocai Wang, Guangming Dai, Massimiliano Vasile, "A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis", The Scientific World Journal, vol. 2015, Article ID 439307, 10 pages, 2015. https://doi.org/10.1155/2015/439307
A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis
Abstract
As is known, the Pareto set of a continuous multiobjective optimization problem with objective functions is a piecewise continuous ()dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a modelbased multiobjective evolutionary algorithm with regression analysis (MMEARA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is ()dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGAII and RMMEDA. The result shows that MMEARA outperforms RMMEDA and NSGAII on the test instances with variable linkages. At the same time, MMEARA has higher efficiency than the other two algorithms. A few shortcomings of MMEARA have also been identified and discussed in this paper.
1. Introduction
Evolutionary algorithm has become an increasingly popular design and optimization tool in the last few years [1]. Although there have been a lot of researches about evolutionary algorithm, there are still many new areas that needed to be explored with sufficient depth. One of them is how to use the evolutionary algorithm to solve multiobjective optimization problems. The first implementation of a multiobjective evolutionary algorithm dates back to the mid1980s [2]. Since then, many researchers have done a considerable amount of works in the area, which is known as multiobjective evolutionary algorithm (MOEA).
Because of the ability to deal with a set of possible solutions simultaneously, evolutionary algorithm seems particularly suitable to solve multiobjective optimization problems. The ability makes it possible to search several members of the Paretooptimal set in a single run of the algorithm [3]. Obviously, evolutionary algorithm is more effective than the traditional mathematical programming methods in solving multiobjective optimization problem because the traditional methods need to perform a series of separate runs [4].
The current MOEA research mainly focuses on some highly related issues [5]. The first issue is the fitness assignment and diversity maintenance. Some techniques such as fitness sharing and crowding have been frequently used to maintain the diversity of the search. The second issue is the external population. The external population is used to record nondominated solutions found during the search. There have been some efforts on how to maintain and utilize such an external population. The last issue is the combination of MOEA and local search. Researches have shown that the combination of evolutionary algorithm and local heuristics search outperforms traditional evolutionary algorithms in a wide variety of scalar objective optimization problems [4, 6].
However, there are little researches focusing on the way to generate new solutions in MOEA. Currently, most MOEAs directly adopt traditional genetic operators such as crossover and mutation. These methods have not fully utilized the characteristics of MOP when generating new solutions. Some researches show that MOEA fails to solve MOPs with variable linkages, and the recombination operators are crucial to the performance of MOEA [7]. It has been noted that under mild smoothness conditions, the Pareto set (PS) of a continuous MOP is a piecewise continuous ()dimensional manifold, where is the number of the objectives. However, as analyzed in [8], this regularity has not been exploited explicitly by most current MOEA.
In 2005, Zhou et al. proposed to extract regularity patterns of the Pareto set by using local principal component analysis (PCA) [9]. They had also studied two naive hybrid MOEAs. In the two MOEAs, some trial solutions were generated by traditional genetic operators and others by sampling from probability models based on regularity patterns in 2006 [10].
In 2007, Zhang et al. conducted a further and thorough investigation along their previous works in [9, 10]. They proposed a regularity modelbased multiobjective estimation of distribution algorithm and named it as RMMEDA [5]. At each generation, the proposed algorithm models a promising area in the decision space by a probability distribution whose centroid is a ()dimensional piecewise continuous manifold. The local principal component analysis algorithm is used to build such a model. Systematic experiments have shown that RMMEDA outperforms some other algorithms on a set of test instances with variable linkages.
In 2008, Zhou et al. proposed a probabilistic model based multiobjective evolutionary algorithm to approximate PS and PF (Pareto front) for a MOP in this class simultaneously and named the algorithm as MMEA [11]. They proposed two typical classes of continuous MOPs as follows. One class is that PS and PF are of the same dimensionality while the other one is that PF is a ()dimensional continuous manifold and PS is a continuous manifold with a higher dimensionality. There is a class of MOPs, in which the dimensionalities of PS and PF are different so that a good approximation to PF might not approximate PS very well. MMEA could promote the population diversity both in the decision spaces and in the objective spaces.
Modeling method is a crucial part for MOEA because it determines the performance of the algorithms. Zhang et al. built such a model by local principal component analysis (PCA) algorithm [5]. The test results show that the method has great performance over some instances with linkage variables. However, there are still some shortcomings about the method. The first shortcoming is that RMMEDA needs extra CPU time for running local PCA at each generation. The second one is that the model is just linear fitting for all types of PS, including the one with nonlinear linkage variables, which enable that the result may be not accurate.
In the paper, we proposed a modelbased multiobjective evolutionary algorithm with regression analysis, which is named as MMEARA. In MMEARA, a new modeling method based on regression analysis is put forward. In the method, least squares method (LSM) is used to fit a 1dimensional manifold in highdimensional space. Because least squares can fit any type of curves through its model, the shortcomings of RMMEDA can be avoided, especially for the instances with nonlinkage variables.
The rest of this paper is organized as follows. After defining the continuous multiobjective optimization problem in Section 2, the new model of multiobjective evolutionary algorithm based on regression analysis is put forward in Section 3. Then, a description of the test cases for MMEARA follows in Section 4. After presenting the results of the tests, the performance of MMEARA is analyzed and some conclusions are given in Section 5.
2. Problem Definition
In this paper, the continuous multiobjective optimization problem is defined as follows [5]:where is the decision space and is the decision vector. consists of realvalued continuous objective functions . is the objective space.
Let and be two vectors, and is said to dominate , denoted by , if for all , and . A point is called (globally) Pareto optimal if there is no such that . The set of all Paretooptimal points, denoted by PS, is called the Pareto set. The set of all Pareto objective vectors is called the Pareto front, denoted by PF.
3. Algorithm
3.1. Basic Idea
Under certain smoothness assumptions, it can be induced from the KarushKuhnTucker condition that the PS of a continuous MOP defines a piecewise continuous ()dimensional manifold in the decision space [12]. Therefore, the PS of a continuous biobjective optimization problem is a piecewise continuous curve in .
The population in the decision space in a MOEA for (1) will hopefully approximate the PS and is uniformly scattered around the PS as the search goes on. Therefore, we can envisage the points in the population as independent observations of a random vector whose centroid is the PS of (1). Since the PS is a ()dimensional piecewise continuous manifold, can be naturally described by where is uniformly distributed over a piecewise continuous ()dimensional manifold, and is an dimensional zeromean noise vector. Figure 1 illustrates the basic idea.
3.2. Algorithm Framework
In this paper, a modelbased multiobjective evolutionary algorithm based on regression analysis is put forward to solve continuous multiobjective optimization problems with variable linkages. The algorithm is named as MMEARA. The algorithm works as follows.
MMEARA
Step 1 (initializing). Set . Generate an initial population and compute the value of each individual solution in .
Step 2 (stopping). If stopping condition is met, the algorithm stops and returns the nondominated solutions in , and their corresponding vectors constitute an approximation to the PF.
Step 3 (modeling). Build the probability model in to fit expression (2),(3.1)to compute the coefficients for by solving the matrix in expression (10);(3.2)to compute the manifold by expression (11);(3.3)to generate a dimensional zeromean noise vector between () randomly based on expressions (13) and (14).
Step 4 (reproducing). Generate a new solution set from expression (2). Evaluate the value of each solution in .
Step 5 (selecting). Select individuals from to create .
Step 6. Set and go to Step 2.
In the following Section 3.3, the implementation of modeling, reproducing, and selecting of the above algorithm will be given in detail.
3.3. Modeling
Fitting expression (2) to the points in is highly related to principal curve analysis, which aims at finding a central curve of a set of points in [13]. However, most current algorithms for principal curve analysis are rather expensive due to the intrinsic complexity of their models. RMMEDA uses the ()dimensional local principal component analysis (PCA) algorithm [14]; it is less complex compared with most algorithms for principal curve analysis. However, it needs much more CPU time compared with the traditional evolutionary algorithms which adopt genetic recombination operators such as crossover and mutation. Moreover, the PS cannot be exactly described by local PCA because it only uses linear curves to approximate the model at one cluster of .
In this implementation, we do not make use of clustering method in the modeling process. We try to find the principal curve of the whole points in , not just the local part of them. As is known, least squares approach is a simple and effective method for linear curve fitting and nonlinear curve fitting, such as polynomial or exponential curve fitting. Then we consider whether this technique could be made use of to describe expression (2).
For the sake of simplicity, it could be assumed that the centroid of is a manifold in formula (2), and is uniformly distributed on . is a ()dimensional hyperrectangle. Particularly, in the case of two objectives, is a curve segment in .
A line in 3dimensional space can be expressed as where , and are the coefficients of the expression.
The geometric meaning of expression (3) is that a 3dimensional line can be seen as the intersecting line of two planes and . Figure 2 illustrates this meaning.
The expression can be seen as the projection of the line in plane, and is the one in the plane.
As a 3dimensional line can be expressed by the intersecting of 2 planes, then a dimensional line can be expressed as the intersecting of () planes as
Expression can be regarded as the projection of the dimensional line in plane.
By expression (4), we further conclude that a dimensional curve can be regarded as the intersecting of () surface, and expression (5) shows this idea:
Expression can be regarded as the approximate projection of the dimensional curve in surface. Each expression is a order polynomial. is a point on the dimensional curve. Then the thing that we need to do is to find out all the coefficients , which could make the curve fit the population in the decision space well, and here we used least squares approach method to help us find out the best coefficients.
Least squares approach is mainly used to fit the curve, that is to say, to capture the trend of the data by assigning a single function across the entire range. Figure 3 shows the idea.
(a)
(b)
In Figure 3, Figure 3(a) looks linear in trend, so we can fit the curve by choosing a general form of the straight line , and then the goal is to identify the coefficients and such that fits the date well, the method to identify the two coefficients is called as linear regression. Figure 3(b) looks nonlinear, we use higher polynomial , and the goal is to find out the coefficients , and such that fits the date well. It is called as nonlinear regression compared with linear regression. In fact, there are a lot of functions with different shapes that depend on the coefficients. The methods to find out the best coefficients are just called as regression analysis (RA).
Consider the general form for a polynomial with order :
How can we choose the coefficients that best fit the curve to the data? The idea of least squares approach is to find a curve that gives minimum error between data and the fitting curve . As is shown in Figure 4, we can firstly add up the length of all the solid and dashed vertical lines and then pick curve with minimum total error. The general expression for any error using the least squares approach is For expression (7), we want to minimize the error err. Replace in expression (7) with the expression (6), and then we havewhere is the number of data points given, is the current data points being summed, and is the polynomial order. To find the best line means to minimize the square of the distance error between line and data points. Find the set of coefficients , that is to say, to minimize expression (8).
In Figure 4, there are four data points and two fitting curves and . Obviously, is better than because there is smaller error between the four points and the fitting curve .
To minimize expression (8), take the derivative with respect to each coefficient for , and set each to zero:
Rewrite these equations, and put into matrix form:
The coefficients for can be solved by matrix computation.
With the above work, we can describe the 1dimensional manifold aswhere is the polynomial order and and are the minimum and maximum values on :
In order to approximate the PS better, is extended by 50% along . Figure 5 shows this idea. In Figure 5, could not approximate the PS very well, but its extension can provide a better approximation.
When we find out the coefficients ) based on least square approach above, we could get in expression (2). is generated over uniformly and randomly.
In expression (2), is a dimensional zeromean noise vector, and it is designed as the following description:where is a random number between (). The noise is changed from big to small as the generation goes on because big noise can accelerate the convergence of the population in the early generation and small noise can maintain the accuracy of the population in the end. Expression (14) shows the implementation:where maxGen is the max generation of the algorithm and is set to be 200 and curGen is the current generation. is set to be 0.2 when the algorithm begins. Then the noise is changed from 0.2 to 0.02. The trends of the noise can be seen in Figure 6. As is shown in Figure 6, the noise decreases as the generation increases, and it will be stable after the 160th generation.
3.4. Reproducing
It is desirable that final solutions are uniformly distributed on the PS. Therefore, in order to maintain the diversity of the solution, in this paper, the new solution is generated uniformly and randomly as follows.
Step 1. Generate a point from uniformly and randomly.
Step 2. Generate a noise vector in expression (13).
Step 3. Return .
In Step 3 of the algorithm framework of MMEARA, new solutions can be produced by repeating above all steps times.
3.5. Selecting
The selection procedure used in this paper is the same in the procedure used in [5], which is based on the nondominated sorting of NSGAII [15]. The selection procedure is called as NDSselection.
The main idea of NDSselection is to divide into different fronts such that the th front contains all the nondominated solutions in . Therefore, there is no solution in that could dominate a solution in . Roughly speaking, is the best nondominated front in , and is the second best nondominated front, and so on. The detailed procedure of NDSselection can be found in [5].
4. Test Case
4.1. Performance Metric
In this paper, the performance metric used to evaluate the solutions is the convergence metric , which is also the common performance metric in multiobjective optimization algorithm [16].
The metric measures that the solutions will be convergent to a known set of Paretooptimal solutions. We find a set of 500 uniformly solutions from the true Paretooptimal front in the objective space. And then to compute the minimum Euclidean distance of each solution from chosen solutions on the Paretooptimal front. The average of theses distances is used as the metric .
4.2. General Experimental Setting
There are three algorithms employed to solve the test instance for a comparison. These three algorithms are RMMEDA, NSGAII, and MMEARA, while MMEARA is the new algorithm proposed in this paper.
The three algorithms are implemented by C++. The machine used in the test is Core 2 Duo (2.4 GHz, 2.00 GB RAM). The experiment setting is as follows. The number of new trial solutions generated at each generation is set to be 100 for all tests. The number of decision variables is set to be 30 for all tests. Parameter setting in RMMEDA: the number of cluster is set to be 5 in local PCA algorithm. Parameter setting in MMEARA: the order is set to be 2.
We run each algorithm independently 10 times for the test instance. The algorithms stop after a given number of generations. The maximal number of generations in three algorithm is 1000.
Table 1 gives the test instance [5]. In the test instance, the feasible decision space is a hyperrectangle. There are nonlinear variable linkages in the test case. Furthermore, the test instance has many local Pareto fronts since its has many locally minimal points. It also has some characteristics such as concave PF, nonlinear variable linkage, and multimodal with Griewank function.

If an element of solution , sampled from MMEARA or RMMEDA, is out of the boundary, we simply reset its value to a randomly selected value inside the boundary.
4.3. Performance Analysis
The evolution of the average metric of the nondominated solutions for the test case is shown in Figure 7. It should be noted that the solutions of all three algorithms are stable when the iteration generation is more than 300. After the solutions are stable, the convergence values of the three algorithms are small than 0.1. Because we adopt the average of the minimum Euclidean distance of each solution from chosen solutions as the metric , the smaller the convergence values, the better the convergence metric . As is shown by Figure 7, among the three algorithms, MEMARA has best convergence performance and NSGAII and RMMEDA follow.
Figure 8 shows the final nondominated solutions and fronts obtained by MMEARA on the test case. Figure 8(a) is the result with the lowest metric obtained in 10 runs while Figure 8(b) is all the 10 fronts in 10 runs. It can be seen that the nondominated fronts with the lowest metric are very close to the Pareto front, especially when tends to 0 and tends to 1. It can also be noted that the nondominated solutions in every run have some small fluctuations around the Pareto front.
(a)
(b)
The final nondominated solutions and fronts obtained by RMMEDA on the test case are shown in Figure 9. Similarly, Figure 9(a) is the result with the lowest metric obtained in 10 runs while Figure 9(b) gives all the 10 fronts in 10 runs. Similar to Figure 8, the nondominated solution(s) in Figures 9(a) and 9(b) are marked with red. The Pareto fronts are marked with blue. The Pareto fronts are given in Figures 9(a) and 9(b) only for comparing the quality of the nondominated solutions. It can be seen that the nondominated front with the lowest metric is very consistent with the Pareto front although there are some differences between them. In particular, it should be noted that all results in 10 runs from RMMEDA match the Pareto front better than MMEARA. But it also should be noted that there is an isolated point in the nondominated solutions for all 10 runs in Figure 9(b), maybe because RMMEDA falls into a local minimum and could not jump out.
(a)
(b)
The final nondominated solutions and fronts obtained by NSGAII on the test case are shown in Figure 10. Again, Figure 10(a) means the result with the lowest metric obtained in 10 runs and Figure 10(b) means all 10 fronts in 10 runs. As is shown in Figure 10(a), the nondominated front with the lowest metric is close to the Pareto front but different to the result obtained by MMEARA. The nondominated front with the lowest metric in NSGAII does not tend to the Pareto front very close. It does also not match the Pareto front as good as the result obtained by REMEDA. Similarly, the nondominated solutions in every run have some small fluctuations around the Pareto front.
(a)
(b)
The running time of the three algorithms are given in Table 2. From the point of the running time, as is shown in Table 2, among the three algorithms, NSGAII is the best, then MMEARA follows, and RMMEDA is the worst. This result is consistent with the main idea of the three algorithms. In RMMEDA, local principal component analysis (PCA) is used to construct the model, and it needs extra CPU time for running local PCA at each generation. In MMEARA, the least squares method is used to construct the model, and it is easy to run the least squares by matrix computation. MMEARA is slower than NSGAII because the selection in MMEARA is based on NSGAII.

Obviously, it can be seen that the nondominated front with the lowest metric obtained by MMEARA is the closest to the Pareto front in the three algorithms, which shows MMEARA is suitable to solve the problem with some characteristics such as concave PF, nonlinear variable linkage, and multimodal with Griewank function. In contrast, the results in 10 runs from RMMEDA mostly match the Pareto front, which shows the performance of RMMEDA is good in common.
5. Conclusion
In this paper, a modelbased multiobjective evolutionary algorithm based on regression analysis (MMEARA) is put forward to solve continuous multiobjective optimization problems with variable linkages. MMEARA models a promising area whose centroid is a complete and continuous curve described by expression (8). Because of this feature, MMEARA does not need to cluster the population. The least squares approach is simple yet enough to describe the nonlinear principal curve using the polynomial model.
The less CPU time of MMEARA does not come without a price. MMEARA samples points uniformly around the PS in the decision variable space, and the centroid of the model is not piecewise but complete curve. This makes it very difficult for MMEARA to approximate the whole PF. The experimental results also reveal that MMEARA may fail in test instances with many local Pareto fronts.
The future research topics along this line should include the following points:(1)designing an accurate model to describe the decision space: as the case of 3 objectives, the PS is a surface, so expression (8) cannot solve the problems with 3 objectives right now;(2)combining MMEARA with traditional genetic algorithms using operators such as crossover and mutation for accelerating the convergence of the algorithm;(3)improving the method to calculate random noise value to make the final population more convergent;(4)considering the distribution of the solutions in the objective space when sampling solutions from the models to improve the performance of MMEARA on the instance;(5)incorporating effective global search techniques for scalar optimization into MMEARA in order to improve its ability for global search.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
Maocai Wang thanks the Special Financial Grant from China Postdoctoral Science Foundation (Grant no. 2012T50681), the General Financial Grant from China Postdoctoral Science Foundation (Grant no. 2011M501260), the Grant from China Scholarship Council (Grant no. 201206415018), and the Fundamental Research Funds for the Central Universities at China University of Geosciences (Grant no. CUG120114). Guangming Dai thanks the Grant from Natural Science Foundation of China (Grant nos. 61472375 and 60873107) and the 12th FiveYear Preresearch Project of Civil Aerospace in China.
References
 T. Back, D. B. Fogel, and Z. Michalewicz, Handbook of Evolutionary Computation, Oxford University Press, Oxford, UK, 1997. View at: Publisher Site  MathSciNet
 J. D. Schaffer, Multiple objective optimization with vector evaluated genetic algorithms [Ph.D. thesis], Vanderbilt University, Nashville, Tenn, USA, 1984.
 C. A. C. Coello, D. A. van Veldhuizen, and G. B. Lamont, Evolutionary Algorithms for Solving MultiObjective Problems, Kluwer Academic Publishers, New York, NY, USA, 2002.
 C. A. C. Coello, “A comprehensive survey of evolutionarybased multiobjective optimization techniques,” Knowledge and Information Systems, vol. 1, no. 3, pp. 269–308, 1999. View at: Publisher Site  Google Scholar
 Q. Zhang, A. Zhou, and Y. Jin, “RMMEDA: a regularity modelbased multiobjective estimation of distribution algorithm,” IEEE Transactions on Evolutionary Computation, vol. 12, no. 1, pp. 41–63, 2008. View at: Publisher Site  Google Scholar
 A. Jaszkiewicz, “Genetic local search for multiobjective combinatorial optimization,” European Journal of Operational Research, vol. 137, no. 1, pp. 50–71, 2002. View at: Publisher Site  Google Scholar  MathSciNet
 K. Deb, A. Sinha, and S. Kukkonen, “Multiobjective test problems, linkages, and evolutionary methodologies,” in Proceedings of the 8th Annual Genetic and Evolutionary Computation Conference (GECCO '06), pp. 1141–1148, Seattle, Wash, USA, July 2006. View at: Publisher Site  Google Scholar
 Y. Jin and B. Sendhoff, “Connectedness, regularity and the success of local search in evolutionary multiobjective optimization,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '03), vol. 3, pp. 1910–1917, IEEE, Canberra, Australia, December 2003. View at: Publisher Site  Google Scholar
 A. Zhou, Q. Zhang, Y. Jin, E. Tsang, and T. Okabe, “A modelbased evolutionary algorithm for Biobjective optimization,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '05), pp. 2568–2575, Edinburgh, UK, September 2005. View at: Google Scholar
 A. Zhou, Y. Jin, Q. Zhang, B. Sendhoff, and E. Tsang, “Combining modelbased and geneticsbased offspring generation for multiobjective optimization using a convergence criterion,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '06), pp. 3234–3241, Vancouve, Canada, July 2006. View at: Google Scholar
 A. Zhou, Q. Zhang, and Y. Jin, “Approximating the set of paretooptimal solutions in both the decision and objective spaces by an estimation of distribution algorithm,” IEEE Transactions on Evolutionary Computation, vol. 13, no. 5, pp. 1167–1189, 2009. View at: Publisher Site  Google Scholar
 O. Schutze, S. Mostaghim, M. Dellnitz, and J. Teich, “Covering Pareto sets by multilevel evolutionary subdivision techniques,” in Evolutionary MultiCriterion Optimization: Second International Conference, EMO 2003, Faro, Portugal, April 8–11, 2003. Proceedings, vol. 2632 of Lecture Notes in Computer Science, pp. 118–132, Springer, Berlin, Germany, 2003. View at: Publisher Site  Google Scholar
 T. Hastie and W. Stuetzle, “Principal curves,” Journal of the American Statistical Association, vol. 84, no. 406, pp. 502–516, 1989. View at: Publisher Site  Google Scholar  MathSciNet
 N. Kambhatla and T. K. Leen, “Dimension reduction by local principal component analysis,” Neural Computation, vol. 9, no. 7, pp. 1493–1516, 1997. View at: Publisher Site  Google Scholar
 M. Wang, G. Dai, and H. Hu, “Improved NSGAII algorithm for optimization of constrained functions,” in Proceedings of the International Conference on Machine Vision and HumanMachine Interface, pp. 673–675, Kaifeng, China, April 2010. View at: Publisher Site  Google Scholar
 K. Deb and S. Jain, “Running performance metrics for evolutionary multiobjective optimization,” Tech. Rep. 2002004, KanGAL, India Institute of Technology, Kanpur, India, 2004. View at: Google Scholar
Copyright
Copyright © 2015 Zhiming Song et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.