Abstract
The adaptive operator selection (AOS) and the adaptive parameter control are widely used to enhance the search power in many multiobjective evolutionary algorithms. This paper proposes a novel adaptive selection strategy with bandits for the multiobjective evolutionary algorithm based on decomposition (MOEA/D), named latest stored information based adaptive selection (LSIAS). An improved upper confidence bound (UCB) method is adopted in the strategy, in which the operator usage rate and abandonment of extreme fitness improvement are introduced to improve the performance of UCB. The strategy uses a sliding window to store recent valuable information about operators, such as factors, probabilities, and efficiency. Four common used DE operators are chosen with the AOS, and two kinds of assist information on operator are selected to improve the operators search power. The operator information is updated with the help of LSIAS and the resulting algorithmic combination is called MOEA/DLSIAS. Compared to some wellknown MOEA/D variants, the LSIAS demonstrates the superior robustness and fast convergence for various multiobjective optimization problems. The comparative experiments also demonstrate improved search power of operators with different assist information on different problems.
1. Introduction
Multiobjective optimization is a common problem that scientists and engineers face, which concerns optimizing problems with multiple and often conflicting objectives. In principle, there is no single solution for a multiobjective optimization problem (MOP), but a set of Paretooptimal solutions. This paper considers the following continuous MOP:where is a ndimensional decision variable vector, is the decision space, and consists of m realvalued continuous objective functions [1, 2].
Over the past decades, a number of multiobjective evolutionary algorithms (MOEAs) have been proposed. The first MOEA, named vector evaluated genetic algorithm, has been used for MOPs since 1980s [3] and after that more and more attention has been attached to MOEA. The first generation of intelligence algorithms for MOPs represented by NSGA [4], NPGA [5], and MOGA [6] were presented in 1990s. In 2000s, the second generation appeared, and the renowned ones include NSGAII [7], SPEAII [8], MOEA/D [9], and some biology imitated algorithms, such as MOPSO [2], MOIA [10], MOACO [11]. These algorithms are usually classified into 3 categories: Pareto based methods [4–8], indicatorbased methods [12], and decompositionbased methods [9]. The first one evaluates individuals based on the nondominated ranking. The second one integrates the convergence and diversity into a single indicator to guide evolution. The last one decomposes an MOP into a set of single subproblems, and evaluates solutions with regard to reference vector.
An extreme difficulty for most MOEAs is how to promote the searching efficiency; that is to say how to improve the operator to induce a more probability for searching high dimensional space. There are two improving methods: enhancing operator with adaptive parameter control and using multiple operators with an adaptive operator selection (AOS). The simulated binary crossover (SBX) [13] and the differential operator (DE) [14] are widely adopted, and they are usually combined with the binomial crossover or the polynomial mutation, as these combinations could bring powerful search ability [15, 16]. There are still some limitations for composite operators [17] and variants [18] when searching complex problems. However, more and more adaptive control strategies are considered indispensable. In [19], best individuals are selected as donor vectors for the DE, and the parameter of every mutation vary within the range of the statistical data of each generation. Zhu et al. [17] design a novel recombination operator possessing the advantages of both the DE and the SBX. Its adaptive parameter control strategy is to allocate probabilities for SBX and DE according to the search period, whereas Zhao et al. [20] suggest that the different neighborhood sizes have an unavoidable influence on the search power of operators based on the framework of MOEA/D, and the experiments imply that adaptive selection of neighborhood sizes works very well.
The major intention of the adaptive parameter control is to solve an essential problem regarded as the exploration versus exploitation (EvE) dilemma. Exploitation means searching the local space deeply, that is to say making full use of the current operator with current parameters. However, exploration means the operator search power for unfamiliar areas, which is displayed by other operators or the configuration of the current operator with different parameters. In conclusion, the EvE dilemma can be considered as looking for a tradeoff of the search power both in unfamiliar and familiar areas.
The EvE dilemma is of great significance for the existence of AOS methods. The EvE dilemma has been intensively studied in the game theory community for dealing with the multiarmed bandit (MAB) problem, which was first proposed in 1952 [21]. An interesting strategy, the upper confidence bound (UCB) selection strategy, has been used to solve the EvE dilemma since 1994 [22]. It possesses distinctive advantages among many AOSs, and a host of improved UCB versions appears after that. The application of UCB strategy in solving the MAB problem has been widely recognized, and in the following part UCB is referred to as the MAB algorithm for convenience. DMAB [23] presents the combination of the MAB problem and two statistical tests and suggests that the operator selection is sensitive to any change of the reward distribution. SlMAB [24] uses a sliding time window to store recent rewards and applied operators with a mechanism of first in first out (FIFO). Compared with DMAB, SlMAB possesses less assessment and factors. Furthermore, two rank based methods, the Area Under Curve and the Sum of Ranks, are presented for assigning credits, respectively [25]. Inspired by SlMAB and rank based credit assignment, Li et al. [16] suggest that a decay factor is useful for the credit assignment and it can improve the selection probability of the best operator. Two modified MAB methods are provided, called UCBTuned and UCBV, in which reward variance is used as a parameter for a better EvE tradeoff [26].
Whether enhancing operator with adaptive parameter control or using multiple operators with AOSs, the statistical data about the operators is vitally meaningful with no doubt. Enlightened by the sliding time window, we present a novel adaptive method which is used to store the information about the operators, called latest stored information based adaptive selection (LSIAS) strategy. The information of operators includes operator names, operator efficacies, and parameters about operators such as neighborhood sizes, scaling factors, and other parameters of operators (the parameters about operators will be regarded as assist information on operator for convenience). In this paper, two kinds of assist information are taken into account and they are used within MOEA/D with dynamical resource allocation (MOEA/DDRA) [27], which won the championship of the CEC 2009 MOEA contest. The reason of choosing a decompositionbased algorithm is mainly because each decomposed subproblem is a single objective problem, which easily gives an exact value to measure the performance of operators every time. To validate the effectiveness and robustness of LSIAS, 22 wellknown benchmark problems such as ZDT problems [28], DTLZ problems [29], and UF problems [30] are adopted. The comparative experiments are demonstrated when the MOEA/DLSIAS compares with some versions of MOEA/D, for example, MOEA/DDE [31], MOEA/DDRA [27], MOEA/DFRRMAB [16], MOEA/DSTM [32], MOEA/DUCBT [21], and MOEA/DAGR [33].
The rest of this paper is organized as follows. The background and some works regarding the AOS and the adaptive parameter control are described in Section 2. The detail description of the LSIAS strategy is given in Section 3, and its utilization with the MOEA/DDRA is presented in Section 4. Section 5 analyzes the result of the comparison experiments. Finally, conclusions are summarized and further work along the direction of the adaptive selection is discussed.
2. Related Background
2.1. Tchebycheff Approach in MOEA/D
MOEA/D provides a method which decomposes an MOP into a series of single problems. It is suitable to evaluate the performance of operators. There are three common decomposition methods: the weighted sum approach, the Tchebycheff approach, and the boundary intersection method, which are all described in [9]. This paper employs the Tchebycheff approach and it is in the formwhere is the reference point and for each . For each Pareto optimal point there exists a weight vector such that is the optimal solution of (2) and each optimal solution of (2) is a Pareto optimal solution of (1). Therefore, one is able to obtain different Pareto optimal solutions by altering the weight vector.
2.2. Basic Differential Evolution Operator
Differential evolution (DE) is a parallel direct search method. There are many various mutation strategies about DE. Here, several frequently used DE operators [15, 34–36] are given as follows:(i)DE/rand/1: ,(ii)DE/rand/2: ,(iii)DE/targettorand/1: ,(iv)DE/targettorand/2: ,(v)DE/best/1: ,(vi)DE/best/2: ,(vii)DE/targettobest/1: ,
where is the target vector, is the mutant vector, and both of them belong to the ith subproblem. is one of the best individual vectors in the population. The scale factors F and K are used to control the influence of the mutant vector difference within the range . The donor vectors , , , , and are different individuals.
After the differential evolution, the crossover operation and the polynomial mutation usually go on to mutate the vector . The crossover operation decides which dimension of the new solution will be replaced by or the trial vector . Basically, DE works with the binomial crossover [14], and the trial vector is formed as follows:where r is a random number in the range . CR is the crossover rate and has the same range as r. and is a random integer within the range .
The polynomial mutation provides a deep mutation for the to generate a new solution . The polynomial mutation is defined as follows:where and are the lower and upper bounds of the jth variable, respectively, is the mutation probability, and is a random number within the range . is a mutation factor and is obtained bywhere is a random number within the range and is the distribution index and is usually set to 20.
The description above is a common DE mutation process. It has been validated that these different DE operators enjoy different search powers. “DE/rand/1” and “DE/rand/2” show strong exploration performance. “DE/targettorand/1,” “DE/best/1,” and “DE/best/2” manifest their perfect exploitation performance and are useful for unimodal problems. However, “DE/targettorand/1” is more suitable for rotated problems than other DE operators [19]. Accordingly, a combination of multiple DE operators could offer a strong search power to solve most MOPs.
2.3. Adaptive Operator Selection
Every operator is in possession of different search power. Although parameter selfadaptive adjustment improves their search power, it is limited by its best performance. The AOS offers intense search power by choosing different operators when faced with different dilemmas.
There are mainly two AOS methods: the probability based methods, such as the probability matching (PM) [37] and the adaptive pursuit (AP) [38], and the bandits based methods [39]. The similarity of all AOS methods is that they enjoy similar processes, which are applied operator based credit assignment and selection based credit accumulation. Nevertheless, the detail of selection and credit accumulation is different.
2.3.1. Credit Assignment
Two credit assignment methods are often used. One is the dynamic statistic information evaluation about operators, and the other is the search power evaluation which uses various complex statistics to detect outlier production. The former takes recent assist information on operator into account. Some recent assist information is employed as rewards which decide the credit assignment of operators [24]. Rank factor is used to increase the use frequency of better operators. The latter possesses a complex calculation in consideration of two measures, fitness and diversity. In [40], the evaluation criterion depends on the appearance probability of outlier solutions. This evaluation criterion does not regard the fitness as the unique criterion, as the authors argue that infrequent but powerful operators are as significant as frequent but powerless operators. Density estimator is adopted as evaluation criterion in [41, 42], and a statistic method is added in [42]. This method calculates the normalized relative fitness improvements from successful operators, and then it regards the mean value of the improvements brought by operators as the credit. In [26], four different credit assignment methods are adopted, which are Average Absolute Reward, Extreme Absolut Reward, Average Normalized Reward, and Extreme Normalized Reward. All the rewards of each method are evaluated, and the method with the max probability is chosen as the credit assignment method at current generation.
2.3.2. Operator Selection
Assume that there are K different operators, and and are the probability vector and the estimate of the ith operator reward, respectively.
Probability Matchingwhere r is the reward of selected and successfully applied operator and t is the time point. is a decay factor which alleviates the influence brought by the accumulation reward of previously used operators. In this method, the worst probability is and the best probability is . So each operator has a nonzero probability to be chosen during the whole search process. As every operator performs differently at different phases, nonzero probability operator selections are very suitable for AOSs, and it manifests superior robustness.
Adaptive Pursuitwhere is the same as in (6). When the best operator is successfully applied, it will get a relatively better reward. The accumulation is enlarged and it is the reason why this selection strategy always chooses the best operator, noted .
Multiarmed Banditwhere is the same as that in (6), is the successful times of th operator at point , and is a scaling factor of the tradeoff of different search powers. In the MAB algorithm, the operator denotes the arm.
Each operator gets different credit after credit assignment, which is the key of operator selection. As (9) revealed, the selection depends on two factors: one is the credit value of operators () and the other is the usage number of operators (the part behind parameter C). The parameter C plays a crucial role in deciding which factor plays a more important role. SlMAB [24] uses a sliding window with a mechanism of FIFO to store some latest information about operators. The latest information about operators truly reflects the operator performance. Due to the timeliness inadequacy of this accumulation, a decay factor is suggested in [16]. In [21], two modified MAB methods are proposed, which are MOEA/DUCBTuned and MOEA/DUCBV. The two methods use a parameter called the rewards’ variance to modify (9), and the experimental results show that UCBT performs better than V on most test problems.
Except for the two methods mentioned above, there are some other kinds of adaptive operator selection methods, such as gradient based methods and multiple trial vector comparison based methods. Schütze et al. [43] propose a local search mechanism, Hypervolume Directed Search (HVDS). In HVDS, the gradient information is used to select search behaviors which are greedy search, descent direction, and search along the Pareto front. As these search behaviors are based on gradients, these methods cannot be used if objectives are not differentiable. Lara et al. [2] suggest a novel iterative search procedure, Hill Climber with Sidestep (HCS), in which it is capable of moving both toward and along the Pareto set depending on the distance of the current iterate toward this set. The search direction selection is on the basis of the dominance relation between several trial vectors and the old individual.
3. The Proposed Algorithm
In this section, we present an improved bandit based method for MOPs, named latest stored information based adaptive selection strategy. This method attaches more attention to the AOS dynamic nature. It is mainly composed of two parts, credit assignment and operator selection.
3.1. Credit Assignment
Credit assignment contains two main tasks: one is to calculate credit value of applied operators; the other is to assign the credit fairly.
For the first task, fitness change is adopted as rewards of successful applied operators, which is regarded as Fitness Improvement Rate (FIR). During different search processes, the convergence levels of individuals are highly different. Normalization is used for FIR as follows:where is the fitness value of the solution of last generation on the subproblem and is the fitness value of the current solution on the subproblem.
A sliding window with length W is used to store operators, their assist information, and FIRs with the mechanism of FIFO. It always stores the latest W configuration of operators and their related information.
Supposing that the operator number is , the type number of assist information on operator is T, and the number of each type of assist information on operator is , . The structure of sliding window is shown as in Figure 1. As the sliding window revealed, the first layer is stored by operator names, the last is FIRs, and the middle layers are for parameters. It is worth noting that the locations of different types of parameters in the sliding window are ranked according to their significance, and this order is also the computation order of assist information on operator. Since the best suitable operator and the best suitable configuration of operator and assist information on operator are considered simultaneously, the credit value to operator is first to be assigned and the configuration of operator and assist information is considered later. The configurations among assist information on operator are not taken into consideration. The index of FIR is not described in Figure 1. This is mainly because the index of FIR is mainly associated with the credit assignment of different operators and different types of assist information on operator. The details of the index of FIR are illustrated in Figure 2.
As rewards guide operator selection methods, some unexpected extreme fitness improvement values brought by operators can appear. In case of that, we discard 5% best rewards and 5% worst rewards, and the rest of the rewards are denoted as R.where denotes credit value of operator i and is credit value of a configuration of operator i and th parameter of parameter type t; denotes jth reward of operator i and th parameter of parameter type t which belongs to , and denotes jth reward of operator i which belongs to . denotes the cardinality of a set. But if the total usage number of the operator or the assist information is small, this method is not very useful. So this method is used under the condition that the total usage number is no less than 20.
3.2. Operator Selection
Based on the credit assignment described above, each configuration of operator and parameters gets its FRR. Then the best configuration will be selected. Refer to MAB algorithms; the selection of operators and parameters are defined aswhere is the successful applied number of ith operator, is the successful applied number of jth parameter of parameter type t, and K is the number of operators or this type of parameters.
As (12) revealed, the selection depends on two factors. One is the credit value of operators (FRR_op) and the other is the usage number of operators (the part behind parameter C). As the parameter C plays a crucial role in deciding which factor plays a more important role, we decide to redefine the parameter C with the difference between the best usage rate and the worst usage rate. The advantage of such a parameter C is that the selection of operators and parameters becomes sensitive. If the difference is a large value, usage number is the major factor. The configuration of the operator and parameter possessing maximum usage numbers has more possibility of giving full play to its search power. If the difference is a small value, the credit value is more important and the configuration with power search ability at this time will be selected more possibly by chance. The parameter C is defined aswhere and are the maximum and minimum usage numbers, respectively and W is the length of sliding window, and it also could be regarded as the total usage number.
The pseudocode of operator and parameter selection is given in Algorithm 1.

4. Integration of LSIAS with MOEA/DDRA
The MOEA/D provides a decomposition method which could decompose a MOP into a series of single problems. It is the single problems that could give an exact value to measure the performance of every operator. Therefore, the main reason we choose MOEA/D is that the metrics for evaluating operators are easy to get.
Since the MOEA/D was presented in 2007, a good deal of research has been put forward to improve its performance and its applied range. Any improvement on MOEA/D could be of practical interests. Here we choose a famous improved version of MOEA/D as the framework, MOEA/DDRA, as it is the champion of the CEC 2009 MOEA contest. Consequently, we investigate how to enhance the MOEA/D with LSIAS.
4.1. MOEA/DDRA
In this paper, we use Tchebycheff with objective normalization instead of classic Tchebycheff. It is known that the objective normalization performs better than classic methods especially when the objective space becomes more complex.
MOEA/D minimizes all these objectives functions simultaneously in a single run. Neighborhood relations among these single objective subproblems are defined based on the distances among their weight vectors. Each subproblem is optimized by using information mainly from its neighboring subproblems. For most versions of MOEA/D proposed, they receive about the same amount of computational effort. These subproblems, however, may have different computational difficulties; therefore, it is very reasonable to assign different amounts of computational effort to different problems. In MOEA/DDRA, N/5 subproblems are selected to be optimized based on the utility in a single run. We define and compute a utility for each subproblem i. Computational efforts are distributed to these subproblems based on their utilities. is the relative improvement of subproblem i and its value is a positive number. If the solution of subproblem i is better than the one in last generation, the value of is 1. But if the solution of subproblem i is not better than the one in last generation, the value of is less than 1. Thus this subproblem is selected to be optimized with a large probability in next generation.
Suppose that the MOP is decomposed into N scalar subproblems, and each subproblem has a weight vector which is uniformly spread. In , each matches the condition of and . The objective function of jth subproblem can be stated as (2). When searching, MOEA/DDRA maintains(i)a population of N points , where is the current solution to the ith subproblem;(ii), where is the Fvalue of ; that is, for each ;(iii): the utility of ith subproblem, which measures the improvement of the individual between the previous and the current generation. It is defined aswhere is the relative improvement of subproblem i.
An important advantage of MOEA/D and its improved versions is that a better solution is helpful for both the ith subproblem and the subproblems close to the ith. There is a congruent relationship between subproblems and vectors. Thus, each solution converges along its vector from beginning to end, and the excellent ones could bring valuable information to help its neighbor individuals evolve. Each subproblem has T neighborhoods which are selected based on the Euclidean distance. For generation t, the process is as follows:(1)Select T neighborhoods or all population as P;(2)Randomly select several solution from P;(3)Generate a new solution after using genetic operators;(4)Compare the new one with old ones selected from P and replace them if the new one is better.
4.2. MOEA/DDRA with LSIAS
To combine the LSIAS with MOEA/DDRA, we set the content of latest stored information and define the reward calculation. In this paper, we choose four kinds of operator related information as the content of latest stored information, which are an operator pool, the scaling factor F, the neighborhood size, and the FIRs. They are stored in the sliding window as shown in Figure 3. The operator pool consists of four DE operators, expressed as ().(i): DE/rand/1(ii): DE/rand/2(iii): DE/targettorand/1(iv): DE/targettorand/2.
In MOEA/DDRA, each vector employs T nearest vectors as its neighborhood. In this paper, suppose that all vectors are divided into several neighborhoods with the size of T. Choosing three nearest neighborhoods for , it means that each subproblem has three neighborhoods as its neighborhood pool to provide donor vectors for mutation. But just only one neighborhood can be used for every mutation.
The scaling factor F is used for the adaptive parameter control of DE operators. Because of the different search power of the four, four independent scaling factors are set. Every F follows a Cauchy distribution with location parameter and scale parameter (). If a new F is out of the range , it will be regenerated.where is a weight factor. It is validated that a little random perturbation of can bring better effectiveness for the optimization process [17]. As a result, it is randomly generated within the range . is updated as (16) and it is initialized to be 0.5. is the set of the scaling factor of ith operator which is stored in the sliding window. is given as follows:where is the number of in the sliding window. k is set to 1.5 which is proved to be the best in [36].
Assume that is a new offspring for ith subproblem and is the old one. If the new one is better than the old one, the former replaces the latter. The reward is the same as the relative improvement in [26], and it is given by
In MOEA/DDRA, each new offspring is compared with solutions selected from the neighborhood; thus the is the sum of brought by with the current operator and the current parameters.
The pseudocode of MOEA/DLSIAS is demonstrated in Algorithm 2.

5. Experimental Studies
In this section, some experiments are adopted to analyze the performance of our algorithm. In Section 5.1, 22 wellknown benchmark problems are briefly introduced in the experiments. Two performance measures and all parameter settings about the experiments are described, respectively, in Sections 5.2 and 5.3. Furthermore, some comparative experiments between MOEA/DLSIAS and other stateoftheart MOEA/D variants are given and analyzed in Section 5.4. Additionally, some experiments about parameters are conducted to analyze the effectiveness of parameters with different numbers in Sections 5.5 and 5.6.
5.1. Benchmark Functions
Three types of benchmark functions are adopted here to manifest the effectiveness and robustness of MOEA/DLSIAS. There are 22 wellknown benchmark problems in total including ZDT problems [28], DTLZ problems [29], and UF problems [30]. The solution set of the these benchmark problems (the Pareto set) is not given by a single point but forms a dimensional object, where m is the number of objectives involved in the MOP.
The most widely used ZDT problems are biobjective test problems, including ZDT1–4 and ZDT6. This kind of problems is easy to solve as it lacks some features, such as variable linkage and multimodality. Therefore, UF problems and DTLZ problems are covered in the experiments, of which various features can make up for the deficiency of ZDT problems. It is worth noting that UF1–7 are all biobjective test problems and DTLZ1–7 and UF8–10 are threeobjective test problems. All the three types of benchmark functions are widely used for the evaluation of MOEAs.
5.2. Performance Measure
There are many performance measures for comparisons among algorithms, like Inverted Generational Distance (IGD) [44], Hypervolume (HV) [45], Spread [46], [47], and so on. Because all comparative algorithms in their literature employ either IGD or HV or both of them, they are also chosen to assess the performance of the proposed algorithm and its comparisons. The two metrics possess the ability of convergence assessment and diversity assessment.
IGD. Let be a set of solutions distributed in the true Paretooptimal front, which is a sphere composed of all best solutions. Let be an approximation set got by MOEA. The IGD metrics of is obtained bywhere is the minimum Euclidean distance between x and its nearest solution in P, and is the number of elements in . The true Paretooptimal front set is known in advance and it can be got in http://jmetal.sourceforge.net/problems.html. In this paper, we, respectively, select 1,000 and 10,000 points uniformly distributed in the true Paretooptimal front for biobjective and threeobjective test problems.
HV. Assume that is the reference point dominated by all the points in . HV metrics is the size of the objective space dominated by the approximate set from and bounded by .where indicates the Lebesgue measure. It is noted that if a solution in P does not dominate , it will be abandoned. In this paper, is set to (2.0, 2.0) for ZDT1–ZDT4, ZDT6, and UF1–UF7, (1.0, 1.0, 1.0) for DTLZ1, (2.0, 2.0, 2.0) for DTLZ2–DTLZ6 and UF8–UF10, and (2.0, 2.0, 2.0) for DTLZ7.
5.3. Experimental Settings
In this paper, the proposed algorithm, MOEA/DLSIAS, is compared with various MOEA/D improved versions, including MOEA/DDE, MOEA/DDRA, MOEA/DFRRMAB, MOEA/DSTM, MOEA/DUCBT, and MOEA/DARG. All the basic experimental settings for the above competitive algorithms are shown in Table 1. Three kinds of benchmark functions are adopted, and the population size N and the maximum number of function evaluations for each are set differently. N is set to 100 for all ZDT benchmark problems. N is set to 300 for all DTLZ problems and all UF problems which are threeobjective problems. The maximum number of function evaluations is set to 25,000 for ZDT problems and 300,000 for UF problems and DTLZ problems. The parameter settings of MOEA/DLSIAS are as follows. The control parameters CR and C and the sliding window size W are shown in Table 1. The operator parameters are explained in Section 4 and the neighborhood size is set to 20. The mean value and the interquartile range (IQR) are used to measure all results of every test instance run 30 independent times. Furthermore, Wilcoxon’s rank sum test is employed to assess the statistical significance of all experimental results from the compared algorithms with a significance level .
5.4. Comparisons of MOEA/DLSIAS with StateoftheArt MOEA/D Variants
5.4.1. Comparisons on the ZDT Test Problems
Tables 2 and 3 provide the comparative experiment results of all the competitive algorithms on the ZDT test problems, which show the mean value and the interquartile range of IGD and HV over 30 independent runs, respectively. The best result for each test problem is highlighted in boldface with regard to the metric in Tables 2–7.
In Table 2, MOEA/DLSIAS performs best on ZDT1 and ZDT2, while MOEA/DUCBT, MOEA/DSTM, and MOEA/DDE are, respectively, the best on ZDT3, ZDT4, and ZDT6. Moreover, the Wilcoxon’s rank sum test indicates that MOEA/DLSIAS is similar to MOEA/DARG on ZDT2 and to MOEA/D on ZDT6. These comparison results for the compared algorithms on ZDT test problems are summarized in the last two rows. The “+/−/≈” summarizes the competitive results on ZDT test problems regarding IGD. It can be observed that the compared algorithms except MOEA/DDRA perform better than MOEA/DLSIAS at least once. But the “Rank Sum” indicates that MOEA/DLSIAS gets the first rank, and MOEA/DARG is the best among all the compared algorithms.
Table 3 provides the comparative results using HV. It is easy to find that MOEA/DLSIAS gets the best performance on ZDT1 and ZDT6, while MOEA/DARG is the best on ZDT2 and ZDT4. MOEA/DFRRMAB performs best only on ZDT3. The Wilcoxon’s rank sum test indicates that MOEA/DLSIAS is similar to MOEA/DDE on ZDT6. It is worth noting that MOEA/DUCBT and MOEA/DFRRMAB perform similarly on most ZDT test problems. “+/−/≈” shows that MOEA/DLSIAS performs better or in a similar way on over half of the ZDT test problems. The “Rank Sum” also shows that MOEA/DLSIAS is the best. The advantages of MOEA/DLSIAS are further confirmed by using HV.
5.4.2. Comparisons on the DTLZ Test Problems
DTLZ test problems are threeobjective optimization problems, and they are obviously harder than ZDT test problems. Tables 4 and 5 are suggestive of the fact that MOEA/DLSIAS gets the best performance.
Table 4 gives comparative results of all the compared algorithms on the DTLZ test problems regarding IGD. It is straightforward to find that MOEA/DLSIAS performs better on more than half of the DTLZ test problems. MOEA/DLSIAS performs best on 4 (i.e., DTLZ1, DTLZ3, DTLZ5, and DTLZ6) out of 7 DTLZ test problems. “+/−/≈” indicates that MOEA/DLSIAS performs better than MOEA/DDE, MOEA/DDRA, MOEA/DFRRMAB, and MOEA/DSTM on 6, 6, 7, 4, 7, and 5 out of 7 DTLZ test problems. The Wilcoxon’s rank sum test shows that MOEA/DLSIAS performs similarly to MOEA/DSTM on DTLZ5. As observed from “Rank Sum,” MOEA/DLSIAS is evidently the best one for the IGD metrics on DTLZ test problems.
In Table 5, the results of DTLZ test problems with regard to HV are provided. MOEA/DLSIAS gets the best result on DTLZ1, DTLZ4, and DTLZ7, and it is worse than MOEA/DSTM on DTLZ2, worse than MOEA/DARG on DTLZ3 and DTLZ5, and worse than MOEA/DUCBT and MOEA/DFRRMAB on DTLZ6. It achieves statistically similar results to MOEA/DDRA on DTLZ1 and to MOEA/DSTM on DTLZ5 and DTLZ7. It is reasonable to conclude that MOEA/DLSIAS achieves a superior performance on most of DTLZ test problems on account of the IGD and HV metric.
5.4.3. Comparisons on the UF Test Problems
Table 6 lists compared results on the UF problems with regard to IGD, which possesses a more complicated Pareto optimal set than ZDT and DTLZ test problems. It is observed that MOEA/DLSIAS performs best on 5 (i.e., UF1UF2, UF4, UF6, and UF10) out of 10 UF test problems, while MOEA/DARG performs best on 3 (i.e., UF3, UF5, and UF7). MOEA/DSTM achieves best results on UF8 and UF9. The Wilcoxon’s rank sum test indicates that MOEA/DLSIAS performs similarly to MOEA/DSTM on UF1 and to MOEA/DDRA on UF10. The “+/−/≈” and the “Rank Sum” show that MOEA/DARG and MOEA/DSTM are serious rivals to MOEA/DLSIAS. But it is evident that MOEA/DLSIAS performs best when comparing UF test problems with regard to IGD.
Table 7 shows the results of the HV metrics for MOEA/DLSIAS and various versions of MOEA/D. MOEA/DLSIAS also performs best on 5 (i.e., UF2–4 and UF67) out of 10 UF test problems. In more detail, MOEA/DLSIAS is worse than MOEA/DSTM on UF1, UF5, UF8, and UF9, worse than MOEA/DARG on UF9 and UF10, and worse than MOEA/DFRRMAB on UF8. Moreover, the Wilcoxon’s rank sum test reveals that MOEA/DLSIAS performs similarly to MOEA/DARG on UF4. As summarized in the “Rank Sum,” MOEA/DLSIAS is better than MOEA/DDE and MOEA/DDRA on all the UF test problems. Regarding the comparison with MOEA/DFRRMAB, MOEA/DSTM, MOEA/DUCBT, and MOA/DARG, MOEA/DLSIAS performs better or in a similar way on more than half of the UF test problems. Therefore, it is confirmed that MOEA/DLSIAS possesses a better performance than all the compared algorithms by using HV.
5.5. MOEA/DLSIAS versus AOSs with Different Assist Information
To investigate the effectiveness of different types of assist information, a further experiment about diverse configurations of operator information is given. In Section 3, two types of assist information are considered, including adaptive selection of neighborhood and adaptive control of factor F. The compared algorithms are the same as MOEA/DLSIAS except the operator information selection, and they all employ AOS with four DE operators.(1)Method 1 (M1): AOS with no assist information,(2)Method 2 (M2): AOS with adaptive selection of neighborhood,(3)Method 3 (M3): AOS with adaptive control of factor F.
Figure 4 provides the four algorithm comparison results on ZDT1, DTLZ1, and UF1. For the IGD metrics, the ones with assist information are better than AOS with nothing on ZDT1, but it is absolutely not on DTLZ1 and UF1. M2 and M3 perform worse than M1 on UF1, while MOEA/DLSIAS is better than M1 on all the three test problems. It is concluded that the configuration of adaptive control of factor F and adaptive selection of neighborhood are helpful to improve the performance of AOS regarding IGD.
(a)
(b)
(c)
(d)
(e)
(f)
Regarding the HV metric, M2, M3, and MOEA/DLSIAS work better than M1 on ZDT1. M1 performs better than M2 and M3 on DTLZ1 and UF1, but is absolutely beaten by LSIAS. The results indicate that the assist information is not absolutely helpful for the AOS, but a good configuration of assist information is more useful. The improvement value depends on the configuration of assist information, and AOS with the help of different assist information can display different search power on different test problem.
5.6. The Dynamics of Operator and Assist Information Selection
For a further investigation for the dynamic performance, we analyze the usage of operators and operator with different assist information. Divide the search process into 50 search phases. Each phase is composed of 500 function evaluations for ZDT1, while it is composed of 6000 function evaluations for UF1 and DTLZ1. Then calculate the usage number of every phase during searching. The usage state of operators and operators with different assist information during the whole search process on ZDT1, DTLZ1, and UF1 are shown in Figures 4–6.
(a)
(b)
(c)
(d)
(e)
(a)
(b)
(c)
(d)
(e)
In Figure 5, op4 shows effect search power during all the phases, but the impact of the other three operators is unable to be ignored. op1 performs better with ne1 than the other neighborhoods, but op1 only provides effective search power at the beginning of the search phase. op3 has two good partners which are ne1 and ne3, and op3 plays an important role during the second half of the search. During the whole search phase, all the assist information work well with op4.
In Figure 6, it is found that at the beginning of the search phase op3 with ne3 and op4 with ne2 possess powerful search capability. At the first half of search phase, op1 provides effective search, but the four operators provide effective search in turn. The important partners of op1 are ne1 and ne1. op2 works well with all the neighborhoods, but op4 only works well with ne2.
As observed from Figure 7, all of four operators work hard at the beginning, but op4 and op3 provide little help during the second half of search phase. ne3 offers significant effect for different operators during different search phases. It means that the farthest neighborhood set is very helpful for the search. ne1, ne2, and ne3 offer help for op1 and op2 in turn. op3 and op4 perform well with ne2 and ne3. It is obvious that op4 works well at the first half of the search phase, but it cannot show a good performance later.
(a)
(b)
(c)
(d)
(e)
6. Conclusions
In this paper, a novel AOS method called LSIAS is introduced. In the LSIAS, the best operator usage rate and the worst operator usage rate are used to improve the UCB method. In case of the influence of unexpected extremely large or small fitness improvement, a credit assignment method which is abandonment of extreme fitness improvement is introduced. LSIAS is also an adaptive selection strategy, which is used to select operators and assist information on operator adaptively. Every latest used operator with all of its assist information and its efficiency are dynamically stored in a sliding window. Based on the efficiency, the best configuration of the operator and its assist information for each search phase are dynamically chosen.
Since the decompositionbased MOEAs could help in evaluating the efficiency of operators with their assist information easily, a variant of MOEA/D is adopted to investigate the performance of our proposed LSIAS. Four DE mutation operators and two assist information, which are neighborhoods and scaling factors, are used in this method within the MOEA/D framework. We conduct extensive experimental studies on three kinds of benchmark functions. The experiments show that LSIAS is robust and effective and its adaptive selection method of operators and assist information can significantly improve the performance of MOEA/D.
In the future work, the adaptive selection of operators with assist information will be used to investigate manyobjective optimization problems and constraint optimization problems. Furthermore, as the indicatorbased MOEAs can also provide exact estimation for operators with assist information, the performance of LSIAS within this algorithm can also be studied.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grants nos. 7170129 and 71771216).