Abstract
The backtracking search optimization algorithm (BSA) is a new natureinspired method which possesses a memory to take advantage of experiences gained from previous generation to guide the population to the global optimum. BSA is capable of solving multimodal problems, but it slowly converges and poorly exploits solution. The differential evolution (DE) algorithm is a robust evolutionary algorithm and has a fast convergence speed in the case of exploitive mutation strategies that utilize the information of the best solution found so far. In this paper, we propose a hybrid backtracking search optimization algorithm with differential evolution, called HBD. In HBD, DE with exploitive strategy is used to accelerate the convergence by optimizing one worse individual according to its probability at each iteration process. A suit of 28 benchmark functions are employed to verify the performance of HBD, and the results show the improvement in effectiveness and efficiency of hybridization of BSA and DE.
1. Introduction
Optimization plays an important role in many fields, for example, decision science and physical system, and can be abstracted as the minimization or maximization of objective functions subject to constraints on their variables mathematically. Generally speaking, the optimization algorithms can be employed to find their solutions. The stochastic relaxation optimization algorithms, such as genetic algorithm (GA) [1], particle swarm optimization algorithm (PSO) [2, 3], ant colony algorithm (ACO) [4], and differential evolution (DE) [5], are one of the methods for solving solutions effectively and almost natureinspired optimization techniques. For instance, DE, one of the most powerful stochastic optimization methods, employs the mutation, crossover, and selection operators at each generation to drive the population to global optimum. In DE, the mutation operator is one of core components and includes many differential mutation strategies which reveal different characteristics. For example, the strategies, which utilize the information of the best solution found so far, have fast convergence speed and favor exploitation. These strategies are classified as the exploitative strategies [6].
Inspired by the success of GA, PSO, ACO, and DE for solving optimization problems, new natureinspired algorithms have been a hot topic in the development of the stochastic relaxation optimization techniques, such as artificial bee colony [7], cuckoo search [8], bat algorithm [9], firefly algorithm [10], social emotional optimization [11–13], harmony search [14], and biogeography based optimization [15]. A survey has pointed out that there are about 40 different natureinspired algorithms [16].
The backtracking search optimization algorithm (BSA) [17] is a new stochastic method for solving realvalued numerical optimization problems. Similar to other evolutionary algorithms, BSA uses the mutation, crossover, and selection operators to generate trial solutions. When generating trial solutions, BSA employs a memory to store experiences gained from previous generation solutions. Taking advantage of historical information to guide the population to global optimum, BSA focuses on exploration and is capable of solving multimodal optimization problems. However, utilizing experiences may make BSA converge slowly and prejudice exploitation on later iteration stage.
On the other hand, researches have paid more and more attention to combine different search optimization algorithms or machine learning methods to improve the performance for realworld optimization problems. Some good surveys about hybrid metaheuristics or machine learning methods can be found in the literatures [18–20]. In this paper, we also concentrate on a hybrid metaheuristic algorithm, called HBD, which combines BSA and DE. HBD employs DE with exploitative mutation strategy to improve convergence speed and to favor exploitation. Furthermore, in HBD, DE is invoked to optimize only one worse individual selected with the help of its probability at each iteration process. We use 28 benchmark functions to verify the performance of HBD, and the results show the improvement in effectiveness and efficiency of hybridization of BSA and DE. The major advantages of our approach are as follows. (i) DE with exploitive strategies helps HBD converge fast and favor exploitation. (ii) Since DE optimizes one individual, HBD expends only one more function evaluation at each iteration and will not increase the overall complexity of BSA. (iii) DE is embedded behind BSA, and therefore HBD does not destroy the structure of BSA, and it is still very simple.
The remainder of this paper is organized as follows. Section 2 describes BSA and DE. Section 3 presents the HBD algorithm. Section 4 reports the experimental results. Section 5 concludes this paper.
2. Preliminary
2.1. BSA
The backtracking search optimization algorithm is a new stochastic search technique developed recently [17]. BSA has a single control parameter and a simple structure that is effective and capable of solving different optimization problems. Furthermore, BSA is a populationbased method and possesses a memory in which it stores a population from a randomly chosen previous generation for generating the searchdirection matrix. In addition, BSA is a natureinspired method employing three basic genetic operators: mutation, crossover, and selection.
BSA employs a random mutation strategy that used only one direction individual for each target individual, formulated as follows:where is the current population, is the historical population, and is a coefficient which controls the amplitude of the searchdirection matrix .
BSA also uses a nonuniform and more complex crossover strategy. There are two steps in the crossover process. Firstly, a binary integervalues matrix (map) of size ( and are the population size and the problem dimensions) is generated to indicate the mutant individual to be manipulated by using the relevant individual. Secondly, the relevant dimensions of mutant individual are updated by using the relevant individual. This crossover process can be summarized as shown in Algorithm 1.

BSA has two types of selection operators. The first type selection operator is employed to select the historical population for calculating search direction. The rule is that the historical population should be replaced with the current population when the random number is smaller than the other one. The second type of selection operator is greedy to determine the better individuals to go into the next generation.
According to the above descriptions, the pseudocode of BSA is summarized as shown in Algorithm 2.

2.2. DE
DE is a powerful evolutionary algorithm for global optimization over continuous space. When being used to solve optimization problems, it evolves a population of candidate solutions with dimensional parameter vectors, noted as . In DE, the population is initiated by uniform sampling within the prescribed minimum and maximum bounds.
After initialization, DE steps into the iteration process where the evolutionary operators, namely, mutation, crossover, and selection, are invoked in turn, respectively.
DE employs the mutation strategy to generate a mutant vector . So far, there are several mutant strategies, and the most wellknown and widely used strategies are listed as follows [21, 22]: “DE/best/1”: “DE/currenttobest/1”: “DE/best/2”: “DE/rand/1”: “DE/currenttorand/1”: “DE/rand/2”: where the indices , , , , and are uniformly random mutually different integers from 1 to , denotes the best individual obtained so far, and and are the th vector of and , respectively.
The crossover operator is performed to generate a trial vector according to each pair of and after the mutant vector is generated. The most popular strategy is the binomial crossover described as follows:where is called the crossover rate, is randomly sampled from 1 to , and , , and are the th element of , , and , respectively.
Finally, DE uses a greedy mechanism to select the better vector from each pair of and . This can be described as follows:
3. HBD
In this section, we describe the HBD algorithm in detail. First, the motivations of this paper are given. Second, the framework of HBD is shown.
3.1. Motivations
BSA uses an external archive to store experiences gained from previous generation solutions and makes use of them to guide the population to global optimum. According to BSA, permuting arbitrary changes in position of historical population makes the individuals be chosen randomly in the mutation operator; therefore, the algorithm focuses on exploration and is capable of solving multimodal optimization problems. However, just due to random selection, by utilizing experiences, BSA may be led to converge slowly and to prejudice exploitation on later iteration stage. This motivates our approach which aims to accelerate the convergence speed and to enhance the exploitation of the search space to keep the balance between the exploration and exploitation capabilities of BSA.
On the other hand, some studies have investigated the exploration and exploitation ability of different DE mutation strategies and pointed out the mutation operators that incorporate the best individual (e.g., (2), (3), and (4)) favor exploitation because the mutant individuals are strongly attracted around the best individual [6, 23]. This motivates us to hybridize these exploitative mutation strategies to enhance the exploitation capability of BSA. In addition, this paper is also in light of some studies which have shown that it is an effective way to combine other optimization methods to improve the performance for realworld optimization problems [24–27].
3.2. Framework of HBD
Generally speaking, there are many ways to hybridize BSA with DE. In this study, we propose another hybrid schema between BSA with DE. In this schema, HBD employs DE with exploitive strategy behind BSA at each iteration process to share the information between BSA and DE. However, more individuals are optimized by DE, and more function evaluations will be spent. In this case, HBD would gain the premature convergence, resulting in prejudicing exploration. Thus, to keep the exploration capability of HBD, DE is used to optimize only one worse individual according to its probability. In addition, (2) is used as default mutation strategy in HBD because (3) and (4) have stronger exploration capabilities by introducing more perturbation with the random individual [6] or a modification combining “DE/best/1” and “DE/rand/1” [28]. The performance influenced by different exploitative strategies will be discussed in Section 4.3.
In order to select one individual for DE, in this work, we assign a probability model for each individual according to its fitness. It can be formulated as follows:where is the population size and is the ranking value of each individual when the population is sorted from the worst fitness to the best one.
Note that the probability equation is similar to the selection probability in DE with rankingbased mutation operators [29]. In general, the worse individuals are more far away from the best individual than the better ones; thus, they will have higher probabilities to get around the best one. This selection strategy can be defined as follows:where is selected individual and optimized by DE.
It is worth pointing out that our previous work [30], called BSADE, splits the whole iteration process into two parts: the previous twothird and the latter onethird stages. BSA is used in the first stage, and DE is employed in the second stage. In this case, DE does not share the population information with BSA. Moreover, it is difficult to split the whole iteration process into two parts. Thus, the difference between HBD and BSADE is that HBD shares the population information between BSA and DE, while BSADE does not. The comparison can be found in Section 4.4.
According to the above descriptions, the pseudocode of HBD is described in Algorithm 3.

4. Experimental Verifications
In this section, to verify the performance of HBD, we carry out comprehensive experimental tests on a suit of 28 benchmark functions proposed in the CEC2013 competition [31]. These 28 benchmark functions include 5 unimodal functions , 15 basic multimodal functions , and 8 composition functions . More details about 28 functions can be found in [31].
To make a fair comparison, we use the same parameters for BSA and HBD, unless a change is mentioned. Each algorithm is performed 25 times for each function with the dimensions , 30, and 50, respectively. The population size of each algorithm is when and , while it is 30 in the case of . The maximum function evaluations are . The mutation factor and the crossover factor are 0.8 and 0.9 for HBD, respectively. In addition, we use the boundary handling method given in [17].
To evaluate the performance of algorithms, we use Error as an evaluation indicator first. Error, which is the function error value for the solution obtained by the algorithms, is defined as , where is the global optimum of function. In addition, the average and standard deviation of the best error values, presented as “,” are used in the different tables. Second, the convergence graphs are employed to show the mean error values of the best solutions at iteration process over the total run. Third, a Wilcoxon signedrank test at the 5% significance level () is used to show the significant differences between two algorithms. The “+” symbol shows that the null hypothesis is rejected at the 5% significant level and HBD outperforms BSA, the “−” symbol says that the null hypothesis is rejected at the 5% significant level and BSA exceeds HBD, and the “” symbol reveals that the null hypothesis is accepted at the 5% significant level and HBD ties BSA. Additionally, we also give the total number of statistical significant cases at the bottom of each table.
4.1. The Effect of HBD
To show the effect of the proposed algorithm, Table 1 lists the average error values obtained by BSA and HBD for 30dimentional benchmark functions. For unimodal functions , HBD overall obtains better average error values than BSA does. For instance, HBD gains the global optimum on and brings solutions with high quality to in terms of average error values. HBD exhibits a little inferiority to BSA for , but these two approaches are not significant. For 15 basic multimodal functions , with the help of average error values, HBD brings superior solutions to 10 out of 15 functions, equal ones to 2 out of 15 functions, and inferior ones to 3 out of 15 functions. However, according to the results of Wilcoxon test, they are not significant for HBD and BSA for 3 functions in which HBD gains lower solution quality. For composition functions , HBD and BSA draw a tie on and by the aid of average error values; however, HBD significantly outperforms BSA according to the results of Wilcoxon test. Moreover, according to average error values, HBD performs better than BSA in , , , and but worse than BSA in and . Nevertheless, two algorithms almost are not significant for these 8 composition functions in terms of the results of Wilcoxon test. Summarily, according to “+//−,” HBD wins and ties BSA on 12 and 16 out of 28 benchmark functions, respectively.
In order to further show the convergence speed of HBD, the convergence curves of two algorithms for six selected benchmark functions are given in Figure 1.
(a)
(b)
(c)
(d)
(e)
(f)
It is observed that the selected functions can be divided into four groups, and overall the convergence performance of HBD is better than BSA. For example, for the first group of functions, for example, and in which HBD has significantly better average error values than BSA, HBD converges faster than BSA in terms of the convergence curves seen in Figures 1(c) and 1(f). For and belong to the second group where HBD cannot bring the solutions with higher quality significantly, HBD still converges faster than BSA does. Third, for in which both of the two algorithms reach the global optimum, convergence performance of HBD is better compared to BSA. Additionally, HBD outperforms BSA according to the convergence curves seen in Figure 1(a), although the average error values optimized by HBD are inferior but not significant to BSA.
All in all, HBD overall outperforms BSA in terms of solution quality and convergence speed. This is because DE with exploitive mutation strategy enhances the exploitation capability of HBD, and it does not expend too much function evaluations.
4.2. Scalability of HBD
In this section, to analyze the performance of HBD affected by the problem dimensionality, a scalability study is investigated, respectively, on the 28 functions with 10 and 50 due to their definition up to 50 [31]. The results are tabulated in Table 2.
In the case of , according to average error values shown in Table 2, HBD exhibits superiority in the majority of functions while inferiority in a handful of ones. Additionally, in terms of the total of “+//−,” HBD wins and ties BSA in 9 and 19 out of 28 functions, respectively.
When , HBD still can bring solutions with higher quality than BSA does in most of benchmark functions. Moreover, HBD outperforms and ties BSA in 13 and 15 out of 28 functions, respectively.
In summary, it suggests that the advantage of HBD over BSA is stable when the dimensionality of problems increases.
4.3. The Effect of Mutation Strategy
In HBD, the “DE/best/1” mutation strategy is used to enhance the exploitation capability of HBD in default. To show the performance of HBD influenced by other exploitive mutation strategies, the experiments are carried on benchmark functions and the results are listed in Table 3 where cHBD and bHBD mean that HBD uses “DE/currenttobest/1” and “DE/best/2,” respectively. The results obtained by cHBD and bHBD, which are highly accurate compared to those obtained by HBD, are marked in bold.
From Table 3, in terms of the average error values, bHBD shows the higher accuracy compared to HBD for a few functions since “DE/best/2” usually exhibits better exploration than “DE/best/1” because of one more difference of randomly selected individuals in the former [23]. cHBD also gains higher accuracy of solutions than HBD does for a handful of functions because “DE/currentbest/1,” a modification combining “DE/best/1” and “DE/rand/1” [28], shows better exploration than “DE/best/1.” In other words, for a few functions, “DE/best/2” and “DE/currentbest/1” can balance the exploration and exploitation capabilities of HBD better. For example, bHBD and cHBD bring the solutions with higher quality to , , , , , and ; in particular, they reach the global optimum. However, for most of the functions, HBD with “DE/best/1” performs better than cHBD and bHBD.
Additionally, Table 4 reports the results of the multipleproblem Wilcoxon test which was done similarly in [29, 32] between HBD and its variants for all functions. We can see from Table 4 that HBD is significantly better than bHBD and HBD gets higher value than value although two values are not significant. Therefore, HBD uses “DE/best/1” in the tradeoff.
4.4. The Effect of Hybrid Schema
In this section, we analyze the performance of HBD affected by the hybrid schema. Firstly, to show the effect of more than one individual optimized by DE, the algorithm, called aHBD which uses DE to optimize the whole population, is used to compare with HBD. Secondly, we add a probability on aHBD to control the use of DE and propose paHBD. In paHBD, if the random number drawn from uniform distribution between 0 and 1 is less than , then DE is invoked. The is defined as follows:where is the number of function evaluations which had been spent and is the maximum number of function evaluations. Additionally, BSADE is compared with HBD to show their differences.
Table 5 lists the error values obtained by aHBD, paHBD, BSADE, and HBD for 28 functions at . It can be observed that HBD wins, ties, and loses aHBD in 10, 12, and 6 out of 28 functions in terms of “+//−,” respectively. It says that optimizing more individuals using DE costs more function evaluations when DE is embedded behind BSA directly, resulting in reducing the iteration process cycles and then getting poor performance for most functions. Regarding BSADE, since BSA and DE are invoked in different stages where they cannot exchange the population information, it is clear that this schema cannot balance the exploitation and exploration well. Thus, compared with BSADE, HBD brings solutions with higher accuracy for most functions. Moreover, HBD wins, ties, and loses BSADE in 8, 17, and 3 out of 28 functions with the help of “+//−,” respectively. However, paHBD uses the probability to control the use of DE. In this case, it can decrease the cost of function evaluation at early evolution stage. Thus, paHBD is almost similar to HBD according to “+//−.”
In addition, we also perform the multipleproblem Wilcoxon test for HBD, aHBD, paHBD, and BSADE for 28 functions and list the results in Table 6.
It can be found from Table 6 that HBD is not significant to aHBD, paHBD, and BSADE. But HBD gets higher values than values, compared with aHBD and BSADE, respectively. But HBD obtains slightly lower value than value in comparison with paHBD. This is because HBD brings weakly lower accurate solutions on , , , and , resulting in higher ranking. Nevertheless, it indicates that the hybrid schema used in HBD is a reasonable choice.
4.5. The Effect of Probability Model
In HBD, a linear model seen (10) is used to select one individual to optimize. It is worth pointing out that other models, for example, nonlinear, can also be adopted in our algorithm. In this section, we do not seek the optimal probability model but only analyze the performance influenced by different models. Thus, two models, as similarly used in [29, 33], are employed to study the performance affected by other models. They are the quadratic model and the sinusoidal model, formulated as seen in (13) and (14), respectively. The average error values and the results of the multipleproblem Wilcoxon test are reported in Tables 7 and 8, respectively, where qHBD is HBD with the quadratic model and sHBD means HBD with the sinusoidal one. Consider
From Table 7, we can find that qHBD can bring higher solutions to 11 out of 28 functions compared with HBD, although the results they obtain are not significant in terms of “+//−.” In addition, qHBD gets lower values than values HBD gained, though they are not significant at the 5% and 10% significance level. It says that the linear model is a reasonable choice compared with the quadratic model. However, it is not the optimal one compared with the sinusoidal model. For instance, sHBD wins, ties, and loses HBD in 1, 27, and 0 out of 28 functions according to “+//−.” Moreover, sHBD has higher values than values HBD does though they are not significant at the 5% and 10% significance level.
4.6. Compared with Other Algorithms
Firstly, HBD is compared with 6 nonBSA approaches in [17], namely, PSO2011 [34], CMAES [35, 36], ABC [7], JDE [37], CLPSO [38], and SADE [39]. Moreover, to compare fair and conveniently, we use the 25 functions and the parameters which are employed and suggested in [17]. More details about these 25 functions can be found in CEC2005 competition [40]. Table 9 lists the minimal fitness and average fitness of 7 approaches, where the results of 6 nonBSA algorithms are adopted from [17] directly. In addition, the results of multipleproblem Wilcoxon test and Friedman test similarly done in [29] for the seven algorithms are listed in Tables 10 and 11, respectively.
From Table 9, we find that each algorithm does well in some functions according to its average error value. For instance, PSO2011, CMAES, ABC, JDE, CLPSO, SADE, and HBD perform better in 8, 5, 9, 3, 3, 3, and 7 out of 25 functions, respectively. However, Table 10 shows that HBD gets higher values than values in all cases. This suggests that HBD is better than the other 6 algorithms. Moreover, for Wilcoxon test at and in three cases, there are significant differences for CEC2005 functions. Furthermore, with respect to the average rankings of different algorithms by the Friedman test, it can be seen clearly from Table 11 that HBD offers the best overall performance, while SADE is the second best, followed by ABC, PSO2011, CLPSO, JDE, and CMAES.
Secondly, to appreciate the actual performance of the proposed algorithm, HBD is in comparison with the other five algorithms identified as NBIPOPaCMA [41], fkPSO [42], SPSO2011 [43], SPSOABC [44], and PVADE [45], which were presented during the CEC2013 Special Session & Competition on RealParameter Single Objective Optimization.
Table 12 lists the average error values which are dealt with from [46], and the average rankings of the six algorithms by the Friedman test for CEC2013 functions at are given in Table 13. Since NBIPOPaCMA is one of top three performing algorithms for CEC2013 functions [47], seen from Table 12, it shows the promising performance in almost all of functions. Other algorithms bring solutions with higher accuracy in a handful of functions. For example, fkPSO, SPSO2011, SPSOABC, PVADE, and HBD yield the better performance on 3, 2, 6, 4, and 5 out of 28 functions in terms of the average error values. However, according to the average rankings of different algorithms by the Friedman test in Table 13, we can find that NBIPOPaCMA is the best, and HBD offers the second best overall performance, followed by SPSOABC, fkPSO, PVADE, and PSO2011.
5. Conclusion
In this paper, we presented a hybrid BSA, called HBD, which combined BSA and DE with exploitive mutation strategy. At each iteration process, DE was embedded behind the BSA algorithm to optimize one individual which was selected according to its probability in order to enhance the convergence of BSA and to bring solutions with higher quality.
Comprehensive experiments have been carried out in 28 benchmark functions proposed in CEC2013 competition. The experimental results reveal that the hybridization of BSA and DE provides the high effectiveness and efficiency in most of functions, contributing to solutions with higher accuracy, faster convergence speed, and more stable scalability. HBD was also compared with other evolutionary algorithms and has shown its promising performance.
There are several interesting directions for future work. Experimentally, the linear probability model used to select one individual to optimize is a reasonable but not optimal one; thus, firstly, the comprehensive tests will be performed on various probability models in HBD. Secondly, although experimental results have shown that HBD owns the stable scalability, we plan to investigate HBD for largescale optimization problems. Last but not least, we plan to apply HBD to some realworld optimization problems for further examinations.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors are very grateful to the editor and the anonymous reviewers for their constructive comments and suggestions to this paper. This work was supported by the NSFC Joint Fund with Guangdong of China under Key Project U1201258, the Shandong Natural Science Funds for Distinguished Young Scholar under Grant no. JQ201316, and the Natural Science Foundation of Fujian Province of China under Grant no. 2013J01216.