Abstract

Human Learning Optimization (HLO) is a simple yet highly efficient metaheuristic developed based on a simplified human learning model. To further extend the research of HLO, the social reasoning learning operator (SRLO) is introduced. However, the learning ability of social imitating learning operator (SILO) and SRLO is constant in the process of iterations, which is not true in a real human population as humans often adopt dynamic learning strategies to solve the problem. Inspired by this fact, an improved adaptive human learning optimization algorithm with reasoning learning (AHLORL) is proposed to enhance the global search ability, in which an adaptive ps strategy is carefully designed to sufficiently motivate the roles of SILO and SRLO and dynamically adjust the learning efficiency of the algorithm at different stages of iterations. Then, a comprehensive parameter study is performed to explain why the proposed adaptive strategy can exploit the optimization ability of SILO and SRLO effectively. Finally, the AHLORL is applied to solve the CEC 15 benchmark functions as well as multidimensional knapsack problems (MKPs), and its performance is compared with the previous HLO variants as well as the other recent metaheuristics. The experimental results show that the proposed AHLORL outperforms the other algorithms in terms of search accuracy and scalability.

1. Introduction

With the increasing complexity of industrial production, traditional optimization algorithms cannot effectively solve the optimal solution of the problem. To break through the limitations of traditional optimization algorithms, some new metaheuristics, i.e., Particle Swarm Optimization (PSO) [1], Ant Colony Optimization (ACO) [2], Artificial Bee Colony (ABC) [3], Monarch Butterfly Optimization (MBO) [4], Moth Search Algorithm (MSA) [5], Harris Hawk Optimization (HHO) [6], and Slime Mould Algorithm (SMA) [7], have emerged in the field of intelligent optimization to solve the complex optimization problems, such as medical analysis [8], fault diagnosis [9], and model design [10]. Compared with other mechanisms in nature, humans have a higher level of intelligence and strong learning ability, which can solve complex problems that other creatures cannot. Inspired by this fact, Wang et al. proposed a new Human Learning Optimization Algorithm (HLO) [11] by employing the learning ability of human beings. The HLO [11] algorithm is a highly effective metaheuristic algorithm transforming the accumulated knowledge into computational intelligence, in which three learning operators, i.e., the random learning operator (RLO), the individual learning operator (ILO), and the social learning operator (SLO), are developed to yield new candidates to search for the optimal solution.

To further improve the performance of HLO, a few enhanced variants have been subsequently developed. In 2015, an adaptive simplified human learning optimization algorithm (ASHLO) [12] is proposed to achieve a better trade-off between exploration and exploitation, in which the pr and pi, which are two control parameters determining the probabilities of performing RLO, ILO, and SLO, adopt the linearly decreasing adaptive strategy and linearly increasing adaptive strategy during the whole search process to strengthen the search efficiency of the algorithm and relieve the effort of parameter setting. Later, to dynamically switch the ability between the global search and local search, a sine-cosine adaptive human learning optimization algorithm (SCHLO) [13] is developed in which the pr and pi are adjusted by the sine and cosine functions to help SCHLO escape from the local optimal and get better results. Recently, an improved adaptive human learning optimization algorithm (IAHLO) [14] is presented to further take an insight into the role of RLO, in which the control parameter pr is precisely tuned so that IAHLO can efficiently explore the interesting solution areas at the early stages of iterations and perform the local search at the later stages of the search process. Inspired by the IQ scores of humans, a diverse human learning optimization algorithm (DHLO) [15] is proposed in which the values of control parameter pi are randomly initiated by a Gaussian distribution and dynamically updated based on the pi value of the best individual over the course of the search process.

The adaptive strategies of pr and pi provide a significant improvement in HLO. However, the problem of trapping in the local optimum remains. So, the re-learning operation [16] is introduced to help HLO escape from the local optima and acquire better performance if its fitness is not renewed in a certain number of iterations. To further extend HLO, a new hybrid-coded human learning optimization algorithm (HcHLO) [17] is presented to efficiently tackle mixed-variable optimization problems, in which a continuous human learning optimization (CHLO) is proposed to solve real-coded parameters while other variables are optimized by the standard HLO. Besides, a novel discrete human learning optimization algorithm is presented to tackle the scheduling problem [18]. Until now, the HLO algorithms have been successfully applied to tackle the different types of problems, such as financial markets forecasting [19], engineering optimization problems [14], knapsack problems [16], optimal power flow calculation [20, 21], extractive text summarization [22], furnace flame recognition [23], scheduling problems [24], intelligent control [25, 26], and image segmentation [27]. Especially, HLO achieved the best-so-far results on two well-studied sets of multidimensional knapsack problems, i.e., 5.100 and 10.100 [16], as well as the set of mixed-variable optimization problems [17] compared to the other publicly reported metaheuristics, which proves that the HLO algorithm is a promising metaheuristic optimization algorithm and has important research significance.

To further improve the global search ability of HLO, a novel social reasoning learning operator (SRLO) is developed and human learning optimization with reasoning learning (HLORL) [28] is presented. In HLORL, the SILO (original SLO) and SRLO are inspired by social imitating learning strategy and social reasoning learning strategy, respectively. Among them, social imitating learning is an efficient learning strategy [29], which can efficiently accumulate knowledge by copying optimal individuals in a certain environment. For example, students generally imitate the thinking of their teachers to build a new knowledge framework efficiently [30], and preschool children usually imitate the behavior of their parents to solve problems effectively [31]. Compared with the social imitating learning strategy, social reasoning learning is a powerful learning strategy with logical thinking [32], which can excavate some deeper common characteristics by using surface-related information in an uncertain environment. For instance, economists usually use reasoning ability to predict economic changes [33], and police often adopt the found evidence to reason the truth effectively when they are analyzing the criminal case [34]. Based on the different characteristics between social imitating learning and social reasoning learning, these two learning strategies can effectively play different roles at different stages of human learning.

Although the learning efficiency of SILO and SRLO will change with the environment of human learning, it is not considered in the standard HLORL, which uses two fixed learning probabilities to perform SILO and SRLO over the course of the search process. Now cultural evolution researchers [35, 36] believe that humans often reasonably choose the corresponding learning strategy and perform the optimal behaviors in different learning environments, which is better for human beings to further improve the learning efficiency and obtain better learning results. Therefore, an improved adaptive human learning optimization algorithm with reasoning learning (AHLORL) is proposed in this paper, in which the adaptive learning probabilities of SILO and SRLO are introduced to dynamically adjust the learning efficiency. And a thorough analysis and comparison are performed to explain why the proposed adaptive strategy can effectively exploit the optimization ability of SILO and SRLO and further enhance the global search ability of the algorithm. This paper makes the following contributions:(1)This paper proposes an improved adaptive human learning optimization algorithm with reasoning learning (AHLORL)(2)An adaptive learning probability of SILO and SRLO is introduced to dynamically adjust the learning efficiency of the algorithm during the iterative search(3)A thorough analysis and comparison are performed to explain why the proposed adaptive strategy can effectively exploit the optimization ability of SILO and SRLO and further enhance the global search ability of the algorithm(4)The results demonstrate that the proposed AHLORL has significant advantages over the previous HLO variants

The rest of this paper is organized as follows: Section 2 proposes the proposed AHLORL algorithm in detail. A comprehensive parameter learning is performed in Section 3 to analyze and explain the superiority of the proposed AHLORL algorithm. After that, we describe the experiment with settings, CEC 15 benchmark functions, and MKPs and analyze the results in Section 4. Finally, the conclusion is given with the overall result of this proposed AHLORL algorithm.

2. Adaptive Human Learning Optimization with Reasoning Learning

2.1. Initialization

Like the HLORL, all computations in AHLORL are also a discrete process and the binary-coding framework is adopted to denote the population. At the initialization stage, the individual is initialized as a binary string of length M with “0” or “1” stochastically as the following equation:where denotes the i-th individual, indicates the j-th bit of the i-th individual, and N and M are the number of individuals and the dimension of solutions, respectively.

2.2. Learning Operators
2.2.1. Random Learning Operator

Random learning [37] always occurs at the early stages of human learning because of the lack of prior knowledge of new problems. With the progress of the search, this random learning strategy also remains, which can help AHLORL keep the exploration ability to obtain new strategies. And therefore, AHLORL adopts the random learning operator as equation (2) to imitate these phenomena of human random learning:where is a stochastic number between 0 and 1.

2.2.2. Individual Learning Operator

Individual learning [38] is an important learning ability in the process of human evolution, which can build an individual’s knowledge database to help humans avoid the same mistakes and improve learning efficiency more effectively. Inspired by this learning mechanism, the best individual solutions are saved in the individual knowledge database (IKD) aswhere stands for the IKD of individual ; N and L are the size of IKD and , respectively; and denotes the p-th best solution of the individual . When AHLORL performs the individual learning operator (ILO), a new candidate solution is generated from the IKD as the following equation:

2.2.3. Social Imitating Learning Operator

Social imitating learning [39] is a potentially cheap way of acquiring valuable information and plays a fundamental role in development, communication, interaction, learning, and culture, which can greatly hasten the process of independent learning by enabling the subject to perform the correct response sooner than others. And human beings usually use the social imitating learning strategy to learn from others’ better experiences and improve learning efficiency. To simulate this advanced learning strategy, the social knowledge data (SKD) is adopted to reserve the best social solution as follows:where means the q-th solution in the SKD, H is the size of SKD, and stands for the j-th dimension of the q-th solution in the SKD. The social imitating learning operator (SILO) is performed to generate a new candidate solution as follows:

2.2.4. Social Reasoning Learning Operator

Social reasoning learning is a hallmark of human intelligence [40], which allows humans to attempt powerful generalizations from sparse data when learning about unobserved properties and causal relationships. Demetriou and Kazi [41] point out that logic in the mind is the culmination of a long developmental process, extending into adolescence, and Cesana-Arlotti et al. [42] discover that infants can also use elementary logical representations to frame and prune hypotheses. Modern psychologists are in reasonable agreement [43] that humans usually adopt the predictable methods of social reasoning learning to get deeper characteristic information. Inspired by the social reasoning learning strategy, a social reasoning learning operator (SRLO) is designed to generate a new candidate solution as the following equations:where stands for the social reasoning learning operation; is a stochastic number between 0 and 1; means the social reasoning learning probability model; , , and are the j-th dimension the best knowledge saved in the IKD of three randomly chosen individuals, i.e., individuals i1, i2, and i3, and i1i2i3i.

In summary, AHLORL adopts the random learning operator, individual learning operator, social imitating learning operator, and social reasoning learning operator to yield new candidate solutions and searches for the optimal, which are presented as follows:where means a stochastic number between 0 and 1 and pr, (pi − pr), (ps − pi), and (1 − ps) are the probabilities of performing RLO, ILO, SILO, and SRLO, respectively.

2.3. Adaptive Strategy of SILO and SRLO

In HLORL, the SILO and SRLO are simultaneously adopted to accumulate useful information in social groups more effectively. Among them, the SILO can quickly accumulate knowledge by imitating the current global optimal [44, 45]. However, copying comes with pitfalls in SILO that the acquired knowledge may be outdated [46], misleading or inappropriate if the knowledge of the learned individual is inaccurate. The imitated individual is important in the progress of the search, which can directly militate the learning result. Therefore, the SILO can play an accurate and efficient learning role in a certainty environment. On the other hand, the SRLO is also an efficient learning strategy for the accumulation of human culture, especially in an uncertain environment [47], which can stimulate the learning ability of human beings to dig out some deeper common characteristics by using surface-related knowledge [48, 49]. And reasoning learning [31] is generally used to avoid the inferiority of imitating learning and further exploit the unlimited potential ability learning of human beings. Due to the different learning mechanisms between the SILO and the SRLO, they can play different learning roles at different stages of iterations.

Based on the insight of the roles of SILO and SRLO, we argue that an adaptive strategy for the SILO and SRLO to enhance the optimization ability of AHLORL needs to meet the requirements as follows:(1)As the initial population is randomly generated, the global optimal solution has the largest uncertainty at the beginning. Therefore, the value of ps should be small so that the individuals can efficiently use the SRLO to find the optimal solutions.(2)With the progress of the search, more and more optimal solutions are found by SRLO and the uncertainty of population gradually decreases. Therefore, the value of ps should be increased so that the SILO can accumulate the found optimal solutions effectively and further guide the whole population to learn the global optimal information better.(3)As the learning efficiency of SILO and SRLO is closely related to the reliability of the population, the adaptive strategy of control parameter ps should be nonlinear so that SILO quickly accumulates the found optimal solutions more, which can further help AHLORL maximize the learning ability of SILO and SRLO more effectively.(4)At the later stage of iterations, the risk of trapping in the local optimum remains because the greedy strategy is adopted in the SILO. Therefore, the value of ps should remain constant so that AHLORL can keep the exploration ability to explore the interesting solution areas more efficiently.

As analyzed above, a novel adaptive strategy is proposed to dynamically adjust the control parameter ps between SILO and SRLO as follows:where t and itermax denote the current iteration and the maximum number of iterations, respectively. Tp is a predefined turning point, and psmin and psmax are the minimum value and the maximum value in the whole process, respectively.

For the proposed adaptive strategy, psmin should be small to meet the requirements in Point (1). Tp should be set greater than 0.5 so that AHLORL effectively maximizes the learning ability of SILO and SRLO, which satisfies the demands in Points (2), (3), and (4). Finally, k should be less than 1 to meet the requirements in Point (3). With the introduction of adaptive strategy, AHLORL can achieve a better trade-off between exploration and exploitation. Figure 1 draws an example of the proposed adaptive strategy.

2.4. Updating the IKD and the SKD

For AHLORL, the IKD and the SKD are updated like the HLORL. After the new candidate solutions of all individuals are generated, the fitness of all individuals is calculated by the predefined fitness function. If the new fitness value is superior to the old one, the new candidate solutions will be adopted to update the old one in the IKDs. Otherwise, the individual’s solution in the IKDs will not be updated. And the SKD is updated according to the same way. Since AHLORL is not a Pareto algorithm, the sizes of the IKD and the SKD are both set to 1.

2.5. Algorithm Complexity

Like the standard HLO, the AHLORL also has two phases, i.e., the population initialization and the iterative search. The running time of generating the initial population X, individual knowledge database (IKD), and social knowledge database (SKD) are , and , respectively, where M and N are the dimension of solutions and number of individuals, respectively. So, the total running time of the population initialization is . During the iterative search of AHLORL, generating new individuals costs time , and updating the IKD and SKD costs time and , respectively, where L is the size of and H denotes the size of SKD. Therefore, the running time of each iterative is . Assume that the maximum generation of AHLORL is G, so the iterative search phase takes time . In general, the maximum generation G is much greater than N, L, and H, and therefore the time complexity of AHLORL is .

A flowchart illustrating the implementation of AHLORL is presented in Figure 2.

3. Parameter Study of AHLORL

3.1. Analysis of the Control Parameters

To evaluate the proposed adaptive strategy, a parameter study was performed in this section, which mainly considers psmin, psmax, Tp, and k. For simplicity, the values of pr and pi remain unchanged in AHLORL, i.e., pr = 5/M and pi = 0.82, because the role of the RLO and the ILO in AHLORL is as same as that in HLORL. The orthogonal experimental design method was used, and all the combinations of control parameters are given in Table 1. It should be noted that k is recommended to be less than 1 in the last section. However, for a throughout parameter study, the cases that k is equal to or more than 1 are also considered. The two functions, i.e., F1 and F5 chosen from the CEC 15 benchmark functions [50], were adopted to test the influence of those four parameters on the performance of AHLORL. And the characteristics of selected functions as well as the other 13 functions were used as benchmarks to verify the superiority of the AHLORL in the next section, which is listed in Table 2. The number of populations and the maximum number of iterations for the 10-dimensional/30-dimensional were set as 50/100 and 3000/5000, respectively. Each decision variable was coded by 30 bits and the times of each testing were set to 100 independently. The result of the mean value (mean) was used to evaluate the optimization ability, which is given in Table 3 where the best values have been highlighted in bold.

As mentioned above, psmin, psmax, Tp, and k jointly decide the probabilities of SILO and SRLO over the course of the search process, which are dependent on problems and these also interact with each other, and therefore AHLORL needs a set of suitable values to obtain a better the optimization search ability. Table 3 shows that AHLORL obtains the best comprehensive results on the F1 and F5 of CEC 15 benchmark functions when psmin, psmax, Tp, and k are set to 0.85, 0.98, 0.7, and 2/3, which are chosen as the default values in this work. And the influences of the four control parameters can be concluded as follows:(1)The value of psmin is of great importance for the AHLORL, and it should be small so that AHLORL can efficiently utilize the reasoning ability of SRLO to find the optimal solutions at the beginning of the search, which boost the effectiveness and confidence of the following learning operation and consequently enhance the exploitation ability of AHLORL. However, too small psmin still causes lower convergence speed and spoils the performance of the algorithm. According to Table 3, the value of psmin should be no less than 0.84.(2)The larger the psmax is, the more accurate search the AHLORL performs at the later stage of the search. However, the results show that psmax should be more than level 7, i.e., 0.98, which indicates that a too big psmax would greatly reduce the efficiency of search, and consequently the performance of the algorithm is worsened.(3)It is suggested that Tp should be big enough and do not exceed 0.8 so that enough generations can be guaranteed for the AHLORL to switch between SILO and SRLO search abilities more efficiently.(4)The value of k is also important for the AHLORL, and it should be smaller than 1 so that AHLORL can efficiently utilize the imitating ability of SILO to quickly accumulate the found optimal solutions, which boost the effectiveness and confidence of the following learning operation and consequently enhance the exploitation ability of AHLORL. However, too small k still weakens the reasoning ability of SRLO and spoils the performance of the algorithm. According to Table 3, the value of k should be no less than 1/3.

3.2. Influences of the Adaptive Strategies on SILO and SRLO

With the deep analysis of the control parameters of AHLORL, the influences of the adaptive strategies on SILO and SRLO are further investigated. The AHLORL with the default values and the other four versions with modified parameters, i.e., AHLORL2, AHLORL3, AHLORL4, and AHLORL5, are compared with each other to testify the characteristics of different adaptive strategies. The parameter settings of AHLORL, AHLORL2, AHLORL3, AHLORL4, and AHLORL5 are given in Table 4, and the corresponding ps curves are drawn in Figure 3. All the algorithms were used to solve the 10-dimensional and 30-dimensional CEC 15 functions, and the numerical results, including mean, the best value (Best), and the standard deviation (Std), are listed in Table 5, where the best results are marked in bold. And the Student’s t-test (t-test) and Wilcoxon signed-rank test (W-test) [51] are performed and the corresponding results are also shown in Table 6, in which “1/0/−1” indicates that the numerical result of AHLORL is obviously better than, similar to, or worse than the compared algorithm in the 95% confidence, respectively. The t-test is a parameter test that needs to fulfill the normality and homogeneity of variance, while the W-test is a nonparametric test that does not need to satisfy the above characteristics [12]. For convenience, the results of the t-test and W-test are summarized in Tables 7 and 8. Besides, to deeply inspect the influence of the adaptive strategies on SILO and SRLO, two evaluating indicators were used as follows:Definition 1: Index 1 is the percent of the same bit values obtained by SILO and SRLO in a generation.Definition 2: average distance (AD) is the average Hamming distance between the global optimal solution and the other individual optimal solutions in a generation.

Tables 58 clearly show that AHLORL outperforms AHLORL2, AHLORL3, AHLORL4, and AHLORL5. Especially 30-dimensional functions achieve a better optimization performance. Specifically, the proposed AHLORL obtains the best numerical results on 12 out of fifteen 30-dimensional functions. Besides, the summary of t-test and W-test results on the 30-dimensional CEC 15 functions in Table 8 indicates that the proposed AHLORL surpasses AHLORL2, AHLORL3, AHLORL4, and AHLORL5 on 15, 5, 8, and 7 out of 15 functions. And the W-test results support that AHLORL outperforms these variants on 15, 6, 7, and 8 out of 15 functions. By systematically analyzing the differences between AHLORL and the other four versions, it will be easier to understand the influences of various adaptive strategies on the performance and learn about how to meet the requirements of the ideal balance between exploration and exploitation.

To analyze the superiority of the proposed AHLORL algorithm more clearly, the Index 1 curves of AHLORL and AHLORL2 on F1 and F5 over 100 independent runs are drawn in Figure 4, and the AD curves and corresponding fitness values of AHLORL, AHLORL2, AHLORL3, AHLORL4, and AHLORL5 are drawn in Figure 5, respectively. Figures 36 clearly show that the relationship between AHLORL, AHLORL2, AHLORL3, AHLORL4, and AHLORL5 and the evaluating indicators result was caused by the change of ps.

Based on the above experiments, the characteristics of AHLORL and the influences of the adaptive strategies can be concluded as follows:(1)The Index 1 curves of AHLORL in Figure 4 increase gradually. This character indicates that AHLORL needs to use a large probability of SRLO to find the optimal bit values at the beginning of the search, which can effectively reduce the uncertainty impact of the randomly initialized population. With the progress of the search, the learning probability of SILO is increased to accumulate the found optimal bit values, which further boosts the effectiveness and confidence of the following learning operation. Besides, the Index 1 curves of AHLORL2 have a significant drop at the later stage of iteration, which does not meet the characteristics of knowledge changes of the population. And therefore, the increasing strategy of control parameter ps can utilize the learning ability between SILO and SRLO more effectively and efficiently.(2)Figure 5 shows that the AD curves of AHLORL2 represent converging fast and then performing the accurate search at the beginning. Although AHLORL2 converges fastest, Figure 6 displays that AHLORL2 likely stucks in the local optima because it cannot widely explore the interesting solution areas and sufficiently perform the accurate search at the later stages of iterations, and consequently its results are the worst among all the algorithms.(3)Figure 3 shows that the ps curves of AHLORL3 can maintain diversity during a long period to sufficiently explore the interesting solution areas and quickly enhance the local search ability in the middle of the searching process. From Figures 5 and 6, it reveals that the values of AD are almost unchanged in the first Gmax × Tp1 generations because the found useful knowledge cannot be effectively accumulated by SILO, and the functions fitness is not improved. Besides, the ps value of AHLORL3 quickly rises in the middle of the searching process, and AHLORL3 performs efficient accurate search promptly. Correspondingly, the fitness of AHLORL3 is greatly improved; meanwhile, the value of AD curves obviously changes. However, the final results of AHLORL3 are not good enough due to the limited resources.(4)AHLORL4 has a similar problem with AHLORL3, and it also cannot effectively accumulate the found useful knowledge. Compared with AHLORL3, AHLORL4 obtains a better result before the Gmax × Tp1 generations because it can perform SILO with a certain probability. However, the final results of AHLORL4 are worse than AHLORL3 because the AHLORL4 is relatively slow to enhance the learning ability of SILO.(5)Figures 5 and 6 clearly show that the AD curves of AHLORL5 drop quickly, and it obtains a better result before the Gmax × Tp generations. Although the AD curve of AHLORL5 continues to drop after the Gmax × Tp generations, the fitness values are almost unchanged, which displays that AHLORL5 is likely stuck in the local optima. As the greedy strategy is adopted for updating IKDs and SKD, the ILO and SILO perform copy strategy to get the same bit value from the IKDs and the SKD. Therefore, if the corresponding bit value from the IKDs and SKD is the same, for example, being “1” but the optimal value being “0,” AHLORL5 cannot efficiently regain “0” by the ILO and SILO, and the only chance of the algorithm to obtain “0” depends on the RLO. However, the rate of performing the RLO is 5/M; that is, the probability of generating “0” for a certain bit on the 30-dimensional functions is 0.0028, which is quite inefficient.(6)Compared with AHLORL2, AHLORL3, AHLORL4, and AHLORL5, it is fair to declare that AHLORL can effectively utilize the learning ability of SILO and SRLO and significantly improve the search results because the proposed adaptive strategy is carefully designed according to the different requirements at different search stages. Figures 5 and 6 indicate that the proposed adaptive strategy brings about noticeable improvements in search performance. The reason is that AHLORL achieves a practically perfect trade-off between exploration and exploitation through the proposed adaptive strategy. Specifically, at the beginning of the search, the efficiency and reliability of SILO are low due to the random initialization of the population. At this time, the SRLO can efficiently find the optimal bit values by reasoning. With the progress of the search, the learning probability of SILO is quickly increased to efficiently accumulate found the optimal bit values, which further enhance the effectiveness of the following learning operation and consequently boost the exploitation ability of AHLORL. At the later stage of iterations, the risk of trapping in the local optimum remains because the greedy strategy is adopted in the SILO and ILO, and the SRLO efficiently retrieves the optimal bit values lost by the SILO and ILO. Therefore, the global search ability is significantly enhanced.

4. Inherent Search Mechanisms of AHLORL

To further understand the inherent search mechanisms of AHLORL, the structural similarities and differences between AHLORL and two mainstream metaheuristic algorithms, i.e., Genetic Algorithms (GAs) and Particle Swarm Optimization (PSO), are compared and discussed in this section.

In Genetic Algorithms (GAs), the mutation operator has a similar effect on the RLO of AHLORL. However, there are still differences in the execution strategy of the operator. In AHLORL, each bit of an individual has an independent mutation probability pr/2. In binary GAs, the mutation of all bits of an individual will not occur for the simple mutation operator or the boundary mutation operator. Besides, the combination of the ILO, SILO, and SRLO of AHLORL can be seen as a complicated variable multipoint crossover operator in GAs. However, there are still significant differences between the combination of ILO, SILO, and SRLO and the crossover operator. In AHLORL, the ILO and SILO yield a new candidate by copying the values of the solutions in the IKD and SKD, and the SRLO yields a new candidate based on the best knowledge saved in the IKD of three randomly chosen individuals. In binary GAs, the crossover operator chooses two solutions in the current population to generate a new offspring. Note that the best solution of an individual, i.e., the knowledge stored in the IKD, may not survive in the selection of GAs since there is no certain mechanism to save it for the next generation. Therefore, the inherent search mechanisms of GAs and AHLORL are different.

Compared with binary GAs, Particle Swarm Optimization may be more similar to AHLORL in the structure of the algorithm as the information of the individual best solutions and the global best solution is also adopted in PSO. However, the underlying search mechanisms of PSO and AHLORL are also different. For the standard PSO, it is a real-coded algorithm inspired by the foraging of birds. But the proposed AHLORL is a binary-coded algorithm that mimics the learning mechanism of humans. In the updating of the population, PSO and its binary variants generate solutions based on the “velocity,” and the information of the “velocity” is updated based on its inertia information and the individual/global best information. But, there is no corresponding definition of the “velocity” or inertia information in AHLORL. Besides, PSO performs the new search based on its current position while the following search in AHLORL does not depend on its current solution. Therefore, the forms of the operators of AHLORL and binary PSO are different, and the candidates generated by AHLORL and binary PSO with the same population are also distinct.

5. Experimental Results and Discussion

In this section, the proposed AHLORL, as well as seven recent algorithms, i.e., HLORL [28], IAHLO [14], SCHLO [13], Scale-Free Binary Particle Swarm Optimization (SFPSO) [52], Binary Grey Wolf Optimizer (BGWO) [53], Binary Artificial Algae Algorithm (BAAA) [54], and Improved Binary Differential Evolution (IBDE) [55], were applied to solve the 10/30-dimensional CEC 15 benchmark functions [50] and multidimensional knapsack problems (MKPs) [56]. For a fair comparison, the parameter settings for all the algorithms adopted the recommended values, which are listed in Table 9. Besides, the simulation environment was Eclipse platform and Java encoding on Windows 7, 64-bit operating system, the configuration of computer was Intel Xeon E3-1230 v3 @3.30 GHz 16G RAMs.

5.1. Results of the CEC 15 Benchmark Functions
5.1.1. Low-Dimensional Benchmark Functions

The optimization results of all algorithms on the 10-dimensional benchmark functions are presented in Table 10, where the best numerical results are highlighted in bold. Besides, the Student’s t-test (t-test) and the Wilcoxon signed-rank test (W-test) are also summarized in Table 11. As can be seen from Tables 10 and 11, the AHLORL is significantly superior to these compared algorithms, which obtains 13 best numerical results out of 15 functions. Besides, the results of the t-test clearly indicate that the proposed AHLORL is substantially better than HLORL, IAHLO, SCHLO, SFPSO, BGWO, BAAA, and IBDE on 9, 12, 13, 15, 15, 15, and 15 out of 15 functions. And the results of the W-test also unfold that the proposed AHLORL is obviously superior to HLORL, IAHLO, SCHLO, SFPSO, BGWO, BAAA, and IBDE on 10, 13, 14, 14, 14, 15, and 14 out of 15 functions, respectively.

5.1.2. High-Dimensional Benchmark Functions

The numerical results of all algorithms on the 30-dimensional CEC 15 benchmark functions are listed in Table 12, where the best optimization results are also marked in bold. For convenience, the summary results of the t-test and W-test are counted in Table 13. From Tables 12 and 13, AHLORL obtains the best numerical results on all the functions. Besides, the t-test results explicitly show that the proposed AHLORL significantly surpasses HLORL, IAHLO, SCHLO, SFPSO, BGWO, BAAA, and IBDE on 12, 15, 15, 15, 15, 15, and 15 out of 15 functions, respectively. And the W-test results also show that the proposed AHLORL is significantly better than HLORL, IAHLO, SCHLO, SFPSO, BGWO, BAAA, and IBDE on 13, 15, 15, 15, 15, 15, and 15 out of 15 functions, respectively.

5.2. Results of the Multidimensional Knapsack Problems (MKPs)

To further verify the optimization ability of AHLORL, a total of 30 multidimensional knapsack problems (MKPs) [56], i.e., the instances 10.500.00-29, was adopted as the test function to evaluate the performance of AHLORL. The times of simulation test for all problems were 100 independently, and the population size and the maximal generation number were set to 100 and 5000, respectively. The MKPs is a multiconstrained problem and the objective of MKPs is to find out an optimal subset for the maximum total profit and with multiple constraints, which is presented as follows:where and are the number of items and constraints, respectively. stands for the profit of the j-th item, means the capacity of the i-th knapsack, and represents the weight of the j-th item in the i-th knapsack with capacity constraint .

At the same time, previous work [47] demonstrates that the penalty function method, called pCOR, has the best results on solving MKPs, and therefore pCOR is used in this paper which can be presented as follows:where is the penalty coefficient used in the penalty function for infeasible solutions, pmax is the maximum profit coefficient, rmin is the minimum resource consumption, and is the amount of constraint violation for constraint i.

The results of all algorithms on the multidimensional knapsack problems (MKPs) are given in Table 14, where the best solutions have been highlighted in bold. To analyze the superiority of the AHLORL, the summary results of the t-test and W-test are summarized in Table 15. From Table 14, the proposed AHLORL has the best performance on the multidimensional knapsack problems (MKPs). Specifically, AHLORL obtains the best numerical results on all the problems, and the superiority of AHLORL is also reflected in Table 15 where no algorithm is competitive to it. Therefore, it can be concluded that the AHLORL algorithm is a promising binary metaheuristic algorithm.

Based on the numerical simulation results on the multidimensional knapsack problems (MKPs) and CEC 15 benchmarks functions as well as the results of the parameter study, it is fair to claim that AHLORL has overwhelming advantages over previous HLO variants, as well as SFPSO, BGWO, BAAA, and IBDE, because the proposed adaptive strategy can exploit the optimization ability of SILO and SRLO more effectively. And therefore, the optimization search ability of AHLORL is significantly enhanced.

6. Conclusions and Future Work

The SRLO and SILO are both important learning operators for HLORL, which can play different roles and functions at different stages during the search process. The reasonable execution probability of SRLO and SILO can effectively enhance the learning efficiency of the algorithm, and therefore the learning performance is significantly improved. Inspired by this, an improved adaptive human learning optimization algorithm with reasoning learning is proposed, and a new adaptive strategy is presented based on the search requirements to utilize the optimization ability of SILO and SRLO more efficiently and effectively.

A comprehensive parameter study is performed to evaluate the influences of the proposed adaptive strategy. On that basis, the analysis on each parameter is given and the deep insights of the roles and functions of SRLO and SILO are taken. Then, the necessity for the adaptive ps strategy is concluded. The comparison results of different adaptive strategies demonstrate the efficiency and superiority of the proposed AHLORL and reveal why the proposed adaptive strategy can achieve the practically perfect trade-off between exploration and exploitation at different search stages of the algorithm. Finally, the experimental results show that the proposed AHLORL outperforms the other algorithms in terms of search accuracy and scalability.

It is well known that humans can adaptively choose and adjust their strategies to solve problems more efficiently and effectively, and the performance of AHLORL is also influenced by the parameter pi, which determines the learning probabilities of operating ILO and SILO. Therefore, our future work will focus on the relationship between pi and ps and try to develop a cooperatively adaptive strategy for both pi and ps to further balance the exploration-exploitation ability and enhance the performance of the algorithm.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by Key Project of Science and Technology Commission of Shanghai Municipality (Nos. 16010500300, 19510750300, 19500712300, and 21190780300), Natural Science Research Programme of Colleges and Universities of Anhui Province (Nos. KJ2020ZD39 and KJ2021A1025), Open Research Fund of AnHui Key Laboratory of Detection Technology and Energy Saving Devices, AnHui Polytechnic University (Nos. DTESD2020A02 and JCKJ2021A05), School-level Scientific Research Project of Chaohu University (No. XLY-202101), 2021 Discipline Construction Quality Improvement Project of Chaohu University (no. kj21gczx02), and 111 Project under Grant no. D18003.