BP Network Model Based on SCLBOA for House Price Forecasting
Butterfly optimization algorithm (BOA) is a new swarm intelligence algorithm mimicking the behaviors of butterflies. However, there is still much room for improvement. In order to enhance the convergence speed and accuracy of the BOA, we present an improved algorithm SCLBOA based on SIBOA, which incorporates a logical mapping and a Lévy flight mechanism. The logical chaotic map is used for population initialization, and then the Lévy flight mechanism is integrated into the SCLBOA algorithm. To evaluate the performance of the SCLBOA, we conducted many experiments on standard test functions. The simulation results suggest that the SCLBOA is capable of high-precision optimization, fast convergence, and effective global optimization, all of which show that our method outperforms other methods in solving mathematical optimization problems. Finally, the BP network is optimized according to the SCLBOA (SCLBOA-BP) to further verify the availability of the algorithm. Simulation experiments prove the practicability of this method by building a Boston housing price prediction model for training.
Inspired by human intelligence, the social behaviors of biological groups, or the laws of nature, many intelligence optimization algorithms are developed to solve complex optimization problems, which represents the applications of artificial intelligence. As a subfield of artificial intelligence, swarm intelligence requires effective computational methods, as the algorithms themselves must show high levels of adaptability to complex and constantly changing situations to find optimal solutions. The metaheuristic methods for the optimization problem are proved to be a good solution .
General purposed metaheuristic methods are evaluated in eight different groups which are biology based, swarm based, math based, sport based, chemistry based, social based, music based, and physics based. Furthermore, there are hybrid methods that are a combination of these . Genetic algorithm (GA) which solves both constrained and unconstrained optimization problems that are based on natural selection [2, 3], differential evolution (DE) which optimizes a problem by iteratively improving a candidate solution based on an evolutionary process , and slime mould algorithm (SMA) which is an effective optimizer motivated by slime behavior to tackle the optimization problems  are biology based; particle swarm optimization (PSO) which optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality , cat swarm optimization (CSO) which is inspired by resting and tracing behaviors of cats [7, 8], grey wolf optimization (GWO) which simulates the leadership hierarchy and hunting mechanism of grey wolves in nature , and Harris hawks optimization (HHO) which is a gradient-free optimization algorithm with several active and time-varying phases of exploration and exploitation  are swarm based; sine cosine algorithm (SCA) which generates various initial random solutions and asks them to shift towards the best solution using a mathematical model  is math based; war strategy optimization (WSO) which is based on the strategic movement of army troops during the war is sport based ; artificial chemical reaction optimization algorithm (ACROA) which mimics chemical reaction process is chemistry based ; teaching-learning-based optimization (TLBO) which is based on the effect of the influence of a teacher on the output of learners in a class is social based ; harmony search algorithm (HS) which is inspired by music to solve computationally involved optimization paradigms is music based . Chaos optimization algorithm (COA) which usually utilizes the chaotic map like a logistic map to generate the pseudo-random numbers mapped as the design variables for global optimization can be classified as both math based and physics based [16, 17].
These primitive intelligent algorithms are simple and easy to implement, with fewer parameters and shorter running time. Therefore, it presents excellent operability and optimization ability in solving many nonlinear and multimodal realistic optimization problems. BOA  was proposed by Arora and Singh in 2018. The method and concept of this algorithm were first proposed at the 2015 International Signal Processing, Computing, and Control Conference (2015 ISPCC) . After the algorithm was proposed, the authors conducted a lot of research on BOA [20, 21]. In the BOA algorithm, each butterfly has its own unique sense and individual perception ability. This is also a major feature that differentiates it from other metaheuristics.
As a newly proposed natural heuristic algorithm, BOA, like other intelligent algorithms, also has defects such as convergence speed, and it is easy to fall into local optimum. In response to the above problems, many scholars have proposed different improvement strategies. Wang and Zhang  introduced adaptive inertial weights and introduced multisegment perturbation strategies into the update of the optimal nectar position, introduced the crazy factor into the position formula to increase the population diversity, and proposed a crazy butterfly algorithm based on adaptive perturbation (CIBOA). Gao and Liu  first introduced the limit threshold to limit the number of times BOA fell into the local optimal solution and combined the simplex strategy to optimize the poorly positioned butterflies at the later stage of the iteration to improve the performance of the algorithm. BOA is a bionic swarm intelligence algorithm derived from simulating the foraging or courtship behavior of butterflies in nature. It has successfully solved engineering design problems , image segmentation , and data mining . But these optimizations are difficult to balance the system's computational power and exploration power.
The SCLBOA algorithm differs from the traditional BOA algorithm in that it is developed based on the SIBOA algorithm with an integration of the logical chaotic map and the Lévy flight mechanism. This paper aims to balance the exploratory capacity and exploitative capability of the SCLBOA. The major contributions of this study are summarized as follows:(i)The method used for initializing the individuals of the population is a chaotic sequence generated based on logistic mapping, which exhibits higher uniformity and diversity and helps to improve the efficiency and quality of the solution.(ii)The Lévy flight mechanism is introduced in the solution search to generate random steps. Randomly smaller steps for long-distance and more giant steps for short-distance are applied to prevent the solution search from stagnating.(iii)To balance the local search ability and global search ability of the algorithm, individuals’ Lévy flight state is adjusted periodically, and a limit range is set, both of which are realized by changing the mathematical function.
The rest of the paper is organized as follows. Section 2 introduces related work and existing BOA and SIBOA algorithms. Section 3 proposes a new SCLBOA algorithm and introduces the improved algorithm in detail. Section 4 explains the solution quality and convergence performance of benchmark functions. Section 5 uses SCLBOA as the training algorithm for the BP network, that is, the SCLBOA-BP network predicts Boston housing prices. Section 6 provides an in-depth analysis of the method proposed in this paper. Section 7 depicts the main works in this study and gives some suggestions for future research.
2. Related Research Works
BOA [18, 19] is a metaheuristic algorithm inspired by natural organisms, which simulates the foraging and mating behavior of butterflies. In order to determine the potential direction of mating objects and food sources, butterflies judge by a certain concentration of aroma in the air. The scent intensity is determined by each butterfly and directly determines the fitness of the butterfly to find the best, so the fitness of the butterfly will vary according to the change of position. The global search in BOA is that the butterflies in the population move towards the target individual that emits fragrance, and the local search is the random movement of butterflies when they do not perceive the fragrance of other individuals. The concentration of the butterfly’s fragrance depends on the following three factors: sensory factor, stimulation intensity, and power index. The equation is as follows:where is the scent concentration, is the sensory factor, usually taking the value of 0.01, is a power exponent that depends on , usually taking the value of 0.1, and is the stimulus intensity, which is related to the optimal fitness.
Before BOA begins a local search and global search, the algorithm will randomly generate the positions of the population individuals and generate their respective fragrances accordingly.
In the common global search stage, when a butterfly senses the smell of other butterflies, it moves towards the current global optimal position . is the position vector of the -th butterfly in the t-th iteration, which is the part of its own cognitive flight; is the scent size of the i-th butterfly; is a random number in [0,1].
In another case when a butterfly cannot sense the smell of its nearby environment, it moves randomly. Intensive local search updates as equation (3). Among them, , are the j-th butterfly and the k-th butterfly randomly selected from the solution space; is a random number in [0,1].
In the butterfly foraging process, the switching probability controls whether the algorithm is in the dense local search or the ordinary global search stage. Each iteration compares the switch probability and the size of a random number to decide whether to perform a global search or a local search. The final update formula of the butterfly algorithm is as follows:
In the butterfly optimization algorithm, the position of the nectar source of the population plays an important role as it guides the individual butterfly to move to the optimal solution. However, if the population falls into the local optimal position, it will easily cause the search to stop in the group, and a better global optimization cannot be obtained. Therefore, in order to balance the ability of local search and global search of BOA, SIBOA  introduces the sine cosine operator in its own cognitive flight part:where t represents the current number of iterations; Max_iter represents the maximum number of iterations; , , represent a random number within [0, 1]; is the switching probability, which determines the choice of algorithm update formula; is the control parameter, which determines the location area of the i-th butterfly individual in the next iteration; is a random number in [0,2π], which defines whether the current solution should be close to the target or far away from the target; and and jointly control the algorithm for local search and global search. When the value of and is greater than 1 or less than −1, develop different areas for global search. When it is between −1 and 1, develop the desired search space for local search.
In SIBOA, the scent size changes with the degree of butterfly absorption, which is realized by the power index parameter a. The behavior of the individual butterfly in the control algorithm affects the convergence speed and optimization accuracy of the entire algorithm. When a = 1, it means that no fragrance is absorbed, which means that the fragrance is spread in an ideal environment. The fragrance emitted by the butterfly can be felt from anywhere in the search domain, and the global optimal value can be easily obtained. a = 0 means that the fragrance of any butterfly cannot be detected by other butterflies. Therefore, in the original BOA, the power exponent a that depends on the scent size is set to a specific value of 0.1, which results in poor optimization performance. Therefore, a in SIBOA adopts the following calculation formula:where , represent the maximum and minimum power exponent coefficients, respectively. The value of a decreases linearly from 0.3 to 0.01, which can effectively adjust the butterfly’s absorption of fragrance, which is convenient for the individual butterfly to perform better local search and global search and improves the accuracy of convergence.
3. Fusion of Logistic Chaotic Mapping and Lévy Flight Sine-Cosine Butterfly Optimization Algorithm (SCLBOA)
3.1. Logistic Chaotic Map Initialization
Studies have shown that the initialization of the population has a certain impact on the performance of the algorithm. As the chaotic sequence can be easily generated, it has attracted significant attention from researchers. Recently, chaotic ensemble optimization has been adopted in many studies[28, 29]. Richards et al. were the earliest ones who proposed such insights . A slight difference in the initial state of a nonlinear system will lead to very different state development and change. The development and change of the position of the population are chaotic, and different initial positions will produce very different solutions. If the initial population is better, it will improve the solution efficiency and the quality of the solution, so the population individuals should be uniformly initialized in the solution space as much as possible.
The initialization process of the basic BOA is random. However, based on the sine operator introduced in SIBOA, the improvement of SCLBOA uses the logistic mapping in the chaotic sequence to initialize the position and velocity of the butterfly due to the chaotic sequence. Those with nonrepetitive ergodicity and pseudo-random characteristics can help increase the diversity of the population.
Logistic mapping is currently the most widely used nonlinear dynamics discrete chaotic mapping system. Its mathematical model is defined as follows:where is the input and is the output, both of which are the state values of the logistics mapping; α represents system control parameters; and is the logistic mapping between variable and control parameter α. Logistic mapping is used to generate chaotic variables to describe position and velocity. At this time, the chaotic position variables generated have better ergodicity.
Figure 1 shows the bifurcation diagram of logistic mapping.
When x takes a value between [0, 1] and α takes a value between [2, 4], the logistic map bifurcation graph is shown above. When α=3.569945672, it enters the chaotic state.
3.2. Lévy Flight Mechanism
The SCLBOA proposed in this paper not only uses the logistics chaotic map but also introduces the Lévy flight mechanism [31, 32]. The Lévy flight search strategy is mainly used to generate random steps, similar to a random walk, where the solution search moves towards a completely random direction at every step.
The characteristic performance of the Lévy flight mechanism is similar to the global search and local search features in the intelligent optimization algorithm. In one case, it repeatedly walks randomly with a small step size over a long distance to ensure that it can enter another area and search in a wider range. In another case, the direction mutation jump is occasionally performed in a short distance with a large step size, to ensure that the individual carefully searches the small area around itself. Its step size satisfies the form of power-law distribution, which can be expressed as follows:where r is a random number in the range of [0, 1] and the value of ξ is set to 1.5.
In this paper, the experiment sets the limit range. When it is less than 1/5 of the population, the Lévy flight strategy is introduced to improve the global search ability; when it is greater than 1/5 of the population, the butterfly updates its position compared with the worst value. The core idea is to change the individual state of the butterfly population through the change of mathematical functions, increase the diversity of the population, and improve the global search ability. The updated position formula after introducing Lévy flight is as follows:
3.3. Algorithm Steps
The specific steps of SCLBOA are as follows: Step 1. Initialize parameters. Use logistic mapping to chaotically initialize the population position within the boundary range to ensure that the initial position traversal is distributed in the search space. Step 2. Calculate the initial fitness value. Calculate the fitness value of the individual butterfly according to the test function. Step 3. Select the source of nectar. The butterfly position with the best fitness value is selected as the nectar source position, and the fragrance size is calculated according to formula (1). Step 4. Update the butterfly position. In order to reduce the influence of external environmental factors, according to (2)formulas –(4) and the random number P, it is judged whether the current iteration is performing a global search or a local search. The position update formulas (5) and (8)–(10) use the sine cosine operator and the Lévy flight mechanism. Step 5. Calculate the fitness value and update the optimal position. Compare the target value of the current individual of the butterfly with the previous individual and replace it if it is superior. Step 6. Repeat the iterative process of Step 4 and Step 5. If the set accuracy requirements or the specified maximum number of iterations are reached, the algorithm is terminated and the global optimal solution is output.
3.4. SCLBOA Flowchart
When running the SCLBOA algorithm, the initialization is performed first, and then the iterative search is performed, and in the final phase, the algorithm running terminates until the optimal solution is found. In the initialization stage, by initializing the position of the population through chaotic logic mapping, the algorithm defines the target function and its solution space and records the optimal position of individuals while calculating the individual fitness value. At the multiple-iteration stage, individual butterflies in the solution space move and update their positions following the Lévy flight mechanism, thereby evaluating their fitness values. When the maximum number of iterations is completed, the iteration ends, and the algorithm outputs the optimal solution with the highest fitness value. The above steps constitute the whole procedure of the SCLBOA algorithm.
4. Experimental and Result Analysis
In this section, to find the optimal solution, the exploratory and exploitative capacity of the proposed method is examined by eight typical standard test functions in different dimensions. Also, six algorithms (i.e., PSO, DE, SCA, BOA, SIBOA, and HHO) are adopted to verify and compare the performance of the proposed algorithm. The starting search points of the algorithms selected for the comparison were the same for all the algorithms, and the simulations were performed in the same situations.
4.1. Benchmark Functions
The benchmark functions F1∼F8 include functions with different characteristics such as unimodal and multimodal. The unimodal function only has a strict maximum value (or minimum value) within defined upper and lower limits, which is usually used to detect the convergence speed of the algorithm. The multimodal function is a function containing multiple locally optimal solutions or global optimal solutions and is often used to detect algorithm exploration and development capabilities. The specific expression, dimension, value range, function type, and optimal value of the theoretical value of the function are shown in Table 1.
4.2. Algorithm Parameters
The parameter settings of each algorithm are shown in Table 2.
4.3. Development Environment
The software and hardware environment of the experimental platform of numerical simulation includes MatlabR2018b and Windows10, the main frequency of the machine is 2.00 GHz, and the memory is 8 GB.
In order to avoid contingency, all the algorithms are independently run in 50 comparative experiments on MatlabR2018b. The highest, lowest, average, and standard deviation of each function are calculated. The maximum number of iterations is set to 5000. The results are shown in Tables 3 and 4.
4.4. Experiments to Analyze the Optimality
The bold part in Tables 3 and 4 shows the optimal solution derived by the algorithm iteration under the same experimental conditions. And the optimal solutions searched by the F2 and HHO are smaller than those searched by the SCLBOA. SIBOA shows premature stagnation in F5. To sum up, the SCLBOA excels other algorithms on the benchmark function. In this experiment, just the single-modal and multimodal mathematical functions of the benchmark functions are adopted to do a model evaluation, and future experiments are suggested to incorporate composite functions.
4.5. Experimental Convergence Analysis
The eight standard test functions in Table 1 are solved by the seven algorithms, and their fitness function value curves are shown in Figures 3 and 4. The horizontal axis represents the maximum number of iterations in the program running and the vertical axis stands for the corresponding fitness values. It can be seen from the figure that the convergence of the SCLBOA algorithm proposed in this paper is significantly better than other algorithms, which shows that the solution accuracy of the proposed algorithm has been significantly improved.
5. SCLBOA-BP Network Predicts Boston Housing Prices
5.1. BP Network
BP network [33, 34] is a multilayer feedforward neural network that includes two processes which are input signal forward and error backpropagation. The structure is generally composed of an input layer, hidden layer, and output layer [35, 36]. The neuron state of each layer only affects the neuron state of the next layer. It is widely used in various prediction models. The dimensions of the input and output vectors of the training samples would determine the number of neural nodes in the input and output layers of the network, respectively. A typical BP neural network structure with only a single hidden layer and a single output is shown in Figure 5.
In Figure 5, is a set of input vectors of the BP network; y is the target output value of the network; is the connection weight between the input layer and the hidden layer; is the connection weight between the hidden layer and the output layer; and and b are the node thresholds of the hidden layer and output layer, respectively. If the number of hidden layer nodes is m, then j = 1,2,3, …, m. In the forward pass, the input signal vector is first transmitted layer by layer from the input layer to the hidden layer and then finally to the output layer, connecting the weight vector and the threshold vector sum through each layer and calculating the corresponding activation function of each layer to get the predicted output value Y of the output layer. On the other hand, if an error occurred between the predicted value Y and the target value y, the error part would be transferred to the reverse layer-by-layer transmission, and the weights and thresholds of the network layers would be adjusted in the direction of reducing the error.
5.2. SCLBOA-BP Network
This section explains how we use the SCLBOA to train the BP network. In the SCLBOA algorithm, the position of each butterfly in the butterfly group represents a set of weights in the current iteration of the BP network, and the dimension of each butterfly represents the number of weights that play a role in the network. Meanwhile, it takes the neural network output error of a given training sample set as the fitness function of the neural network training problem. The fitness value represents the error threshold of the neural network. The smaller the error is, the better performance in the search will be. In the weight space, butterflies would move and search, which reduces the MSE of the output layer of the network. Thus, in this way, SCLBOA optimizes the search and training of the weight of the neural network to obtain a smaller output error, and in each iteration process, it would calculate and move towards a new position. The new position will be a new set of weights, and then a new MSE is obtained according to the set of weights, and the individual with the smallest MSE would be the current global optimal solution. Repeat the above process to make the predicted value of BP neural network approach the actual output value.
5.3. Boston Housing Price
This SCLBOA-BP network is trained and tested on the Boston house price dataset. The model’s performance and predictability are evaluated. It is expected that this model can be used for in-house price estimation to improve the efficiency of real estate agents. The initial parameters of the SCLBOA-BP network are determined as follows. The group scale is 50; the maximum number of iterations Max_itera is 30; the number of input nodes is 13; the number of output nodes is 1; the number of hidden layer nodes is 13; the number of weights is 13×13+13×1 = 182; the number of thresholds is 13+1 = 14. The previous 50 sets of data are used as the test set, and the remaining 456 sets of data are used as the training set to train and test the Boston housing price prediction model. Therefore, the entire input vector is a 13 × 506 matrix. At the same time, BP, BOA-BP, and SCLBOA-BP prediction models are trained to predict the test set, compare the prediction results, and perform performance analysis.
Figure 6 shows the visual comparison of the BP network predicted value and the true value before and after BOA optimization. In the meantime, Figure 7 shows the visual comparison of the BP network predicted value and the true value before and after SCLBOA optimization and the error results between them. At the same time, Table 5 presents the numerical comparison of the predictive evaluation indicators, where MAEG, MSEG, RMSEG, and MAPEG represent the average absolute error, mean square error, standard error, and average absolute percentage error in the test set, respectively, and MSET represents the mean square error in the training set.
As shown in the simulation results, the SCLBOA algorithm proposed in this paper can train the BP network to avoid the inability of finding the optimal solution in the BP training process due to the defects of the BOA itself and to avoid the algorithm from premature convergence to obtain smaller prediction error.
The major issue in metaheuristic algorithms is being stuck in local optima. The solution search performance of the SCLBOA algorithm is tested on different functions, and the results we obtained are compared with those of other six well-established metaheuristic algorithms (i.e., PSO, DE, SCA, BOA, SIBOA, and HHO), and obtained results confirm the superiority of the proposed algorithm compared to the other metaheuristic algorithms. The real-life optimization problem, house price forecasting, is solved with the help of the newly proposed algorithm.
From the performed convergence analysis of SCLBOA, we confirm that the algorithm will guarantee convergence, but the rate of convergence is still influenced by several factors. Since the experiment is mainly to verify the effectiveness of the algorithm, the calculation time test of the algorithm is lacking, and the algorithm can be further optimized and tested in the future. Another strategy is the combination comparison of the proposed method to see if there is synergy between different strategies.
Butterfly optimization algorithm (BOA) has gained huge popularity among the research community and is being used to solve optimization problems in various disciplines. The SCLBOA algorithm proposed in this paper combines logistics chaotic mapping and Lévy flight mechanism based on the SIBOA. The simulation results show that the convergence speed of the algorithm is greatly improved, and the problem of the defect that is easy to fall into local optimality is significantly eliminated. Using SCLBOA as the training algorithm of the BP network, the SCLBOA-BP network is applied to train the Boston housing price prediction model, which verifies the practicability of the SCLBOA-BP network. It is noteworthy that the SCLBOA algorithm works well on the Boston house price dataset. It is recommended that future research evaluate the performance of this algorithm on real-life optimization problems.
The Boston housing dataset can be found in the following website: https://www.kaggle.com/schirmerchad/bostonhoustingmlnd.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This study was supported by the National Key R&D Program (grant no. 2019YFB1312202).
J. Kennedy and R. Eberhart, “Particle swarm optimization,” Proceedings of ICNN’95-international conference on neural networks, vol. 4, pp. 1942–1948, 1995.View at: Google Scholar
P.-W. Tsai, J.-S. Pan, S.-M. Chen, B.-Y. Liao, and S.-P. Hao, “Parallel cat swarm optimization,” International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3328–3333, 2008.View at: Google Scholar
T. Van Tran and Y. N. Wang, “Artificial chemical reaction optimization algorithm and neural network based adaptive control for robot manipulator,” Journal of Control Engineering and Applied Informatics, vol. 19, no. 2, pp. 61–70, 2017.View at: Google Scholar
S. Arora and S. Singh, “Butterfly algorithm with levy flights for global optimization,” in Proceedings of the 2015 International Conference on Signal Processing, Computing and Control (ISPCC), pp. 220–224, Solan, India, September, 2015.View at: Google Scholar
Y. R. Wang and D. M. Zhang, “Crazy butterfly algorithm based on adaptive perturbation,” Application Research of Computers, vol. 37, no. 11, pp. 3276–3280, 2020.View at: Google Scholar
W. X. Gao and S. Liu, “Butterfly optimization algorithm for global optimization,” Application Research of Computers, vol. 37, no. 10, pp. 2966–2970, 2020.View at: Google Scholar
S. M. J. Jalali, S. Ahmadian, P. M. Kebria, A. Khosravi, C. P. Lim, and S. Nahavandi, “Evolving artificial neural networks using butterfly optimization algorithm for data classification,” in Proceedings of the International Conference on Neural Information Processing, pp. 596–607, Bangkok, Thailand, November, 2019.View at: Google Scholar
Y. Wang and D. Zhang, “Butterfly optimization algorithm combining sine cosine and iterative chaotic map with infinite collapses,” Pattern Recognition and Artificial Intelligence, vol. 33, no. 7, pp. 660–669, 2020.View at: Google Scholar
H. Bingol and B. Alatas, “Chaos based optics inspired optimization algorithms as global solution search approach,” Chaos, Solitons & Fractals, vol. 141, Article ID 110434, 2020.View at: Google Scholar
M. Richards and D. Ventura, “Choosing a starting configuration for particle swarm optimization,” Neural Networks, pp. 2309–2312, 2004.View at: Google Scholar
M. Yu and X. Ye, “Prediction for tourist traffic based on improved particle swarm optimization bp neural network,” Microcomputer and Its Applications, vol. 34, no. 21, pp. 51–54, 2015.View at: Google Scholar
T. Jiang, Y. Zhang, and Y. Wang, “A study of application of an improved pso algorithm in bp network,” Computer Science, vol. 33, no. 9, pp. 164-165, 2006.View at: Google Scholar
H. Zhang, “Bp network model based on pso for house price forecasting,” Value Engineering, vol. 14, 2012.View at: Google Scholar
X. Chen, “Application of an improved bp neural network algorithm in intrusion detection,” Journal of Physics: Conf. Ser, vol. 1684, 2020.View at: Google Scholar