Recent Advances on Swarm Intelligence for Solving Complex Engineering Problems
View this Special IssueResearch Article  Open Access
Adaptive Black Hole Algorithm for Solving the Set Covering Problem
Abstract
Evolutionary algorithms have been used to solve several optimization problems, showing an efficient performance. Nevertheless, when these algorithms are applied they present the difficulty to decide on the appropriate values of their parameters. Typically, parameters are specified before the algorithm is run and include population size, selection rate, and operator probabilities. This process is known as offline control and is even considered as an optimization problem in itself. On the other hand, parameter settings or control online is a variation of the algorithm original version. The main idea is to vary the parameters so that the algorithm of interest can provide the best convergence rate and thus may achieve the best performance. In this paper, we propose an adaptive black hole algorithm able to dynamically adapt its population according to solving performance. For that, we use autonomous search which appeared as a new technique that enables the problem solver to control and adapt its own parameters and heuristics during solving in order to be more efficient without the knowledge of an expert user. In order to test this approach, we resolve the set covering problem which is a classical optimization benchmark with many industrial applications such as line balancing production, crew scheduling, service installation, and databases, among several others. We illustrate encouraging experimental results, where the proposed approach is able to reach various global optimums for a wellknown instance set from Beasley’s ORLibrary, while improving various modern metaheuristics.
1. Introduction
Nature inspired algorithms are inspired by natural phenomena and behaviours that are met in nature. These algorithms are a subcategory of evolutionary algorithms, which are algorithmic schemes that evolve the solution in each iteration of the algorithm [1]. Evolutionary algorithms use the intelligence of the previous solution for the new one to be generated [2]. In innumerable works, the efficiency of these techniques has been tested. However, it is well known that the performance of metaheuristics depends largely on their correct parameter settings. In fact, finding the appropriate values for the parameters of an algorithm is considered to be a nontrivial task. Previous research has addressed this task that can be classified into two major forms of parameter setting (see Figure 1): parameter tuning and parameter control. Parameter tuning is known as the process of finding good parameter values before the run of the solver—by using, for instance, training instances—, and then launching the algorithm using these values, which remain static during solving. On the contrary, parameter control launches the algorithm with initial parameter values which are updated during solving time. Now, parameter control is divided into three branches according to the degree of autonomy of the strategies. The control is deterministic when the parameter value is modified by some fixed, predetermined rule using no information from the search. The control is called adaptive, when feedback from the search is employed to determine the variation of a given parameter value, for instance, the population size. Finally, selfadaptive is known as when the parameters are encoded into individuals in order to evolve conjointly with the other variables of the problem. This applies to some metaheuristics; for instance, in evolutionary algorithms the better values of these encoded parameters may lead to better individuals, which in turn have more chances to propagate these good configurations to next generations.
This paper proposes an adaptive approach for parameter control by using autonomous search which is a particular case of adaptive systems that improve their solving performance by modifying and adjusting themselves to the problem at hand, either by adaptation or supervised adaptation [3]. Autonomous Search is applied on the black hole algorithm that is a populationbased metaheuristic inspired by the black hole phenomenon. The behaviour of this algorithm is principally controlled by two elements: number of solutions and event horizon. The main idea is to automatically adapt the population according to performance exhibited by the algorithm; this population variation indirectly influences on the event horizon.
To evaluate the adaptive approach on the black hole algorithm, we test the set covering problem which is one of the wellknown Karp's 21 NPcomplete problems. Interesting experimental results are obtained, where the proposed approach is able to obtain various global optimums for a set of 65 wellknown set covering problem instances, outperforming also several recently reported techniques.
The remainder of this paper is organized as follows: The related work is introduced in the next section. A brief description of the set covering problem is presented in Section 3. Section 4 details the metaheuristic employed in this work, including their adaptive variation, the binary representation, and the repair of unfeasible solutions. Section 5 illustrate the experimental results, and finally, we conclude and suggest some lines of future research in Section 6.
2. Related Work
In the evolutionary computing field, many optimization problems have widely been treated by using evolutionary algorithms, such as [4–6] and also [7–9]. In this research line, the parameter setting approach [10] is one of the most studied challenges [11–17]. However, a more reduced work can be observed in the literature when other metaheuristics are involved. For instance, a parameter adaptation study on ant colony optimization is reported in [18], a firefly algorithm for solving the optimal capacitor placement problem is described [19], in [20] a selfadaptive artificial bee colony for constrained numerical optimization is presented, and a modified cuckoo search algorithm for solving engineering problems can be encountered in [21, 22]. Now, considering more similar works, we find a variation of genetic algorithm that adjusts its population size in [23]. The basic idea is to adapt the actual population size depending on the difficulty of the algorithm in its ultimate goal to generate new child chromosomes that outperform their parents. Similarly, [24] presents a version of cuckoo search algorithm that adapts their parameter reflecting the probability whether the nest will be abandoned or updated. In both cases, the adaptive approach depends of quality solution. Finally, in [25] we can observe a variation of the artificial bee colony that consists in controlling the perturbation frequency.
On the other hand, the set covering problem has widely been explored in the optimization and mathematical programming sphere [26]. First works have used exact techniques, for instance branchandbound and branchandcut algorithms [27–30]. Greedy algorithms have been proposed as a real alternative for solving the set covering problem; nevertheless their deterministic nature hinders the generation of high quality solutions [31]. Under this line, it is well known that exact methods are in general unable to tackle large instances of NPcomplete problems, consequently much research work has been devoted to the study of efficient metaheuristics to solve hard instances of the set covering problem in an amount of bounded time. For instance, ant colony optimization [32–35], simulated annealing [36], tabu search [37], and genetic algorithms [38] have extensively been proposed to tackle the classic set covering problem. In recent years, research has been driven towards solving this problem by using recent bioinspired algorithms, such as teachinglearning based optimization [39] firefly optimization [40], cat swarm optimization [41], shuffled frog leaping [42], artificial bee colony [43], cuckoo search [44], and black hole algorithm [44], among others.
Finally, we can say that an adaptive approach for bioinspired algorithms to solve the set covering problem—or other optimization problem—is an interesting research line that has not yet been highly exploited.
3. Set Covering Problem
In this section we present the set covering problem. The set covering problem consists in finding a set of solutions at the lowest possible cost to cover a set of needs. Line balancing production [45], crew scheduling [46–48], service installation [49, 50], databases [51], production flowlines optimization [52], and Boolean expressions [53] are some practical examples of set covering problem. Formally, we define the problem as follows: Let be a binary matrix with rows × columns, and let be a vector representing the cost of each column , assuming that . Then, we can observe that column cover a row that exists in , if . The corresponding mathematical is stated in the following:subject toThe aim is to minimize the summation of column costs, where if component is part of the solution and otherwise. The constraints of the set covering problem guarantee that every row is covered by at least one column .
4. Black Hole Algorithm
The black hole algorithm is a populationbased algorithm inspired on the black hole phenomenon [54, 55]. A black hole is a zone of space that has so much mass concentrated in it that there is no way for a nearby object to escape its gravitational force. Anything falling into a black hole, including light, cannot escape.
Similar to other populationbased algorithms, the black hole algorithm begins with an initial population of potential solutions, called “stars". At each iteration of the black hole algorithm, the best solution is chosen to be the black hole, which then starts atractting other candidates around it. If the fitness of a star crosses the event horizon of the black hole (Eq. (5)), it will be swallowed and it is gone forever. In such case, a new star (potential solution) is randomly generated and placed in the search space and starts a new search.
The black hole has the ability to absorb the stars that surround it. After initializing the black hole and stars, the black hole begins by absorbing the stars around it and all the stars start moving towards the black hole. The absorption of stars by the black hole is formulated as follows:where and are the components of a solution at iterations and , respectively. is the component of best solution (black hole) in the search space. is a random number in the interval . Finally, represents the number of stars (potential solutions).
While moving towards the black hole, a star may reach a location with lower cost than the black hole. In such case, the black hole moves to the location of that star and vice versa. Then the algorithm will continue with the black hole in the new location and then stars start moving towards this new location.
In addition, there is the probability of crossing the event horizon during moving stars towards the black hole. Every star (or potential solution) that crosses the event horizon of the black hole will be swallowed by the black hole. Every time a candidate is sucked in by the black hole, another potential solution (star) is born and distributed randomly in the search space and starts a new search. This is done to keep the number of potential solutions constant. The next iteration takes place after all the stars have been moved.
The radius of the event horizon in the black hole algorithm is calculated using the following equation:where is the fitness value of the black hole, is the fitness value of the th star, and is the number of stars (potential solutions).
When the distance between a star and the black hole (best candidate) is less than , that candidate is collapsed and a new star is created and distributed randomly in the search space. Based on the above description the main steps in the black hole algorithm are described in detail in Algorithm 1.

The black hole algorithm begins with the loading and processing phases. Then, at Lines 38, an initial population of stars is randomly generated as a vector of binary values that corresponds to one star or potential solution (Line 5). For each star, the fitness value is calculated by evaluating the objective function (Line 7).
Then, while a termination criterion (a maximum number of iterations or a sufficiently good solution was not reached) is met, each fitness of a potential solution is evaluated (Lines 1137). As previously mentioned, the set covering problem is a minimization problem. This evaluation is handled in comparison presented at Line 13. If the new minimum value is less than minimum global, the minimum global is changed by the new minimum value and the best solution is stored in (black hole).
If the fitness of a star crosses the event horizon of the black hole, replace it with a new star in a random location in the search space. This process is described in the for loop statement at Lines 1928.
Finally, the last for loop statement (Lines 2936) generates new solutions according to Equation (4), for each dimension . This value belongs to real domain and it must be brought to a binary domain; thus a function is used again to transform the real value in a binary one.
4.1. Adaptive Black Hole Algorithm
The basic version of the black hole algorithm has not control parameters. If we observe the parameter , we can see that the population size is valuated before the run of the metaheuristic. At each iteration of the algorithm, the solutions are altered updating their position or when they are absorbed by the black hole, in which case a new random solution—or star—is created. However, the total number of solutions is preserved. The second part of the metaheuristic uses an absorption process, called event horizon that is a stochastic procedure and it is influenced by the best solution, discarding bad solutions that seem to lead to poor results. This is not necessarily a correct idea, since often a bad solution can be drive the search to an optimal one.
These concerns are very important for the performance of the algorithm, reason why we have decided to take the event horizon logic to calculate the value of the population size parameter in autonomous way. For that, we use autonomous search which is a particular case of adaptive systems that improve their solving performance by modifying and adjusting themselves to the problem at hand, either by adaptation or supervised adaptation (please see [3, 56] for details). This approach has been successfully applied in constraint programming using bioinspired algorithms for controlling the process resolution of solver tools [57, 58]. The objective of autonomous search is to allow the metaheuristic to selfadapt the value of the parameter during the run, according to the algorithm convergence.
The literature presents several works illustrating how to improve evolutionary algorithms by dynamically controlling parameters during solving time [13–17, 22]. We have decided to modify the original black hole algorithm considering the autonomous search principles for building a procedure that adaptively vary the number of solutions (population size parameter), according to the performance exhibited during search.
We are inspired by this approach to improve the local search process of black hole algorithm. The main objective is to give the algorithm the ability to adjust its population size. When algorithm is stagnant some solutions are improved according to the best solution. In opposition, if solutions are far away from the best solution, the worst stars are eliminated and new ones are randomly generated.
We vary only the population parameter since is the only one available to be modified during solving time, this make the implementation easier and the solving process improved. We believe this approach also provide knowhow for future experiments involving adaptive population. The procedure is described in Algorithm 2.

Inputs of the procedure are the value of population size , current iteration , and number of stagnation that represents the number of iterations where the best solution does not improve, commonly called “local search”.
Reached this point, we calculate the distance between best and worst solutions according to cost of each one and then it divides by sum of all costs (Line 2). This value describes the percentage of maximum separation between solutions. A low percentage value indicates that solutions are homogeneous skewed to the best solution, in which case the procedure creates new solutions (Lines 518). To create these solutions, we reuse the percentage to determinate if they are cloned from best solution (Lines 810) or they are randomly generated (Lines 1216). On the other hand, if means that solutions are heterogeneous; thus the procedure remove the worst solutions (Lines 2022).
In both cases, the percentages calculated can also be seen as a mechanism to support the exploration and exploitation phases. If solution are homogeneous, this procedure allows to explore towards new solutions, while if solutions are heterogeneous, the procedure converges towards a set of similar solutions itself.
For computational experiments we use potential solution and number of generations. Although the approach proposed is adaptive, it is necessary to define an initial value for .
4.2. Binary Approaches
It is known that variables of the set covering problem are limited to binary values, namely, : for this reason, we use a binary representation for each solution given by the black hole algorithm, as shown in Figure 2, where
In this paper, we use the standard black hole algorithm [59], where each star represent a candidate solution.
However, the standard version of the black hole algorithm is designed to solve problems with real domains. This task is resolved by transforming domains, by applying binarization strategies, which are responsible for forcing elements to move in a binary domain. The binarization strategy is composed of a transfer function and a discretization method. In this work, we tested 32 different binarization strategies.
We evaluate different functions, separated into two families [60, 61]: Shape and Shape (see Table 1).

Once a transfer function is applied, the input real number is mapped to a real number belonging to interval. Then, a discretization method is required to produce a binary value from the real one. For achieving this, we test four different methods:(1)Standard: if condition is satisfied, standard method returns 1; otherwise, return 0.(2)Complement: if condition is satisfied, standard method returns the complement value.(3)Static probability: a probability is generated and it is evaluated with a transfer function.(4)Elitist: discretization method Elitist Roulette, also known as Monte Carlo, is to select randomly among the best individuals of the population, with a probability proportional to its fitness.
4.3. Heuristic Feasibility Operator
Generally, metaheuristics may provide solutions that violate the constraints of the problem. For instance, a new set covering problem solution owning uncovered rows, clearly violates a subset of constraints. In order to provide feasible solutions the algorithm needs additional operators. To this end, we employ a heuristic operator that achieves the generation of feasible solutions, and additionally eliminates column redundancy.
For making all feasible solutions, we calculate a percentage based on the cost of column over the sum of all the constraint matrix rows covered by a column , as shown in The unfeasible solutions are repaired by covering the columns of the solution that had the lower ratio. After this, a local optimization step is applied, where column redundancy is eliminated. A column is redundant when it can be deleted and the feasibility of the solution is not affected.
Algorithm 3 starts with the initialization of variables from instance in Lines 15. The recognition of the rows that are not covered are in Lines 6 and 7. Between the statements 8 and 18 is “greedy” heuristic. On the one hand, between the instructions 8 and 12, the columns with lower ratio are added to the solution. On the other hand, between Lines 13 and 18, the redundant columns with higher costs are deleted while the solution is feasible.

5. Experimental Results
After to apply the adaptive approach, we have analyzed the time complexity of the basic algorithm to illustrate that our proposal does not affect its performance. If we observe each flow control, statements and expressions from the basic black hole algorithm, we can determine that time complexity is given by , where is a constant representing the maximum of iterations and it is avoided, value is the number of solutions, and is the dimension of each solution. In worst case, the basic algorithm is upper bounded by .
Now, evaluating time complexity about our adaptive approaches, we determine that the upper bound is given by , where is the new number of solutions and is the dimension of each solution. This procedures are activated when the number of iterations achieve the local search criteria ls, therefore they are independent of the main algorithm and as a consequence its performance and complexity time are not altered.
With respect to space analysis about our approach, we considerate the space used in principal memory. In this context, every time that the adaptive algorithms are lunched, we observe that the usage of the principal memory is not outperformed to 8%. The basic algorithm use a similar percentage of principal memory.
The performance of the adaptive black hole algorithm is experimentally evaluated by using 65 instances of the set covering problem organized in 11 sets taken from the Beasley's ORlibrary [62]. Table 2 describes instance set, number of rows or constraints , number of columns or variables (dimension) , range of costs, and density (percentage of nonzeroes in the matrix).

In order to reduce the instance size of set covering problem, we have used a preprocessed instances set. Different preprocessing methods have particularly been proposed in [63]. We employ two of them, which have been proved to be the most effective ones:(i)Column Domination. The nonunicost set covering problem holds different column costs, then once a set of rows is covered by another column and , we say that column is dominated by , then column is removed from the solution.(ii)Column Inclusion. If a row is covered by only one column after the above domination, it means that there is no better column to cover those rows; consequently this column must be included in optimal solution.
The results are evaluated using the relative percentage deviation (). The value quantifies the deviation of the objective value from that in our case is the best known value for each instance and it is calculated as follows:
The minimum (Min), maximum (Max), and average (Avg) of the solutions obtained were achieved running 30 runs over each one of set the 65 test instances. We test all the combinations of transfer functions and discretization methods over all these instances. Before to test all instances, we perform a sampling phase to evaluate which binarization strategy would show the best performance. In appendix section we include preliminary results in the sampling phase (see Tables 9 and 10 and Figures 7 and 8). The binarization strategy that achieved the best results for the black hole algorithm is .
5.1. Black Hole Algorithms Comparison
In this Section, we compare the proposed adaptive black hole algorithm with the basic black hole one. To the best of our knowledge, no autonomous search implementation on top of black hole has been proposed yet to include in the comparison.
The algorithm implementation has been done using Java SE 7 and the experiments have been launched on a 2.3 Ghz Intel Core i3 with 4 GB RAM machine running Windows 7 (available at https://goo.gl/AEYXJq). The initial parameter setting used is as follows: a population size of 40, 4000 iterations, a local search of 40 and 30 executions for each instance. Bold font and underlined row are used to present when the adaptive black hole algorithm outperforms their basic version.
Table 3 illustrates the results obtained for all instances from groups 4, 5, 6, A, B, C, and D. Regarding instance sets 4 and 6, the proposed adaptive approach in this work (BHAS) improves dramatically the results in 50%, reaching 4 global optimums. Considering instance sets 5 and 6, we can observe that the basic black hole algorithm exhibits great performance achieving 80% of the optimal values. However, the adaptive process of the population size parameter improves results obtaining 2 more optimal values. Now, if we analyze instances from groups A, B, C, and D, we can see that again the basic black hole algorithm shows an outstanding efficiency to solve the set covering problem, solving to optimally 15 of 20 instances. Nevertheless, the adaptive approach shows that it can improve even more, finding one more optimal value for the instance C.1.

Table 4 presents the results obtained for instances from groups NRE, NRF, NRG, and NRH. If we consider that the basic black hole algorithm reaches 3 optimal values from 20, there is a wide margin to improve. Taking into account the fact that the autoadaptive methods outperform the basic BH algorithm in 11 of 17 possible instances and 2 of them are global optima, we can say that BHAS performs significantly better than the basic version.

Figure 3 illustrates convergence charts for the hardest instances of each instance group. Here, we observe that for the 4 group the convergence of the BHAS is clearly faster than the basic algorithm. For the group 5, the BHAS begins with a bad quality solution but at the middle of the process improves its performance outperforming the basic BHA similar behaviour can be seen for the NRE and NRH group. The performance of instances from groups 6, A, B, C, and D is similar keeping all of them an early convergence. Finally, for benchmarks from groups NRF and NRG, the behaviour of BHAS is clearly earlier than its competitor.
Now, if we regard solving times required for reaching the solutions, we may observe that times are very similar for the two algorithms. However, we must consider that BHAS need the computation of the adaptive process and is able to outperform the basic BH in terms of optimum values reached. We can observe also a small difference in terms of solving times in favor of the BHAS with respect to BH.
5.2. BHAS v/s Original Approach
After to analyze the exhibited efficiency for BHAS and the basic algorithm for solving the set covering problem, we have observed that the variation is minimal. It is not possible to state which technique is the best. However, in order to show a significant difference between the adaptive approach and the original black hole algorithm, we perform a contrast statistical test for each instance through the KolmogorovSmirnovLilliefors to determine the independence of samples [64] and Wilcoxon's signed rank [65] to compare statistically the results.
For both tests, we consider a hypothesis evaluation, which is analyzed assuming a of —accuracy—i.e., smaller values than , determines that the corresponding hypothesis cannot be assumed. Both tests were conducted using GNU Octave (available at http://goo.gl/jtHn8i).
The first test allows us to analyze the independence of samples. For that, we run the algorithm 30 times for each instance and then, we apply the test. According to obtained values, we can decide if samples follow a normal distribution or they are independent. To proceed, we propose the following hypotheses:(i) states that follows a normal distribution.(ii) states the opposite.
The conducted test has yielded lower than ; therefore cannot be assumed.
Then, as samples are independent and cannot be assumed that follow a normal distribution, it is not feasible to use the central limit theorem to approximate the distribution of the sample mean as Gaussian. Therefore, we assume the use of a nonparametric test for evaluating the heterogeneity of samples. For that we use the Wilcoxon’s signed rank test. This is a paired test that compare the medians of two distributions. To proceed, we propose the following new hypotheses:(i): achieved by basic black hole algorithm achieved by BHAS.(ii) states the opposite.
Table 5 compares the approaches for all tested instances via the Wilcoxon’s signed rank test. As the significance level is also established to , smaller values that defines that cannot be assumed. Again, bold font and underlined row are used for a winner value of the metaheuristic stated in the column of the table, e.g., for instance 4.1, BHAS is better than basic algorithm as its value is lower than , then cannot be assumed.

According to results, for lower than for black hole optimization are 12 and for BHAS are 40. The rest does not provide significant information. This results illustrating that the performance of BHAS is better than the original approach.
5.3. BHAS v/s Other Optimization Techniques
To evidence the performance of our adaptive approach, we perform a comparison with different approximation techniques: binary cat swarm optimization (BCSO) [41], binary firefly optimization (BFO) [40], binary shuffled frog leaping algorithm (BSFLA) [42], binary artificial bee colony (BABC) [43], and binary electromagnetismlike algorithm (BELA) [66].
We additionally incorporate a comparative by using Mixed Integer Linear Programming (MIP) as exact solving method implemented on MiniZinc G12 MIP. With the solver, the instances are solved to a maximum time of 8 hours. If no solution is found at this point the problem is set to t.o. (timeout).
Tables 6, 7, and 8 illustrate that the proposed approach is able to reach competitive results in contrast to those modern optimization techniques.

