Abstract

Ensemble pruning has been widely applied to improve the capacity of multiple learner system. Both diversity and classification accuracy of learners are considered as two key factors for achieving an ensemble with competitive classification ability. Considering that extreme learning machine (ELM) is characterized by excellent training rate and generalization capability, it is employed as the base classifier. For a multiple ELM system, when we increase its constituents’ diversity, the mean accuracy of the whole members must be decreased. Therefore, a compromise between them can ensure that the ELMs remain good diversity and high precision, but finding the compromise brings a heavy computational burden. It is hard to look for the exact result via the searching of intelligent algorithms or pruning of diversity measures. On the basis, we propose a hybrid ensemble pruning approach employing coevolution binary glowworm swarm optimization and reduce-error (HEPCBR). Considering the good performance of reduce-error (RE) in selecting ELMs with high diversity and precision, we try to employ RE to choose the satisfactory ELMs from the generated ELMs. In addition, the constituents are further selected via the proposed coevolution binary glowworm swarm optimization, which are utilized to construct the promising ensemble. Experimental results indicate that, compared to other frequently used methods, the proposed HEPCBR achieves significantly superior performance in classification.

1. Introduction

Ensemble learning is widely used to improve classification ability in the areas such as image recognition [1, 2], intelligent detection [3, 4], and data mining [5, 6]. Its main idea is to aggregate the predictions of multiple learners so that a better predictive results can be attained [7, 8]. Considering that its remarkable enhancement in classification in contrast to single ones [9, 10], it has been widely used to several distinct applications, e.g., image processing [11, 12], medical diagnosis [13, 14], age prediction [15, 16], gene expression data classification [17, 18], and intrusion detection [19], but there is no such thing as a free lunch. When we employ massive learners for attaining an ensemble with good generalization ability, it needs to consume a large number of computing resources in the application of practical problems [20]. To reduce the consumption of computing resources, ensemble pruning has been studied to cope with it [9]. It aims to aggregate a fraction of learners for increasing prediction accuracy via consuming less resource [10]. It is widely accepted that the same results are achieved by aggregating multiple identical learners, and the classification capacity cannot be enhanced [20]. Namely, the learners should be more diverse on data samples, so as to acquire the ensemble with higher predictive ability.

Nevertheless, combining diverse learners may result in a better or worse performance, because the ensemble composed of the members with great diversity and low precision may win low classification capacity [21]. Hence, the performance of ensemble is dramatically influenced by the two fundamental factors, i.e., diversity and precision of learners [22, 23]. As a consequence, we should select the learners with large diversity and high precision, and it can be converted to a combinatory optimization question [24]. Assume that there are learners in an ensemble, and its size of nonempty subsets is very large [9, 25]. Hence, most algorithms cannot do an exhaustive search in such a large subset [9]. Many ensemble pruning methods are proposed for coping with the selection of learners, and the existing works are presented as below.

As a rule, we can classify these techniques into several categories [10, 20]. Firstly, some scholars attempt to adopt some diversity measures to evaluate the learners, and those ones who satisfy some stability conditions can be chosen to build the ensemble via different pruning strategies [26, 27], e.g., kappa [28], reduce-error [9], complementarity [26], contribution [29], margin-based criteria [30], margin and diversity-based measure [31], and relevancy and complementary measures [32]. Secondly, the pruning of classifiers is a combinatory optimization question. Some researchers try to directly employ heuristic search algorithms to solve it [33, 34], for instance, a well-known pruning method, named GASEN [33], and GSOEP [34]. Thirdly, some clustering techniques are used to select the learners with good diversity for integration, and those ones are a series of representative learners [8]. Widely used techniques include k-means [35] and deterministic annealing [36]. Finally, many scholars study other different pruning methods to acquire the selection of learners, for example, frequent pattern [7], a randomized greedy selective strategy and ballot [37], greedy randomized dynamic pruning [38], confidence interval based on double-fault measure [39], graph coloring way [40], cost-sensitive rotation forest [17], induction of decision tree [41], and simple coalitional games [20].

As to the analysis above, it is easy to find that those existing works via diversity measures only remove a part of learners with bad performance, and there still exist learners with low precision, which may impact on the performance of the ensemble. To tackle the aforementioned issue, we combine diversity measures and heuristic algorithms to filter out redundant learners. The integration of them can seek for the optimal combination of learners. Where diversity measure is used to preprune a collection of learners with poor comprehensive capability [9]. They notably reduce its computational requirements. Then, we employ heuristic algorithms to select the learners with good comprehensive performance from the remaining learners to build the ensemble [33].

Considering the advantages of ELM [42, 43], we use it as the base learner in this work [39]. Reduce-error (RE) can perform better than other diversity measures in filtering out learners with bad performance [9]. Hence, RE can select some complementary learners with high classification capability, memorably downsize ELMs, and reduce the computing requirements. Therefore, RE denotes an appropriate preselection measure. In addition, glowworm swarm optimization (GSO) shows the superiorities [4448] of strong robustness, easy to realize, and good capability of global search. Hence, it provides an efficient search strategy for further pruning redundant ELMs.

Therefore, we propose a novel hybrid pruning method called HEPCBR utilizing the integration of the proposed CBGSO and RE. Existing pruning methods generally find the subensemble with best capability employing GSO or RE separately, which cannot exactly find it, so the combination of CBGSO and RE is employed to search for it. HEPCBR is an integrated method for dealing with ensemble pruning of multiple ELMs, which performs well in terms of searching for the optimal combination of ELMs. We remove some redundant ELMs using RE for reducing the size of ELMs and alleviating the computational complexity, and the presented CBGSO is also adopted to further seek for the optimal subensemble of ELMs. Therefore, the combination of CBGSO and RE can cope with the selection of ELMs.

This work’s contributions have been presented as follows:(1)An ensemble pruning algorithm, named HEPCBR, which utilizes the integration of RE and the proposed CBGSO(2)We modify the basic GSO, named CBGSO, which has good convergence accuracy and high evolution velocity(3)RE takes full advantage of the diversity among ELMs and filters out a fraction of ELMs with poor comprehensive performance(4)Experimental results show the proposed approach can obtain a significant enhancement in classification capacity

2. CBGSO

The presented GSO was inspired via glowworms’ luminous behavior [44, 45]. In GSO’s searching process, an initial population composed of many glowworms is generated randomly, and then each glowworm moves forwards the one with larger luciferin. In addition, many individuals can come together the locations of the ones with high luciferin values [49]. Therefore, the optimal one has been gained. For an optimization question in a discrete space, the basic GSO cannot be directly used because of its fixed step.

To deal with the above question in a discrete space, we propose the CBGSO by improving the searching processes of the basic GSO. In CBGSO, its fixed step is modified, so as to make the algorithm efficiently seek for global optimum solution in the binary space. Crossover operation and collaborative operation are introduced into the basic GSO, which can create some new individuals and maintain the population diversity, so as to avoid getting in the local optimum and improve the convergence rate. Escaping operation is used in this work, which make those individuals who equal to the locally optimal solution have a chance to escape from the current area and avoid falling into a local optimum. The above operations are introduced into the basic GSO, so as to make it efficiently find the optimum solution. CBGSO has been displayed in the following part.

2.1. Improvement of Moving Way

Given a binary combinatorial optimization problem, GSO cannot directly do a search using a fixed step, and we need to modify its movement. To make sure that GSO can move in binary space, we try to update the position of a glowworm with a certain probability [34, 48]. Its updating process can be formulated aswhere is the current glowworm; is the k-th dimension of at the t-th iteration; is random integer 0 or 1; is a predefined probability; and is a random number.

2.2. Crossover Operation

To enhance the searching rate of GSO, a crossover operation is introduced in GSO, which is inspired by GA [33]. In the population of glowworms, two selected glowworms perform a crossover, and two new offspring individuals can be created by exchanging part of elements of their parent individuals. That is, it has a chance to create new offspring individuals that perform better than before, and those individuals are not maintained in the population. If all the glowworms in the population perform the crossover, then the population performs diversely with a large probability. After that, those glowworms with bad performance will be eliminated. Namely, if the new generated glowworms can achieve better results, then the current individuals would be replaced. The crossover operation is described aswhere represent the new generated glowworms who are created using the crossover operation; express two different glowworms, respectively; and declares a random number, , .

2.3. Collaborative Operation

To avoid the remaining local optima, a collaborative operation is employed. Each glowworm in GSO can keep strong links with those glowworms inside its dynamic local-decision domain but has no link with that outside the dynamic local-decision domain. So, each glowworm should keep link with those individuals outside the dynamic local-decision domain. In this work, the population of glowworms can be partitioned into several subgroups. Each subgroup can keep strong links with other subpopulations. The optimal glowworm in one subgroup performs a crossover with the individual in another one, so as to gain new individuals with better performance via the exchange of their elements.

Assume that the population can be partitioned into subgroups, is the optimal glowworm in the s1-th subgroup, is the optimal glowworm in the s2-th one, and . The collaborative operation can be expressed as follows:where are the new generated glowworms, respectively; are the fitness values of , respectively.

2.4. Escaping Operation

To increase the probability of escaping from the local optima, escaping operation [50] is adopted. If the glowworm wins the same with the locally optimal solution, several random elements of the glowworm will be changed, so as to move forward other position for escaping from the local optima.

3. HEPCBR

In this section, HEPCBR is proposed using the combination of CBGSO and RE. First, the ELMs with poor comprehensive performance are prepruned using RE for downsizing ELMs and notably reducing the computational overheads of the selection of ELMs; second, the subensemble of ELMs are chosen from the remaining ELMs using the proposed CBGSO. The proposed HEPCBR is described as follows.

3.1. Initial Pool

To maintain the good ability in classification by grouping multiple learners, these members should be diverse. When we try to gain the diverse learners, the weak one can be employed [40]. ELM is a weak learner [39] with fast training speed. We produce multiple diverse ELMs with a high efficiency. Therefore, ELM can be utilized to generate ELMs via the bootstrap sampling method in bagging [51], and the initial pool of ELMs is constructed.

Huang et al. [42, 43] first published the first version of ELM for dealing with classification and regression problems [52]. When the hidden node of ELM is a certain value, the hidden weights and biases are generated randomly, so as to gain the unique solutions with a high efficiency.

For a training set , then the model of ELM is expressed as follows:where ; declares the hidden nodes’ size; exhibits an activate function; denotes the weight matrix; discloses the weight matrix of the output layer; and reveals a threshold. Equation (4) is recharacterized in another way:where , , and , and manifests the hidden output matrix. Then, we can getwhere intimates the Moore–Penrose generalized inverse of .

3.2. Preselection of ELMs

The pruning of learners is an NP-complete problem. To make an ensemble perform at its best, we should balance the accuracy and the diversity of ELMs, which will bring heavy computational burden. Therefore, it is almost impossible to directly perform the search for the optimal subensemble. To efficiently find the optimal subensemble, we need to downsize the redundant ELMs and reduce the computational burden of the selection of ELMs. Finally, the final ensemble can be obtained using heuristic algorithms with a high searching efficiency.

Reduce-error (RE) [9, 26, 28] selects the ELMs with good diversity and high precision via evaluating its contribution to the ensemble, and it should be utilized for removing some poorly performing ELMs. We introduce its basic concept in the following part.

For a training set . Its samples are composed of a vector and a label . The predictive results are attained by classifying the samples in the training set , and shows an ELM that can achieve the predictive label in terms of . The predictive result of the original ensemble is obtained using the unweighted majority voting method, which is shown as follows [9, 26, 28]:where represents an indicator function ( and ); expresses the class label classified by the i-th ELM; and represents the set of sample labels.

The ELMs who can reduce the error rate of the candidate subensemble in the pool of unselected ELMs can be used to construct the final ensemble [9, 26, 28]. The first selected ELM is that with the highest classification ability on the training set. Assume ELMs have been selected to constitute the ensemble , and then the ELM is packaged into the ensemble :where .

In this section, we use RE to reduce a part of redundant ELMs for a goal of significantly alleviating the computation complexity. The ELM with the best classification accuracy is selected according to the prepruning method utilizing RE, which is the first ELM of the ensemble, and it grows by joining new ELMs in terms of equation (8). We attempt to determine the size of the preselection through the use of RE. Suppose that we retain the first ELMs acquired by RE to construct the prepruned ensemble, and then the members are used to be further selected.

Theorem 1. The ensemble composed of the first ELMs extracted from the initial pool using RE, and its ensemble accuracy can be improved.

Proof. training subsets are generated employing a bootstrap sampling way, and multiple ELMs can be independently trained on the different training subsets. Hence, ELMs are achieved . The process of preselection based on RE is as follows.
First, the ELM in who does the best in classification is selected to add into the new subensemble , and its accuracy is . Second, the ELM is extracted from the unselected ELMs , so as to attain that new subensemble wins the highest precision according to equation (8), and its ensemble accuracy is . Finally, in the unselected ELMs can be added into the new subensemble one by one according to equation (8) in a similar fashion, and their ensemble accuracies are achieved. Then, the new sequence of ELMs and its sequence of ensemble accuracies are attained.
Let , namely, is the best. In addition, should be retained, because the ensemble composed of can achieve the highest accuracy. In a word, the first ELMs are utilized to form the prepruned ensemble using RE, and then its ensemble performance can be improved.
To verify the above theorem, the average ensemble accuracies are achieved by ordered bagging in terms of RE with 50, 150, and 250 ELMs on different datasets (Bupa, Vehicle, and CMC), respectively, as shown in Figure 1. We can observe from Figure 1 that the curves of the accuracies achieved by RE first rise up and then move down, with the ELMs’ size growing. It can attain a peak when the number of ELMs is moderated. After that, as the size of ELMs increases, its precision gradually decreases. Namely, the ensemble ability can be enhanced after preselection based on RE. At the beginning, the ELMs have low diversity, so their accuracies go up with the size of ELMs rising, whereas as the number gets larger, the poor-performing ELMs have been added into the subensemble, which leads to the decline of the precision. From Figure 1, we can know that the ordered ensemble using RE achieves a maximal value before the ELMs’ size reaches a certain value. Yang et al. [21] found that the heuristic algorithms can perform better in the selection of learners than other techniques when the size of learners is less than 25. As a consequence, we select the first 25 well-performing ELMs, which are used for postselection via CBGSO.

3.3. Postselection of ELMs

Ensemble pruning is a combinatorial optimization problem. To calculate its fitness value, we take the classification accuracy as its fitness function [33, 34], where shows the classification precision between the predictive outputs and the actual labels, , where if , then ; otherwise, , expresses the size of testing samples, represent the predictive result and its actual label on the j-th testing samples, respectively. In addition, shows the selected 25 ELMs gained by RE. In this work, each glowworm is characterized as a bit sequence. In the bit sequence, 1 discloses the ELM is selected, and 0 points to the ELM is not selected. So, we can cope with the pruning question of well employing CBGSO.

3.4. The Pseudocode of HEPCBR

The pseudocode of HEPCBR has been exhibited as follows (Algorithm 1).

Inputs: HEPCBR’s parameters.
Outputs: the subensemble and its precision .
(1)  A pool composed of ELMs is built, .
(2).
(3) The optimal ELM with the best classification ability is chosen.
(4).
(5).
(6)for to do
(7)The ELM is chosen via equation (8).
(8)  .
(9)  .
(10)end for
(11)   individuals can be produced randomly, and evaluate their fitness values .
(12)  , .
(13)  .
(14)whiledo
(15)for to do
(16)   for to do
(17)ifthen
(18)  Perform escaping operation to generate a new glowworm.
(19)  end if
(20)  Find the objective individual .
(21)  Update the position of via equation (1).
(22)  Update parameters of .
(23)  Perform crossover operation to produce new individuals.
(24)   end for
(25)   Perform collaborative operation produce new individuals.
(26)  end for
(27)  , .
(28)end while
(29)return and

4. Experiments

To evaluate the classification ability of HEPCBR, we select 25 UCI machine learning benchmark datasets to implement the massive experiments, and these datasets are presented in Table 1. Specifically, each experiment is repeated 30 times using different random seeds, and then the final classification accuracies are attained. The parameters of CBGSO are given in the following part [44, 45]: the , the , the , the , the , and the .We randomly divide the dataset into five equal parts, three of them as training, one of them as validation, and the other one as testing, where the ELMs were trained on the training datasets, the pruning process of ELMs is implemented on the validation datasets, and the final results are achieved on the testing datasets.

4.1. Experimental Results

The proposed HEPCBR achieves the final ensemble using the combination of RE and the proposed CBGSO, and we attempt to verify that whether HEPCBR can perform better than any one of them. So, the predictive results of the proposed HEPCBR are compared to both RE and CBGSO in different sizes of ELMs, as shown in Tables 2 and 3. It is easy to see from Tables 2 and 3 that HEPCBR, via the integration of RE and CBGSO, wins a higher precision than any one of them (CBGSO or RE) on the most datasets using less size of ELMs. We can also observe from Tables 2 and 3 that, as the pool size increases, the prediction results achieved by HEPCBR and RE are getting better, but those attained by CBGSO are getting worse. We diagnose the causes, and we find that the computation complexity of the selection of ELMs exponentially increases as the size gets larger. For CBGSO, it is difficult to do an exhaustive search to find the exact subensemble. As a result, the preselection of ELMs can significantly lower the calculation complexity of the selection of ELMs. RE can select some well-performing ELMs, and the postselection of CBGSO can make the ensemble gain better results. Moreover, the classification ability of HEPCBR can attain a very small improvement as the initial size exceeds 150. Hence, 150 ELMs are used in this work.

4.2. Comparison with Other Methods

To test the classification ability of HEPCBR further, massive experiments were implemented by comparing it with other techniques: bagging [51], kappa [28], AGOB [26], POBE [27], DREP [22], DEELM [39], GASEN [33], GSOEP [34], MOAG [30], RREP [10], DMEP [40], EPSCG [20], DASEP [24], MDOEP [31], RCOA [32], and PEAD [41]. Bagging extracts the training samples with equal probability, and it can construct an initial pool composed of multiple learners with a good diversity. Kappa pruning achieves the ensemble by grouping the learners with their measures under a certain value. AGOB, POBE, and MOAG find the ordering of those learners in the ensemble works well. DREP first selects learner with the best classification ability and gradually increases the size of the ensemble which can make the ensemble achieve better precisions. DEELM chooses the learners with their measures falling into an interval. GASEN can assign a random weight to each learner. It adopts GA to evolve their parameters, and the learners whose weights exceed a predefined value can be utilized for combination. GSOEP employs GSO to directly filter out the redundant learners. RREP attained the ensemble by utilizing the retained learners and the removed ones gained by RE. DMEP employs GA to optimize five pairwise diversity matrices for a combining diversity measure, and select a part of learners in the light of the graph coloring theory. EPSCG evaluates the learners’ contribution to the ensemble in terms of the Banzhaf index of power. DASEP finds new measures by simultaneously considering diversity and accuracy of learners, and acquires a good result in the selection of learners. MDOEP attains the final ensemble by the aggregation using margin and diversity-based measure. RCOA ranks all the learners based on the relevancy and complementary measures in a descending order, and the final ensemble is achieved. PEAD selects a collection of learners extracted from a pool using the induction of decision tree, which are used to form the final ensemble.

We gain the classification precisions of the proposed HEPCBR, and other methods with 150 ELMs in the pool are displayed in Tables 4 and 5. “+/ = /−,” respectively, indicates that HEPCBR achieves higher/neutral/lower precisions than other methods. It is clearly shown in Tables 4 and 5 that the proposed HEPCBR outperforms other state-of-the-art techniques on the most classification tasks, which shows its enhancements in classification. Table 6 displays the sizes of the final ensemble attained by all the pruning approaches. As shown in Table 6, HEPCBR utilizes less the size of ELMs to build the ensemble, but it uses more ELMs than DMEP. Additionally, HEPCBR gains better results than DMEP. As a whole, the proposed HEPCBR has acquired the expected results in classification.

In this section, we try to estimate the difference significance between the predictive accuracies gained by HEPCBR and that attained by other comparative methods, and we used the Wilcoxon rank sum test [40] with its level of 0.05. When the p value of two techniques is lower than 0.05, it expresses that there exists significant difference between them. Table 7 exhibits the p values acquired in the test between HEPCBR and other comparative approaches. Table 7 indicates that the p values have been below 0.05. It manifests the significance and the effectiveness of HEPCBR.

The execution time of the proposed HEPCBR and other comparative techniques is reported in Table 8. The observations from Table 8 flag up that HEPCBR uses much less time than GASEN, GSOEP, and DMEP, but utilizes more time than the retaining comparative methods. In HEPCBR, CBGSO is employed to extract the final ELMs from the remaining ones after pruning, which needs to calculate the classification accuracy of each candidate subensemble at each iteration. During the search of CBGSO, each candidate subensemble is expressed by a glowworm, and each glowworm needs to calculate its fitness value at least once per iteration. Therefore, it requires much more time. Those techniques with less running time achieve the selection of ELMs without repetitive iteration process and assessment. Nevertheless, they cannot attain good results in classification. HEPCBR cost less time than the mentioned three methods, it is the reason that those approaches directly selected the ELMs with good comprehensive performance without preselection, and it might not be enough to seek for the exact solution due to the enormous amount candidate subensembles. Moreover, the preselection process can notably alleviate the computational burden of the selection of ELMs. It is concluded that HEPCBR consumes much more time, but the remarkable improvements in classification are obtained.

The aforementioned comparative methods are based on bagging, which are attained using diversity measures with different strategies or heuristic algorithms. In addition, we further compare the proposed HEPCBR with the more state-of-the-art approaches by considering other factors, which are not the same category with the aforementioned comparative methods: C-RoF [17], CS-D-ELM [18], KPCA-RoF [4], and DyPReVNsGraspEnS [38]. C-RoF introduces misclassification, test, and rejection costs for overcoming that some existing works neglect the classification costs, and it achieves a good result. CS-D-ELM extends the D-ELM employing misclassification cost of the learner and embeds rejection cost into it to increase the classification stability. KPCA-RoF integrates the KPCA and RoF for linearly inseparable dataset classification problems, and it enhances the classification capacity of RoF on the nonlinear fractional datasets. DyPReVNsGraspEnS realizes random multistart search, and it can avoid falling into local optimal solutions and possesses a high probability to exactly find the ensemble with better performance. The classification results achieved by different approaches are presented in Table 9. We can see from Table 9 that the accuracies gained by HEPCBR are higher than other four state-of-the-art methods. Then, we verify the significance between HEPCBR and other four approaches using the Wilcoxon rank sum test, which is displayed in Table 10. From Table 10, all the p values are less than 0.05. So, we can conclude that the proposed HEPCBR has powerful advantages over other four techniques in classification, and it is significant and effective.

4.3. Parameter Analysis

In HEPCBR, we employ CBGSO to select the ELMs with good diversity and high accuracy from the remaining ones after prepruning. To make the proposed CBGSO perform well, we try to analyze its parameters. In addition, to assess its convergence speed and precision, the comparative analyses have been presented as follows: MGSO [48], MBGSO [47], MDGSO [34], IBFS [50], MBFS [53], MFS [54], and GA [33]. We implemented the experiments on two datasets, and we adopt different methods in search of the optimal subensemble from the remaining ELMs using RE with the pool size of 150.

It is clearly observed in Figure 2 that, as a whole, CBGSO attains better convergence accuracy and higher evolution velocity than other seven comparative algorithms, i.e., CBGSO performs better than others in terms of the speed and the precision of convergence. It is also easy to find that the accuracies acquired by CBGSO rise up and stay stable. The results achieved by CBGSO just can attain a slight gain in classification, when the number of iterations is above 400. So, it is 400. In addition, when CBGSO is utilized to search for the final ensemble in HEPCBR, it costs less time than other seven heuristic algorithms, which is displayed in Table 11. Table 11 demonstrates that CBGSO performs higher efficiency of searching than other heuristic algorithms. Hence, CBGSO can work well when it comes to searching speed and efficiency. The accuracies mount and level off, as the population size increases, which is rendered in Figure 3(a). We can see from Figure 3(a) that, when the population size of glowworms is above 25, it will greatly increase the computation complexity with less improvements in classification. Hence, the size is 25. Figure 3(b) analyzes the affection of the initial range on the performance of CBGSO. We can observe from Figure 3(b) that it performs at its best when the initial range is 11. Hence, the initial range is 11. Figure 3(c) reveals the affection of the maximal one on the performance of CBGSO. It is more than the initial one. The observations from Figure 3(c) disclose that the performance of CBGSO peaks at 15. So, we advise it is set as 15.

5. Conclusions

Ensemble pruning extracts a subset of ELMs extracted from a constructed pool of ELMs for achieving better predictive results and efficiency. To make an ensemble perform well, it must consist of accurate and diverse ELMs. Attaining that is a combinatory optimization question, and it has high complexity. To address this issue, HEPCBR is presented via the fusion of RE and CBGSO, the prepruning strategy using RE can avoid heavy computing burdens of the selection of ELMs, and then the ELMs with good comprehensive performance can be efficiently selected by CBGSO. Experimental study utilizing 25 UCI datasets has demonstrated that the proposed HEPCBR outperforms bagging and other algorithms, and it also indicates the effectiveness and significance. Furthermore, the presented CBGSO can look for the better results than other methods. The proposed HEPCBR brings forward a new study way for the pruning question of ELMs.

With respect to future research directions, we aim to design a new combining diversity measure, which can minimize the redundant ELMs. Therefore, it can provide these well-performing ELMs for further pruning.

Data Availability

The data about the UCI datasets used are from the website https://archive.ics.uci.edu/ml/index.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was funded by the Anhui Provincial Natural Science Foundation under grant no. 1908085QG298, the National Nature Science Foundation of China under grant nos. 91546108 and 61806068, the Fundamental Research Funds for the Central Universities under grant nos. JZ2019HGTA0053 and JZ2019HGBZ0128, the Anhui Provincial Science and Technology Major Projects under grant no. 201903a05020020, and the Open Research Fund Program of Key Laboratory of Process Optimization and Intelligent Decision-Making (Hefei University of Technology), Ministry of Education.