Research Article  Open Access
Adeeb Noor, Muhammed Kürşad Uçar, Kemal Polat, Abdullah Assiri, Redhwan Nour, "A Novel Approach to Ensemble Classifiers: FsBoostBased Subspace Method", Mathematical Problems in Engineering, vol. 2020, Article ID 8571712, 11 pages, 2020. https://doi.org/10.1155/2020/8571712
A Novel Approach to Ensemble Classifiers: FsBoostBased Subspace Method
Abstract
In this article, an algorithm is proposed for creating an ensemble classifier. The name of the algorithm is the Fscore subspace method (FsBoost). According to this method, the features are selected with the Fscore and classified with different or the same classifiers. In the next step, the ensemble classifier is created. Two versions that are named FsBoost.V1 and FsBoost.V2 have been developed based on classification by the same or different classifiers. According to the results obtained, the results are consistent with the literature. Besides, a higher accuracy rate is obtained compared with many algorithms in the literature. The algorithm is fast because it has a few steps. It is thought that the algorithm will be successful due to these advantages.
1. Introduction
An ensemble classifier is a method in which multiple classifications are used together to improve classification performance [1, 2]. For example, when three classifiers are used to classify an object, the classifier works like this: if the first classifier is classified as a cat, the second classifier is classified as a dog, and the third classifier is classified as a cat, the ensemble classifier generates the result by taking the average of these decisions. There are many ways to create an ensemble classifier. Some of the most commonly used ones are (1) adaptive resampling and combining (boosting) [3], (1.1) AdaBoost (adaptive boosting) [4], (2) bagging (bootstrap aggregating) [5], and (3) random subspace [6].
The boosting method can create powerful classifiers by combining and training weak classifiers [3]. The most commonly used boosting method is AdaBoost [4]. The AdaBoost method tries to improve performance by focusing on misclassified instances [4]. In the bagging method, classifiers trained with different training sets randomly selected (random sampling method) from the dataset are combined [5]. Outputs of classifiers are combined with majority voting or weighted voting [5]. In the random subspace method, feature subsets are generated by randomly selecting from samples [6]. Each subset has an element and an feature [6]. In other words, a subset of features is created, not a subset of instances [6]. In this way, the training process is accelerated. These subclasses and classifiers are trained to form ensemble classifiers. The outputs of the classifiers are combined with majority voting or weighted voting.
These algorithms have a disadvantage. As the education levels for AdaBoost increase, the number of samples decreases, and training becomes more difficult. There is a need for more samples for training. It is quite slow because the training stages are too much [7, 8]. The bagging method involves complex calculations [7]. Both methods require many iterations [1]. So, the success rate is usually lower than the random forest method [1]. These models cannot explain the dataset by modeling it as decision trees [7]. When these disadvantages are taken into consideration, these methods are still in need of improvement. In this study, a new ensemble algorithm based on the Fscore feature selection algorithm has been developed to reduce the processing load of existing ensemble algorithms and to increase the accuracy rate.
Feature selection algorithms are often used in the machine learning field to improve the performance of systems [9–11]. In the field of machine learning, datasets are used in a variety of sizes and types [12–15]. Large size data will cause the classifier to lengthen the training duration. Feature selection algorithms have been developed to solve this problem [9, 16, 17]. They do this by clearing irrelevant data when holding relevant data [9]. Thus, data size, process load, and training time decrease while classification accuracy increases [9, 18]. Many feature selection algorithms have been developed in the literature [1, 16]. However, in this study, the Fscore feature selection algorithm is used because it can work fast, and its performance is good [16]. Feature selection algorithms can be used in many places such as health areas [19–22].
In this study, two different methods have been developed, namely, FsBoost.V1 and FsBoost.V2, based on the Fscore feature selection algorithm that can enhance training performance for ensemble classifiers. The FsBoost.V1 method is like the random subspace method. However, the features are chosen concerning the data label, not random. Selected datasets are classified with a single classifier, and then ensemble classifier 1 is created. This process is repeated for three or more different classifiers. Eventually, ensemble classifiers for three different classifiers are merged. In this way, it is ensured that unnecessary data are removed from the training process. The operation can be interrupted first in the ensemble classifier. In the FsBoost.V2 method, all data are classified with different classifiers. In the second step, the subfeature space is created by the Fscore feature selection algorithm and reclassified. The ensemble classifier is created because of classification. This process was repeated a second time. Eventually, ensemble classifiers for three different classifiers are merged. The use of a single classifier reduces the cost. Only relevant features are retrieved by using the Fscore feature selection algorithm. This process accelerates the training process. Complexity is less than other algorithms.
2. Materials and Methods
The operation was performed according to the flow in Figure 1. Firstly, the records to be used in the study were collected. Then, features were selected with the Fscore feature selection algorithm. Finally, the data are classified with different classifiers, and their performances are calculated. When these operations are performed, ensemble classifiers are created, and their performances are calculated at different levels and formations.
2.1. Collection of Data
The data used in the study were downloaded from the Machine Learning Repository website of the University of California, Irvine (UCI) [23, 24]. The data consist of 4 groups (A/B/C/D) belonging to epilepsy patients (Table 1). Records include EEG records of individuals. Each record is 23.6 seconds. 2300 EEG recordings were taken during the epileptic seizure. The other 2300 records (nonepilepsy) were recorded while in a healthy condition. However, the records belong to epileptic patients. The epilepsy data in each set are the same. However, nonepilepsy records are different. The database contains 178 features for each EEG recording.

2.2. FScore Feature Selection Algorithm
The Fscore is one of the feature selection algorithms that helps distinguish classes from each other [25]. To select the feature, an Fscore value () is calculated for each feature (equation (1)). The Fscore threshold value () is determined by taking the average of all Fscore values. For the th feature, if , th feature is selected. This step is repeated for each feature.
The variables in equation (1) are (1) feature vector , and . (2) , , respectively, are the positive (+) and negative (−) total number of elements in the class, and . (3) is the feature number. (4) , , and are the average value, mean value in the negative class, and mean value in the positive class of the th property, respectively. (5) represents the th positive example of the th feature. represents the th negative example of the th feature.
In the study, A, B, C, and D dataset features were selected with the Fscore (Table 2). Feature selection has been applied twice.

2.3. Ensemble Classifier
The ensemble classifier is a system created by combining different classifiers to produce safer and more stable estimates [26]. The system is built with classifiers. can be single or double. While classifying according to the feature vector, for each feature vector 1, each classifier generates an output value. The output values produced are counted. Then, the output of the ensemble classifier is determined by the number of votes. If the number of classifiers is even, the average of the decision values of the classifiers is rounded off, and the decision of the ensemble classifier is determined. This process applies to all feature vectors. The ensemble classifier was prepared in MATLAB using three different classifiers: kNN, PNN, and SVMs [27].
The kNN is one of the machine learning classification methods with advisory learning [28, 29]. Under the structure of the training dataset, classification is done according to nearest of the new classifier. In this study, was selected, and ten distance calculation formulas were used. These include Spearman, Seuclidean, Minkowski, Mahalanobis, Jaccard, Hamming, Euclidean, Cosine, Correlation, and Cityblock.
PNN is a statistical classification algorithm based on kernel and Bayesian [30]. The method is developed based on feedforward networks [30]. The classifier takes care of all class elements when processing [31]. The radialbased kernel function calculates the distance between class samples. The user in the PNN classifier can manipulate the spread parameter. As the spread parameter approaches zero, the network begins to behave like the nearest neighbor classifier [32]. This value when farther away from zero, the classifier classifies, considering several vectors that separate data from each other [32]. In the study, PNN networks were designed with a total of 500 different values ranging from 0.01 to 5 steps of the spread parameter, with 0.01 step range. At the end of the study, the best performing network parameters and performance criteria were calculated.
SVMs are among the best machine learning algorithms [33]. They can be used in the regression analysis as well as classification [33]. SVMs try to separate datasets from each other with a linear and nonlinear line. The purpose of the SVM algorithm is to be able to distinguish between the data with the minimum error [34]. Gaussian or radial basis function (RBF) kernel (rbf) was used in the study. The BoxConstraint box limit is set between 1 and 100 so that the best performance can be achieved.
2.4. Ensemble Classifier Powered by the FScore
In this study, two different ensemble classifiers, namely, ClassifierFsBoost.V1 and FeatureFsBoost.V2, were developed.
2.4.1. ClassifierBased Ensemble Classifier: FsBoost.V1
The implementation steps of this method are shown in detail in Figure 2. Accordingly to this, firstly, a dataset (A) is classified in a classifier (kNN). In the second step, the first feature selection is performed and again classified in the same classifier (kNN). In the third step, the first and second feature selection are performed and again classified in the same classifier (kNN). Thus, it is classified in three different steps, but only in a classifier (kNN). These three results are combined to form the kNN ensemble. The same process is repeated in PNN and SVMs. Eventually, the kNN ensemble, the PNN ensemble, and the SVM ensemble are combined into a single ensemble classifier.
2.4.2. FeatureBased Ensemble Classifier: FsBoost.V2
The steps for this method are shown in Figure 3. Accordingly to this, firstly, a dataset (A) is classified by each classifier (kNN, PNN, and SVMs). These three classifiers are combined to obtain ensemble classifier 1. In the second step, the first feature selection is performed, and the process in the first step is repeated. In the third step, the first and second property selection steps are performed together, and then the first process is repeated. Ensemble 1, 2, and 3 classifiers are combined to create the ensemble classifier.
2.5. Performance Evaluation Criteria and Distribution of Data for Classification
Different performance evaluation criteria were used to test the accuracy rates of the proposed systems. These are accuracy rates, sensitivity, specificity, kappa value, receiver operating characteristic (ROC), area under a ROC (AUC), and k (10fold) crossvalidation accuracy.
While classifying the datasets, they were divided into two groups: training (50%) and test (50%) (Table 3).

3. Results
The work aims to develop a new algorithm to improve the ensemble classifier performance. We have developed an algorithm (FsBoost) that is similar to the random subspace method but with less workload, faster running, and better performance. Fscore feature selection algorithm based on this method has two versions (FsBoost.V1 vs. FsBoost.V2). The ensemble classifier is created with a single classifier in FsBoost.V1 (Şekil 2, Level 1) and at least three different classifiers in FsBoost.V2 (Şekil 3, Level 1). The developed algorithms were tested with four twoclass datasets (A, B, C, and D) (Table 3).
According to the FsBoost algorithm, the dataset features were selected twice using the Fscore feature selection algorithm. For example, according to FsBoost.V1, the dataset (A) is classified with the same classifier after each property selection (Figure 2, Level 1—kNN1, kNN2, and kNN3) (Table 4). kNN ensemble was formed by combining classifiers of three kNNs (Figure 2, Level 1) (Table 5). This process was repeated with three different classifiers to create PNN ensemble and SVM ensemble (Figure 2, Level 1) (Table 5). Then, the kNN ensemble, PNN ensemble, and SVM ensemble were combined to form the final ensemble classifier (Figure 2, Level 2) (Table 5). This process is repeated for each dataset (Tables 4–8).
 
DF: distance function, Sen: sensitivity, Spe: specificity, Acc: accuracy (%), NP: network parameters, FS: feature selection, NF: number of features, EC: ensemble classifier, E: epilepsy, and NE: nonepilepsy. 
 
Sen: sensitivity, Spe: specificity, Acc: accuracy (%), E: epilepsy, and NE: nonepilepsy. 
 
DF: distance function, Sen: sensitivity, Spe: specificity, Acc: accuracy (%), NP: network parameters, FS: feature selection, NF: number of features, EC: ensemble classifier, E: epilepsy, and NE: nonepilepsy. 
 
DF: distance function, Sen: sensitivity, Spe: specificity, Acc: accuracy (%), NP: network parameters, FS: feature selection, NF: number of features, EC: ensemble classifier, E: epilepsy, and NE: nonepilepsy. 
 
DF: distance function, Sen: sensitivity, Spe: specificity, Acc: accuracy (%), NP: network parameters, FS: feature selection, NF: number of features, EC: ensemble classifier, E: epilepsy, and NE: nonepilepsy. 
In FsBoost.V2, the dataset (A) is classified with different classifiers after each feature selection (Figure 3, Level 1—kNN1, PNN1, and SVM1) (Table 4). These three classifiers were combined to create ensemble 1 (Figure 3, Level 1—ensemble 1) (Table 9). Then, ensemble 1, ensemble 2, and ensemble 3 were combined to form the final ensemble classifier (Figure 3, Level 2) (Table 9). This process is repeated for each dataset (Tables 4–7 and 9). Finally, the FsBoost ensemble algorithm is also compared with the ensemble algorithms available in the literature (Table 10).
 
Sen: sensitivity, Spe: specificity, Acc: accuracy (%), FS: feature selection, NF: number of features, EC: ensemble classifier, E: epilepsy, and NE: nonepilepsy. 
 
Acc: accuracy (%). 
Accuracy rates for FsBoost.V1 and FsBoost.V2 are higher than those for single classifiers (Table 10). The FsBoost algorithm is well ranked compared to other boosting algorithms in the literature (Table 10). FsBoost.V1—Level 1—SVM ensemble method is the best method when compared with the literature (Table 10, Rank).
Three different datasets were used to reconfirm the results obtained. The distribution of datasets is shown in Table 11.
 
NF: number of features. 
In order to compare the FsBoost algorithm with boosting algorithms, three different datasets were reanalyzed. The results obtained from the analysis are summarized in Table 12. According to the results, the algorithm with the average best performance is the FsBoost.V1 Level 2 ensemble algorithm.
 
Acc: accuracy (%). 
4. Discussion and Conclusion
FsBoost is one of the best algorithms developed until now [4–7]. This method has very few steps. In this way, it provides results faster. A high accuracy rate is a distinct advantage. Algorithms with high accuracy and fast results are preferred in medical data classification. In this regard, FsBoost may be preferred.
FsBoost contains fewer calculations and steps than the algorithms in the literature [4–7]. The accuracy rate is very good compared with other algorithms (Table 10) [4]. Considering these advantages, FsBoost may be a commonly used algorithm soon.
FsBoost algorithms are also suitable for use in biomedical signal processing, deep learning, and communication [35–37].
FsBoost can be used with three or more classifiers. Besides, FsBoost.V1 is a version of FsBoost that can be used with a single classifier. Achieving high performance with a single classifier is a distinct advantage of FsBoost.V1. The Fscore feature selection algorithm creates this advantage. By combining different features, the same data can be interpreted differently. If the classifiers are strong, FsBoost increases in performance. Therefore, it is recommended that the algorithm is used with robust classifiers. Ensemble classifiers often bring out a strong classifier by combining weak classifiers. This is the weakness of FsBoost.
As a result, we can say that FsBoost is an alternative method to create an ensemble classifier. A highperformance ensemble classifier can be created with a powerful classifier and the Fscore feature selection algorithm.
Data Availability
The datasets in our paper could be downloaded from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/index.html). The authors can send all the datasets based on the readers’ requests.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
This project was funded by the Deanship of Science Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia, under grant no. RG261140. The authors, therefore, acknowledge with thanks to DSR for the technical and financial support.
References
 L. Rokach, “Ensemblebased classifiers,” Artificial Intelligence Review, vol. 33, no. 12, pp. 1–39, 2010. View at: Publisher Site  Google Scholar
 M. K. Uçar, “Classification performancebased feature selection algorithm for machine learning: Pscore,” Innovation and Research in BioMedical Engineering, 2020. View at: Publisher Site  Google Scholar
 Y. Freund and E. Robert, “Schapire. Experiments with a new boosting algorithm,” in Proceedings of the ICML ’96: 13th International Conference on Machine Learning, pp. 148–156, Bari Italy, July 1996. View at: Google Scholar
 R. Rojas, AdaBoost and the Super Bowl of Classifiers A Tutorial Introduction to Adaptive Boosting, 2009.
 L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996. View at: Publisher Site  Google Scholar
 T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832–844, 1998. View at: Publisher Site  Google Scholar
 B. Peter, “Bagging, boosting and ensemble learning,” in Handbook of Computational Statistics: Concepts and Methods, J. E. Gentle, W. Karl Härdle, and Y. Mori, Eds., pp. 1–38, SpringerVerlag Berlin Heidelberg, Heidelberg, Germany, 2012. View at: Google Scholar
 M. Toğaçar, B. Ergen, and Z. Cömert, “A deep feature learning model for pneumonia detection applying a combination of mRMR feature selection and machine learning models,” Innovation and Research in BioMedical Engineering, 2019. View at: Publisher Site  Google Scholar
 D. Guan, W. Yuan, Y.K. Lee, K. Najeebullah, and M. K. Rasel, “A review of ensemble learning based feature selection,” IETE Technical Review, vol. 31, no. 3, pp. 190–198, 2014. View at: Publisher Site  Google Scholar
 N. Daldal, K. Polat, and Y. Guo, “Classification of multicarrier digital modulation signals using NCM clustering based featureweighting method,” Computers in Industry, vol. 109, p. 45, 2019. View at: Publisher Site  Google Scholar
 M. K. Uçar, M. Nour, H. Sindi, and K. Polat, “The effect of training and testing process on machine learning in biomedical datasets,” Mathematical Problems in Engineering, vol. 2020, Article ID 2836236, 17 pages, 2020. View at: Publisher Site  Google Scholar
 K. Polat, S. Şahan, H. Kodaz, and S. Güneş, “Breast cancer and liver disorders classification using artificial immune recognition system (AIRS) with performance evaluation by fuzzy resource allocation mechanism,” Expert Systems with Applications, vol. 32, no. 1, pp. 172–183, 2007. View at: Publisher Site  Google Scholar
 S. AlMuhaideb and M. E. B. Menai, “An individualized preprocessing for medical data classification,” Procedia Computer Science, vol. 82, pp. 35–42, 2016. View at: Publisher Site  Google Scholar
 N. Daldal, M. Nour, and K. Polat, “A novel demodulation structure for quadrate modulation signals using the segmentary neural network modelling,” Applied Acoustics, vol. 164, Article ID 107251, 2020. View at: Publisher Site  Google Scholar
 N. Daldal, “A novel demodulation method for quadrate type modulations using a hybrid signal processing method,” Physica A: Statistical Mechanics and Its Applications, vol. 540, Article ID 122836, 2020. View at: Publisher Site  Google Scholar
 K. Polat and S. Güneş, “A new feature selection method on classification of medical datasets: kernel Fscore feature selection,” Expert Systems with Applications, vol. 36, no. 7, pp. 10367–10373, 2009. View at: Publisher Site  Google Scholar
 K. Polat and K. Onur Koc, “Detection of skin diseases from dermoscopy image using the combination of convolutional neural network and oneversusall,” Journal of Artificial Intelligence and Systems, vol. 2, no. 1, pp. 80–97, 2020. View at: Publisher Site  Google Scholar
 J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: a new perspective,” Neurocomputing, vol. 300, pp. 70–79, 2018. View at: Publisher Site  Google Scholar
 A. Noor, “The utilization of Ehealth in the kingdom of Saudi Arabia,” International Research Journal of Engineering and Technology (IRJET), vol. 6, no. 9, 2019. View at: Google Scholar
 A. Noor, L. Wang, B. Ahmed et al., “D4: deep drugdrug interaction discovery and demystification,” Bioinformatics, 2020. View at: Publisher Site  Google Scholar
 A. Noor, “Discovering gaps in Saudi education for digital health transformation,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 10, pp. 105–109, 2019. View at: Publisher Site  Google Scholar
 A. Noor, A. Assiri, S. Ayvaz, C. Clark, and M. Dumontier, “Drugdrug interaction discovery and demystification using Semantic Web technologies,” Journal of the American Medical Informatics Association, vol. 24, no. 3, pp. 556–564, 2017. View at: Publisher Site  Google Scholar
 R. G. Andrzejak, K. Lehnertz, F. Mormann et al., “Indications of nonlinear deterministic and finitedimensional structures in time series of brain electrical activity: dependence on recording region and brain state,” Physical Review E—Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, vol. 64, no. 6, 2001. View at: Publisher Site  Google Scholar
 R. G. Elger CE Andrzejak, K. Lehnertz, C. Rieke, F. Mormann, and P. David, UCI Machine Learning Repository: Epileptic Seizure Recognition Data Set, University of California, Oakland, CA, USA, 2001.
 M. K. Uçar, M. R. Bozkurt, C. Bilgin, and K. Polat, “Automatic detection of respiratory arrests in OSA patients using PPG and machine learning techniques,” Neural Computing and Applications, vol. 28, no. 10, pp. 2931–2945, 2017. View at: Publisher Site  Google Scholar
 L. Rokach, A. Schclar, and E. Itach, “Ensemble methods for multilabel classification,” Expert Systems with Applications, vol. 41, no. 16, pp. 7507–7523, 2014. View at: Publisher Site  Google Scholar
 A. S. D. P. Wallisch, M. E. Lusignan, M. D. Benayoun, T. I. Baker, and N. G. Hatsopoulos, “MATLAB for neuroscientists: an introduction to scientific computing in MATLAB,” Journal of Undergraduate Neuroscience Education, vol. 13, no. 1, 2014. View at: Google Scholar
 S. Şahan, K. Polat, H. Kodaz, and S. Güneş, “A new hybrid method based on fuzzyartificial immune system and knn algorithm for breast cancer diagnosis,” Computers in Biology and Medicine, vol. 37, no. 3, pp. 415–423, 2007. View at: Publisher Site  Google Scholar
 M. Khan, Q. Ding, and W. Perrizo, “knearest neighbor classification on spatial data streams using Ptrees,” in Advances in Knowledge Discovery and Data Mining, pp. 517–528, Springer Berlin Heidelberg, Heidelberg, Germany, 2002, Chapter Lecture No. View at: Google Scholar
 A. Khamis, S. Hussain, A. Mohamed, and E. Bizkevelci, “Islanding detection in a distributed generation integrated power system using phase space technique and probabilistic neural network,” Neurocomputing, vol. 148, pp. 587–599, 2015. View at: Publisher Site  Google Scholar
 E. Parzen, “On estimation of a probability density function and mode,” The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065–1076, 1962. View at: Publisher Site  Google Scholar
 P. D. Wasserman, Advanced Methods in Neural Computing, Van Nostrand Reinhold, New York, NY USA, 1993.
 C. Cortes and V. Vapnik, “Supportvector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at: Publisher Site  Google Scholar
 V. N. Mandhala, V. Sujatha, and B. Renuka Devi, “Scene classification using support vector machines,” in Proceedings of the 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, pp. 1807–1810, Ramanathapuram, India, May 2014. View at: Publisher Site  Google Scholar
 M. Arican, K. Polat, and K. Polat, “Binary particle swarm optimization (BPSO) based channel selection in the EEG signals and its application to speller systems,” Journal of Artificial Intelligence and Systems, vol. 2, no. 1, pp. 27–37, 2020. View at: Publisher Site  Google Scholar
 A. Ozdemir and K. Polat, “Deep learning applications for hyperspectral imaging: a systematic review,” Journal of the Institute of Electronics and Computer, vol. 2, no. 1, pp. 39–56, 2020. View at: Publisher Site  Google Scholar
 N. Daldal, Z. Cömert, and K. Polat, “Automatic determination of digital modulation types with different noises using convolutional neural network based on time–frequency information,” Applied Soft Computing Journal, vol. 86, Article ID 105834, 2020. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Adeeb Noor et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.