Applied Computational Intelligence and Soft Computing

Volume 2016 (2016), Article ID 9569161, 6 pages

http://dx.doi.org/10.1155/2016/9569161

## Angle Modulated Artificial Bee Colony Algorithms for Feature Selection

Computer Engineering Department, Dumlupinar University, 43000 Kütahya, Turkey

Received 6 November 2015; Accepted 1 February 2016

Academic Editor: Thunshun W. Liao

Copyright © 2016 Gürcan Yavuz and Doğan Aydin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Optimal feature subset selection is an important and a difficult task for pattern classification, data mining, and machine intelligence applications. The objective of the feature subset selection is to eliminate the irrelevant and noisy feature in order to select optimum feature subsets and increase accuracy. The large number of features in a dataset increases the computational complexity thus leading to performance degradation. In this paper, to overcome this problem, angle modulation technique is used to reduce feature subset selection problem to four-dimensional continuous optimization problem instead of presenting the problem as a high-dimensional bit vector. To present the effectiveness of the problem presentation with angle modulation and to determine the efficiency of the proposed method, six variants of Artificial Bee Colony (ABC) algorithms employ angle modulation for feature selection. Experimental results on six high-dimensional datasets show that Angle Modulated ABC algorithms improved the classification accuracy with fewer feature subsets.

#### 1. Introduction

Many data mining and machine learning applications suffer from the curse of dimensionality in which a dataset usually involves a large number of features, often including relevant and irrelevant features [1]. When the number of features is large, the cost of acquisition of the data will be increased, the performance of classifier may be reduced, and the generalization of the data will be more difficult [2]. To cope with this problem, feature selection is one of the methods that eliminate the redundant, uninformative, and noisy features while preserving the accuracy of feature subset.

Filter methods and wrapper methods are two main strategies in feature selection [1]. In filter approaches, an algorithm selects the relevant feature subset based on data characteristics. However, wrapper approaches include a classifier to evaluate candidate feature sets. Although the wrapper approaches involve the computational overhead of evaluating candidate feature subsets, they outperform filter approaches in terms of classification accuracy [3].

When a wrapper method is used, the problem of optimal feature subset selection can be seen as NP-Hard because the number of possible feature subsets in search space is 2 where is the number of features. Evolutionary computation techniques are well-known tools to tackle this kind of problem [2]. One of these techniques is Artificial Bee Colony (ABC) algorithm that mimics the foraging behavior of real bee colonies. In recent years, there are some few studies proposed on feature selection based on ABC algorithms. Palanisamy and Kanmani [4] used ABC for feature selection. However, the paper uses only original ABC and does not give any information about the generation of bit vector used in feature selection. Another binary ABC algorithm for feature selection is proposed in [5] but the search equation of the binary ABC algorithm is based on modifying candidate solution without interacting with the other solutions. Thus, the algorithm turns to a randomized algorithm which randomly generates solutions without interaction in population. Moreover, in these approaches, candidate solutions are presented with a bit vector of size . Therefore, for large scale instances, this may lead to taking more time and decreases classification accuracy. Besides, there are also some applications which use ABC in the feature selection step. Syarifahadilah et al. [6] proposed feature selection method for biomarker identification. Uzer et al. [7] developed ABC-based feature selection algorithm in order to diagnose liver diseases and diabetes. Akila et al. [8] identify a user based on analysis of human typing rhythm by using ABC-based feature selection method.

In this study, ABC algorithms employ angle modulation based bit vector generation for feature selection for the first time. In angle modulation based approach, an ABC algorithm, called Angle Modulated Artificial Bee Colony (AMABC) algorithm, selects candidate feature sets with a bit vector obtained by a bit string generator employing a trigonometric function. The main advantage of this approach is that an AMABC algorithm tries to optimize the trigonometric function that has only four parameters in continuous domain. Thus, high-dimensional binary search space can be presented by only 4-dimensional continuous search space for any dataset. Consequently, any ABC algorithm variant applied to continuous optimization problems in the literature can be used for feature selection problem. To do so, we have adopted angle modulation to six ABC variants to show its significant effect on finding relevant feature subset selection on dataset instances having many features. The comparison shows that ABC algorithms with Angle Modulated feature selection significantly improve the classification accuracy using fewer features.

This paper is organized as follows. Section 2 briefly reviews the original ABC algorithms and six variants considered here. Section 3 elaborates application of angle modulation based ABC algorithms to feature selection. Experimental results are presented in Section 4. Finally, Section 5 concludes the paper.

#### 2. Artificial Bee Colony Algorithm

##### 2.1. The Original Artificial Bee Colony Algorithm

Artificial Bee Colony (ABC) algorithm, which is inspired from the foraging behaviour of real bee colonies, is proposed for tackling optimization problems. It was at first introduced by Karaboga [9], for bound-constraint continuous optimization problems. In ABC algorithm, each candidate solution is assumed as a food source located at the D-dimensional search space. The nectar amount on a food source is referred to as the fitness value of a candidate solution.

Colony life is organized by division of labour. It comprises three types of bees, employed bees, onlooker bees, and scout bees, which are specialized for different tasks. The employed bees forage outside the hive and communicate with onlooker bees through a series of dances when they return to the hive with news of discovered food source. The onlooker bees obtain remarkable accurate information about the location and the quality of the discovered food sources. The attractiveness of the dance, which is assumed as selection probability of a food source, recruits the onlooker bees to help find new good food sources in the vicinity of the discovered one. A food source is abandoned because of its low quality. Then, an employed bee turns to scout bee which flies around looking for food in desirable spots. Based on this phenomenon, the ABC algorithm is composed of four main steps: Initialization Step, Employed Bees Step, Onlooker Bees Step, and Scout Bees Step. Except for the Initialization Step, the algorithm repeats the other steps until a stopping criterion is satisfied. The detailed description of these steps is as follows.

*(a) Initialization.* A number of initial solutions are discovered or simply created within the bounds of search space using the following formula [9]: where is the lower bound and is the upper bound for each decision variable , of a solution, . is a uniformly distributed random number generated between 0 and 1. Furthermore, other control parameters, such as limit representing the maximum number of visits for each solution, are initialized in this step.

*(b) Employed Bees Step*. At this step, each employed bee visits a solution, , to discover a better candidate solution, , with the formula [9] where is the position of the reference solution, is a randomly selected solution, is a randomly selected dimension , and is a random number uniformly distributed in . If the candidate solution, , is better than , it replaces and becomes the new solution. Otherwise, a counter which holds total number of trials of is increased.

*(c) Onlooker Bees Step*. The onlooker bees also try to discover new solutions around the visited solutions like the employed bees do. However, in this step, information about the quality of solutions discovered by employed bees is shared with the onlooker bees. Therefore, each solution has no equal eligibility of visit but a probability of selection that is defined as follows [9]:where is the value of the solution which is defined as where is the objective value of solution . If a solution has a higher quality, then the ratio of visiting the solution by an onlooker bee becomes higher.

*(d) Scout Bees Step*. A solution can be visited several times by employed and onlooker bees to find new solutions. After the number of unsuccessful trials equals limit value, the solution is marked as abandoned. Then, an employed bee, which is responsible for the abandoned solution, turns to a scout bee. A new solution is explored randomly by the scout bee with (1) at the Initialization Step.

##### 2.2. The Considered Artificial Bee Colony Variants

In this section, we briefly describe five ABC algorithms which we considered here as feature selection methods on various datasets.

Modified ABC (MABC) is proposed by Akay and Karaboga [10]. MABC algorithm suggests modification in (2) as follows:where is modification ratio and is scaling factor. While controls the ratio of the amount of dimensions to be changed, adjusts the perturbation range.

Gbest guided ABC (GABC) [11, 12] used information of the best solution found so far () to enhance the intensification behaviour in the search equation of the Employed Bees and the Onlooker Bees Steps. The modified search equation is as follows: where is the th dimension of and is a uniform random number in . is a control parameter for adjustment perturbation [11, 12]. It is set to a positive constant value that is usually set as 1 [11].

GbestDist guided ABC (GDABC) [11] is an improved variant of GABC. The search equation of GABC is modified to select preferably a neighbour solution, , according to a probabilistic selection rule, which is defined aswhere is the probability of neighbour chosen, is the location of a solution, and is the Euclidean distance between two solution locations, and [11].

Chaotic ABC (CABC) [13] algorithm has three variants. For the first variant of CABC, canonical uniform random number generator is replaced with a chaotic random generator using seven different chaotic maps for the Initialization Step. The second variant of CABC proposes chaotic search for Onlooker Bees Step after the number of trials of a solution reaches . The details of the chaotic search can be found in [13]. The third version, which we considered in this study, is the combination of the first two variants of CABC algorithm.

Enhanced ABC (EABC) [14] algorithm proposes two separate search equations for the Employed Bee Step and the Onlooker Bee Step to improve poor convergence performance. The search equation of the Employed Bee Step is defined as follows: where and are random number in the range and , respectively, where is a nonnegative constant and is a random number generated by standard deviation and normal distribution with mean . For the Onlooker Bees Step, in order to enhance exploitation, is used in the third term of the search equation instead of as follows:

#### 3. Angle Modulated Artificial Bee Colony Algorithms

Angle Modulated Artificial Bee Colony (AMABC) algorithms are used for finding an optimal solution of binary optimization problems by reducing the problem to a four-dimensional continuous optimization problem. To do so, the algorithm generates bit strings by employing a trigonometric function derived from angle modulation [15] technique which is used in telecommunication systems. The trigonometric function is composed of sines and cosines functions as follows:where . has four coefficients (, , , and ) which control the frequency of the sines and cosines functions or shift the function vertically. The coefficient values let the function generate different signals for a given range. Therefore, a number of bits can be generated from the results of the elements ( values) obtained from evenly separated intervals. Figure 1 shows bit string generation by using the trigonometric function with , , , and . With a range and an interval of 1, a bit string can be generated by sampling of result at each point as follows:When we use angle modulation to generate bit strings, a binary problem can be presented as the task of finding the optimum coefficients values. Thus, optimum binary vector solution to the original problem can be sampled from the resultant function at the evenly spaced intervals. The advantages of this approach for ABC algorithms are as follows:(i)ABC algorithms are originally presented for bound-constraint continuous optimization. They perform well and are competitive with the contemporary algorithms for continuous optimization problems. However, superiority of the binary variants of ABC algorithms has not been proved yet in the literature. With this approach, ABC algorithms try to find appropriate values of coefficients in continuous space instead of evolving bit strings in binary space.(ii)This approach decreases the dimension of the problem. For example, a large scale binary problem instance can be represented by a four-dimensional problem instance in continuous space.(iii)Several ABC variants proposed for continuous optimization can be applied easily to a binary optimization problem without modification on original implementation of the algorithm.