Abstract

Clustering is one of the most commonly used approaches in data mining and data analysis. One clustering technique in clustering that gains big attention in clustering related research is -means clustering such that the observation is grouped into cluster. However, some obstacles such as the adherence of results to the initial cluster centers or the risk of getting trapped into local optimality hinder the overall clustering performance. The purpose of this research is to minimize the dissimilarity of all points of a cluster from gravity center of the cluster with respect to capacity constraints in each cluster, such that each element is allocated to only one cluster. This paper proposes an effective combination algorithm to find optimal cluster center for the analysis of data in data mining and a new combination algorithm is proposed to untangle the clustering problem. This paper presents a new hybrid algorithm, which is, based on cluster center initialization algorithm (CCIA), bees algorithm (BA), and differential evolution (DE), known as CCIA-BADE-K, aiming at finding the best cluster center. The proposed algorithm performance is evaluated with standard data set. The evaluation results of the proposed algorithm and its comparison with other alternative algorithms in the literature confirm its superior performance and higher efficiency.

1. Introduction

Data clustering is one of the most important knowledge discovery techniques to extract structures from dataset and is widely used in data mining, machine learning, statistical data analysis, vector quantization, and pattern recognition. The aim of clustering is to partition data into cluster, so that each cluster contains data, which has the most similarity and maximum dissimilarity with the other clusters. Clustering algorithms can be comprehensively classified into hierarchical, partitioning, model-based, grid-based, and concentration-based clustering algorithms [13].

Hierarchical clustering algorithm divides a dataset into a number of levels of nested partitioning. In the partitioning algorithms observations of one dataset decompose into a set of clusters with most similarity among intra-group members and least similarity among inter group members [4]. Dissimilarities are evaluated based on attribute values. Generally, distance criterion is used for data analysis [5].

The -means algorithm is one of the partitional clustering algorithm and one of the most popular algorithms, used in many domains. The -means algorithm implementation is easy and often practical. However, results of -means algorithm considerably depend on initial state. In other words, its efficiency highly depends on the first initial center [6].

The main purpose of -means clustering algorithm is to minimize the diversity of all objects in a cluster from their cluster centers. The initialization problem of -means algorithm is considered by heuristic algorithms, but it still risks being trapped in local optimality. Therefore, for achieving a better cluster algorithm we should find a solution for overcoming the problem of trap into local optimum [7].

There are many studies to overcome this problem. For instance, Niknam and Amiri have proposed a hybrid approach based on combining partial swarm optimization and ant colony optimization with -means algorithm for data clustering [8], and Nguyen and Cios have proposed a combination technique based on the hybrid of -means, genetic algorithm, and maximization of logarithmic regression expectation [9]. Kao et al. have presented a combination algorithm according to the hybrid of partial swarm optimization, Nelder-Mead simplex search and genetic algorithm [10]. Krishna and Murty proposed an algorithm for cluster analysis called genetic -means algorithm [11]. Žalik proposed an approach for clustering without preassigning cluster numbers [12]. Maulik and Bandyopadhyay haves introduced genetic based algorithm to solve this problem and evaluate the performance on real data. They define spatial distance-based mutation according to mutation operator for clustering [13]. Laszlo and Mukherjee have proposed another genetic based approach, that for -means clustering exchanges neighboring cluster centers [14]. Fathian et al. have presented a technique to overcome clustering problem according to honey-bees mating optimization (HBMO) [1517]. Shelokar et al. have presented to solve clustering problem based on the ant colony optimization [18]. Niknam et al., have combined to dominate this problem based on the simulated annealing and ant colony optimization [19]. Ng and Sung have introduced a technique based on the taboo search to find cluster center [20, 21]. Niknam et al. have introduced a hybrid approach based on combining partial swarm optimization and ant simulated annealing to solve clustering problem [22, 23].

The bees algorithms can be classified in two main categories including foraging-based honeybee algorithms and marriage-based honeybee algorithm. Each of these categories have many algorithm such as artificial bee algorithm (ABC) [3, 24, 25], corporate artificial bee algorithm (CABC) [26], parallel artificial bee algorithm (PABC) [27], bee colony optimization (BCO) [28, 29], bee algorithm (BA) [30], bee foraging algorithm (BFA) [31], bee swarm optimization (BSO) for first categories [32]. Marriage in honey-bees optimization (MBO) [32], fast marriage honey-bees optimization (FMBO) [33], and finally modified fast marriage in honey-bees optimization (MFMBO) are in the second category of bee algorithm [34].

One of the foraging-based algorithms is the bees algorithm that is a new population based search algorithm, developed by Pham et al. in 2006 [30]. The algorithm mimics the food foraging behavior of swarms of honeybees (Figure 3). In its basic version, the algorithm performs a kind of neighborhood search combined with random search and can be used for optimization problems [30].

Differential evolution is an evolutionary algorithm (EA), which has been widely used in to optimization problems, mainly in continuous search spaces [35]. Differential evolution was introduced by Storn and Price in 1995 [36]. Global optimization is necessary in fields such as engineering, statistics, and finance, but many practical problems have objective functions that are nonlinear, noisy, noncontinuous, and multidimensional or have many local minima and constraints. Such problems are difficult if not impossible to solve analytically. Differential evolution can be used to find approximate solutions to such problems. Differential evolution also includes genetic algorithms, evolutionary strategies, and evolutionary programming. Differential evolution encodes solutions as vectors and new solution, compared to its parent. If the candidate is better than its parents, it replaces the parent in the population. Differential evolution can be applied in numerical optimization [37, 38].

In this paper, a hybrid evolutionary technique is used in order to solve the -means problem. The proposed algorithm helps clustering technique to escape from being trapped in local optimum. Our algorithm takes the benefits of both algorithms. Also, in this survey, some standard datasets are used for testing the proposed algorithm. To obtain the best cluster centers, in proposed algorithm, the advantages of BA (bees algorithm) and DE (differential evolution) are used with a data preprocessing technique called CCIA (cluster center initialization algorithm) for data analysis. Through experiments, the proposed CCIA-BADE-K algorithm has shown that this algorithm efficiently selects the exact cluster centers.

The main contribution of this paper is the introduction of a novel combination of evolutionary algorithm according to bees algorithm and differential evolution to overcome data analysis problem and hybrid with CCIA (cluster center initialization algorithm) preprocessing technique.

The rest of this paper is arranged as follows: in Section 2, the data clustering issue is introduced. In Sections 3 and 4, the classic principles of the DE and BA evolutionary algorithm are discussed. In Section 5, the suggested approach is introduced. In Section 6, experimental results of proposed algorithm are shown and compared with PSO-ANT, SA, ACO, GA, ACO-SA, TS, HBMO, PSO, and -means on benchmark data and finally Section 7 presents the concluding remarks.

2. Data Clustering

Clustering is defined as grouping similar objects either physically or in abstract. The groups inside one cluster have the most similarity with each other and the maximum diversity with other groups’ objects [39].

Definition 1. Suppose the set of containing objects. The purpose of clustering is to group objects in clusters as while each cluster satisfies the following conditions [40]:(1);(2), ;(3).

According to the mentioned definition, the possible modes for clustering objects in clusters are obtained as follows:

In most approaches, the cluster number, that is, , is specified by an expert. Relation (1) implies that even with a given , finding the optimum solution for clustering is not so simple. Moreover, the number of possible solutions for clustering with objects in clusters increases by the order of . So, obtaining the best mode for clustering objects in clusters is an intricate NP-complete problem which needs to be settled by optimization approaches [5].

2.1. The -Means Algorithm

There have been many algorithms suggested for addressing the clustering problem and among them the -means algorithm which is one of the most famous and most practical algorithms [41]. In this method, besides the input datasets, samples are introduced into the algorithm as the initial centers of clusters. These representing ’s are usually the first data samples [39]. The way these representatives are chosen influences the performance of -means algorithm [42]. The four stages of this algorithm are shown as follows.

Stage I. Choose data items randomly from as cluster centers of .

Stage II. Based on relation (2), add every data item to a relevant cluster. For example, if the following relation (2) holds, the object from the set of is added to the cluster

Stage III. Now, based on the clustering of Stage II, the new cluster centers are calculated by using relation (3) as follows ( is the number of objects in the cluster ):

Stage IV. If the cluster centers are changed, repeat the algorithm from Stage II, otherwise do the clustering based on the resulted centers.

The performance of the -means clustering algorithm relies on initial centers and this is a major challenge in this algorithm. Random selection of initial cluster centers makes this algorithm yield different results for different runs over the same datasets, which is considered as one of the potential disadvantages of this algorithm [43]. This mix is not sensitive to center initialization, but it still has tendency towards local optimality. In this algorithm, strong ties among data points and the nearest data centers cause cluster centers not to exit from their local dense ranges [44].

The algorithm of bees, first developed by Karaboga and Basturk [3] and Pham et al. in 2006 [30], is a new swarm-based algorithm to search solutions independently. The algorithm was inspired by the behavior of food foraging from swarms of honeybees. In classic edition, the algorithm used random search to find neighborhood to solve optimization problems and issues.

2.2. Algorithm for Finding Cluster Initial Centers

In this study, with regards to efficiency purposes, all data objects are first clustered using -means algorithm to find the initial cluster centers to be used in the solutions based on all their attributes. Based on the generated clusters, the pattern for an object is produced from each attribute at any stage.

Objects with the same patterns are located in one cluster and hence all objects are clustered. The obtained clusters in this stage will be more than the original number of clusters. For more information, refer to paper [6]. In this paper, clustering is completed in two stages. The first stage is performed as discussed above and in the second stage similar clusters are integrated with each other until achieving a given number of clusters. Algorithm 1 shows the proposed approach for initial clustering of data objects and the achieved cluster centers are called seed points.

(1) Input: Data SET , Attribute Set ), Cluster Number (),
(2) Output: Clusters Seed-Set
(3) Begin
(4) While ()
   (4.1) Compute Standard Deviation () and Mean ()
   (4.2) Compute Cluster Center
    
   (4.3) Execute -means on this attribute
   (4.4) Allocate cluster labels obtained from Step  (4.3) to every data pattern
(5) Find unique patterns () and clustering each data with obtained patterns.
(6) Return SC
(7) End

As can be observed in Algorithm 1, for every attribute of data objects, a cluster label is generated and this label is added to the data object pattern. Objects with identical patterns are placed in one cluster. To produce data object labels based on each attribute, first the mean and standard deviation of that attribute are computed for all data objects. Thereafter, based on the mean and standard deviation, the range of attribute values are broken into identical intervals so that the tail of each interval appears as an initial cluster center. Thus, using the initial centers, all data objects are clustered by the -means method.

2.3. Fitness Function

To calculate the fitness of each solution, the distance between the centers of clusters and each data will be used. To do this, first a set of cluster centers will be generated randomly and then clustering of the numerator will be conducted based on (2). Now, according to centers obtained in the interaction step, the new centers of the clusters and fitness of solutions based on (3) will be calculated [40]

3. The Dance Language of Bees

For honeybees, finding nectar is essential to survival. Bees lead others to specific sources of food and then scout bees start to identify the visited resources by making movements as “dancing.” These dances are very careful and fast in different directions. Dancers try to give information about a food resource by specifying the direction, distance, and quality of the visited food source [45].

3.1. Describing the Dance Language

There are two kind of dance for Observed bees including “round dance” and “waggle dance” [46]. When a food resource is less than fifty meters away, they do round dance and when a food resource is greater than fifty meters away, they perform waggle dance (Figure 1).

There are some concepts in this dance, in which the angle between vertical and waggle run is equal to the angle between the sun and food resource. Dance “tempo” shows the distance of food resource (Figure 2). A slower dance tempo means that a food resource is farther and vice versa [47]. Another concept is the duration of dance and a longer dance duration means that a food resource is rich and better [45]. Audiences are other bees, which follow the dancer. In this algorithm, there are two kinds of bees, SCOUTS are bees that find new food sources and perform the dance. RECRUTTS are bees that follow the scout bees, dance, and then forage. One of the first people that translate the waggle dance mining was Austrian etiologist Karl von Frisch.

Distance between flowers and hive is demonstrated by the duration of the waggle dance. The flowers that are farther from the hive have longer waggle dance duration. Each hundred meters distance between flowers from the hive is shown in the waggle dance phase close to 75 milliseconds.

3.2. Bee in Nature

A colony of honeybees can extend itself over long distances (more than 10 km) and in multiple directions simultaneously to exploit a large number of food sources. In principle, flower patches with plentiful amount of nectar or pollen that can be collected with less effort should be visited by more bees, whereas, patches with less nectar or pollen should receive fewer bees [35, 47].

The foraging process begins in a colony with the scout bees being sent out to search for promising flower patches. Scout bees move randomly from one patch to another. During the harvest season, a colony continues its exploration, keeping a percentage of the population as scout bees. When the scout bees return to the hive, those that found a patch, which is rated above a certain quality threshold (measured as a combination of some constituents, such as sugar content), deposit their nectar or pollen and go to the “dance floor” to perform a dance known as “waggle dance” [46]. The waggle dance is essential for colony communication and contains three pieces of information regarding a flower patch: the direction in which it will be found, its distance from the hive, and its quality rating (or fitness). This information helps the colony to send its bees to flower patches precisely, without using guides or maps [45]. After the waggle dance on the dancing floor, the dancers (i.e., scout bee) go back to the flower patch with follower bees that are waiting inside the hive. More follower bees are sent to patches that are most promising [48, 49]. The flowchart of bee algorithm is shown in Figure 4 [50].

The Basic Bee Algorithm is shown as in Algorithm 2 [51].

(1) Initialize population with random solutions. ( scout bees are placed randomly in the search space.)
(2) Evaluate fitness of the population.
(3) While (Repeat optimization cycles for the specified number)
(4) Select sites for neighborhood search. (Bee that have the highest fitness are chosen as “selected” and
   sites visited by them are chosen for neighborhood search.)
(5) Recruit bees for selected sites (more bees for best sites) and evaluate fitness.
(6) Select the fittest bee from each patch. (For each patch, only the bee with the highest fitness will be
   selected to form the next bee population.)
(7) Assign remaining bees to search randomly and evaluate their fitness.
(8) End While

4. Differential Evolution

Differential evolution is a type of standard genetic algorithm. Differential evolution algorithm evaluates the initial population by using probability motion and observation models and population evolution is performed by using evolution operators [52]. The main idea in the differential evolution algorithm is to generate a new solution for each solution by using one constant member and two random members. In each generation, the best member of population is selected and then the difference between each member of population and the best member is calculated. Two random members are then selected and the difference between them is calculated. Coefficient of this difference is added to th member and thus a new member is created. The cost of each new member is calculated and if the cost value of the new member is less, the th member is replaced instead of th member; otherwise, the previous value can be kept in the next generation [35].

Differential evolution is one of the population based popular algorithms that uses point floating (real coded) for presentation as follows [53]:where is the number of generation (iteration), refers to members (population), and is the number of optimization parameters. Now, in each generation (or each iteration of algorithm) to perform changes on members of population , one donor vector is formed. The various methods of DE are used to determine how to make the donor vector. The first kind of DE named 1/rand/DE generates th member , in which three members of current generation () are chosen randomly as

Then, the difference between two vectors from three selected vectors are calculated and multiplied by coefficient and with the third vector added [53]. Therefore, donor vector is obtained. Calculation process of donor vector for th element from th vector can be demonstrated as follows [54]:

To increase the exploration of algorithm a crossover operation is then performed. Differential algorithm has generally two kinds of crossover exponential and binomial [55]. In this paper to save time, the binomial mode has been used. To apply the binomial crossover, it requires that set of is constituted as in Algorithm 3.

(1) Begin
(2) First is selected randomly between 1 and
(3) is added to set
(4) For all values of the following operations are repeated:
   (a) One random number is generated such as that has uniform distribution between zero and one
   (b) If is less than or equal to then number of is added to set
(5) End.

Therefore, for each target vector , there is a trial vector as follows [56]:where is equal to and is uniform distribution number between . Set of is guaranteed where there is at least one difference between and . In the next step, the selection process is performed between target vector and trial vector as follows:where is a function that should be the minimum. In this paper, to escape from premature convergence, two new strategies of merging have been studied. In the basic DE are used difference vector of multiplied where is control parameter between 0.4 and one [55, 57]. To improve the convergence feature in the DE, this paper makes the following proposal:where is uniform distribution number between zero and one. Generally, the DE algorithm steps are as in Algorithm 4.

(1) Define algorithm parameter
(2) Generate and evaluate initial population or solutions
(3) For all members of population per form the following steps
   (a) With mutation operator create a new trial solution
   (b) By using the crossover generate new solution and evaluate them
   (c) Replace new solution with current solution if new solution is better than current solution otherwise,
      the current solution is retained.
(4) Return to step three if termination condition is not achieved.

In Figure 5, the process of differential evolution is illustrated.

5. Proposed Algorithm

As noted in the former sections, studies conducted on the BA method have shown that this algorithm can be a powerful approach with enough performance to handle different types of nonlinear problems in various fields. However, it can be possibly trapped into local optimum. Lately, several ideas have been used to reduce this problem by hybrid different evolutionary techniques such as partial swarm optimization, genetic algorithm, and simulating annealing. In most population based evolutionary algorithms, in each iteration, new members are generated and then the movement operations are applied to explore new positions based on providing better opportunities. To increase the diversity of algorithm, in the differential evolution algorithm, all members have a possibility to win the global optimum and move to that side. The ability of the best particle to local search also depends on the other particles by selecting the two other particles and calculating the difference between them. This situation may lead to local convergence.

In this proposed algorithm, to escape from random selecting of the global best particle, we used competency selection for choosing the global best particle. If particle is better than the other solutions, then the probability of being selected is greater.

The basic idea behind the proposed algorithm is that our solutions are grouped based on the bees’ algorithm.

On the other hand, in this algorithm new approach is proposed to the movement and selects the recruiting bees for selecting sites. This algorithm classified the bees into three groups and named them elite sites, nonelite sites, and nonselected site. To increase diversity, the two modes for movement based on the differential evolution algorithm operator as parallel mode and serial mode were used. The suggested algorithm tries to use the advantage of these algorithms to find the best cluster center and to improve simulation results. In other words, in this algorithm, first, a preprocessing technique is performed on the data and then the proposed hybrid algorithm is used to find the best cluster center for -means problem.

The flowchart and pseudocode of the combined algorithm, called CCIA-BA-DE, are illustrated in Algorithm 5 and Figure 6.

Begin
 (a)  Find seed cluster center (preprocessing)
 (b)  Create an initial Bees population randomly with Scout Bees
 (c)  Calculate the objective function for each individual
 (d)  Sort and update best site ever found
 (e)  Select the elite sites, non-elite sites, and non-selected site (three site groups)
 (f)  Determine number of recruited bees for each kind of site
 (g)  While (iteration < 100)
     (I)     For each selected kind of sites
             % calculate the neighborhoods
      (1)   For each recruited bees
            % Mutation
      (2)  Choose target site and base site from this group
      (3)  Random choice of two sites from this group
      (4)  Calculate weighted difference site
      (5)  Add to base selected site
            % Crossover
      (6)  Perform crossover operation with Crossover Probability
      (7)  Evaluate the trial site that is generated
            % update site
      (8)  If trial site is less than target site
      (9)     Select trial site instead of target site
      (10)   else
      (11)        Select target site
      (12)   End if
     (II)   End (for of recruited bees)
 (h)  End (for of selected Sites)
 (i)   Sort and update best site ever found
End

6. Application of CCIA-BA-BE on Clustering

The application of CCIA-BADE-K algorithm on the clustering problem in this section is presented. To perform the CCIA-BADE-K algorithm to find best cluster centers, the following steps should be repeated and taken.

Step 1 (generate the seed cluster center). This step is a preprocessing step to find the seed cluster center to choose the best interval for each cluster.

Step 2 (generate the initial bees’ population randomly). In other words, generate initial solutions to find the best cluster centers statistically where center is a vector with cluster and each vector has dimension:where is cluster center of for th scout bee and is the number of dimension for each cluster center. In fact, each solution in the algorithm is a matrix with . and are values of the minimum and maximum for each dimension (each feature of center).

Step 3 (calculate the objective function for each individual). Calculate the cost function for each solution (each site) in this algorithm.

Step 4 (sort the solutions and select scout bees for each groups). The sorting of the site is carried out based on the objective function value.

Step 5 (select the first group of sites). Finding the new solutions is performed by selecting the group of sites. There are three groups of sites in which the first group or elite sites are evaluated to find the neighbors of the selected site followed by nonelite site and finally nonselected sites. To find the neighbors of each group of sites, either the serial mode or parallel mode may be used. This algorithm used parallel model.

Step 6 (select the number of bees for each site). Numbers of bees for each site depend on their group and are considered as competence, more bees for better site. If the site is rich then more bees are allocated to this site. In other words, if the solution is better, it is rated as more important than the other sites.

Step 7 (performing the differential evolution operator (mutation)). In this step, the target site is chosen from the group sites and then two other sites from this group are selected randomly to calculate the weighted difference between them. After calculating this difference, it is added to base trial site as shown in the following equation:where is the target site, is the weight, and the , are the selected sites from target’s group. is the trial solution for comparison purposes.

Step 8 (perform crossover operation with crossover probability). The recombination step incorporates successful solutions from the previous generation. The trial vectors is developed from the elements of the target vector, and the elements of donor vector . Elements of the donor vector enter the trial vector with probability CR where and is a random integer from and ensures that .

Step 9 (calculate the cost function for trial site). In the selection step, target vector is compared with trial vector . There are two modes to calculate the new site as follows:
Trial vector is compared to target vector . To use greedy criterion, if is better than the , then replace with ; otherwise, “survive” and are discarded.

Step 10. If not all sites from this group are selected, go to Step 6 and select another site from this group; otherwise, go to the next step.

Step 11. If not all groups are selected, go to Step 5 and select the next group; otherwise, go to the next step.

Step 12 (check the termination criteria). If the current number of iteration does not reach the maximum number of iterations, go to Step 4 and start next generation; otherwise, go to the next step.

7. Evaluation

To evaluate the accuracy and efficiency of the proposed algorithm, experiments have been performed on two artificial datasets, four real-life datasets and four standard datasets to determine the correctness of clustering algorithms. This collection includes Iris, Glass, Wine, and Contraceptive Method Choice (CMC) datasets that have been chosen from standard UCI dataset.

The suggested algorithm is coded by an appropriate programming language and is run on an i5 computer with 2.60 GHz microprocessor speed and 4 GB main memory. For measuring the performance of the proposed algorithm, the benchmarks data items of Table 1 are used.

The execution results of the proposed algorithm over the selected datasets as well as the comparison figures relative to -means, PSO, and K-NM-PSO results in [10] are tabulated in Table 2. As easily seen in Table 2, the suggested algorithm provides superior results relative to -means and PSO algorithms. The real-life datasets compared with several optimization algorithms are included.

For better study and analysis of the proposed approach, the execution results of the proposed approach along with HBMO, PSO, ACO-SA, PSO-ACO, ACO, PSO-SA, TS, GA, SA, and -means clustering algorithm results as reported in [8] are tabulated in Tables 36. It is worth mentioning that the investigated algorithms of [8] are implemented with MATLAB 7.1, using a Pentium IV system of 2.8 GHz CPU speed and 512 MB main memory.

Frist artificial dataset includes (, , ) where is the number of instance, is the number of clusters, and is the number of dimensions. The instances were drawn for four absolute classes where each of these groups was distributed aswhere and are covariance matrix and vector, respectively [10]. The first artificial dataset is demonstrated in Figure 7(a). Figure 7(b) illustrated the clustered data after applying CCIA-BADE-K algorithm on data.

Second artificial dataset includes (, , ) where is the number of instance, is the number of clusters, and is the number of dimensions. The instances were drawn for four absolute classes where each of these groups was distributed aswhere and are covariance matrix and vector, respectively [10]. The second artificial dataset is demonstrated in Figure 8. Figure 8 shows clusters after applying proposed algorithm on the artificial dataset.

In Tables 36, best, worst, and average results are reported for 100 runs, respectively. The resulting figures represent the distance of every data from the cluster center to which it belongs and is computed by using relation (4). As observed in the table, regarding the execution time, the proposed algorithm generates acceptable solutions.

To clarify the issue, in Figure 10, the scatterplot (scatter-graph) is illustrated. The scatter-graph is one kind of mathematic diagram, which shows the values for a dataset for two variables using Cartesian coordinates. In this diagram, data is demonstrated as a set of spots. This type of diagram is known as a scatter diagram or scatter-gram. This kind of diagram is also used to display relation between response variables with control variables when a variable is below the control of the experimenter. One of the strongest aspects of the scatter-diagram is the ability to show nonlinear relationship between variables. In Figure 9, the scatter-diagram of Iris dataset is displayed and in Figure 10 the clustered Iris data on the scatter-diagram is shown.

In Table 4, best, worst, and average results of Wine dataset are reported for 100 runs. The resulting figures represent the distance of every data from the cluster center.

In Figure 11 best cost and average best costs of results for all datasets are reported for 100 runs. The resulting figures represent the distance of every data from the cluster center by using relation (4). Figure 11(a) is related to the best cost and mean of best cost for Iris dataset, and Figure 11(b) illustrated the best cost and mean of best cost for Wine dataset. Figure 11(c) reported best cost and mean of best cost for CMC datasets, and finally Figure 11(d) demonstrated mean value of best cost and best cost of Glass dataset.

According to the reported results in Tables 3 to 6, the proposed method over Iris, CMC, and Wine Datasets provides the best results in comparison with other mentioned algorithms. According to Table 6, the suggested algorithm over Glass dataset provides more acceptable results than the alternative algorithms. The reason for this behavior is justified by the fact that as data objects increase in number the efficiency of the alternative algorithms decreases while the deficiency of the suggested algorithm highlights more.

8. Image Segmentation

In Section 7, it was shown that the proposed CCIA-BADE-K algorithm is one of the best methods for data clustering. For further investigation of the performance of algorithm, the algorithm was tested on one standard image and one industrial image. Each digital image in RGB space is formed by three-color components consisting of red, green, and blue. Each of these three alone is a grayscale image and the numerical value of each pixel is between 1 and 255. Image histogram is a chart that is made by the number of pixels on an image that is determined based on the brightness level [58]. To obtain a histogram of image it is enough to scroll the whole pixel of image and to calculate the number of pixels for each brightness level. The normalized histogram is obtained by dividing the total number of histogram value to each value of pixels. Normalizing the histogram causes the histogram value to be in interval. Figures 12 and 13 show that image samples in this paper are shown for image segmentation. In Figure 12, the color, grayscale, and clustered modes of these images are shown and, in Figure 13, histogram diagrams for these four images are shown. Furthermore, these segmentation charts will be used to detect segmentation an image.

9. Concluding Remarks

In this paper, a new technique based on a combination of bees algorithm and differential evolution algorithm with -means was presented. In the proposed algorithm, bee algorithm was assigned to perform globally and differential evolution algorithm was assigned to implement local searching on -means problem, which is responsible for the task of finding the best cluster centers. The new proposed algorithm CCIA-BADE-K applies abilities of both algorithms and, by removing shortcomings of each algorithm, it tries to use its own strengths to cover other algorithm defects as well as to find best cluster centers that is the proposed seed cluster center algorithm. Experimental results showed that the CCIA-BADE-K algorithm enjoys acceptable results.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to express their cordial thanks to the Ministry of Education (MoE), University Technology Malaysia (UTM), for the Research University Grant no. Q.J130000.2528.06H90. The authors are also grateful to Soft Computing Research Group (SCRG) for their support and incisive comments in making this study a success.