An Improved Animal Migration Optimization Algorithm for Clustering Analysis
Animal migration optimization (AMO) is one of the most recently introduced algorithms based on the behavior of animal swarm migration. This paper presents an improved AMO algorithm (IAMO), which significantly improves the original AMO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique and it is used in many fields. The well-known method in solving clustering problems is -means clustering algorithm; however, it highly depends on the initial solution and is easy to fall into local optimum. To improve the defects of the -means method, this paper used IAMO for the clustering problem and experiment on synthetic and real life data sets. The simulation results show that the algorithm has a better performance than that of the -means, PSO, CPSO, ABC, CABC, and AMO algorithm for solving the clustering problem.
Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. It is a main task of exploratory data mining and a common technique for statistical data analysis used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Tryon in 1939 and famously used by Cattell beginning in 1943  for trait theory classification in personality psychology. Many clustering methods have been proposed; they are divided into two main categories: hierarchical and partitional. The -means clustering method  is one of the most commonly used partitional methods. However, the results of -means solving the clustering problem highly depend on the initial solution and it is easy to fall into local optimal solutions. Zhang et al. have proposed an improved -means clustering algorithm called -harmonic means . But the accuracy of the results obtained by the method is not high enough.
In recent years, many studies have been inspired by animal behavior phenomena for developing optimization techniques, such as firefly algorithm (FA) , cuckoo search (CS) , bat algorithm (BA) , artificial bee colony (ABC) , and particle swarm optimization (PSO) . Because of its advantages of global, parallel efficiency, robustness, and universality, these bioinspired algorithms have been widely used in constrained optimization and engineering optimization [9, 10], scientific computing, automatic control, and clustering problem [11–21]. Niknam et al. have proposed an efficient hybrid evolutionary algorithm based on combining ACO and SA for clustering problem [15, 16] in 2008. In 1991, Colorni et al. have presented ant colony optimization (ACO) algorithm based on the behavior of ants seeking a path between their colony and a source of food . Then Shelokar et al. and Kao and Cheng have solved the clustering problem using the ACO algorithm [17, 18] in 2004 and 2006. Eberhart and Kennedy have proposed particle swarm optimizer (PSO) algorithm which simulates the movement of organisms in a bird flock or fish school  in 1995 and the algorithm also has been adopted to solve this problem by Omran et al. and van der Merwe and Engelbrecht [19, 23] in 2005 and 2003. Kao et al. have presented a hybrid approach according to combination of the -means algorithm, Nelder-Mead simplex search, and PSO for clustering analysis  in 2008. Niknam et al. have presented a hybrid evolutionary algorithm based on PSO and SA (simulated annealing algorithm, 1989 ) to solve the clustering problem  in 2009. Zou et al. have proposed a cooperative artificial bee colony algorithm to solve the clustering problem and experiment on synthetic and real life data sets to evaluate the performance  in 2010. Niknam and Amiri have proposed an efficient hybrid approach based on PSO, ACO, and -means called PSO-ACO-K approach for clustering analysis  in 2010. The artificial bee colony (ABC) algorithm is described by Karaboga  in 2005 and it has been adopted to solve clustering problem by Karaboga and Ozturk  in 2011. Voges and Pope have used an evolutionary-based rough clustering algorithm for the clustering problem  in 2012. Chen et al. have used monkey search algorithm for clustering analysis  in 2014.
Animal migration algorithm (AMO) is a new bioinspired intelligent optimization algorithm by simulating animal migration behavior proposed by Li et al.  in 2013. AMO simulates the widespread migration phenomenon in the animal kingdom, through the change of position, replacement of individual, and finding the optimal solution gradually. AMO has obtained good experimental results on many optimization problems. This paper presents an algorithm to improve the performance of AMO. We proposed a new migration method to modify the performance of AMO, the migration process based on shrinking animals living area operator; this method guarantees AMO rapid convergence to global optimum. By means of selecting the better solution space around the current solution, it improves search ability and accelerates convergence velocity, and it has more chance to find the global optima.
The structure of the paper is as follows. In Section 2, the traditional method -means for clustering is presented. In Section 3, the original AMO algorithm is introduced. Section 4 describes our proposed novel approach of migration process. Section 5 elaborates the improved AMO and some biological foundations of animal behaviors are explained. Section 6 illustrates experiments and discusses the results. Section 7 studies the extent of different size of shrinkage coefficient impact of the proposed algorithm. At the end of the paper, we conclude it with future directions and developments with the improved AMO.
2. The -Means Clustering Algorithm
The target of data clustering is grouping data into a number of clusters; -means is one of the simplest unsupervised learning algorithms that solve the clustering problem. It is proposed by MacQueen in 1967 . The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume clusters) fixed a priori; each data vector is a -dimensional vector, satisfying the following conditions [29, 30]:(1);(2);(3).The -means clustering algorithm is as follows.(1)Set the number of clusters and the data set .(2)Randomly choose points as the cluster centroids from .(3)Assign each object to the group that has the closest centroid. The principle of division is as follows: if , and . The data will be divided into classified collection .(4)When all objects have been assigned, recalculate the positions of the centroids : where is the number of the points in the classified collection .(5)Repeat Steps and until the centroids no longer move.
The main idea is to define centroids, one for each cluster. These centroids should be placed in a cunning way because of different location causing different result. So the better choice is to place them as much as possible far away from each other. In this study, we will use Euclidian metric as a distance metric. The expression is given as follows: Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective function is as follows:
3. Animal Migration Optimization (AMO)
Animal migration algorithm can be divided into animal migration process and animal updating process. In the migration process, the algorithm simulates how the groups of animals move from current position to a new position. During the population updating process, the algorithm simulates how animals are updated by the probabilistic method.
3.1. Animal Migration Process
During the animal migration process, an animal should obey three rules: avoid collisions with your neighbors; move in the same direction as your neighbors; and remain close to your neighbors. In order to define concept of the local neighborhood of an individual, we use a topological ring, as has been illustrated in Figure 1. For the sake of simplicity, we set the length of the neighborhood to be five for each dimension of the individual. Note that, in our algorithm, the neighborhood topology is static and is defined on the set of indices of vectors. If the index of animal is , then its neighborhood consists of animal having indices , if the index of animal is 1, the neighborhood consists of animal having indices , and so forth. Once the neighborhood topology has been constructed, we select one neighbor randomly and update the position of the individual according to this neighbor, as can be seen in the following formula: where is the current position of the neighborhood, is produced by using a random number generator controlled by a Gaussian distribution, is the current position of th individual, and is the new position of th individual.
3.2. Population Updating Process
During the population updating process, the algorithm simulates how some animals leave the group and some join in the new population. Individuals will be replaced by some new animals with a probability . The probability is used according to the quality of the fitness. We sort fitness in descending order, so the probability of the individual with best fitness is and the individual with worst fitness, by contrast, is 1, and the process can be shown in Algorithm 1.
In Algorithm 1, are randomly chosen integers, . After producing the new solution , it will be evaluated and compared with the , and we choose the individual with a better objective fitness:
4. The New Migration Process Method
In AMO, algorithm uses migration process and population updating process to find a satisfactory solution. The proposed algorithm used a new migration process by establishing a living area by the leader animal (the individuals with best fitness value) and animals migrate from current locations into this new living area to simulate animal migration process.
At first, there are animals that live in living area, as shown in Figure 2(a), moving, eating, drinking, reproducing, and so on; some individuals move randomly and their position be updated, and then we calculate the best position of animals by fitness function and record it. But the amount of food or water gradually diminished as the time wore on, as shown in Figure 2(b), and some animals migrate from the current areas which have no food and water to a new area with abundant food and water, as shown in Figure 2(c). In Figure 2, the green parts represent the living areas with abundant food and water; animals can live in these areas. And the yellow parts represent the areas that lack food or water; animals can no longer live in these areas, and they must migrate to a new living area (the green parts in Figure 2(c)). We shrink the living area after a period of time (as shown in Figures 2(a) and 2(c)), and then animals migrate to the new living area ceaselessly. As a rule of thumb, the globally optimal solution always nearby is the current best solution; in IAMO, the animals living area is smaller and smaller (by formula (6)) after each iteration, and the individuals get closer and closer to the globally optimal solution, so we can accelerate the convergence velocity and precision of the algorithm to some extent.
(a) The th iteration living area
(b) Animals begin to migrate
(c) The th iteration living area
The boundary of the living area is established by where is the leader animal (the current best solution), and are the lower and upper bound of the living area, is living area radius, is shrinkage coefficient, , and , , and are all row vector. In general, the original value of depends on the size of the search space. As iterations go on, a big value of improves the exploration ability of the algorithm and a small value of improves the exploitation ability of the algorithm.
5. The IAMO Algorithm for Solving Clustering Problem
5.1. Initializing the Population
During the initialization process, the algorithm begins with initializing a set of animal positions ; each animal position is a -dimensional vector, where is the number of clustering center and is the dimension of the test set . The cluster centers , (), each center is -dimensional vector, and the lower bound of the centers is the minimum of each column in test set , namely, , and the upper bound of the centers is . So we can initialize the position of an individual , , and then the lower and upper bounds of the solution space are and .
Animals are randomly and uniformly distributed between the prespecified lower initial parameter bound and the upper initial parameter bound . So the th component of the th vector is as follows: where is a uniform distribution random number between 0 and 1.
5.2. Animals Migration
During the migration process, because of animals hunting, foraging, or drinking in the living area, some parts of the living area are lacking food or water or climate condition change, and some animals migrate from the current living area to the new living area which has abundant food and water or climate condition suitable for living. We assume that there is only one living area, and animals out of the new living area would be migrating into the new living area, as depicted in Section 4. We calculate the distance between cluster centers and text data set, then we classify test data set into categories according to the distance, and, finally, we can obtain the fitness according the fitness function:
According to the fitness function, we obtain the best individual , and the new living area can be established by and .
5.3. Individuals in Population Updating
During the population updating process, algorithm simulates some animals that are preyed by their enemies or some animals leave the group and some join in the group from other groups or some new animals are born. In IAMO, we assume that the number of available animals is fixed, and every animal will be replaced by , as shown in Section 3.2.
Specific implementation steps of the improved animals migration optimization algorithm (IAMO) can be shown as in Algorithm 2.
6. Numerical Simulation Experiments
All of the algorithm was programmed in MATLAB R2008a, numerical experiment was set up on AMD Athlon(tm)II *4 640 processor and 2 GB memory.
The experimental results comparing the IAMO clustering algorithm with six typical stochastic algorithms including the PSO , CPSO , ABC , CABC , AMO , and -means algorithms are provided for two artificial data sets and eight real life data sets (Iris, teaching assistant evaluation (TAE), wine, seeds, StatLog (heart), Hagerman’s survival, balance scale, and Wisconsin breast cancer) which are selected from the UCI machine learning repository .
Artificial Data Set One , , and . This is a three-featured problem with five classes, where every feature of the classes was distributed according to Class 1-Uniform (85, 100), Class 2-Uniform (70, 85), Class 3-Uniform (55, 70), Class 4-Uniform (40, 55), and Class 5-Uniform (25, 40) [12, 14]. The data set is illustrated in Figure 3.
Artificial Data Set Two , , and . This is a two-featured problem with four unique classes. A total of 600 patterns were drawn from four independent bivariate normal distributions, where classes were distributed according to where , , , , and .
Iris Data , , and . This data set with 150 random samples of flowers from the Iris species setosa, versicolor, and virginica collected by Anderson . From each species, there are 50 observations for sepal length, sepal width, petal length, and petal width in cm. This data set was used by Fisher  in his initiation of the linear-discriminant-function technique [11, 12, 33].
Teaching Assistant Evaluation , , and . The data consist of evaluations of teaching performance over three regular semesters and two summer semesters of 151 teaching assistant (TA) assignments at the Statistics Department of the University of Wisconsin-Madison. The scores were divided into 3 roughly equal-sized categories (“low,” “medium,” and “high”) to form the class variable .
Wine Data , , and . This is the wine data set, which is also taken from MCI laboratory. These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. There are 178 instances with 13 numeric attributes in wine data set. All attributes are continuous. There is no missing attribute value [11, 12, 33].
Seeds Data , , and . This data set consists of 210 patterns belonging to three different varieties of wheat: Kama, Rosa, and Canadian. From each species, there are 70 observations for area , perimeter , compactness (), length of kernel, width of kernel, asymmetry coefficient, and length of kernel groove .
StatLog (Heart) Data , , and . This data set is a heart disease database similar to a database already present in the repository (heart disease databases) but in a slightly different form .
Hagerman’s Survival , , and . The data set contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago’s Billings Hospital on the survival of patients who had undergone surgery for breast cancer. It records two survival status patients with the age of patient at time of operation, patient’s year of operation, and number of positive axillary nodes detected .
Balance Scale Data , , and . This data set was generated to model psychological experimental results. Each example is classified as having the balance scale tip to the right, to the left, or balanced. The attributes are the left weight, the left distance, the right weight, and the right distance. The correct way to find the class is the greater of (left-distance * left-weight) and (right-distance * right-weight). If they are equal, it is balanced .
Wisconsin Breast Cancer , , and . It consists of 683 objects characterized by nine features: clump thickness, cell size uniformity, cell shape uniformity, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses. There are two categories in the data: malignant (444 objects) and benign (239 objects) [11, 12, 33].
Here we set the parameters of AMO and IAMO as follows. The population size of the AMO and IAMO is 100. In IAMO, the original living area radius and shrinkage coefficient . For the PSO, inertia weight , acceleration coefficients , , and population size . The population size of the CPSO is 20. The population size of the ABC and CABC are 50 and 10, respectively. In order to compare with other algorithms, the maximum generations of all algorithms are 100.
For every data set, each algorithm is applied 20 times individually with random initial solution. For the Art1 and Art2 data set, once the randomly generated parameters are determined, the same parameters are used to test the performance of three algorithms. We ranked each algorithm according to the mean result. The results are kept four digits after the decimal point. The mean value, the best value, the worst value, the standard deviation, and the rank value are recorded in Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.
Tables 1–10 show that IAMO is very precise than other algorithms in solving the ten data sets. As seen from the results, the IAMO algorithm provides the best value and small standard deviation in comparison with other methods. For the Art1 and Art2 data set in Tables 1 and 2 which were randomly generated, IAMO obtained the best mean and smallest standard deviation compared to other algorithms. The mean value of IAMO obtained is 1718.2540 in solving Art1, while ABC and CABC obtained 1718.5496 and 1718.4434, and IAMO gives 4 orders of magnitude better than ABC and CABC. Same to solving Art2, IAMO obtained 513.9035, while CPSO, ABC, and CABC obtained 513.9046, 513.9037, and 513.9037, respectively, but the standard deviation of IAMO is at least 2 orders of magnitude better than them. For Iris data set, the mean value, the optimum value, and the worst value of IAMO are all 96.6555 and the standard deviation is which revealed the robustness of IAMO. CABC also sought the best solution 96.6555, but the standard deviation is bigger than IAMO when the best solutions of AMO, PSO, CPSO, ABC, and -means are 97.0751, 96.6567, 96.6580, 96.6566, and 99.4582, respectively. Table 4 shows the results of algorithms on the TAE data set. The mean value of IAMO is 1491.0900 which is smaller than that of AMO, PSO, CPSO, ABC, CABC, and -means within 20 runs. For wine data set, IAMO reached the mean value 16292.1855 while CABC reached the mean value 16292.1982. The best value and worst value of IAMO are 16292.1849 and 16292.1862, which are also better than 16292.1858 and 16292.2094 obtained by CABC, and the standard deviation value of IAMO is also the smallest one. Table 6 provides the results of algorithms on the seeds data set; the IAMO algorithm and CABC algorithm are superior to those obtained by the others. Although IAMO and CABC reached the same mean value 311.7980, the standard deviation of IAMO is 1 order of magnitude better than CABC. On StatLog (heart) data set results given in Table 7, IAMO gets the best value is 10622.9824 and the same as CABC, while the mean values of the two algorithms are 10622.9824 and 10622.9904, so the IAMO is better than CABC algorithm. For Hagerman’s survival data set, the optimum value 2566.9888 can be obtained by IAMO, ABC, and CABC, but the standard deviations of ABC and CABC are and which is worse than that of obtained by IAMO. The standard deviation of PSO is a little bigger than that of CPSO. For balance scale data set in Table 9, as seen from the results, the mean, best, and worst ones are all 1423.8204, which reflect the stable characteristics of IAMO. The three best algorithms in this test data are IAMO, CABC, and ABC, and the best results of them are 1423.8204, 1423.8206, and 1423.8308. For Wisconsin breast cancer data set in Table 10, the mean value, the best value, and the worst value are all 2964.3870 which are obviously superior to -means, PSO, CPSO, ABC, and AMO.
As seen from Table 1 to Table 10, we can conclude that although the convergence rate is not quick enough at the beginning of the iteration compared to ABC and CABC, the final results are the best compared to other algorithms in all test data sets. The most results of ABC and CABC are better than PSO and CPSO, and the -means algorithm is the worst for most of test data sets.
Figures 5, 6, 7, 8, 9, 10, 11, 12, 13, and 14 show the convergence curves of different data sets for various algorithms. Figures 15 and 16 show the original data distribution of Iris data set and the clustering result by IAMO algorithm.
7. Living Area Radius Evaluation
The performance and results of the proposed algorithms are greatly affected by the size of living area. At the beginning of the iteration, a big value of improves the exploration ability of the algorithm and; at the end of iteration, a small value of improves the exploitation ability of the algorithm. We adopted a fixed shrinking coefficient to change the living area radius after each iteration, as shown in formula (6). To study the extent of impacts on the proposed algorithm, we selected Art1 data set and Iris data set, using different to evaluate the performance of the proposed algorithm.
Figure 17 shows the results of an experiment on Art1; we can conclude that if we choose between 0.6 and 0.9, it has a better convergence precision than that of or . If we choose , IAMO algorithm plunges into local optima, and if we choose , the IAMO algorithm has a very low convergence rate. And likewise in Figure 18, for Iris test data set, IAMO algorithm quickly converged at global optimum before 30 iterations if we choose , while IAMO could not escape from poor local optima and to global optimum if we choose , , or . So the best for solving Iris data set must exist between 0.7 and 0.99.
The results suggest that a proper can greatly improve the algorithm convergence velocity and convergence precision, and an improper may lead the IAMO fall into local optimum.
In this paper, to improve the deficiencies of the AMO algorithm, we improved the algorithm by using a new migration method based on shrinking animals living area. By 10 typical standard test data sets simulation, the results show that IAMO algorithm generally has strong global searching ability and local optimization ability and can effectively avoid the deficiencies that conventional algorithms easily fall into local optimum. IAMO has improved the convergence precision of AMO and rank 1st in all test data sets; therefore, it is very practical and effective to solve clustering problems. At last, how to define a proper and unified radius of living area needs to be considered in subsequent work.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by National Science Foundation of China under Grant nos. 61165015 and 61463007, Key Project of Guangxi Science Foundation under Grant no. 2012GXNSFDA053028, and Key Project of Guangxi High School Science Foundation under Grant no. 20121ZD008.
K. R. Zalik, “An efficient k-means clustering algorithm,” Pattern Recognition Letters, vol. 29, no. 8, pp. 1385–1391, 2008.View at: Google Scholar
B. Zhang, M. Hsu, and U. Dayal, “K-harmonic means—a data clustering algorithm,” Tech. Rep. HPL-1999-124, Hewlett-Packard Laboratories, 1999.View at: Google Scholar
X.-S. Yang, Nature-Inspired Metaheuristic Algorithms, Luniver Press, 2008.
R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proceedings of the 6th International Symposium on Micro Machine and Human Science, pp. 39–43, Nagoya, Japan, October 1995.View at: Google Scholar
T. Niknam, B. Bahmani Firouzi, and M. Nayeripour, “An efficient hybrid evolutionary algorithm for cluster analysis,” World Applied Sciences Journal, vol. 4, no. 2, pp. 300–307, 2008.View at: Google Scholar
Y. Kao and K. Cheng, An ACO-Based Clustering Algorithm, Springer, Berlin, Germany, 2006.
A. Colorni, M. Dorigo, and V. Maniezzo, Distributed Optimization by Ant Colonies, Elsevier Publishing, Paris, France, 1991.
E. H. L. Aarts and J. H. Korst, Simulated Annealing and Boltzmann Machines, John Wiley & Sons, 1989.
D. Karaboga, “An idea based on honey bee swarm for numerical optimization,” Tech. Rep. TR06, Erciyes University Press, Erciyes, Turkey, 2005.View at: Google Scholar
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pp. 281–297, University of California Press, Berkeley, Calif, USA, 1967.View at: Google Scholar
X. Chen and J. Zhang, “Clustering algorithm based on improved particle swarm optimization,” Journal of Computer Research and Development, pp. 287–291, 2012.View at: Google Scholar
X. Liu, Q. Sha, Y. Liu, and X. Duan, “Analysis of classification using particle swarm optimization,” Computer Engineering, vol. 32, no. 6, pp. 201–213, 2006.View at: Google Scholar
J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of the IEEE International Conference on Neural Networks, pp. 1942–1948, December 1995.View at: Google Scholar
C. L. Blake and C. J. Merz, UCI Repository of Machine Learning Databases, http://archive.ics.uci.edu/ml/datasets.html.
E. Anderson, “The irises of the gaspe peninsula,” Bulletin of the American Iris Society, vol. 59, pp. 2–5, 1935.View at: Google Scholar
R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, part 2, Article ID 179188, pp. 179–188, 1936.View at: Google Scholar