Abstract

Ant colony optimization (ACO) is often used to solve optimization problems, such as traveling salesman problem (TSP). When it is applied to TSP, its runtime is proportional to the squared size of problem so as to look less efficient. The following statistical feature is observed during the authors’ long-term gene data analysis using ACO: when the data size becomes big, local clustering appears frequently. That is, some data cluster tightly in a small area and form a class, and the correlation between different classes is weak. And this feature makes the idea of divide and rule feasible for the estimate of solution of TSP. In this paper an improved ACO algorithm is presented, which firstly divided all data into local clusters and calculated small TSP routes and then assembled a big TSP route with them. Simulation shows that the presented method improves the running speed of ACO by 200 factors under the condition that data set holds feature of local clustering.

1. Introduction

1.1. Introduction of Ant Colony Optimization (ACO)

In 1991, ant colony optimization (ACO) was presented firstly by Colorni et al. [1] and applied to solve TSP firstly by Dorigo et al. [13]. Dorigo et al. created a new research topic which is studied by many scholars now.

ACO is essentially a system based on agents that simulate the natural behavior of ants, in which real ants are able to find the shortest route from a food source to their nest, without using visual cues by exploiting pheromone information [2]. Pheromone is deposited when ants are walking on a route. It provides heuristic information for other ants to choose their routes. The more dense the pheromone trail of a route is, the more possibly the route is selected by ants. At last, nearly all ants select the route that has the most dense pheromone trail, and it is the shortest route potentially.

ACO has been applied to solve optimization problems widely and successfully, such as TSP [14], quadratic assignment problem [5], image processing [6], data mining [7], classification or clustering analysis [8], and biology [9]. The application of ACO leads the theoretic study of ACO. Gutijahr firstly analyzes the convergence property of ACO [10]. Stutzle and Dorigo prove the important conclusion that if the running time of ACO is long enough, ACO can find optimal solution possibly [11]. The other interesting property is revealed currently by Birattari et al. that the sequence of solutions of some algorithms does not depend on the scale of problem instance [12].

ACO is especially well suited for solving difficult optimization problems, where traditional optimization methods are less efficient. However, ACO is not very efficient in solving large problems because running time is too long and the quality of solution is still low. To solve the two main problems, the configuration of the parameters is discussed [2, 3]. To further improve ACO, many approaches have been proposed. Among these approaches, parallel computation and other methods are used to accelerate ACO [13]. In this study, we design a novel clustering algorithm named special local clustering algorithm (SLC), which is applied to classify and find the solution for TSP problem. Moreover, a colony of ants acts on each class to get a local TSP path. And we use the convergence of route length as termination criterion of ACO. The experimental results indicated that the improved ACO speeds up and its quality becomes higher for testing problems. It is more robust than comparative approaches.

1.2. Clustering Correlates to the Running Time of ACO

One of study focuses of ACO is to cut down running time. The running time of ACO is , and in general, where , , and denote the iteration number, number of ants, and number of cities, respectively [4]. The running time is proportional to . Cutting down the number of cities is the key to reduce running time. Therefore, classifying all cities into different classes and letting ACO act on each class will reduce running time heavily. Hu and Huang used this method to improve the running speed of ACO [14], which is named ACO-K-Means. It is faster than ACO by factors of 5–15 approximately. Simulations show that ACO-K-Means algorithm is valid only to the set of cities that has evident clustering feature and invalid to more general situation. ACO-K-Means implies that using clustering method to improve the running speed of ACO is possible.

1.3. Introduction of Local Clustering Algorithm

Clustering is classifying objects of a set (named training set) into different clusters (or groups), so that the data in each class (ideally) share some common traits. One of the most popular clustering algorithms is K-Means clustering algorithm [15, 16]. K-Means clustering algorithm assigns each point to the cluster whose center (i.e., centroid) is nearest to it and then updates the centroid. Repeat this process until termination criterion is satisfied [16].

During the th iteration of K-Means algorithm, the th class has distortion that is defined as the average distance of each point and the class centroid, which is denoted by ( ), where is the number of classes. Pang proves that for each the distortion sequence is convergent if the th class is separated from other classes evidently [16]. That is, distortion sequence is convergent locally. According to this property, an algorithm named local clustering algorithm (LC) is presented [17], and its essential idea is introduced as below.

Step 1. K-Means is applied to a given training set to generate classes.

Step 2. The class whose distortion is convergent first is deleted from training set. Then, update training set such that it is comprised of residual points. Go to Step 1.

Repeat the process of Steps 1 and 2 until all data is classified.

LC algorithm is faster than K-Means algorithm by factors of 4–13 approximately.

Suppose that the th class is during the th iteration of K-Means algorithm. Set has entropy , where and is the probability of data . It is proved that entropy sequence is convergent [16]. That is, the convergent criterion of K-Means algorithm can be replaced by the convergence of entropy sequence [18]. The K-Means with convergent criterion of entropy convergence is fast by factors of 2 at least [18, 19].

2. Improve Local Clustering Algorithm to Generate Compact Class

2.1. Compact Set and the Method of Generation

For any subset of Euclidean space , every sequence in this subset has a convergent subsequence, the limit point of which belongs to the set. This subset is called compact set. The conception of compact set (or compactness) is a topology conception. To understand it easily, compactness can be described visually as the phenomenon where many points cluster tightly in a small region, while noncompact set is the set of which most of points cluster loosely in a big region.

K-Means clustering, LC, or other algorithms aim to partition a training set into classes. Some classes are compact and some are not. The most common situation is that a class contains a compact subset and some loose points, and points of the compact subset are around the center of the class. That is, the central part of class is compact possibly. To extract compact subset from a class, the following -principle is introduced.

For Gauss distribution, suppose that denotes the deviation of random data. It is the -principle that there is more than 99% probability that a random point falls into the central region of data set whose radius is [16]. The central region contains more than 99% points. Thus, if radius is small enough and the number of points is big enough, the central region is compact. If the central region with radius is not compact, shortening the radius of central region to , , and so on will make it compact. For Gauss distribution which is comprised of enough points, the compact central region always exists. In general, for a class generated by clustering algorithm, all distances of points from class centroid comprise a similar gauss distribution. Therefore, the central region of a class is compact possibly.

Suppose that the th class is at the th iteration of -Means or LC algorithm. With the increase of iteration, class sequence ( ) appears, where denotes the number of classes. Let where denotes the number of elements in and denotes distance.

Consider

Clearly, is the distortion of class and is the approximation of deviation of .

Consider

is the central region of class . Parameter is used to shorten the radius of central region and makes it compact. Figure 1 illustrates the -principle and compact subset .

2.2. Subroutine 1: Local Clustering Algorithm with -Principle

The local clustering algorithm with -principle is used to classify points into classes and to extract compact central region of classes. Its essential idea is described as below.

Firstly, apply LC algorithm to cluster data. And apply the criterion of entropy convergence (i.e., ) to mark the stable class .

Secondly, extract compact central region from class and preserve it as a genuine class. Remove from training set and update it. Repeat the above two steps until all compact central regions are extracted. The details are described in Algorithm 1.

Input parameters:
: Training Set
: The number of classes
: The stop threshold for clustering.
: Initial centroids set.
: A parameter to adjust the size of compact subs-ets .
Output:
(i.e., the set of co-mpact subset, see Figure 1)
, where , and it is comprised by dispersive points
( , see Figure 1)
Void Subroutine  1  ( )
{
Step  1. Initialization: Let iteration number . Let . Let and , where
denotes empty set. According to initial centroids set , generate initial partition of training set
.
Step  2. While
Step  2.1. Generate new centroids set and new partition
/* Note: Check whether entropy sequence { } is convergent. If it is convergent,
let the convergent marker StableMarker  */
Step  2.2. For
Estimate the entropy of class , that is, .
If
Else
 }
/* Note: Extract the data around the centroid of class as a genuine class */
Step  2.3. For
If
Calculate compact central region according to formula (3)
Calculate : 
Let
Let
Update Training Set:
Update centroids set:
 }
}
}
}

2.3. Special LC Algorithm to Generate Compact Classes (SLC)

Note that above subroutine 1 is not a partition of training set. Subroutine 1 extracts only compact central regions of all classes and the residual points are unclassified. The residual points comprise a new training set. And it is possible that some of residual points cluster together tightly and comprise some small compact subsets again. These small compact subsets are new classes. To obtain these new classes and classify all points, SLC algorithm is described in Algorithm 2.

Input parameters:
: Training Set
: The initial number of classes.
: The stop threshold for clustering.
Output:
Num: The final number of classes.
CLS: The partition of    , in which each class is com-pact.
SLC Algorithm:
Step  1. Initialization: Let   ,   ,   , and .
Step  2. For ( ) /*Note: denotes the integer */
Step  2.1. Generate initial centroids set   .
Step  2.2. Call Subroutine1
Step  2.3.   ;
Step  2.4. ;
 /* Note: Increase to get smaller compact class */
Step  2.5.   ;  
}
Step  3. Every residual point in the last set     is regarded as a class   . And let   .
Let  Num  denote the number of classes contained in  CLS. The two outputs are  CLS  and Num.

2.4. The Clustering for Mixture Distribution (SLC-Mixture)

The clustering algorithm SLC presented above generates spherical classes only. However, for a general distribution, some classes are of spherical shape, some classes are of chain shape in which points cluster closely around a curve (or a line), and some classes contain isolated points. This common distribution is called mixture distribution. For a large-scale TSP, the distribution of cities is mixture distribution in general. The clustering method for mixture distribution is proposed as below.

2.4.1. The Simple Maker to Distinguish Spherical Class from Chain-Shaped Class

The position of city on a map is two-dimensional point. A given class can be divided into 8 areas along the 4 directions of the north-south and west-east and two diagonal lines through the centroid of the class. If the class is spherical, the percentage of points in each area is close to 1/8 and is the same approximately. If the class is chain-shaped class (or part of chain-shaped class), it is impossible that the percentage of every area is close to 1/8 at the same time. Therefore, the percentage of points in each area is the maker of spherical class. Figure 2 illustrates the marker.

2.4.2. Applying SLC to Process Mixture Distribution (SLC-Mixture)

At first, apply SLC to classify all data of training set. Secondly, apply the marker presented above to distinguish spherical classes and extract them from the training set. Then all residual points comprise a new set named residual set. The residual set contains only chain-shaped classes and isolated points. Thirdly, apply the method presented in [20] to classify all residual points of residual set into different chain-shaped classes or marked as isolated points. The method presented in [20] is named chain-shaped clustering algorithm.

The clustering method presented in this section is called SLC-Mixture algorithm, which processes the mixture distribution of spherical classes, chain-shaped classes, and isolated points.

3. Apply SLC to ACO

3.1. The Termination Criterion of ACO

Suppose ACO acts on a compact class and let denote the minimum route length that is generated at the th iteration of computation. There is sequence and it is convergent under ideal condition. The convergent criterion is proposed as the termination criterion of ACO in this paper.

In the following discussion, ACO refers to the algorithm whose termination criterion is .

3.2. Apply SLC to Improve the Running Speed of ACO (ACO-SLC)

In this section, the clustering algorithm SLC will be applied to improve the running speed of ACO. The method is named ACO-SLC and it is described as below.

Input parameter: set of cities.Output: the shortest TSP route obtained by the algorithm.

ACO-SLC Algorithm.

Step 1. Apply SLC algorithm to partition set . The classes are , and , and their centroids are , and , respectively.

Step 2. Construct graph : centroids , and are regarded as virtual cities, respectively, and the virtual cities are regarded as the vertices of graph . For a pair of classes and , if there exist two cities that belong to and , respectively, and they join each other, use an edge to join the two corresponding vertices and . The weight of edge is the minimum distance between two classes; that is,

Step 3. Calculate a TSP route of graph to generate the traveling order of all classes: let ACO algorithm act on graph to find a TSP route denoted by , where , is a permutation of sequence . The pair of classes and is called neighbor class.

Step 4. Choose an edge as the bridge to join a pair of neighbor classes, and this edge is named bridge edge. Assume that the two neighbor classes are and . If there exists an edge such that edge is the bridge edge, and are called border cities, where vertices and should be not used to join other neighbor classes.

Step 5. Calculate a local TSP route for every class ( ): add a new edge to join the two border cities in the class and mark the edge as necessary edge of the local TSP route. This edge is named pseudoedge. Let the ACO algorithm with convergence criterion act on the class to generate a local TSP route.

Step 6. Construct a TSP route: walk along the traveling order obtained at Step 3; for every pair of neighbor classes, delete the pseudoedge of each class such that the local route is not close. Then let the local route of each class and the bridge edge between these two classes be joined.

Figure 3 illustrates the processing of ACO-SLC algorithm.

3.3. Using the Method of Little-Window and Removing Cross-Edge to Improve ACO-SLC (ACO-SLC-LWCR)

Clustering may cause the error of solution although it improves the running speed of ACO heavily. If all classes are compact and separated clearly, the quality of solution of ACO-SLC should be very good. However, in fact, the border between two neighbor classes is fuzzy. The fuzzy border will cause the inaccuracy of solution, and much longer route will appear. And recognizing the longer part and removing it will generate better solution possibly. It is well known that the shortest route is always at the surface of a convex hull. Thus, the longer part should be at the inner of a convex hull and two longer edges intersect. In other words, intersection of two edges is a marker of longer part of a route possibly. According to the marker, removing longer edges is called removing cross-edge or removing intersection edges, which is similar to the method in [4]. (Notice: in [4], before executing ACO, the long and crossed edges are removed to improve the running speed of ACO, not to improve the solution quality.)

Figure 4 illustrates the method of removing cross-edge.

In addition, a simple method named little-window strategy is proposed to improve the running speed of ACO in [21]. Construct a set that is comprised by accessible and short edges which join the th city, where is a preassigned constant. The ant which has arrived at th city will select an edge from window set only to arrive at its next city and not select an edge from all neighbor edges of this vertex. So, this method improves the running speed of ACO.

The ACO-SLC with little-window strategy and cross-edges removing is called ACO-SLC-LWCR.

3.4. The ACO-SLC for Mixture Distribution (ACO-SLC-Mixture)

ACO-SLC is suitable for the spherical shape distribution only, and the low quality of solution will appear possibly when ACO-SLC is applied to process mixture distribution. To process mixture distribution, the following method named ACO-SLC-Mixture is proposed in this paper.

Firstly, apply SLC-Mixture at Section 2.4.2 to partition the set of cities into spherical classes, chain-shaped classes, or isolated points. Secondly, apply ACO-SL-C-LWCR to each class and generate a TSP route.

4. Simulation

In this section, five related algorithms ACO, ACO-K-Means, ACO-SLC, ACO-SLC-LWCR, and ACO-SLC-Mixture are tested and compared. In the following simulation, ACO refers to ant-cycle presented by Colorni et al., which is very typical [1].

All test data in this paper is downloaded from http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/tsp/. All algorithms in this paper run on personal computer (CPU: 1.80 GHz; memory: 480 M; software: Matlab). The parameters are listed as below. Initialize pheromone trails , iteration number 1000, , , , , , and . Two performance items are tested. One item is the running time, which is defined as Ratio = Time (ACO)/Time (Algorithm). The bigger the ratio is, the faster the algorithm is. In addition, the advantage of the ratio is that the subtle infection of other processes to runtime is evaded as possible, and it is more accurate than raw measured runtime because the value caused by other processes gives little contribution to the ratio. The other item is the quality of solution, which is defined as the percentage of error Error = (Solution-Optimum)/Optimum, where Optimum denotes the best solution known currently. The smaller the error is, the better the quality of solution is.

The performances of the five algorithms are listed in Figure 5. It shows that ACO-SLC, ACO-SLC-LWCR, and ACO-SLC-Mixture are faster than ACO by , , and 257–9419 of factors, respectively! However, some solutions of ACO-K-Means and ACO-SLC have low quality. The inaccuracy ratio of ACO-SLC-Mixture is less than ACO in most cases and is bigger than ACO by 2% at most.

The Defect of ACO-SLC. From Figure 5, it should be noted that only under the condition that the data set holds feature of local clustering significantly, the quality of solution is good. The simulations of this paper show that the quality of ACO-SLC solution depends on the quality of clustering and clustering quality of SLC is sensitive to the initial centroids just like K-Mean algorithm. This is the main defect of ACO-SLC.

5. Conclusion

Time Complexity of ACO. ACO is the algorithm that is inspired by the foraging behavior of ant colonies and has been applied to solve many optimization problems. The typical application of ACO is the application at traveling salesman problem (TSP). The running time of ACO is , where , , and denote the iteration number, number of ants, and number of cities, respectively. Parameter is an experiential value and is set to in general. Parameter is the key factor of running time because running time is proportional to its square. Parameter and are available, and decreasing parameter and will cut down running time.

Focus of ACO Study. ACO can generate solution with high quality in general. But its shortage is that running time is too long. Cutting down running time is one of study focuses of ACO, and one way is to decrease parameters and , especially .

Basic Idea for this Study Focus. For this study focus, the following basic idea is presented in this paper.

Firstly, all cities are classified into compact classes, where compact class is the class where all cities in this class cluster tightly in a small region. Secondly, let ACO act on every class to get a local TSP route. Thirdly, all local TSP routes are joined to form solution. Fourthly, the inaccuracy of solution caused by clustering is eliminated.

Realization of Basic Idea. The realization of above idea is based on a novel clustering algorithm presented in this paper, which is named special local clustering algorithm (SLC). The running time of SLC is far less than the time of ACO. SLC generates compact classes, while current popular clustering algorithm such as K-Means does not generate compact classes in general. The compactness of class makes the length of TSP route at every iteration convergent; the convergence of (i.e., ) is proposed as the termination criterion of ACO in this paper. Thus, parameter is cut down to improve the running speed of ACO. In addition, every class has small size; ACO acting on small class makes parameter cut down, and running speed is improved. According to this analysis, ACO-SLC algorithm is presented in this paper. Simulation shows that ACO-SLC is faster than ACO by of factors!

Elimination of the Solution Inaccuracy Caused by Clustering. Although the running speed is improved in this paper, the inaccuracy of solution is heavy. Two factors causing the inaccuracy are found in this paper. One is the cross-edges (see Section 3.3) and the other factor is the unmatching between ACO-SLC and mixture distribution (see Section 3.4). According to these two factors, ACO-SLC-LWCR and ACO-SLC-Mixture are presented in this paper, which is the improvement of ACO-SLC. Simulation shows that ACO-SLC-LWCR and ACO-SLC-Mixture are faster than ACO by and 257–9419 of factors, respectively! The inaccuracy ratio of ACO-SLC-Mixture is less than ACO in most cases and is bigger than ACO by 2% at most.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The work is supported by Aeronautical Science Foundation of China (no. 2012ZD11) and partially by the Education Department of Sichuan Province (no. 12zA134 and no. 09zz028).