Journal of Optimization

Volume 2016 (2016), Article ID 4786268, 15 pages

http://dx.doi.org/10.1155/2016/4786268

## Introducing Complexity Curtailing Techniques for the Tour Construction Heuristics for the Travelling Salesperson Problem

^{1}School of Mathematical and Computer Sciences, Heriot Watt University, Edinburgh EH14 4AS, UK^{2}Route Monkey Ltd., Livingston EH54 5DW, UK

Received 22 March 2016; Revised 23 May 2016; Accepted 8 June 2016

Academic Editor: Ling Wang

Copyright © 2016 Ziauddin Ursani and David W. Corne. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In this paper, complexity curtailing techniques are introduced to create faster version of insertion heuristics, that is, cheapest insertion heuristic (CIH) and largest insertion heuristic (LIH), effectively reducing their complexities from to with no significant effect on quality of solution. This paper also examines relatively not very known heuristic concept of max difference and shows that it can be culminated into a full-fledged max difference insertion heuristic (MDIH) by defining its missing steps. Further to this the paper extends the complexity curtailing techniques to MDIH to create its faster version. The resultant heuristic, that is, fast max difference insertion heuristic (FMDIH), outperforms the “farthest insertion” heuristic (FIH) across a wide spectrum of popular datasets with statistical significance, even though both the heuristics have the same worst case complexity of . It should be noted that FIH is considered best among lowest order complexity heuristics. The complexity curtailing techniques presented here open up the new area of research for their possible extension to other heuristics.

#### 1. Introduction

The Traveling Salesman Problem (TSP) is one of the most studied problems in the scientific literature and sometimes is referred to as a mother of all combinatorial optimization problems. It continues to be a testing ground for the development of combinatorial optimization methods, while having numerous practical applications in diverse areas, including logistics, genetics, manufacturing, telecommunications, and neuroscience [1]. Examples of real-world application domains with problems that can be naturally formulated as the TSP include VLSI design, vehicle routing, data clustering, and job-shop scheduling [2]. The earliest reference to the TSP can be found in the 1832 German handbook for travelling salesmen [1]. The problem consists of finding the shortest possible tour of a set of cities, such that the tour starts from and ends at the same city and visits each of the remaining cities precisely once. The problem is simple to state but has proved to be intractable and is included among the seven “millennium prize problems” described by the Clay Mathematical Institute, carrying a prize of one million US dollars for discovery of a polynomial time solution method.

Meanwhile, research and practice in the TSP have focused on heuristic methods that yield fast approximate solutions. These heuristic methods tend to cluster into three main groups: (i) tour construction heuristics, (ii) tour improvement heuristics, and (iii) composite heuristics (which combine elements of both tour construction and tour improvement). A tour construction heuristic constructs the tour from scratch, beginning with one city and iteratively expanding the subtour by one city at a time. In contrast, a tour improvement heuristic begins with a complete tour and makes one or more rearrangements in an attempt to improve it. A vast space of “composite” heuristics obscures the distinction between these two categories by using elements of both; for example, in the highly successful Lin-Kernighan heuristic [3], the context is that of iterated tour improvement; however the improvement process consists of repeated construction of full solutions from partial solutions. The concord software code [4], which has solved most problems in TSPLIB to optimality, uses the Lin Kernighan heuristic to find out near optimal tours and then applies various mathematical programming methodologies to achieve optimality.

With no general polynomial time solution method yet available, the quest for the development of new heuristics for the TSP remains active. Tour construction heuristics have played an important role in this quest, in particular because they can form components of a wide range of approaches. For example, they can be used to construct the starting solutions for tour improvement heuristics [5], and they can provide rough estimates of the cost of optimal solutions. In turn, they therefore provide interesting analytical grounds for the study of upper bounds on solution quality [6] and provide material for empirical studies [7, 8] that attempt to understand how optimal solutions differ from solutions that arise from tour construction heuristics. Such studies lead to better understanding of the structure of optimal solutions of the TSP, a problem that has now remained centre of our intellectual curiosity for centuries.

The basic procedure of tour construction heuristics can be summarised as follows:(1)Establishment of the initial small subtour (subtour establishment rule).(2)Selection of a city not present in the current subtour (selection rule).(3)Expansion of the current subtour to include the selected city (expansion rule).(4)Iterative application of steps and until a complete tour is obtained.In the repeated application of steps and , the subtour is “successively augmented” with the insertion of a new city; this is why tour construction heuristics are sometimes also called “successive augmentation” heuristics (see, e.g., [9]). Different tour construction heuristics are characterised by the specific methods they choose for each of the three steps: the initial subtour establishment, selection, and expansion rules. Expansion rules can be grouped into two main types: insertion and addition. An insertion-based expansion rule chooses where in the permutation to place the new city on the basis of the cost of the resulting subtour, whereas an addition-based expansion rule bases this decision on next-hop distance. It has been reported that insertion heuristics generally perform better than addition heuristics (see, e.g., [8]); we have therefore chosen to focus on insertion-based expansion heuristics in this paper.

Three insertion heuristics are particularly prominent in the literature, namely, nearest, farthest, and cheapest [10]. We consider these three heuristics in the remainder of this work, along with a further two that are less well known. The first of these two is the “largest-cost insertion heuristic” (LIH) [11], which sits naturally alongside the aforementioned three, despite being rarely considered in the literature. The second is the “max difference insertion heuristic” (MDIH). Tunnel and Heath [12] presented “max difference” as a concept that could be attached to any other heuristic; however we argue that, with appropriate extensions that we later describe (resulting in MDIH), it is more appropriately seen as a tour construction heuristic in itself, and in fact we demonstrate that it is a particularly effective one.

The insertion heuristics under study can be divided into two groups, that is, distance based insertion heuristics (DBIH) and cost based insertion heuristics (CBIH). The DBIH has city selection rule based on distance. This group includes nearest (NIH) and farthest insertion heuristics (FIH). The CBIH has city selection rule based on the cost of insertion (see (5)). This group includes cheapest (CIH), largest (LIH), and max difference insertion heuristics (MDIH). The difference in selection rule brings about difference in worst case complexity of heuristics with DBIH having worst case complexity of and CBIH having complexity of (see Section 4). This paper proposes techniques to curtail the complexity of CBIH to , effectively creating faster versions of CBIH, that is, FCIH, FLIH, and FMDIH, while addition of “F” denotes the word fast.

Previously it has been reported that farthest insertion heuristic (FIH) generally performs best among the community of insertion heuristics of or lower time complexity (see, e.g., [13]). However we will show that the performance of FMDIH is consistently better than FIH on a wide spectrum of popular datasets even though the complexity of FMDIH is no greater than that of FIH.

The remainder of this paper is structured as follows. In Section 2, we describe four of the five insertion heuristics of interest: cheapest, largest, nearest, and farthest insertion. In Section 3 we describe the previously overlooked “max difference” concept and build on that to describe the “max difference insertion” heuristic. In Section 4, we discuss design and complexity issues for all five of the heuristics of interest, and in Section 5 we then describe our complexity curtailing techniques to produce accelerated and approximated variants of cost based insertion heuristics. Section 6 provides summaries of empirical results, and we conclude with a statement of our main findings in Section 7. In Appendix, we then show the empirical results for the farthest and max difference insertion heuristics in finer detail.

#### 2. Description of Tour Construction Heuristics

Below, we explain the design details of four of the five main heuristics considered in this paper. Our elaboration of the details follows the four-part structure given in Section 1 for tour construction heuristics. Being insertion heuristics, the most important elements are the city selection and subtour expansion methods. The city selection method is also salient for another reason, which is the fact that the name of a tour construction heuristic (e.g., cheapest and farthest) tends to reflect the nature of this rule. We come back to this fact in Section 3, when we speculate about why the “max difference” heuristic has been largely overlooked. The simple “initial subtour establishment” rule is also described for completeness and to facilitate replication of our experiments. Meanwhile, it is worth noting an interesting aspect of the city selection rule.

##### 2.1. Initial Subtour Establishment

This is the first step in any tour construction heuristic. In each of the standard forms of the cheapest, largest, nearest, and farthest insertion heuristics, this initial “subtour” is simply a single city chosen uniformly at random.

##### 2.2. City Selection

Momentarily referring to the subtour expansion rule (in Section 2.3), we note that two of the heuristics of interest are based on cost, and two are based on distance. Where subtour expansion is based on cost, the city selection rule is as follows. Let be the set of cities in the subtour and let be the set of cities not in the subtour, while city and is the expansion cost of subtour by insertion of city . Then, the city is selected such that for CIH and for LIHWhen subtour expansion is instead based on distance, the city selection rule is different. Now, let be the distance between cities and ; the city is selected such that for NIH and for FIH

##### 2.3. Subtour Expansion

As indicated in the introduction, subtour expansion heuristics can be based around either* insertion* or* addition*. We are only concerned with insertion-based variants in this paper and so will confine our description accordingly. Let and , such that and are a pair of consecutive cities in . Let be the distance between the cities and . The subtour is expanded by inserting the selected city in the subtour such that This step is applicable to each of CIH, LIH, FIH, and NIH. From this point onwards, we refer to the edge related to as the* expansion cost edge*. It should be noted that the “largest-cost insertion” heuristic is not published yet, however, an unpublished paper about it is present on the internet [10]. We include it in our experiments, noting that it is the logical counterpoint to CIH, in the same way that FIH is related to NIH. It therefore completes a set of four heuristics of which two are based on minimum and maximum distance ((3)-(4)) while two are based on minimum and maximum expansion cost ((1)-(2)). We later find that LIH outperforms both CIH and NIH, though apparently missing from the literature.

#### 3. The Max Difference Insertion Heuristic (MDIH)

The idea behind the MDIH was introduced in a Master’s thesis [12] a quarter of a century ago and no research paper about this could be found in literature. The “max difference” concept was presented by Tunnel and Heath as a way to engineer new variants of existing heuristics (they applied it to two versions of CIH and also to Stewart’s Algorithm [14]). However, since (as we will see) the concept relates directly to how the city selection step is done, from which the names of tour construction heuristics seem invariably to be derived, we would argue that it merits reinvention as a fully fledged tour construction heuristic. Its performance in the form of MDIH, as detailed in the next section, certainly supports this view. Meanwhile we speculate that Tunnel and Heath may not have put it forward as a standalone new heuristic, based on the belief that it was the “insertion” (subtour expansion) rather than the “max difference” (city selection) that was salient in such categorization; in turn, this seems to have led to its being unnoticed, despite its comparatively strong performance among similar heuristics.

We now explain the key “city selection” step for MDIH. First, we need some helper definitions. The first of these is the function “th minimum,” denoted as . The meaning of “th minimum” is that if all the values in set are sorted from minimum to maximum then this function represents the th value in that list. Now, the “th expansion cost” of a subtour by the insertion of any city not in that subtour is equal to the function , if the set is the set of all the expansion costs of that subtour on all of its edges caused by the insertion of that city. Now let and , such that and are a pair of consecutive cities in . Let be the distance between the cities and . Let represent th expansion cost of the tour by the insertion of city and then by the definitionIt should be noted that, apart from here, the exclusive use of above function is made in Sections 5.2 and 5.3. From this point onwards the edge related to is referred to as the th expansion cost edge. In the city selection rule of MDIH the city is selected such that the cost difference between its expansion cost and 2nd expansion cost is maximum among all the cities not in the tour; that is,It should be noted that there is no difference between and . The 3rd step (subtour expansion) follows the same rule as represented by (5).

Finally to present MDIH as complete tour construction heuristic in itself, we also explore five variants of the subtour establishment rule (step ). Unlike the other insertion heuristics considered in this paper, the city selection step of MDIH requires an initial subtour containing at least three cities. We therefore test five subtours establishment rules: (i) a subtour of three randomly chosen cities; (ii) a subtour first formed by two randomly chosen cities and then the third city is chosen based on cheapest expansion; (iii) a subtour first formed by two randomly chosen cities and then the third city is chosen based on largest-cost expansion; (iv) a subtour initially formed by one randomly chosen city and the next two are chosen iteratively based on cheapest expansion; (v) a subtour first formed by one randomly chosen city and then the next two are iteratively chosen on the basis of largest-cost expansion. Associated experiments and results are discussed later in Section 6.

#### 4. Design and Complexity Aspects of Tour Construction Heuristics

Design of any heuristic consists of the design of data structures and algorithms for its efficient implementation. Our implementation centres around two key data structures: a doubly circular linked list representing the set of cities not in the subtour and a singly circular linked list representing the subtour. For set of cities not in the subtour, the doubly circular linked list allows for a very simple and fast city deletion operation. For the set of cities representing the subtour , single circular linked list is sufficient since only an insertion operation is needed here. Initially, a doubly circular linked list is prepared representing the set of cities not in the subtour. From this list, cities are deleted one by one and correspondingly inserted in the single circular linked list representing the subtour of cities at each step. The procedure continues until the doubly circular linked list becomes empty.

The design of step (city selection) of DBIH, that is, the nearest and farthest insertion heuristics ((3)-(4)), consists of finding the nearest neighbour city present in the subtour for each of the cities not in the subtour. To reduce computation costs, the nearest neighbour city in the subtour for each city not in the subtour is recorded in each iteration, and in the next iteration the distance of only the recently added city is compared with the distance of the nearest neighbour from the previous iteration, thus updating the nearest neighbour information for the current iteration. Therefore, FIH and NIH can be implemented within time complexity of .

In the case of CBIH, that is, the cheapest and largest insertion heuristics, the design of the city selection step ((1) and (2)) consists of finding the expansion cost for insertion into the subtour of each city not in the subtour. This implies that the expansion cost needs to be computed for each city not in the subtour; in turn, computation of the expansion requires visits to each and every edge of the subtour. To make this efficient, information should be preserved between iterations, and in each new iteration the expansion cost of only newly added edges is computed and compared with the expansion costs of the previous iteration to update the expansion cost information. However this shortcut may not be possible for all the cities not in the subtour. This is because some of the cities may have expansion costs in the previous iteration at an edge which is broken in the current iteration. This means the information about the expansion cost of those cities is no longer valid since the relevant edge for the expansion does not exist. For such cities, we need to compute the expansion costs for newly formed edges and if this expansion cost is less than or equal to their expansion cost in previous iterations, then it means their new expansion cost has been found on the newly formed edges. However, if not, we are forced to recompute the expansion cost of these cities on all edges of the current subtour to find out their new expansion cost. Therefore worst case scenario complexity of CIH and LIH is . The average complexity of CIH is reported as of the order [14].

To design the city selection step for MDIH, for each city not in the subtour both the expansion cost and the 2nd expansion cost need to be computed (7). However, both of these quantities can be computed simultaneously in a single scan of the subtour. To implement the step efficiently, these quantities should be preserved between iterations for each city not in the subtour, and in the new iteration only the expansion cost and 2nd expansion cost of the recently added edges are computed and compared with those of the previous iteration to update them. However, just as was the case for CIH and LIH, this shortcut is not always possible; for some cities the expansion costs at all edges of the subtour need to be computed. Therefore the worst case scenario complexity of MDIH is same as that of cheapest and largest insertion heuristics, that is, ).

#### 5. Complexity Curtailing Techniques for Cost Based Insertion Heuristics

In this section, complexity curtailing techniques are introduced to create fast and approximate variants of CIH, LIH, and MDIH.

##### 5.1. Design of Fast Cheapest Insertion Heuristic (FCIH)

We can devise a new but faster heuristic based on CIH on the basis of simple geometrical principle. Consider the subtour “” in Figure 1.