Mathematical Problems in Engineering

Volume 2015, Article ID 713043, 14 pages

http://dx.doi.org/10.1155/2015/713043

## Reducing the Size of Combinatorial Optimization Problems Using the Operator Vaccine by Fuzzy Selector with Adaptive Heuristics

Instituto Politécnico Nacional-CITEDI, Avenue del Parque No. 1310, Mesa de Otay, 22510 Tijuana, BCN, Mexico

Received 10 March 2015; Revised 5 June 2015; Accepted 8 June 2015

Academic Editor: David Bigaud

Copyright © 2015 Oscar Montiel and Francisco Javier Díaz Delgadillo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Nowadays, solving optimally combinatorial problems is an open problem. Determining the best arrangement of elements proves being a very complex task that becomes critical when the problem size increases. Researchers have proposed various algorithms for solving Combinatorial Optimization Problems (COPs) that take into account the scalability; however, issues are still presented with larger COPs concerning hardware limitations such as memory and CPU speed. It has been shown that the Reduce-Optimize-Expand (ROE) method can solve COPs faster with the same resources; in this methodology, the reduction step is the most important procedure since inappropriate reductions, applied to the problem, will produce suboptimal results on the subsequent stages. In this work, an algorithm to improve the reduction step is proposed. It is based on a fuzzy inference system to classify portions of the problem and remove them, allowing COPs solving algorithms to utilize better the hardware resources by dealing with smaller problem sizes, and the use of metadata and adaptive heuristics. The Travelling Salesman Problem has been used as a case of study; instances that range from 343 to 3056 cities were used to prove that the fuzzy logic approach produces a higher percentage of successful reductions.

#### 1. Introduction

In the field of mathematics and computer science, one of the most challenging problems is searching for optimal arrangements of elements within hundreds of thousands of possibilities. Collectively, these problems are known as Combinatorial Optimization Problems (COPs), and their importance lies in the large diversity of real-life problems that can be solved in the industry and engineering sectors by studying them. They become a challenge because using a direct approach of calculating all the possibilities and choosing the best one becomes an unfeasible task, because the number of possible combinations grows quickly compared to checking modestly sized instances, which is beyond the capabilities of even the most powerful today’s supercomputers; therefore, a great deal of research has been invested in developing faster and better algorithms for solving COPs [1].

One of the most well-known COPs is the Traveling Salesman Problem (TSP) whose popularity and importance can be attributed to its simple definition but high complexity to solve it, making it an ideal research test problem. Additionally, there are many scientific, real-life industrial and commercial applications that can be analyzed in an analogous way. The TSP consists in finding the shortest tour that a traveling salesman must take between a finite amount of cities, starting and ending in the same city. The TSP is known to be NP-hard [2] and the term was coined by W. R. Hamilton and Thomas Kirkman in the 1800s [3]; it was first formulated as a mathematical problem in 1930 by Karl Menger. Numerous approaches to solving it have been published [4, 5], from which two main categories can be identified:* exact solutions* and* approximate approaches* [6].

The most straightforward* exact solution* algorithm would be to utilize brute force by creating every possible outcome and choosing the best one. This approach is not feasible for larger TSP instances and, thus, better performing algorithms were proposed. One of the best proposals comes under the published name “implementing the Dantzig-Fulkerson-Johnson algorithm for large traveling salesman problems” [7, 8], which was an early description of the computer program solver for the symmetric TSP named Concorde [9]. A modified Concorde’s algorithm was used for solving instances up to 85,900 cities optimally [10]. Other approaches such as “Branch-and-Bound/Cut” method described in [11] in 1958 was applied to solve the TSP in 1963 in [12].

The search for exact solutions is not always the best course of action when trying to solve a problem, and this is where* approximate approaches* take the lead. Based on heuristics, these types of algorithms do not guarantee that the optimum solution will be found; however, they provide suboptimal results that can be “good enough” for the task at hand. One of the best known approximate algorithms was published as “Polynomial Time Approximation Schemes for Euclidean Traveling Salesman and Other Geometric Problems” [13]. Some other heuristics examples as well as their adaptation to solve the TSP are as follows: Lin-Kernighan [14, 15], Tabu Search [16, 17], Evolutionary Algorithms [18–20], Ant Colony Optimization [21–23], Bee Colony [24], Neural Network Algorithms [25, 26], Memetic Approaches [27, 28], and hybrid strategies such as Neural Networks [29] and Memetic Algorithms [30] plus Lin-Kernighan local optimization.

Differently from the above proposals, in this work, the goal is not to provide new solving algorithms but to allow that current proposals can perform better by providing an operator that performs a systematic selection, removal, optimization, and reconstruction of TSP instances to reduce the problem size. The ROE method was conceived in [31], and, here, we will contribute to the method by providing better selection strategies (operator) than the existing ones, which includes the use of adaptive heuristics. This will produce a higher percentage of reductions that are part of the global optimum. With the Reduce-Optimize-Expand (ROE) method, we can produce faster and higher quality solutions and, in some cases, provide solutions which was previously impossible due to algorithm or hardware limitations; now, using the fuzzy operator with adaptive heuristics, it will be possible to improve the quality of solutions. We present a series of comparative experiments from the well-known TSPLIB that showcase the improvement of the fuzzy selection strategy to those that were previously published.

The main contributions of this work can be summarized as follows: a general method to reduce the size of TSP instances that uses a rule-based Fuzzy Inference System (FIS) and a procedure to adequate TSP instances for future evaluation by the FIS; this includes the creation of custom made linguistic variables based on TSP instance metadata and the logical rules that drive the FIS. The above contributions improve the ROE methodology [31] providing a higher quantity of reductions that are part of the global optimum.

This paper is organized as follows. In Section 2 a summary of the state of the art of this extensive topic, focusing on TSP, is presented. Section 3 provides a mathematical overview of the TSP. Section 4 is concerned with the proposed methodology and its inclusion in the ROE method, the fuzzy logic classifier with adaptive heuristics, which is the core of this work, and the use of advanced metadata is explained. In Section 5, a detailed case of study through the use of plots is explained. Section 6 provides experimental results for ten TSP instances from the TSPLIB. Finally, Section 7 provides the conclusions and future work.

#### 2. Related Work

Research has been conducted to solve the scalability issues associated with COPs in different ways. One of the most relevant trend that have proved to be very effective is to utilize modern hardware architectures to provide an additional degree of scalability through high levels of parallelism. With the increased access to such technologies, mainly by the introduction to mainstream multicore CPU computing and more recently GPU oriented computing, a significant amount of effort has been shifted to develop algorithms that properly use these new paradigms. Researchers have proven that revisiting classic algorithms and adding parallelism to them is a very effective strategy for solving COPs. Such is the work in [32] where the authors revisited the design and implementation of Branch-and-Bound algorithms for solving large COPs on GPU-enhanced multicore machines, and in [33] a high-performance GPU implementation of the 2-opt and 3-opt local search algorithm was presented. Other metaheuristics that were enhanced using GPU are Ant Colony Optimization [34, 35] and Genetic Algorithms [36, 37].

The systematic breakdown of COPs has also proven to be an effective strategy. Utilizing the divide and conquer strategies, the algorithms are capable of producing smaller problems that can be solved, and at a later stage they can be interconnected. The task at hand is providing proper dividing methodologies and the algorithms that will interconnect them. Works such as [38] describe the effectiveness of this approach and a discussion of multiple algorithms for undertaking the task is presented. In [29], a hybrid Neural Network, with local search via modified Lin-Kernighan algorithm that tries to solve a million city TSP, was introduced, although the work proves to be very successful in allowing for a higher degree of division of this huge TSP instance, thus enabling more scalability; the next inherent issue is presented in their results:* faster solution with lower quality*.

In the ROE method [31], a proposal to reduce the size of TSPs was presented; the main concept is to apply systematic reductions to the TSP instance in order to obtain a new instance that is representative of the original but with the added benefit that the TSP solving algorithms would require fewer resources to produce solutions. After it has been solved, the last stage of the ROE methodology is to reconstruct the original instance, giving us the final solution in the form of the expanded route. In [39] source code to implement the method is provided. Experimental results reported a 30% to 55% reduction in computational time, and the final solutions’ length remained between 6.48% and 15.30% from the global optimum.

An exhaustive literature review in the Scopus database, ACM Digital Library, IEEE Xplore, Springer link, and Google Scholar was performed, and there was no methodology found that works similarly to what is proposed in this work, that is, a generic methodology that aims to remove elements from the given problem using a Fuzzy Inference System based on metadata and adaptive heuristics. To compare the ROE method using the proposed fuzzy operator, we have surveyed the current state of clustering algorithms, taking a special interest in clustering by fuzzy strategies. ROE can be denoted as a clustering algorithm because it performs the task of grouping via classification, in such a way that the elements share fuzzy characteristics. The idea of using fuzzy classification for clustering has been also considered by other researchers; however, they do not use advanced metadata based on adaptive heuristics, which provides a good degree of intelligence in the grouping process.

Clustering algorithms are usually unable to locate global optima because of an underlying data structure that is difficult to propose optimally. However, important contributions in clustering have been reported, and, similarly to this new approach, their aim is to improve performance. Fuzzy classification has been employed as a powerful tool in clustering algorithms; two important advantages are as follows:(1)The use of linguistic variables to describe important characteristics and concepts about the elements in the problem.(2)Classification based on knowledge represented through fuzzy rules extracted from the human expertise.

By surveying fuzzy logic applied to COPs, we found a clear tendency of using C-means fuzzy clustering techniques as the main form of grouping. From the year 2004 to the present, the consulted digital databases contain around 40 articles that fall within the search terms regarding fuzzy clustering and applications for TSP. The keywords chosen were TSP, COP, fuzzy clustering, fuzzy sets, and classification, among different variations of them. A selection of the surveyed articles was carried out. The following works are of special interest because they are the most fitting for comparison with the proposed methodology; however, they are still being substantially different because the grouping method and final purpose is dissimilar.

Focusing only on clustering based algorithms, the other methodologies usually group all the elements that fall within certain decision criteria and then they calculate the centroid of that group this being the first step in the optimization process. Differently, ROE using the proposed fuzzy operator classifies segments of the problem instance, and a centroid is calculated for those segments that are chosen to be removed; the segments may or may not be close to each other. This concept of classification and grouping of elements is one of the main contributions of this work.

Now, focusing on three-step methodologies for solving the TSP problem, the work presented in [40] proposes using a Genetic Algorithm (GA) based on unsupervised fuzzy clustering. In the first step, the cities in the problem are divided into several subtours, using a clustering algorithm. In the second step, each partition of cities is considered a smaller size TSP and it is solved using a GA, obtaining optimal subtours of the cities for each partition. In the third step, the subtours are connected in an appropriate way to obtain an optimized tour. Then a fourth step is required since the final tour needs to be improved by the GA. On the other hand, the ROE method, with the fuzzy operator, has three well-defined steps, and, differently from the above method, it does not depend on a particular method to achieve the optimization. Instead of working with independent subtours, which inevitably most of the times guide the optimization process to suboptimal solutions, the ROE works with the whole problem at the same time; in the first step the problem’s original size is reduced to a convenient size using a fuzzy classifier based on advanced metadata and heuristics. In the second step, the reduced size problem can be optimized by any state-of-the-art method. In the third step, the optimized tour is methodology adapted to the original size problem, producing better solutions.

In [41] a hybrid metaheuristics for solving the TSP is presented; this is a fused method that combines a GA and an adaptive fuzzy greedy search operator. In [42], the individuals of the population of a GA are subject to be grouped using a fuzzy clustering approach; hence the reduction and the optimization stages are also fused. Contrary to these approaches, in ROE, the optimization process is not blended with the method, providing it with a big flexibility since it does not require modifying the inner workings of the selected optimization algorithm.

Fuzzy c-means clustering is the most popular technique of all. Twenty of the surveyed articles use this grouping method. In [43], a hybrid evolutionary fuzzy learning algorithm that automatically determines the near optimal traveling path in large-scale traveling salesman problems is presented. It described the steps required to identify, group, solve, and connect clusters of cities. The key difference in the aforementioned work is that the cities of the TSP are never removed; thus the problem size remains the same. Other works that apply fuzzy clustering in a similar matter can be in [42, 44, 45].

Hybrid approaches besides Evolutionary Algorithms exist. Although these types of approaches do not share as many characteristics as the ones discussed previously, it is important to note that fuzzy clustering can be applied in different ways. In [46], a fuzzy self-organizing map for an artificial Neural Network is presented.

Other approaches come in the form of fixing edges [47] and they provide significant improvements to the solving of the TSP instances as well. However, these approaches require additional local optimization as a preprocess.

#### 3. Mathematical Formulation of the TSP

The TSP is a highly studied problem for testing and benchmarking algorithms. Mathematically, the TSP can be expressed as an edge-weighted, directed graph , where is the set of vertices (cities), , the set of (directed) edges (path between two cities), and a* distance function * assigning each edge to a distance [10, 48].

A tour is* Hamiltonian cycle* that consists of a single unique permutation of all vertices in where the first and last elements of the permutation are the same, and is the set of all tours . By interpreting as the city visited after city , for , the cost of a tour can be written as

Thus, the TSP is defined as a set cities where, for each pair of distinct cities, there exists a distance ; the goal is to find the ordering of cities that minimize the quantity:

This quantity is referred to as the tour length since it is the length of the tour that a salesman would make when visiting the cities in the order specified by the permutation, returning at the end of the initial city. For two-dimensional problems, the vertices are points in the plane, and is the Euclidian distance [31] given by

Finally, solving the TSP consists in producing a tour which comprises a closed route where there are no repeated nodes, except for the first and last node as they are the same and the length of the tour is minimal. This task is not easy to satisfy, specifically as the amount of cities increases, in which obtaining a tour with minimal cost for a large number of cities is classified as an NP-hard problem; these kinds of problems are considered unsolvable by polynomial algorithms. For this very reason, there is a special interest in developing and using methodologies that limit the search space and then utilize a metaheuristic.

As it was explained, many different variations of the TSP exist in literature [6]; however, three of them stand out as the most widely studied and used: the symmetric Traveling Salesman Problem (sTSP) where the distances must always satisfy , the asymmetric Traveling Salesman Problem (aTSP) in which at least one distance is , for both cases , , and finally, the Multitraveling Salesman Problem (mTSP) where salesmen are deployed from the starting city with the objective of finding a tour for each salesman and each city is visited once by only one salesman, and the cost of the tour is minimized.

Each variation of the TSP requires a different mathematical formulation and consideration. Being the sTSP the most cited work [7], with specialized WEB pages such as the TSPLIB where the most recent advances and optimally obtained results for different instances are summarized, we focused this proposal on this variation of TSP, with the aim of having many examples to achieve comparisons with the latest results obtained by the state-of-the-art algorithms. From this point forward, when speaking of the TSP, we are referring to the sTSP.

#### 4. The VFS with Adaptive Heuristics in the ROE Method

The ROE method [31] produces a temporally smaller-sized COP, representing the original problem. The purpose is helping COP solving algorithms to perform faster providing high-quality solutions. For huge sized problems, where no other existing methods can provide a solution, the ROE method allows doing it whether the problem can be effectively reduced to a manageable size for optimizing algorithms. The first reduction operators of ROE were inspired by Artificial Immune Systems (AIS) [49–51], specifically in artificial vaccination [52]; for this reason, the reduction operators in this method are called vaccines. Two main operators for selection were reported in [31], and they are Vaccination by Random Selector (VRS) and Vaccination by Elitist Selector (VES). In this paper, the operator Vaccination by Fuzzy Selector with adaptive heuristics (VFSah) is proposed.

The ROE method consists of three steps, and they are as follows:(i)* Reduction*. In this first step, the TSP instance is analyzed to select the nodes to be removed; it is the most important and researched step. It has many considerations to guarantee that the reductions made to the instance actually form part of the optimal solution, and they remain representing the original TSP instance. The reduction step has most significance on the outcome of the whole methodology. A selection or decision criteria to determine which candidates are chosen to be removed is required. In our program, a .tsp file was created to save the new reduced TSP instance. This stage requires achieving the next following actions:(1)* Loading the TSP Instance and Representation*. The instance data is loaded into memory. Each city is represented in coordinates. An example of this representation is shown in Figure 1(a).(2)* Mesh Generation*. The mesh consists of a series of connections (edges) between nodes. Obtaining multiple paths from one node to the rest is desirable to promote options for the fuzzy classifier; however, a big amount of nodes can slow down algorithms due to memory and computing limitations. The simplest mesh can be created by calculating the distance matrix for the particular TSP instance; nevertheless this is not the best choice because we found the previously discussed problem. A better alternative is to create a distance matrix but only to the nearest nodes, and in this way the size is reduced considerably. An example of a mesh is illustrated in Figure 1(b).(3)* Advanced Metadata Generation*. This consists in calculating for each edge; this value contains both the Proximity and Secludedness values that are part of the mesh. Figure 1(c) shows a representation of each edge with its assigned value.(4)* Using the Operator VFSah*. The VFSah is used to classify each of the edges in the mesh as vaccines or not vaccine; that is, the idea is to identify which edges can be removed from the original TSP. This classification is represented using colors in Figure 1(d); in green color are the chosen edges to be removed (vaccines), in yellow are those that cannot be classified as vaccines, and in red are those that will remain as part of the instance. The resulting TSP reduced instance is shown in Figure 1(e).(ii)* Optimize*. The new reduced instance is ready to be solved by any state-of-the-art specialized optimizing algorithm. This is one of the most important advantages of the ROE methodology, since we do not need to achieve any change to the optimizing algorithm. The resulting tour is going to be used in the next step. Figure 1(f) shows a reduced TSP instance that has been solved by any TSP solving algorithm as usual. It is expected that the TSP solving algorithm generates an output file in the form of a “.tour file.” In this paper, we will not discuss TSP solving algorithms.(iii)* Expand*. The expand step takes the solution from the previous step and reconstructs the tour by returning the removed nodes from the reduction step. Local optimization can be used in order to create a better reconstruction scenario when connecting the reduced nodes and paths. This step is critical because it provides the correct solution to the TSP instance. Figure 1(g) illustrates the last step; it consists in the expansion of the previously generated tour by including the original removed sections. This step requires the solution generated by the TSP solving algorithm, the original TSP instance data, and the reduced TSP instance; the final tour will be in terms of the original TSP instance, thus providing the real tour length.