Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 3851520, 11 pages

http://dx.doi.org/10.1155/2016/3851520

## Feasible Initial Population with Genetic Diversity for a Population-Based Algorithm Applied to the Vehicle Routing Problem with Time Windows

Research Center in Engineering and Applied Sciences, Autonomous University of Morelos State, Avenida Universidad 1001, Colonia Chamilpa, 62209 Cuernavaca, MOR, Mexico

Received 23 August 2015; Accepted 17 January 2016

Academic Editor: Panos Liatsis

Copyright © 2016 Marco Antonio Cruz-Chávez and Alina Martínez-Oropeza. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A stochastic algorithm for obtaining feasible initial populations to the Vehicle Routing Problem with Time Windows is presented. The theoretical formulation for the Vehicle Routing Problem with Time Windows is explained. The proposed method is primarily divided into a clustering algorithm and a two-phase algorithm. The first step is the application of a modified -means clustering algorithm which is proposed in this paper. The two-phase algorithm evaluates a partial solution to transform it into a feasible individual. The two-phase algorithm consists of a hybridization of four kinds of insertions which interact randomly to obtain feasible individuals. It has been proven that different kinds of insertions impact the diversity among individuals in initial populations, which is crucial for population-based algorithm behavior. A modification to the Hamming distance method is applied to the populations generated for the Vehicle Routing Problem with Time Windows to evaluate their diversity. Experimental tests were performed based on the Solomon benchmarking. Experimental results show that the proposed method facilitates generation of highly diverse populations, which vary according to the type and distribution of the instances.

#### 1. Introduction

The Routing Problem is one of the most important and widely studied combinatorial problems focused on distribution, logistics, and transportation systems. It is important to note that there is great variation in real problems. Each company has a different problem with specific features that makes it unique. According to its importance and hardness, several variants of the Routing Problem have been proposed to provide approaches for more realistic problems. Variants of the Routing Problem are defined by theoretical mathematical models that focus on general problems with specific features. Due to its high complexity and similarities with real problems, the Vehicle Routing Problem with Time Windows (VRPTW) is one of the most studied models.

The VRPTW can be described as a set of identical vehicles that have to serve a set of customers. Each customer has a fixed distance, a defined service time, a specific demand, and a time window during which it must be served. In addition, each vehicle has a limited capacity. A* route* is defined as the set of ordered customers served by a single vehicle, which starts and finishes at the depot. The objective function finds a solution that minimizes the total travel cost. The cost of going from customer to is obtained by calculating the Euclidean distance.

The VRPTW is classified by the Complexity Theory as NP-Complete [1–3]. There is no known deterministic algorithm bound in polynomial time that solves this problem. It is necessary to implement heuristic methods for undertaking problems in this class.

Although several heuristics and metaheuristics have been applied to the VRPTW in attempt to develop efficient distribution strategies according to the constraints, this work is focused on the most widely used population-based methods, evolutionary algorithms, [4] and genetic algorithms [5, 6]. Population-based algorithms sometimes suffer from premature convergence due to loss of diversity during the evolutionary process; they become rapidly trapped in local optima. The initial population diversity and the selection of genetic operators are crucial for the optimal performance of these kinds of algorithms [7, 8] because the proper selection improves the exploration and exploitation of the solution space.

Several methods exist for generating feasible solutions to the VRPTW. The most frequently used ones are clustering methods and insertion heuristics [2, 9, 10]. Both of these techniques are taken into account in this work. Some of the recent work related to this problem is presented in [11], where a basic insertion heuristic is implemented to solve the VRPTW with Multiple Routes per Vehicle in order to generate a feasible solution. At first, an unrouted customer is selected to be inserted between two customers and , if the hard problem constraints are fulfilled. Once a customer is inserted, the new route is updated. This process is iteratively repeated until all unrouted customers have been inserted into a route. In [12], a sequential insertion heuristic is applied to three variants of the Routing Problem. Variants used in this research are the Multiple Time Windows Vehicle Routing Problem, Heterogeneous Fleet Vehicle Routing Problem, and Double Scheduling Vehicle Routing Problem. The efficiency of the insertion heuristic is evaluated based on Solomon’s and Gehring and Homberger’s benchmarks. This method is very useful for heuristics that require a high-quality initial solution, such as Taboo Search.

According to the literature, a crucial factor in obtaining high-quality solutions in a population-based algorithm is the diversity. There are several methods for obtaining equilibrium between exploration and exploitation of the solution space to get the proper convergence and avoid stagnation behavior. It is well-known that the initial population plays an important role in the convergence of population-based algorithms. High-quality solutions will be obtained with the combination of a sufficiently diverse population and the application of the proper genetic operators that allow diversity to be maintained [13–15]. In addition, several methods exist that inject diversity into the population. One of the most relevant of these methods is the DDCGA (Dynamic Diversity Control Genetic Algorithms) proposed by Chang et al. [16]. This method is a chromosome control mechanism that generates a set of artificial chromosomes with high diversity and injects them into the population to effectively maintain the diversity during the evolutionary process. Anselmo and Pinheiro [17] presented an interesting work which is not applied to Routing Problems but is focused on the construction of phylogenetic trees based on the evaluation of Hamming distance among DNA sequences. They proposed an Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and neighbor joining (NJ). This method is applied to two binary data sets based on sequences of mitochondrial DNA from humans, chimpanzees, gorillas, and orangutans. The statistical properties of the DNA sequences are compared using Hamming distance to construct the phylogenetic tree based on their differences. In addition, this method has the capacity to efficiently deal with multiple and different numbers of sequences per group. The relevance of this work to the current paper is that the UPGMA examines several binary sequences. These DNA sequences are studied to identify which of them are related among themselves, in order to construct a phylogenetic tree. The idea of using Hamming distance to identify relationships among compound data sets with several subgroups is taken in this paper. The evaluation of dissimilarities allows for identification of the need to increase the diversity in the nonbinary initial population. The similarities help identify identical individuals, which could negatively affect the algorithm’s behavior. More recently, Yang and Li [18] presented another relevant method called the Autoenhanced Population Diversity (AEPD). This method identifies when the population should be rediversified based on its convergence or stagnation and reinitializes individuals to enhance population diversity.

Based on the previously explained work, some modifications of the classical -means algorithm are implemented in this research. Those features are described in Section 3. The contribution of this paper is focused on the generation of feasible solutions from the partially feasible solutions obtained by the clustering method, where no time constraints have been evaluated yet. Therefore, a two-phase algorithm is proposed to evaluate the time windows and generate highly diverse initial populations. The first phase corresponds to an initial evaluation of time constraints, discarding those customers with time violations. The second phase implements a hybrid insertion heuristic to ensure the solutions feasibility. According to experimental results, the hybridization performed in this phase favors the diversity among solutions.

According to literature, one of the crucial features of a population is the diversity, because the convergence of a population-based algorithm depends on it. There are various methods for calculating diversity. The most widely used are Levenshtein distance [19] and Hamming distance [20].

The Hamming distance is the most common diversity measure for comparisons of categorical sequences and binary sequences [21]. Although there are several ways to apply this method to optimization problems, in this work a modification was made. The Hamming distance was applied, but it was necessary to adapt the method to the solution structure that the VRPTW represents. A computational method is proposed based on this methodology.

The present work is organized as follows. Section 2 provides a description of the Vehicle Routing Problem with Time Windows. Section 3 describes the clustering problem and the implemented clustering algorithm. Section 4 explains the two-phase algorithm proposed in this work. Section 5 introduces the classical Hamming distance method. It also presents the modifications, methodology, and algorithm proposed for calculating the diversity among solutions in a population for the undertaken problem. Section 6 shows the obtained experimental results based on the well-known Solomon benchmarking of 100 customers. Finally, the conclusions and future research direction are presented.

#### 2. Vehicle Routing Problem with Time Windows

The Vehicle Routing Problem with Time Windows (VRPTW) is one of the most studied problems due to its wide application in industry and its economic significance. It is important because it involves the main features of the classical problems of transportation, distribution, and logistics. Therefore, the focus of this paper is on the model proposed by Paolo and Daniele [2]. The VRPTW is considered to be a hard problem and is classified by Complexity Theory as NP-Complete [1–3].

The VRPTW can be formally described as a directed graph with a set and an edge set . The set consists of customers distributed into a two-dimensional space , where each customer has a defined service time , a specific demand , and a time window in which the customer must be served. Each customer must be served at most by a single vehicle. Each vehicle belongs to a homogeneous fleet of identical vehicles with limited capacity . The ordered sequence of customers visited by a vehicle is known as a* route*; a single route is exclusive to a single vehicle. Each route must start and finish at the depot (identified as 0 at the beginning and at the end) within the time window of the depot .

The objective is to minimize the number of routes and the total travel cost of the solution. The cost of a route is given by the sum of the Euclidean distance of each edge between customers served by a single vehicle, where . Therefore, the total travel cost corresponds to the sum of the route’s costs involved in a solution. According to this description, the integer linear programming model of the VRPTW [2] can be formulated as shown below.

*Integer Linear Programming Model of the VRPTW [2]*. Consider

The goal of the objective function (1) is to minimize the total cost of the solution by attempting to reduce the number of routes. The constraint set in (2) makes certain that each customer is visited by a single vehicle at most. The constraint set in (3) ensures that only one customer connects directly to the depot at the beginning of a route. The inequality set in (4) safeguards the idea that the number of vehicles arriving to a customer is the same as that leaving it. The constraint set in (5) verifies that only one customer connects to the depot at the end of a route. The constraint set in (6) guarantees the feasibility of a solution. The constraint set in (7) ensures that the time constraints of each customer are met. The constraint set in (8) guarantees that all of the routes are performed within the time window of the depot. The constraint set in (9) ensures that the sum of customer demands assigned to a specific vehicle does not exceed its maximum capacity. The constraint sets in (10) and (11) guarantee the nonnegativity of variables and define the formulation as a binary integer linear programming model.

#### 3. Clustering Algorithm

Clustering is an important and difficult problem classified as NP-Hard [12], with many applications in different areas. In Combinatorial Optimization, it is applied to difficult problems based on the paradigm “*Divide and Conquer*,” where the problem is divided into segments or subsets [22]. Each subset groups together similar data, separating it by feature into different subsets so that a large problem can be approached as smaller independent subproblems. Formally, a clustering problem can be described as a problem divided into subsets. Therefore, a solution is a compound of data subsets which can be undertaken independently.

Several clustering methods exist [23] that can be applied to Routing Problems, but one of the most well-known and widely used is the *-means* [24]. This method consists of the evaluation of a set of data distributed in a Euclidean space, where disjoint subsets of elements have to be created according to their data features, where is a known value at the beginning of the process. The first step is to determine the starting points called* centroids*. A* centroid* is determined as a point in space. Each centroid groups together a set of elements that minimize the mean squared Euclidean distances from each element to the nearest centroid. Despite its popularity, this method has advantages and disadvantages [8], which are listed below.

Advantages are as follows:(i)It is simple and fast.(ii)It is relatively flexible and efficient.(iii)It can process values with good geometrical and statistical meaning.(iv)It has relatively good results for convex clusters.

Disadvantages are as follows:(i)It is necessary to know the number of clusters at the beginning.(ii)There is sensitivity to the starting point (selection of the first centroid).(iii)There are clusters with similar form and density.(iv)There is recalculation of centroids and distances.

According to the features of -means [23] and the known desirable features of clustering methods [25], some modifications were applied according to the features of the clustering method and the proposed mathematical model shown in (1)–(11). The algorithm is shown in Algorithm 1.