Scientific Programming

Volume 2016, Article ID 1682925, 11 pages

http://dx.doi.org/10.1155/2016/1682925

## Automatically Produced Algorithms for the Generalized Minimum Spanning Tree Problem

^{1}DEI, University of Bologna, Viale Risorgimento 2, 40136 Bologna, Italy^{2}Departamento de Ingeniería Informática, Universidad de Santiago de Chile, 3659 Avenida Ecuador, 9170124 Santiago, Chile

Received 2 December 2015; Revised 2 February 2016; Accepted 16 February 2016

Academic Editor: Frédéric Saubion

Copyright © 2016 Carlos Contreras-Bolton et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The generalized minimum spanning tree problem consists of finding a minimum cost spanning tree in an undirected graph for which the vertices are divided into clusters. Such spanning tree includes only one vertex from each cluster. Despite the diverse practical applications for this problem, the NP-hardness continues to be a computational challenge. Good quality solutions for some instances of the problem have been found by combining specific heuristics or by including them within a metaheuristic. However studied combinations correspond to a subset of all possible combinations. In this study a technique based on a genotype-phenotype genetic algorithm to automatically construct new algorithms for the problem, which contain combinations of heuristics, is presented. The produced algorithms are competitive in terms of the quality of the solution obtained. This emerges from the comparison of the performance with problem-specific heuristics and with metaheuristic approaches.

#### 1. Introduction

Determining the minimum cost spanning tree in a graph is a problem with various applications in the world of operations planning and management. It is known as the minimum spanning tree problem (MSTP) and consists of finding a tree of minimum cost that spans all vertices of a graph, the resolution of which can be obtained in polynomial time [1, 2]. However, there are several extensions of the MSTP that are generally NP-hard and for which it is not possible to obtain a solution in polynomial time [3]. A generalized version of the MSTP is the generalized minimum spanning tree problem (GMSTP), which still represents an enormous challenge because it belongs to the NP-hard class [4]. It has recently received much attention not only because of its difficulty but also because of its diverse applications. Dror et al. [5] study the design of a minimum-length irrigation network in agricultural irrigation in desert environments, while Myung et al. [4] consider supporting the decision making on the location of public facilities, commercial offices, and distribution centers connected by links such as roads or telecommunications. In the same field, Golden et al. [6] designed backbones in communication networks.

The GMSTP consists in finding a minimum cost spanning tree in an undirected graph, the vertices of which are divided into clusters such that the spanning tree includes only one vertex from each cluster. Let be an undirected graph with nodes, and let be a division of into subsets called clusters; that is, and , *∀*, . The cost of an edge is denoted by and the cost of a tree is obtained by adding the individual costs of the edges that compose it. Thus, the GMSTP consists of finding a tree of minimal cost that spans exactly one vertex , . Different formulations using integer programming have been proposed in the literature [7, 8]. Particularly, a compact formulation in terms of the number of variables and constraints was proposed by Pop [9, 10] by considering four types of binary variables.

Because of its computational complexity, the GMSTP has been addressed using a variety of approaches. The different mathematical formulations proposed have been unable to solve large problems because long computational time or large memory resources are required [11]. Other approaches to solving the GMSTP are constructive and improvement heuristics. The former provide the solution by adding edges step by step until a generalized minimum spanning tree is constructed, for which adaptations of the well-known polynomial algorithms for the MSTP are used [5, 12].

In search of new solutions for the most challenging GMSTP instances some metaheuristics have been implemented. GRASP, tabu search, and genetic algorithms show good performance with larger instances; however, none of them determines the optimal solution for all larger sizes known instances. Such instances with 229 to 783 nodes were proposed to test the performance of a model based on tabu search [13] in which a current solution is represented by an array of size equal to the number of clusters and the neighbor solutions are randomly generated. When comparing the tabu search model with two versions of genetic algorithms [6] the result obtained by Öncan et al. [13] achieved better or equal solutions. In addition, an adaptive version of GRASP that uses path-relinking and iterated local search improved some of the solution values, through other subsets. However, the best known value was not achieved by this approach [11]. Recently, Contreras-Bolton et al. [14] proposed an approach based on a multioperator genetic algorithm by considering two crossover and five mutation operators, three of which are local searches. That allowed MGA to be competitive with respect to the best algorithms present in the literature, in terms of both quality of the solution and computing time.

Different approaches known for GMSTP include various combinations of search heuristics. There are different ways to generate a neighbor solution: local search algorithms, different operators, or efficient well-known algorithms that solve only a subproblem. When properly combined these heuristics give rise to novel and effective problem-solving techniques. Indeed, the genetic algorithm presented by Golden et al. [6] uses a local search method as a mutation operator and a particular designed crossover. In addition, Haouari and Chaouachi [15] presented a random greedy algorithm that includes different search techniques. On the other hand, Ferreira et al. [11] presented five different heuristics that when combined with path-relinking give rise to six different versions of GRASP [16]. Such approach suggests that to obtain good results for GMSTP a proper selection and combination of heuristics is necessary. However, the combinations explored so far seem to be an only subset of the many possibilities that can be examined; thus by studying the unexplored combinations new algorithms for the GMSTP may be generated.

The task of automatically selecting and combining simple heuristics to generate a generic heuristic to solve any instance of a given optimization problem is known as a hyperheuristic [17, 18]. A hyperheuristic performs a search in the heuristic space rather than searching in the problem’s solution space. Some of the most widely used approaches to generating hyperheuristics are genetic programming [19, 20], genetic algorithms [21], and learning systems [22]. The hyperheuristic approach has been used to approach various optimization problems, such as packing [23], timetabling [24], scheduling [25], MAX-SAT [26], vertex coloring problems [27], and binary knapsack problem [28, 29].

In this paper, an automated technique is presented that explores new heuristic combinations for the GMSTP. The algorithms are constructed from elementary heuristic components obtained from the methods described in the current literature and from a set of control structures typically used in any algorithm. The constructive process is carried out with a genetic algorithm in which a binary string represents the interactions between the elementary heuristics and the control structures [30–33].

In the following section, the procedures for generating the algorithms are described. The computational results of the generated algorithms are presented and discussed in the third section. The conclusions of the study are presented in the last section.

#### 2. Procedure for Generating Algorithms

The procedure for generating the algorithms takes into account a genotype-phenotype genetic algorithm (GPGA) [34], a set of structures that composes the algorithms, and a fitness procedure to evaluate the performance. The GPGA considers two search spaces; the first search space corresponds to the genotype and is composed of strings which represent characteristics of the algorithms, and the second is the phenotype, composed of trees that assemble instructions that when executed find a solution to the GMSTP (Figure 1). Thus, from a population containing a fixed number of strings, a new population is generated by applying the selection, crossover, and mutation operators. In order to evaluate the performance of each string, the corresponding tree is constructed and evaluated. The process is repeated a number of times. The final algorithms are produced after two stages: First the algorithms are evolved by a number of generations until a convergence of the GPGA is detected and in a second stage, the best algorithms from the first stage are selected and evaluated with a different set of problem instances.