Complexity

Volume 2019, Article ID 8728245, 13 pages

https://doi.org/10.1155/2019/8728245

## Finding the Shortest Path with Vertex Constraint over Large Graphs

^{1}College of Intelligence and Computing, Tianjin University, China^{2}State Key Laboratory of Digital Publishing Technology, Beijing, China

Correspondence should be addressed to Xin Wang; nc.ude.ujt@xgnaw

Received 30 November 2018; Accepted 31 January 2019; Published 19 February 2019

Guest Editor: Xin Huang

Copyright © 2019 Yajun Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Graph is an important complex network model to describe the relationship among various entities in real applications, including knowledge graph, social network, and traffic network. Shortest path query is an important problem over graphs and has been well studied. This paper studies a special case of the shortest path problem to find the shortest path passing through a set of vertices specified by user, which is NP-hard. Most existing methods calculate all permutations for given vertices and then find the shortest one from these permutations. However, the computational cost is extremely expensive when the size of graph or given set of vertices is large. In this paper, we first propose a novel exact heuristic algorithm in best-first search way and then give two optimizing techniques to improve efficiency. Moreover, we propose an approximate heuristic algorithm in polynomial time for this problem over large graphs. We prove the ratio bound is 3 for our approximate algorithm. We confirm the efficiency of our algorithms by extensive experiments on real-life datasets. The experimental results validate that our algorithms always outperform the existing methods even though the size of graph or given set of vertices is large.

#### 1. Introduction

Graph is an important* complex network model* to describe the relationship among various entities in real applications, including knowledge graph, RDF graph, linked data, social network, biological network, and traffic network [1–4]. Shortest path query is a basic problem on graph model. For example, in knowledge graphs, it is to find the closest connection between two entities or concepts; in social networks, it is to find the closest relationships such as friendship between two individuals; in traffic networks, it is to compute the shortest route between two locations.

Shortest path routing is an important problem in* location-based services* (LBS) and has been well studied in the past decades [5–7]. However, a special kind of shortest path query with vertex constraint is more and more important in real life. For instance, in knowledge graphs, a data miner is interested in investigating the closest relationship between two entities connected by some specified entities or concepts. In traffic networks, carpooling becomes a common business with the rapid development of sharing economy. A car driver may carry some fellows on the way home from company and the fellows are going to get down at distinct locations. Thus a critical problem is how to find a route with the minimum length passing through these locations. In above examples, both knowledge graph and traffic network can be modeled as a large graph . The query of shortest path with vertex constraint can be defined as follows: given a starting vertex , an ending vertex , and a subset , find a path with the minimum length among all the paths passing through every from to . The subset is called vertex constraint; that is, the shortest path must pass through every vertex in the subset .

The above problem is a special case of* Generalized Traveling Salesman Path (GTSP)* problem [8], which is known to be NP-hard. In GTSP problem, all the vertices in are partitioned into several categories. The objective is to find a path that visits at least one vertex for every category specified by user. For example, a tourist plans to travel through three kinds of locations, e.g., a coffee shop, a gas station, and a bank. Because he/she may have several choices for every location category, then it is necessary to find an optimal route for him/her. The basic idea of most existing works on GTSP problem is as follows: they first compute all permutations for given categories. Each permutation represents a class of path which has the same order of the categories. Next, for every permutation, these methods enumerate all possible paths from source to destination by concatenating the subpaths between vertices in two successive categories. Finally, they find the optimal one from these paths. In our problem, every vertex in represents a category different to others. Thus these methods need to calculate all the permutations of the vertices to be visited, which incur too heavy computational consumption. However, most of these permutations are unnecessary for computing the shortest path. Therefore, the main challenge is how to avoid computing unnecessary permutations when finding the shortest path with vertex constraint. In this paper, we propose a novel efficient algorithm based on the best-first search to compute the shortest path with vertex constraint. The main idea of our method is to avoid calculating the unnecessary permutations as soon as possible. We also propose an approximate algorithm in polynomial time which is more efficient for large graphs. The contributions of this paper are summarized below.(i)We propose a novel and efficient exact heuristic algorithm with two optimizing techniques to find the shortest path with vertex constraint.(ii)We also propose an approximate algorithm in polynomial time for our problem over large graphs. We prove the ratio bound of our approximate algorithm is 3.(iii)We conduct extensive experiments on several real-life datasets. We compare our algorithms with the state-of-the-art methods. The experimental results validate the efficiency and effectiveness of our algorithms.

The rest of this paper is organized as follows. Section 2 gives the problem statement. Section 3 introduces the CH technique for preprocessing graphs. Section 4 proposes the best-first searching algorithm with two optimizing techniques. Section 5 proposes the approximate algorithm and analyzes the ratio bound. The experimental results are presented in Section 6. The related work is in Section 7. Finally, we conclude this paper in Section 8.

#### 2. Problem Statement

An undirected weighted graph is denoted as (or for short), where is the set of vertices and is the set of edges in . is a function that assigns a nonnegative weight on every edge ; i.e., . Note that is equivalent to because is an undirected graph. The number of vertices (or edges) is denoted as (or ) in . A path in is a sequence of vertices; i.e., , where every is an edge in for . The weight of path , denoted as , is the sum of the weights of all the edges in ; i.e., . We say a path is simple if and only if there is no repeated vertex in . The shortest path between and is a path with the minimum among all the paths between and . For simplicity, in the following, we use to denote the weight of the shortest path between and in .

In this paper, we study the problem of finding the shortest path with vertex constraint. Table 1 summarizes the symbols in this paper. We first give the definition below.