Abstract

The sensor location problem (SLP) discussed in this paper is to find the minimum number and optimum locations of the flow counting points in the road network so that the traffic flows over the whole network can be inferred uniquely. Flow conservation system at intersections is formulated firstly using the turning ratios as the prior information. Then the coefficient matrix of the flow conservation system is proved to be nonsingular. Based on that, the minimal number of counting points is determined to be the total number of exclusive incoming roads and dummy roads, which are added to the network to represent the trips generated on real roads. So the task of SLP model based on turning ratios is just to determine the optimal sensor locations. The following analysis in this paper shows that placing sensors on all the exclusive incoming roads and dummy roads can always generate a unique network flow vector for any network topology. After that, a detection set composed of only real roads is proven to exist from the view of feasibility in reality. Finally, considering the roads importance and cost of the sensors, a weighted SLP model is formulated to find the optimal detection set. The greedy algorithm is proven to be able to provide the optimal solution for the proposed weighted SLP model.

1. Introduction

Link flow data in the road network is the valuable information in many applications of traffic planning and management, such as the road network planning and the congestion analysis. Placing traffic sensors on every link is usually not realistic, because there are thousands of links in the urban road network normally and the investment on sensors installation and maintenance will be huge. So it is important to identify the subset of links so that the traffic flow on every link in the network can be inferred if sensors are installed on this subset of links to collect flow data. The sensor location problem (SLP) was proposed to address this problem.

The SLP has attracted increasing interest in the past few years for its importance in the area of transportation system analysis. Two main classes of SLP were summarized by Gentili and Mirchandani [1]: sensor location flow-observability problems and sensor location flow-estimation problems. The objective of the first-class SLP is to find the optimal sensor locations so that all link flows (or partial link flows) can be determined uniquely. The second-class SLP is trying to improve the accuracy of the estimation by optimizing the sensors locations. Early research in the literature mainly focused on the second-class SLP as a subproblem of origin-destination (OD) estimation. The “OD Covering Rule” was given by Yang et al. [2, 3] to guarantee the upper-bounded OD estimation error. Several other rules were also proposed to improve the estimation accuracy, like the “maximal flow fraction rule,” the “maximal flow-intercepting rule,” and so on (for details, see [15]). The first-class SLP was not formally defined until 2001. Bianco et al. [6] defined SLP as determining the minimum number and locations of counting points to infer all traffic flows in the network. We will focus on the first-class SLP in this paper.

The first-class SLP was proposed in the context of traffic planning and also was developed in this domain in most of the following research. So, essentially, it is a static optimization problem addressing the method of network sensor deployment under the steady-state traffic condition. Generally, in the sensor location flow-observability problems, a system of linear equations will be formulated at first, which represents the relationship of different link flows or the relationship between link flows and route flows (or OD trip) under the law of flow conservation. This system of equations is defined as the flow conservation system in this paper. Then, by deploying the sensors, equations representing the detected link flows will be added. These equations are defined as the detecting equations. The number of the detecting equations is equal to the number of sensors. Basically, the main idea of the first-class SLP is to find the optimal detecting equations so that all the variables are solvable. According to this idea, the SLP was normally designed as an integer linear programming model in the existing studies. The difference among these studies lies in the assumptions on the prior information, which lead to different formulations of the flow conservation system. According to the prior information, three kinds of flow conservation systems are formulated: OD-link based system, route-link based system, and link-link based system [1]. The OD-link based system uses the link choice proportions as the prior information, which is defined as the proportion of trips between one OD pair that uses a given link in the network. Similarly, the route-link based system uses the link-route coefficients as the prior information. When a link is used by a route, the corresponding coefficient will be 1; otherwise, the coefficient will be zero [7]. These two kinds of systems use the OD trips or route flows as the bridge between detected link flows and undetected link flows. So the OD matrix and route flows will be obtained at the same time. However, in the scenarios when the link flow is the only information we care about, the effort used in calculating the OD trips and route flows is not necessary. In particular for the route-based system, enumerating all routes is not easy for a large-scale network. At the same time, if we are only interested in the link flows of one small local network, the whole network still needs to be considered in the SLP since the OD trip and route flow are defined over the network. The third kind of system uses the relationship of the link flows directly. Without using the concept of OD trip or route flow, this kind of system can be used to formulate the SLP for local network. Bianco et al. [6] proposed the first system of this kind in 2001, using the split ratios of outgoing roads at intersections as the prior information. The split ratio of an outgoing link at an intersection is defined as the fraction of the flow on the outgoing link over the total incoming flow of the intersection. With the assumption of symmetrical network and node-based detection (the sensors are located on every in- and out-link of the intersection), a necessary condition was given to guarantee that the number of equations in the system is not less than the number of variables. They also presented a couple of greedy heuristics to find the lower and upper bounds on the number of sensors for random networks. After that, Bianco et al. [8] proved that the SLP proposed in 2001 is NP-hard in the computation complexity. Linear algorithms were invented for several special graphs, including the paths, circles, and combs. Based on the work of Bianco, Morrison et al. [9, 10] gave a stricter necessary condition for general graphs and proved the condition to be also sufficient for the special network with trees as its unmonitored components. Essentially, the network structure-driven logic is used in the link-link based sensor location flow-observability problems. So the traffic flows over the network are calculated based on the network topology and any behavior assumption of the road users is not required [6].

Until now, most discussions on the link-link based sensor location problems used the split ratios as the prior information. The network was assumed to be symmetrical and the detection was assumed to be node based. However, in reality, unidirectional roads normally exist, which leads to an asymmetrical network. At the same time, road-based detection is more reasonable. In this paper, the network is allowed to be asymmetrical and the detection is assumed to be road based. Turning ratios at intersections are used as the prior information. The turning ratio is defined as the ratios of flow turning from an incoming link to an outgoing link over the total flow of the incoming link. Turning ratio is used as a popular parameter in many applications. Many methods were designed to obtain the turning ratios (for details, see [1113]), which make the estimation of turning ratios from historical data possible. Using turning ratios as the prior information, an integer linear programming model is designed in this paper to address the sensor location flow-observability problem under steady-state traffic conditions.

The present paper is organized as follows. In Section 2, the definition and assumptions of the road network used in this paper are specified. Then, the flow conservation system is formulated in Section 3 based on the turning ratios. After that, the coefficient matrix of the flow conservation system is proven to be full rank in row space. Based on that, in Section 5, the SLP model is given and the greedy algorithm is set up to solve the model. Finally, some conclusions are summarized.

2. Definitions and Assumptions

Suppose the road network is composed of directed roads. The roads are numbered by . Represent the roads set as . Six assumptions are used in this paper.

Assumption 1. The road network which we are interested in is taken from a bigger one so that on the boundary there are directed roads whose upstream or downstream intersections are missing.

Definitions for Three Categories of Roads. Under Assumption 1, there must be three categories of roads in a network: incoming/outgoing road, which connects two adjacent intersections; exclusive incoming road, whose downstream intersection is in the network but the upstream intersection is not; exclusive outgoing road, whose upstream intersection is in the network but the downstream intersection is not. Figure 1 gives an example of a simple network including only two intersections.

In the network of Figure 1, roads 4 and 8 are the incoming/outgoing roads, roads 9, 10, 11, 12, 13, and 14 are the exclusive incoming roads, and roads 1, 2, 3, 5, 6, and 7 are the exclusive outgoing roads.

Represent the set of the incoming/outgoing roads as , the set of exclusive incoming roads as , and the set of exclusive outgoing roads as . Let and , which represent the set of the outgoing roads and the set of incoming roads, respectively. We have . Suppose there are roads in and roads in .

Assumption 2. Turning ratios for any road in are known.

Definition for Turning Ratios. The turning ratio is defined as the proportion of the flow on an incoming road which turns to an outgoing road at the intersection. Denote the turning ratio of traffic flow turning from road to road at intersection as and the traffic flow on road as . The traffic flow turning from road to road at intersection is .

Assumption 3. Flow conservation law is satisfied at the intersection.

For an outgoing road without trips originating or terminating, the traffic flow is equal to the sum of the flows turning into it from all the incoming roads at the intersection. Using as the traffic flow on outgoing road , we have

Definition for Net Balancing Flow. For the outgoing roads which have trips originating or terminating, the net balancing flow of the outgoing roads is defined as the originating trips minus the terminating trips. Denote the net balancing flow on road as and then for road , we have

Suppose there are roads in set which have nonzero net balancing flows. Normally, is far smaller than the size of , especially when the analysis period is long enough.

Assumption 4. For any road set , there must be at least one road which has a turning ratio , where road .

If in road set , all the turning ratios are directed to the roads inside the set, and then the traffic inside the set will never flow out. So Assumption 4 is actually to say that the turning ratios should not define a self-circulation system. This assumption is reasonable because any part of the road network has to be able to exchange traffic with another part.

Assumption 5. Traffic flows are assumed to be detected based on roads.

Once a road is detected, the traffic flow on the road is known.

Assumption 6. The detecting flows and turning ratios are both assumed to be error-free.

3. Flow Conservation System Based on the Turning Ratios

From Assumption 3 in Section 2, we know that, for any road in set , either (1) or (2) can be formulated. In order to simplify the expressions of the flow conservation system and the corresponding parameters, we renew by numbering all the roads in at first so that and . On the other hand, the nonzero net balancing variable is looked at as the traffic flow on a dummy road which is an incoming road of the corresponding outgoing road with 1 as the turning ratio. The dummy roads can be looked at as the exclusive incoming roads. Compared to the real exclusive incoming roads, the dummy roads may have negative flows. To illustrate the method to add the dummy roads into the network, we suppose that there is a nonzero net balancing flow on road 4 in the network of Figure 1. Dummy road 15 is added to the network as shown in Figure 2. The turning ratio from dummy road 15 to road 4 is set to be 1, which means all the flow on dummy road 15 will turn into road 4.

The dummy roads are numbered after exclusive incoming roads. The roads compose set . Using to represent the traffic flow vector of the network, where is the traffic flow on road , , the flow conservation system can be formulated as follows:

The coefficient matrix of system (3) can be expressed as follows:

is matrix. The entries are given as follows:

And we have for and for .

The flow conservation system can also be written in the following form:

4. Proof for the Full Row Rank of Coefficient Matrix

The objective of sensor location flow-observability problem is to find the minimum number and locations of counting points to infer all traffic flows in the network. If we can prove that the coefficient matrix of the flow conservation system is full rank in the row space, then the minimum number of counting points will be determined. The task of corresponding SLP will be reduced to optimize the sensors location only.

According to the knowledge of linear algebra, if we can find a nonsingular sub matrix in , then matrix is proven to be full rank in the row space.

Let us select the first columns from to compose a square matrix :

The column vector of matrix is denoted as , . can be written as . can be written as . The flow conservation system can be rewritten as follows:

The variables , , on the left side correspond to the traffic flows on outgoing roads. On the right side, , , correspond to the traffic flows on incoming roads; , , correspond to the net balancing flows.

If we can prove that the traffic flows on the outgoing roads must be zero when the right side vector of (8) equals , then the column vectors are proven to be linear independent, which means must be nonsingular.

Proposition 7. If the right side vector of (8) equals , then the traffic flows on all the exclusive outgoing roads must be zero.

Proof. According to the analysis above, we haveAdding all the equations in system (8) together, we will easily get that the sum of the left side of (8) is minus total traffic flows in set and the sum of the right side of (8) is minus total traffic flows in set . Set the right side vector of (8) . So total traffic flows in set are zero. Then, total traffic flows in set are zero too, which determines that all the exclusive outgoing roads have no traffic on them.

Actually, this is an obvious result under the assumption of traffic flow conservation. For a road network, all the traffic flows (including the net balancing flows here) entering the network must equal the traffic flows leaving the network.

Proposition 8. If the right side vector of (8) equals , then the traffic flows on all the incoming/outgoing roads must be zero too.

Proof. Proof by contradiction is used here to prove Proposition 8.

Suppose there are incoming/outgoing roads which have nonzero traffic flows on them. All these roads compose a road set . We have . According to Assumption 4 in Section 2, there must be at least one road which has a turning ratio , where road . Since the traffic flow on road is not zero, there must be traffic flow turning from road into road . Then, the traffic flows on road must not be zero. This can be proved as follows.

For any road in set , the incoming traffic flows at the intersection can be divided into two parts: traffic flows coming from the exclusive incoming roads or/and dummy roads which correspond to the right side of (8) and traffic flows coming from the incoming/outgoing roads. So for a road in set , if the right side of (8) is set to zero and there are traffic flows turning from the incoming/outgoing roads into it, the traffic flows on it must be nonzero.

If road is still an incoming/outgoing road, set . Then, we can still find at least one outgoing road outside , whose traffic flow is not zero. Since , there must be at least one exclusive outgoing road which has nonzero flow. This cannot happen according to Proposition 7. So if the right side vector of (8) equals , then the traffic flows on all the incoming/outgoing roads must be zero too.

Now we have proven that if the right side vector of (8) equals , the traffic flows on the outgoing roads must be zero. So is nonsingular, which means that the coefficient matrix of the flow conservation system is full rank in the row space.

5. Sensor Location Optimization Model and Solution Algorithm

5.1. Full Flow-Observability SLP Based on Turning Ratios
5.1.1. SLP Description

As we stated above, the objective of full flow-observability SLP is to determine the minimum number and locations of the sensors over the network so that all the traffic flows can be inferred.

Based on turning ratios, linear equations can be formulated to compose the flow conservation system as system (6). The number of variables in the system is , which is larger than . It is easy to know that at least other equations are needed to derivate all the variables uniquely. By installing sensors on road, detecting equation can be formulated as , where is the detected traffic flow. So at least roads need to be detected. Denote the coefficient matrix for the detecting equations as . Matrix is a matrix where only one entry equals one and the others equal zero in each row. For the undetected roads, the corresponding column vectors are zero vectors. Adding all the detecting equations, the complete system can be formulated as follows:where is the detected flows vector, composed of the detected traffic flows .

System (10) will generate a unique solution if and only if the coefficient matrix is nonsingular. So the objective of full flow-observability SLP here is to find a detecting coefficient matrix so that the coefficient matrix of system (10) is nonsingular.

In Section 4, matrix is proven to have full row rank. So we can always find a matrix with to generate a nonsingular coefficient matrix for system (10), which means only roads are required to be detected.

5.1.2. SLP Modeling

Using as the decision variable in the model to determine whether road will be detected or not, means road will be detected; means road will not be detected. We design a matrix as follows:

Let . The full flow-observability SLP based on turning ratios can be described as follows.

Determine which satisfywhere is the rank of matrix .

The solutions of model (12) must exist according to the analysis in Section 5.1.1.

5.1.3. A Special Design: Closing Ring

Shao et al. [14, 15] introduced a special design of sensor location over the network called “closing ring,” which is given in Figure 3.

The “closing ring” is designed to install counting sensors on every exclusive incoming road of the network with the assumption that no net balancing flows exist. No proof was given in [14, 15] to guarantee a unique flow solution. By the analysis above, the assumption can be released by adding dummy roads into the network to represent the net balancing flows. So here the “closing ring” is designed to detect the flows on all exclusive incoming roads and dummy roads. This design can be easily proven to be a solution of model (12).

In , the last columns correspond to the exclusive incoming roads and dummy roads of the network. So the corresponding coefficient matrix for the detecting equations of closing ring design is , where is an identity square matrix and is zero matrix. The whole coefficient matrix of system (10) can be written as follows:where is the submatrix composed of the last columns of .

Obviously, the determinant of can be calculated as . Since has been proven to be nonsingular, we have . So we have . Obviously, is satisfied.

We do not use any assumption of the network topology here. So, if possible, “closing ring” can always be a sensor location design for any road network, which can be given without any calculation.

5.1.4. Feasibility for Detecting Only Real Roads

In the “closing ring” design, all exclusive incoming roads and dummy roads are supposed to be detected. But, in reality, it is not easy to detect the net balancing flows. If we can always find real roads as a detection set in the above SLP, the solution will be more applicable in reality.

Renew by numbering the roads with nonzero net balancing flows after others. It is equivalent to permute rows and first columns in matrix . After that, permute the last columns in matrix so that the coefficient matrix for dummy roads is expressed by , where is an identity square matrix and is a zero matrix. After all the permutations, the coefficient matrix of the flow conservation system is as follows:

Denote the first columns of as . Submatrix is also the permutations result of matrix . Since is nonsingular, is nonsingular too. So the row vectors in are linear independent. Consequently, the first row vectors are linear independent. We can always find a nonsingular subsquare matrix in the first rows of . Represent the submatrix, which is composed of columns in with as its first rows, as , where is the submatrix composed of the last rows of . Also, represent the submatrix composed of the other columns of as and represent the submatrix composed of the columns from th to th in as . By exchanging the columns in and and some columns permutations, can be written as :

As a subsquare matrix of , matrix is a matrix. We have

Since is a nonsingular square matrix, is an identity square matrix and matrix is also nonsingular. According to the analysis in “closing ring” design, if we place the sensors on all the roads corresponding to the columns in and , the traffic flow vector can be inferred uniquely. Because all the roads corresponding to the columns in and are real roads, we can always find real roads as a detection set for the SLP based on turning ratios.

5.2. Weighted Full Flow-Observability SLP Model Based on Turning Ratios
5.2.1. Model Formulation

Obviously, many combinations of roads in the network can be used as the solution detection sets for the full flow-observability SLP based on turning ratios. How to choose the detection set among all the feasible solutions depends on people’s evaluation on the importance of the roads and the cost of the sensors’ installation and maintenance. For example, an artery road is usually more important than a local street for the traffic management. So people maybe want to detect the traffic flow on it such that they can know its detailed traffic condition. Also lower cost of installation and maintenance could be preferred. A weight can be given to the road by integrating people’s evaluation on its importance and cost. A higher weight means that a road has higher importance and needs lower investment. For the whole network, a weight vector could be given as , where is the weight of the road . Here the dummy roads are included too. To get rid of dummy roads, we can simply set , . Discussion in Section 5.1.4 shows that we can always find the feasible solutions even though , .

The objective of the weighted SLP here is to find the optimal detection set so that the total weight of all the roads in the detection set is maximal. The weighted SLP model can be formulated as follows:

All the variables and parameters have been defined before.

5.2.2. Solution Algorithm

Let us change in into 1; then we get :

Denote the th row vector in as . The first row vectors compose matrix , . The last row vectors compose identity matrix , . Every row vector in is given a weight . If row vectors of and row vectors in can compose a maximal independence row vector group of , then these row vectors are called feasible vector group.

Solving model (17) is essential to find the optimal feasible vector group whose total weight is maximal among all the feasible vector groups.

Greedy approach is used here to solve the SLP model (17). The algorithm is designed as follows.

Step 1. Set matrix .

Step 2. Sort row vectors in into monotonically decreasing order by the given weight .

Step 3. For each row vector , in , taken in monotonically decreasing order by weight , consider the following.

Calculate the rank of to see if has full row rank. If has full row rank, then set . When the rank of is equal to , stop the calculation and return .

The nonzero entries in the last rows of give the optimal solution of SLP model (17). The corresponding roads compose the optimal detection set.

The greedy algorithm can provide the optimal solution to model (17). This can be proven as follows.

Proposition 9. Let be the row vector with the maximum weight among all the row vectors which satisfy that has full row rank. Then, must belong to the optimal feasible vector group.

Proof. Denote the optimal feasible vector group as , where . Obviously, any row vector in will satisfy that has full rank. Assume that . Since has the maximal weight among all the row vectors which satisfy that has full row rank, we have .

Because the vectors in can compose a maximal independence row vector group of with row vectors in , the following equation must hold:where at least one . Since has full rank, there must be one which has a nonzero value. Otherwise, if all is zero, then, from (19), is the linear combination of , , which is inconsistent with the fact that has full rank.

Suppose , ; then, can compose a maximal independence row vector group of with all the vectors in except . Otherwise, the following equation must hold:

From (19) and (20), we have

Since , (21) is not consistent with the fact that is a maximal independence row vector group of .

So is a maximal independence row vector group of . Because , the total weights of must be not smaller than the total weights of . If is the optimal feasible vector group, then must be the optimal feasible vector group too.

So must belong to the optimal feasible vector group.

Proposition 10. For any in the algorithm steps whose rank is smaller than , suppose all the row vectors in belong to the optimal feasible vector group. Let be the row vector with the maximum weight among all the row vectors which satisfy that has full row rank. must belong to the optimal feasible vector group.

Proposition 10 can be easily proven if we look at here as in Proposition 9. All the proved procedure will be the same with Proposition 9.

Proposition 9 shows that the first row vector which is selected from into according to the proposed greedy algorithm must belong to the optimal feasible vector group. Proposition 10 shows that if all the row vectors, which have been selected from into , compose a subset of the optimal feasible vector group, then the next row vector, which will be selected from into according to the proposed greedy algorithm, must belong to the optimal feasible vector group too. So, by the method of mathematical induction, the proposed greedy algorithm is proven to be able to produce the optimal solution to model (17).

5.2.3. Numerical Examples

In this part, a simple road network composed of one exclusive incoming road, two exclusive outgoing roads, and six incoming/outgoing roads is used to examine the greedy algorithm. Another bigger road network with different number of dummy roads is used to show the computation performance.

The simple road network is given in Figure 4.

There is nonzero net balancing flow generated on road 7. The net balancing flow can be transferred to a dummy road 10 as shown in Figure 4(b) with 1 as the turning ratio from road 10 to road 7.

The turning ratios for each road are given by Table 1.

The weight vector is given by .

According to the descending order of weights, the checking order of the roads is . By calculating the ranks of matrices, the final optimal detection set is , and the maximal weight is 14.5. Actually, after road 9 is determined to be detected, the flows on roads 3, 4, and 8 are also determined.

A bigger network is given to show the computation performance of the algorithm. The network is shown in Figure 5. This network is composed of 120 directed roads and 25 intersections.

Based on the road network in Figure 5, three scenarios are generated:(a)No net balancing flow exists in the network.(b)There are 20 roads whose net balancing flows are nonzero.(c)There are 40 roads whose net balancing flows are nonzero.

We implemented the algorithm for 200 times under each scenario in C# programming language on a Thinkpad (X201s). Every time we generate the turning ratios and weights randomly. The average CPU time for three scenarios is shown in Table 2.

6. Conclusions

The network sensor location problem based on the turning ratios discussed in this paper essentially uses the topology-driven logic to infer the traffic flow over the network. So no assumptions about the route choice are needed. The findings of this paper can be summarized as follows.

First of all, the coefficient matrix of the flow conservation system based on turning ratios is proven to be nonsingular. So the minimal number of the roads required to infer all the traffic flows is determined to be the total number of exclusive incoming roads and the outgoing roads with nonzero net balancing flows.

Second of all, a special sensors location design called “closing ring” is improved in this paper. The analysis shows that if we can detect the traffic flows on all the exclusive incoming roads and all the net balancing flows in the network, then the whole traffic flow vector can be inferred no matter what topology structure the network has. The analysis also shows that we can always find a feasible solution including only real roads for the proposed SLP model, which is useful because the net balancing flow is usually difficult to be detected.

And finally, a weighted SLP model to find the optimal detecting set considering the roads importance and cost of sensors is formulated. Greedy algorithm is designed to solve the model, which is proven to be able to provide the optimal solution. Two numerical examples are given to test the algorithm. The computation performance is reported.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The work of this paper is supported by the National Natural Science Foundation of China (Grant no. 51208379).