Abstract

In order to improve the accuracy, reliability, and economy of urban traffic information collection, an optimization model of traffic sensor layout is proposed in this paper. Considering the impact of traffic big data, a set of impact factors for traffic sensor layout is established, including system cost, multisource data sharing, data demand, sensor failures, road infrastructure, and sensor type. The impacts of these influential factors are taken into account in the traffic sensor layout optimization problem, which is formulated in the form of multiobjective programming model that includes minimum system cost, maximum truncation flow, minimum path coverage, and an origin-destination (OD) coverage constraint. The model is solved by the tolerant lexicographic method based on a genetic algorithm. A case study shows that the model reflects the influence of multisource data sharing and fault conditions and satisfies the origin-destination coverage constraint to achieve the multiobjective optimization of traffic sensor layout.

1. Introduction

With the development of Intelligent Transportation Systems, large-scale acquisition of urban traffic data has become possible by fixed detectors, such as inductive loop detectors [1], microwave radar detectors [2] and video detection technology [3], moving detectors as probe-vehicle systems [4], and new detectors, such as Bei Dou Navigation Satellite systems [5, 6], mobile devices [7, 8], and wireless transmission technology [9]. Therefore, traffic sensor is an important part of urban traffic information collection, and its optimal layout is of great significance.

In recent years, optimal traffic sensor layout model is mostly based on the theory of intelligent algorithms [10], graph theory [11], and mathematical planning [12], considering various factors. On one hand, the objective external conditions of detector layout have been well investigated. Hao et al. [13] proposed the traffic detector optimal location model and algorithm for dynamic OD estimation. Shao et al. [14] built the network sensors model based on turning ratios. Bertini and Lovell [15] explored infrastructure and traveler characteristics and put forward a model on accurate freeway travel time estimation for traveler information. Zhang and Jin [16] explored the problem of traffic flow detector layout on highways with the optimization goal of minimizing the travel time estimation error. Li et al. [17] investigated the reliability of traffic information spatially and then researched the optimal traffic sensor layout. Overall, the most studies develop mathematical models to optimize the sensor layout based on road conditions and research objectives.

On the other hand, the sensor layout should not only consider the location and ensure the reliability of the data, but also consider the detection target, the engineering cost, the subsequent service life, etc. Therefore, with the development of measurement technology, the research related to the detector itself, such as sensor target and fault conditions, has been developed. Barcelo et al. [18] built an optimal traffic sensor layout model considering link and node coverage in order to estimate time-varying origin and destination. Luo et al. [19] considered the shortcoming of sensors configuration during short-time prediction and proposed an improved method for short-time traffic flow prediction based on genetic algorithm and wavelet neural network. Xiang et al. [20] built a predictive model of detector layout based on multipoint prediction method. Zhu et al. [21] analyzed the classical traffic sensor layout model and proposed a two-stage optimization model, which considered fault conditions of the sensor. Determining the optimal number and location of traffic sensors is the key to solving OD travel matrices. However, techniques based on traditional link and node coverage are not enough to collect sufficient road information, and short-term traffic flow prediction is susceptible to contingencies [22, 23], which requires a more refined technical approach to improve the efficiency and accuracy of obtaining road information.

In general, the optimal traffic sensor layout is conducted by mechanism models and knowledge models, without consideration of the traffic big data. But in practical problems, these two models have disadvantages in some complex optimization problems. The detector layout problem involves many factors, and some of them are difficult to obtain or unable to calibrate, which seriously affects the accuracy of the mechanism model and may cause the model fails to match the actual phenomenon. However, the knowledge model is based on practical experience which is highly influenced by the researcher’s subjectivity, and the scientific validity of the model cannot be guaranteed, so it also cannot be used to model the detector optimization.

Nowadays, there is rapid development of Mobile Internet, Internet of Things, cloud computing, and other technologies, putting us into the age of big data [24]. Unlike traditional traffic data, traffic big data has “6V” characteristics [25] (volume, velocity, variety, veracity, value, and visualization).

With the evolution of information technology and the emergence of intelligent transportation cities, traffic big data has been developed rapidly and applied in various fields, such as logistics, public transportation, and social economic. Nowadays, traffic big data plays an important role in traffic data resources, and scholars, government, and companies cooperate to develop new mechanism for sharing and applying data resources [26, 27]. Xu and Yang [28] solved the low acquisition efficiency and low quality of data information of channel traffic flow collection system and proposed a design method of waterway traffic data collection system based on big data analysis and designed software for data acquisition, which can meet the demand for any types of transportation. Ji et al. [29] fully used traffic big data by video monitoring to develop a study of vehicle category mining and application analysis and the potential applications of the urban traffic big data and POI (Point of Interest, POI) in urban planning. Zhu et al. [30] proposed an algorithm for real-time vehicle detection in automatic number of plate recognition data, which can effectively discover the platoon companions effectively. Zhang et al. [31] analyzed the traffic big data of the main urban area in Chongqing and established a road network operation evaluation model to evaluate the urban tail number restriction scheme. Wen et al. [32] proposed a large-scale search algorithm for bus trip chain based on historical bus data, which consists of bus IC card data and bus GPS data, and constructed a method and database for estimating bus capacity rate. In addition, traffic big data has been applied in urban traffic congestion evaluation [33], subway and taxi connection travel planning [34], and short-time traffic flow forecasting [35]. It is shown that traffic big data mainly comes from three aspects: internet-based public travel service data, industry-based production supervision data of operating companies, and vehicle-based sensor collection data of IoT(Internet of Vehicles) and IoV(Internet of Vehicles). Big data has great potential information and value, and diverse data types can be effectively used for road management and research, combining big data with detector optimization models and using big data appropriately can create extremely high value at low cost.

The emergence of the concept of big data for transportation has spurred the emergence of big data-driven theory, which led to a focus on big data-driven mathematical modeling approaches. Data-driven model is a bottom-up modeling process that starts with data, which is widely used in the analysis of complex traffic systems and various traffic optimization problems due to its high accuracy and ignorance of the mechanism. Therefore, the impact analysis on the number and location of traffic detector deployment in the context of traffic big data can be combined to build a deployment optimization model based on the data-driven modeling approach. Specifically, on the one hand, traffic big data can describe the influences of traffic flow detector layout in detail, which is useful for model building and solving; on the other hand, data sharing can enrich the data sources of the urban traffic information collection system and reduce the number of traffic flow detector layouts.

Therefore, the purpose of this paper is to present an optimal traffic sensor layout model in the context of traffic big data. According to the data characteristics of traffic big data, this paper establishes an evaluation index system for the influencing factors of traffic sensor layout and solves it by tolerance hierarchical sequence method based on genetic algorithm, which verifies the validity of the optimization goal and the feasibility of solution. Combining big data with the layout optimization is cost-effective for significantly optimizing the layout of the sensor. The remainder of this paper is organized as follows: the influencing factors of traffic sensor layout are given in Section 2. Section 3 presents the model formulation of the multiobjective optimization model for traffic sensor layout, in which the related parameters and constraints are elaborated. In Section 4, a tolerant lexicographic method based on genetic algorithm is proposed. A case study based on the classical Nguyen-Dupuis network is carried out in Section 5 to verify the reliability and feasibility of the proposed model. Finally, in Section 6, a summary concludes this paper.

2. The Influencing Factor Set of Traffic Sensor Layout

2.1. Notation

To facilitate the presentation and analysis of the optimal traffic sensor layout model, all definitions and notations used throughout this work are described in Table 1.

2.2. The Analysis of the Influencing Factors

The quality and integrity of the data acquired by the system, the one-time input, and operation and maintenance cost of the system are the main considerations in the planning and design of the urban traffic information acquisition system. Both of them are closely related to the number and the specific position of sensor layout in the road network. According to the traffic big data, the influencing factors of traffic sensor layout are analyzed as follows.

2.2.1. System Cost

Urban traffic information acquisition system is a subsystem of ITS for which the hardware cost, especially the purchase of traffic sensor, accounts for a large proportion of the total investment. The construction cost of traffic sensor varies according to the type and location of the road infrastructure. In addition, the sensor in the road network should be maintained regularly under the influence of environmental factors and service life. Optimizing the layout of traffic sensor and reducing the acquisition cost, construction cost, and maintenance cost of traffic sensor are the main ways to reduce the total investment of Intelligent Transportation System.

The system cost related to the layout of traffic sensor can be expressed as

2.2.2. Multisource Data Sharing

Many subsystems of the ITS and other applications have data collection capabilities. For example, traffic flow data can be obtained not only from the traffic signal control system and inductive loop detector but also from a monitoring system (including traffic and speed detection bayonets) and a microwave or video detector. Through multisource data sharing, the data can be directly connected to the urban traffic information collection system.

The sensor layout of different systems in section a is represented by the variable :where indicates that other systems have sensor type on the section ; means the opposite.

2.2.3. Data Demand

The data acquired by the urban traffic information collection system consists of the data collected by the system layout sensor and the data directly accessed by other systems. Traffic managers and travelers hope to obtain as much OD data and traffic flow data as possible. Considering the system functions of signal control system, monitoring system, and self-provided sensor, the sensor layout in the urban traffic information acquisition system can be primarily considered for OD data acquisition. Three basic principles of the traffic sensor layout based on OD estimation are as follows [36, 37].(1)OD coverage principle:The sensor layout in the road network should be fully covered to ensure that the travel information of any OD pair can be observed. In other words, any OD pair must have an observation section.The layout of the detector on the section is expressed in variables and .The variable is used to denote the attribution of section . When section is on a certain path of the OD pair, it takes 1, otherwise 0.OD coverage principle requirements:(2)Maximum truncation flow principle.Considering the accuracy of OD estimation, the sensor should be installed in the section with large net flow. For a certain OD pair, the sensor should be installed on the path with the maximum flow ratio. and are both 0-1 variables; means the section is on the route and means there is a sensor on the path ; the value of 0 means the opposite.The maximum truncation flow principle requires that(3)Minimum path coverage principle:Considering the accuracy of the OD estimate, the sensor should be installed in such a way that the total number of passing paths is as small as possible. For a given path, the sensor should be installed on the path containing the smallest number of paths. is the number of paths through the section and indicates the existence of a detector section. If there is no detector on the road section , is 0, else 1.Minimum path coverage principle requires that(4)Sensor failure:The sensor may fail due to the failure of detection equipment or communication equipment, or environmental and human factors, which can reduce the data quality. In the optimal layout of traffic sensor, the probability of sensor failure should be considered.The probability of sensor failure varies when different types of sensor are laid out for different types of sections. For example, the failure probability of an inductive loop detector installed on a section with heavy trucks passing frequently is much higher than average. is related to the detector failure; when a failure of the sensor type is on the section and 0 is the opposite.(5)Road infrastructure:Not all sections of the road can be laid out with sensors due to road infrastructure constraints. When section cannot or is difficult to install sensor type , the feasible region of can be reduced to 0. Alternatively, variable represents the construction cost of the layout of sensor type in section , and let take a maximum number.In addition, the construction cost of different types of sensor installed on different types of sections varies, as does the number of installations. indicates the number of sensors of type that need to be installed on the section .(6)Types of sensors:Currently, common traffic sensors include inductive loop sensors, video sensors, and microwave radar sensors. The selection of sensor types is generally carried out using qualitative analysis, the more important principles of which are to minimize the number of traffic flow sensor types and to use multifunctional sensors where possible to facilitate management and reduce costs.Different types of sensors have special applicability [38]; for example, roads covered by ice and snow frequently have a high susceptibility to inductive loop sensor malfunctions; cities with long night affect the efficiency of video sensor; roads with obvious characteristics of mixed traffic flows have low accuracy of microwave radar sensors. Considering the principle of selecting sensor type and the applicable scope of sensor, the optimal traffic sensor layout model considers that microwave radar sensor is installed on urban expressway, and video sensor is installed on urban trunk road and subtrunk road.

3. Multiobjective Optimization Model for Traffic Sensor Layout

3.1. Minimum System Cost Optimization

The minimum system cost optimization model based on (1) is established by the acquisition cost, construction cost, and maintenance cost of the traffic flow sensor.

3.2. Maximum Truncation Flow Optimization

Considering the impact of sensor failure, the expression for reliability level is shown as follows [39]:where indicates the threshold for the level of reliability, and if , traffic data for the path is accessible; if , traffic data for the path is unavailable.

is related to the possibility of obtaining traffic data for path . If traffic data for the path is available, variable takes 1, otherwise 0.

A maximum truncation flow optimization model is constructed based on (7), which is expressed as follows:

3.3. Minimum Path Coverage Optimization

The minimum path coverage is affected by multiple sources of data sharing and is not affected by sensor failure. The expression for the minimum path coverage optimization model is shown in (10).

3.4. OD Coverage Constraints

Consider the effect of sensor failure, indicates the level of probability of obtaining traffic data of OD pairs , that is, the probability of not obtaining traffic data of OD pairs . The expression is as follows:

If , traffic data for the OD pairs is accessible; , traffic data for the OD pairs is unavailable.

On the basis of (5), the OD coverage constraint is presented:

3.5. Multiobjective Optimization Model

Consider a multiobjective optimization model of the traffic sensor layout can be expressed as

4. Tolerant Lexicographic Method Based on Genetic Algorithm

4.1. Tolerant Lexicographic Method

For the multiobjective optimization problem of traffic sensor layout, an improvement in one subobjective may reduce performance in one or more, which make each subobjective as optimal as possible to obtain the optimal solution—Pareto optimal solution. In this paper, the tolerant lexicographic method is chosen for solving the model [40]. The tolerant hierarchical sequence method classifies the objective functions in the optimization problem according to their importance and then solves the problem orderly to ensure the optimal solution for the next objective. This method has superior performance, and each decision step has practical meaning and context, which is a method that can transform a multiobjective optimization problem into a series of single-objective optimization problems for solving, improving the model’s solving efficiency significantly [41]. The basic ideal is the tolerance, which increased from the optimization value of the previous step, which is added as a new target constraint to the next step, and the final iterative solution is the optimal solution of the original problem [42]. Specific processes include the following.Step 1: minimum system cost is obtained by solving model below:Step 2: maximum truncation flow is calculated with adding the constraint on the tolerance factor :Step 3: minimum path coverage and the optimal solution are given by increasing the constraint on the tolerance factor :

4.2. Genetic Algorithm

When applying tolerant lexicographic method to solve the multiobjective optimization model, the optimization problem for each subobjective needs to be solved, which is nonlinear 0-1 integer programming and can be solved by genetic algorithms [43]. The basic steps [44] are as follows:Step 1 (initialization): the binary encoding is used to set individual size , that is, the number of elements in the set, population size, selection probability, cross probability, variation probability, maximum number of evolutions, etc.Step 2 (adaptation): for the minimization problem, the reciprocal of the function value is the fitness value of the individual.Step 3 (selection): the roulette method is used to randomly select good individuals from the old population to form new populations to breed the next generation of individuals.Step 4 (crossover): two individuals are randomly selected from the population for chromosomal exchanges and combinations that pass on excellent characteristics from the parent string to the child string, producing new excellent individuals.Step 5 (variation): one individual is randomly selected from the population, and a point in the individual is selected to mutate as a more superior individual.

5. Case Study

5.1. Basic Parameters of the Case

The classical Nguyen–Dupuis network [45] is used for the case study, with 13 nodes, 19 sections, and 4 OD pairs. The Nguyen–Dupuis network and basic characteristics are shown in Figure 1. The first item in parentheses is the section number, and the second item is the free flow travel time; and the third item is the traffic capacity . Nodes 1 and 4 are traffic demand generation points, and nodes 2 and 3 are traffic demand attraction points. The OD traffic demand of this network is given in Table 2.

The optimal traffic sensor layout in the context of big data requires considering a set of influencing factors. However, calculating the system cost uses the road network survey and choosing the type of detector is based on qualitative methods, both of which are difficult to show in the case study. Therefore, specific conditions are set as follows: the sections in the road network are all trunk roads; section 13 is covered by the traffic monitoring system with video detectors; section 8 is covered by the signal control system with induction coil detectors; section 17 cannot be installed detectors; other sections can be installed detectors and the cost of the same type of detectors is equal.

Effective path sets and possible flow are determined by static traffic assignment calculations [46], as shown in Table 3. For ease of calculation, the flow value is approximated to an integer multiple of 5.

5.2. Optimizing the Effectiveness of Objectives

Minimum system cost, maximum truncation flow, and minimum path coverage models are solved by genetic algorithm and the change of system cost, traffic flow intercept, and path coverage with sensor layout’s number of points are shown in Figures 24, respectively.

5.2.1. The Minimum System Cost Model

As shown in Figure 2, the system cost has a linear positive correlation with number of points in the sensor layout, that is due to the same type of sensor have the same system cost on the trunk roads. The higher the number of points, the greater the system cost.

5.2.2. The Maximum Truncation Flow Model

As shown in Figure 3, there is an increasing trend between truncation flow and the number of points at first, and the truncation flow is optimal when the number of points exceeds two. For different number of points, it may have one or more optimal solutions to maximum truncation flow, where one of optional solutions is taken arbitrarily to calculate the path coverage. In addition, with the increase in dot numbers, the path coverage corresponding to the truncated flow increases volatility.

5.2.3. The Minimum Path Coverage Model

As can be seen from Figure 4, the path coverage decreases first and then increases with the adding number of points. The first reduction is due to installed sensors on the road. For different number of points, it may have one or more optimal solutions to minimum path coverage, where one of optional solutions is taken arbitrarily to calculate the truncation flow. What is more, the path coverage corresponding to the truncation flow increases volatility with the increasing number of points.

5.2.4. Comparative Analysis

For ingle-objective optimization, the range of feasible number of points is [1, 18;] the system cost varies within [1.68, 30.24] and is optimal when the, number of points is 1; the truncated flow varies in [725, 1400] and takes the optimal value when the number of points is greater than 2; the path coverage varies in [4, 48] and takes the optimal value when the number of points is 2 or 3.

In summary, Figures 2 to 4 describe the change of system cost, traffic flow intercept, and path coverage and their relationship, which prove the effectiveness of the multiobjective programming model.

5.3. Calculation Procedure

The specific steps to solve the model are as follows:Step 1 (minimum system cost optimization): the minimum system cost and the maximum system cost can be obtained by solving (19), the values of which are 1.68 and 30.24 million, respectively.Step 2 (maximum truncation flow optimization): The maximum truncation flow and the maximum value of truncation flow are both 1400 according to (20), on the basis of , andStep 3 (minimum contained path optimization): the maximum path coverage and the optimal solution can be obtained by solving (21), on the basis of , andand the optimal solution is that detectors are installed in sections 2 and 3, where the type is a video detector. The system cost is 3.36 million, the traffic flow intercept is 1125 , and path coverage is 10.

5.4. Results and Discussion

The optimization goal of traffic sensor layout model includes minimum system cost, maximum truncation flow, minimum path coverage, and an origin-destination (OD) coverage constraint, which meets the data requirements and optimizes the system cost. An optimal solution can be obtained by applying tolerant lexicographic method based on genetic algorithm.

In addition, the model takes into account the sharing of data from multiple sources, the traffic jam system covers the road section 13, and the video detector data connect to the urban traffic information collection system, which reduces the number of detector points. For example, the video detector installed on road section 13 can satisfy the origin-destination coverage constraint when putting one traffic sensor on road section 7, road section 9, or road section 11; otherwise, it cannot satisfy the origin-destination coverage constraint when the number of points is 1.

Considering the influence of sensor fault condition, signal control system covers the road section 8 in the case study, and the induction coil detector of this section is easy to fail, so it is necessary to increase the detector, improving the system robustness. For example, without considering the influence of sensor fault condition, only section 2 is a possible solution. But it cannot satisfy the origin-destination coverage constraint due to the fault conditions. In general, the indicators considered in the optimization model, such as whether to install detectors and the type and cost of detectors installed, are easier to obtain in the actual case, which enables a clear characterization of the influence of each factor on the model results. The computational steps of the model are clear and the calculation is simple. After optimizing the sensor layout with the optimization model, it shows an obvious effect, which can prove that the constructed model has strong feasibility and practicality.

6. Conclusions

In this paper, the optimal traffic sensor layout model, which aims to promote the construction and development of urban traffic information acquisition system, has been studied. Considering the impact of traffic big data, a set of impact factors for traffic sensor layout has been established, including system cost, multisource data sharing, data demand, sensor failures, road infrastructure, and sensor type. With an optimal objective of minimum system cost, maximum truncation flow and minimum path of inclusion, and OD coverage as a constraint, the model was proposed and was solve based on the tolerance lexicographic method of genetic algorithm to demonstrate the validity of the optimization target and the feasibility of the solution, with the classical Nguyen–Dupuis network as a case.

The multiobjective optimization model for traffic sensor layout can not only guarantee the optimal system cost and satisfy the data requirements consisting of OD coverage principle, maximum truncation flow principle, and minimum contained path principle but also reduce the duplication of detector layout under multiple source data sharing. Also, system robustness in case of detector failure was enhanced.

However, this paper presents an analysis based on an arithmetic example, whereas the actual road network has more constraints and more complex traffic conditions, making the model difficult to apply. As different detectors probably belong to different departments for management, it is worth discussing whether the optimization of the detector layout allows the sharing of data and achieves the optimal effect. The model should be extended in the next study, where it will be applied to a specific problem in a city to test the effectiveness and reliability of the model. Also, the article does not consider the differences in data quality between different detectors and the impact on data storage, analysis, and application convenience. A comparison of data collected before and after detector optimization should be done to ensure that the model does not cause loss of collected data. In addition, this paper considers the case of detector failures, but does not go into the causes of detector failures and the random distribution characteristics of the probability of failure. It also neglects how to deal with the case where the OD coverage constraint is not met after the failure occurs, which is one of the directions for future research.

Data Availability

The OD traffic demand data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Humanity and Social Science Youth Foundation of Ministry of Education of China (No. 19YJC630148) and the China Postdoctoral Science Foundation (No. 2018M641169).