HD-RMPC: A Hierarchical Distributed and Robust Model Predictive Control Framework for Urban Traffic Signal Timing
Due to the nonlinearity and dynamics of transportation systems, traffic signal control (TSC) in urban traffic networks has always been an important challenge. In recent years, model predictive control (MPC) has shown extraordinary potential in TSC due to its outstanding ability to model dynamic systems. However, the relatively complex online computing, lack of reasonable setpoints for target solving, and uncertainty of traffic network hinder MPC from being further applied. To address these problems, we propose a hierarchical, distributed, and robust model predictive control (HD-RMPC) framework for urban TSC. At the slow-update layer, the road network is dynamically divided into several subareas according to regional attributes and real-time traffic demand. Meanwhile, the volume is coordinated in a robust way for the purpose of traffic equilibrium and overflow prevention. Then, the set-point matrix of each subarea is calculated to equalize the flow in the subarea. This distributed framework guarantees the real-time performance of MPC in urban traffic networks. At the fast-update layer, we adopt an improved prediction model by explicit modeling of the disturbance and reduce the prediction error. Finally, the objective function is reconstructed and solved at the control layer to obtain the optimal control law. Through continuous and asynchronous optimization of the set point and prediction model, the framework significantly improves the control effect. Simulation evaluation based on a real-world road network demonstrates that the proposed HD-RMPC method outperforms all baselines and maintains excellent real-time performance.
As a consequence of the rapid growth of the urban population and vehicle ownership, existing infrastructure has difficulty meeting increasing travel demand, resulting in serious traffic congestion [1–3]. It is generally recognized that enhancing traffic efficiency with advanced signal control strategies is the main means to alleviate urban traffic congestion [4–6].
Many efforts have been devoted to developing appropriate traffic signal control (TSC) strategies for urban traffic networks. In general, existing TSC methods can be classified into three categories: fixed-time control, real-time traffic-responsive control, and computational intelligence-based control [3, 7]. Traditional fixed-time approaches periodically cycle established phase settings. One of the most typical tools is the traffic network study tool (TRANSYT), which can compile a series of fixed-time signal plans for different times of the day. In the face of a time-varying urban environment and dynamic traffic flow, such methods inevitably lead to time loss. To overcome this drawback, real-time traffic-responsive control methods have been proposed. The split-cycle offset optimization technique (SCOOT), Sydney coordinated adaptive traffic system (SCATS), and traffic-responsive urban control (TUC) are the leading examples and have been widely used in urban traffic networks. Unfortunately, the control performances of the SCOOT and SCATS may deteriorate under saturated traffic conditions, and TUC usually needs to be redesigned when the real traffic condition dramatically changes. With the penetration of intelligent transportation systems (ITSs), researchers have tried to regulate TSC systems in a “smart” way. More factors, such as infrastructure and human and wireless communication, are taken into consideration in the implementation of traffic control. In this context, TSC methods based on computational intelligence, including but not limited to fuzzy logic , swarm intelligence algorithms , and artificial neural networks , have been widely adopted by researchers. Among them, the most remarkable are reinforcement learning (RL) methods. Over the past ten years, numerous RL models, such as Q-learning , deep Q networks (DQNs) [12–14], actor-critic method [15, 16], and deep deterministic policy gradients (DDPGs) [17, 18], have been used to improve the efficiency of TSC systems. Faced with large-scale road networks, researchers further extend single-agent RL methods to multiagent RL (MARL) methods and introduce them to address multiple interacting intersections [19–21]. However, stability and learning efficiency are always the shackles that restrict RL methods from moving toward practical application [3, 5, 9, 22]. Besides, augmenting the TSC system with connected vehicles [23, 24], autonomous vehicles , or other emerging technologies [26–28] is also an effective methodology. For example, in , the authors incorporate trajectories of connected vehicles into signal timing optimization, formulate the TSC problem as a mixed-integer nonlinear program, and proposed a multistage method to solve it. Simulation results show that the proposed method has the best fuel economy. In such kinds of strategies, connected autonomous vehicles provide the TSC system with a greater variety and accuracy of available data, which is more conducive for the system to make better decisions. Unfortunately, the relevant basic technologies have not been widely popularized in the actual road environment, resulting in the lack of universality of such methods at this stage .
Unlike the randomness of reward generated by RL methods, model predictive control (MPC) derives a time serious optimal signal plan by predicting future traffic states and executing rolling optimization and feedback correction, which is stable and reliable [29–31]. In addition, MPC can correct the uncertainty issues based on a model-based prediction output using the actual measured system output information at the next sampling moment to perform the next optimization. In this way, MPC can be applied to nonlinear time-varying systems that are difficult to accurately model, and it can still obtain high-quality control actions under ordinary predictive accuracy. As a result, MPC has drawn much attention to improving TSC in large-scale transportation networks. However, the application of MPC-based methods to TSC systems still faces the following challenges:(i)Real-time performance: with the increase in the number of controlled intersections, the computational complexity of MPC increases exponentially, which is a major challenge to the real-time performance of online TSC. In terms of improving the real-time performance of MPC, the decentralized, hierarchical, or distributed control structure is widely applied. In , Ma et al. constructed a decentralized model predictive signal control method using a back-pressure policy, which effectively improved the performance under congested conditions in terms of throughput and comprehensive transportation efficiency. In , Ye et al. proposed a hierarchical and distributed MPC to improve the real-time performance of a TSC system. However, the control performance was less efficient than with centralized MPC due to the added complexity brought about by distribution schemes. In addition, it is obviously unreasonable to subjectively set the control range of a single agent or take a single intersection as a distributed control unit in these methods.(ii)MPC set-point optimization: the pros and cons of a set point directly determine the control performance of the system. However, most existing MPC-based TSC methods have not paid enough attention to the optimization of set points. Some researchers employ a macroscopic fundamental diagram (MFD) to determine the expected number of vehicles in a controlled area . However, considering that an MFD does not have a good definition for heterogeneous networks and often appears with high scatter and hysteresis phenomena, this kind of method is not universal for real road networks. Others employ a model-free adaptive control (MFAC) approach to address the set-point determination of traffic control . Although it can obtain good results, such a method relies heavily on high-quality data and is not stable enough.(iii)Uncertainty handling: although there is a feedback correction mechanism in MPC, the systematic defect of using a linear model to deal with nonlinear problems will still lead to the existence of uncertainty in the control system. The two most representative ones are the influence of the unobservable disturbance on the prediction results in the road section and the uncertainty of subarea control parameters . For the former, some studies have taken the impact of a disturbance into account, but none has tried to model the disturbance [36–38]. For the latter, the use of a robust model can reduce the impact of uncertainty on the optimal control of traffic systems [39–41].
In summary, although there are improvements in many aspects, current MPC-based TSC methods still have some limitations. In a sophisticated MPC method for urban road networks, reasonable subarea division, dynamic set-point optimization, and full consideration of system uncertainty are necessary.
Based on the above discussion, we propose a hierarchical, distributed, and robust MPC approach (HD-RMPC) for urban TSC. Specifically, the system adopts a multihierarchical distributed design. At the slow-update layer, the road network is divided into several coordinated control subareas, and the set-point matrix of each subarea is dynamically optimized according to the network topology and real-time traffic flow. Meanwhile, system sampling is performed by the fast-update layer, and a model that explicitly considers disturbances is used to make predictions. At the control layer, the construction and solution of the objective function are completed according to the prediction results and the set-point matrix, and the optimal control law will be obtained. Compared with the existing MPC-based TSC methods, through the continuous optimization of the set point, our proposal can make the traffic flow in the road network tend to be equilibrium, which is more effective against local congestion. Moreover, the robust response to system uncertainty and explicit modeling of disturbances significantly improves the accuracy of MPC. The main contributions of this paper are summarized as follows:(1)A novel MPC-based TSC method named the HD-RMPC is developed for urban road networks. It formulates the TSC problem as a multiple combinatorial optimization problem and adopts a distributed computing form to solve it, which is economical and efficient.(2)Inspired by the theory of traffic equilibrium, we enable a dynamic set-point optimization model that considers traffic flow, network topology, and evacuation demand in controlled subareas. Compared with existing methods, the proposed model has stable performance and does not depend on large historical data.(3)To deal with the uncertainty in the system, the unobservable disturbance is modeled parametrically and updated in real time through online sampling, which greatly improves the prediction accuracy. In addition, the model with uncertain parameters in subarea coordination is robustly equivalent to improving the stability of the control system.
The rest of the paper is organized as follows. Section 2 formulates the problem. We present the details of the proposed method in Section 3. The simulation results and analysis are shown in Section 4. In Section 5, concluding remarks and trends for further research are discussed.
2. The General Form of an MPC-Based TSC Method
2.1. The Main Composition and Process of MPC
Figure 1 presents a schematic diagram of MPC for TSC . Generally, the execution process of an MPC approach is composed of three parts: a prediction model, an objective function, and a rolling optimization scheme.
A concrete example is given in Figure 2. MPC predicts the future state of the system in the finite time domain based on the prediction model according to the sampling information [31, 38]. Considering the system constraints, it establishes the control optimization problem in the future finite time domain and obtains a control sequence to make the system state tend to the set point. Moreover, MPC has the characteristics of feedback correction and receding horizon optimization, which can compensate for the uncertainty caused by model mismatch, distortion, interference, and other factors in a timely manner.
2.2. The General Form of MPC-Based TSC Methods
The most critical part of MPC-based methods is using a model to predict the future dynamics of the traffic flow of urban roads at each optimization step. To maintain universality, traffic flow dynamics under a discrete-time system can be expressed aswhere is the system state (i.e., the traffic volume) of the th discrete timestep and and denote the control input vector (i.e., the green time) and the measurement vector (i.e., the travel demand), respectively. The optimization function can be defined as follows:subject towhere and are the prediction and control horizons, respectively.
At each prediction step, the system performs equation (1) to obtain the prediction result . Meanwhile, the objective function equation (2) will be solved, and the optimal control sequence will be output. When the prediction horizon is moved forward one step, the same optimization process over the new prediction horizon is repeated.
From the above discussion, we can see that the application of MPC to TSC is based on an important assumption; that is, the travel demand of the whole traffic system and each subsystem can be observed. In practice, it can be obtained using fixed detectors, floating cars, etc. Therefore, as with most of the existing work, the same assumption is also used in this paper.
3. Distributed and Robust MPC with a Multihierarchical Structure
3.1. System Framework
Since an urban road network is a large nonlinear time-varying system with numerous traffic zones, centralized control approaches not only have a high computational burden but also face the problem of insufficient robustness. Applying distributed or hierarchical control structures is generally recognized as an effective solution [36, 42].
Figure 3 shows the proposed hierarchical and distributed hybrid MPC architecture for the TSC of urban road networks.
The slow-update layer is used for distributed processing of road networks and optimization of set points. Its input contains the topology of the road network, the real-time regional travel demand, and historical data. The outputs are the subarea division results and the set point. The fast-update layer is responsible for sampling the traffic state of each link at the time step and predicting the state of the next step, both of which will be used to build the final optimization function. Based on the output of other layers, an optimization function will be constructed and further solved at the control layer to obtain the optimal control law of the next step.
The components contained in each layer and how they work are described in detail as follows.
3.2. The Slow-Update Layer: Optimization of the Set Point
As a reference value in MPC, the set point directly determines the optimality of the solution. However, an implicit assumption in existing MPC-based TSC methods is that the set point is generally assumed to be known a priori. This means that the control objective is to tilt the number of vehicles on the road toward a fixed value, which can hardly cope with the time-varying traffic states. We regard the optimization of set point as the process of bringing traffic networks to the system equilibrium. In this way, the problem of set-point optimization is transformed into a combinatorial optimization problem, including traffic subarea division, subarea coordination control, and subarea dynamic traffic assignment.
In , Wardrop Equilibrium was proposed to decrease the optimal user state; i.e., the traffic network would reach the equilibrium state when all users know exactly what the traffic status of the network is and try to choose the shortest path. On this occasion, the running time of each used path of each OD pair is equal. Inspired by this, we regard the optimization of set point as the process of bringing traffic flow to equilibrium. Combined with the idea of distributed control, the process is divided into three stages: traffic subarea division, subarea coordination control, and set-point calculation.
3.2.1. Traffic Subarea Division
Decomposing the network-level TSC problem into several subproblems to consequently reduce overall computation complexity is the basic idea of distributed control. The simplest way is to treat each individual signalized intersection as an agent, but this approach is apparently uneconomical. Considering the complexity of urban traffic networks, from simple geometric structure and road topology to regional attributes and historical traffic flow, many factors affect the division of traffic subareas. Referring to complex network theory [44, 45], this paper describes the “modularity (assumes to )” of road network from the perspectives of road network topology, area attributes, and traffic dynamic attributes and further presents a novel subarea division algorithm.
We use a directed graph to describe the topological structure of a road network. Each intersection is treated as a node, and each road link is set as an edge. Assume that the link connecting intersection and is a one-way street, and the relevancy degree can be calculated bywhere and are the traffic volumes and the number of lanes of this link, respectively. is the regional correlation of intersection and , which is generally defined as 0 (irrelevant), 0.1 (similar), or 0.2 (consistent). and are balance factors. In the same way, when the link is a two-way street, the relevancy degree can be calculated by
Then, we use the relevancy degree as the weight of edges to transform the directed graph into an undirected graph. Initialize the network to several subnetworks, each containing one node. Create a system weight matrix , where is represented as follows:where is the sum of the edge weights in the undirected graph. The increment of modularity is calculated by equation (7) after the subareas are merged.where is the sum of all edge weights connecting subarea . Select the direction of maximum growth or minimum reduction of for traffic subarea merging.
After completing subarea merging, the modularity of the road network will be calculated bywhere is the sum of the edge weights in subarea . Meanwhile, the weight matrix will be updated. Finally, repeat steps 4 to 6 until all traffic subareas are merged into one system. The partition result with the largest value in the merging process is selected as the optimal partition scheme.
3.2.2. Traffic Subarea Coordination and Boundary Control
To reduce the complexity of the problem, a subarea is selected as the main objective of coordination. Therefore, the average traffic saturation degree of a traffic network is taken as the coordination index of the controlled subareas. When the traffic saturation degree of the controlled subarea exceeds a threshold, its evacuation coefficient can be calculated by equations (9) and (10).where is the set of subareas adjacent to . is the evacuation coefficient from to . is the average traffic saturation degree of the controlled subarea . and are the traffic volumes of and . In an actual transportation system, the evacuation coefficient is restricted by the evacuation capacity of the subareas. Therefore, the constraint should be added to the optimization function. After obtaining the evacuation proportional coefficient of the controlled subarea, traffic flow coordination is realized by boundary control, and the control action is shown as follows:where , , and represent the average green time, evacuation flow rate, and turning rate, respectively, of connecting link . is the updated step length of the top layer, and is the number of evacuation routes.
In addition, considering the uncertainty of the evacuation coefficient between subareas, we employ a robust numerical expression of the coordinated control model. Assume that the evacuation coefficient intervals of subareas and based on historical data are and . In this paper, the linearly equivalent dual variables , , , and are introduced, and the parameters and between 0 and 1 are used to indicate whether a parameter is under robust control. In this way, the total traffic volume of the traffic subarea can be described aswhere is the travel demand in the th stage and is the percentage of the evacuated flow from subarea to subarea .
3.2.3. Calculation of the Set-Point Matrix
A traffic assignment coefficient matrix is established in each control subarea according to the static attribute values (i.e., lane grade, number of lanes, and length of sections) in the subarea, as follows:where is the traffic assignment coefficient of link in traffic subarea . , , and are the lane coefficient, lane number, and length, respectively, of link .
The set-point matrix of the traffic subarea in the stage can be expressed aswhere is the traffic assignment coefficient matrix of the traffic network. At this point, for the road network, the set-point optimization problem is transformed into the equalization distribution of traffic volume in several controlled subareas.
3.3. The Fast-Update Layer: A Prediction Model Explicitly considering Disturbance
The fast-update layer is responsible for sampling the traffic states from the road network, providing the sampling value and prediction state to the control layer. Herein, we apply a novel traffic flow prediction model which explicitly models the possible disturbance.
The store-and-forward (SF) model provides a succinct and clear mathematical description of the relationship between road network, traffic flows, and lights. Therefore, it is widely used in traffic control [46, 47]. The core idea of an SF model is regarding the urban road network as a directed graph with links and junctions. As a paradigm, Figure 4 presents a small urban traffic network that consists of two intersections, where the link connects the upstream and downstream. Its traffic flow dynamics can be expressed as follows.
Assume that the steady flow rate of the corresponding link is constant under fixed travel demand. The output flow of the link is shown as follows:where is the signal phase set of the link in the intersection . is the green time of the phase in the stage . represents the number of lanes corresponding to . is the corresponding traffic flow rate. By the same token, the input flow of the link is
Substituting equations (16) and (17) into equation (15), we have
Analogously, we can obtain the traffic flow dynamics of the link .
It can be seen from equations (18) and (19) that the future state of the road network is related to its current state, input traffic volume, signal control scheme, and traffic disturbance. Assuming that the traffic state is observable, for a traffic subarea, the state transition equation is shown as follows:where is the subarea feature matrix. is the control input vector consisting of all the green times. is the subarea travel demand matrix. is the travel demand vector.
Utilizing the simplified mathematical expression defined above, a linear state-space model for traffic networks of arbitrary size, topology, and characteristics can be clearly derived.
As a disturbance item, is generated by the discrete approximation of the prediction model, the error of the input and output flow rates, and vehicles entering and leaving in the middle of a section, which are difficult to determine in an actual traffic network. Inspired by a BP neural network error correction mechanism, this paper proposes a disturbance modeling method to alleviate the influence of disturbance on prediction accuracy. We divide the disturbance model into an online term and an offline term , in which is roughly calibrated according to historical data and can be calculated by equation (21):where is the current system date and is the mean of the historical error of the prediction model on day . When the control horizon is rolled to the prediction horizon, we can obtain the real measured value of the last time point. Therefore, we can obtain the prediction error of the previous step online, and the online disturbance is shown as follows:where is the error learning rate and and are the actual and predicted values of the previous stage, respectively. Therefore, is as follows:
3.4. The Control Layer: Establishment and Solution of Optimization Function
The control layer will solve the optimization problem after obtaining the set-point matrix and predicted traffic flow from the fast-update and slow-update layers, respectively. Through the work above, the optimization problem of signal timing in a road network has been transformed into a convex quadratic programming problem with nonlinear constraints under the framework of hierarchical and distributed integration. Taking subarea as an example, the optimization objective is to obtain the optimal control law at the th step so that the predicted value approaches the set-point as much as possible. The mathematical description of this problem is as follows:subject towhere is the objective function of traffic subarea . is the prediction step size. and are the abbreviated cycle time constraints and demand equilibrium constraints, respectively. is a correctional item, which will be executed when the green time obtained by the solution differs greatly from the previous stage. and are the road network weight matrix and correctional matrix, respectively. in equation (25) is designed to avoid excessive difference between the control schemes of the two stages.
Since the objective function takes the signal timing scheme of each intersection as an independent variable, the optimal signal control scheme sequence can be obtained by solving the minimum value of the function. The first element of the sequence is sent to the controlled traffic system for execution, and the one-step signal timing optimization is completed.
At the control layer, the TSC of an urban road network is decomposed into a series of discrete problems for each decision stage . At each sample stage, the optimization problem can be regarded as mixed-integer convex quadratic programming. Because of its very small size, existing optimization techniques can be used to solve it effectively. Here, we apply the sequential quadratic programming (SQP) algorithm to solve the problem [48, 49].
4. Simulation-Based Case Studies
4.1. Simulation Settings
The performance of the proposed HD-RMPC was evaluated using the microscopic traffic simulator VISSIM (version 9.0). Algorithm scripts are compiled using the Python 3.9 programming language.
The interaction between the HD-RMPC controller and VISSIM is shown in Figure 5. Through the COM interface, VISSIM delivers real-time traffic status to the controller. The latter calculates the optimal control sequence of the system based on real-time traffic state, historical traffic state, and road network topology attributes and the first element of the control sequence will be fed back to VISSIM for execution via the COM interface. By repeating the above process at each time step, a simulation of online real-world traffic signal timing optimization can be completed.
4.1.2. Benchmark Methods and Metrics
To demonstrate the performance of the proposed HD-RMPC method, three TSC methods are used for comparison.(i)Fixed-time control: the fixed-period F–B method proposed by F. Webster and Cobber  is adopted in this paper as a baseline. The scheme for each intersection will be calculated based on the average flow.(ii)Centralized MPC (C-MPC): this method regards the road network as a whole, and the objective is to minimize the sum of predicted vehicles on the road section through the predictive modeling of the controlled system. The C-MPC used in this paper is referred to . In contrast to the proposed HD-RMPC, C-MPC treats all intersections centrally instead of subdividing the road network. At the same time, it does not take measures against uncertainty, and the set points are set to 0 (as suggested in many previous papers, e.g., in [2, 38, 42]).(iii)Distributed MPC (D-MPC): the applied distributed MPC was developed by Ye et al. . It decomposes the entire system into multiple controlled subsystems and considers the coordination constraints between the subsystems on the basis of MPC for each controlled subsystem. In contrast to the proposed HD-RMPC, D-MPC is lacking treatment for uncertainty and its set points are set to 0. At the same time, we employ the average vehicle delay to evaluate the performance of the different TSC methods.(iv)Average vehicle delay: as Wardrop suggested, we apply the average vehicle delay to evaluate the performance of the proposed method. Assuming that there are vehicles in a certain area, the average vehicle delay in this area can be calculated by where is the actual driving time of the vehicle and is the time to pass through the system with the design speed.
In addition to a comparison of these strategies, the study included an analysis of the efficiency of the different MPC-based solutions, which is reflected in the difference in CPU running time using the same operating platform environment. Meanwhile, in-process results in optimization control, such as the set point, traffic flow in each subarea, and the prediction accuracy of the model, are also shown and analyzed to illustrate the effectiveness of the improvement measures.
4.1.3. Study Area and Simulation Parameters
To evaluate the performance of the HD-RMPC, we chose a medium-sized urban network with prominent subarea characteristics. In the simulation process, we will also adjust the input flow of different links to simulate the change in traffic demand. Figure 6 illustrates the study area in Huangpu District, Shanghai, for the simulation experiment. There are 17 intersections in total, 9 of which are signal-controlled. The simulation network contains 44 links, 9 input links, and 9 output links. We assume that the average vehicle length is 5 meters and that the road lane is 3.5 m in width. The saturated flow rate for all links is set as 2000 veh/h.
We define the TSC cycle time as 60 seconds. During the simulation, the sampling time is also 60 seconds, and the simulation period is set to 4800 seconds (including 80 signal control cycles). Figure 7 shows the input flow to the simulation network. Herein, the flow rate of each link in the set varies once every 10 TSC cycles (600 seconds). The initial turn rate of the link is calibrated using historical data. Referring to [2, 32, 34, 38], in the simulation evaluation part, we repeated a set of experiments (including HD-RMPC, C-MPC, and D-MPC) five times with different random seeds at each and took the average result of the five experiments to avoid the impact of randomness;, i.e., the evaluation was actually conducted 15 times (three different values of and five different random seeds). All of these experiments are performed on a computing platform with an Intel Core i9-10900K CPU (3.70 GHz), NVIDIA GeForce RTX 2080Ti GPU, with 32.0 GB memory.
4.2.1. Evaluation of Process Performance
The process performance of the proposed HD-RMPC will be evaluated from three dimensions: the results of subarea division, the effect of subarea flow coordination, and the optimization of set points. We also compare the variation of prediction error with or without robustness expression at different .
At the beginning of the experiment, according to the subarea division method in Section 3.2, the simulation traffic network is divided into three subareas to obtain the maximum modularity and the results are shown in Table 1.
Figure 8 shows the evolution of traffic volume in each subarea during the simulation. At the beginning of the simulation, the total volume of traffic in subarea 2 is significantly higher than that in subareas 1 and 3. At the end of the simulation, the traffic capacity ratio of the three subareas is approximately 0.29 : 0.38 : 0.33; i.e., with the combined action of time-varying traffic demand, coordination, and optimization, the total traffic volume in the three subareas tends to the traffic capacity ratio.
Taking link 1 as an example, the relationship between set point and input traffic flow is given in Figure 9. We can see that the adjustment of set points has been done in a relatively gentle manner and the applied set-point optimization component responds to the change of input flow in general.
It indicates that the proposed method can effectively balance the regional traffic flow of the road network, which proves the effectiveness of the dynamic subarea division and the set-point optimization components in HD-RMPC.
Figure 10 illustrates the comparison of prediction errors between HD-RMPC and D-MPC during simulation. The difference lies in that the former carries out robust correspondence of uncertainty and explicit modeling of disturbance while the latter does not. Comparing Figures 10(a)–10(c), we can see that the prediction error of all methods would increase with the increase in . During the process of TSC, accumulative prediction errors can make the effect of MPC-based methods get steadily worse. This view is demonstrated in the subsequent evaluation of results. Most importantly, although the initial prediction error of both HD-RMPC and D-MPC is considerable, HD-RMPC achieves a significant reduction in prediction error by self-adjusting as the simulation progresses. It indicates that the proposed robust error characterization approach is valid.
4.2.2. Evaluation of Control Effect
Figure 11 shows the average vehicle delay under different prediction horizons. As shown in Figure 11(a), when is equal to 1, compared with fixed-time control, C-MPC, and D-MPC methods, the average vehicle delay of the proposed method decreases by 56.91%, 49.65%, and 51.89%, respectively.
For the other MPC-based methods, we can see that the C-MPC method performs better than the D-MPC (for about 4.45%). Although the coordination of controlled subareas is taken into account, the control performance of the D-MPC method is inevitably lower than that of the C-MPC.
When , at the scale of the road network, the average vehicle delay of the proposed method is 54.88%, 64.56%, and 56.89% lower, respectively, than the three baselines. Since then, the performance of C-MPC and D-MPC began to be inferior to the fixed-time control due to the increase in prediction error. Meanwhile, the performance of C-MPC is greatly reduced due to the more serious error accumulation effect. Similarly, the HD-RMPC method also achieves the optimal performance when .
Comparing the simulation results of , and , it can be found that the C-MPC method is more susceptible to traffic disturbance than the D-MPC. The reason is that the object of the C-MPC method is the whole traffic system, and that of the D-MPC method is to predict and model the traffic flow of a target road section in each controlled subarea. When the dimension of modeling is reduced, the negative impact of prediction error is weakened accordingly. At the same time, the proposed HD-RMPC method can maintain the optimality of the solution and the robustness of the system in an actual traffic environment because of the hierarchical and distributed reconstruction of the algorithm framework and objective function.
Moreover, with the increase in , the performance of the C-MPC and D-MPC shows a significant decline, which is different from the results of pure numerical simulation experiments. This is because the traffic disturbance in the professional traffic simulation tool adopted in this study is unobservable and noisy, which is closer to the actual world.
On this basis, taking as an example, we further analyzed the variation of delays in each subarea during the simulation. As is shown in Figure 12, for different traffic subareas, it is obvious that the HD-RMPC method is superior to the other three methods in subarea 2 because the flexible set-point setting of the proposed method effectively evacuates the traffic flow in subarea 2. Combined with the analysis of process performance, we can conclude that the proposed set-point optimization and disturbance modeling methods can make the prediction state and the set goal tend to the real value at the same time and greatly improve the quality of the optimal solution.
4.2.3. Efficiency Analysis
To demonstrate the efficiency of the proposed algorithm, we present the average CPU running time of one simulation that is used for solving the online optimization problem under different . The results are shown in Figure 13.
One counterintuitive phenomenon is that when , the efficiency of the C-MPC method is significantly higher than that of the other two methods. This can be interpreted as the applied traffic network being limited in scale. In this context, computational complexity is not the main factor affecting the CPU runtime. Meanwhile, the D-MPC and HD-RMPC methods require more time due to the existence of decomposition and coordination of the traffic network. As increases, the computational complexity of the C-MPC method increases geometrically while the D-MPC and HD-RMPC methods keep their real-time efficiency benefit from their distributed characteristics.
To summarize, faced with a road network with obvious subarea characteristics or local congestion, the proposed HD-RMPC method performs much better than all baselines with excellent computational efficiency. Meanwhile, the performance of the HD-RMPC method does not suffer as the prediction horizon length increases. Therefore, it has significant advantages in applications for real-world traffic networks.
MPC is proven to be a promising control paradigm for TSC. However, further efforts are still needed to cope with the challenges of real-time performance, set-point optimization, and prediction disturbance modeling. To address these problems, this paper proposes a distributed and robust MPC with a multihierarchical structure for urban traffic networks. Macroscopically, the applied multihierarchical distributed structure mode can achieve an ideal control effect with better real-time performance. Microscopically, the dynamic subarea division model, the robust coordination algorithm of subareas, and the set-point online optimization method based on traffic equilibrium theory have achieved good results. Meanwhile, traffic disturbances are explicitly modeled and feedback regulated for the first time, which further improves the prediction accuracy and control performance. A series of simulation experiments based on the real road network show that the proposed HD-RMPC performs better than the existing MPC-based method for the road network with significant subarea characteristics or local congestion.
Although the disturbance modeling method can significantly improve the accuracy of the prediction and reduce the error to within a reasonable range to maintain the validity of the prediction model, it cannot compensate for the structural defects of employing a linear model to solve nonlinear problems. This is also considered the main shortcoming of MPC-based methods . In future studies, more data-driven methods, such as data-enabled predictive control (DeePC) , will be used to solve TSC problems.
The road network structure data used to support the findings of this study are included within the article.
Conflicts of Interest
No potential conflicts of interest were reported by the authors.
This study was supported by the National Natural Science Foundation of China (Nos. U1811463, 51908018, and 51878020).
D. Ma, J. Xiao, X. Song, X. Ma, and S. Jin, “A back-pressure-based model with fixed phase sequences for traffic signal optimization under oversaturated networks,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 9, pp. 5577–5588, 2021.View at: Publisher Site | Google Scholar
B.-L. Ye, W. Wu, L. Li, and W. Mao, “A hierarchical model predictive control approach for signal splits optimization in large-scale urban road networks,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 8, pp. 2182–2192, 2016.View at: Publisher Site | Google Scholar
K.-L. A. Yau, J. Qadir, H. L. Khoo, M. H. Ling, and P. Komisarczuk, “A survey on reinforcement learning models and algorithms for traffic signal control,” ACM Computing Surveys, vol. 50, no. 3, pp. 1–38, 2018.View at: Publisher Site | Google Scholar
Y. Ren, Y. Wang, G. Yu, H. Liu, and L. Xiao, “An adaptive signal control scheme to prevent intersection traffic blockage,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 6, pp. 1–10, 2016.View at: Publisher Site | Google Scholar
H. Wei, G. Zheng, V. Gayah, and Z. Li, “A survey on traffic signal control methods,” 2019, https://arxiv.org/abs/1904.08117.View at: Google Scholar
L. Xu, J. Xu, X. Qu, and S. Jin, “An origin-destination demands-based multipath-band Approach to time-varying arterial coordination,” IEEE Transactions on Intelligent Transportation Systems, no. 17, pp. 1–17, 2022.View at: Publisher Site | Google Scholar
D. Zhao, Y. Dai, and Z. Zhang, “Computational intelligence in urban traffic signal control: a survey,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 4, pp. 485–494, 2011.View at: Google Scholar
D. Kim and O. Jeong, “Cooperative traffic signal control with traffic flow prediction in multi-intersection,” Sensors, vol. 20, no. 1, p. 137, 2019.View at: Publisher Site | Google Scholar
P. W. Shaikh, M. El-Abd, M. Khanafer, and K. Gao, “A Review on Swarm Intelligence and Evolutionary Algorithms for Solving the Traffic Signal Control Problem,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 1, 2020.View at: Google Scholar
A. Tizghadam, H. Khazaei, M. H. Y. Moghaddam, and Y. Hassan, “Machine learning in transportation,” Journal of Advanced Transportation, vol. 2019, Article ID 4359785, pp. 1–3, 2019.View at: Publisher Site | Google Scholar
B. Abdulhai, R. Pringle, and G. J. Karakoulas, “Reinforcement learning for true adaptive traffic signal control,” Journal of Transportation Engineering, vol. 129, no. 3, pp. 278–285, 2003.View at: Publisher Site | Google Scholar
D. Li, J. Wu, M. Xu, Z. Wang, and K. Hu, “Adaptive traffic signal control model on intersections based on deep reinforcement learning,” Journal of Advanced Transportation, vol. 2020, Article ID 6505893, pp. 1–14, 2020.View at: Publisher Site | Google Scholar
J. Zeng, J. Hu, and Y. Zhang, “Adaptive traffic signal control with deep recurrent Q-learning,” in Proceedings of the 2018 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 1215–1220, Changshu, China, 26-30 June 2018.View at: Publisher Site | Google Scholar
J. Yoon, K. Ahn, J. Park, and H. Yeo, “Transferable traffic signal control: reinforcement learning with graph centric state representation,” Transportation Research Part C: Emerging Technologies, vol. 130, Article ID 103321, 2021.View at: Publisher Site | Google Scholar
D. Ma, B. Zhou, X. Song, and H. Dai, “A deep reinforcement learning approach to traffic signal control with temporal traffic pattern mining,” IEEE Transactions on Intelligent Transportation Systems, vol. 12, pp. 1–12, 2021.View at: Publisher Site | Google Scholar
H. Ge and D. L. Y. C. Y. G. Gao, “Multi-agent transfer reinforcement learning with multi-view encoder for adaptive traffic signal control,” IEEE Transactions on Intelligent Transportation Systems, pp. 1–16, 2021.View at: Publisher Site | Google Scholar
S. Yang, B. Yang, H.-S. Wong, and Z. Kang, “Cooperative traffic signal control using multi-step return and off-policy asynchronous advantage actor-critic graph algorithm,” Knowledge-Based Systems, vol. 183, Article ID 104855, 2019.View at: Publisher Site | Google Scholar
Z. Li, H. Yu, G. Zhang, S. Dong, and C.-Z. Xu, “Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning,” Transportation Research Part C: Emerging Technologies, vol. 125, Article ID 103059, 2021.View at: Publisher Site | Google Scholar
T. Chu, J. Wang, L. Codeca, and Z. Li, “Multi-agent deep reinforcement learning for large-scale traffic signal control,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 3, pp. 1086–1095, 2020.View at: Publisher Site | Google Scholar
F. Rasheed, K.-L. A. Yau, and Y.-C. Low, “Deep reinforcement learning for traffic signal control under disturbances: a case study on Sunway city, Malaysia,” Future Generation Computer Systems, vol. 109, pp. 431–445, 2020.View at: Publisher Site | Google Scholar
J. Liu, H. Zhang, Z. Fu, and Y. Wang, “Learning scalable multi-agent coordination by spatial differentiation for traffic signal control,” Engineering Applications of Artificial Intelligence, vol. 100, Article ID 104165, 2021.View at: Publisher Site | Google Scholar
Y. Wang, X. Yang, H. Liang, and Y. Liu, “A review of the self-adaptive traffic signal control system based on future traffic environment,” Journal of Advanced Transportation, vol. 2018, Article ID 1096123, pp. 1–12, 2018.View at: Publisher Site | Google Scholar
C. B. Rafter, B. Anvari, S. Box, and T. Cherrett, “Augmenting traffic signal control systems for urban road networks with connected vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 4, Article ID 2971540, pp. 1728–1740, 2020.View at: Publisher Site | Google Scholar
W. Li and X. Ban, “Connected vehicles based traffic signal timing optimization,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 12, pp. 4354–4366, 2019.View at: Publisher Site | Google Scholar
Q. Guo, L. Li, X. Ban, and Ban, “Urban traffic signal control with connected and automated vehicles: a survey,” Transportation Research Part C: Emerging Technologies, vol. 101, pp. 313–334, 2019.View at: Publisher Site | Google Scholar
Z. Yao, L. Shen, R. Liu, Y. Jiang, and X. Yang, “A dynamic predictive traffic signal control framework in a cross-sectional vehicle infrastructure integration environment,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 4, pp. 1455–1466, 2020.View at: Publisher Site | Google Scholar
Y. Xu, D. Li, and Y. Xi, “A game-based adaptive traffic signal control policy using the vehicle to infrastructure (V2I),” IEEE Transactions on Vehicular Technology, vol. 68, no. 10, pp. 9425–9437, 2019.View at: Publisher Site | Google Scholar
C. Ma, W. Hao, A. Wang, and H. Zhao, “Developing a coordinated signal control system for urban ring road under the vehicle-infrastructure connected environment,” IEEE Access, vol. 6, pp. 52471–52478, 2018.View at: Publisher Site | Google Scholar
E. F. Camacho and C. B. Alba, Model Predictive Control, Springer science & business media, Berlin, 2013.
Z. Hao, R. Boel, and Z. Li, “Model based urban traffic control, part I: local model and local model predictive controllers,” Transportation Research Part C: Emerging Technologies, vol. 97, pp. 61–81, 2018.View at: Publisher Site | Google Scholar
A. Hegyi, B. De Schutter, and H. Hellendoorn, “Model predictive control for optimal coordination of ramp metering and variable speed limits,” Transportation Research Part C: Emerging Technologies, vol. 13, no. 3, pp. 185–209, 2005.View at: Publisher Site | Google Scholar
D. Ma, J. Xiao, and X. Ma, “A decentralized model predictive traffic signal control method with fixed phase sequence for urban networks,” Journal of Intelligent Transportation Systems, vol. 25, no. 5, pp. 455–468, 2021.View at: Publisher Site | Google Scholar
Z. Zhou, S. Lin, Y. Xi, D. Li, and J. Zhang, “A hierarchical urban network control with integration of demand balance and traffic signal coordination,” IFAC-PapersOnLine, vol. 49, no. 3, pp. 31–36, 2016.View at: Publisher Site | Google Scholar
H. Yu and Z. Hou, “Two-level hierarchical optimal control for urban traffic networks,” Transportmetrica: Transportation Science, vol. 18, no. 1, pp. 144–165, 2020.View at: Publisher Site | Google Scholar
B.-L. Ye and W. K. L. T. H. Y. Wu, “A survey of model predictive control methods for traffic signal control,” IEEE/CAA Journal of Automatica Sinica, vol. 6, no. 3, pp. 623–640, 2019.View at: Publisher Site | Google Scholar
S. Timotheou, C. G. Panayiotou, and M. M. Polycarpou, “Distributed traffic signal control using the cell transmission model via the alternating direction method of multipliers,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 2, pp. 1–15, 2014.View at: Publisher Site | Google Scholar
J. Haddad and B. Mirkin, “Adaptive perimeter traffic control of urban road networks based on MFD model with time delays,” International Journal of Robust and Nonlinear Control, vol. 26, no. 6, pp. 1267–1285, 2016.View at: Publisher Site | Google Scholar
K. Lu, P. Du, J. Cao, Q. Zou, T. He, and W. Huang, “A novel traffic signal split approach based on explicit model predictive control,” Mathematics and Computers in Simulation, vol. 155, pp. 105–114, 2019.View at: Publisher Site | Google Scholar
M. Cannon and B. Kouvaritakis, “Optimizing prediction dynamics for robust MPC,” IFAC Proceedings Volumes, vol. 38, no. 1, pp. 239–244, 2005.View at: Publisher Site | Google Scholar
B. Ding, Y. Xi, M. T. Cychowski, and T. O’Mahony, “Improving off-line approach to robust MPC based-on nominal performance cost,” Automatica, vol. 43, no. 1, pp. 158–163, 2007.View at: Publisher Site | Google Scholar
M. Najafi, K. Eshghi, and W. Dullaert, “A multi-objective robust optimization model for logistics planning in the earthquake response phase,” Transportation Research Part E: Logistics and Transportation Review, vol. 49, no. 1, pp. 217–249, 2013.View at: Publisher Site | Google Scholar
B.-L. Ye, W. Wu, and W. Mao, “Distributed model predictive control method for optimal coordination of signal splits in urban traffic networks,” Asian Journal of Control, vol. 17, no. 3, pp. 775–790, 2015.View at: Publisher Site | Google Scholar
J. G. Wardrop, “Road paper. some theoretical aspects of road traffic research,” Proceedings - Institution of Civil Engineers, vol. 1, no. 3, pp. 325–362, 1952.View at: Publisher Site | Google Scholar
A. Nematzadeh, E. Ferrara, A. Flammini, and Y.-Y. Ahn, “Optimal network modularity for information diffusion,” Physical Review Letters, vol. 113, no. 8, Article ID 088701, 2014.View at: Publisher Site | Google Scholar
J. M. Hofman and C. H. Wiggins, “Bayesian approach to network modularity,” Physical Review Letters, vol. 100, no. 25, Article ID 258701, 2008.View at: Publisher Site | Google Scholar
K. Aboudolas, M. Papageorgiou, and E. Kosmatopoulos, “Store-and-forward based methods for the signal control problem in large-scale congested urban road networks,” Transportation Research Part C: Emerging Technologies, vol. 17, no. 2, pp. 163–174, 2009.View at: Publisher Site | Google Scholar
M. Rostami Shahrbabaki, A. A. Safavi, M. Papageorgiou, and I. Papamichail, “A data fusion approach for real-time traffic state estimation in urban signalized links,” Transportation Research Part C: Emerging Technologies, vol. 92, pp. 525–548, 2018.View at: Publisher Site | Google Scholar
P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” Acta Numerica, vol. 4, pp. 1–51, 1995.View at: Publisher Site | Google Scholar
Q. Ma, S. Li, H. Zhang, Y. Yuan, and L. Yang, “Robust optimal predictive control for real-time bus regulation strategy with passenger demand uncertainties in urban rapid transit,” Transportation Research Part C: Emerging Technologies, vol. 127, Article ID 103086, 2021.View at: Publisher Site | Google Scholar
F. V. Webster, Traffic Signal Settings, London, 1958.
J. Coulson, J. Lygeros, and F. Dörfler, “Data-enabled predictive control: in the shallows of the DeePC,” in Proceedings of the 2019 18th European Control Conference (ECC), pp. 307–312, Naples, Italy, 25-28 June 2019.View at: Publisher Site | Google Scholar