Resilience Analysis of Urban Road Networks Based on Adaptive Signal Controls: Day-to-Day Traffic Dynamics with Deep Reinforcement Learning
Table 1
The training of the DQN.
Step 1
Initialize the parameters of the agent evaluation network (EN) ; assign to the parameters of the target network ; set the training times epoch and the batch number ; set epoch = 0
Step 2
Set epoch = epoch + 1. Assign the free-flow costs of all routes as the initial perceived route costs of drivers; initialize the state of the DTD dynamic model, and set
Step 3
While (DTD model does not reach convergence)
Step 4
Based on equations (12) and (13), generate actions
Step 5
According to equations (7) and (9), obtain actual route costs on day based on and (link flow); according to route perception updating process (1), update the perceived route travel cost on day based on perceived route cost and actual route costs on day ; following this, determine the route flows on day based on route choice probability formula (2) and route assignment equation (3); according to network loading model (9) and equation (4), the link flows on day (new state ) and the actual route costs on day are achieved; since the DTD model is not converged, reward
Update the state, actual route cost, perceived route cost, and route flow: ,,, and
Step 11
When the equilibrium is reached, reward takes 10000-RAI, and the RAI is derived from (23); update the parameters of the target network, . One epoch ends
Step 12
If epoch does not reach the set times, return to Step 2; otherwise, store the parameters , and the training of the DQN ends