Research Article

Resilience Analysis of Urban Road Networks Based on Adaptive Signal Controls: Day-to-Day Traffic Dynamics with Deep Reinforcement Learning

Table 1

The training of the DQN.

Step 1Initialize the parameters of the agent evaluation network (EN) ; assign to the parameters of the target network ; set the training times epoch and the batch number ; set epoch = 0

Step 2Set epoch = epoch + 1. Assign the free-flow costs of all routes as the initial perceived route costs of drivers; initialize the state of the DTD dynamic model, and set

Step 3While (DTD model does not reach convergence)

Step 4Based on equations (12) and (13), generate actions

Step 5According to equations (7) and (9), obtain actual route costs on day based on and (link flow); according to route perception updating process (1), update the perceived route travel cost on day based on perceived route cost and actual route costs on day ; following this, determine the route flows on day based on route choice probability formula (2) and route assignment equation (3); according to network loading model (9) and equation (4), the link flows on day (new state ) and the actual route costs on day are achieved; since the DTD model is not converged, reward

Step 6Store the experience into the experience database

Step 7Select a batch of experience , from

Step 8Based on equation (14), calculate the return

Step 9According to (15), update the parameters

Step 10Update the state, actual route cost, perceived route cost, and route flow: , , , and

Step 11When the equilibrium is reached, reward takes 10000-RAI, and the RAI is derived from (23); update the parameters of the target network, . One epoch ends

Step 12If epoch does not reach the set times, return to Step 2; otherwise, store the parameters , and the training of the DQN ends