#### Abstract

Predicting individual mobility of subway passengers in large crowding events is crucial for subway safety management and crowd control. However, most previous models focused on individual mobility prediction under ordinary conditions. Here, we develop a passenger mobility prediction model, which is also applicable to large crowding events. The developed model includes the trip-making prediction part and the trip attribute prediction part. For trip-making prediction, we develop a regularized logistic regression model that employs the proposed individual and cumulative mobility features, the number of potential trips, and the trip generation index. For trip attribute prediction, we develop an -gram model incorporating a new feature, the trip attraction index, for each cluster of subway passengers. The incorporation of the three new features and the clustering of passengers considerably improves the accuracy of passenger mobility prediction, especially in large crowding events.

#### 1. Introduction

Large crowding events, which may be triggered by religious, commercial, or recreational activities [1–3], are observed around the world. In large crowding events, crowd density often exceeds its safety threshold [1], posing tremendous pressure to the safety of crowd. If crowd density cannot be effectively controlled, crowd disasters may occur. Over the last decade, crowd disasters have caused more than 2,000 fatalities around the world [4]. Just on April 30, 2021, a total of 45 people were killed and about 150 people were injured in Israel when celebrating the Jewish holiday of Lag BaOmer. Thus, an in-depth understanding of mobility patterns in large crowding events is crucial for mobility anomaly detection and public safety management.

Human mobility patterns in large crowding events are different from those under ordinary conditions [5]. Remarkable efforts have been devoted to investigate the collective human mobility behaviors [6, 7], such as in large crowding events [8]. Fan et al. [9] predicted the short-term citizen movement using multiple random Markov chains. Using the mobile phone GPS log data, Fan et al. predicted the number of people went to the New Year's Eve activities and the Comiket 80 (C80) show. Dong et al. [10] collected five months of call detail records (CDR) data of the mobile phone users in Ivory Coast. The authors employed the cylindrical cluster detection algorithm to compare the travel patterns of residents in unusual activities and daily routines. Zhou et al. [11] introduced the gradient boosting decision tree (GBDT) algorithm to assess the risk of crowding events. After studying the crowding events occurred in Beijing, Shanghai, and Shenzhen, Zhou et al. found that early warning of potential crowd events could be released 1–3 hours in advance by using the Baidu Map query data. Lam et al. [12] used bike sharing data collected in Washington DC and Philadelphia to identify the unusual events. The authors successfully identified unusual activities near Drexel University through spectral clustering and threshold-based detection. Taken together, collective human mobility in large crowding events has been comprehensively studied in recent years. However, individual human mobility in large crowding events is still not sufficiently explored.

With increasingly abundant big data, individual human mobility research has experienced rapid developments over the last decade. Mobile phone data [13, 14], social media data [15], taxi GPS data [16], automatic fare collection (AFC) data [17], and online-map location data [18] were used to study and discover the laws of human mobility. In individual mobility prediction area, Gambs et al. [19] explored the next location prediction problem using the Markov model early in 2012. In the recent decade, three further research topics have received more attention. In the first topic, various methods are employed to model individual mobility behaviors and predict an individual’s next location, such as hidden Markov model [20], suffix trees [21], association rule mining [22], logistic regression [23], and recurrent neural network [24]. It is noted that abundant travel records are essential for modeling individual mobility patterns, and researchers explored hybrid models that integrated individual historical mobility data and collective mobility data of a group of individuals, which formed the second research topic. For example, Asahara et al. [25] divided individuals into different groups and proposed the mixed Markov model (MMM) to predict the mobility patterns of individuals in each group; Alhasoun et al. [26] identified the “similar strangers” of each individual based on their historical mobility patterns and predicted an individual’s mobility by integrating his/her historical mobility information and the mobility patterns of his/her similar strangers. Noticing the importance of time information of a trip, in the third research topic, researchers predict not only the location of the later trip but also the time when the later trip will take place. For example, Hsieh et al. [27] extracted a location time distribution (LTD) for each location and a transition time distribution (TTD) for each location pair, and individual mobility prediction is made based on how well the location sequence matches the TTD and LTD; using smart card data, Zhao et al. [28] predicted the three trip attributes for each trip, that is, the departure time, the origin, and the destination of the trip. Trip attributes of the first trip in a day were predicted based on the date factors, while trip attributes of the other trips were predicted based on the attributes of the previous trip. However, only a few works investigated human mobility patterns in nonordinary conditions [29, 30]. Individual human mobility in large crowding events has not been well understood and modeled [31].

In summary, previous works focused on collective mobility patterns in crowding events [9–12] or individual mobility patterns in ordinary condition [19–28]. Individual mobility patterns in crowding events are not sufficiently understood and modeled. The anomalous passenger flow during crowding events could not be well reproduced if we use the individual mobility model established for ordinary condition. To fill the research gap, we employ the subway smart card data of 2.5 million passengers of Shenzhen Metro to study the passenger mobility patterns in a large crowding event and develop a hierarchical passenger mobility prediction model. We discovered two primary changes of individual mobility patterns in crowding events, that is, subway passengers make more trips in crowding events, and only the passenger flow directing to the crowding station increased prominently. Therefore, in the present study, the trip generation index is proposed to capture the increased trips in crowding events, and the trip attraction index is introduced to capture subway passengers’ selection of trip destination. By introducing the trip generation index and the trip attraction index, the proposed model can well predict the individual mobility patterns of passengers in crowding events and reproduce the anomalous passenger flow at the crowding station.

The main contributions of the present study could be summarized as follows:(1)We introduce two new features, the number of potential trips and the trip generation index, in trip-making prediction, which improves the accuracy of trip-making prediction (whether a passenger will make a trip).(2)We develop a hierarchical -gram model by incorporating a new feature, the trip attraction index, for each group of clustered passengers, which improves the accuracy of trip attribute prediction, especially in large crowding events.

The remainder of this study is organized as follows: Section 2 introduces the data used in this study. Section 3 presents the model for predicting individual mobility of subway passengers. In Section 4, the proposed model is validated by comparing with several benchmark models. Section 5 concludes the findings of this study. Limitations of the research and future research directions are also discussed.

#### 2. Data

The data used in this study were provided by Shenzhen Transportation Authority. The geographic information systems (GIS) data of the Shenzhen subway network were collected in 2014 when there were 5 lines and 118 stations (Figure 1). The smart card data were collected from October 13 to October 31, 2014. When a subway passenger entered or exited a subway station, she was required to tap her card, and the passenger smart card ID, the station ID, and the time were recorded. The ID of each smart card holder was anonymized in the used smart card data. In addition, in the present study, only statistical results are analyzed and presented for privacy protection. During the observation period, 2,520,455 passengers generated a total of 30,669,097 smart card records. Shenzhen Metro uses the off-board ticketing system with proof-of-payment (fBTS-POP_{B}), where fare evasion is almost absent [32]. Using the smart card data, we estimated the average number of trips of smart card users per day, which is 1.61 million. We also obtained that the total number of subway trips per day was approximately 2.84 million in Shenzhen in 2014 [33]. That is, the used smart card data recorded roughly 57% of the subway trips, and the other 43% passengers used one-way tickets. Therefore, the passenger demand was estimated based on an up-scaling factor :where is the trip origin, is the trip destination, and is the 1-hour time slot, is the demand of smart card users, is the estimated passenger demand, and the up-scaling factor is estimated to be . A crowding event caused by Halloween recreational activity took place near the Window of World station on October 31, 2014. This offers us the data for studying the passenger mobility patterns during large crowding events.

In this study, the smart card records collected from October 13 to October 26, 2014, are used as the training data, and the smart card records collected from October 27 to October 31, 2014, are used as the test data. The Window of World station is here denoted as the crowding station . A large number of passengers took subway to the Window of World station during the time period of the crowding event , which was from 4:00 p.m. to 11:30 p.m. (Figure 2(a)). We find that only the passenger flow directing to the crowding station increased prominently during (Figure 2(b)). The passengers who exited station during time period are defined as the event participants. In this study, only the passengers with more than 14 trips (571,171 passengers in total) during the observation period are used to guarantee sufficient historical individual mobility data.

**(a)**

**(b)**

#### 3. Model

##### 3.1. Modeling Framework

There are three modules in our proposed framework, that is, feature extraction, individual mobility prediction, and passenger clustering (Figure 3).

In the feature extraction module, multiple mobility features are extracted from smart card data to establish the individual mobility prediction model, including the number of potential trips, the trip generation index, and the trip attraction index.

The individual mobility prediction module is used to predict the individual mobility patterns of subway passengers. There are two ways to predict the location of an individual. One way is to predict the individual location for each time slot. The other way is to predict the departure time, the origin, and the destination of the next trip. In this study, the second way was employed given that the mobility information provided by the smart card data is sparsely distributed in time. Inspired by Zhao et al. [28], the individual mobility prediction problem is split into two parts: trip-making prediction (TM) and trip attribute prediction (TA) (Figure 3). Trip-making prediction determines if a passenger will make a trip. The number of potential trips and the trip generation index are used in trip-making prediction to quantify the traveling willingness of subway passengers. Trip attribute prediction determines the departure time , the origin , and the destination of the trip. The trip attraction index is used in trip attribute prediction to capture the attraction of subway stations to the passengers. The training and prediction procedures of the proposed individual mobility model are as follows: at the start of a day, trip-making prediction is first conducted to identify if the passenger will make the first trip in the day (TMF). If so, the trip attributes of the first trip are predicted (TAF); otherwise, prediction is over for the day. After that, whenever a trip record is observed, trip-making prediction determines if the later trip will occur (TML). If the later trip will not occur, prediction is over for the day; otherwise, the trip attributes of the later trip are predicted (TAL). TML and TAL will be repeatedly conducted until no new trip record is observed within the day.

The passenger clustering module is introduced to cluster the passengers with similar temporal mobility patterns. Note that individuals tend to travel like their “similar strangers” [26]. By generating the time series, generating the overlapped slots, and extracting the temporal mobility patterns, the studied passengers are clustered into groups based on their temporal mobility patterns. Consequently, the silhouette coefficient is employed to identify the optimal number of passenger clusters. When training the individual mobility model (TAF and TAL) for a cluster of passengers, only mobility data of passengers in the studied cluster are used, while the mobility data of other cluster of passengers are not considered. The parameter and variable notations are introduced in Table 1.

##### 3.2. Trip-Making Prediction

At the start of a day, the probability that a passenger makes the first trip (TMF) is estimated using the conditional probability , where is a binary variable indicating if the passenger will make the first trip in the day; is the feature set, which includes the date information and the passenger’s historical mobility information; is the trip generation index proposed in this study. Trip generation index quantifies passengers’ overall traveling willingness in the day, which is a cumulative mobility feature of all passengers. When the ()th trip of the passenger is observed, the probability that a passenger makes the later trip (TML) is estimated using the conditional probability , where represents the ()th trip in the day, determines if the ()th trip will occur in the day, and represent the departure time, the origin, and the destination of the observed ()th trip in the day, respectively.

Trip generation index is calculated using the total number of trips in the event day divided by the average number of trips in ordinary days. The trip generation index is set to for ordinary days. In the event day, the value of is larger than 1, indicating that passengers are more willing to travel in that day. Given that passengers may go to the crowding location in their first trip or the later trips, trip generation index is considered in both TMF and TML.

##### 3.3. Trip Attribute Prediction

For the first trip of a day, its attributes (TAF) are predicted using the conditional probability , where is the th cluster, represent the departure time, the origin, and the destination of the first trip, respectively, represents the date features, and is the trip attraction index. Similar to TML, the attributes of the later trip (TAL) are predicted using the conditional probability , where is the th cluster, represents the ()th trip in the day, and represent the departure time, the origin, and the destination of the ()th trip, respectively. The combination of () achieving the largest conditional probability is regarded as the predicted trip attributes. In this study, the resolution of the time is 1 hour.

The method to determine the trip attraction index is as follows:(1)Calculate the attraction coefficient of the crowding station for each passenger: where is the crowding station, is the conditional probability that the destination of the trip is under condition , includes the date features for TAF and the attributes of last trip () for TAL, is the subway station where the passenger visited most under condition in ordinary days, and is the conditional probability that the destination of the passenger is under condition . For the passengers whose most visited station is the crowding station, the attraction coefficient is set to . While for other passengers, their attraction coefficient . The smaller the attraction coefficient , the lower the probability that a passenger visits the crowding station. The attraction coefficient reflects the relative attraction of the crowding station compared with other stations.(2)Arrange the attraction coefficients in a descending order for all studied passengers (, where is the total number of passengers).(3)The trip attraction index of a crowding event is calculated as follows:where is the number of event participants. For a passenger , if , he/she is predicted to go to the crowding station (event). This type of passengers is identified as the predicted participants. While for a passenger , if , he/she is predicted to go to the most visited location . The number of predicted participants is equal to the number of participants . The trip attraction index only when destination of the trip is the crowding station and departure time of the trip is within the time period of the crowding event ; in other cases, .

##### 3.4. Clustering Subway Passengers into Groups

In trip attribute prediction, passengers with similar historical mobility patterns are clustered into the same group. The trip attributes of passengers in each group are predicted using the models specifically constructed for the group.

We cluster subway passengers based on their temporal mobility patterns [34]. First, a day is divided into 18 one-hour time slots since the operation hours of the Shenzhen Metro were from 6:00 a.m. to 12:00 a.m. For each passenger, a time series is generated to extract his/her temporal mobility pattern (Figure 4), where when the passenger swiped the card in the time slot of day ; otherwise, ; is the number of days in the training data.

We generate the overlap slots , each of which lasts for 3 hours. The attributes of each overlap slot are the average number of trips and the proportion of active days , both of which are calculated according to the time series :where the operator returns the maximum value. The overlap slots are generated to identify the typical temporal mobility patterns of subway passengers. There are two ways to split the time slots of a day. One way is to split a day into fixed intervals as tagged time slots, such as 0:00–2:59, 3:00–5:59, 6:00–8:59…. The other way is to split a day into overlap slots: 0:00–2:59, 1:00–3:59, 2:00–4:59…. If we use the first way to split time slots, passengers with similar temporal mobility patterns may be grouped into different clusters. For example, if passenger *A* takes subway during the period from 8:50 to 8:59 and passenger *B* takes subway during the period from 9:01 to 9:10, the two passengers will be regarded as the data samples of different tagged time slots (i.e., 6:00–8:59 for passenger *A* and 9:00–11:59 for passenger *B*), albeit that their temporal mobility patterns are rather similar. However, if we use the overlap slots, passenger *A* and passenger *B*, which have pretty similar temporal mobility patterns, will be regarded as the data samples of the same overlap slot (i.e., 7:00–9:59 or 8:00–10:59). Therefore, the overlap slots can better capture the temporal mobility patterns of subway passengers.

We sort the overlap slots in descending order of and generate the sorted overlap slots . Then, we iteratively select from to to generate the nonoverlap slots , where the elements in have no overlap in time. The first slot in the nonoverlap slots () is the first slot in the sorted overlap slots (, 6:00–8:59). Given that there are overlapped time slots between and (i.e., 7:00–8:59), is not selected to add into . Next, given that there are no overlapped time slots between and , is selected as the second slot in . We iteratively select from to to generate the nonoverlap slots. Finally, , and are selected to generate the nonoverlap slots . The proportions of active days of the top four elements in are used as features for passenger clustering [34].

The *K*-means algorithm is used to cluster passengers into different groups. We use to generate the feature space, where is the proportion of active days of the top element in the nonoverlap slot . The silhouette coefficient [35] is employed to determine the number of optimal cluster number K. The number of passenger groups is tested from 2 to 17 (i.e., *K* Ł 2, 3, ... 17), and the value of K that achieves the highest silhouette coefficient is used.

##### 3.5. Solving the Model

For trip-making prediction, a binary variable determines if the ()th trip will occur in the day. The binary variable in TMF and TML can be solved using the logistic regression algorithm [28]. In order to avoid overfitting and reduce model complexity, L2 regularization [36] is employed in the logistic regression algorithm. Through inputting the feature set and the trip generation index , we train the L2 regularization model for TMF and predict if the passenger will make the first trip in a day. We input the features , and to train the L2 regularization model for TML and predict if the later trip will occur.

For trip attribute prediction, to reduce the feasible solution space and simplify the solution process of TAF and TAL, the joint probability in TAF and the joint probability in TAL are respectively decomposed into the product of three conditional probabilities based on the chain rule [28]:

In natural language processing, -gram model is widely used to predict the next word or phrase based on given context and shows good performance in machine translation, speech recognition, and other applications [37]. We use the -gram model [38] to solve the conditional probability obtained by the chain rule decomposition.

In equation (7), the origin of the first trip in a day is the variable to be predicted, where is the date feature, is the departure time of the first trip, and () represents that the origin of the first trip in a day is with given context (). For all possible , we calculate the conditional probability using the given context ():where is the time-smoothed counting function, is a scalar concentration parameter of Dirichlet prior, is the expectation of prior probability , and is the feasible solution set of . The time-smoothed counting function returns the estimated frequency of and is calculated using the following equation:where is the number of times that emerges in the training set, and is the collection of contexts “adjacent” to context (). Context () is adjacent to context () if or .

The expected prior probability is estimated usingthe following equation:where is the back-off prior of [39], is a collective prior of , -gram model and collective -gram model are balanced using the parameter . The collective probability is calculated as follows:where is a counting function representing the travel behaviors of all passengers in the ‐th cluster, is a collective prior of , and is a scalar concentration parameter of Dirichlet prior and is used to represent the effect of collective behaviors on individual mobility prediction. The counting function is estimated using the following equation:where represents the frequency of in the training data and is the collection of contexts “adjacent” to context () for all passengers in the ‐th cluster.

Note that the conditional probabilities and are calculated recursively. When the number of context variables () is zero, the individual probability that the first trip starts from and () and the collective probability that the first trip starts from () are calculated using the following equations:

In order to improve the model performance, parameters and are determined for each individual [28]. The training data are further divided into training part (from October 13 to October 19) and validation part (from October 20 to October 26). The value of parameters and are tested from 0 to 1 with a tolerance of 0.1. The values of and that achieve the highest average prediction accuracy rates are selected.

Next, the conditional probabilities , , , , and can be calculated, which are further used to estimate the joint probability in TAF equation (5) and the joint probability in TAL (equation 6).

##### 3.6. Prediction Performance Indices

There are three indices used to measure the prediction performance of trip-making prediction, that is, prediction accuracy rate, recall, and precision. The prediction accuracy rate of a passenger is defined as the number of correctly predicted binary variables divided by the total number of binary variables . The recall of a passenger is defined as the number of correctly predicted positive binary variables divided by the total number of positive binary variables . The precision of a passenger is defined as the number of correctly predicted positive binary variables divided by the total number of predicted positive binary variables .

Meanwhile, four indices are used to measure the prediction performance of trip attribute prediction, that is, prediction accuracy rate for the prediction of individual mobility, and mean absolute percentage error (MAPE), mean absolute error (MAE), and root mean squared error (RMSE) for the prediction of the number of event participants from the passenger sources of crowding station. The prediction accuracy rate is defined as the number of trips with trip attribute correctly predicted divided by the total number of trips in the test data. The indices MAPE, MAE, and RMSE are defined as follows:where represents the actual passenger flow of the passenger sources of crowding station, represents the predicted passenger flow of the passenger sources of crowding station, and is the number of passenger sources of the crowding station.

##### 3.7. Comparative Models

In this study, three benchmark models are established to validate the proposed -gram model, that is, a first-order Markov chain model (MC model), a basic -gram model, and a -gram model incorporating the proposed indices (-gram model).

In the MC model, the transition probabilities , , and are obtained based on the mobility patterns in the training data:where is the number of times that the last trip starts in time slot and the later trip starts in time slot , is the number of times that the destination of the last trip is and the origin of the later trip is , is the number of times that the destination of the last trip is and the destination of the later trip is , is the number of possible time slots that a passenger may start a trip, is the number of possible subway stations that the passenger may enter or exit, and parameter is also used for smoothing. Next, the mobility probability of the later trip can be calculated based on the observed trip .

In the -gram model, the trip-making prediction is estimated using the conditional probability for TMF and for TML, and the trip attribute prediction is estimated using the conditional probability for TAF or for TAL.

In the -gram model, the trip-making prediction is estimated using the conditional probability for TMF and for TML, where the proposed trip generation index is used to capture the increased trips in crowding events. The trip attribute prediction is estimated using the conditional probability for TAF and for TAL, where the proposed trip attribute index is used to capture subway passengers’ selection of trip destination. The conditional probabilities of -gram model and -gram model are obtained using the mobility data of all studied passengers.

#### 4. Results

##### 4.1. Trip-Making Prediction

The problem of trip-making prediction includes TMF and TML, which are solved by estimating the conditional probabilities and . The average number of trips is 1,601,737 trips per day in ordinary days, while there were 1,837,826 trips in the event day. The trip generation index is therefore set to 1.15 in the event day. Feature set contains the date features and the mobility characteristics of the passenger (Table 2). The Category A features include the day of week, the day is a holiday or not, and passenger travelled in the last day or not, which are common features in individual mobility prediction. The feature, day of week, is categorical variables and converted into a series of binary variables (e.g., whether it is Monday or not). Note that there are no holidays in the used data. The Category B features include travel frequency and nontravel days. We propose a new Category C feature, the number of potential trips , which is defined as the difference between the average number of daily trips N of the passenger in ordinary days and the number of observed trips of the passenger in the predicted day. Based on Categories A, B, and C features, we employ the L2 regularization model in the trip-making prediction.

For TMF, we predict if a passenger will make a trip. The number of potential trips of a passenger is set to the average number of daily trips of the passenger N in ordinary days. The medians of coefficients in the L2 regularization model are presented in Table 2. Results show that passengers are more willing to travel on weekdays than on weekends. In addition, if a passenger makes a trip in last day, he/she is more active and willing to make a trip in the current day. We also notice that passengers with large number of potential trips are more willing to travel.

For TML, we predict if the later trip will occur based on five variables, namely, the order of the later trip , the number of potential trips , and the departure time , the origin , and the destination of last trip. In this study, the departure time , the origin , and the destination of last trip are converted to a series of binary variables. The departure time of last trip is expected to have a great impact in trip-making prediction because the probability of a passenger making the later trip decreases with time. The destination of last trip is also important because the last trip of a day usually ends at the same place. Meanwhile, the probability of a passenger making another trip decreases with the number of potential trips . The coefficients of the variables are unique for each individual and are not presented here.

Our results show that using Categories A + B features can effectively predict if the first trip will occur, and the accuracy of first trip prediction is 88.18% (Table 3, TMF). However, when using the proposed new feature (i.e., the number of potential trips) to replace the Category B features, the prediction accuracy increases to 88.41% (Table 3, TMF). This highlights the effectiveness of the proposed feature in passenger mobility prediction. By combining the Categories A + B + C features, the accuracy of prediction keeps increasing to 88.77%, and the recall increases to 93.96% (Table 3, TMF). The proposed -gram model reaches the highest prediction accuracy (i.e., 88.84%) and the highest recall (i.e., 96.86%). Similarly, the incorporation of the proposed new feature (i.e., the number of potential trips) increases the accuracy of next trip prediction from 81.55% to 82.42% (Table 3, TML). Finally, by introducing the trip generation index , the accuracy increased from 82.42% to 82.82%, the recall increased from 84.56% to 84.65%, and the precision increased from 88.25% to 88.61% for TML (Table 3). The results suggest that the trip generation index can be used to improve the performance of trip-making prediction.

##### 4.2. Trip Attribute Prediction

For trip attribute prediction, we first cluster passengers into different groups based on their temporal mobility patterns. We find that the value of silhouette coefficient reaches the maximum value when the number of groups is set to 3. Hence, the studied passengers are clustered into 3 groups. We develop a hierarchical -gram model (i.e., -gram model) incorporating a new feature, the trip attraction index, to predict the trip attributes of passengers in each group.

For TAF, only the day of the week is used as the date feature to predict the start time , the origin , and the destination of the first trip. For TAL, the trip attributes of the later trip are predicted using the trip attributes of last trip (i.e., , and ). We propose a new feature, trip attraction index , to predict trip attributes. The effectiveness of -gram model is validated from three aspects: (1) prediction of individual mobility, (2) prediction of out-passenger flow at the crowding station, and (3) prediction of the passenger sources of the crowding station. Several benchmark models, including the MC model, the -gram model proposed in reference [28], and the -gram model (-gram model incorporating the trip attraction index), are used for comparison. The prediction accuracy rates are listed in Table 4. The MC model cannot capture the individual mobility patterns for both TAF and TAL. The average prediction accuracy rates of the -gram model is 53.1% for TAF and 42.5% for TAL, implying that -gram model can better predict passenger mobility attributes than MC model. The -gram model achieves the similar prediction accuracy with the -gram model when predicting trip attributes. The -gram model, which is constructed for each group of passengers specifically, can further improve the average prediction accuracy rates to 57.2% for TAF and 56.6% for TAL.

Out-passenger flow at the crowding station is important for subway management and crowd control. On the event day, 20,823 passengers exited the Window of World station during , and the trip attraction index = 1.15 for -gram model and = 4.28 for -gram model according to equations (2) and (3). The out-passenger flow at the Window of World station suddenly increased at 4:00 p.m. and reached its peak at 6:00 p.m. (Figure 5). We find the MC model and the -gram model are not able to predict the anomalous passenger flow in the large crowding event. By introducing the trip attraction index , the -gram model can well reproduce the anomalous passenger flow in crowding events, which suggests that the trip attraction index can be used to improve the performance of trip attribute prediction. In addition, the proposed -gram model can further improve the passenger flow estimation in crowding events (Figure 5).

The trip-making prediction and trip attribute prediction can be conducted as soon as a passenger trip is observed. Hence, we can predict event participants’ trips and corresponding trip attributes before the start of the crowding event. The -gram model can predict 80% of actual passenger flow before the time period of the crowding event (Figure 6). This offers sufficient time for deploying crowd control strategies. A total of 2,198 predicted participants arrived at the crowding station in the first trip. Thus, these participants can be predicted at the start of the day (i.e., 6:00 a.m.), therefore providing sufficient time for the implementation of crowd control strategies. For example, we can close streets or place barriers to change the routes of crowds to reduce traffic congestion in streets.

Finally, we predict where the event participants start their trips. Here, only the trips with a destination at the crowding station are analyzed. Table 5 lists the main passenger sources of the event participants from 6:00 p.m. to 7:00 p.m. on the event day. Passengers mainly originated from the subway stations near the crowding station. Table 5 lists that the MC model and the -gram model could not capture the anomalous OD flow in some crowding events. However, by introducing the trip attraction index, the -gram model and -gram model can capture the anomalous passenger flow in OD pairs, which improves the prediction performance of the model.

#### 5. Conclusions

In summary, we predict the individual mobility of subway passengers in a large crowding event by inducing a new individual mobility feature (i.e., the number of potential trips ) and two new cumulative mobility features (i.e., trip generation index and trip attraction index ). Individual mobility prediction is split into the trip-making prediction and the trip attribute prediction, which are respectively solved by the L2 regularization model and the proposed -gram model. The newly proposed individual and cumulative mobility features, the number of potential trips and the trip generation index , can enhance the accuracy of trip-making prediction. The other cumulative mobility feature, that is, trip attraction index , captures the anomalous mobility patterns in large crowding events, which improve the prediction accuracy of the trip attributes. The -gram model is also validated from the aspect of predicting the out-passenger flow at the crowding station and predicting the event participant sources, outperforming several benchmark models.

In this study, we only consider the subway passengers who participate in the crowding events. The taxi passengers and bus passengers involved in crowding events can be further analyzed in the future to improve the accuracy of trip-making prediction and trip attribute prediction. Besides, the trip prediction in the situations when multiple crowding events simultaneously occur may be an interesting future research direction. In addition, passenger demand could be underestimated due to device malfunction or fare evasion [32], which should be further investigated when relevant video surveillance data and subway trip volume data become available.

#### Data Availability

The subway smart card data and network data used to support the findings of this study have not been made available because of the confidentiality agreement.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The research was supported by the National Natural Science Foundation of China (grant no. 71871224) the 2021 Science and Technology Progress and Innovation Plan of Department of Transportation of Hunan Province (grant no. 212102), and the Hunan Provincial Natural Science Fund for Distinguished Young Scholars.