Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 9717582, 10 pages

http://dx.doi.org/10.1155/2016/9717582

## Three Revised Kalman Filtering Models for Short-Term Rail Transit Passenger Flow Prediction

^{1}Beijing Urban Transportation Infrastructure Engineering Technology Research Center, Beijing University of Civil Engineering and Architecture, Beijing 100044, China^{2}Institute of Transportation Engineering, Tsinghua University, Beijing 100084, China^{3}Parsons Transportation Group, 100 Broadway, New York, NY 10005, USA^{4}New Jersey Department of Transportation (NJDOT), 1035 Parkway Avenue, Trenton, NJ 08625, USA

Received 16 December 2015; Accepted 10 March 2016

Academic Editor: Payman Jalali

Copyright © 2016 Pengpeng Jiao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Short-term prediction of passenger flow is very important for the operation and management of a rail transit system. Based on the traditional Kalman filtering method, this paper puts forward three revised models for real-time passenger flow forecasting. First, the paper introduces the historical prediction error into the measurement equation and formulates a revised Kalman filtering model based on error correction coefficient (KF-ECC). Second, this paper employs the deviation between real-time passenger flow and corresponding historical data as state variable and presents a revised Kalman filtering model based on Historical Deviation (KF-HD). Third, the paper integrates nonparametric regression forecast into the traditional Kalman filtering method using a Bayesian combined technique and puts forward a revised Kalman filtering model based on Bayesian combination and nonparametric regression (KF-BCNR). A case study is implemented using statistical passenger flow data of rail transit line 13 in Beijing during a one-month period. The reported prediction results show that KF-ECC improves the applicability to historical trend, KF-HD achieves excellent accuracy and stability, and KF-BCNR yields the best performances. Comparisons among different periods further indicate that results during peak periods outperform those during nonpeak periods. All three revised models are accurate and stable enough for on-line predictions, especially during the peak periods.

#### 1. Introduction

With the rapid development of urbanization and motorization in most Chinese large cities, the urban transportation systems are facing more and more serious problems, such as congestion, crashes, and pollution. As an efficient trip mode, rail transit system has played a more and more important role in solving traffic issues. In Beijing, there are a total of 21 lines in operation now, covering a distance of 527.2 kilometers (327.6 miles). During the past decade, the average daily passenger flow has increased dramatically to about 10 million riders. Therefore, the operation and management of the rail transit system, especially real-time operation, is very important.

During peak hours, pedestrian congestion happens frequently. For safe and efficient purposes, the real-time passenger flows, especially predicted flows during the next several time intervals, are key issues for real-time intelligent operation of the rail transit system. However, with the past and current passenger flows detected easily, the future flows are not straightforward. Therefore, the passenger flow forecast method based on statistical data is rather meritorious.

Most recently, Sun et al. [1] proposed a nonparametric regression method to forecast passenger flow at subway transfer stations. Except for this, the literature review shows that very few researches have focused directly on short-term rail transit passenger flow prediction. However, short-term traffic flow forecasting has been studied extensively with Intelligent Transportation Systems (ITS) and many practical models have been developed from these studies. With just different input data entered into these models, some of those achievements can be used to forecast rail transit passenger flow easily.

Existing traffic flow forecast models cover a wide range consisting of Historical Average (HA), Autoregressive Integrated Moving Average (ARIMA), Neural Network (NN), Kalman filtering (KF), nonparametric regression (NR), chaos theory, Support Vector Machine (SVM), and others. The HA model uses a simple time-series method [2], which is rarely in use now. Ahmed and Cook [3] put forward an ARIMA model to forecast freeway traffic flows, and Williams et al. [4] further developed it to seasonal case and compared it with an Exponential Smoothing Method (ESM). Many researchers formulated NN-based prediction models and obtained rather satisfying results such as Smith and Demetsky [5], Florio and Mussone [6], Zhang et al. [7], Dougherty et al. [8], Park and Rilett [9], and Vlahogianni et al. [10]. Kalman filtering is a kind of recursive state forecast method with high efficiency that has also been widely used in short-term traffic flow prediction, for example, Okutani and Stephanedes [11], Cathey and Dailey [12], and Shekhar and Williams [13]. As a nonlinear regression method, the NR model is rather applicable to uncertain and dynamic systems, just like real-time transportation systems. Pioneering work on the NR method can be found in Yakowitz [14] and Karlsson and Yakowitz [15], and some scholars further developed them for traffic flow forecast, for instance, Davis and Nihan [16], Smith and Demetsky [17], Oswald et al. [18], Smith et al. [19], Qi and Smith [20], and Kindzerske and Ni [21]. Huang et al. [22], Lu and Wang [23], Meng and Peng [24], Xue and Shi [25], and Pang and Zhao [26] applied chaos theory in the traffic flow prediction and obtained acceptable results. SVM is a new statistical machine-learning method [27] which has been proved to have stronger learning and generalization abilities than the NN model. SVM has also been used in the field of traffic flow forecast such as Ren et al. [28], Wu et al. [29], and Wang et al. [30].

Generally, the above methods can be classified into statistical and artificial intelligence models. Smith and Demetsky [17] and Smith et al. [19] compared some of these models and proposed that no single method was universally accepted as the best one. Therefore, based on existing single models, some combined methods have been developed and one of the most effective approaches is the Bayesian combined model. Zheng et al. [31], Dong et al. [32], Jiao et al. [33], and Jiao et al. [34] have proved its effectiveness.

More recently, some researches proposed new models for multistep prediction [35] and large-scale road network forecast [36]. The latter employed cloud computing techniques for large-scale network applications.

Among all the above short-term traffic flow forecast models, the Kalman filtering method is very efficient due to its recursive attribute and is rather convenient for use in rail transit passenger flow predictions. However, existing researches have proved that the traditional KF methods are not accurate and stable enough for on-line applications. Therefore, this paper will revise the traditional KF methods and propose three revised models.

To predict passenger flow accurately and efficiently, one key feature of the paper is to introduce some error calibration measures or new state variables into classical models and to construct some revised KF forecast models. The second key feature is to integrate some stable methods and formulate an innovative KF prediction model with good accuracy, stability, and robustness.

This paper consists of six sections. Following the Introduction, the basic KF model is described in the second section, including its state transition and measurement equations. Three revised KF models are formulated in the third section, including the KF model based on the error correction coefficient (KF-ECC), the KF model based on Historical Deviation (KF-HD), and the KF model based on the Bayesian combination and nonparametric regression (KF-BCNR). Solution algorithms for the NR model, KF model, and Bayesian combination model are designed in the fourth section, respectively. Prediction results using practical statistical passenger flow data are reported and analyzed in the fifth section. Conclusions and some future research directions are summarized in the last section.

#### 2. Basic Kalman Filtering Model

The KF model is a kind of state space method consisting of three important parts: state variable, state transition equation, and measurement equation.

In the rail transit passenger flow prediction, the short-term passenger flow to be forecasted is taken as the state variable directly. In this paper, we employ the passenger flow at the station. Using to denote the passenger flow during time interval at a station, the state transition equation and measurement equation are formulated as follows:where is column vector form of passenger flow and, accordingly, is the column vector of ; is Gauss white noise vector with mean value and covariance matrix and here is a constant semipositive matrix and is the Kronecker delta; that is, ; is column vector form of measurements and here the Historical Average passenger flow during the same time interval is taken as the measurement; is measurement matrix and here it equals the identity matrix in the passenger flow prediction; that is, it can be neglected in the formulation; is column vector form of detection errors with mean value and covariance matrix and here is a constant semipositive matrix similar to .

Equations (1) and (2) constitute the basic KF model together. Existing researches have proved that the basic form of KF is rather efficient due to its recursive attribute. However, the accuracy is not satisfying. Therefore, we further formulate some revised KF models to improve the prediction accuracy.

#### 3. Three Revised Kalman Filtering Models

##### 3.1. The Revised KF Model Based on Error Correction Coefficient

Since the historical passenger flow data could be collected easily, we can conveniently track the trend of the flow changes. The basic KF model in (1) and (2) has been employed in historical cases, and the errors between historical forecast and historical detection are thus obtained. Based on characteristics of such errors, we introduce an error correction coefficient into the measurement equation:where is the error correction coefficient based on historical forecasting deviations. Here, measurement matrix is neglected, because it is an identity matrix in nature.

The error correction coefficient varies under different conditions. It is closely correlated to the historical forecasting errors. In detail, it grows with the increase of historical errors, and we can obtain it by the historical data fitting procedures.

During weekdays, rail transit passenger flows usually change from morning peak hours to nonpeak hours and then to evening peak hours. Therefore, some similar characteristics in the historical forecasting errors are observed. Statistical analyses prove that it can fit a quadratic parabola function:where and are parameters to be estimated from the data fitting procedures.

Equations (1), (3), and (4) constitute the revised KF-ECC model together.

##### 3.2. The Revised KF Model Based on Historical Deviation

Since the rail transit passenger flow fluctuates dramatically and the magnitude is rather large, the forecasting process of KF model using passenger volume as a state variable directly is not very stable. Further analyses of passenger flows show that the deviation between real-time volume and the corresponding historical data is fairly smooth [37]. Therefore, the above-mentioned deviation is introduced into the KF model as the revised state variable to improve the accuracy and stability of the prediction. The revised KF-HD model is formulated as follows:where is the column vector form of historical passenger flow in the same time interval and the same weekday during the last week. The most important issue is that is different from ; that is, is corresponding to the same weekday in the previous week, while is the average value of the historical data.

Equations (5) and (6) together constitute the revised KF-HD model, which is a basic KF formulation except for the state variable in a deviation form. Since and are available from statistical data, one can get the real-time passenger flow easily.

##### 3.3. The Revised KF Model Based on Bayesian Combination and Nonparametric Regression

Existing researches [31–34] have proved the effectiveness of Bayesian combined approach in traffic flow forecasting. It is a weighted average method in fact, as shown below:where is the result from the KF model, is the result from the NR model, and is the weight of the KF or the NR model.

As stated before, the NR model is fairly applicable to uncertain and dynamic transportation systems, and many literatures have demonstrated its accuracy. Therefore, we introduce the NR method into the Bayesian combined model to further improve the prediction effects. Here, the -nearest neighbor nonparametric regression (NNNR) method is employed.

From (7), we can find out that, in the Bayesian combination framework, KF model or NR model may be strengthened or weakened by adjusting the weight . If we set to zero, the KF model will be neglected from the combination. The same result will be derived for the NR model if we set to zero. Actually, both weights will be adjusted dynamically according to the forecasting errors of two single models. The detailed adjustment mechanism will be illustrated in Section 4.

We further take the NR prediction as the control variable and introduce it into the KF model. Meanwhile, we combine the NR result in interval with the KF result in interval through Bayesian combination method and integrate them into the state transition equation of the KF model. The revised formulation is shown below:where and are the column vector forms of and , respectively, and other symbols are the same as before. The item is the control variable of the state transition equation; that is, it reflects the contributions of NR model to the final prediction results.

Equations (8) and (2) constitute the revised KF-BCNR together. The main purpose of this revised KF model is to introduce more historical information and accurate results into the forecast process and to improve the accuracy and stability of the prediction.

Based on the adjusted algorithm of Bayesian weights and the results of the NR model, we can finally obtain the forecasted passenger flows.

#### 4. Algorithms

##### 4.1. Nonparametric Regression Algorithm

The NR algorithm mainly consists of five steps: the preparation of historical data, the generation of sample database, the definition of state vector, the searching of -nearest neighbors, and the prediction function. The general algorithm flow is shown in Figure 1.