Journal of Advanced Transportation

Volume 2018, Article ID 3189238, 13 pages

https://doi.org/10.1155/2018/3189238

## Forecasting of Short-Term Metro Ridership with Support Vector Machine Online Model

^{1}Department of Transportation Engineering, Tongji Zhejiang College, Jiaxing 314000, China^{2}Intelligent Transportation System Institute of Ministry of Education, Southeast University, Nanjing 211189, China^{3}Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, USA

Correspondence should be addressed to Ning Zhang; moc.liamg@2791gnahzgnin

Received 18 January 2018; Revised 6 April 2018; Accepted 26 April 2018; Published 27 June 2018

Academic Editor: David F. Llorca

Copyright © 2018 Xuemei Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Forecasting for short-term ridership is the foundation of metro operation and management. A prediction model is necessary to seize the weekly periodicity and nonlinearity characteristics of short-term ridership in real-time. First, this research captures the inherent periodicity of ridership via seasonal autoregressive integrated moving average model (SARIMA) and proposes a support vector machine overall online model (SVMOOL) which insets the weekly periodic characteristics and trains the updated data day by day. Then, this research captures the nonlinear characteristics of the ridership via successive ridership value inputs and proposes a support vector machine partial online model (SVMPOL) which insets the nonlinear characteristics and trains the updated data of the predicted day by time interval (such as 5-min). Afterwards, to avoid the drawbacks and to take advantages of the strengths of the two individual online models, this research takes the average predicted values of two models as the final predicted values, which are called support vector machine combined online model (SVMCOL). Finally, this research uses the 5-min ridership at Zhujianglu and Sanshanjie Stations of Nanjing Metro to compare the SVMCOL model with three well-known prediction models including SARIMA, back-propagation neural network (BPNN), and SVM models. The resultant performance comparisons suggest that SARIMA is superior for the stable weekday ridership to other models. Yet the SVMCOL model is the best performer for the unstable weekend ridership and holiday ridership. It shows that for metro operation manager that gear toward timely response to real-world unstable and abnormal situations, the SVMCOL may be a better tool than the three well-known models.

#### 1. Introduction

Short-term ridership forecasting is a vital component of metro operation and management. Accurate predictions can reflect real-time changes in ridership. The prediction results can become important inputs for decision-making in evaluating rail transit service level and system operating status and provide an important basis for station passenger crowd regulation and emergency response. In addition, short-term ridership forecasting is the key to the success of revenue management for railway operators [1].

In the last two decades, traditional metro ridership forecasting is based on travel demand forecasting models including the steps of trip generation, trip distribution, mode choice, and assignment [2, 3]. This type of long-term forecasting has been applied in the planning and construction of metro, but it cannot be adapted to the needs of the operations management.

Though the spatial-temporal characteristics of metro ridership are not completely the same as those for vehicle traffic flow [4], short-term forecasting methods can also be divided into two categories: the theory driven method and the data driven method. Theory driven method is based on traffic flow mechanism to investigate traffic dynamics [5, 6]. The data driven method on the other hand is based on the data of traffic flow series itself to construct models and make predictions. The data driven model is the main method of short-term prediction and can be divided into linear, nonlinear, and hybrid forecasting methods. The linear forecasting method mainly includes time series model [7–9] and Kalman filtering model [10–12]. The nonlinear forecasting method includes nonparametric regression [13, 14], neural network algorithm [15–17], support vector machine [18–20], and Gaussian maximum likelihood model [21]. The hybrid forecasting method combines at least two methods for prediction to achieve better performance in accuracy and reliability. Hybrid models mainly include wavelet decomposition hybrid model [22, 23], Bayesian decomposition hybrid model [24, 25], empirical mode decomposition hybrid model [26], neural network hybrid model [27, 28], and support vector machine hybrid model [29–35].

Whether it is traffic flow or passenger, time series model has become one of the classic models of short-term flow prediction [36]. Of all the time series models, seasonal autoregressive integrated moving average (SARIMA) model considers the periodicity feature of the time series, so it can capture the inherent periodicity of traffic flow data. Williams et al. [9–11] used the SARIMA model for short-term traffic flow prediction and verified its good performance. But time series model is a linear model, and its prediction performance may worsen significantly if the time series are nonstationary and nonlinear. Nevertheless SARIMA model is widely used to be the benchmark to evaluate the forecasting performance of a novel model.

Neural networks are among the most widely used nonlinear models. A neural network trains neurons based on historical data, maps the complicated nonlinear relation between input and output data, and uses the relationship for predictions for given inputs. Neural network algorithms have the adaptive and learning advantages and are flexible without the need to construct detailed and explicit models like other methods. Vlahogianni et al. [37] optimized neural network structure to forecast urban traffic flow parameters. But the neural network algorithm cannot make expected risk minimization because of the empirical risk minimization principle that may also lead to two major drawbacks: local minima and overfitting [38]. The local minima are associated with the training process of neural network, which is to minimize the difference between the predicted outputs and the observed outputs by optimizing the network weights. Overfitting leads to poor generalization ability and may produce inaccurate predictions with some particular testing data.

Compared with neural network algorithm, support vector machine (SVM) model can strike a compromise between prediction accuracy and generalization ability based on the structural risk minimization principle. With the help of intelligent use of kernel function, SVM can solve the problems of small sample, nonlinearity and the curse of dimensionality, overfitting, and local minima. Zhang and Xie [19] proposed a *υ*-support vector machine model for short-term traffic volume prediction and showed that it outperformed the multilayer feed-forward neural network (MLFNN) model. Zhang et al. [30] proposed a novel hybrid model that identified the SVM input dimensions via SARIMA model to forecast short-term traffic volume, taking advantage of the individual strengths of the two models. Hong [33] presented a traffic flow forecasting model to forecast interurban traffic flow, which combines the seasonal support vector regression model with chaotic immune algorithm (SSVRCIA), and yielded more accurate forecasting results than the SARIMA, BPNN, and seasonal Holt–Winters models. Wang and Shi [34] constructed a new kernel function using a wavelet function to capture the nonstationary characteristics of the short-term traffic speed data, proposed a short-term traffic speed forecasting hybrid model (Chaos-Wavelet Analysis-Support Vector Machine model, C-WSVM), and achieved the encouraging results. Chen et al. [35] proposed an approach which hybridizes SVR model with adaptive genetic algorithm (AGA) and the seasonal index adjustment, namely, AGA-SSVR, to forecast holiday daily tourist flow.

The research of short-term metro ridership forecasting is a rather new undertaking. Tsai et al. [1] proposed two novel neural network structures based on temporal feature extraction and successfully applied them in railway short-term passenger demand forecasting in Taiwan. Wei and Chen [26] used empirical mode decomposition to extract neural network input variables to forecast the short-term ridership of Taipei Rapid Transit Muzha Line. Sun et al. [29] proposed a novel hybrid model Wavelet-SVM, and the experimental results showed that the approach has appeared to be the promising and robust. These studies indicated that metro ridership has significant characteristics of periodicity and nonlinearity reflecting a variety of factors; however, how these characteristics are embedded into the model without affecting the computational complexity of the model is worth discussing. And, for neural network or support vector model, the previous literature also did not discuss the training time to see if it meets the demand of practical operation. If the training time is too long and leads to serious forecasting delay, the prediction model cannot meet the demand of practical operation even if it has good prediction performance. In addition, most existing research on short-term metro ridership forecasting focused mainly on normal situations; it is not clear how the applicability and the prediction accuracy of the model is when it comes to holidays, inclement weather, large sports events, or emergencies. Sun et al. [29] selected the data including a Valentine’s Day (not a major holiday) as training data, not as a predictor. Finally, the short-term prediction interval is long (i.e., 15-min) in these literatures, and, for the actual operation of the metro, it cannot meet the requirement of the operator because the departure intervals are short.

The reliability and the operability of the models play a crucial role in the accuracy and real-time implementation of the prediction, so the choice of the model is very important in a practical application. Since the characteristics of metro ridership are quite different from those in other transportation systems, most of forecasting models provide unsatisfactory prediction effectiveness. After comparing time series model, neural network model, and SVM model, this paper selects SVM model as the base short-term prediction model, considering capturing in real-time the periodicity and nonlinearity characteristics of short-term ridership as mentioned previously. With this base model, this paper proposes a support vector machine overall online (SVMOOL) model, which extracts input features via SARIMA model, trains the updated data by day, and optimizes the parameters by a particle swarm optimization (PSO) algorithm, to capture the periodicity of ridership in real-time. This paper also proposes a support vector machine partial online (SVMPOL) model, which extracts input features based on the temporal continuity of ridership model, trains the updated data by time intervals (such as 5-min), and also optimizes the parameters by a PSO algorithm to capture the nonlinearity of ridership. Afterwards, the support vector machine combined online (SVMCOL) model is proposed by combining the SVMOOL model and the SVMPOL model.

The main contributions of this paper are as follows.

This paper proposes a novel hybrid model combining the SVMOOL model and the SVMPOL model for short-term ridership forecasting that better captures the periodicity and nonlinearity characteristics by the updated data set. The SVMCOL model takes advantages of the individual strengths of the two models. The actual results of 5-min short-term ridership forecasting show the feasibility and effectiveness of the proposed combined model in real-time implementation.

While the SARIMA model is superior for the stable weekday ridership to other models, experiments results indicate that the SVMOOL model is superior to SARIMA model, BPNN model, or SVM model in terms of MAE and RMSE for the weekend and holiday ridership test. It should be noted that the prediction of ridership under abnormal situations (such as holiday) is evidently more challenging than doing so under normal conditions (such as weekday ridership) and, hence, is much desired by the operator. Therefore, the proposed SVMCOL model is found to be suitable and useful in real-world operations.

The experiments using LibSVM package on desktop computers indicate that the SVMOOL model needs about one hour for three weeks’ data (4284 observations) to construct the prediction function and the forecasting time takes less than 1 second for a one-step prediction using SVM. In the process of the implementation experiments, the SVMPOL model needs less than 1 s to construct due to the small data sample and the forecasting time needs less than 1 s for a one-step prediction. Therefore, the training time and the forecasting time can meet the real-time demand for the one-step prediction in implementation as well.

In general, short-term forecasting represents prediction for a specific time interval, such as 5 min, 10 min, and 15 min. For metro ridership, 5-min interval will be more useful for metro operation and management because the departure interval of the metro vehicle is really short. In addition, it is obvious that ridership during workdays is different from that on weekends or holidays. As discussed by Chen et al., some prediction models that work well for workdays data may yield unsatisfactory results for weekends or holidays data. In order to discuss the applicability of the proposed model, three samples were selected. The first sample contains weekdays, weekends, and no holidays, and the second and third samples contain weekdays, weekends, and holidays.

This paper attempts to develop an online hybrid model to improve the forecasting performance of metro ridership. The rest of this paper is organized in the following manner. A brief theoretical background of the SVM model is presented first, followed by detailed description on SVMOOL model, SVMPOL model, and SVMCOL model. After that, a brief description of the data source and the implementation of the models are given. Finally, results analysis and conclusions are presented.

#### 2. Methodology

To introduce the SVMOOL, SVMPOL, and SVMCOL models, SVM model is illustrated here first.

##### 2.1. Support Vector Machine for Regression

A detailed description of SVM algorithm is given in Vapnik [38]. Assume that training input data and the corresponding training output data are , where and , and denotes the total number of data. The basic idea of SVM is to map the low-dimensional input space to the high-dimensional feature space using a function . The linear regression function can be stated aswhere and are coefficients. For SVM, these coefficients can be obtained by solving the following optimization problems:where (≥0) is the insensitive loss function, and are slack variables, and is a regularization parameter. The maximal dual function in (2) has the following form:where and are Lagrange multipliers.

Ultimately, the decision function given by (1) has the explicit form:where is the kernel function. There are several types of kernel functions, including polynomial, radial basis, and sigmoid. Generally, a Gaussian radial basis function (see (5)) is widely used because of better prediction performance:

##### 2.2. Input Features and Parameter Optimization

Identifying input features is crucial step in SVM modeling. Metro ridership has significant characteristics of periodicity and nonlinearity. Abe [39] discovered that excessive features caused not only long training time but also poor generalization ability. Some researchers documented in detail the identification of input features. For example, Zhang et al. [30] identified the SVM input dimensions via SARIMA. Wu et al. [40] extracted input features from successive actual values before the prediction time; that is to say, if the value of future time is regarded as output, then the real values of past time serve as inputs. Cao et al. [41] used principal component analysis, kernel principal component analysis, and independent component analysis for inputs extraction. Huang and Wang [42] and Lin et al. [43] used genetic algorithm (GA) and particle swarm optimization (PSO) algorithm to extract input features, respectively.

Parameter optimization is to obtain better forecasting accuracy of the SVM model. The parameters optimized are mainly the penalty coefficient, the insensitive loss coefficient, and the corresponding parameters of kernel function. The LibSVM package [44] uses the grid-searching algorithm combined by cross-validation to determine these parameters but the process takes lengthy computation time. Hong et al. [45], Lin et al. [43], and Hong et al. [45] successfully used GA, PSO, and the ant colony optimization (ACO) algorithm to find the most optimal parameters, respectively. The advantages of PSO lie in easier application, fewer parameters to adjust, and faster convergence to optimum. As a result, PSO is used to optimize the parameters in this study. PSO simulates social behavior, like birds flocking to a promising position, to achieve precise objectives in a multidimensional space [46]. PSO gains the optimal solution through collaboration between individuals.

##### 2.3. Support Vector Machine Online Model

###### 2.3.1. Support Vector Machine Overall Online Model

Support vector machine overall online (SVMOOL) model is based on the theory of SVM, to extract input features, to train the batched updated training data, to use intelligent algorithms, to find the optimal parameters, and to get time-varying prediction function to realize the short-term forecasting.

Due to apparent periodicity feature of the rail transit ridership, SARIMA model is used to extract input features because SARIMA model is able to capture the periodicity of time series. A time series is generated by the SARIMA(p,d,q)(P,D,Q process of Box and Jenkins as described by Williams et al. [8, 10] and Zhang et al. [30] described the process how to extract the features via SARIMA model in detail.

Considering the computation time of the training data and the real-time demand of the one-step prediction, the SVMOOL model is constructed by updating the training data day by day. That is to say, the training data is updated by adding the ridership data of the most recent day, and the time-varying prediction function is then constructed. Stating in simpler words, assume that denotes the ridership value at time of day , , where denotes the number of the data points each day. All of the prediction values of ridership after day are forecasted by the training data of ridership values. According to SVMOOL model described above, the prediction function is obtained by using the SARIMA model to extract input features from the training data and using PSO algorithm to optimize parameters, then forecasting value of every time interval , until the real values of day are totally obtained. After that, the training data is updated by adding the actual ridership values of day . New prediction function is then constructed to forecast every value of day , by retraining data and updating the parameters, and the process repeats. This process of constructing SVMOOL model is shown in Figure 1.