Abstract

With the rapid development of long or extra-long highway tunnel, accurate and reliable methods and techniques to forecast traffic flow for road tunnel are urgently needed to improve the ventilation efficiency and saving energy. This paper presents a new hybrid Gaussian process regression (GPR) optimized by particle swarm optimization (PSO) for coping with the forecasting of the uncertain, nonlinear, and complex traffic flow for road tunnel. In this proposed coupling approach, the PSO algorithm is employed to overcome the disadvantages of too strong dependence of optimization effect on initial value and easy falling into local optimum of the traditional conjugate gradient algorithm and accurately search the optimal hyperparameters of the GPR method, and the GPR model simulates the internal uncertainties and dynamic feature of tunnel traffic flow. The predicted results indicate that the proposed PSO-GPR algorithm with different kernel function is able to predict traffic flow for road tunnel with a higher degree of accuracy. The PSO-GPR-CK is effective in boosting the forecasting accuracy in comparison with the single kernel function and is worth promoting in the field of traffic flow forecasting for road tunnel to improve the ventilation efficiency.

1. Introduction

In road tunnel, the vehicle traffic is the significant source of air pollutants such as carbon monoxide, nitrogen oxide, or suspended particulate matter. Therefore, it is essential that enough airflow be provided inside road tunnel to keep the visibility index on required level and the pollutants under certain margins. In order to achieve this purpose, tunnel ventilation systems adopt mechanical facilities such as jet-fans, blowers, and dust collectors that provide drivers with a safe and comfortable driving environment [1]. The total design scale of tunnel ventilation system is decided by pollutants level or smoke concentration, which depends on the traffic flow determined by traffic volume survey about this tunnel in the feasibility study phase [2, 3]. In view of the actual traffic flow intensity in operation period, which is often lower than the traffic volume in feasibility study phase and its variation characteristics, all facilities such as jet-fans in tunnel ventilation system in operation condition will consume a considerable amount of electric power. It is significant to have an efficient method to ascertain the actual traffic flow inside tunnel and ensure the operation condition of tunnel ventilation systems consistent with the traffic flow and pollutants concentration in tunnel. One of the means for determining the traffic flow for road tunnel is current time detection [4], in which the traffic counter located at the tunnel entrance records the number of cars entering the tunnel and feeds back the information about traffic condition to the controller for executing certain control algorithms and adjusting the operating condition of ventilation facilities. However, the way of determining traffic flow for tunnel based on current time detection has two aspects of the problem [4]: (a) the detection of traffic flow always lags behind pollution effect caused by motor vehicles passing through tunnel; (b) tunnel ventilation system controlled by traffic flow determined using this method is delayed. If tunnel ventilation system is operated according to traffic flow detected by some current sensors, it is difficult to generate immediately a sufficient amount of airflow and dilute the concentrations of noxious and dangerous contaminants to acceptable levels. To overcome these challenges, it is very necessary to put forward other predicted methods for tunnel traffic flow to realize the feedforward control of tunnel ventilation system and then improve the ventilation efficiency and saving energy [5, 6].

Aiming at traffic flow forecasting for road tunnel, various algorithms or models have been proposed in literature. The early traditional mathematics and physics methods to predict traffic flow which are based on the historical average model, time series models, state space models (e.g., Kalman Filter (KF)), dynamic-traffic-assignment-based (DTA) models, smoothing methods, and so forth were presented [7]. Because traffic flow in highway tunnel is uncertain, nonlinear, and complex, The aforementioned models may not overcome the inherent limitations in data fitting and extrapolation, which give rise to generating the low forecasting accuracy with the actual traffic volume. In order to come up with solutions to some of these problems, many artificial intelligence techniques have received much attention and are considered as an alternative for traffic flow forecasting model over the last decade [8]. Smith and Demetsky applied ANN for short traffic flow forecasting [9]. An ANN model for urban traffic flow was presented [10]. Dia applied an object oriented neural network approach for short-term traffic forecasting [11]. Recent studies are applying different ANN architectures with different input parameters measured through field studies by using advanced instruments demonstrating that ANN modeling is an effective approach for short-term traffic flow modeling [1214]. A short-term traffic flow forecasting model based on support vector machine (SVM) was proposed [15]. Theja and Vanajakshi investigated the application of SVM for short-term forecasting of traffic flow, and comparison of performance was carried out between SVM and ANN showing the SVM-based model is superior to the ANN-based model on these aspects of prediction accuracy, convergence time, generalization capability, optimization possibility, and so on [16]. What needs to be explained is that SVM cannot be used for the quantitative analysis of the uncertainty of the prediction and the predicted results are undesirable under the condition of elongated or asymmetric space [17]. Recently, a model based on Gaussian process regression (GPR) was utilized to predict the traffic flow, and it is worthwhile to note that GPR has been shown to generate promising performance [10].

GPR, the third-generation kernel machine, has advantage of small samples, probabilistic reasoning, and good generalization capability and embodies the characteristics of programming easily, self-adaptive acquisition of hyperparameters, and forecasting with probability interpretation, which are superior to those of SVM. At present the hyperparameters of GPR are got by maximizing likelihood function of training samples based on conjugate gradient algorithm. However, this traditional algorithm embodies the disadvantages of too strong dependence of optimization effect on initial value, difficult determination of iteration steps, and easy falling into local optimum. In this paper, particle swarm optimization (PSO) is used to automatically search the hyperparameters of GPR model for overcoming the shortcomings of the conjugate gradient algorithm. Based on the concept of hybrid forecasting, hybrid approaches have been increasingly investigated to pursue more accurate and stable forecasts [1820]. This paper proposes a hybrid algorithm for predicting the traffic flow for road tunnel, which incorporates the PSO algorithm and the GPR model. The developed hybrid algorithm is examined using 11 groups verifying datasets of traffic flow monitored in one expressway tunnel. The predictive results reveal that this hybrid forecasting algorithm with different kernel function has a good ability to yield the predicted results of traffic flow for road tunnel and the developed hybrid PSO-GPR algorithm with the combined kernel function is effective in boosting the higher degree of forecasting accuracy than this coupled method with the single kernel function.

2. PSO-GPR Algorithm

2.1. Gaussian Process Regression (GPR)

Gaussian process is also called normal random process from the angle of probability theory. Gaussian process is a stochastic process, any finite number of which has a joint Gaussian distribution. This property simply means, for any positive integer , an arbitrary set of random variables , and its corresponding state which have a -dimensional joint probability distributions obey a Gaussian distribution. A Gaussian process is completely specified by its mean function and covariance function [21]. Gaussian process is defined by the following formula according to [22]where is any random variable.

2.1.1. Forecasting of Gaussian Process

There is a training set of observations, , , where denotes an input vector, denotes a scalar output or target; assuming that the difference between output value and its real value is , then the general model of Gaussian process problem can be established as follows [23]:where is the -dimensional input vector, is the output scalar, and is the independent random variable according to Gaussian distribution,

Based on the framework of Bayesian linear regression , using the random distribution , the prior distribution of the observed target value defined by the formula (1) is

The joint Gaussian prior distribution between training samples output derived by formula (3) and testing samples output iswhere is the -dimensional symmetric positive definite covariance matrix, any of measures the correlation between and , is the -dimensional covariance matrix on testing sample and all input of the training set, and is testing sample ’s own covariance matrix.

In the condition of given testing samples and training set , the goal of Bayesian probability forecasting is to calculate , which can be obtained according to Bayesian posterior probability formula as follows:where the mean and variance of are

2.1.2. Gaussian Process Training

The Gaussian process method demands that the covariance function is positive definite in the limited input point set, which is a symmetric function satisfying the Mercer condition; therefore, the covariance function is equivalent to the kernel function; formula (6) can be rewritten aswhere .

The mean of predicted value is a linear combination of the kernel function seen from formula (8), which can map the nonlinear data to characteristic space and then turn into linear relation. The kernel function of Gaussian process for machine learning can be divided into two categories generally: one is automatic relevance determination (ARD) kernel function; the other is isotropic (ISO) kernel function. These two kernel functions are the same in form, but the significant difference of the two functions is whether the dimension of relevance determination hyperparameter in kernel function equals that of input invariable . For ARD kernel function, each component of corresponds to the component of ; however the hyperparameter of ISO kernel function is always a one-dimensional scalar and the relevance of all inputs and outputs is the same.

Traffic flow forecasting for road tunnel is actually the problem of multiple regression. The ISO kernel function adopted in forecasting can effectively reduce the number of hyperparameters in Gaussian process regression and improve the computational efficiency. Gaussian process can choose different ISO kernel (covariance) function, and the following three kinds of ISO kernel function are adopted in this paper [22].(1)Squared exponential kernel function (squared exponential SEiso):(2)Rational quadratic kernel function (rational quadratic RQiso):(3)Combined kernel function (CKiso):where is the signal variance of the kernel function, and it is used to control the degree of local relevance; is variance of noise; is relevance determination hyperparameter of kernel function; the bigger the value of is, the worse the relevance of input and output is; is shape parameter of kernel function. is the Kronecker symbol.

2.1.3. Optimal Hyperparameter Determination

Gaussian process employs a set of hyperparameters including the relevance determination hyperparameter , the signal variance , and the noise variance . Hyperparameters can be optimized based on log-likelihood framework. The optimal hyperparameters are searched by maximizing the log-likelihood function of training samples based on conjugate gradient algorithm as follows [23, 24]:where is probability symbol; is the symbol of calculating the trace of matrix.

The predicted value and variance of testing sample are acquired by substituting the optimal values of got by the above formulas into (6) and (7).

2.2. Particle Swarm Optimization (PSO)

Particle swarm optimization (PSO) which was originally introduced by Kennedy and Eberhart is a simple and powerful optimization technique inspired by social behavior of bird flocking or fish schooling [25]. A PSO system simulates the knowledge evolvement of a social organism, in which individuals (particles) representing the candidate solutions of an optimization problem traverse through a multidimensional search space in order to determine the optima or suboptima. The position of each particle is evaluated according to the objective function, and particles in a local neighborhood share memories of their “best” positions. These memories are used to adjust the particles’ own velocities and their subsequent positions. The position and velocity of the particles can be determined and updated using the following equation [26]:where is the vector that represents the position of particle in the search space at iteration ; denotes the velocity of this particle. and are called acceleration coefficient; usually the values of and are in [1.8, 2]. and are random numbers distributed uniformly in . Vector is the best previous position of particle called personal best position. Vector is the position of the best particle among all the particles in the population and called global best position. The personal and global best position at the next iteration are defined as is the inertia weight; it is beneficial to search in larger range when takes lager value. On the contrary, it means fine searching in local scope. Its value depends on the linear interpolation algorithm with the following equation:where and are the maximum and minimum inertia weight; generally , and ; is the maximum iteration, and is the current iteration.

2.3. The Hybrid PSO-GPR Algorithm

Gaussian process algorithm traditionally determined the optimal hyperparameters by (12) [27]. The conjugate gradient algorithm used in (12) embodies the disadvantage of strong dependence of initial value on optimization effect, difficultly determining iteration steps and easily falling into local optimum during the optimization process; thus its application effect is not ideal in practice. The greatest features of PSO are convenient procedural treatment, less parameters, simply implemented algorithm, and memory function particles, while can unidirectionally transfer information to other particles; such information sharing mechanism is more advantageous for the algorithms fast converging to the global optimal solution. In order to overcome the above defects of the conjugate gradient algorithm, the PSO algorithm is used to search the optimal hyperparameters in samples training process, and the hybrid PSO-GPR algorithm is established. Flow chart for the PSO-GPR coupling algorithm is shown in Figure 1.

The hybrid algorithm implementation steps are as follows:(1)Divide the training samples (lines ()–() in Table 1) into two parts: one part is the learning samples for the PSO-GPR model, and the remaining part is regarded as the testing samples to examine the generalization ability of PSO-GPR model.(2)Initialize the network parameters of PSO algorithm including the population size () of initial particle swarm, iteration number, particle radon solution-particle initial velocity, and position, in which each particle vector represents a GPR model. Counter .(3)Train and test samples by GPR network; then forecast the value of testing samples.(4)Calculate the fitness value of each particle by the equation as follows:where is the predicted value of testing sample by GPR model; is the sample value of testing sample ; is the number of testing samples.(5)Compare calculated in step () with optimal solution calculated in previous iteration history; if , then replace with the new fitness value and substitute previous particle by new particle.(6)Compare each individual optimal solution with global optimal solution ; if , then replace original global optimal solution with the new individual optimal solution; meanwhile, save the current state of the particles.(7)Judge whether the network satisfies the preset iteration step; if so, end the program and return the particle with current minimum fitness value to find the optimal solution; meanwhile decode and acquire hyperparameters. Otherwise, start a new iteration.(8)Update position and velocity of particle using (13)~(15) and generate new particle; then return to step () and start a new iteration, , recorded by counter.(9)Repeat step ()~step () until the maximum iteration step is satisfied. End the program and return the optimal hyperparameters of GPR model.

3. Traffic Flow Forecasting for Road Tunnel Using the Hybrid Algorithm

3.1. The Main Factors of Traffic Flow for Road Tunnel

The influential factors of road traffic are complex and numerous, for example, the regional economic development level and the grade and quality of road. In terms of road tunnel, it is considered that these macro influential factors are comparatively stable; consequently the above-mentioned factors in short-term traffic flow forecasting for tunnel can not be taken into consideration temporarily (with the increase of operating time and the accumulation of relevant statistical data, these macro factors will be also considered into the long-term traffic flow forecasting for tunnel). The comparatively short-term traffic flow forecasting for tunnel in this paper mainly takes day category, weather, and season into consideration.

There is no doubt that weather has an obvious impact on the traffic flow. Generally speaking, traffic flow is maximum on sunny day, followed by rainy days, and that is minimum in foggy days and snow days. Day category including weekdays, weekends, and holidays is also an important factor for traffic flow. In general, the traffic flow is comparatively large on weekdays, and traffic flow is in decline on weekends due to the reduction of official car and private car travel. The traffic flow rebounds on holidays because of the rapid growth of private car travel, and the traffic flow may be in explosive growth especially during the Spring Festival and Golden Week (long weekend in China). The season (namely, the temperature) is a nonignorable factor for the traffic flow. People’s activity in summer or winter decreases significantly, so the traffic flow in these two seasons is less than the rest seasons.

Theoretically speaking, the hybrid PSO-GPR algorithm based on the historic traffic flow of tunnel can be used to forecast long-term or short-term traffic flow. Information about required prediction period and headway relies on the time nature of samples. If prediction period (sampling period) is too short, the volatility of data acquisition will be high. On the contrary, it is unfavorable to save electrical energy because of ventilation facilities operation time [28]. It is a rule in analysis of traffic flow for road tunnel that at least one time air exchange (about 15~20 minutes based on guidelines for design of ventilation of highway tunnels in China) should be carried out in prediction period [28]. Besides considering this situation where design of ventilation of highway tunnel is on the basis of an hourly traffic volume, an hour is chosen as the required prediction period. Traffic flow of road tunnel at the same moment on one day shows a similar trend [29]. In order to forecast traffic flow for tunnel from 7:00 to 8:00 Am of one day in future, the traffic flow data at the same period are obtained, listed in Table 1 through traffic monitoring of a certain expressway tunnel in the past and recent days during operation period. Because the time to predict traffic flow for this tunnel is uncertain to some extent, it is difficult to provide information about headway. In the respect of samples property of PSO-GPR algorithm in Table 1, the headway is at least 24 hours; but considering the condition that the purpose of traffic flow for tunnel is to preturn on or preturn off the ventilation equipment, the headway should be the time needed to operate and realize the feedforward ventilation system inside tunnel in my opinion.

3.2. Training Samples of Traffic Flow Prediction Based on GPR Model

In order to provide the training samples (lines ()–() in Table 1), these qualitative influential factors of tunnel traffic flow listed in Table 1 must be quantified. Quantitative standard of day category, weather, and season is shown in Table 2.

Table 1 can be easily converted into the quantitative form with a standard of quantification about three influential factors shown in Table 2. Then the quantitative form of Table 1 is divided into three parts. One part is the learning samples for the hybrid PSO-GPR algorithm shown in Table 3 and another part is the testing samples for PSO-GPR algorithm illustrated in Table 4. The last part is regarded as verifying samples in Table 6.

3.3. Traffic Flow Forecasting for Road Tunnel Based on PSO-GPR Algorithm

The calculating program of the hybrid PSO-GPR algorithm is written using Matlab language according to the flow chart for this coupled model described in Figure 1. The population size of the PSO algorithm is 40. The maximum and minimum inertia weight ( and ) are 0.9 and 0.3, respectively. The acceleration coefficient ( or ) is 2.0. The maximum iteration step is 500.

The searched regions of these parameters for the GPR model such as , , , are, respectively, , , , and . The optimal hyperparameters of GPR model searched by PSO algorithm are listed in Table 5 after network training.

The curve of evolutional fitness value in the PSO algorithm with iteration step is shown in Figure 2. The decrease procedure of the fitness value in Figure 2 is the optimization process of the GPR network parameters. The closer the fitness value to the zero point is, the higher the regression precision is, and the smaller the curve concussion section is, the stronger the algorithm searching ability is. Figure 2 shows that the fitness value drops in a fluctuating manner before iteration step 100 and nearly converges to zero after iteration step 250. There is reason to believe that the higher precision optimal hyperparameters (shown in Table 5) have been found after iteration step 500 seen from Figure 2 when the program finishes.

To examine the predicted performance of the coupled PSO-GPR model with different kernel function, the predicted results of the verifying samples from Table 6 are shown in Table 7 and Figure 3 using the optimal hyperparameters of GPR in Table 5. All the monitoring data about traffic flow is from the same tunnel at the same period.

It is seen from Table 7 and Figure 3 that the predicted results of the hybrid PSO-GPR algorithm with different kernel function are totally acceptable and ideal. It is concluded that the PSO-GPR algorithm could satisfy the engineering requirement such as the feedforward ventilation control for road tunnel and traffic administration in tunnel and providing a favorable method for traffic flow forecasting of tunnel in the meantime.

Statistical outcomes of the relative error (RE) of predicted results for verifying samples are presented in Table 8.

It is shown in the Table 8 that the maximal relative error (5.47%) of the results for verifying samples predicted by PSO-GPR with the combined kernel function (CKiso) is significantly less than that (6.72% or 7.46%) of the predicted results by PSO-GPR with squared exponential kernel function (SEiso) or rational quadratic kernel function (RQiso). The PSO-GPR with CKiso obviously improves the predicted performance of the PSO-GPR with the single kernel function according to Tables 7 and 8.

3.4. Further Analysis of Forecasting Performance

To further analyze and evaluate the predicted performance of the proposed hybrid algorithm, three statistical indices are used to assess the forecasting results. These indices are the MAE (mean absolute error), MRE (mean relative error), and RMSE (root mean square error), for which small values indicate high forecast performance. These indices are defined as follows: where is the predicted value of traffic flow for verifying sample by the coupled PSO-GPR algorithm with different kernel function; is the actual value for the corresponding sample. The index MAE reveals how similar the predicted values are to the actual values, whereas the RMSE measures the overall deviation between the predicted values and the actual values. The MRE is a unit-free measure of accuracy for the predicted traffic flow. Table 9 shows the analytical and evaluation results obtained from the hybrid approach with different kernel function with respect to the traffic flow for verifying samples.

It can be readily seen from Table 9 that the proposed hybrid PSO-GPR algorithm with combined kernel function (PSO-GPR-CK) for traffic flow forecasting outperforms the other kernel functions in terms of the three forecasting evaluation indices (MAE, MRE, and RMSE). The number of RE in PSO-GPR-CK, the value of which is greater than 3%, is only three as shown in Table 9.

More detailed analyses were performed. The proposed hybrid algorithm with the combined kernel function performed better than the other kernel functions for different forecasting horizons. For example, the developed algorithm (PSO-GPR-CK) in this paper outperforms the other kernel functions in traffic flow forecasting with lower RMSE value of 2.41 in contrast to 3.88 of PSO-GPR-SE and 4.08 of PSO-GPR-RQ. A low MRE value of 1.87% was obtained by the PSO-GPR-CK for the traffic flow forecasting, while the PSO-GPR-SE and PSO-GPR-RQ resulted in higher MRE values (3.37% and 3.52%), respectively.

The foregoing analysis demonstrates that the proposed hybrid PSO-GPR algorithm with the combined kernel function formed by the linear superposition of squared exponential kernel function (SEiso) and rational quadratic kernel function (RQiso) is effective in boosting the forecasting accuracy. Therefore it is worth promoting in the field of traffic flow forecasting for road tunnel.

4. Conclusions

Traffic flow forecasting with a high degree of accuracy is very necessary in road tunnel with the mechanical ventilation systems. For this purpose, a coupling algorithm of PSO-GPR is proposed in this paper. The major conclusions can be drawn as follows:(1)Particle swarm optimization (PSO) algorithm has the characteristics of simple programming operation and less network parameters. The PSO algorithm is substituted for the traditional conjugate gradient algorithm to search the optimal hyperparameters of the GPR, which is more effective in parameters optimization design of GPR and improves the generalization ability.(2)GPR model with many advantages is a power algorithm to predict the traffic flow for road tunnel. Traffic flow in highway tunnel presents uncertainty, nonlinearity, and complexity, making it difficult to forecast the traffic flow for road tunnel with a single model. This paper proposed a hybrid forecasting algorithm (we call it PSO-GPR) to effectively improve the predicted accuracy of the single GPR model, which integrates the merits of PSO and GPR.(3)The effectiveness of the developed PSO-GPR algorithm with different kennel function is demonstrated with 11 groups of verifying datasets. The forecasting results show that the proposed hybrid algorithm based on the GPR model integrated with the PSO algorithm has the ability to yield good prediction.(4)CKiso formed by the linear superposition of SEiso and RQiso can significantly improve the generalization performance of a single kernel function. The PSO-GPR-CK outperforms the PSO-GPR-SE and PSO-GPR-RQ in terms of the three evaluation indices. The PSO-GPR-CK can be used as a recommendable approach to realize real-time traffic flow forecasting for road tunnel and feedforward control of road tunnel ventilation systems.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is financially supported by Provincial Key Technological R&D Program of Henan of China (Grant no. 152102210318), National Natural Science Foundation of China (Grant no. 51778215), and Doctoral Foundation of Henan Polytechnic University (Grant no. B2012-016).