Abstract

This paper aims to create a prediction model for car body vibration acceleration that is reliable, effective, and close to real-world conditions. Therefore, a huge amount of data on railway parameters were collected by multiple sensors, and different correlation coefficients were selected to screen out the parameters closely correlated to car body vibration acceleration. Taking the selected parameters and previous car body vibration acceleration as the inputs, a prediction model for car body vibration acceleration was established based on several training algorithms and neural network structures. Then, the model was successfully applied to predict the car body vibration acceleration of test datasets on different segments of the same railway. The results show that the proposed method overcomes the complexity and uncertainty of the multiparameter coupling analysis in traditional theoretical models. The research findings boast a great potential for application.

1. Introduction

Passenger comfort is an important indicator of the operation quality of passenger trains. Previous studies [1, 2] have shown that passenger comfort can be estimated indirectly by parameters like vibration acceleration of the car body. Based on the estimated passenger comfort, it is possible to identify the warning signals or system statuses needed to ensure the smooth operation of the train.

Much research has been done to forecast the vibration acceleration of trains. For instance, Shafiullah et al. [3] predicted the forward and backward vertical acceleration conditions by popular regression algorithms. Zhai et al. [4] created a comprehensive train-track dynamics model to predict the ground vibrations of high-speed trains. Inspired by the dynamics model, Czop et al. [5] proposed a rail irregularity detection method based on the bearing box acceleration during train operation and successfully applied the method to recognize the rail regularities of a typical railway in Poland. Qian et al. [6] established a model to predict the vibration acceleration of high-speed trains based on nonlinear autoregressive neural network with exogenous inputs (NARX NN) and multibody dynamic model and proved the prediction accuracy of the model through experimental analysis.

In addition, some scholars have attempted to infer important parameters of railways from vibration acceleration of the car body. For example, Connolly et al. [7] assessed the effects of vibration acceleration on passenger comfort and track performance. Koo et al. [8] put forward theoretical derailment coefficients for single wheel pairs, considering the impacts from lateral vibration acceleration and gyroscopic factors as well as flange angle, friction coefficient, wheel unloading, wheel radius, gauge, and bearing position. Navik et al. [9] developed a new sensor system that captures the dynamic behaviour of high-speed rail with several sensors placed at an interval of 150m and predicted the maximum vertical displacement, train speed, dynamic behaviour, and quantification modal parameters with vibration acceleration time series.

In general, the previous research into vibration acceleration had concentrated on the traditional multibody dynamics modelling, and the research results were mainly derived through simulation. In actual operation, the train is faced with a complex environment and uncertain track conditions. Thus, there is always some gap between the simulated state and the actual state of railway track and train operation. This calls for a new theoretical model that can accurately reflect the actual conditions of the train and the track.

In light of the above, this paper aims to propose a prediction model for car body vibration acceleration that is reliable, effective, and close to real-world conditions. Therefore, a huge amount of data on railway parameters were collected by multiple sensors, and different correlation coefficients were selected to screen out the parameters closely correlated to car body vibration acceleration. Taking the selected parameters and previous car body vibration acceleration as the inputs, a prediction model for car body vibration acceleration was established based on several training algorithms and neural network structures. Then, the model was successfully applied to predict the car body vibration acceleration of test datasets on different segments of the same railway. The results show that the proposed method overcomes the complexity and uncertainty of the multiparameter coupling analysis in traditional theoretical models. The research findings boast a great potential for application.

The remainder of this paper is organized as follows: Section 2 introduces the data preprocessing and feature selection methods; Section 3 describes the structures of the neural networks and several popular training algorithms; Section 4 verifies the effect of the proposed model on different datasets, and the results under different structures are discussed and compared; Section 5 wraps up this paper with some meaningful conclusions.

2. Data Preprocessing and Feature Selection

2.1. Data Preprocessing

The research data are a collection of useful data from actual railways. The sensors were subjected to noise reduction and antijamming processing, aiming to enhance the readability and usability of the collected data. In addition, the data underwent a multistep preprocessing.

Firstly, time synchronization was performed on the huge amount of data captured by multiple sensors to remove time points with missing or abnormal values and eliminate the variables of constant values. In this way, the data containing useful information were screened out.

Secondly, the modelling variables were determined, excluding those rarely used, irrelevant to mechanics, or difficult to measure in actual conditions.

Thirdly, the influencing factors of the relevant variables in the transfer part were minimized, e.g., the angular acceleration at different positions of car body, as the prediction variables were expected to consider such parameters as train structure, track state, and operation state. Note that the minimization only treats the transfer process as a black box, rather than overlooking the impacts of the influencing factors. The treatment simplifies the modelling process.

Finally, the preprocessed data were normalized for further use.

2.2. Variable Selection
2.2.1. Linear Correlation

The linear correlation of two random variables can be measured by the Pearson’s correlation coefficient (P). If each variable has scalar observations, then the Pearson’s correlation coefficient can be defined as [1012]where and are the mean and standard deviations of , respectively, and and are the mean and standard deviations of , respectively.

The Pearson’s correlation coefficient can also be described based on the covariance of and as follows.

2.2.2. Nonlinear Correlation

The linear correlation coefficient cannot fully reflect the relationship between variables, owing to the possible existence of nonlinear correlations. Thus, the Spearman’s rank correlation coefficient (S) [13] was employed to analyze the nonlinear correlations between variables. This coefficient can be defined as follows.By this definition, and are fully correlated as long as they share a monotonic functional relationship. This is different from Pearson’s correlation, in which only linearly correlated variables are considered as relevant to each other.

Then, the correlation coefficient matrix of random variables is a matrix of correlation coefficients for each pair of variable combinations.Since and are always directly correlated, the diagonal entries are 1, that is,

Through the above calculation, the variables with the greater values under the two types of correlation coefficients, and , can be selected as predictor variables.

3. Method of Prediction Model

3.1. Training Algorithms

The training algorithms pursue the minimum gap between the predicted value and the measured value. In most cases, the minimization is achieved by adjusting the weights of each layer in the neural networks. Below is a brief introduction to the training algorithms adopted for our research.

(1) Broyden–Fletcher–Goldfarb–Shanno (BFGS) Quasi-Newton Backpropagation (BQ). The BQ is an alternative to the conjugate gradient methods for fast optimization. The basic formula [14] is as follows:where is the step distance of ; is the step distance of ; is the Hessian matrix (second derivatives) of performance index at the current weights and biases; and is the gradient of step .

(2) Conjugate Gradient Backpropagation with Powell-Beale Restarts (CGB). For all conjugate gradient algorithms, the search direction is periodically reset to the negative of the gradient. The reset happens whenever there is too little orthogonality left between the current and the previous gradients. This condition is tested with the following inequality [15].

(3) Conjugate Gradient Backpropagation with Fletcher-Reeves Updates (CGF). The optimal distances to move along the current search direction, the new search direction, and the conjugate weight adjustment coefficient are, respectively, computed by the following equations [16]:where is a variable to minimize the performance along the current search direction; is the next search direction, which is conjugate to the previous search direction; and is a constant that adjusts the conjugate weights. Most conjugate gradient algorithms differ only in the calculation of the constant .

(4) Conjugate Gradient Backpropagation with Polak-Ribiére Updates (CGP). For this algorithm, the search direction in each iteration is the same as the new search direction in the CGF algorithm [16]. The constant can be obtained bywhere is the change transposing form in the gradient from the previous iteration.

(5) One-Step Secant Backpropagation (OSS). The OSS algorithm is an approximate secant method with relatively small storage and computing load [17]. By this method, the weights can be adjusted in the following manner:where is the weight adjustment coefficient; is the change in the weights of the previous iteration; and is the gradient adjustment coefficient.

(6) Resilient Backpropagation (RB). The RB is a local learning algorithm that is easy to implement and compute. In this algorithm, the weights are updated according to the behaviour of the sign sequence for the partial derivatives in each dimension of the weight space [18]: where is a parameter to scale the influence of the previous iteration and is the learning rate.

(7) Scaled Conjugate Gradient Backpropagation (SCG). The SCG is a step-size scaling algorithm [19, 20] created to expand the applicable scope of conjugate gradient (CG) algorithm from the functions with positive definite Hessian matrices. The SCG works faster than other second-order algorithms, as it prevents the time-consuming search in each iteration.

(8) Levenberg-Marquardt Backpropagation (LM). The LM algorithm uses the approximate Hessian matrix in the following Newton-like update [21, 22]:where is the Hessian matrix; is the Jacobian matrix containing the first-order derivatives of network errors with respect to the weights and biases; and is a vector of network errors. When the scalar is zero, the LM algorithm is essentially a Newton’s method using the approximate Hessian matrix.

(9) Bayesian Regularization Backpropagation (BR). Besides reducing the sum of squared errors, , the regularization adds an additional term. Thus, the objective function can be expressed aswhere is the sum of squares of network weights and and are two parameters of the objective function. Under the Bayesian framework [23, 24], this method can optimize the regularization parameters.

With different weight adjustment mechanisms, the above algorithms differ in training accuracy, storage, and running time. Their performance will be compared in the following section. In addition, the functions of the output layer and the hidden layer are as follows.

3.2. Structures of Neural Networks
3.2.1. Feedforward Neural Networks (FFNN)

The FFNN is one of the most popular neural networks. The networks have multiple layers, including an input layer, several hidden layers, and an output layer. Layers are connected to each other by nodes or neurons. The input layer is connected to the inputs, while the output layer exports the predicted results. Each hidden layer treats the output of the previous layer as its input.

3.2.2. Time-Series Neural Network without Feedback Time Delays (TSNF)

The parameters like vibration and body attitude are often affected by relevant factors and the existing states. This type of variable often uses more accurate time-series neural network prediction methods. A typical time-series neural network structure is as shown in Figure 1, where the input vectors are formed by the input variables and their delays.

Other than the input layer, the other parts of the TSNF are similar to those of the FFNN. In other words, the TSNF also has multiple hidden layers and one output layer. Each hidden layer contains a certain number of neurons.

3.2.3. Time-Series Neural Networks with Feedback Time Delays (TSF)

TSF is another common time-series neural network (Figure 2). The structure of TSF originates from the NARX. Unlike the TSNF, the TSF contains both input delayed variables and feedback delays.

The introduction of delayed feedback is equivalent to taking the states of the target close to the next predicted moment as the input variables. According to the analysis in the previous section, the TSF structure is expected to further improve the prediction accuracy. Hence, the performance of the prediction model can be optimized by this structure. Of course, the other two structures cannot be neglected in actual practice; it is sometimes necessary to make predictions based on predictors with or without feedback.

3.3. Model Construction Process

The optimal prediction model can be constructed in two phases, namely, data preprocessing and variable selection, and the model construction based on neural networks. Figure 3 is the flow chart of the model construction.

In the data preprocessing and variable selection block, firstly the dataset applied here is from GJ-5 track inspection car. To make the following process more effective, we screen the missing and singular values and delete the corresponding sampling points that may deteriorate the analysis results. Then the target variables are defined as the vibration acceleration of car body from three directions, i.e., horizontal, vertical, and lateral directions. The rest measured parameters (139 other parameters in our dataset) are all considered as the predictors at this step. However, under reasonable deduction, there must be a big mount of the predictors being redundant and almost having no impact on the vibration of car body. To solve this problem, next, the correlation analysis between the predictors and the response parameters is carried out. Specifically, the Pearson’s correlation coefficient and Spearman’s rank correlation coefficient were adopted to select the predictor variables. Since the accuracy of car body vibration forecast model is one of the common key indices, all the predictors with absolute correlation coefficient over 0.1 are taken into consideration as inputs in the following model building process. It should be noted that if the data is very large or selected predictors are still redundant, the chosen bound of the correlation coefficient could be changed to improve the efficiency.

Next, in the building the prediction model block, the selected variables were used to generate the models trained by different algorithms under three neural network structures. The algorithms chosen are widely verified effectively in neural network method and briefly introduced in Section 3.1, which contains LM, BR, BFG, RP, SCG, CGB, CGF, CGP, and OSS, respectively. The three neural networks structures are FF, TSNF, and TSF, respectively, which are illustrated in Section 3.2. The performance evaluation indices of these models were compared to determine the optimal prediction model for car body vibration acceleration. In this work, the accuracy is defined as the priority pursuing goal of the ranking as we want to improve the suitability of the proposed forecast model, and the real measured values of car body vibration acceleration are very low, normally less than 0.1. Under such consideration, the MSE is set to be the main index of the comparison. If the MSE values are very close, the R and MAE values are compared as the auxiliary indices. Once the data are large and the running times of algorithms are obviously different, a more comprehensive ranking equation should be designed containing both the three indices and the running time with reasonable corresponding weights. Finally, the forecast model of car body vibration is obtained.

4. Results and Discussion

4.1. Measurement System

The training data were collected by a GJ-5 track inspection vehicle of ImageMap, Inc., between Shenzhen and Guangzhou, two first-tier cities in China. The verification data were acquired on the return section. The main test items include geometry inspection items, on-board dynamics test items, and ground dynamics test items. Specifically, the track geometry inspections include different wavelengths and gauge, track pitch variation rate, level, triangle pit twisting curvature, and curvature change rate; on-board dynamics test items mainly include wheel rail force and left and right wheel vertical forces, lateral force, derailment coefficient, deceleration rate, three-section acceleration reduction rate of wheel load, lateral stability index of the structure, stability and the vehicle body vertical and lateral acceleration of the left and right axle box, frame, and body of the vehicle left and right axle boxes, vertical and lateral acceleration of the frame, car body vertical and lateral acceleration of the left and right axle boxes, frames, and bodywork of the middle car; ground dynamics test items include derailment coefficient, load shedding rate, lateral force, vertical force, and vertical rail displacement horizontal. The main parameters of track inspection and test accuracy of the inspection vehicle are listed in Table 1.

Through the correlation analysis, the predictor variables for car body vibration acceleration were selected by the absolute values of coefficients falling between 0.1 and 0.9. This interval was chosen to exclude parameters loosely correlated with or similar to the target variable, making it possible to obtain a practical analysis of the impact of each track factor on the target variable. The selected variables are listed in Table 2.

As shown in Table 2, it is clear that the vertical vibration acceleration of the car body directly hinges on the surface conditions of the track. The horizontal and lateral acceleration are affected by relatively more factors, owing to the track state and train operation. Moreover, there are some repeated variables, which is not out of expectation. Through the above processes, the main influencing factors of car body vibration acceleration were all identified, laying the basis for subsequent improvement of train structure and passenger comfort. Finally, the repeated variables were eliminated, leaving a total of 23 predictor variables.

4.2. Performance Evaluation Indices

Three indices were selected to evaluate the performance of our prediction model: the mean square error (MSE), mean absolute error (MAE), and regression coefficient (R). The MSE served as the main index and the other two as the auxiliary indices.

The MSE can be calculated by the equation below:where is a vector of predictions; is the vector of observed values corresponding to the inputs; and is the square of the errors. When the MSE is close to zero, it means the model is suitable for prediction when it is not overfitted.

The MAE is a yardstick of the accuracy of evaluation and prediction [25, 26]. The definition of the MAE is as follows.The R indicates the amount of variance explained by the prediction model. This index can be expressed aswhere is the mean value of the measured data; is the mean value of the predicted data; is the residual sum of squares; and is the explained sum of squares. The value of R falls between 0 and 1. If the R is close to one, it means the model has explained the majority of the variance [27, 28].

4.3. Model Performance

Our prediction models were built on the Matlab software with some codes in the neural network toolbox. The program runs on a Lenovo workstation (CPU: Intel® Xeon® Processor E5-2623 v3; 3.00GHz; RMB: 64GB). To compare the models based on the said three neural network structures, the number of hidden layers and the number of hidden layer nodes were set to 1 and 15, respectively. The number of hidden layer nodes was determined because the FFNN structure can achieve the optimal performance with 15 hidden layer nodes through 10~20 traversal iterations.

4.3.1. Training Performance

As mentioned before, the modelling data were captured at the interval of 0.25m by a GJ-5 track inspection vehicle moving from Shenzhen to Guangzhou. In total, 50,000 sample points which are continuous in time series were selected for modelling. replaces . It can be seen that the models trained with LM and BR outperformed those trained by other algorithms, but the run times of the two models were relatively long.

For better understanding, the predicted values of the FFNN-structure model trained by BR were compared with the measured values in Figure 4, where the horizontal axis is the total number of samples and the vertical axis is the target variable (i.e., the car body vibration acceleration in three directions).

As shown in Figure 4, the predicted acceleration in all three directions basically conformed to the trend of the target variables. However, only the predicted acceleration in the horizontal direction was entirely consistent with the measured value, while that in the other two directions merely approximated the mean value of the target variables. The magnitude and range of the measured values were not well reflected.

The performance of the TSNF-structure prediction model is presented in Table 4. It can be seen that the three indices were all improved from the levels in Table 3, indicating that it is meaningful to consider the time delays. This is because the car body has some time delays in its response to the relevant factors, such as track surface and driver’s operation. Nonetheless, it is also learned that the TSNF increased the run time from that of the FFNN.

The predicted values of the TSNF-structure model trained by BR were compared with the measured values in Figure 5. It is clear that the TSNF-structure model reflected the magnitude and range of the measured values more accurately than the FFNN-structure model.

Finally, the performance of the TSF-structure model (Table 5) shows that the model performance changed little with the training algorithms. The order of magnitude of the MSE was always at ; the R value remained near 0.99. Therefore, the TSF structure can bless the model with a good performance with a short run time.

The predicted values of the TSF-structure model trained by LM were compared with the measured values in Figure 6. It can be seen that the predicted values were highly consistent with the measured values. Compared to the other two structures, the TSF managed to reflect the actual magnitude and range of the measured values.

4.3.2. Additional Dataset Verification

To verify the universality of the proposed model, another 40,000 sample points were selected from the data acquired by the GJ-5 track inspection vehicle on the return journey. Table 6 compares the performances of the FFNN-, TSNF-, and TSF-structure models on the additional dataset. Figures 710 display the best performing model of each structure on the test dataset.

As shown in Figures 7, 8, 9, and 10, the predicted values of FFNN- and TSNF-structure models deviated significantly from the measured results in some intervals. The possible reasons are as follows: First, the target variables may be affected by other implicit variables. Second, the correlation analysis fails to screen out all the representative predictor variables. Before applying the models of these two structures to actual projects, it is necessary to expand the dataset to include various situations and calculate correlation coefficients in a proper manner. These are the necessary steps to acquire the typical influencing factors of the target variables.

Besides, the predicted values of the TSF-structure model trained by CGB and CGP were basically consistent with the measured values, which are obviously better than those of the other two structure models. In some stationary phases, however, there was a constant deviation between the predicted and measured values. A possible reason lies in the fact that the predicted values are less affected by other influencing factors in relatively stable phases and are only determined by the impacts from delayed feedback values. By contrast, the constant deviation did not appear in the FFNN-structure model. Thus, the FFNN structure might be helpful to eliminate the deviation in the TSF-structure model. This idea will be examined in future research.

Given the accurately predicted vibration acceleration of the car body, the passenger comfort can be derived according to international standards like ISO2631 or UIC513. Thus, the proposed prediction model lays the basis for early warning and fault detection in operation and maintenance processes.

5. Conclusions

This paper establishes a prediction model for car body vibration acceleration. In the beginning, the various parameters related to the track and the train were filtered by correlation analysis based on both Pearson’ correlation coefficient and Spearman’s rank correlation coefficient. The parameters closely correlated with the target variable were selected as predictor variables. Then, the selected variables were used to construct prediction models with three different neural network structures, namely, the FFNN, the TSNF, and the TSF. To verify the performance, the proposed models were applied to predict the car body vibration acceleration with actual railway datasets. The following phenomena were observed from the predicted results.

During the training process and new prediction, according to the obtained values of indices, the BR training algorithm achieved very good performances both on the training dataset and the new dataset under FFNN and TSNF structures, but it consumed too much time. The LM boasted the best performance under the TSF structure but performed poorly on the new test dataset. The TSF-structure models trained by CGB and CGP achieved even more accurate prediction on the new dataset.

The future research will further improve the adaptability of the proposed model and apply the predicted values to enhance passenger comfort.

Nomenclature

NN:Neural networks
P:Pearson correlation coefficient
S:Spearman correlation coefficient
BFGS:Broyden–Fletcher–Goldfarb–Shanno
BQ:BFGS Quasi-Newton backpropagation
CGB:Conjugate gradient backpropagation with Powell-Beale restarts
CGF:Conjugate gradient backpropagation with Fletcher-Reeves updates
CGP:Conjugate gradient backpropagation with Polak-Ribiére updates
OSS:One-step secant backpropagation
RB:Resilient backpropagation
SCG:Scaled conjugate gradient backpropagation
LM:Levenberg-Marquardt backpropagation
BR:Bayesian regularization backpropagation
FF:Feedforward net
TS_NF:Time-series neural networks without feedback time delays
TS_F:Time-series neural networks with feedback time delays
Superelev:Superelevation
C_IRREG_LEFT:Left complex irregularity
C_IRREG_RIGHT:Right complex irregularity
LOFFSET:Left offset
ROFFSET:Right offset
LSURFACE:Left surface
RSURFACE:Right surface
BOGIE_FRAME_HACCEL:Horizontal acceleration of bogie framework
BOGIE_FRAME_VACCEL:Vertical acceleration of bogie framework
L Surf 1:Left vertical irregularity of long wave
R Surf 1:Right vertical irregularity of long wave
L Surf 2:Left vertical irregularity of medium wave
R Surf 2:Right vertical irregularity of medium wave
L Surf 3:Left vertical irregularity under 20 cm chord
R Surf 3:Right vertical irregularity under 20 cm chord
L Surf 4:Left vertical irregularity under 10 cm chord
R Surf 4:Right vertical irregularity under 10 cm chord
CURVE_RATE:Curvature change rate
MSE:Mean square error
R:Regression coefficient
MAE:Mean absolute error.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Acknowledgments

This research was funded by National Natural Science Foundation of China [Grant No. 51478258 and 51405287] and Shanghai Committee of Science and Technology [Grant No. 18030501300].