Abstract
In order to further improve the accuracy of electric bus energy consumption estimation and reduce the complexity of using data, the paper proposes a new method for estimating electric bus energy consumption based on a deep learning approach with a data-driven model. The method can estimate the single-trip energy consumption of an electric bus by employing CNN (convolutional neural network) to time series prediction, which takes into account easily accessible trip data of electric buses, including initial SOC (state of charge), average speed, average temperature. First, we need to convert the raw data into a trip dataset by preprocessing the collected real-world trip data of an electric bus. Then, the single trip of the bus from the original station to the terminal station is considered the basic unit for energy consumption estimation, and the trip data are processed in a quasi-time series. Following that, the trip data were modified so that the subsequent convolutional operations more closely matched the interactions between adjacent trips, and a time series prediction method based on CNN was used instead of the regression analysis methods used in traditional data-driven models. Finally, single-trip operation energy consumption estimation of electric buses is achieved with time series prediction based on CNN, and this method is compared and analysed with the LSTM (long-short term memory) time series prediction method and multivariate nonlinear regression prediction methods in traditional data-driven models. The results show that the energy consumption estimation model for electric buses developed in this paper has a higher prediction accuracy, which can improve by 3.68 percent over the traditional multivariate nonlinear regression prediction method and by 1.32 percent over the LSTM time series prediction method.
1. Introduction
By enhancing the precision of pure electric bus energy consumption and range estimation, it can better handle problems such as bus vehicle charging cooperative scheduling, operation organization optimization, charging pile design, and building. Currently, there are two primary mainstream ways of estimating EV energy consumption: dynamics-based and data-driven.
The dynamics-based EV energy consumption estimation model can more accurately describe the EV’s energy consumption process, which typically includes energy consumption of drive, wind resistance, rolling friction, road slope, air conditioning, and other vehicle factors. Gao et al. [1] assessed the energy consumption and battery performance of an urban electric bus based on bus trip data and standardized operating circumstances. By modelling the nonlinear steady-state turning of an urban electric bus, Beckers and Besselink [2] calculated the proportion of turning energy to the overall energy consumption. Chen and Xie [3] developed an energy consumption estimation model based on the identification of working conditions that combined fuzzy energy consumption and the Kalman filter, and the model’s estimation accuracy was improved by 77% when compared to the traditional energy consumption estimation method. Lian et al. [4] separated real-world working condition data into segments by analysing the correlation between SOC and driving range and then developed an electric vehicle range prediction model based on battery SOC and driving condition identification. Liu et al. [5] predicted the vehicle’s energy consumption in the future period using the link between energy consumption and working circumstances, which was produced by determining the driving energy consumption and accessory energy consumption, respectively. Zhang and Yao [6] investigated the energy consumption characteristics of electric cars from a statistical and physical standpoint and provided a microscopic modelling framework for predicting the energy consumption of electric vehicles based on real driving state data. Al-Wreikat et al. [7] investigated the influence of variables such as journey distance, ambient temperature, and road gradient on specific energy consumption in various driving modes. Yin et al. [8] developed a simulation model of an entirely electric vehicle by expressing the typical parameters of 12 driving circumstances, such as speed and acceleration, in order to accomplish a simulation estimation of the entire vehicle’s energy consumption. By dividing electric vehicle driving power into three categories, motor power loss, driving resistance loss, and acceleration and deceleration power, Wu et al. [9] developed an analytical approach for predicting energy consumption. Xu and Wang [10] predicted future vehicle speed profiles based on historical data, elevation information, and real-time road congestion information and developed a mileage consumption estimation model based on travel cycle identification and prediction.
The data-driven EV energy consumption estimation model examines the elements that influence energy consumption and employs statistical regression analysis and machine learning to obtain electric vehicle energy consumption estimates. Hu and Gao [11] proposed an energy consumption model based on temperature stratification to extract energy consumption influencing factors from an electric cab model’s operational data. Chen et al. [12] employed a neural network prediction model to estimate energy consumption or recovery values and a data partitioning approach to distinguish between charging and discharging modes. Fukushima et al. [13] developed a transfer learning-based energy consumption estimation approach to accomplish energy consumption estimation for novel electric vehicle models that are based on common electric vehicle models. Yang et al. [14] used a weighted type II fuzzy set model to estimate electric bus energy consumption based on the obtained pure electric bus position data and energy consumption-related data. Vatanparvar et al. [15] estimated electric vehicle energy consumption using the NARX neural network and driving behaviour. Fan et al. [16] employed observed and state data such as operating voltage and battery SOC to enhance the forecast accuracy of electric cars, using recursive least squares calibration of model parameters in conjunction with the ampere-time integration approach and EKF algorithm. Cauwer et al. [17] employed a data-driven method to estimate the energy consumption of an EV for a given road on a road network based on neural networks and multiple linear regression models, which combine microscopic driving parameters with external environmental parameters.
The dynamics-based method for estimating the energy consumption of an EV has high data type requirements, which means more road conditions and real-time weather data (e.g., road slope, friction coefficient, wind speed, wind direction) are required. Data-driven energy consumption estimation approaches may be assessed using more easily available data, but they frequently suffer from a lack of estimation accuracy. Applying CNN to data-driven models can solve the higher requirement of data types in dynamics-based models and the insufficient accuracy in data-driven models. In this research, we employ CNN for time series prediction of electric bus trip data. The trip data are processed in a quasi-time series by considering a single bus trip from origin to destination as the fundamental unit for estimating energy consumption. Finally, a real trip data set of many electric buses in the city of Jilin Province is selected to train, validate, and test the CNN model to achieve the single-trip energy consumption estimation of electric buses.
2. CNN-Based Model Development
Even though it reduces the required data types and allows for the use of readily available data as input variables, the data-driven energy consumption estimating approach has insufficient estimation accuracy. Deep learning may effectively tackle the problem of insufficient estimation accuracy of typical data-driven models since it reflects the nonlinear and stochastic properties of data and has significant robustness.
First, an electric bus energy consumption estimation model based on CNN is developed, followed by preprocessing and adapting the collected electric bus trip data to the CNN input matrix requirements. Finally, the model is trained, validated, and analysed using test set data, and its time cost and prediction accuracy are compared to those of multiple nonlinear regression models and LSTM time series prediction methods. The research idea of this paper is shown in Figure 1.

2.1. The CNN Model’s Basic Structure
The CNN model employed in this study is based on the LeNet-5 model, with changes to the structure and size of each layer. Taking into account the size of the input matrix, the number of data processes, and the model’s correctness, two convolutional layers, one pooling layer and one fully connected layer, are designed. The output value is a single value, the loss function is chosen as MSE (mean squared error), and the step size of the convolution operation is 1. Figure 2 depicts the CNN’s structure.

Four 3 × 3 convolutional kernels are used for convolutional layer 1 to guarantee that as many data characteristics as feasible are preserved while lowering the computational cost. The pooling layer employs the maximum pooling approach, with a 2 × 2 pooling area. The fully connected layer employs a full connection to get the output values.
2.2. Quasi-Timing Data Extraction and Multichannel Input Matrix Construction
When there is no set time interval yet there is a change in state, quasi-time series analysis is commonly applied. A quasi-time dataset is created based on the statistical temporal distribution of the acquired data and may be utilized for time series forecasting. In this work, the state is a single electric bus trip, and the change is a change in trip data. Hence, the trip data are processed in a quasi-time series by considering a single bus trip as the fundamental unit.
For example, the departure time of an electric bus route is a set time as stated in Table 1. The bus route has 8 and a half shifts in a day’s operational duty, and each shift has two round trips. Therefore, each bus on the route makes a total of 17 trips every day.
As shown in Table 2, the departure times of the vehicles operating on the line are counted and divided into different time intervals, so that each time interval contains the departure time of only 1 vehicle. Therefore, a quasi-time series based on 17 trips per vehicle per day can be constructed, as shown in Figure 3.

Because each variable’s sequence is not strongly correlated with each other when the input matrix is directly combined, this paper separates each variable for input to form a three-channel RGB input matrix similar to that of an image (as shown in Figure 4). Each variable serves as an input channel, with each variable being combined into a three-dimensional matrix that serves as the input matrix. When the convolution operation is performed in the convolution layer, each channel is kept relatively independent. The results of each channel’s convolution operation are then combined, and the output is obtained by adding bias.

2.3. Quasi-Time Series Conversion of Trip Data Based on GAF
This paper employs the gram angular fields (GAF) used in the study by Modi [18] et al. to implement ascending dimension, which is to meet the requirements of the input matrix of each channel in the model while preserving the original time series relationship of the variables in each channel. GAF turns each variable’s one-dimensional quasi-time series structure into a two-dimensional channel structure, preserving the original time series relationships while reflecting the relationships between the interactions within the time series data. Gram angular fields can be divided into two forms, gram angular summation fields (GASF) and gram angular difference fields (GADF). The main idea of GAF is to convert one-dimensional time series into a polar coordinate system representation, where the time axis is represented by the radius of the polar coordinates and the angular values are represented by the values of the time series and then use trigonometric values between the different time series taking values to generate the GAF matrix.
The GAF’s precise computation procedure is as follows:(1)Because the angular values of polar coordinates include a range of values, the time series data are first normalized to the interval [−1, 1] or [0, 1]. Equations (1) and (2) show the normalization procedure. is the -th observation value of time series set , and and are the corresponding value normalized to [−1, 1] and [0, 1].(2)The time series data are reconstructed based on the polar coordinate system. The calculation method is shown in the following equation: is the -th value of time series set which after normalized, is the corresponding timestamp, is the constant used to regularize the span of the polar coordinate system, forms the curve function of the polar coordinate system. It should be noted that when using [0, 1] normalization for the time series, the range of values of is . And the range of values of is when using [−1, 1] normalization.(3)The construction of GAF is inspired by Gram’s matrix, and the angular sum or difference of trigonometric functions can be utilized to evaluate the data correlation between multiple time series points, as shown in the following equations: and are the angular values of the two time series points, which can be converted to Cartesian coordinate form by the sum and difference product formula. is the unit row vector. and are two standardized time series matrices.
Taking GASF as an example, its matrix representation is shown in the following equation:
3. Energy Consumption Estimation Model Training
3.1. Base Data Collection for Model Training
In this paper, a data-driven model is used to estimate the energy consumption of electric buses. The data collection site was chosen in Jilin Province, and real-time operation data of relevant vehicles were acquired using an online onboard networking platform while taking data collection expenses into account.
The data collected includes the following: (1) the number of people getting on and off the bus at each stop along a predetermined route journey, as well as arrival and departure times based on data from in-bus surveillance cameras, (2) real-time vehicle position, state of charge, energy consumption, total mileage, and air conditioning switch data for specified routes based on an electric bus vehicle networking platform, (3) hourly temperature changes are observed and recorded hour by hour based on national surface weather stations. Some of the telematics platform’s raw recorded data are shown in Table 3.
3.2. Preprocessing of Base Data for Model Training
Because this study only addresses the change in energy consumption during the operation of electric buses, and the gathered real-time data of electric buses may not be in the operation period, further data filtering is required to derive relevant data during the operating period. There is also a need for further data processing because the raw data obtained via video surveillance and telematics platforms differs from the needed trip data in terms of data kinds.
3.2.1. Revenue Passenger Kilometre
Beause video surveillance data recording data is instant value, the data such as the number of people in the car, vehicle position, and driving speed must be converted to data values during a single trip. By integrating the curve of the number of passengers across the mileage throughout one trip, it is possible to reflect the change in the number of passengers from the whole trip, which has an impact on the trip energy consumption. The revenue passenger kilometre integral for a single trip is calculated as shown in the following equation:
3.2.2. Raw Data to Single-Trip Data Conversion
In the original data record of the automobile network platform, the collected time value is the moment value of the SOC and other display values dropped. And in the trip data, the value of each indicator in the time period of a single trip is taken as a record. As shown in Table 4, converting between the two types of data requires first determining the start and end points of the journey in order to calculate data like travel time and average speed.
3.2.3. Estimated Average Temperature for a Single Trip
The hourly air temperature value is used as the initial moment air temperature value for that hour in the trip average air temperature calculation, and a time-averaged linear estimation is performed and rounded to the nearest whole number, as shown in Table 5.
Figure 5 depicts the problem of estimating the trip’s average temperature, which necessitates first estimating the temperature values at the start and end of the journey. A weighted average estimate of the temperature over the different time periods is required if the starting and ending moments are not in the same whole time period. Equation (8) shows the specific calculation formula.

indicates the average temperature of a single trip to be estimated. , , indicate time moments for hourly temperature observation. , are the start moment and end moment of the trip, respectively. is a median value to be estimated—temperature value at the start moment of the trip. is a median value—value for average temperature change during a trip; , , are the hourly actual temperature values.
3.3. Preprocessing of Data Set for Model Training
Trip data are converted to an electric bus trip data set after processing is completed. Then, the data are then split into layers within each month based on the distribution of the dataset throughout the dates. Finally, 1291 data points were separated in chronological order as the training set, 277 data points as the validation set, and 278 data points as the test set. The training set and validation set are used to train models, while the test set is used to compare the method presented in this paper to other methods.
3.3.1. PCA Downscaling and Normalization of Trip Data
(1) PCA Downscaling. PCA (principal component analysis) is used to reduce the dimensionality of the input variables in order to obtain more concise and orthogonalized input variables, minimize the number of variables to reduce the number of operations, and enhance the running performance.
PCA is a multiindicator statistical analysis method that uses dimensionality reduction to reduce the number of variables to a minimum. It allows for variable compression, i.e., replacing the original more indicators with indicators of fewer dimensions using PCA while keeping the original indicators’ information so that the new indicators (principal components) are not connected with each other.
The scikit-learn package for Python can be used to realize the computation of PCA. The training matrix is created by filtering five factors from the training set data: beginning SOC, journey time, average speed, average temperature, and air conditioning on time. The initial five variables are reduced to four distinctive variables using PCA. Table 6 shows the percentage variance of each of the four components that were kept, i.e., the variance contribution of individual variables.
Based on the variance percentage results, the cumulative variance contribution of the four principle components was calculated to be 99.37 percent, which may cover the majority of the data information. As a result, the five variables can be transformed into four new principal component variables using eigenvectors, as illustrated in Table 7.
According to the values of the component coefficients of the feature vector matrix, it can be found that component 1 mainly contains the information of starting SOC (the absolute value of the component coefficient is greater than 0.8), component 2 mainly contains the information of average temperature (the absolute value of the component coefficient is greater than 0.7), component 3 mainly contains the information of average temperature and air conditioning on time (the absolute value of the component coefficient is greater than 0.6), and component 4 mainly contains the information of travel time and average speed (the absolute value of the component coefficient is greater than 0.5).
The four principal component variables extracted from the above training set using PCA are used as input variables in energy consumption estimation using CNN. (9) depicts the relationship between them and the five initial variables.
, , , are the values of the 4 principal components corresponding to each data, respectively. indicates the percentage value of the start SOC. indicates the trip time, min. indicates the average speed, km/h. indicates the average temperature of trip, °C. indicates the air conditioning on time during the trip, min.
(2) Normalization. Before performing the CNN operation, the input data must be normalized to unify the magnitudes of the input variables and preserve the range of fluctuation of each variable within the interval of [−1, 1] (as required by GAF). Equation (10) shows the specific calculation formula.
indicates the value after the normalization process, is the original value, and indicate the maximum and minimum values of the set where variable is located.
3.3.2. Input Sequence Construction and GAF Transformation
The symmetric quasi-time series generation approach, based on the input matrix and output structure of the convolutional neural network, is designed to match the electrical consumption values of more trip data and extract the data correlation between each surrounding trip. In Figure 6, the relevant component values of the sequence corresponding to the output energy consumption variables are placed in the proposed timing median, and the quasi-time data before and after them are ordered according to the time period during which the trip is made. Missing trip value variables are replaced by the overall average of the time series.

The results of the constructed symmetric quasi-time series data are shown in Table 8. The data for each variable under the associated trip are in one row with the energy consumption, and the other rows are the created symmetric suggested time series data. As a result, each energy consumption data corresponds to the four component factors of its own trip, as well as information on the component variables of nearby journeys, allowing for a more accurate representation of the influence of surrounding trip data.
The GASF method is employed to convert the one-dimensional proposed time-series data into a two-dimensional matrix. Figure 7 depicts the four-component channel diagram (from left to right, components 1 to 4) for generating GAF using the first trip of a road electric bus as an example.

Figure 7 shows that the pictures of the first three principal components have a centralized distribution, with the exception of component 4, which has a more uniform distribution of values at each point, implying that the component variables have a greater influence. The corresponding time period trip of each vehicle on each survey day is used as an input matrix, and the matrix of each input component variable of the same vehicle on the same day is set as a channel matrix, which is merged into a three-dimensional input matrix.
This experimental environment’s software platform is Python 3.8.6, and the running environment is an Intel (R) CoreTM i5-4200M CPU @ 2.50 GHz. The input matrix (GAF quasi-time matrix), the designed CNN model, and the optimizer (RMSProp, which is chosen to achieve a faster and more globalized gradient descent) are input into the experimental environment, and the data of different bus routes in the training set are trained separately to obtain the CNN model for energy consumption estimation applicable to different bus routes.
4. Validation Analysis of Model
The test set data consist of 278 trip data separated in chronological order from the electric bus trip data set. Substitute the test set into the CNN model for energy consumption estimation after the model training is completed. The multiple nonlinear regression prediction method and the LSTM time series prediction method are also used to predict the test set data, and the test results are compared with the prediction results of the CNN time series prediction method in this paper.
The LSTM model structure for the LSTM time series prediction method includes two LSTM layers (each layer has 100 neurons), one fully connected layer, the ReLU function for activation, the MSE function for loss, and the RMSprop method for optimization. (11) displays the model relations obtained by the multiple nonlinear regression prediction approaches.
In this paper, the program running time of the CNN time series prediction method is approximately 24% lower than that of the LSTM time series prediction method when using electric bus data for model training and testing.
In order to observe more intuitively the prediction accuracy of each prediction method in energy consumption estimation, two electric buses with license plate numbers 3123 and 3338 from different bus routes in the test set were selected, and their energy consumption estimation results for all single trips in a day were compared and analysed. Figures 8 and 9 represent the predicted energy consumption results for a single trip of bus 3123 (bus line 1) and bus 3338 (bus line 2) compared to the true values, respectively. The column in the figure represents the real value of single-trip energy consumption, and the three dotted line plots in black, red, and blue represent the predicted values of single-trip energy consumption for the multivariate nonlinear regression prediction method, the LSTM time series prediction method, and the CNN time series prediction method, respectively.


According to the calculation results, the trained CNN model can achieve a prediction accuracy of 88.30% for bus route 1, which is 2.39% better than the prediction accuracy of 85.91% for the multivariate nonlinear regression prediction method and 1.21% better than the prediction accuracy of 87.09% for the LSTM time series prediction method.
According to the calculation results, the trained CNN model can achieve a prediction accuracy of 94.31% for bus route 2, which is 4.98% better than the prediction accuracy of 89.33% for the multivariate nonlinear regression prediction method and 1.43% better than the prediction accuracy of 92.88% for the LSTM time series prediction method.
As a result, the prediction accuracy of the trained CNN model improved by 3.69% compared to the multiple nonlinear regression method and by 1.32% compared to the LSTM time series prediction method.
5. Conclusion
This paper proposes a method of time-series prediction for trip data using CNN, which combines deep learning methods with data-driven models, to achieve single-trip energy consumption estimation for electric buses. The prediction accuracy of this method is higher than that of the LSTM time series prediction method and the multiple nonlinear regression prediction method, while the method reduces the requirement for the type of data used by using data for easy access to trip data, which can provide a new theoretical basis for rational scheduling and dispatching of electric buses. However, the quasi-time-series data selected in this paper in units of trips and the large data granularity level cannot constitute the real-time series data, which still has some influence on the accuracy of the electric bus energy consumption estimation. In a further step, data with lower data granularity will be selected for study to further improve the accuracy of the energy consumption estimation model.
Data Availability
The data used to support the findings of this study are included within the paper.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This research was financially supported by the General Project of National Natural Science Foundation of China and Basic Research Project of Education Department of Liaoning Province. The name of the research projects is “Vehicle Static Scheduling and Dynamic Control Method under Overlapping Operating Ranges of High Frequency Bus Lines” (71771062) and “Research on Evaluation Method of Urban Regional Road Network Structure Based on Supply and Demand” (LJKZ0590) and Urban Crowd Evacuation Security System Based on 5G Technology (2021JH2/10100005), respectively.