#### Abstract

Due to the important characteristics of energy saving and carbon reduction, electric vehicles have attracted worldwide attention. It can be predicted that the power grid will be faced with the access problem of large-scale electric vehicles. In order to master the user behavior characteristics of electric vehicle load, it is necessary to establish the model based on electric vehicle charging behavior. In this paper, combined with the electric vehicle charging demand and the situational awareness results of the dispatchable resources in the station area, the characteristic indicators of the electric vehicle load are quantitatively analyzed. Situational prediction of electric vehicle load based on random forest algorithm is proposed, and the sample set is divided and trained. A simulation example is used to verify the effectiveness of the method provided in load forecasting.

#### 1. Introduction

Energy crisis and climate warming have become the key problems that restrict the sustainable development of human society. Electric vehicles have significant advantages in alleviating global energy shortages and environmental pollution [1–3]. At the same time, distributed power generation, which is mainly composed of photovoltaic and wind power, develops rapidly [4]. However, the access of large-scale electric vehicles and the intermittence of renewable energy will aggravate the peak-valley difference of power grid load, making the operation of power grid more difficult to control [5, 6]. On the one hand, as a flexible load, the electric vehicle’s off-driving charging time is usually longer than the time required for its battery to be fully charged. Orderly regulating the charging behavior of a large number of electric vehicles can not only avoid the impact of large-scale random access of electric vehicles to the power grid but also realize the peak load cutting and valley load filling, which is very important for the safe operation of the power grid in the future. It is of great significance to improve the economic benefits of the power grid [7–10]. On the other hand, due to the strong randomness at the moment when electric vehicles connect and leave the power grid, the state of charge of the battery is uncertain, and the day-ahead scheduling process can not accurately predict the connection and departure of each electric vehicle in the future [11–13].

Short-term load forecasting is an important part of power system load forecasting, which mainly forecasts the load at any time in the future [14]. The factors that affect the accuracy of short-term load forecasting mainly include sudden weather changes, seasonal changes, dispatching plans, emergencies, large-scale social activities, and so on. Therefore, short-term load forecasting has randomness and uncertainty [15, 16]. However, in the process of random change, the load still has obvious periodicity in different periods, such as year, month, week, and day. Therefore, the short-term load variation comprehensively appears as a nonstationary random process in the time series [17, 18]. Accurate short-term load forecasting is of great significance to the economic dispatch of electric power dispatching departments, the control of unit allocation, and the current developing electric market [19].

The methods of early warning, decision making, and visualization in situational awareness technology play an important role in realizing the safe and stable operation of smart grids. Reference [20] has designed a distribution network dispatching system based on situation awareness technology, which can flexibly detect and evaluate the running state of the distribution network. Reference [21] has already applied situational awareness to microgrid, and based on microgrid situational awareness, the situation leads to active decision. Reference [22] has proposed that the electric vehicle charging and discharging index is used for situational warning of the power supply capacity of the power distribution system. At present, the application of situational awareness in the distribution system is mainly aimed at the power supply and consumption early warning and monitoring of distribution network. However, the situation awareness of the distribution system to forecast and dispatch the charging load of electric vehicles has not been deeply developed.

In this paper, the basic user behavior characteristics of electric vehicle load are studied, and the charging behavior model of electric vehicle is established. The forecasting method of electric vehicle load based on random forest algorithm is put forward. The sample set is trained and compared with other methods in the field of short-term load forecasting. The validity and superiority of the random forest algorithm for electric vehicle load forecasting are verified, which can effectively improve the accuracy of electric vehicle charging load forecasting.

#### 2. Modeling of Electric Vehicle Charging Behavior

##### 2.1. Electric Vehicle Charging Behavior Analysis

After electric vehicles are connected to the power grid on a large scale, its charging load becomes an important factor that cannot be ignored in the power grid. Because the large-scale electric vehicle groups has certain aggregation characteristics, this paper takes the large-scale electric vehicle group as the research object and analyzes the charging demand of electric vehicles from the aspects of battery characteristics, charging time, mileage characteristics, and so on.

The characteristics of the battery mainly include battery capacity and the state of charge. The features include charging start time and charging duration. Driving features mainly include daily mileage and power consumption per kilometer of electric vehicles. The relationship between these 3 properties is as follows:where is the charging time in hours; is the electricity consumption per kilometer of the electric vehicle, in kilowatt-hours per kilometer; is the mileage of the electric vehicle, in kilometers; is the charging power in kilowatts; is the battery capacity in kWh; and is the battery state of charge, which is the ratio of the remaining battery power to the fully charged state.

The daily mileage of electric vehicles obeys a normal distribution, and its probability density function is as follows:where is the expected value of daily mileage, and is the standard deviation of daily mileage. According to [23], it is advisable to take and .

The initial charging time of electric vehicles also has a certain normal distribution, and its probability density function is as follows:where is the initial charging time of each electric vehicle, and its value is in the range of [0, 1440] (24 h is divided into 1440 time periods, each of which is 1 minute); represents the expected value of charging time; and is the standard deviation of charging time. According to [23], it is advisable to take and .

##### 2.2. The Model of Single Electric Vehicle Charging Behavior

It shows that the charging power of each electric vehicle is constant and continuously adjustable in each optimization period. The mathematical model of the charging behavior of a single electric vehicle is shown in where is the energy trajectory value of the electric vehicle at time ; and are the lower and upper bounds of the energy trajectory at time , respectively; is the charging efficiency; is the electric vehicle charging power at time and remains constant during the period between time and time ; is the time interval of the scheduling period; and are the access and departure times of the electric vehicle, respectively; that is, the electric vehicle is connected at time and leaves at time ; is the charging demand for electric vehicles; is the upper limit of the rated charging power of the electric vehicle battery; , , and are the charging power of the electric vehicle at time and the maximum and minimum charging power limited by the energy boundary constraints, respectively. The schematic diagram of the energy boundary model of a single electric vehicle is shown in Figure 1.

In Figure 1, it is assumed that the electric vehicle is connected at time and left at time . The curve abd is the upper bound of the energy boundary of the electric vehicle, which means that after the electric vehicle is connected to the power grid, it will be charged with the maximum charging power immediately until it reaches the user’s expectation . The curve acd is the lower bound of the energy boundary of the electric vehicle, which means that the electric vehicle delays charging after it is connected to the power grid, until the user’s expectation is reached at the time of departure. The slopes of ab and cd represent the increase in battery power per unit time according to the maximum charging power of the electric vehicle, that is, .

##### 2.3. Cluster Equivalent Model for Electric Vehicles

After the electric vehicles are connected to the power grid, the electric vehicles with the same departure time can be grouped into the same subcluster according to the charging parameters input by the owner. By superimposing the charging models belonging to the same subcluster, the equivalent charging model of a single subcluster can be obtained as follows:where is the energy trajectory of subcluster at time . and are the lower and upper bounds of the energy trajectory of subcluster at time , respectively. is the total charging power of subcluster at time . and are the lower and upper limits of the charging power of subcluster at time , respectively. is the number of electric vehicles belonging to subcluster at time . , and , are the lower and upper bounds of the energy trajectory and the lower and upper bounds of the charging power of the L-th electric vehicle of subcluster at time , respectively.

Because the electric vehicles belonging to the same subcluster will leave at the same time, the cluster charging model is equivalent to the single charging model; that is, the charging strategy of the subcluster meeting the cluster charging model must follow a certain energy allocation mode. For the cluster charging amount set that satisfies the cluster charging model, there must be at least one charging power distribution method so that all electric vehicles in the cluster are fully charged before leaving under the condition that the single charging model is satisfied. According to the cluster charging model, the following inequality holds for .where is the charging power demand of the L-th vehicle.

First, when , the inequality is transformed into

Using the strategy of giving priority to charging the electric vehicles that leave first, the charging plan for all electric vehicles at departure time can be set as

so that it can be filled before leaving. Therefore, when , the proposition holds.

It is assumed that, for all time points , the electric vehicles that leave before time point can be fully charged within the cluster by adopting a strategy of charging the electric vehicles that leave first. That is, there is at least one charging plan set that satisfies the following equation:

The result is as follows:where represents the total amount of charge of the electric vehicle whose departure time is after the time point . Due to the strategy of charging the electric vehicle that leaves first, once the electric vehicle at the departure time point has been fully charged, the electric vehicle at the departure time point will be preferentially arranged to be charged.

When

The electric vehicle whose departure time is is charged, and the electric vehicle with departure time point will be charged.

Then, the electric vehicle with departure time point will be charged, and the total amount of charging is .

When , because of

it can be obtained as follows:

The above equations show that the charging plan can meet the charging demand of the electric vehicle at the departure time point .

Because the upper and lower limits of the cluster charging power in the cluster charging mode will not exceed the sum of the upper and lower limits of all electric vehicles in the cluster, all charging strategies that meet the cluster charging mode must have at least one power allocation method so that the charging amount of the cluster at the start and the single charging mode can meet.

#### 3. Situational Awareness of Power Supply Resources for Electric Vehicles

##### 3.1. Situational Awareness Model of Electric Vehicle Power Supply Resources

In the process of grid power resource scheduling, it is necessary to extract the relevant factors of electric vehicle demand change. Then, the situation is understood, the obtained information is integrated and classified, and finally, the situation is predicted, and the development trend is predicted and judged. In this way, the charging demand and power supply resources of electric vehicles can be accurately grasped, and the corresponding situation classification rules can be formulated so that the power supply resources can be reasonably allocated according to the real-time load status of the station area, as shown in Table 1.

The process of situational awareness mainly includes three parts: (1) extracting the characteristic parameters of situational awareness, preprocessing, and providing data support for situational understanding and prediction; (2) situational understanding, integrating, and classifying the obtained information; and (3) making predictions and judgments about the situation.

Based on the charging characteristics of the electric vehicle, the characteristic parameters of the situational awareness model are extracted, including the current charging state of electric vehicle, the charging start time, the charging time, and the charging pile usage interval time. We quantify the indicators and then get the situational awareness value, so as to reasonably dispatch the station resources. According to the mined indicators, the original data are collected, and the features are extracted to generate sample sets. If the total number of samples is and the number of feature indicators is , the sample matrix will be an matrix. Then, the training sample set is taken as the input, and the value corresponding to the sample matrix is taken as the training output, so as to train the integrated model.

##### 3.2. Prediction of Electric Vehicle Load Situation Based on Random Forest Algorithm

In this section, a situation prediction model of electric vehicle power supply resources based on stochastic forest algorithm is proposed, and the situation prediction of electric vehicle charging demand and regional power supply resources in the power supply station is made.

The decision tree of the random forest algorithm generally adopts the Classification and Regression Tree, which can effectively deal with large data samples and solve nonlinear problems. For classification problems, CART (Classification and Regression Tree) uses the Gini index as the attribute measure. The smaller the Gini coefficient is, the more accurate the classification effect will be. The Gini coefficient is defined as shown in where is the probability that the test variable belongs to the sample of class , and is the number of samples. When , all samples belong to the same class.

If the attribute satisfies a certain purity, the decision tree generation algorithm divides the sample into the left subtree; otherwise, it divides the sample into the right subtree. The CART decision tree generation algorithm selects the split attribute rules according to the principle of the smallest Gini index. Assuming that the attribute in the training set divides into and , the Gini index of the given partition is shown in

For the regression problem, CART selects the optimal bisection cut point based on the sum of squared errors and selects the optimal division attribute and cut point . The way to get it is as follows:

As an integrated algorithm based on decision trees, in the process of building a random forest model, different training sets are constructed to train each decision tree, thus increasing the difference between the classifiers and making the classification performance of random forest algorithm surpass that of a single decision tree algorithm. Figure 2 shows the random forest training process. To reflect the randomness of the random forest model, the construction of the training set includes the following two key processes.

The random forest algorithm performs random sampling with replacement on the original training dataset. We construct a subdataset, the sample size of which is consistent with the original dataset. Samples in different subdatasets can be repeated, and samples in the same subdataset can be repeated. We generate a decision tree for each subset of data.

The splitting process of each decision tree in the random forest model only uses some of all the candidate features. The random forest algorithm first randomly selects a certain number of features from all the features to be selected and then uses the decision tree generation algorithm to select the optimal feature for splitting among the randomly selected features.

The specific implementation steps of the random forest algorithm are as follows:(a)Preprocess the data required for forecasting. And the missing data are filled in by linear interpolation. The training sample set is divided according to the needs of the prediction algorithm.(b)Training the random forest algorithm by using the training sample set. According to the parameter settings of the random forest algorithm, several decision trees are built, and the prediction model of the random forest is completed.(c)Use the characteristic data of the forecast samples to make the forecast. In the model, the characteristic data of the predicted samples will be followed by multiple decision tree prediction processes. The random forest algorithm will summarize and output the results to obtain the prediction results.

The flowchart of short-term electric vehicle charging load prediction based on random forest algorithm is shown in Figure 3.

At present, with the continuous improvement of electric vehicle data and other external data collection and storage level, there are many functions to choose from. Therefore, it is impossible to find all the features to participate in the fitting of the model, which may lead to problems such as overfitting of the model and reduce the accuracy of predictions. Therefore, considering the current research status of load forecasting, it is necessary to analyze the factors that affect the trip behavior of electric vehicles. This paper summarizes the following factors involved in the construction of feature engineering:

###### 3.2.1. Historical Data of Electric Vehicle Charging Load

The charging load of electric vehicles in the group also has the continuous characteristics of other conventional loads, and the historical load data closer to the predicted time can better reflect the load change trend. Of all the features, the historical data of load often have the greatest influence on the accuracy of the load model. The load history data of electric vehicle are selected as the input of the model, and the built feature set determines the accuracy of the load forecasting model to a great extent.

###### 3.2.2. Meteorological Factor Data

Meteorological factor data are temperature, humidity, and weather conditions. The influence of temperature and humidity on the behavior of electric vehicles is mainly reflected in the use of onboard air conditioners of electric vehicles, which will increase the power consumption of vehicles. Therefore, the influence reflected in the EV charging load is a delay in the falling time of the charging curve, which takes longer time to completely charge the EV battery. The weather often affects road conditions and driving behavior of electric vehicles. Vehicles that tend to drive in bad weather also consume more energy.

###### 3.2.3. Date Type

The types of appointments are mainly divided into working days and weekends. In general, traffic congestion will be caused during peak hours on weekdays, which will affect the driving behavior of electric vehicles, increase the electricity consumption, and also affect the charging time of electric vehicles, thus affecting overall performance.

###### 3.2.4. Time Type

The charging load of group electric vehicle, like the conventional load, has certain periodicity. Therefore, the difference of load capacity in this type at a certain time of day will not be particularly great. This is mainly due to the fact that the charging characteristics of group electric vehicles show a certain regularity. Therefore, the type of torque can also be used as an optional input feature.

#### 4. Study Case and Simulation Results

In this paper, the charging piles in a certain area are selected as the simulation object. We assume that the number of electric vehicles connected to the charging station in this area is 300 and that the connected vehicles are only pure electric vehicles. The electric vehicle parameters use the data in Table 2. We assume an EV charging efficiency of 99%. The charging pile is an AC charging pile, and the power selection is set to 3.5 kW and 7 kW; the two are randomly generated with equal probability.

First of all, it is necessary to divide training data and prediction data. This paper selects the historical electric vehicle charging load data, meteorological data, date type data, and time type data of Shenyang City from November 15, 2021, to November 21, 2021, to construct the training sample set, forecast the load from November 22, 2021, to November 28, 2021, and compare the forecast results with the actual values to verify the effectiveness of the method.

The model input includes the following: load data at the moment before the time to be predicted, load data at the previous two moments, and load data at the same moment last week, temperature, humidity, wind direction, wind speed, weather condition information, date type, and time type.

The linear interpolation method is used to preprocess the numerical anomaly data of time series. It is assumed that a few continuous data points show a linear variation. It mainly adopts the mean value of the nonmissing time series data before and after as the repair value of the missing data. It is suitable for scenes with high data acquisition quality and few missing values.

There are different processing methods for missing time series data in different positions, mainly including beginning and end missing and the middle missing. For the missing data in the first and last numbers, the nonmissing value closest to the first and last numbers will be used to fill in. The calculation formula is as follows:where is the nonmissing value closest to the first position in the daily loading curve, and is the nonmissing value closest to the last position. is the data dimension of a single load data.

For the abnormal data of time series, this paper adopts the nearest-neighbor filling method. In the case of missing intermediate data, if single data are missing and the data of its front and rear load point are known, the corresponding data can be filled by line interpolation; that is, the average value of the load value before and after the time is used for filling. If there are missing points, the linear expressions are calculated for the front and back nonmissing data points, and all the missing data points are calculated proportionally. The calculation formula is as follows:where is the missing data, and and are the nearest nonmissing values before and after consecutive missing data.

Figure 4 shows the prediction results based on the random forest algorithm on weekdays and weekends. The trend in the figure can also see the travel habits of electric vehicles in this area. After 8:00 p.m., it gradually entered the charging peak, reached its peak around 11:00 p.m., and maintained a high level at around 5:00 a.m. Some vehicles have been charged, and the load begins to drop. It reaches a valley value around 8:00 a.m. and then continues to climb, with a small peak appearing at around 12:00 noon and then dropping.

**(a)**

**(b)**

In order to verify the effectiveness and superiority of the random forest algorithm for electric vehicle load forecasting, the other three algorithms are selected for forecasting, and the forecasting effect was compared with that of the random forest algorithm. They are support vector regression machine, Bagging regression machine, and stochastic gradient descent regression.

The prediction performance of the algorithm was evaluated using mean absolute percentage error (MAPE) and mean square error (MSE). The calculation formulas of MAPE and MSE are shown in following equations:where is the predicted load value at the time *i* predicted by the algorithm. is the actual load value at the time *i* moment of the day. The smaller and are, the more accurate the prediction effect of the algorithm is.

Table 3 shows the statistics of load forecasting error of each forecasting algorithm. Among all experimental algorithms, the random forest algorithm has the smallest prediction error. It shows the effective load forecasting ability of electric vehicle. At the same time, horizontal comparison can be made, which shows that the method adopted in this paper can effectively improve the prediction accuracy of electric vehicles.

#### 5. Conclusions

This paper analyzes the basic user behavior characteristics of electric vehicle load and establishes a single electric vehicle charging energy boundary model and a clustering classification model. A random forest algorithm-based electric vehicle load situation prediction method is proposed. The situational awareness results of the dispatch able resources in the power supply area are divided into sample sets and trained. Compared with other methods in the field of short-term load forecasting, the validity and superiority of the random forest algorithm for electric vehicle load forecasting are verified, and the charging performance and load forecasting accuracy of electric vehicles can be effectively improved. The following points can be explained by the simulation example.(a)According to the usage habits of electric vehicle users, the travel model of electric vehicles is established, and then, the mathematical model of charging behavior of electric vehicles clusters is established, and its correctness is proved. Aiming at the adverse impact of electric vehicle disorderly access on power grid dispatching, and considering the changing characteristics of both the supply and demand sides, a situation awareness method is proposed to reasonably evaluate and forecast the power supply situation.(b)From a data-driven point of view, a method of electric vehicle load forecasting based on random forest algorithm is proposed, and the training set is constructed, which improves the accuracy of electric vehicle charging load forecasting. Moreover, the actual bearing capacity of the power supply area can also be taken into account while meeting the needs of users to the maximum extent.

#### Data Availability

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the data use agreement.

#### Conflicts of Interest

The authors declared that they have no conflicts of interest.

#### Acknowledgments

This research was funded by Key R&D Program of Liaoning Province (2020JH2/10300101), Liaoning Revitalization Talents Program (XLYC1907138), the Key R&D Program of Shenyang (GG200252), and Liaoning PhD Initial Scientific Research Fund (2020-BS-179).