#### Abstract

The charging profiles of plug-in electric vehicles (PEVs) have large volatility. It has brought great challenges for aggregator to accurately complete load identification and aggregated configuration. Therefore, an analysis and configuration method of responsive capacity based on clustering and the Markov model are proposed in this paper. Firstly, the adaptive two-scale clustering algorithm based on spectral clustering (ATCSC) is applied to the clustering of charging piles. The offset compensation of the extreme points is used to form the distance measurement in the clustering process. Then, the responsive aggregated power can be obtained after the change control of suitable charging piles. Finally, the variation characteristics of the actual charging profiles based on the Markov model are introduced to the reliability evaluation in the load curtailment service. Simulation results reveal the following. (1) The robustness of the proposed method is better especially for the PEV charging profiles with strong volatility. (2) The validity of the aggregated configuration is verified. Additionally, the sum of power deviation is 0.0707 kW when the change interval of control strategy is 15 min. (3) The maximum shortage of configuration is −98.0875 kW as the entropy of the volatility is 37.027.

#### 1. Introduction

In 2015, China issued the guidance on accelerating the construction of charging infrastructures for the preparation of 100 million plug-in electric vehicles (PEVs) in 2050 [1]. To solve the power balance problem in the power system, the core is regulating resources of demand side [2]. The charging stations (CSs) will generate a large amount of charging data. The information of charging data is valuable for studying the adjustable capacity of PEVs to participate in the demand response.

The demand response capability of a single charging pile is weak. The aggregator can aggregate the charging piles to increase the capacity of responsive power [3]. However, the number of charging piles is huge, and the charging profiles are different from each other. Therefore, it is necessary to cluster these charging profiles. The clustering algorithms of load profiles can effectively identify the consumption modes of PEV users [4]. The K-means clustering utilizes the European distance as the similarity measurement, and the existing applications are comparatively mature [5]. However, it is difficult to take the profile’s differences into account based on the European distance [6]. There are a lot of clustering methods including hierarchical clustering [7], density-based spatial clustering [8], fuzzy clustering [9], mean shift clustering [10], and so on. Although these algorithms can improve the efficiency and quality of clustering to an extent, the distance-based method can only describe the profile’s characteristics from the overall or macroscopic level. These algorithms may lose some important information about the profile’s shape pattern. In the process of clustering, the similarity measurement describing the difference of two profiles must be determined [11]. The first-order differential operation [12] can effectively extract the power ramp rate between continuous data points of the load profiles.

The performance of some improved clustering methods is more stable than K-means, such as the dimension-reduction technology [13], the Gaussian mixture model clustering [14], and the double-layer clustering [15]. A classification method based on segment aggregation approximation and spectral clustering is proposed in [16, 17], which has advantages in data degradation and computation efficiency. On the basis of two-layer clustering method in [18], a two-layer iterative clustering method in [19] is proposed. The above research studies have proposed a variety of improved clustering methods. These methods can effectively identify the changing trend of the electricity consumption profiles, but the quality of the algorithm is greatly affected by the large variation in data. The charging profiles of PEVs have large volatility [20] which require the robustness of the clustering method. Even for the same CS, there are load peaks with different amplitudes at different occurrence times [21]. During the clustering process, the largest challenge is the high variability of the load patterns. However, the study on considering the offset of the extreme points for clustering is relative less.

The comparison of relevant references is listed in Table 1.

Due to the randomness and discreteness of PEVs, the high variability is the main difference between charging loads and traditional loads [22]. For existing methods, the capability assessment about the adjustable power is lacking. It is necessary to analyze the variability characteristics to meet the requirements of demand response [23]. Therefore, this paper proposes the clustering and variation analysis method of PEV charging profiles for the load curtailment services in the power system. This paper has the following contributions and features:(1)The improvement of clustering robustness for PEV charging profiles. Current research studies perform the profile smoothing without considering the subtle fluctuation on the shape similarity [16, 17]. This paper proposes an adaptive two-scale clustering algorithm based on spectral clustering (ATCSC) for PEV charging profiles. In the distance measurement process, the offset compensation of extreme points ensures that the trajectory bias is comparatively small. The offset compensation of extreme points is utilized to reflect the difference of two profiles. The method considers the offset compensation of the extreme points and enhances the robustness of the clustering process.(2)This paper introduces the Markov model to analyze the variation characteristics of the CS, which provide an assessment of reliability for participating in the load curtailment programs. The addition of this process can reduce the deviation of capacity configuration and improve the reliability of control. In addition, the versatile probability distribution is utilized to statistically evaluate the charging load during 9–24 h with the high peak shaving potential.

This paper is organized as follows. In Section 2, the clustering method (ATCSC) is established. The charging schedules can be obtained by the clustering of charging piles in the CS. Then, the Markov model of charging profiles is introduced to evaluate the reliability for the peak shaving in Section 3. The simulation results are discussed in Section 4. Section 5 concludes the paper.

#### 2. Clustering Method of Charging Profiles

The clustering methods of load profiles are applied to classify the electricity consumption patterns. The clustering method in this paper can be divided into four stages: (1) the preparation of charging data; (2) the extraction of morphological characteristics; (3) the measurement of the offset compensation of extreme points; and (4) the clustering process.

##### 2.1. Preparation of Charging Data

For the actual charging data, it is necessary to obtain low-dimensional data with an approximation. Assume that the original charging data of *i*-th charging pile in one day can be expressed aswhere *z*_{i,t} is the charging power of *i*-th charging pile at time *t*.

The low-dimensional data are appropriate for the clustering algorithm. *Z*_{i} can be approximated by a profile *X*_{i} = [*x*_{i,1}, *x*_{i,2}, …, *x*_{i,T}]. *X*_{i} has *T* elements. *x*_{i,t} is calculated as follows:

##### 2.2. Extraction of Morphological Characteristics

The first-order differential operation can accurately describe the profile states of increase, decrease, and constant. The first-order difference is utilized to obtain the morphological characteristic index. The corresponding first-order difference values of *i*-th profile can be expressed aswhere *x*_{diff,i,t} is the first-order difference value of *i*-th profile at time *t*.

Assume that the maximum and minimum values of the first-order difference of the *i*-th profile are max (*X*_{diff,i}) and min (*X*_{diff,i}), respectively. The morphological characteristic index of *i*-th profile can be expressed as *Y*_{i} = [*y*_{i,1}, *y*_{i,2}, …, *y*_{i,T}]. *y*_{i,t} can be obtained fromwhere *y*_{i,t} is the morphological characteristic index of *i*-th profile at time *t*.

As shown in Figure 1, *y*_{i,t} reflects the change of states, such as increase, decrease, and constant. The extracted results of morphological characteristics conform to the change characteristics of the original profile.

##### 2.3. Adaptive Two-Scale Clustering Algorithm Based on Spectral Clustering (ATCSC)

The difference between two profiles is considered as the distance and the shape. Therefore, the similarity measurements in the clustering method are the first-order difference and the distance measurement based on the offset compensation of extreme points. The correlation coefficient is set as the criterion for the optimal number of clusters.

###### 2.3.1. Data Normalization

The maximum-minimum method is utilized to normalize the data, which is defined in the following equation [17–19].where Norm(*x*_{i,t}) denotes the normalized electricity consumption at time *t*; Norm(*x*_{i,t}) ∈ [0, 1]; and *x*_{min} and *x*_{max} denote the minimum and maximum records, respectively.

###### 2.3.2. Similarity Measurement Based on Euclidean Distance

The Euclidean distance *d*_{i,j} between profile *X*_{i} and *X*_{j} is shown in (6). Assume that *X*_{i} and *X*_{j} are the normalized data records.

###### 2.3.3. Similarity Measurement of Distance Based on the Offset Compensation of Extreme Points

The load profiles fluctuate greatly. The offset compensation of extreme points can reflect the trajectory bias of two profiles. These extreme points of *X*_{i} are shown as follows.

The set of local maximum points: ; the set of local minimum points: . and are the labels of the sequence points in the profile.

As for *X*_{i} and *X*_{j}, in can be recorded as the closest point in the distance to , and in can also be recorded as the closest point in the distance to , as shown in (7) and (8).

The offset distance between point and can be marked as . Also, the offset distance between point and can be marked as , as shown in (9) and (10).

In this section, the offset distance of extreme points is added to the Euclidean distance to form a distance similarity measurement based on the offset compensation of the extreme points.

The offset interval of the extreme points is defined. The horizontal interval is formed by the extreme point and its neighboring extreme point, denoted as [*t*_{1}, *t*_{2}], such as , .

The offset distance of extreme points is defined. The offset distance between the two profiles in the offset interval of extreme points, such as .

The distance compensation of the extreme points is defined. Adding the offset distance of the extreme points to the Euclidean distance between the two profiles is used as the distance compensation of the extreme point.

When the adjacent extreme points have a time offset, the above method is used for the distance compensation; when the adjacent extreme points have no time offset, no distance compensation is required.

Therefore, the Euclidean distance of the two profiles considering the offset distance compensation of the extreme points is defined as follows:

Since *dd*_{ij} is not necessarily equal to *dd*_{ji}, the distance measurement between the two profiles is the mean value of both, which is denoted as *a*_{ij}.

The distance compensation using extreme points can not only reflect the difference caused by extreme points but also be not affected by the dimensional difference between the horizontal and vertical axes.

###### 2.3.4. Similarity Measurement Based on the Morphological Characteristics

The difference degree *b*_{i,j} based on the morphological characteristics of *Y*_{i} and *Y*_{j} is defined in equations (13) and (14):

Equation (13) indicates the difference accumulation of the morphological characteristics between two profiles. The smaller *b*_{i,j} indicates the more similarities in the morphology of two profiles.

###### 2.3.5. The Measurement Matrix of Two-Scale Similarity

After normalizing the above two matrixes [*a*_{i,j}] and [*b*_{i,j}], they are defined as *A* and *B*, respectively. The measurement matrix of two-scale similarity is defined aswhere *S* denotes the two-scale similarity matrix. The parameters *α* and *β* are the weighting coefficients of *A* and *B*, respectively. The coefficients can be adjusted to obtain the better performance by setting the step size to 0.1 [16]. For example, *α* = 0.1 and *β* = 0.9, *α* = 0.2 and *β* = 0.8, etc.

###### 2.3.6. Gaussian Kernel

The spectral clustering algorithm utilizes the similarity matrix in the form of Gaussian kernel, as shown in the following equation.where *σ* is the parameter of Gaussian kernel.

###### 2.3.7. The Optimal Number of Clusters

As the number of clusters increases, the precision of clustering performance increases. The correlation coefficient is utilized to evaluate the linear correlation between variables. The iterative threshold is selected according to the range of the Pearson correlation coefficient in [15], so as to determine the optimal number of PEV clusters. The expression of the Pearson correlation coefficient of the two profiles is shown in (17). The average of *i*-th and *j*-th profiles can be expressed in (18), respectively:

The minimum threshold *r*_{0} of Pearson correlation coefficient is set according to the requirements of linear correlation [5, 19]. The specific steps are shown in Figure 2. The optimal number of clusters is determined by the process to meet the accuracy requirements of the case study.

###### 2.3.8. Specific Steps of Spectral Clustering Algorithm

The spectral clustering algorithm is divided into three steps.(1)Every element of column in the Gauss kernel G is added up, and then the sum is utilized to get the diagonal matrix, i.e., the degree matrix D.(2)The Laplacian matrix *L* is obtained according to a transformation L = I − D^{−1/2}GD^{−1/2}.(3)The smallest *k* eigenvalues of *L* and the corresponding eigenvector *V* are found out. The eigenvector *V* is clustered by K-means to get the *m*-dimensional column vector, which is the marker result of classification.

##### 2.4. Evaluation of Clustering

The performance of clustering can be evaluated by various criteria including the sum of square error (SSE), Calinski–Harabasz indicator (CHI), and Davies–Bouldin index (DBI) [12]. A new index, maximum dissimilarity (MDS), is defined to evaluate the clustering performance in this paper. MDS reflects the maximum difference degree of all clusters, which is not embodied in other three indexes. The MDS indicator represents the maximum Euclidean distance among elements within a cluster of all *k* clusters. The equation is shown in (19). The smaller the MDS, the better the clustering performance.

##### 2.5. Peak Shaving Response of Clustered Charging Piles

In order to adjust the peak load and make full use of the peak shaving potential of the aggregated PEVs, it is important to accurately measure the adjustable load in the peak shaving events [24–26]. The charging power of a single charging pile is limited. The CS can aggregate charging piles to increase the responsive capacity of PEVs. Firstly, the charging piles are classified by the clustering method. Then, the CS selects charging piles to participate in peak shaving according to the peak shaving demand. The objective of air conditioners dispatching in [27] is to meet the load reduction goal as much as possible. With the idea in [27, 28], the power regulation of CS can meet the load reduction requirement by finding the optimal combination of charging piles to control. With the objective of minimum deviation between the demand and the response, the optimal numbers for controlling in each cluster can be obtained according to (20) and (21). The selective control of the corresponding charging piles within a time interval is completed until the *T*-th moment. After that, the CS can effectively utilize the resources of the charging piles to obtain the aggregated adjustable power.

Constraint (21) means that the number of charging piles participating in control within a time interval cannot exceed the total number in corresponding clusters. *R*_{k}(*t*) is the peak shaving demand in the CS at period *t*, and the unit is kW. The charging piles within the CS have *M* clusters. *b*_{m_num} is the total quantity of *m*-th cluster charging piles. *b*_{m}(*t*) is the quantity in *m*-th cluster with suspended state at period *t*. The suspended state is the state of not charging at this period. *C*_{m}(*t*) is the available load reduction of one charging pile in *m*-th cluster, and the unit is kW.

#### 3. Variation Characteristics of Charging Load

The variation characteristics of the charging load need to be analyzed because of their randomness and discreteness. The transition matrix can describe the migration law of PEVs on the physical level. It is also the main difference between PEV loads and traditional loads. This section studies the variation characteristics based on the actual charging profiles.

The Markov chain is a process that can be described by the conditional probability model, as shown in (22). The probability, *P*_{pq}, shows the possibility of transition from state *p* to state *q*.

First of all, to make the profiles appropriate for accuracy, a new expression of discrete data with lower dimensions is obtained after the segment aggregation approximation, as shown in (2). Then, the normalized amplitude is partitioned into *N* intervals. Each representation, , corresponds to an interval [*s*_{q}, *s*_{q+1}]. The mapping from an approximation to a record like *a, b, c, d*, and *e* is obtained. For instance, the intervals of load level can be set as *a* = [0, 0.28), *b* = [0.28, 0.42), *c* = [0.42, 0.52), *d* = [0.52, 0.64), and *e* = [0.64, 1].

Therefore, the *N* different markers map *N* corresponding states. A normalized profile (the sampling interval is 15 min) and its lower-dimension presentation (the sampling interval is 1 h) are shown in Figure 3. As the number of states increases, the difference between the symbol sequence and the original data decreases. However, the more states it has, the larger the scale of the transition matrix is and the greater the sparsity is. For example, assume that the profile’s amplitude is partitioned into five states, which are represented as “*bbaaab babcdc ebceec ddddb*,” as shown in Figure 3.

The Markov model is utilized to simulate the variation characteristics of the states in adjacent time interval. At *t* period, the next-state transition quantity matrix *F*^{t} is built, as shown in (23). That is, the element represents the number of states’ transition from state *i* to state *j* at period *t*. As shown in (24), the element in the corresponding state transition probability matrix *P*^{t} represents the probability that the state *i* transfers to the state *j* from the period *t* to the period *t* + 1.where

#### 4. Case Studies

The parameter settings of the case study are stated below. The PEV charging model in [29] is selected. The load data of 100 CSs are generated by combining the spatial and temporal distribution model [30]. The actual data of charging piles in one CS of Shanghai are used for the clustering process. It should be noted that the actual 56-day charging profiles of 4 charging piles are studied in the paper. Under the condition of limited information, these profiles can be regarded as the charging profiles of 56 charging units (CUs) in a CS. The two-scale weight parameters are *α* = 0.5 and *β* = 0.5 in model data. Additionally, *α* = 0.7 and *β* = 0.3 in actual data. The Gaussian kernel parameter is *σ* = 0.15. By setting the iteration threshold *r*_{0} = 0.6, these profiles of model data in each cluster have a strong positive correlation, so the optimal number of clusters is 6. Then, the charging data of charging piles in a CS are classified for the charging strategies of the CS. Finally, the variation characteristics of one CS are analyzed based on actual charging profiles. In this work, peak shaving applications based on the clustering and the Markov variation characteristics are proposed.

##### 4.1. Evaluation of Clustering Based on Charging Model

Indicators such as SSE, CHI, DBI, and MDS are utilized to evaluate the performance of the clustering algorithm and are shown in Figure 4.

The smaller SSE, larger CHI, smaller DBI, and smaller MDS indicate the better clustering performance [16]. When the number of clusters is 6, the inflection point of SSE is shown in Figure 4. With the increase of clusters’ number *k*, SSE decreases slowly. DBI and CHI show the minimum and maximum values, respectively, when the number of clusters is 6. MDS shows that when the number of clusters is larger than 6, the indicator decreases more slowly. Based on the analysis results, the classification number is determined as 6. The smaller DBI represents the better clustering performance. Significantly, when the number of clusters approaches the optimal number, the DBI decreases. As the number of clusters is greater than the optimal number, elements within a cluster are divided into more clusters. Referring to the definition of DBI [16], DBI rises when the number of clusters is greater than the optimal number. Meanwhile, the rationality of determining the optimal number of clusters as 6 is also verified according to the Pearson correlation coefficient [5], as shown in Table 2.

##### 4.2. Clustering Based on Charging Model

Based on the spatial clustering results, 6 typical profiles can be obtained by calculating the mean value of the corresponding profiles in each cluster. The typical profiles after Gaussian filtering are shown in Figure 5. The reference load *P*_{based_CSs} = 1000 kW.

##### 4.3. Comparison of Algorithm Performance Based on Charging Model

The stability of the proposed algorithm can be tested according to the multiple-run experiments. The determined number of clusters is 6. The clustering results of the proposed clustering algorithm are compared with K-means algorithm and mean shift algorithm with 8 times experiments, as shown in Figure 6.

**(a)**

**(b)**

**(c)**

Comparing the results in Figure 6, it can be seen intuitively that the K-means and mean shift algorithms have a dissimilarity rate of more than 50%. However, the consistency of the algorithm proposed in this paper reaches 100%. It is clear that the consistency of the proposed clustering algorithm is better than that of the K-means and mean shift algorithms.

The clustering results by different methods are evaluated for the indicators’ analysis. Smaller SSE, larger CHI, smaller DBI, and smaller MDS indicate the better clustering performance [16]. The indicators of charging model data by different clustering methods are shown in Table 3. The clustering method in this paper outperforms the other two methods in four indicators. Additionally, the execution time can be shown in Table 3. The time of proposed method is acceptable due to the distance compensation of extreme points. Therefore, the proposed clustering method is superior to the other two methods.

##### 4.4. Clustering Results Based on Actual PEV Data

Based on the actual charging profiles, the number of clusters is 8 with the proposed clustering algorithm. Four typical clustering results are shown in Figure 7 for brevity. The reference load *P*_{based_piles} = 80.2 kW.

**(a)**

**(b)**

**(c)**

**(d)**

Due to the strong volatility of the PEV charging profiles, the robustness of the clustering algorithm is required. Four experiments are set. The clustering stability of K-means and mean shift algorithms is discussed in Experiments 1 and 2, respectively. The two-scale similarity of spectral clustering in Experiment 3 are the Euclidean distance and the morphological characteristics. The consistency of the clustering results of Experiment 3 is less than 50%. Meanwhile, Experiment 4 takes into account the offset distance compensation of extremum points. Also, the clustering consistency of Experiment 4 reaches 100%. The analysis of its robustness is shown in Figure 8. In addition, the computation efficiency of the K-means algorithm is low. K-means is prone to long-time operation when centers of clusters are difficult to find. In summary, the proposed algorithm is superior to K-means and mean shift algorithms in terms of efficiency and stability.

**(a)**

**(b)**

**(c)**

**(d)**

The performance of clustering for Experiment 3 and Experiment 4 is shown in Figures 9 and 10. SSE of Experiment 3 fluctuates significantly, which is unfavorable for determining the optimal number of clusters. The index of Experiment 4 proves that the optimal number of clusters is 8, and Pearson’s correlation coefficient is 0.6064. The above simulation results prove the effectiveness of the proposed clustering algorithm.

The indicators based on actual charging data by different clustering methods are shown in Table 4. The clustering method in this paper outperforms the other two methods in four indicators. The time of proposed method is comparatively long due to the distance compensation of extreme points. Therefore, the performance of the proposed clustering method is better than the other two methods especially for the actual charging data with high variability.

##### 4.5. Clustering for the Application of Peak Shaving Service

Firstly, the peak shaving task will be allocated to the CS depending on the total peak shaving demand. Then, the corresponding peak shaving product of each CS is generated through the clustering of charging piles within the station. The peak shaving potential of 6 CSs’ clusters is shown in Figure 5. The quantity of CSs in each cluster is [9, 23 ,30, 14, 15, 9]. With the objective to minimize the deviation between the peak shaving demand and the response capacity, the quantity of CSs that participates in the peak shaving service, A_cs = [*a*_{1}, *a*_{2}, …, *a*_{6}], can be obtained, as shown in (26)–(28).where *R* = [*R*(1), *R*(2), …, *R*(*T*)] is the total peak shaving demand at period *t*, and the unit is kW. *CS*_{k}(*t*) is the peak shaving potential of *k-*th cluster CS at period *t*, and the unit is kW. *T* is the period of peak shaving service. Constraint (27) means that the peak shaving potential by the CSs participating in peak shaving must greater than the total peak shaving demand within the peak shaving period. The allocated peak shaving demand *R*_{k}(*t*) of one CS in *k*-th cluster is obtained by the dispatch coefficient *b*_{k}, as shown in (29) and (30) [31]. *E*_{obj} is the total peak shaving demand. *E*_{k} is the potential peak shaving capacity of one CS in *k*-th cluster. Δ*t* is the time interval.

Assume that the total peak shaving demand of 800 kW is required during 12–18 periods. The quantity of CSs in the auxiliary service is A_cs = [0, 1, 7, 1, 0, 0].

Under the limited information, assume that the allocated demand of one CS is 200 kW during 12–18 periods. The adjustable power of aggregated charging piles is obtained by uniformly adjusting the charging status of the CUs’ clusters. In the CS, one CU consists of 4 normal charging piles. 8 CUs’ clusters are shown in the above section, and the quantity is [17, 12, 7, 5, 5, 4, 4, 2]. With the objective of minimum deviation between the allocated demand and the response, the charging strategies can be obtained according to (31) and (32). *R*_{k}(*t*) is the allocated demand in the CS at period *t*. *b*_{m}(*t*) is the quantity in *m*-th CU cluster with suspended state at period *t*. *C*_{m}(*t*) is the available load reduction of one CU in *m*-th CU cluster, and *m* = 1, 2, …, 8.

The minimum interval of the actual data points is 15 min, which can be set as the minimum time interval of changing the charging piles groups to control. Therefore, different scenarios are set as scene 1, scene 2, and scene 3 corresponding to the time interval 15 min, 30 min, and 1 h. Controllable quantity matrix of charging piles can be represented by B_pile = [*b*_{m}(*t*)]. The deviation between the allocated demand reduction and the response reduction is small by the suspended charging strategies as shown in Figure 11. In other words, the control strategy *b*_{m}(*t*) is effective for the objective in (31). In addition, only the sum of the controllable quantity at period *t* is shown for brevity.

**(a)**

**(b)**

**(c)**

The comparisons of three different scenarios are shown in Table 5. The smaller the change interval of charging pile control strategy, the smaller the deviation between the objective demand and the actual response. However, the corresponding controllable quantity matrix will be more complex. Additionally, the execution time of simulation case also increases with the decrease of control strategy change interval.

Two works are finished in the above discussions: the selection of the appropriate CSs’ clusters for the peak shaving service and the charging strategies in one CS. The following variation analysis of adjustable reliability will find the appropriate CS for the load curtailment.

##### 4.6. Transition Probability Matrix of Charging Load of the CS

This section studies the variation characteristics of the charging load based on the Markov model. The dataset consists of the charging profiles of 56 days in a CS of Shanghai. After the experimental analysis, the actual charging load data have smaller value in the periods 0:00–9:00. Therefore, one day is partitioned into two large time durations: (0:00–9:00) and (9:00–24:00).

The number of various load levels during the periods 9:00–24:00 is counted, so as to obtain the probability density distribution and cumulative probability distribution, as shown in Figures 12 and 13, respectively. From the analysis results, the period of 9:00–24:00 has a strong peak shaving capacity.

The expressions of probability density function (PDF) and cumulative distribution function (CDF) of Versatile distribution [32] are shown in (33) and (34), respectively:

The *nlinfit* function in MATLAB is utilized to fit the actual probability distribution to obtain the shape parameters of Versatile-PDF and Versatile-CDF [32] as shown in Table 6.

According to the statistics of load levels shown in Figure 13, the states of the Markov model are set with equal probability. In order to determine the states’ number of the symbol sequence, the numbers of states from 1 to 14 are set, respectively. The average error between the original data and the symbol sequence corresponding to each case is calculated. Figure 14 shows the average error of the whole day (full periods) and the segment duration (periods 9–24).

The average error decreases as the number of states increases. It can be seen that the average error of time periods 9–24 is more stable than the whole time periods and both error curves converge. Considering the average error of period 9–24, when the number of states is greater than 5, the average error decreases slowly. Considering the lack of state information and the scale of the transition matrix, the number of Markov states during the period 9–24 is set to 5. The corresponding average error is 0.0494.

The amplitude of the charging load is partitioned into 5 states (*abcde*). Accordingly, there are 4 breakpoints corresponding to 20%, 40%, 60%, and 80% of the CDF. The corresponding breakpoints are found in Versatile-CDF in Figure 13, which are (0.28, 0.20), (0.42, 0.40), (0.52, 0.60), and (0.64, 0.80). Therefore, the intervals of load level corresponding to state 1 to state 5 are [0, 0.28), [0.28, 0.42), [0.42, 0.52), [0.52, 0.64), and [0.64, 1], respectively. Then, the next-state transition probability matrix at each period is calculated, respectively. After that, the Markov chain of charging load can be obtained for the analysis of variation characteristics.

##### 4.7. Load Curtailment Applications of the Transition Probability Matrix

The precision of adjustable amount of charging load is the reliable index in peak shaving services. Generally, the charging load of CS with less variability could be predicted easily. Therefore, CSs with less variability and higher consumption levels are suitable for the direct load control (DLC). In this method, the corresponding amount of charging load can be controlled precisely. Assume that peak shaving of 200 kW is allocated during 12–18 periods. Type 1 CS and type 2 CS are selected to the peak shaving service by the clustering process. Their load curtailment services during period 12–18 are shown in Figure 15.

The positive deviation of type 1 CS indicates that the capacity of type 1 CS is sufficient to participate in the peak shaving service. However, there will be a great uncertainty in the amount of adjustable load in type 2 CS due to the great volatility, as shown in Figure 15. Therefore, type 1 CS is more suitable for the peak shaving service.

To quantitatively evaluate the variation characteristics for different types of CSs, Shannon entropy is introduced [33]. Also, the transition probability matrix, *P*^{t} in (24), is utilized in (35) to calculate the entropy. The entropy values of the two types of CSs are shown in Table 7 for the period 12–18.

Smaller entropy generally indicates less variability. In practical situations, the PEV aggregator can set up the entropy threshold. The adjustable PEV load within the threshold is suitable for the peak shaving service. For the PEV load with great volatility whose entropy exceeds the limit, it is not suitable for the peak load shaving service. Then, the load participating in the curtailment service needs to be reselected.

#### 5. Conclusion

This work presents a novel method for the load curtailment applications based on clustering and Markov model with the charging profiles.(1)The proposed clustering method considering the offset compensation of extreme points has more robust performance especially for the actual charging profiles with high variability. In the distance measurement process, the offset compensation of extreme points ensures that the trajectory bias is comparatively small. The clustering method in this paper outperforms the other two methods in four indicators.(2)With the objective of offering the load curtailment services, the CS can find the optimal combination of charging piles to control. The smaller the change interval of charging pile control strategy, the smaller the deviation between the objective demand and the actual response. Additionally, the corresponding controllable quantity matrix is more complex when the deviation becomes smaller.(3)Considering the large volatility of charging profiles, the entropy from the Markov model is used to evaluate the reliability of one CS participating in the capacity configuration. When the entropy of one charging load exceeds the limit, the charging load is not suitable for the peak load shaving service.

The controllability of aggregated PEVs in the future plays a key role in flexible adjustment and variation matching of supply and demand in power systems with high-proportion renewable energy. This paper can provide technical support and facilitate the participation of PEVs in DR programs. The next step is to optimize the global charging control on the basis of this paper and realize the coordinated control of large-scale PEVs.

#### Abbreviations

Z_{i}: | The origin charging data of i-th charging pile |

r: | The Pearson correlation coefficient |

z_{i,t}: | The charging power of i-th charging pile at time t |

: | The average of i-th profile |

X_{i}: | The profile for the clustering of i-th charging pile |

: | The average of j-th profile |

x_{i,t}: | The charging data of i-th profile at time t |

D: | The degree matrix |

X_{diff,i}: | The first-order difference values of i-th profile |

L: | The Laplacian matrix |

x_{diff,i,t}: | The first-order difference value of i-th profile at time t |

SSE: | Sum of square error |

y_{i,t}: | The morphological characteristic index of i-th profile at time t |

CHI: | Calinski–Harabasz indicator |

x_{min}: | The minimum record of charging data |

DBI: | Davies–Bouldin index |

x_{max}: | The maximum record of charging data |

MDS: | Maximum dissimilarity |

Norm(x_{i,t}): | The normalized electricity consumption of i-th profile at time t |

P^{t}: | The state transition probability matrix |

P_{i}: | The set of local maximum points of i-th profile |

P_{pq}: | The possibility of transition from state p to state q. |

Q_{i}: | The set of local minimum points of i-th profile |

F^{t}: | The next-state transition quantity matrix |

: | The k-th label of the local maximum points in the i-th profile |

R_{k}(t): | The peak shaving demand of one CS in k-th cluster at period t. |

: | The k-th label of the local minimum points in the i-th profile |

b_{m_num}: | The quantity of m-th cluster charging piles |

d_{i,j}: | The Euclidean distance of i-th and j-th profiles |

b_{m}(t): | The quantity in m-th cluster charging piles with suspended state at period t |

d( ): | The offset distance of point and |

C_{m}(t): | The available load reduction of one charging pile in m-th cluster |

dd_{ij}: | The distance of i-th and j-th profiles considering the offset distance compensation of the extreme points |

CU: | The charging unit |

a_{i,j}: | The distance measurement of i-th and j-th profiles |

A_cs: | The quantity matrix of CSs that participates in the peak shaving service |

Y_{i}: | The morphological characteristic index of i-th profile |

CS_{k}(t): | The peak shaving potential of k-th cluster CS at period t |

b_{i,j}: | The difference degree based on the morphological characteristic of i-th and j-th profiles |

E_{obj}: | The total peak shaving demand |

S: | The two-scale similarity matrix |

E_{k}: | The potential peak shaving capacity of one CS in k-th cluster |

A: | The similarity matrix of distance measurement |

B_pile: | The control matrix of charging piles |

B: | The similarity matrix of morphological measurement |

PDF: | The probability density function |

α: | The weighting coefficient of A |

CDF: | The cumulative distribution function |

β: | The weighting coefficient of B |

DLC: | The direct load control |

σ: | The parameter of Gaussian kernel |

α_{1}, β_{1}, γ_{1}: | The shape parameters of Versatile-PDF and Versatile-CDF |

G: | The similarity matrix in the form of Gaussian kernel. |

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This study was supported by the National Natural Science Foundation of China (51977128).