Abstract
In order to extract more information that affects customer arrears behavior, the feature extraction method is used to extend the lowdimensional features to the highdimensional features for the warning problem of user arrears risk model of electric charge recovery (ECR). However, there are many irrelevant or redundant features in data, which affect prediction accuracy. In order to reduce the dimension of the feature and improve the prediction result, an improved hybrid feature selection algorithm is proposed, integrating nonlinear inertia weight binary particle swarm optimization with shrinking encircling and exploration mechanism (NBPSOSEE) with sequential backward selection (SBS), namely, NBPSOSEESBS, for selecting the optimal feature subset. NBPSOSEESBS can not only effectively reduce the redundant or irrelevant features from the feature subset selected by NBPSOSEE but also improve the accuracy of classification. The experimental results show that the proposed NBPSOSEESBS can effectively reduce a large number of redundant features and stably improve the prediction results in the case of low execution time, compared with one stateoftheart optimization algorithm, and seven wellknown wrapperbased feature selection approaches for the risk prediction of ECR for power customers.
1. Introduction
With the rapid development of global energy market, smart grid [1] in power industry has been built continuously, and the scale of information data accumulated by power system is becoming larger and larger. As the main income of power companies, electric tariff plays a decisive role in the development of power enterprises. However, in the whole process of power marketing, the risk of arrears’ users has always existed, which hinders the development of power enterprises. Electric charge recovery (ECR) has always been a difficult problem that power supply enterprises need to solve urgently. It is also the most important management part of power meter reading, verification, and checking. Once the processing of ECR is delayed, it may have a bad impact on the charging result. It also causes power customers to occupy the funds of power enterprises, which is not conducive to the fund management of power companies. The main operating profit of power grid enterprises comes from ECR. Through the analysis of electric power clients’ paying behavior and customer’s characteristics, the risk factors and risk levels of ECR are predicted and evaluated, which contribute to collecting electric charge, formulating effectively preventive measures in time, reducing management risks and safeguarding economic benefits of power enterprises. Therefore, accurate and reliable forecasting of arrear risks is an important reference in terms of determining the management of ECR.
In the past, some people adopted these methods of feature extraction [2–4], information entropy theory [5], combined support vector machine (SVM) with VIKOR method [6], artificial immune algorithm [7], and combined trust evaluation cloud with decision algorithm [8] and established a power arrears risk model, but these results predicted by these methods were not very good. In recent years, some scholars have used quantitative analysis [9], classification [10], clustering [11], ensemble [12], and improved algorithms [13, 14] to analyse arrears behavior of electricity users and improved prediction results. However, with the increase in huge quantities of data and the difficulty in capital operating, power companies urgently need to adopt faster and more accurate data processing methods to predict the arrears risk of power users in the future.
In order to extract more information that affects customer arrears behavior, lowdimensional features are extended to highdimensional features via the feature extraction method. However, many irrelevant and redundant features will reduce the accuracy of classification and raise the complexity of dimensions. Therefore, feature selection is a very effective solution [15]. Feature selection and classification methods are widely used in highdimensional and multiclass data sets [16, 17], which can improve the accuracy of model prediction by removing irrelevant and redundant features. Generally, the feature selection process includes the following stages: selecting feature subsets, evaluating feature subsets, and verifying results. The purpose of this process is to remove unrelated or redundant features and generate a smaller optimal feature subset. Generally, feature selection methods are divided into filter, wrapper, and embedded [18]. For the filter algorithms, the inherent distribution characteristics of the data are used as the basis for feature selection. The process of selecting features by filter approaches is not related to the learner. The filter approaches can be further divided into single variable algorithms and multivariate algorithms [19]. The single variable algorithms are able to evaluate each feature individually, reducing the accuracy of classification. However, multivariate algorithms have the advantage of evaluating the correlation between features. The filter algorithms are independent of the classifier and have fast computation speed. However, there is no interaction between classification algorithm and features in the process of feature selection by filter approaches. The wrapper approaches rely on classification algorithms to evaluate the selected feature subsets, which can achieve a higher classification accuracy than filter methods [20]. However, wrapper algorithms have high computational complexity in feature selection of highdimensional data sets. In embedded approaches, feature selection is directly integrated into the training process of learners [21], but embedded methods are more complex in concept because it is not easy to obtain better classification results by the improved classification model. In contrast, wrapper approaches can make use of the performance of machine learning algorithm as evaluation criteria for selecting features, making it more flexible and more effective analysis of highdimensional data. In recent years, wrapper approaches have aroused much attention by solving feature selection problems and seeking global optimal solutions through heuristic algorithms.
In recent years, many researchers have successfully applied wrapper approaches to feature selection. Li and Yang [22] integrated OSextreme learning machine (EOSELM) and binary Jayabased feature selection to realtime transient stability assessment using the phasor measurement unit (PMU) data. Yang et al. [23] propose a novel binary Jaya optimization algorithm, which is integrated with the lambda iteration method to transform the dual objectives of economy and emission commitment into a single objective problem. Aslan et al. [24] proposed the JayaX binary optimization algorithm replacing Jaya’s solution update rules with XOR operator and compared the results with the latest algorithm, which can produce better quality results in binary optimization problems. The whale optimization algorithm (WOA) [25, 26] was proposed, using the wrapperbased method to reach the optimal subset of features and effectively improve the accuracy of classification. Houssein et al. [27] proposed an Sshaped binary whale optimization algorithm for feature selection. Tawhid and Ibrahim [28] used a feature selection based on binary whale optimization algorithm (BWOA) to solve the problem of feature selection. Rao et al. [29] used the artificial bee colony approach (ABC) based on the boosting decision tree model to improve the quality of the selected features. The method can accelerate the convergence speed and balance exploration and exploitation efficiently. Furthermore, binary artificial bee colony algorithm (BABC) [30, 31] is used to select feature subset. S. Oreski and G. Oreski [32] used the genetic algorithmbased heuristic for feature selection in credit risk assessment. In the paper, the genetic algorithm (GA) applies the neural network model to select optimal subset of features and increases the results of classification in credit risk assessment. Chen et al. [33] introduced a heuristic feature selection approach into text categorization by using chaos optimization and GA. Shukla et al. [34] proposed a new hybrid feature subset selection framework based on binary genetic algorithm and information theory. The method accelerates the search of important feature subset. Emary et al. [35] used binary grey wolf optimization (BGWO) approaches to select feature subset. AlTashi et al. [36] introduced the feature selection method based on GWO for coronary artery disease classification. Several studies used feature selection methods based on particle swarm optimization (PSO) algorithm to search the feature space for optimal solutions [37–39]. PSO performs equally well or better than WOA, ABC, GA, and GWO in solving global optimization problems [40]. Therefore, PSO is a promising approach to many tasks, including feature selection.
Kennedy and Eberhart proposed a particle swarm optimization (PSO) algorithm in 1995 [41]. PSO has been widely applied to many fields due to its superior performance, such as neural network training [42], classifier design [43], clustering analysis [44], and network community [45]. Since many problems are discrete in practice, Kennedy and Eberhart proposed a binary particle swarm optimization (BPSO) algorithm in 1997 [46], which uses binary encoding form to solve the discrete optimal combination problem. The BPSO has also been widely used in many fields, such as knapsack problems [46–48], power systems [49, 50], data mining [51–55], and image processing [56]. Because the search range of particles cannot be dynamically adjusted, BPSO can easily fall into the local optimum and premature convergence with the decline of population diversity. In view of the shortcomings of BPSO, many scholars have proposed various improved methods. First, Chuang and other scholars used chaotic binary particle swarm optimization (CBPSO) algorithm [57–59] to chaotic mapping the inertia weight improving the performance of BPSO. However, the behavior of chaotic mapping does not exist at fixed points, periodic or quasiperiodic orbits. Liu et al. [60] proposed BPSO with linear adaptive inertia weight, which improved the search performance of particles, but it often ignored the nearby optimal solution in the process of linear transformation of inertia weight. Wu et al. [61] proposed a feature selection algorithm based on hybrid improved binary quantum particle swarm optimization (HIBQPSO) for feature selection. The proposed HIBQPSO method effectively and efficiently improves the classification accuracy and introduces the strategies of crossover and mutation, compared with other feature selection approaches on nine gene expression datasets and 36 UCI datasets. The sequential backward selection (SBS) [62] is a heuristic search algorithm, that is, simple to implement, but the amount of computation is greatly affected by the initial feature set.
The traditional BWOA, BABC, BGA, BGWO, BPSO, and CBPSO algorithms have simple structure and few parameters. These algorithms are proved to be effective by using a binary mechanism to select feature subsets, but they are difficult to effectively jump out of the local optimum. For the stateoftheart metaheuristic optimization algorithm, the HIBQPSO algorithm enhances the diversity in the search process and can effectively search the optimal feature subset, but HIBQPSO lacks stability in balance exploration and exploitation for searching global optimal solution. In order to effectively balance exploration and exploitation in the search process, a hybrid nonlinear inertia weight binary particle swarm optimization with shrinking encircling and exploration mechanism is proposed, and then, the SBS method is introduced, namely, NBPSOSEESBS, for solving feature selection tasks. NBPSOSEESBS can effectively reduce the number of features while maintaining the best classification effect. In order to prove the effectiveness and superiority of NBPSOSEESBS, the two groups of comparative experiments are set up, using the logistic regression method, to realize the risk prediction of ECR for power customers in June, July, and August 2018. NBPSOSEESBS can significantly reduce the feature dimension, improve the accuracy of classification, and effectively enhance the global search capability. The main contributions of this study are summarized as follows. Firstly, the proposed algorithm achieves the balance between local search and global search by nonlinearly updating the inertia weights and enhances the diversity of particles for searching the optimal solutions. Secondly, two dynamic contraction factors are introduced into the update of velocity and position, which can not only effectively enhance the inheritance ability, selfrecognition ability, and social cognitive ability of the particles but also improve the quality of the particle position. Furthermore, a novel position updating approach is proposed to get rid of the trap of local optimum and shrinking encircling, and exploration mechanism is introduced. Finally, the SBS algorithm is used to remove individual redundant features separately and help to find the potential optimal solutions.
2. Methodology
2.1. Binary Particle Swarm Optimization
PSO is a random search algorithm based on group cooperation developed by simulating the foraging behavior of birds [41]. Assuming that the feature dimension of a target search space is , the size of the population is m, where is a Ddimensional vector and represents the i^{th} particle, , and is the number of particles. is the position of particle i in the target search space. is the velocity determining the direction and distance of each particle flight in dimension . and are the optimal position of the i^{th} particle and the optimal position for the whole population, respectively. The velocity and position of particles are updated by the following equations, respectively:where and . is the number of particles. and are learning factors. The range of and is [0, 1]. is the number of iterations.
The PSO is only suitable for continuous function solving. Kennedy and Eberhart proposed the BPSO [46] based on binary encoding form. In the BPSO, the position of each particle is represented by binary strings, and the velocity vector is not. The positions of particles are updated according to the following equation [63]:where .
2.2. Fitness Function
The purpose of feature selection is to get the best classification results with the least features. The fitness function is shown in the following equation:
The fitness value is the F1measure average value of predicting the highrisk, mediumrisk, and lowrisk levels of customer ECR, and its range is between [0, 1]. In equation (4), , , and represent the F1measure value of predicting the highrisk, mediumrisk, and lowrisk levels of customer ECR, respectively. In order to make an objective evaluation of the performance of the model, this paper introduces four evaluation criteria: accuracy, precision, recall, and F1measure. The definitions are shown in the following equations:
TP, FP, FN, and TN represent true positive, false positive, false negative, and true negative, respectively, in equations (5)–(8). In theory, the higher the values of accuracy, precision, recall, and F1measure, the higher the fitness value, and the better the predictive performance of the model.
3. Improved Hybrid Feature Selection Algorithm Based on NBPSOSEESBS
The framework for the risk prediction of ECR includes three processes: data preprocessing, selecting the optimal feature subset based on NBPSOSEESBS, and classification and evaluation using the logistic regression method. The framework for feature selection of ECR risk based on NBPSOSEESBS algorithm is shown in Figure 1.
3.1. Data Preprocessing
The characteristic dimension of the original power data is low, which cannot adequately express the arrears behavior of power users. In order to solve this problem, data are expanded from lowdimensional features to highdimensional features by feature extraction. According to the electricity data with 21 consecutive months, the features of training set and test set are extracted, respectively. The training set contains the power data for 20 months, and the test set contains the power data for one month. For the data of each month, firstly, the categorical features are transformed based on the method of onehot encoding and, secondly, adding characteristics of the last six months to the encoded data and calculating the maximum, minimum, average, median, variance, and standard deviation; finally, the original 34dimensional features are extended to 748 dimensions.
The processing of feature extraction is shown in Algorithm 1, where m is the total number of months, is the power data of the k^{th} month, and dataSet is the data set after the features have been extracted.

After all the features have been calculated, the minmax standardization method is used to transform the features and map the values to [0, 1]. The minmax standardization function is shown in the following equation:
In equation (9), presents the feature be used for transforming, and and are the maximum and minimum of each feature, respectively.
The expansion of features can reflect the information of users’ historical electricity consumption behavior, as shown in Table 1 (only one of the features is taken as an example due to too many features).
These 26 features of electricity consumption behavior in the current month are expanded by this method. Then, the statistical features such as maximum value, minimum value, mean value, variance, and standard deviation of these historical electricity consumption features are calculated. Take the expansion of “jfjsl” (payment timeliness rate) as an example, and the specific statistical analysis is shown in Table 2 (only one of the features is taken as an example).
In addition, onehot coding method is used to transform the category features into numerical features. Finally, the original 34dimensional features are extended to 748 dimensions.
4. Improved NBPSOSEE Algorithm
Inertial weight is a very important parameter in the adjustable parameters of the BPSO algorithm. The value of plays an important role in the performance of the algorithm. The small value is convenient for local search of the current search area, and a more accurate solution can be obtained to facilitate the convergence of the algorithm, but it is not easy to jump out of the local extremum point; a large value of is convenient for global search, but it is not easy to get an accurate solution. Literature [60] shows that linear optimization of inertia weight can improve the performance of the algorithm, but this strategy cannot effectively satisfy the optimization process of the algorithm. Therefore, in order to get closer to the actual evolutionary state of the algorithm, this paper performs nonlinear incremental optimization of inertia weight. In each iteration, the inertia weight is calculated as follows:where and represent the current iteration number and the maximum number of iterations, respectively. As the number of iterations increases, the inertia weight exhibits a nonlinear incremental state. It can be known that the improved algorithm has a smaller in the early stage of searching for the optimal solution, and the particles have a stronger local search capability. In the later stage of finding the optimal solution, there is a large , and the particles have strong global search ability.
Furthermore, in order to enhance the optimal performance of the PSO, two new contraction factors are introduced into the position updating equation in the NBPSOSEE algorithm. Clerc and Kennedy [64] proposed a particle swarm optimization algorithm with contraction factor (CFPSO) in 2002, which is intended to improve the convergence speed while getting rid of the local optimal value. The algorithm flow of CFPSO is similar to the original PSO, but the velocity updating formula of particles is different. CFPSO uses the contraction factor to compress the particle velocity after each update, which not only changes the influence ability of its own historical velocity but also changes the influence ability of the historical optimal position on the particle velocity, so as to improve the convergence speed of the population. However, CFPSO has some drawbacks. Too large value of the contraction factor results in poor convergence performance and makes the PSO close to random search optimization. If the value of the contraction factor is too small, the PSO will easily converge earlier and reduce the accuracy of classification. In order to solve this problem, the two dynamic contraction factors and are used to compress the velocity and position of each update, respectively, in the equations (11) and (12). On adding a contraction factor to the velocity, NBPSOSEE can improve the inheritance ability, selfrecognition ability, and social cognitive ability of the particles. NBPSOSSE introduces a contraction factor, which will improve the quality of the particle position when updating the position. In NBPSOSSE, the two dynamic contraction factors can enhance the performance of exploration and exploitation and improve the convergence speed:
These two dynamic contraction factors reveal a nonlinear convert of the particle velocity and position. Parameters and are calculated as follows:where is position value of the particle, is the current iteration, and is a constant ().
Then, in NBPSOSEE, using the two mechanisms of shrinking encircling and exploration can improve the search ability of the population. Firstly, the moving position of the particle is determined between the current position and the target position via using shrinking encircling operation, which can shorten the search range of the particles and achieve the purpose of enhancing the local search ability of the population. In addition to shrinking encircling strategy, NBPSOSEE refers to random search mechanism to improve the diversity of particles. When updating the particle position, it is based on the change of coefficient . If exceeds the range of [−1, 1], distance coefficient will be updated randomly. In order to find the target, the particles will traverse the original target direction, making the population have the performance of global search. In NBPSOSEE, the particle position is updated by equation (18). NBPSOSEE adopts dynamic contraction strategy, shrinking encircling operation and exploration mechanism with some probability, which not only can get rid of the local optimal solution but also can accelerate the convergence speed:
In equations (16) and (17), represents a coefficient variable, and is a target that is randomly searched (). In equation (14), represents the current number of iterations, represents a coefficient variable, is defined as the optimal position of the current target , and represents a random value between [0, 1].
The coefficient variables, namely, and , are calculated separately as follows:
In the above formulas, is a variable with a range of values between [0, 2] and presents a linear decreasing trend. It is updated in the form of , where is the current number of iteration, represents the maximum iteration number, and is the random value between [0, 1].
4.1. SBS Algorithm
Since the feature subset selected by NBPSOSEE still contains redundant features, the importance of features is different, and the order of arrangement is confusing. Therefore, this paper firstly uses the feature selection method of random forest to sort the importance degree of the features selected by NBPSOSEE and then uses SBS algorithm to delete features with low importance in turn.
For each node on the random forest decision tree, features are usually randomly extracted from the ddimensional feature set. Then, according to the Gini gain maximization principle [65], a feature is selected to divide the data on the node into two left and right child nodes, which means that the data on the parent node is divided into its child nodes and . The Gini gain maximization is to maximize the following equation:
Here, is the Gini index of node , is the proportion of class samples in node , while and are the rates of data divided into left and right child nodes and by parent node , respectively. The importance of features on nodes is shown in the following equation:
If the set of nodes in which the feature appears as a node partitioning attribute in the k^{th} decision tree is , the importance of the feature on the decision tree is calculated as shown in the following equation:
Assuming that there are trees in the random forest, the importance of the feature in the random forest can be calculated as follows:
Here, is the number of decision trees in the random forest. After the NBPSOSEE obtains the optimal feature subset, the SBS starts searching, calculates the fitness value corresponding to each feature when it is deleted separately, and then selects the feature subset with the best fitness value to enter the next iteration. The iterative steps of the algorithm in the SBS stage are as follows: Step 1: determine the optimal feature subset of the SBS stage. Step 2: delete a feature in the current feature subset and satisfies the following equation: where denotes the feature subset after removes feature , is the number of iterations, and is the fitness value. The larger the fitness value in this paper, the better the selected feature subset. Step 3: update the optimal feature subset and the number of iterations: Step 4: repeat steps 2 and 3 until the termination condition is met.
The feature selection processing of NBPSOSEESBS is shown in Algorithm 2. Where maxIterations is the maximum number of iterations, swarmSize is the number of particles in the population, dimension is the dimension of each particle, and fitness is the fitness value.

5. Computational Complexity of NBPSOSEESBS
The time computational complexity of the BPSO algorithm can be expressed as , where indicates the number of iterations, is the population size, and represents the computing time for the logistic regression model training and predicting. The time computational complexity of the CBPSO algorithm can be expressed as , where is the computing time of logistic map sequence. The time computational complexity of the CBPSO algorithm can be expressed as , where is the computing time of logistic map sequence. The time computational complexity of the HIBQPSO algorithm can be expressed as , where is the computing time for correction factors.
The time computational complexity of the proposed NBPSOSEE algorithm can be expressed as , where is the computing time for calculating inertia weight; is the computing time for updating coefficient variables, namely, and ; and is the computing time for dynamic contraction factors and . The time computational complexity of the proposed NBPSOSEESBS algorithm can be expressed as , where is the computing time of the SBS method. It can be seen that the computational complexity of NBPSOSEESBS algorithm is obviously highly compared with the BPSO and CBPSO algorithms. However, , , and are the only simple numerical operations according to equations (10), (13), (14), (19), and (20). Furthermore, a large number of redundant or irrelevant features are deleted by the NBPSOSEE algorithm. SBS method will spend less time deleting the remaining unimportant features. Therefore, the proposed NBPSOSEESBS algorithm does not significantly improve the computational complexity.
6. Results
6.1. Data
In this paper, the data set from January 2017 to September 2018 is provided by a power enterprise, including the electricity consumption data of 11,860 highvoltage users who have had arrears. According to the users’ past payment behavior, the power enterprise divides the users into highrisk, mediumrisk, or lowrisk arrears ones. A total of 34 features are used for data processing and model training. There are 8 category features that represent basic information for these users, while 26 statistical features describe these users consuming electric information.
Experimental environment: operating system Centos 7.0, CPU Intel Xeon E52620 V3, memory 128G, development language Python 3.5.
6.2. Experimental Results
In order to prove the effectiveness and superiority of the proposed algorithm, two groups of comparative experiments are set up and use logistic regression model [66, 67] to realize the risk prediction of ECR for power customers in June, July, and August 2018. The first group of experiments verifies the effectiveness of the NBPSOSEE. The second group of experiments proves the superiority of the proposed hybrid feature selection algorithm based on NBPSOSEE with SBS, called NBPSOSEESBS.
The relevant parameters selected for NBPSOSEESBS are listed in Table 3. Population size 20 to 40 is the optimal size for optimization problems [68]. Generally, population size is set to 20 [46, 58, 60]. Hence, 20 particles are selected to form a particle swarm in this paper. The maximum iteration number is 100; maximum value is 0.9, and minimum value is 0.4 for the inertia weight; learning factor .
The logistic regression model is trained by training set with the optimal feature subset selected by the NBPSOSEESBS algorithm. The logistic regression model outputs the default probability of users through the test set, . Then, setting the appropriate threshold value , according to the threshold value , users are divided into three levels: high risk, medium risk, and low risk. The specific division principles are shown in Table 4.
The users with the default probability of are divided into the high risk level; the users with the arrears probability of are divided into the medium risk level; and the users with the arrears probability of are divided into the low risk level.
6.3. Experimental Results of NBPSOSEE
Figures 2–4 show the fitness values and the number of features calculated by BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, LBPSOSEE, and proposed NBPSOSEE algorithms for the risk prediction of ECR in June, July, and August 2018. In these pictures, a represents the number of iterations vs. fitness value, and b represents the number of iterations vs. the number of selected features.
(a)
(b)
(a)
(b)
(a)
(b)
Figure 2 shows the test results of the improved NBPSOSEE for ECR risk in June 2018. Compared with the other eight feature selection algorithms, NBPSOSEE gets the maximum fitness value with the least number of features. In terms of number of features and fitness value, the performance of HIBQPSO, LBPSOSEE, and NBPSOSEE is significantly better than the other six algorithms, of which NBPSOSEE is the best, LBPSOSEE is the second, and HIBQPSO is the third. In the process of calculating fitness, BABC obtains the least fitness value. The fitness values calculated by HIBQPSO, LBPSOSEE, and NBPSOSEE are 5.77%, 7.34%, and 11.61% higher than BABC, respectively. However, NBPSOSEE increased by 5.52% and 3.98% in HIBQPSO and LBPSOSEE, respectively. Furthermore, in the process of selecting feature subset, BABC selects the most features; HIBQPSO, LBPSOSEE, and NBPSOSEE select 242, 216, and 205 features, respectively. NBPSOSEE selects the fewest features and deletes 190, 179, and 153 fewer features in HIBQPSO, LBPSOSEE, and BABC, respectively. NBPSOSEE selects approximately onethird of the features from the original feature set, removing a total of 543 redundant or unrelated features. In addition, BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, and LBPSOSEE search for optimal fitness values after 97, 37, 61, 84, 68, 35, 48, and 43 iterations, respectively. The NBPSOSEE obtains the optimal fitness value until 15 iterations and can still search for the best fitness value after 17, 20, and 98 iterations. In the initial search, NBPSOSEE quickly increases the initial fitness value to a very high level, which reflects the convergence performance of the proposed algorithm. The search results show that the proposed NBPSOSEE still has effective global search ability after falling into the local optimal state and can continue to search for a better feature subset in later search.
Figure 3 shows the test results of the proposed NBPSOSEE and other feature selection algorithms for ECR risk in July. The fitness values calculated by the proposed NBPSOSEE and these comparison algorithms show an increasing trend with the increase in the number of iterations. Moreover, the number of selected features by NBPSOSEE is decreasing. LBPSOSEE and NBPSOSEE get higher fitness values and select fewer features than the other seven algorithms. The fitness values calculated by LBPSOSEE are 2.21%, 4.17%, 6.94%, 4.41%, 5.60%, 4.41%, 1.09% higher than those of BWOA, BABC, BGA, BGWO, BPSO, CBPSO and HIBQPSO, respectively. Moreover, the number of features selected by LBPSOSEE is 122, 46, 106, 51, 80, 105, and 59 less than those of the seven algorithms. From these results, it can be seen that LBPSOSEE has better ability to avoid falling into local optimal solutions than these seven comparison algorithms. After 58 iterations, the fitness value of NBPSOSEE is 0.939, and the selected optimal feature subset has 213 features. However, the fitness value calculated by NBPSOSEE is 1.62% higher than that of LBPSOSEE, and the number of features selected is 69% less than that of LBPSOSEE. This proves that the global search ability of NBPSOSEE is stronger than LBPSOSEE, and the performance of NBPSOSEE is significantly better than other feature selection algorithms.
Figure 4 shows the test results of the improved NBPSOSEE and other eight comparison algorithms for ECR risk in August. The search capabilities of LBPSOSEE and NBPSOSEE are significantly higher than other algorithms. The optimal fitness values obtained by LBPSOSEE and NBPSOSEE are increasing, and the selected features are decreasing. The optimal fitness value calculated by LBPSOSEE is 0.957, which is 2.46%, 3.12%, 1.27%, 1.70%, 7.71%, 2.79%, and 0.95% higher in BWOA, BABC, BGA, BGWO, BPSO, CBPSO, and HIBQPSO, respectively. LBPSOSEE selects 172 features, and the number of selected features is reduced by 169, 144, 208, 137, 175, 184, and 111, respectively, compared with seven algorithms. However, the optimal feature subset selected by NBPSOSEE contains 102 features, and the optimal fitness value calculated is 0.977. NBPSOSEE deletes 70 more redundant or unrelated features than LBPSOSEE, and the optimal fitness value calculated is 2.09% higher than LBPSOSEE. The experimental results show that NBPSOSEE obtains the highest ability to jump out of the local optimal solution and gets the highest fitness value under the condition of selecting the least features. In conclusion, NBPSOSEE has a higher convergence speed and global search ability than BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, and LBPSOSEE.
As shown from Figures 2–4, the performance of NBPSOSEE is significantly better than BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, and LBPSOSEE. The proposed NBPSOSEE selects the least number of features to obtain the highest fitness value for the test of ECR risk in June, July, and August 2018. This verifies that the NBPSOSEE convergence speed and global search ability are higher than other algorithms and also demonstrates the effectiveness and stability of the proposed NBPSOSEE.
6.4. Experimental Results of NBPSOSEESBS
Although the results obtained by NBPSOSEE have been improved, the feature subset selected by NBPSOSEE still contains many redundant features. Therefore, a hybrid feature selection algorithm is proposed, namely, NBPSOSEESBS, to select optimal feature subset in this paper, which can not only effectively reduce the number of features but also improve the accuracy. Figures 5–7 show the test results calculated by BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, LBPSOSEESBS, and the proposed NBPSOSEESBS algorithms for the risk prediction of ECR in June, July, and August 2018. The xaxis represents the number of iterations, where the maximum number of iterations represents the number of features selected by the NBPSOSEE.
(a)
(b)
(a)
(b)
(a)
(b)
Figure 5 shows the test results of the improved NBPSOSEESBS for ECR risk in June 2018. With the constantly deleting redundant or irrelevant features, the fitness values calculated by NBPSOSEESBS and other eight feature selection algorithms show an increasing trend. The optimal fitness value calculated by BWOASBS is 3.01% higher than that of BWOA, and the number of deleted redundant features is 9 more than BWOA. The best fitness value calculated by BABCSBS is 0.24% higher than that of BABC, and the selected features are reduced by 6 compared to BABC. The optimal fitness value obtained by BGWOSBS is increased by 1.29% compared with BGWO, and the selected features are 21 fewer than that of BGWO. The optimal fitness value obtained by BPSOSBS is 1.9% more than BPSO, and the number of selected features is reduced by 12 compared with BPSO. The optimal fitness value obtained by CBPSOSBS is 0.7% higher than CBPSO, and 9 more redundant features are deleted than CBPSO. The best fitness value of HIBQPSOSBS calculation is 3.54% higher than that of HIBQPSO, and the deleting redundant features are 35 more than HIBQPSO. The best fitness value of LBPSOSEESBS is 0.79% higher than that of LBPSOSEE, and selecting redundant features are 14 fewer than LBPSOSEE. In addition, the optimal fitness value obtained by NBPSOSEESBS is 0.54% more than NBPSOSEE, and the number of selecting features is reduced by 11 compared with NBPSOSEE. The experimental results show that NBPSOSEESBS obtains the highest fitness value and removes the most redundant or unrelated features.
Figure 6 shows the test results based on NBPSOSEESBS and other feature selection algorithms for ECR risk in July. The best fitness values obtained by BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, LBPSOSEESBS, and the proposed NBPSOSEESBS are significantly superior to BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, LBPSOSEE, and NBPSOSEE, respectively, and the more redundant features are removed. NBPSOSEESBS obtains the highest fitness value, which increased by 3.68%, 3.68%, 8.79%, 5.73, 1.18%, 1.95%, 2.84%, and 0.64% compared to BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, and LBPSOSEESBS, respectively. NBPSOSEESBS selects the least number of features, which, respectively, reduced by 184, 88, 171, 121, 134, 158, 114, and 60 redundant features compared with the eight comparison algorithms. This shows that the proposed NBPSOSEESBS has better performance than other comparative algorithms in balancing local search and global search.
Figure 7 shows the test results of the improved NBPSOSEESBS for ECR risk in August. The results obtained by NBPSOSEESBS are significantly better than other algorithms. NBPSOSEESBS deletes the most redundant or unrelated features and obtains the highest fitness value. The optimal fitness values obtained by NBPSOSEESBS increased by 4.45%, 5.79%, 4.34%, 4.23%, 6.71%, 5.45%, 3.79%, and 2.60% compared with the other eight algorithms, respectively. The number of features selected by NBPSOSEESBS is reduced by 256, 236, 299, 227, 259, 280, 198, and 85, respectively, over these eight algorithms. The experimental results show that the proposed NBPSOSEESBS has the best performance.
The test results are shown in Figures 5–7. BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, LBPSOSEESBS, and the proposed NBPSOSEESBS are significantly better than BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, LBPSOSEE, and NBPSOSEE. However, the proposed NBPSOSEESBS gets the best results, which not only deletes the most of redundant or unrelated features but also calculates the optimal fitness value. The experimental results show that the proposed NBPSOSEESBS improves the convergence speed of particles and also enhances the ability to avoid falling into premature.
7. Discussion
Tables 5–7 show the comparative experiments of the sixteen traditional algorithms, a new feature selection algorithm, and the proposed NBPSOSEESBS for the risk prediction of ECR in June, July, and August 2018. According to the number of selected features, accuracy, precision, recall, and F1measure, the convergence speed and search performance of the proposed algorithm can be evaluated verifying its effectiveness and superiority. In this table, , Acc, Pre, Rec, and F1 represent the selected number of features, accuracy, precision, recall, and F1measure, respectively.
Table 5 shows the test results of the improved NBPSOSEESBS and comparison algorithms for ECR risk in June 2018. For highrisk arrears users, the accuracy, precision, recall, and F1measure calculated by these comparison algorithms are all 100%. This is because the number of highrisk arrears is small, and the characteristics of reflecting arrears behavior are obvious. Among the arrears’ users of the medium risk level, HIBQPSO, NBPSOSEE, BWOASBS, BGASBS, BGWOSBS, HIBQPSOSBS, and the proposed NBPSOSEESBS achieve the highest precision, reaching 100%. The proposed NBPSOSEESBS obtains the highest accuracy, precision, and F1measure for the arrears’ users of the low risk level. The accuracy, precision, and F1measure obtained by NBPSOSEESBS are 0.1%, 1.24%, and 0.67% higher than NBPSOSEE, respectively. Moreover, the proposed NBPSOSEESBS removes the most redundant or unrelated features and calculates the highest average value of F1measure.
Table 6 shows the test results of the improved NBPSOSEESBS and comparison algorithms for ECR risk in July 2018. The results calculated using all feature subsets are the worst, but the results of the proposed NBPSOSEESBS are significantly better than other algorithms. The accuracy and F1measure of NBPSOSEESBS are higher than other algorithms except for CBPSO in the arrears’ users of the medium risk level. Furthermore, NBPSOSEESBS achieves a precision of 100%. In addition, the accuracy and precision calculated by the proposed NBPSOSEESBS are close to that of LBPSOSEESBS among the arrears’ users of the low risk level. On the whole, the proposed algorithm selects the lowest number of features, less than onethird of the original features, and the calculated average accuracy and F1measure are the highest.
Table 7 shows the test results of the improved NBPSOSEESBS and comparison algorithms for ECR risk in August 2018. Among the arrears’ users of the low risk level, the proposed NBPSOSEESBS achieves the highest accuracy, precision, and F1measure. The accuracy, precision, and F1measure of NBPSOSEESBS increased by 6.73%, 64.74%, and 53.38%, respectively, compared with the results calculated using all features. Furthermore, NBPSOSEESBS selects the least number of features in the optimal feature subset obtained. The experimental results show that the proposed NBPSOSEESBS deletes irrelevant or redundant features more than 90.4% of the original features, getting the highest average value of F1measure.
The execution time of algorithm is also an important indicator for evaluating the performance of algorithm. The long running time of algorithm means that the complexity of algorithm is high, and the short running time indicates that the complexity of algorithm is low. In this paper, the running time of the proposed NBPSOSEE, NBPSOSEESBS, and other algorithms is shown in Figures 8 and 9. In order to remove more redundant features and improve the classification results, these hybrid feature selection algorithms with SBS take slightly more execution time than those without SBS. The execution time of NBPSOSEESBS is not significantly different from other algorithms. The execution speed of NBPSOSEESBS is faster than BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, and LBPSOSEESBS for the testing in July and August. In summary, the proposed NBPSOSEESBS outperforms other algorithms. NBPSOSEESBS can effectively reduce a large of redundant features and stably improve the prediction results while keeping low execution time.
8. Conclusion
To solve the problem of accurately predict the risk of ECR for power customers, a hybrid nonlinear inertia weight binary particle swarm optimization with shrinking encircling and exploration mechanism (NBPSOSEE) is proposed for solving feature selection tasks. In addition, an improved feature selection approach that is based on the NBPSOSEE and SBS is proposed, namely, NBPSOSEESBS, to select the optimal feature subset. The experimental results prove that the proposed NBPSOSEESBS can steadily remove more redundant or irrelevant features and obtain better prediction results of ECR for power customers in the case of low execution time, compared with one stateoftheart optimization algorithm, and seven wellknown wrapperbased feature selection approaches for the risk prediction of ECR for power customers.
Data Availability
The experimental data contain the user’s privacy. Therefore, in order to protect the security of users, we cannot upload the data set.
Conflicts of Interest
There are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This research was financially supported by the National Natural Science Foundation of China (no. 61672470), Science and Technology Project of Henan Province (no. 202102210351), Doctoral Program of Zhengzhou University of Light Industry (no. 2017BSJJ046), and Key Research Projects of Henan Higher Education Institutions (no. 20A120011).
Supplementary Materials
The experimental results file directory contains three subdirectories: 201806, 201807, and 201808. There are 18 CSV files in the subdirectories 201806, 201807, and 201808, respectively. There are 18 comparison algorithms in the manuscript, which are BWOA, BABC, BGA, BGWO, BPSO, CBPSO, HIBQPSO, LBPSOSEE, NBPSOSEE, BWOASBS, BABCSBS, BGASBS, BGWOSBS, BPSOSBS, CBPSOSBS, HIBQPSOSBS, LBPSOSEESBS, and NBPSOSEESBS. These CSV files in the 201806, 201807, and 201808 represent the experimental results calculated by these comparison algorithms for the risk prediction of electric charge recovery in June, July, and August 2018. The file BWOA _results.csv represents the calculation result of the algorithm BWOA. The file BABC _results.csv represents the calculation result of the algorithm BABC. The file BGA _results.csv represents the calculation result of the algorithm BGA. The file BGWO _results.csv represents the calculation result of the algorithm BGWO. The file BPSO _results.csv represents the calculation result of the algorithm BPSO. The file CBPSO _results.csv represents the calculation result of the algorithm CBPSO. The file HPSO_SSM_results.csv represents the calculation result of the algorithm HIBQPSO. The file LBPSOSEE _results.csv represents the calculation result of the algorithm LBPSOSEE. The file NBPSOSEE _results.csv represents the calculation result of the algorithm NBPSOSEE. The file BWOA _SBS_results.csv represents the calculation result of the algorithm BWOASBS. The file BABC_SBS_results.csv represents the calculation result of the algorithm BABCSBS. The file BGA _SBS_results.csv represents the calculation result of the algorithm BGASBS. The file BGWO _SBS_results.csv represents the calculation result of the algorithm BGWOSBS. The file BPSO _SBS_results.csv represents the calculation result of the algorithm BPSOSBS. The file CBPSO _SBS_results.csv represents the calculation result of the algorithm CBPSOSBS. The file HPSO_SSM _SBS_results.csv represents the calculation result of the algorithm HIBQPSOSBS. The file LBPSOSEE _SBS_results.csv represents the calculation result of the algorithm LBPSOSEESBS. The file NBPSOSEE_SBS _results.csv represents the calculation result of the algorithm NBPSOSEESBS. In the headline of these CSV files, features_num means selected number of features. Accuracy_high, Loss_high, Precission_high, Recall_high, and F1_high represent the accuracy, loss, precision, recall, and F1measure of predicting the highrisk level of customer electric charge recovery, respectively. Accuracy_med, Loss_med, Precission_med, Recall_med, and F1_med represent the accuracy, loss, precision, recall, and F1measure of predicting the mediumrisk level of customer electric charge recovery, respectively. And Accuracy_low, Loss_low, Precission_low, Recall_low, and F1_low represent the accuracy, loss, precision, recall, and F1measure of predicting the lowrisk level of customer electric charge recovery, respectively. (Supplementary Materials)