[Retracted] Application of Data Mining Algorithm in Electric Power Marketing Inspection Forecast Analysis

Wu, Weijiang; Xu, Gaojun; Qian, Xusheng; Chu, Chengbo

doi:https://doi.org/10.1155/2022/9229415

International Transactions on Electrical Energy Systems

On this page

Abstract Introduction Literature Review Conclusion Data Availability Conflicts of Interest References Copyright Related Articles

Research Article Retraction

!

This article has been Retracted. To view the article details, please click the ‘Retraction’ tab above.

Special Issue

Artificial Intelligence for Smart Energy Storage Applications

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 9229415 | https://doi.org/10.1155/2022/9229415

[Retracted] Application of Data Mining Algorithm in Electric Power Marketing Inspection Forecast Analysis

Weijiang Wu,¹Gaojun Xu,¹Xusheng Qian,¹and Chengbo Chu²

Academic Editor: Nagamalai Vasimalai

Received14 Jul 2022

Revised28 Jul 2022

Accepted12 Aug 2022

Published28 Aug 2022

Abstract

In order to improve the accuracy of power load forecasting and deal with the challenge of insufficient stand-alone computing resources brought by the intelligent power system, data extraction algorithms are used in energy market analysis. Preliminary weather performance algorithms are optimized online based on the nature of the power load data. In order to improve the accuracy of the computational algorithms, the concept of classification and various agents was introduced. The MapReduce cloud computing programming framework is used simultaneously to improve design algorithms to improve the ability to process large amounts of data. The actual electronic data provided by EUNITE was selected as a sample analysis and a complete experiment of the 32-node cloud computing group. The results of the experiment show that the load data provided by EUNITE was expanded into four different data sets: 1000 times, 2000 times, 4000 times, and 8000 times. Works on older data and the cloud. Platforms with groups of 4, 8, 16, and 32 nodes are designed to calculate acceleration ratios and scale speeds. The acceleration ratio of a perfectly parallel system algorithm can reach 1. However, in practical applications, as the number of cluster nodes increases, so does the transmission cost of the node network. Conclusions. Accuracy assumptions based on this model are better than the general evaluation of supported vector regression prediction algorithms and neural network algorithms, and the planning process is well underway.

1. Introduction

In recent years, the power system accumulated tens of thousands or even hundreds of millions of pieces of data information in the database, and there are many useful information in the database data. This information is of great help to leaders in making decisions. Nowadays, information technology is so advanced that the use of these data resources provides technical support. People also try to find the possible internal laws from these data resources. If appropriate information technology is adopted to process these data, power enterprises can better serve people. Electric power is an industry related to national economy and people’s livelihood. The quality of its service greatly affects people’s quality of life. The data could also help power companies cut costs and boost profits. By studying the law of market change, China’s electric power enterprises represented by State Grid Corporation are implementing informatization [1]. It also includes research on how to extract useful information from this vast amount of data through effective information technology. China is now a socialist market economy; the power enterprise information is also to serve the market economy. How to make the power enterprises and customers between the win-win is the problem that China’s power enterprises face. The market orientation replaces the original production orientation, improves efficiency and reduces costs internally, improves service level externally, and expands the market [2]. Therefore, it puts forward new requirements for the functions of electric power marketing system of enterprises. Electric power industry is a basic industry, which affects all sectors of society. Whether electric power forecast is accurate or not has great influence on society and electric power enterprise itself. For example, China has experienced a serious shortage of power supply due to insufficient investment in power facilities due to insufficient prediction of future electricity demand, which has a great negative impact on the development of the 11th Five-Year Plan [3]. The importance of power demand forecasting can be seen. Demand forecasting is a key component of electric power marketing system. The forecast of electricity demand plays a key role in the construction of electric power facilities, the formulation of electric power marketing strategy, and the decision and subordinate of electric power production schedule tracking. So, how to improve the accuracy of electricity demand prediction becomes an important topic.

2. Literature Review

Tai et al. developed the concept of using smart decision-making technology in the field of electronics and developed the “DSS + problem-solving + knowledge base” smart decision-making process (IDSS) to fulfill decision-making tasks [4]. In studying the principles and algorithms of systems to support decision-making in the energy sector based on data mining [5], Wolf et al. integrated the real issues of the energy market in China. Based on a view of the nature and multidimensionality of big data in different data mining models, a power system decision-making system design process model was developed that combines neural network structure and spatial selection algorithms for data mining and organically integrates problem solving and interpretation functions [6]. In a short-term load hypothesis based on an uncertain neural network, Wang et al., in the study of short-term prediction, the BP algorithm was introduced using a neural network, and a holiday model was developed to calculate the specific holiday load. Based on the data from the analysis of the power plant in Guangxi, the concept of energy load calculation based on the neural network was well received [7]. Sharma conducted research on topics related to power decision-making support system in recent years, for example, support for decision-making in the energy sector based on data warehousing technology, forecasting of electricity production load, use of group research in the sector, use of electricity, structural research, use of electrical decision-making industry, etc. Some research progress has been made [8]. Shafiei Chafi and Afrakhte developed a three-stage DSS system: a language system (LS), a problem system (PPS), and a knowledge system (KS). This model is “problem-solving,” “conscious,” and in some cases has consequences [9].

Aiming at the actual application scenarios of power load prediction, this article proposes a distributed extreme learning machine power load prediction algorithm based on MapReduce, which applies online sequence optimization extreme learning machine to power load prediction, and introduces cloud computing technology and multiagent technology to improve its ability to process massive high-dimensional data and improve the accuracy of load prediction [10]. The parallel performance and load prediction accuracy of the improved algorithm were tested on a cluster built in the laboratory, and an example was used to analyze the real power load data.

3. Research Methods

3.1. Cloud Computing

The Hadoop cloud computing platform developed by the Apache Foundation is a fully open-source application that supports the MapReduce framework and data exchange. MapReduce is a sample program that processes large files in batches. This was the first request from Google to address the issue of distribution and counting [11]. The MapReduce framework makes it easy to use by protecting the content from being used. In large groups, complex parallel computations are abstracted into two user-written functions, the Map and the Reduction function.

The specific explanation is as follows:(1)Access. The input file is first read from the fragmented file and then truncated. The MapReduce framework allocates data slices to each worksheet.(2)Schedule. The MapReduce framework fixes the data format as a set of Content and Value pairs, runs, and executes framework shared key-value (key, value) processes according to user-defined operating system requirements [12]. Finally, a new pair of averages (Key, Value) is created.(3)Competition. At this stage, it takes more time to convert a mean-value pair from a Map node to a Reduced node, depending on bandwidth and CPU speed, rather than Map and Reduce. At this stage, the intermediate values corresponding to the same average concept are combined and generated (main, list of results), and the values are determined.(4)Reduction. Performs the user-written reduce function. Repeat all intermediate values and their intermediate keys or product names, run user processing data, and generate new content and double values. [13].(5)Output. Export the reduced output to a distributed file system.

3.2. Extreme Learning Machine Algorithm

3.2.1. Description of Extreme Learning Machine Algorithm

Cloud learning technology algorithms are different from traditional direct feedback neural network education. The team does not require repetition and carefully selects and adjusts difficult techniques and procedures to conceal injustice. To minimize training errors, the weight concealment procedure is defined by an algorithm [14]. The special weather training algorithm is described as follows.

arbitrary data , where and contain n hidden layer nodes and the regression model of extreme learning machine whose excitation function is G can be expressed aswhere is the weight vector of the th hidden node and the input node; is the weight vector of the th hidden layer node and the output node. is bias of the th hidden layer node; is the number of hidden layer nodes. Formula (1) can be abbreviated as where is the output matrix of the hidden layer, and the th column of corresponds to the output vector of the th hidden layer of input. The weighted output can be obtained by solving the least-squares solution of the linear equation.

The least squares solution is where is the general Moore–Penrose inverse of the output matrix of hidden layers.

3.2.2. Disadvantages of Extreme Learning Machine Algorithm

Although the ELM training algorithm is superior to the SVM and BP algorithms in regression policy in terms of counting, accuracy, training algorithms, and time-tested performance, the ELM training algorithm has been introduced into energy prediction and improved weather performance. However, the ELM training algorithm is a package training algorithm, and online optimization is very important as it is not completely suitable for the power load prediction scenario in the actual power load calculation. The online optimization ELM algorithm does not need to be repeated to add new data to the learning process [15]. In addition, the ability to generate large data of the adjustment algorithm using cloud computing technology and multiagent distribution technology has been improved to prevent discrepancies between data values and high-dimensional data of electronic information, and the accuracy of load calculations has been improved. This algorithm is called MapReudce based on MapReduce weighted online sequential extreme learning machine, Noyon, OSELM-WA.

3.3. Design MR-OSELM-WA Algorithm Based on Cloud Computing

3.3.1. Online Sequential Extreme Learning Machine Algorithm

The steps of the online education system are as follows:(1)Initial phase: part of the data set is defined as the initial training, and the number of nodes in the hidden stage is set manually. Let . Firstly, weight vector of the th hidden layer node and input node and parameters of excitation function are generated randomly [16]. Then, the initial hidden the layer output matrix is calculated. Compute the initial output weight vector . where is the output vector of the target value.(2)Online elementary education: when the new training data arrives, it is considered as the sample of the entire training process. Firstly, calculate the output matrix of the new hidden layer: Then, let .(3)Through the following formula, calculate the output weight vector :(4)Set , go back to step 1, and continue to train the next training data.

3.3.2. MR-OSELM-WA Algorithm Based on Cloud Computing

(1) MR-OSELM-WA Algorithm Idea. As the power of intelligence deepens, the power of data transport increases geometrically, and the use of online sequential extreme learning machine (OSELM) algorithms for power failure prediction is not enough [17]. As the smart grid cloud computing model continues to mature, this paper uses the multiagent concept and cloud computing to develop a hyperlearning algorithm.

The idea of MR-OSELM-WA algorithm is that the multiagent runs the weight balance of OS-ELM to execute, and the OSELM node with higher prediction accuracy should get higher weight when calculating the final predicted value. The OSELM predicted value of each node is calculated by weighted average to get the final predicted value. The weight of each node is

, , and are, respectively, the final predicted value, the predicted value of the th OSELM. and the predicted weight of the th OSELM. The predicted weight is calculated by standard error function and gradient rise method:where is the target value of the input training set of each OSELM agent. According to the gradient ascent strategy, iswhere is the learning rate; set . According to equation (14), it can be concluded thatwhere can be updated by .

(2) Detailed Steps of MR-OSELM-WA Algorithm. The main idea of the MapReduce programming framework is to use the parallel structure by writing the corresponding text and Reduce functions. The average results of the MR-OSELM-WA algorithm are stored in the HBase data distribution and distribution cache. The distributed MR-OSELM-WA prediction and decision model based on MapReduce is shown in Figure 1:(1)Large instructional packages can be read from the data distribution on the cloud computing platform, and different learning sessions can be obtained by segmenting training packages through a simple process of the MapReduce programming system [18]. Number is the number of map locations in a cloud group.(2)Subpackage training is taught in parallel according to the step logic of the map function, such as the logic of OS-ELM training machine learning algorithm, which is equivalent to machine training which is different.(3)The benefits of working in the diagram, which is an estimate of the importance of different training systems, were passed from the Shuffle phase of the MapReduce programming frame to the Reduced phase, and the weight of the estimated values generated by the MapReduce function was determined accordingly. According to the above weight calculation method, the predicted value weight of each Map operation output is determined, and then the final predicted value is calculated.(4)Learn the routine procedure for estimating average and long-term loads along the specified axis, slide as required, and return to step 1. Estimate the next day behind the load data on a regular basis.

4. Result Analysis

4.1. Experimental Preparation Stage

The newly built Hadoop platform has 32 nodes, each with Intel (R) Core (TM) I5-2400 [email protected] GHz, 4 GBRAM, and 100 Mbit/s network bandwidth. Hadoop version is 0.20.2.

4.1.1. Example Analysis Data Set

The actual regional load data for 1997-1998 were selected from a 2001 Medium-Term Load Prediction Test developed by the European Smart Technology Network (EUNITE). The data sample provided by EUNITE is the power load collected every 0.5 hours from 1997 to 1998; the mean daily temperature from 1995 to 1998, and the holiday dates from 1997 to 1999 [19]. The objective of the load forecasting is to predict the maximum power load for 31 days in January 1999 from the above data samples.

4.1.2. Evaluation Indicators

The accuracy rate of load prediction adopts MAPE as the test index:where and are the true and predicted the power load value of day , respectively; is the number of days in the month forecast. In the power load forecasting, the smaller value is, the more accurate the load forecasting is.

The relative acceleration and expansion ratios were used to evaluate the performance of the MR-OSELM-WA algorithm. The evaluation is used to compare the execution time of the algorithm—again the large group, again the large data—with the original data.

4.1.3. Load Forecasting Training Set Design

The input sample includes three feature vectors, and the training set is composed of [date , temperature and historical load ]. Seven binary numbers are used to represent date information, respectively. The predicted daily temperature was expressed by decimal number and normalized. indicates the maximum load value of 7 days before the forecast date.

The objective of the experiment is to predict the maximum power load on January 1999. Through a large number of experiments, temperature is correlated with power load. In order to improve the accuracy of prediction, the sample data range is set as part of the winter data from November to April. The output of the training set is , that is, the maximum power load value of the predicted day.

4.2. Example Analysis

4.2.1. Prediction Accuracy of MR-OSELM-WA

In this experiment, the MR-OSELM-WA algorithm is compared with the support vector regression (SVR) algorithm and functional networks algorithm of generalized neural networks. SVR prediction algorithm and functional neural network algorithm show excellent prediction ability in EUNITE competition. Compared with these two algorithms, the performance of the proposed MR-OSELM-WA algorithm for power load prediction is tested.

Formula (15) for calculating the value of MAPE according to the target function was obtained, and the approval of our standard algorithm was obtained by 10-fold cross-validation. The inconsistencies of each algorithm have been taken for granted, and MR-OSELM-WA, network algorithm performance, and SVR algorithm have been reported as training packages between 1997 and 1998. The historical data of 1997 and 1998 were used as the training set; MR-OSELM-WA uses a one-by-one online sequential learning model to predict the power load values in January 1999. In order to ensure that the results were positive, 50 tests were completed and the mean was considered as the final test. MAPEs of our energy load estimation algorithms are shown in Table 1 [20]. As shown in Table 1, the MR-OSELM-WA proposed in this article received the lowest MAPE value for the load estimate, i.e., the MR-OSELM-WA algorithm has the correct high in estimated strength. Load estimates perform better than SVR and functional neural networks. In addition, SVR prediction algorithm and functional neural network prediction algorithm is a set of training. The larger the training package, the greater the memory required to complete the algorithm in the training package type. If the memory space exceeds the limit, the efficiency of the algorithm will be greatly reduced. However, the above situation is not easy to occur because MR-OSELM-WA’s one-on-one online sequential learning mode training package (or half mode) is smaller than the ELM training mode.

Figures 2 and 3 show the comparison between the actual power load value and the estimated power load value of the MR-OSELM-WA algorithm, SVR algorithm, and network operation algorithm in January 1999.

4.2.2. MR-OSELM-WA Parallel Performance

To reflect the performance of the MR-OSELM-WA algorithm, the sample load data provided by EUNITE is divided into four groups: 1000 times, 2000 times, 4000 times, and 8000 times, different time records. They work on cloud platform with 4, 8, 16, and 32 nodes in a group to calculate acceleration ratios and scale ratios. The acceleration ratio of a perfectly parallel system algorithm is close to 1, but in practical use, as the number of cluster nodes increases, network forwarding nodes are used, and the linear acceleration ratio is very large, as shown in Figure 4, and hard to reach. Figure 4 shows that the acceleration ratio of MR-OSELM-WA increases linearly with the growth of data scale, especially for large files. In practice, the more the data, the better the comparison of MR-OSELM-WA, that is, MR-OSELM-WA can meet the requirements of the calculation of large data of electronic equipment.

In a perfectly parallel system, the clock speed is constant at 1, but it is not possible to complete the application. As the configuration data increase, the speed of the interconnect system gradually decreases. The test results are shown in Figure 5. The measurement speed of the MR-OSELM-WA algorithm is better because the measurement height of the MR-OSELM-WA algorithm decreases when the data setup is large.

5. Conclusion

The deepening of the degree of power intelligence, power system data quantization, high dimensional trend is unstoppable. The load forecasting algorithm represented by support vector regression widely used in power load forecasting has high computational complexity. Under the massive high-dimensional data load prediction, a single machine cannot bear such a huge consumption of computing resources. In recent years, the popular large data processing technology is an effective method to solve this problem, and the algorithm parallelization caused by it has become a research direction of load forecasting in recent years. In this article, an extreme learning machine power load prediction algorithm based on cloud computing is proposed, which can not only shorten the training time and reduce the consumption of computing resources but also significantly improve the accuracy of power load.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Y. Guo and Y. Guo, “Intelligent network office system based on cloud computing and machine learning,” Mobile Information Systems, vol. 2021, no. 2, Article ID 5868261, 14 pages, 2021.
View at: Publisher Site | Google Scholar
W. Tang, Q. Yang, X. Hu, and W. Yan, “Deep learning-based linear defects detection system for large-scale photovoltaic plants based on an edge-cloud computing infrastructure,” Solar Energy, vol. 231, pp. 527–535, 2022.
View at: Publisher Site | Google Scholar
W. Jianbo and X. Cao, “Factors affecting the evolution of advanced manufacturing innovation networks based on cloud computing and multiagent simulation,” Mathematical Problems in Engineering, vol. 2021, no. 1, Article ID 5557606, 12 pages, 2021.
View at: Publisher Site | Google Scholar
W. L. Tai, Y. F. Chang, and W. H. Huang, “Security analyses of a data collaboration scheme with hierarchical attribute-based encryption in cloud computing,” International Journal on Network Security, vol. 22, no. 2, pp. 212–217, 2020.
View at: Google Scholar
M. Nazari Jahantigh, A. Masoud Rahmani, N. Jafari Navimirour, and A. Rezaee, “Integration of internet of things and cloud computing: a systematic survey,” IET Communications, vol. 14, no. 2, pp. 165–176, 2020.
View at: Publisher Site | Google Scholar
F. Wulf, M. Westner, and S. Strahringer, “Cloud computing adoption: a literature review on what is new and what still needs to Be addressed,” Communications of the Association for Information Systems, vol. 48, no. 1, pp. 523–561, 2021.
View at: Publisher Site | Google Scholar
R. Wang, C. Tian, and L. Yan, “Malware detection using cnn via word embedding in cloud computing infrastructure,” Scientific Programming, vol. 2021, no. 6, Article ID 8381550, pp. 1–7, 2021.
View at: Publisher Site | Google Scholar
G. Sharma, “Mpc built frequency regularization studies of multi-area electric power system base on short term load forecasting using ann,” International Journal of Engineering Research in Africa, vol. 50, pp. 145–161, 2020.
View at: Publisher Site | Google Scholar
Z. Shafiei Chafi and H. Afrakhte, “Short-term load forecasting using neural network and particle swarm optimization (pso) algorithm,” Mathematical Problems in Engineering, vol. 2021, no. 2, Article ID 5598267, pp. 1–10, 2021.
View at: Publisher Site | Google Scholar
B. S. Kwon, D. J. Bae, C. H. Moon, and K. B. Song, “Load forecasting algorithm for special days by considering temperature sensitivity and btm estimation,” The Transactions of the Korean Institute of Electrical Engineers, vol. 70, no. 2, pp. 290–296, 2021.
View at: Publisher Site | Google Scholar
Y. Guan, D. Li, S. Xue, and Y. Xi, “Feature-fusion-kernel-based Gaussian process model for probabilistic long-term load forecasting,” Neurocomputing, vol. 426, pp. 174–184, 2021.
View at: Publisher Site | Google Scholar
S. Ma, “A hybrid deep meta-ensemble networks with application in electric utility industry load forecasting,” Information Sciences, vol. 544, pp. 183–196, 2021.
View at: Publisher Site | Google Scholar
G. F. Savari, V. Krishnasamy, J. Sathik, Z. M. Ali, and S. H. Abdel Aleem, “Internet of things based real-time electric vehicle load forecasting and charging station recommendation,” ISA Transactions, vol. 97, pp. 431–447, 2020.
View at: Publisher Site | Google Scholar
J. W. Lee, H. J. Kim, and M. K. Kim, “Design of short-term load forecasting based on ann using bigdata,” The Transactions of the Korean Institute of Electrical Engineers, vol. 69, no. 6, pp. 792–799, 2020.
View at: Publisher Site | Google Scholar
B. S. Kwon, R. J. Park, and K. B. Song, “Analysis of the effect of weather factors for short-term load forecasting,” The Transactions of the Korean Institute of Electrical Engineers, vol. 69, no. 7, pp. 985–992, 2020.
View at: Publisher Site | Google Scholar
R. Huang and X. Yang, “Analysis and research hotspots of ceramic materials in textile application,” Journal of Ceramic Processing Research, vol. 23, no. 3, pp. 312–319, 2022.
View at: Google Scholar
J. Jayakumar, B. Nagaraj, S. Chacko, and P. Ajay, “Conceptual implementation of artificial intelligent based E-mobility controller in smart city environment,” Wireless Communications and Mobile Computing, vol. 2021, Article ID 5325116, pp. 1–8, 2021.
View at: Publisher Site | Google Scholar
L. Li, Y. Diao, and X. Liu, “Ce-Mn mixed oxides supported on glass-fiber for low-temperature selective catalytic reduction of NO with NH3,” Journal of Rare Earths, vol. 32, no. 5, pp. 409–415, 2014.
View at: Publisher Site | Google Scholar
Z. Huang and S. Li, “Reactivation of learned reward association reduces retroactive interference from new reward learning,” Journal of Experimental Psychology: Learning, Memory, and Cognition, vol. 48, no. 2, pp. 213–225, 2022.
View at: Publisher Site | Google Scholar
Q. Zhang, “Relay vibration protection simulation experimental platform based on signal reconstruction of MATLAB software,” Nonlinear Engineering, vol. 10, no. 1, pp. 461–468, 2021.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Weijiang Wu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

159

Downloads

266

Citations