Abstract

More and more IoT (Internet of Thing) devices have been connected to our lives in recent years, making life more convenient. Many countries are also making use of Internet of Thing technology to carry out intelligent electricity network reform. One of the reform goals is balancing the supply and demand of electricity, which has become a top priority. Balancing electricity supply and demand through real-time electricity prices has become an effective way. However, using traditional machine learning models for real-time electricity price prediction requires complex feature engineering, and the results are not satisfactory. Also, the mainstream fusion methods use data-level fusion, which will put very high pressure on communication bandwidth and computer resources. In this paper, an LSTM- (long short-term memory-) based decision level fusion of multisource data is proposed and applied for real-time electricity price prediction on actual electricity price datasets. The method solves the difficulties of traditional machine learning models in dealing with complex nonlinear problems. It achieves local asynchronous processing of multisource data through decision-level fusion, reducing the requirement for bandwidth resources and providing perfect results in real-time electricity price prediction. The experimental results show that the prediction accuracy of the decision fusion prediction model based on LSTM is higher than that of the linear regression algorithm.

1. Introduction

With the development and progress of science and technology, more and more Internet of Things devices are connected to our life. It has provided us with more and more help in intelligent grid [1], intelligent transportation [2], smart home [3], and public safety [4], making life more convenient. Internet of Things devices collect all kinds of sensor information through various information sensors, radiofrequency identification technology, global positioning system, infrared sensors, laser scanners, and other devices and technologies and then connect all kinds of objects. Through various network links, they realize the ubiquitous connection between things and people and the intelligent perception of things and processes in identification and management [5].

In recent years, many countries have carried out intelligent electricity network reform with the help of Internet of Things technology. With the reform of the smart grid, the electricity Internet of Things has also introduced a variety of latest technologies, such as cloud computing [6], artificial intelligence [7], and big data [8], to realize information perception and processing in all links of the electricity Internet of Things [9]. The development of electricity Internet of Things is bound to be accompanied by the access of large-scale terminal equipment, producing a large amount of collected data [10]. Electricity Internet of Things realizes real-time sensing and dynamic control of electricity grid with the help of many sensing devices and various heterogeneous communication networks [11]. As shown in Figure 1, the Internet of Things in electricity system (IOTIES) is a three-tier structure composed of perception, network, and application layers. The perception layer, located at the bottom of the electricity Internet of Things, is the core of items [12]. The transmission layer is located in the second layer of the electricity Internet of Things. As the link between the perception and application layers, it is a center of information processing. Its primary function is to transmit information safely and reliably from the perception layer to the application layer—the application layer, as the top layer of the three-tier structure of the electricity Internet of Things. The application layer can calculate, process, and mine the data collected by the perception layer to realize real-time control, accurate management, and scientific decision-making of the physical world [13].

In promoting the development of the electricity Internet of Things, due to the access of large-scale terminal equipment and the collection of massive data in the electricity Internet of Things, some scholars also apply the appropriate data collection methods, cloud communication, and cloud computing. Data acquisition mostly depends on various sensors, such as sensors existing in user equipment, smart meters, and electricity stations. Leikanger et al. [14] proposed a new method to upload sensor data directly to the online cloud through NFC, which solves the disadvantage that early data needs to be submitted to the mobile phone before forwarding. To improve the security of the electricity information acquisition systems and reduce the risk of attack damage, Li et al. [15] proposes an electricity information acquisition system based on blockchain technology. After acquiring multisource data, cloud computing is introduced. A key benefit of connecting edge and cloud computing is achieving high throughput under high concurrent access, mobility support, real-time processing guarantee, and data persistence [16].

In parallel with the vigorous development of Internet of Things for electricity and the market-oriented reform of electricity around the world, coupled with the frequent occurrence of electricity outages due to energy scarcity worldwide, more and more scholars are focusing on predicting real-time electricity prices. Currently, electricity prices in most regions are fixed nondynamic in real-time, thus creating imbalances in the areas and times of electricity consumption. Traditional pricing methods have led to waste of energy and problems affecting grid stability. Balancing electricity consumption by adjusting prices in real time is a more effective way to help electricity companies coordinate the delivery of electricity by accurately forecasting real-time electricity prices. Accurate real-time tariff forecasting can help electricity companies to coordinate the delivery of electricity. It can also be used to cope with electricity shortages or excesses during specific periods, ensuring the safety [17] and stability [18] of the electricity system while achieving the national goal of carbon neutrality [19].

Currently, there are two main types of approaches to real-time electricity price forecasting: traditional machine learning models and neural network models. A part of the research has used traditional machine learning models that consider mostly historical charge sequences, climate, time, and other factors [2022]. In terms of conventional machine learning models for predicting electricity prices, Azmira et al. [23] proposed a hybrid least squares support vector machine (LSSVM) and bacterial foraging optimization algorithm (BFOA) combined with a multistage optimization approach of LSSVM-BFOA, which can improve the accuracy and efficiency of prediction and provide an ultraconcise solution for electricity forecasting studies. Ding and Ge [24] proposed a new adaptive Kalman filter-based day-ahead electricity price forecasting. Under the condition that the state transfer matrix and the observed noise statistical characteristics of the forecasting model are unknown, the unknown parameters of the forecasting model are estimated based on the electricity market clearing tariff data to ensure that its unit capacity can participate in the market bidding and achieve the goal of maximizing its profit. The results show promising results on the Pennsylvania-New Jersey-Maryland (PJM) electricity market dataset.

In contrast, using traditional machine learning models involves feature engineering, which is a tedious process. Therefore, some scholars have applied the neural network approach to real-time electricity price prediction to get rid of the tediousness of feature engineering. Zou et al. [25] proposed a deep learning model based on stacked autoencoders for electricity price prediction to solve the problem that artificial neural networks are slow to train when the input data is extensive and easily fall into local optimum. The results showed that it could effectively solve the problem that neural networks are difficult to train. Li et al. [26] improved the genetic algorithm- (GA-) based BP neural network prediction algorithm. The traditional BP neural network tends to introduce error signals into local minima. The genetic algorithm can solve this problem by optimizing the weights and thresholds of the BP neural network. The improved algorithm is used to predict electricity prices, and the results show good prediction results. However, the LSTM algorithm was attempted to be applied to time series problems because of the difficulty of modeling future situations using time-series features in traditional neural networks [27]. The LSTM algorithm can avoid gradient disappearance and also can solve long-term problems.

The research on information fusion methods can be divided into three levels: data-level fusion, feature-level fusion, and decision-level fusion. In data-level fusion, the tiniest information is lost in the fusion process. The rapid development of Internet of Things data collection technology brings severe challenges for all kinds of data collection systems. Shah [28] proposed a deep learning model called tensor deep learning (TDL), which is a proposed higher-order backpropagation algorithm based on the traditional backpropagation algorithm that extends the data from linear space to multiple linear areas and trains the parameters of the proposed model. The experimental results show that the proposed model not only performs well in heterogeneous data fusion. However, data-level fusion has very significant limitations, is highly influenced by the environment, and has very high requirements on communication bandwidth and computer resources.

In contrast, feature-level fusion provides objective information compression of data and solves the difficulties of data-level fusion. Gad et al. [29] proposed a multialgorithm feature-level fusion scheme for secure use of Internet of Things based on iris authentication for the current problem of how sensitive data can be safely used in Internet of Things through online banking, and the results show that the accuracy of verification of the correct client in sensitive applications is significantly improved. In feature-level fusion, it can handle the data coming from heterogeneous sensors. Still, the fused feature vectors are generally of high dimensionality, which will make the later pattern classification more difficult. Data from different sensors can be preprocessed locally first, feature extraction, and pattern classification in decision-level fusion. Decision-level fusion is less sensor-dependent. The participating sensors can be homogeneous or heterogeneous and can process asynchronous information. Decision-level fusion is currently widely used in medical research. Hypoxia in daily life is challenging to recognize in the short term. Still, if the situation is difficult, it may lead to decreased physical function or complete incapacity, so for the hypoxia problem that often occurs in people who work at high altitudes for long periods, Acharya et al. [30] proposed a parallel-based decision-level real-time hypoxia monitoring system, based on blood oxygen saturation and dysfunction at different measurements to build a model, tested on a real dataset, and showed outstanding results. It has also demonstrated its advantages in diagnosing depression. Zhang et al. [31] proposed a decision-level fusion method based on deep forest multimodal data, where a random forest regression model (RFR) was trained separately using heterogeneous data to obtain high-level feature vectors, and the cosine similarity of the two vectors was used as a complementary metric for the modal data. Experimental results on the DAIC-WOC dataset show that this fusion method significantly outperforms other methods and can better identify depressed patients.

The main contributions of this paper are listed as follows: (1)A distributed decision-level fusion method based on power Internet of Things is proposed to predict real-time electricity prices(2)At present, electricity data is multisource and heterogeneous. Therefore, a distributed data acquisition method of edge cloud is proposed. It can effectively obtain information and use the edge server for data processing, reduce the communication cost, and improve timeliness and prediction accuracy(3)The electricity price of different regions is predicted through deep learning. The fusion is realized at the decision-making level to complete multiple regions’ unified real-time electricity price prediction

The rest of the paper is organized as follows. Section 2 briefly introduces the architecture, multisource data fusion framework, and system flow chart of electricity Internet of Things. Section 3 describes the LSTM algorithm, decision-level fusion, and experimental evaluation index in detail. The feasibility and validity of the model and algorithm are verified by error comparison in Section 4. Section 5 presents the conclusions and future research.

2. Deep Learning with Multisource Data Decision-Level Fusion

Before the emergence of the Internet of Things, communication infrastructure resources and power system infrastructure resources were not integrated, resulting in low information levels and low infrastructure utilization efficiency. In the development of the Internet of Things, these problems are gradually solved, especially the proposed multisource data fusion architecture. The introduction of multisource data fusion systems under the Internet of Things includes multisource data acquisition topology and multisource data decision-level fusion machine learning model.

2.1. Multisource Data Collection Topology

Figure 2 shows a topology of an edge collecting data and uploading it to the cloud, presenting a tree structure. This topology is mainly composed of a decision center, a central cloud, and an edge cloud, where the edge cloud is connected to multiple wireless transmitters through which different sensor data are acquired. These sensor data sources include user equipment data, hydroelectric electricity plant data, and thermal electricity plant.

In Figure 2, three regions perform edge data acquisition. They rely on various sensors to obtain data, including sensors from hydroelectric electricity plants, thermal electricity plants, user equipment, and environmental collection devices. The electricity usage and environmental information are transmitted to the edge server through wireless transmission devices. Then, the edge server performs edge calculations to obtain the predicted real-time electricity prices for each region. The expected electricity value for each area is then transmitted to the central cloud through a router. The central cloud performs prediction through a decision-level fusion machine learning model to obtain the final real-time electricity price prediction value.

The communication between the central and edge clouds is transmitted in both directions through a wired network. And the edge cloud and each sensor are sent through the wireless network. This design speeds up the transmission and also makes the cost of acquiring various sensor data more favorable.

2.2. Overall Framework Diagram

First, each city in the same province collects historical electricity price information as the original dataset and obtains the input dataset in the Dataset Input module—Datacity () through data preprocessing. Then, each city uses Datacity to train the LSTM model locally and uses the trained LSTM model to predict the future electricity price of each city–Predictcity (city = A,B,...,N). Then, the electricity price information predicted by each city is uploaded to the center, and the final forecast result of the future electricity price of the province—PRICE—is obtained through a decision-level fusion algorithm.

After obtaining the prediction results in the first predict module, each city uploads the predicted electricity price information to the cloud center. According to the power load information of each city, the weight function of the decision-making level fusion is formed. After determining the fusion weight, the decision-level fusion calculation is performed on the future electricity price predicted by each city, and the unified forecast result of the future electricity price of the province—PRICE—is obtained. As shown in Figure 3.

2.3. Flow Chart

Figure 4 shows the flowchart of the LSTM-based decision-level fusion of multisource data for the electricity price prediction model. The first step of data preprocessing is to determine whether the information is correct or not, and if there are anomalies, remove the noisy data and fill in the vacant data, followed by data normalization, output the processed data, and divide it into a training set, validation set, and test set. The preprocessing of data can effectively avoid harmful data to the model training. The second step is to train the LSTM model with the training set, validate the LSTM model with the validation set, retrain it if it does not meet the expectation, and proceed to the next step if it does. The third step is to construct a decision-level fusion algorithm, assigning weights based on the percentage of electricity data allocated to each region and building a decision-level fusion algorithm based on these weights. The fourth step performs electricity price prediction on the test set data, using the decision-level fusion algorithm set up in the third step, and inputting the test set into the trained LSTM algorithm model for the final electricity price prediction.

3. The Specific Implement

This experiment uses the Australian electricity load and electricity price dataset. The original dataset is set to , and the total data volume is . The dataset contains eight attributes, including date, hour, humidity, wet bulb temperature, dry bulb temperature, dew ooint temperature, electricity price, and electricity load. The data from 2006 to 2009 is used as the training set, and the data from 2010 is used as the test set.

Limited by the confidentiality of electricity price data, this experiment uniformly sampled the original dataset . It divided it into three datasets of the same amount, representing the electricity price data of the three cities , , and in the same province.

3.1. Data Preprocessing

The multisource data in training set has different dimensions and units, resulting in other effects on model training. Normalize the data, and map the data of different sizes to the interval [0,1], which can avoid errors caused by the difference in dimensions. Since the attributes of the training set are all numerical, this article uses the MinMaxScaler function provided in the python third-party library Sklearn to normalize the eight attributes of different dimensions in . Assuming that the attribute set contains elements, the normalization formula is shown in formula (1): where indicate an element in the attribute set , . The min and max functions return the minimum and maximum values in the set , respectively.

Due to the abnormal noise data in the “electricity price” attribute in the training set, this paper uses third-party libraries pandas and NumPy for data processing to improve the generalization ability of the training model. It fills in the abnormal data with electricity prices less than 0 and greater than 200 in the training set. The training set is preprocessed by normalization and denoising, and is obtained.

3.2. Deep Learning

Because of the characteristics of time series in electricity price data, this paper uses the Keras framework based on deep learning library TensorFlow 2.5 to construct an LSTM model as a local prediction model for each city.

RNN is a standard deep learning method of time series forecasting, but there is a phenomenon that the gradient disappears or explodes, which leads to short-term memory problems. The hidden layer information of the RNN at this moment only comes from the current input and the hidden layer information of the previous moment, and there is no memory function. When the time series is long, RNN will not learn the information in the early moments.

LSTM is a variant of RNN. The network structure of LSTM is much more complicated than that of RNN. Based on RNN, three gate structures of “forgotten gate,” “input gate,” and “output gate” are added to determine the preservation or forgetting of the information. As information enters the model, the cell in LSTM The information will be judged, the information that meets the rules will be left, and the noncompliant information will be forgotten. Based on this principle, LSTM can effectively solve the problem of long-term memory in RNN.

The “forgetting gate measures the importance of past memory,” and the information of the previous moment and the current moment is input into the nonlinear activation function Sigmoid to obtain , which is used to determine whether the past information should be saved or forgotten ( means all is saved, means all is forgotten). The formula for the “forgotten door” is shown in where we call is the weight matrix of the forgetting gate and call is the bias of the forgetting gate. represents the output of the unit state at the previous time, and represents the input at the current time.

Then use the “input gate” to calculate the importance of the input information and get through the Sigmoid function, which is used to determine whether the data is compressed to between -1 and +1 by the tan function needs to be updated. The formula for “input gate” is shown in where we call as the weight matrix of the input gate and call bi as the bias of the input gate. represents the output of the unit state at the previous time, and represents the input at the current time.

Finally, the updated memory’s importance to the next hidden layer is calculated through the “output gate” to determine the information to be input to the next hidden layer. The formula for “output gate” is shown in where we call is the weight matrix of the output gate and call is the bias of the output gate. represents the output of the united state at the previous time, and it means the input at the current time.

In this paper, the mean square error (MSE) is selected as the loss function, the nonlinear activation function is elu, the Adam optimizer is used, and the learning rate is set to to construct an LSTM model with a 5-layer network structure. At the same time, set the number of nodes in the input layer to 128, the number of nodes in the three hidden layers to (32, 16, 8), and the number of nodes in the output layer to 1. The formula for MSE is shown in

We input the training set into the LSTM model, set the number of iterations , and use the early stopping method to prevent the model from overfitting. We set the minimum amount of change , and the number of tolerance periods ; that is, when the minimum difference of the monitored variable is less than 0 for more than 15 times, the iteration is stopped in advance, and the training of the model is ended.

3.3. Decision-Level Fusion

In cities , , and , the local electricity price dataset is used to train the LSTM model, and then, the trained LSTM model is used to predict the future electricity price of the city—Predictcity (). Finally, cities , , and upload the predicted future electricity price information to the cloud center.

Due to the difference of urban population and economic foundation, the total electricity consumption of different cities varies greatly. The greater the total electricity consumption, the greater the impact of the prediction results uploaded by the city on the decision-making level fusion calculation. We use the Softmax function to convert the average electricity load into the corresponding impact weight . The calculation formula is shown in where indicates the average electricity load of each city.

Finally, the unified electricity price Electricity_Price of the province is obtained using the decision-making level fusion algorithm. The calculation formula is shown in where Predictcity indicates the forecast electricity price uploaded by each city.

4. Experiment and Result Analysis

This experiment uses the Australian electricity load and electricity price dataset. Among them, the data from 2006 to 2009 is the training set, and the data from 2010 is the test set. The dataset contains a total of 8 attributes, namely, date, hour, humidity, wet bulb temperature, dry bulb temperature, dew point temperature, electricity price, and electricity load.

Limited by the confidentiality of electricity price data, this experiment is uniformly sampled from the original dataset and divided into 3 equal datasets, representing the electricity price data of the three cities , , and in the same province.

4.1. Decision-Level Fusion Experiment and Comparison

In cities , , and , the local electricity price dataset is used to train the LSTM model, and then, the trained LSTM model is used to predict the future electricity price of the city. Finally, cities , , and upload the future electricity price information to the cloud center.

To verify the feasibility of decision-level fusion, we upload the undivided raw dataset to the cloud center. Then, we train the LSTM model according to the training set simultaneously as the cities , , and . Then, we use the trained LSTM model to predict the province’s future overall electricity price—entirety.

The decision-level fusion is performed on the electricity price forecast results uploaded by cities , , and , and the fusion result obtained is fusion.

Comparing fusion with entirety, considering the periodicity of electricity price fluctuations, the experimental results are divided into four groups: Season 1, Season 2, Season 3, and Season 4 based on quarters, as shown in Figures 58.

Season 1 compares the decision-level fusion results—fusion and entirety from January to March, as shown in Figure 5. Among them, the fusion result in March is the closest to entirety. Due to the influence of subjective factors such as human control, the electricity prices on January 11 and 20 and February 17 have extreme fluctuations, causing specific errors, making Season 1’s fusion error (me) much higher than other quarters. However, in different periods of the month, the decision-level fusion results can approach entirety well. They can perform well in additional months when individual electricity prices fluctuate sharply, such as January 23 and February 4.

Season 2 compares the decision-level fusion results—fusion and entirety from April to June, as shown in Figure 6. As can be seen from the figure, the decision-level fusion result—fusion has always maintained the same trend of change as entirety. Although there is a specific error when individual electricity prices fluctuate sharply, in other periods of Season 2, the fusion result can guarantee a higher accuracy rate.

Season 3 compares the decision-level fusion results—fusion and entirety from July to September, as shown in Figure 7. Although electricity price fluctuations have a particular impact on the accuracy of decision-level fusion, the decision-level fusion result—fusion still maintains the same trend as entirety, and more accurate fusion results can be obtained at most of the moments when electricity price fluctuations are flat.

Season 4 is the comparison between the decision-level fusion results—fusion and entirety from October to December, as shown in Figure 8. Due to the relatively stable fluctuations in electricity prices from October to December, the accuracy of the decision-level fusion results in Season 4 is significantly higher than that of other quarters. In Season4, the fusion result always maintains a high accuracy rate, and the average error is as low as 2.67.

Combining Figures 58, it can be seen that in the case of flat electricity price fluctuations, the decision-level fusion result (fusion) can maintain the same trend as Entirety and a high accuracy rate. And in extreme cases where electricity prices fluctuate sharply, the error of decision-level fusion is also within an acceptable range.

Through the above analysis, it can be concluded that while the decision-level fusion algorithm has a high accuracy rate, it also ensures that the error in the event of severe electricity price fluctuations is within a reasonable range. While reducing the communication burden, this solution also ensures that the error is controllable. Therefore, the decision-level fusion algorithm has high feasibility.

5. Conclusions

This paper presents LSTM-based decision-level fusion of multiple sources of data for real-time electricity price prediction. Experiments are conducted on an Australian electricity dataset, which is split into multiple copies, and the electricity consumption of different datasets assigns the weights to achieve the prediction of real-time electricity prices finally. The experimental simulation results show that the machine learning model based on the decision-level fusion of the LSTM algorithm significantly outperforms other machine learning models using the same dataset, such as the model of linear regression and XGboost, in terms of prediction performance. The decision-level fusion machine learning model based on the LSTM algorithm for multisource data provides reasonable accuracy for predicting the electricity market prices in multiple regions. In the future, we will improve the model for decision-level fusion to maintain or improve the prediction accuracy while speeding up the real-time prediction and using multiple models to integrate the forecast to avoid the drawbacks of a single model.

Data Availability

We have not used specific data from other sources for the simulations of the results. The datasets in this paper can be downloaded with the website: https://download.csdn.net/download/loading_123/11074748

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the funding of the National Key R&D Program of China (2020YFB0905900).