Abstract

In smart grid systems, electric utilities require real-time access to customer electricity data; however, these data might reveal users’ private information, presenting opportunities for edge computing to encrypt the information while also posing new challenges. In this paper, we propose an Edge-assisted Lightweight Power Data Aggregation Encryption (E-LPDAE) scheme for secure communication in a smart grid. First, in the edge privacy aggregation model, the data of smart meters are rationally divided and stored in a distributed manner using simulated annealing region division, and the edge servers of trusted organizations perform key one-time settings. The model encrypts the data using Paillier homomorphic encryption. It then runs a virtual name-based verification algorithm to achieve identity anonymization and verifiability of the encrypted data. The experimental results indicate that the E-LPDAE scheme reduces overall system power consumption and has significantly lower computation and communication overhead than existing aggregation schemes.

1. Introduction

In recent years, with the rapid development of modern science and technology and urbanization, the combination of power systems and information technology has produced a new concept—Smart Grid [1]. Smart Grid is the intelligence of the power grid. Building a smart grid can optimize resource allocation, reduce consumption, and increase efficiency. In smart grid applications, smart meters are deployed in all households in a residential area, each smart meter can collect the user’s electricity consumption data and report it to the control center periodically (for example, every 15 minutes), and the control center can perform actions based on the reported data and real-time data analysis and take corresponding measures to ensure the health of the power system. Therefore, in the process of data transmission, a large number of real-time electricity consumption data of users is interacted with and calculated on the transmission line [2].

By using container technology, edge computing [3] is able to collect heterogeneous data in real time across a wide range of devices and can provide elastic computing resources for deep learning models. The resource configuration of edge computing can satisfy offline processing and analysis of small-area data, thereby ensuring the safe transmission and processing of various data. In addition, edge computing can reduce network latency and improve the utilization of network transmission bandwidth with the help of high-speed communication technology. In the implementation process of smart grid, the introduction of edge computing has a good development prospect, as shown in Figure 1.

Interaction and calculation of real-time electricity consumption provide a great convenience for power companies to fully grasp the electricity consumption of their customers but, at the same time, pose serious security and privacy risks. As pointed out by the National Institute of Standards and Technology (NIST) in the United States, there are more and richer data in smart grid systems. While bringing convenience to services, data leakage will also bring many security threats. Once the real-time electricity consumption information is stolen by the attacker, through the analysis of the data, the user’s detailed family life habits and other information can be obtained. Therefore, how to protect user privacy and data security in smart grids has become a research hotspot in recent years [4].

In order to overcome the above challenges, we propose an edge-assisted lightweight power data aggregation and encryption scheme. The main contributions of this paper are summarized as follows:(i)An edge privacy aggregation model is proposed. The model uses simulated annealing (SA) to propose a segmentation algorithm for smart meters, Simulated Annealing Region Division (SARD). The algorithm can generate optimal area division according to the energy consumption of electricity meters, which is convenient for data collection and analysis of cluster electricity meters. The realization of distributed data storage is conducive to the privacy protection of smart meter data.(ii)The Trusted Organization (TO) can set all keys in the system at one time, improve the efficiency of the smart grid, and reduce the power consumption of the system. Since a trusted organization stores a large amount of sensitive information such as keys, if it is stolen by an attacker, it will seriously threaten the data privacy and security of users. Such issues can be resolved by using edge servers, which are relatively trustworthy.(iii)A virtual name-based authentication algorithm is proposed. The algorithm uses an encryption mechanism combining chameleon signature and Paillier cryptography to encrypt and verify the data to ensure the security of transmitted data while reducing the communication overhead; a selection strategy is developed using an attribute decision tree to improve the value of the data. Finally, the aggregated encrypted data is sent to the Cloud Power Distribution Center (CPDC). The CPDC decrypts the data in order to obtain the final result.

The rest of this paper is organized as follows. Section 2 summarizes the related work. In Section 3, we do some preparatory work. In Section 4, we describe the procedure and algorithm of the scheme. The results of the experimental analysis are reported in Section 5. Finally, the conclusions are discussed in Section 6.

Although many secure communication schemes to protect the privacy of smart grid users have been introduced over the years, not many privacy-preserving aggregation schemes such as [58] have been proposed so far. Electricity consumption data collection is an important process in smart grid communication systems. However, a report from the Netherlands argues that frequent reading of smart meters is problematic from a legal point of view [9], violates the European Convention on Human Rights, and generates many load issues. Fortunately, integrating edge computing into smart grids and designing data aggregation schemes that protect privacy can avoid these problems. First, Pacific Northwest National Laboratory first proposed “edge computing” in an internal report in 2013. With the rapid growth of the Internet of Things, edge computing has received a lot of attention. Shi et al. summarize typical examples of the smart home and collaborative edge and present some of the challenges and opportunities in the area of edge computing [10]. It moves some of the workloads used in the cloud to the edge nodes. The security of sensitive data stored on cloud servers through edge nodes will be of great concern to users. Therefore, consideration should be given to the resource requirements of edge devices, as well as the privacy of smart grid users.

To address these issues, we use data aggregation technology to solve the transmission conflict problem of a large number of data packets for smart grids in edge computing. To improve the security of the data aggregation model, traditional secure data aggregation schemes use hop-by-hop aggregation encryption [11]. However, frequent encryption and decryption operations may affect the aggregation efficiency and increase the corresponding additional energy consumption and the delay of the data aggregation process. An efficient privacy-preserving aggregation scheme (EPPA) for smart grid communication [7] was proposed by Lu et al. They used a super-incremental sequence to construct multidimensional data and encrypted the data with Paillier homomorphic encryption [12]; however, the scheme has security flaws. Shi et al. used an untrusted aggregator to differentially aggregate multiple time slots, which is more costly based on computationally intensive systems [13]. Fan et al. [14] proposed a secure power usage data aggregation scheme for smart grids, but it critically requires a third-party trust mechanism for distribution, adding an additional burden. Li et al. proposed a distributed incremental data aggregation approach where they used homomorphic encryption to solve the repetitive regular data aggregation task [5]. Garcia and Jacobs used homomorphic encryption to ensure the privacy of users and gave a measurement method [6]. Lu et al. proposed a lightweight privacy-preserving data aggregation scheme called Lightweight Privacy-Preserving Data Aggregation (LPDA), but it cannot achieve identity anonymization [15]. Hua et al. proposed an effective smart grid aggregation scheme against malicious data mining attacks but increased the computational overhead and communication overhead [16].

3. Preparation

This section reviews the main basic concepts related to our work, including Paillier homomorphic encryption, simulated annealing region partition [17], chameleon hash function [18], and Attribute decision tree.

3.1. Paillier Homomorphic Encryption

Paillier cryptography is an additive homomorphic public key cryptography, which has been widely used in the field of encrypted signal processing or third-party data processing. Its homomorphic property is that the corresponding arithmetic operation can be performed on the ciphertext directly after encryption, and the result of the operation is the same as that of the corresponding operation in the plaintext domain. Its probabilistic property is that for the same plaintext, different ciphertexts can be obtained by different encryption processes, thus ensuring the semantic security of the ciphertext. The mechanisms used for encryption and decryption are as follows:(1)Key generation: randomly select two large prime numbers and , calculate their product and the least common multiple of and , and then randomly select an integer that satisfies the following conditions:Among them, function and function are used to calculate the greatest common divisor of two numbers. is the set of integers less than , while is the set of integers coprime with in . and are public key and private key, respectively.(2)Encryption process: a random integer is selected. For any plaintext , the corresponding ciphertext is obtained by using public key encryption:According to the properties of the Paillier encryption system, when ciphertext is encrypted with the same public key, because the selection of ciphertext is random, different ciphertext can be obtained for the same plaintext , but the same plaintext can be restored after decryption, thus ensuring the semantic security of ciphertext.(3)Decryption process: decrypt ciphertext with private key to get the corresponding plaintext .

3.2. Simulated Annealing Region Division
3.2.1. Regional Division

For a given smart meter, the division of area is expressed as follows:

is the number of regions, is the number of links between smart meter nodes in the smart grid, is the number of regions in region , is the number of links between smart meter nodes in the smart grid, is the number of links between smart meter nodes in region , and is the sum of degrees of smart meter nodes in region . First, we use equation (4) to randomly place smart meters on the device layer into the area. Finally, we use a simulated annealing algorithm to find the optimal partition.

3.2.2. Simulated Annealing Algorithm

It is a general probabilistic algorithm that is used in our scheme to find the optimal solution to the zoning problem, where one can find low-cost smart meter regions, but not local minima for high-cost smart meter regions. We introduce the energy consumption of smart meters to achieve this. Starting from high , it gradually decreases and the system gradually approaches the minimum cost, avoiding the high-cost local minima.

The purpose of identifying modules is to maximize the use of modules, where costs and are the areas defined in equation (4). We update each energy consumption randomly, and the probability is expressed aswhere is the cost after the update and is the cost before the update, .

3.3. Chameleon Hash Function

Traditional cryptographic hash functions are difficult to find collisions. But the chameleon hash function can artificially set up a “back door”: if you master it, you can easily find collisions. This breaks the collision resistance of the hash function, but for most people, these properties remain, and the hash is still secure. Accenture applied the characteristics of the chameleon hash function and applied for a patent on an editable blockchain. Although the decentralization and irrevocability of the blockchain are damaged to a certain extent, on the other hand, it also expands the application scenario of the blockchain and meets part of the needs of the government’s regulatory requirements [19].

Principle description: suppose there exist two prime numbers , and is large enough. The private key of the chameleon hash function is , is the group of order , and is its generating element. The public key is . Given an arbitrary message with random value , now tampering the content to , it is now desired to find a random number such that . By the exponential property , . The solution procedure for is as follows:

Therefore, , , , and are known, .

3.4. Attribute Decision Tree

The attribute decision tree is modeled after the access control tree and is set up according to the needs of the data collector. The leaf nodes of the attribute decision tree represent various attributes, and the intermediate nodes and roots are replaced by AND and OR. When an attribute of the data satisfies the requirements of the attribute decision tree, it is passed and the next calculation is performed; if not, other calculations or steps are performed.

For example, Mr. Li is a professor in the school of computer science of a university, so his attribute set matches the attribute strategy, as shown in Figure 2. Miss Wang is a professor in the school of information security of a university. Her attribute set does not match the attribute policy, as shown in Figure 3.

4. Edge-Assisted Lightweight Power Data Aggregation Encryption Scheme

4.1. Edge Privacy Aggregation Model

The edge privacy aggregation model contains four subjects: the User’s Smart Meter (USM), the Marginal Power Services Institutions (MPSI), the Cloud Power Distribution Center, and the trusted organization. First, the USM encrypts data and divides it into optimal regions according to the change of energy consumption at different moments using a simulated annealing region partitioning algorithm, and as the energy consumption of USM changes at different moments, the number and location of clustered meters also change, thus realizing distributed data storage, which is conducive to the privacy protection of user data. Secondly, MPSI aggregates data with user identity anonymized and without affecting the privacy of any party. Finally, CPDC performs secure decryption, and TO performs key generation and distributes the key to the system. The model is shown in Figure 4.User’s Smart Meter. Smart meters use TPM chips to securely store and encrypt data. The SARD algorithm is executed using the handheld unit (including the sensor). Divide the smart meters of all users to meet the power load balance of the meters. The cluster meter regularly sends the collected data to the edge server. Perform data encryption calculation and chameleon signature calculation.Marginal Power Services Institutions(MPSI). It consists of edge servers. The edge server performs chameleon signature aggregation and verification calculations and data aggregation calculations.Cloud Power Distribution Center. The cloud server receives the aggregated data and decrypts it.Trusted Organizations. The real identities of all users are virtualized to form virtual names and distribute system parameters and all private keys, and the distribution channels are all secure channels. The three parties of cloud, edge, and smart meter collaborate with trusted organizations to generate all private keys, as shown in Figure 5. Compared with existing solutions, our private keys require only a one-time setup between the three parties, which is beneficial for resource-limited systems. In addition, the private keys owned by TO are involved in decrypting the ciphertext and verifying the ciphertext, confusing the attacker, and making it impossible to tamper with the ciphertext.

4.2. Scheme Construction

The scheme proposed in this paper realizes the security and integrity of real-time power consumption data transmission between the smart meter and power server. The steps are as follows.

4.2.1. Initialization

TO inputs safety parameter and gets related parameter , where is a large prime, are two additive cyclic groups, is a multiplicative cyclic group, is the order of the cyclic group, and are the generators of groups and , respectively, satisfying that and is an isomorphic mapping, is bilinear mapping, and the storage list is established. TO chooses a system master key , is a multiplication cycle group, and is the system public key. Two hash functions and .

TO publishes system parameters and functions, selects a security parameter for the Paillier encryption algorithm, and sends it to the smart meter for initialization of the Paillier encryption algorithm. TO generates other parameters of the Paillier encryption algorithm: select two large prime numbers and , where . The smart table computes and chooses as the generator to use as the public key of the Paillier encryption algorithm. CPDC computes the private key of the Paillier encryption algorithm .

For the initialization of the chameleon signature, TO selects an element of order in and an arbitrary index , then the private key of the chameleon signature is , and the public key is .

TO sets the regularized attribute set as a multiplicative cyclic group; then, any attribute in the attribute set is any element in the multiplicative cyclic group. The attribute set is sent to the smart meter. Similarly, if TO sets the attribute set of the attribute decision tree as a multiplicative cyclic group, then any attribute in the attribute set is any element in the multiplicative cyclic group, and the set attribute set is sent to CPDC.

4.2.2. User Registration

Assuming a secure channel between TO and the user, in order to complete the user registration, the operation steps between the user and TO are as follows:User sends ID, serial number of smart meter to TO.TO sends a Cert to user after confirmation.User uses the Cert to get permission to request the parameters and key of the algorithm from TO.TO sends the signature key etc. to the smart meter of user .TO calculation:TO calculates the signature key of user :TO sends the signature key , the real-time virtual name to the smart meter of user .

4.2.3. Data Processing

Within data acquisition time , the smart meter of user encrypts the data with the Paillier homomorphic encryption and signs the encrypted data with the chameleon hash function which is referred to as chameleon signature for short. The cluster meter collects data within the divided area. Finally, the real-time encrypted data and signatures are sent to MPSI. The steps are as follows.The smart meter of user selects a random number and encrypts data .The smart meter of user uses signature key , virtual name, and attribute set to sign encrypted data by the chameleon hash function and finally send it to the cluster meter .Cluster meter sends to MPSI.

MPSI receives the information and runs the virtual name-based verification algorithm as shown in Algorithm 1.

Input:
Output:
(1)fordo
(2);
(3);
(4);
(5) match ;
(6);
(7);
(8);
(9)end for
(10)MPSI sends to CPDC;

The algorithm first aggregates chameleon signatures. After verification, the attribute set of the data is obtained, and the attribute set of the data decision tree is matched in turn, and the data satisfying the data decision tree can be data aggregated with other data satisfying that decision tree for the data aggregation operation.

After aggregation, MPSI sends the aggregated data to the CPDC through the secure channel. Data decryption: CPDC decrypts the encrypted aggregate data.

, CPDC stores data for power grid operation and puts forward decisions.

4.2.4. Track

While making power consumption analysis and decision-making, CPDC may find that some power consumption values do not meet its predetermined range or abnormal conditions. At this time, CPDC will start the tracking process, and the steps are as follows:CPDC sends the command to the edge server that submits the relevant abnormal power consumption: let MPSI send the stored power consumption and virtual name at that time to CPDC.CPDC first decrypts each encrypted data received, detects and finds the abnormal power consumption, and locks its .CPDC sends the virtual name of the abnormal power consumption determined by it to TO and applies for identity tracking.TO can query the real identity of the users who send out abnormal electricity consumption. TO sends the real identity to CPDC, and CPDC processes the user and his power consumption accordingly.

4.3. Safety Analysis
4.3.1. User Identity Privacy Protection

Before sending data to the CPDC, the USM registers with the TO to obtain a virtual name and signing key. The USM uses the virtual name as the identity of the data transfer in the architecture and performs encryption, signing, and other actions based on it. The USM has a tamper-proof storage device. This storage device can be thought of as a “black box” that can read and write data, but only by the USM; no other device can read or write information. According to the one-way and collision-free characteristics of the hash function, even if the attacker obtains the virtual name, it cannot crack the real identity. This scheme can effectively protect user identity and prevent illegal intrusion.

4.3.2. Security Analysis of Chameleon Signature

The chameleon signature is a preferable designated verifier signature. Compared to other signatures, the chameleon signatures are more suitable for lightweight aggregated encryption schemes due to their ability to transmit data efficiently and reduce computational overhead. Chameleon signatures are also nontransmissible, nonforgeable, and nonrepudiation, which also ensure data security and meet the security requirements of the system.

4.3.3. User Fine-Grained Data Privacy Protection

USM encrypts the electricity consumption data using the Paillier encryption algorithm, sends it to MPSI, which does not have the ciphertext decryption key, and sends the ciphertext to CPDC after successful verification. CPDC mainly receives aggregated numbers of electricity consumption data, so it protects the user’s fine-grained data privacy, while CPDC can get the complete electricity consumption data.

5. Experimental Analysis

5.1. Simulated Annealing Region Division

Intraregional connectivity and participation: each region is divided into relatively balanced regions from one or several fully centralized regions based on the energy consumption of smart meters to achieve a balanced electrical load in each region. We define the intraregional connectivity, in order to measure whether the smart meter is well connected to other smart meters in the region.where is the number of links from the smart meter to other smart meters in zone , is the average number of links from all smart meters in the zone , and is the standard deviation of all links in the zone.

Of course, we also need to consider unexpected situations. For example, a smart meter may not be connected to its own area. Therefore, we define the participation degree of a smart meter .where is the number of links from the smart meter to smart meters in zone , and is the total number of degrees of the smart meter . According to equation (14), if the connections of the smart meter are evenly distributed in all areas, then the participation degree of the smart meter is close to 1. If all its connections are in its own area, the participation degree is 0.

We use a MATLAB environment with a Dell laptop (i5-6200u, CPU 2.40 GHz, Windows 10 OS) for simulation experiments. Assuming that 100 smart meters are randomly distributed in a 1.0 ∗ 1.0 km smart grid, and each smart meter has a random electricity consumption , a zoning model is established. First, the 100 randomly distributed smart meters are generated as a subset of the neighborhood of electricity consumption Download the open-source dataset from the website Open Energy Data Initiative (OEDI) and randomly select the electricity consumption information from 100 apartments with no missing points and a time granularity of 15 minutes. The average value is calculated based on the electricity load of 100 users at different times of the day, as shown in Figure 6. 14:00–20:00, the user’s electricity load continues to grow, with 20:00 reaching the highest peak of the day; 20:00–24:00, the user’s electricity load continues to fall to a stable value. After reasonable analysis, we divide the average value of the electricity load of 100 users in different time periods of a day into 6 electricity consumption states. A power consumption state of is randomly selected for the regional division scheme, and the next power consumption state of is randomly selected as the candidate scheme for the next regional division scheme. Calculate ; if , accept for the next region division scheme; otherwise, we judge the random update probability , ; if true, accept for the next region division scheme; namely, , . Then, we check whether the connectivity and participation in the region satisfy equations (13) and (14). Finally, we use for the region partition scheme and return the SARD algorithm.

Figures 7(a)7(f) show the experimental process of the SARD algorithm. We performed six rounds of state calculation, divided the six power consumption states into different regions, and terminated the algorithm. Cluster meters in each area are used to collect data and process the data accordingly to realize power load balancing under different power consumption states.

First, the power consumption of smart meters increases with the increase of users in the smart grid. Since all the data eventually needs to be sent to the cloud server of CPDC for processing, the power consumption of the cloud server also increases with the increase of data, as shown in Figure 8. Then, we introduce edge computing into the smart grid, and the power consumption of MPSI increases with the increase of edge servers. This layer processes a large amount of data and then sends it to the CPDC. Since the CPDC does not need to process a large amount of data, the power consumption of the cloud server in the CPDC does not fluctuate much, as shown in Figure 9. Comparing the experiments in the two figures, the introduction of edge servers to process large amounts of data in the edge privacy aggregation model of the smart grid effectively reduces the power consumption of the CPDC and the total system power consumption.

5.2. Total Computing Overhead

The computational overheads of this scheme and the LPDA scheme mainly involve the following three operations: bilinear pair operation, exponential operation, and Paillier homomorphic encryption and a decryption operation, and other operations are neglected. The bilinear pair operation and the exponential operation are and , respectively, and the encryption and decryption of the Paillier algorithm are and , respectively, and the other computational overheads are neglected. The AMDM scheme is mainly multiple operations, is the multiplication operation in the cyclic group , and are the multiplication operation in , is the exponential operation, and is the multiplication operation in the group because and produce little effect negligible.

The scheme uses the MATLAB environment of a Dell laptop (i5-6200u, CPU 2.40 GHz, Windows 10 OS) for simulation experiments. The simulation measures the amount of time needed by the Dell laptop to perform basic operations in the experimental environment. It takes 1.1 ms to calculate a single , 3.1 ms to calculate , 4.5 ms to calculate , and 2.1 ms to calculate . Since all three scenarios in this paper have only one pair of encryption and decryption operations, we first disregard and .

The scenarios in this paper consider the computational overhead of each of the three participants, Smart Meter, MPSI, and CPDC, and compare them with other scenarios, as shown in Table 1. The total computational overhead of all the solutions is the total computational overhead of the three participants. As can be seen from the table, this paper is significantly more efficient than the other two schemes.

As shown in Figure 10, the computing energy consumption of the scheme in this paper is significantly lower than the aggregated encryption schemes of the remaining two schemes, where the AMDM scheme resists malicious attacks and requires more computing energy and is significantly higher than the LPDA scheme and the scheme in this chapter, while the scheme in this chapter does not cause additional computation while ensuring data security due to the use of the chameleon signature, so the total computation overhead is lower, and it can be said that the scheme in this chapter is better than the LPDA scheme and the AMDM scheme.

5.3. Total Communication Overhead

The total communication overhead of this scheme mainly refers to all the communication data that needs to be transmitted in the system. The output data length of the hash function is 160 bits. Suppose the length of in Paillier encryption algorithm is 512 bits, the length of group element is 161 bits, the length of is 32 bits, the length of attribute set is 32 bits, and the length of is 32 bits. The total communication data volume of this scheme consists of two parts: the first part is from SM to MPSI, and the data transmitted is ; the second part is from MPSI to CPDC, and the data transmitted is . The total traffic of the LPDA scheme consists of two parts. The first part is from SM to ESP, which transmits 2048 bits through calculation, and the second part is from ESP to CC, which transmits 2048 bits through calculation. The total traffic of the AMDM scheme consists of two parts. The first part is SM to GW, which transmits 3264 bits through calculation, and the second part is GW to CC, which transmits 3264 bits through calculation. The comparison between this scheme and other schemes is shown in Table 2.The simulation experiment is carried out using MATLAB, and the results are shown in Figure 11.

We use the smart meter data of a year in London on the Open Energy Data Initiative (OEDI) website to simulate the total communication cost per day. As shown in Figure 12, different colors represent different communication situations; that is, when the number of edge servers and smart meters changes, the communication cost also changes. Based on the actual privacy requirements and cost requirements of the customer, we implement appropriate electricity usage data delivery mechanisms in the actual area.

6. Conclusion

In this paper, we consider the actual smart grid, introduce edge computing, and propose an edge-assisted lightweight electricity consumption data aggregation and encryption scheme, which solves the problem of sending electricity consumption data to the cloud by users securely and efficiently. The scheme uses a simulated annealing zone partitioning algorithm to reasonably partition smart meters according to their electricity consumption energy consumption to achieve load balancing of smart grid systems; at each sending of data, licensed users apply for virtual names from trusted organizations to enable them to communicate with the grid as anonymous, which effectively protects the privacy of user identity security; in encrypting data, CPDC uses a virtual name-based verification algorithm which is used to Paillier encryption technology combined with chameleon signature to ensure authentication, integrity, and nonrepudiation of data, so that CPDC can only obtain encrypted data aggregated by MPSI, protecting the privacy of users’ fine-grained data. Performance analysis shows that it is much better than the LPDA scheme and AMDM scheme in terms of communication overhead and computation overhead. In future work, we will evaluate our schemes in realistic smart grid scenarios with stronger adversarial models and study the impact of different signatures on system performance and security.

Data Availability

The research data are obtained from the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant nos. 61771289 and 61832012), Major Basic Research of Natural Science Foundation of Shandong Province (Grant no. ZR2019ZD10), and Key Research and Development Program of Shandong Province (Grant no. 2019GGX101050).