Abstract

As the traditional grid produces a large amount of greenhouse gas and cannot adapt to such new demands as dynamic electricity prices, data analysis, and early warning, smart grid with high efficiency and reliability is increasingly valued. It plays a key role in achieving carbon neutrality. Nonetheless, smart grid requires the collection of real-time power data, and personal privacy may be leaked through the frequent electricity measurement reports. With the requirements of data analysis and prediction while preserving users’ personal privacy, data aggregation schemes have emerged. However, existing schemes cannot resolve all the troubles well. Some schemes do not consider the failures for smart meters, and most of the schemes have expensive computation cost. In view of this, an efficient privacy-preserving data aggregation scheme with fault tolerance in smart grid is put forward in this paper. To be specific, the proposed scheme is lightweight due to the application of the symmetric homomorphic encryption technology and the elliptic curve cryptography. Even if some smart meters are destroyed, the proposed scheme can still successfully obtain aggregated data. Moreover, the proposed data aggregation scheme is proved to be secure, and all security requirements can be satisfied. Performance evaluation illustrates the relatively low computation cost and communication overhead of the proposed scheme compared to other related schemes.

1. Introduction

In recent years, the negative effects of global warming have become increasingly significant, as can be observed from the rising sea levels and the destruction of biodiversity. All countries are looking for ways to achieve carbon neutrality [1]. The application of smart grid (SG) can effectively accelerate the realization of this goal, and SG is included in long-term development plans [26]. Compared with traditional grid, SG has the advantages of conforming to low-carbon sustainable development, adopting a two-way communication mode, and allowing for diversified gradient electricity prices and early warning based on status analysis [79]. These features compensate for various shortcomings of traditional power grid; therefore, SG is considered an excellent next-generation power system. Figure 1 illustrates the framework of SG, which consists of the markets, control center, service provider, energy generation, transmission, distribution, and customers [10].

For the information communication in SG, a large number of sensors are employed, especially smart meters (SM), which need to collect real-time household power measurement data every 10–15 minutes and send them to the control center (CC) for electricity data analysis and dispatch [11]. It is very time-consuming for a large amount of data transmission; at the same time, real-time data transmission also raises people’s privacy concern. According to the survey [12, 13], individual real-time electricity consumption data will expose sensitive information of users; for example, the lifestyle and living habits of family members might be exploited by malicious adversary.

In order to deal with the above contradiction simultaneously, data aggregation technology has been proposed, where SM will use homomorphic encryption to protect real-time power data and upload ciphertext to gateway (GW); then, data ciphertext is aggregated by GW and sent to CC. Finally, CC can take advantage its private key to decrypt the aggregated ciphertext, but it is unable to gain a single power measurement data of a SM. In this way, the users’ personal privacy information can be protected, and in the meanwhile, CC can analyze power measurement data and allocate and adjust power supply in a timely and reasonable manner.

Although there have been many data aggregation schemes [1437], many issues are still worthy of further improvement.

First, SM exposed to an environment without any protection may malfunction and cannot send reports; the lack of partial power data reports may make the system fail to recover the aggregated power data. The feature of fault tolerance enables the recovery of the aggregated data despite the SM malfunction. Some previously existing aggregation schemes [14, 1620] do not support fault tolerance, so they cannot obtain normally aggregated data, and the entire system will be paralyzed. Although some other schemes [22, 24, 29] achieve fault tolerance, their trusted authority (TA) or CC needs to do some special operations, such as generating dummy ciphertext; this is not very practical because the number of GW and SM under the control of the TA is enormous, which will bring unbearable computation costs to the TA.

Second, since the GW and CC are semitrusted, they may launch collision attacks to obtain the private data of a single SM. In the existing schemes [22, 28, 30, 37], the data reports are encrypted directly with the public key of CC. If GW sends the ciphertext of a certain SM to CC, or CC accidentally obtains the single ciphertext, the individual report can be decrypted by CC through its private key.

Third, identity privacy is also a kind of secret information, which should be protected; meanwhile, when malicious SM appear, its real identity should be revealed by the TA. Some existing schemes [18, 23, 25, 26] fail to consider identity privacy, some other schemes [16, 17, 20, 27] achieve identity anonymity, but the way is that the data reports do not contain identity information, which makes it impossible to trace the identity of the malicious SM when it appears.

In order to settle the above problems and realize further optimization, an efficient data aggregation scheme that would support fault tolerance is proposed. The primary contributions are shown below:(i)The proposed scheme applies lightweight symmetric homomorphic encryption technology and elliptic curve signature to accomplish efficiency, instead of commonly used time-consuming public key homomorphic encryption technologies such as Paillier [38] and BGN [39]. It is also characterized by the feature of fault tolerance, thus being able to run normally even if some SM fail to upload data reports.(ii)The security analysis formally proves that the proposed aggregation scheme is secure based on -based decision problem and elliptic curve discrete logarithm problem. Moreover, the proposed scheme would implement required security requirements, especially to resist collusion attacks.(iii)Performance evaluation carries out quantitative analysis, and the result displays; the proposed scheme involves less computation cost and communication overhead compared with other related data aggregation schemes.

The structure of the rest of this paper is allocated as follows. Section 2 displays the related works of data aggregation schemes. The background and preliminaries are given in Sections 3 and 4, respectively. In Section 5, the proposed scheme is introduced in detail. Sections 6 and 7, respectively, illustrate the security analysis and the performance evaluation. Ultimately, the conclusion is described in Section 8.

With the long-term research of data aggregation technology, many problems are considered to satisfy the security requirements, for instance, collision attack, fault tolerance, and identity privacy protection.

The related aggregation schemes are vulnerable to various attacks, especially collision attacks. Compared with external adversaries, internal attackers are more likely to damage the SG system because they have more private information. Fan et al. [14] first considered collision attacks and successfully resisted them by virtue of blinding factors assigned by a trusted third party. Regrettably, Bao and Lu [15] illustrated the integrity drawback of the scheme [14], which lies in that the private key of the user was easily recovered so that data pollution would be caused. He et al. [16] created the certificateless data aggregation scheme by the mechanism of elliptic curve cryptography which could speed up the process and withstand the collision attacks. He et al. [17] improved the BGN scheme to realize data aggregation scheme against collision attacks. Zhang et al. [18] considered the false data injection attacks and prevented them with the blinding factors. Li et al. [19] applied the BGN encryption and blinding factors to complete data aggregation scheme that can prevent collision attacks. Shen et al. [20] put forward the aggregation scheme that can counteract new malicious data mining attacks and internal attacks with BLS short signature. Based on elliptic curve cryptography, a scalable data aggregation scheme was designed by Chen et al. [21], where the encryption key, instead of the public key of CC, was generated independently, even if CC cannot decrypt the single ciphertext.

Once SM malfunction appears and hinders the normal submission of electricity reports, most of the above schemes [14, 1620] would be paralyzed. Therefore, fault tolerance needs to be taken into consideration. In the scheme [22] of Chen et al., a trusted third party additionally generated the dummy ciphertext for the damaged SM to ensure the smooth running of the agreement as the trusted third party held the private keys of all users. Bao and Lu [23] advanced the differentially private aggregation scheme with fault tolerance, where CC was still able to receive the aggregated data from the remaining reports. Pan et al. [24] combined with Lagrangian interpolation technology to propose a two-dimensional and fault-tolerable privacy-preserving aggregation scheme. Ge et al. [25] put forward a fine-grained data analysis scheme which could still run even if the meter was failed, and this scheme could obtain a variety of statistics. Xue et al. [26] proposed the privacy-preserving service outsourcing scheme, which supported fault tolerance mechanism and flexible electricity price. Guan et al. [27] utilized the Shamir sharing method and RSA signature to implement the fault tolerant aggregation protocol, yet this protocol cannot be decrypted correctly. In [28], Ding et al. raised an identity-based secure data aggregation scheme, which supported fault tolerance due to their particular ciphertext structure. Wang et al. [29] skillfully improved the Paillier encryption to get the fault tolerant data aggregation scheme through collaborating between users; unfortunately, the integrity was not considered here. In general, when facing SM malfunction, TA will perform additional operations to achieve fault tolerance, but it will be overloaded because it manages many GW and many SM under GW.

In addition, user identity privacy is an important security problem. Liu et al. [30] utilized the blind signature technology to realize an anonymous data aggregation scheme, where the token was unlinkable to any valid signature. Combined with the ring signature technology, Badra and Zeadally [31] blocked the connection between the content of the report and the identity of the SM. Tan et al. [32] designed a privacy-preserving pseudonym-based collection scheme, where the SM adopted the group key to generate a pseudoidentity so that the adversary was unable to get its real identity. Gong et al. [33] satisfied anonymity by separating data reports and identity of the SM.

Although the above schemes achieve different functions and features, most of them are time-consuming, which can make SM with limited calculation resources embarrassing. He et al. [34] applied batch verification to accelerate the execution of the aggregation scheme. Combining the elliptic curves cryptography and super-increasing sequence technology, Ming et al. [35] came up with an efficient privacy-preserving multidimensional data aggregation scheme, which can classify power measurement data and achieve fine-grained analysis. In scheme [36] of Shen et al., XOR operation of pseudorandom function was employed to encrypt power data and realize confusion so that the adversary could not identify the source of the reports. Zhang et al. [37] adopted online and offline signature technology to create a lightweight aggregation scheme, which would help to speed up the signature verification process.

3. Background

The background of the proposed scheme is described, mainly including system model, security requirements, and design goal in this section.

3.1. System Model

In the proposed scheme, the system model is divided into four entities: trusted authority (TA), control center (CC), gateway (GW), and smart meters (SM), as shown in Figure 2. For the ease of description, considering only one GW, we link smart meters in the model.(1)TA: it is a fully trusted entity, who produces the blinding factors and the secret value for SM. TA would recover the exact identity of the SM with malicious behavior.(2)CC: it is a semitrusted entity, who generates the system parameters. CC is also in charge of the registration of SM and GW. In addition, after receiving aggregated encrypted data from GW, CC will decrypt and analyze them.(3)GW: it is a semitrusted entity. GW collects and aggregates the encrypted electricity report from each domain header; then, GW transmits it to CC.(4)SM: according to geographical proximity principle, entire SM are divided into domains , and each domain contains members, that is, . In the th domain , a random member is selected as the domain header. Without loss of generality, assuming that the first member is appointed as the domain header by GW, obviously, the domain header itself is also a member in . Here, the domain member is charge of collecting electricity measurement data of each user’s household and sending it to . The domain header is responsible for preaggregating the data report in the domain and then uploading it to GW. Besides, is not allowed to send electricity report directly to GW. It is worth noting that all SM cannot collude with GW or CC.

3.2. Security Requirements

The security requirements that the proposed scheme should satisfy are as follows:(1)Confidentiality: the electricity data are closely related to users’ privacy information. Therefore, only useless knowledge can be obtained even if the adversary gets the transmitted ciphertext.(2)Authentication: it is necessary to realize the authentication because the report transmission between any two entities must verify each other to ensure legal identity.(3)Integrity: the report transmitted in the open channel may be tampered, and wrong message may be conveyed. So, the proposed scheme would detect whether the report has been altered.(4)Anonymity and traceability: no entity other than TA can determine or distinguish identity by analyzing transmitted reports. From another aspect, when a malicious SM uploads fake data, its true identity should be revealed by TA to supervise the behavior.(5)Resistance against common attacks: the proposed scheme should guarantee that many common types of attacks would be rejected, including but not limited to collision attack, modification attack, and replay attack.

3.3. Design Goal

The proposed scheme satisfies the following the objectives.(1)Privacy-preserving: the actual data and identity of a single SM are prohibited from being obtained by anyone. CC is only allowed to decrypt aggregated data ciphertext. In addition, the abovementioned attacks should be resisted.(2)Fault tolerance: it is unbearable that the aggregated data cannot be recovered when few SM are damaged. Therefore, even if some SM fail to submit reports, the system should continue to run normally.(3)High efficiency: on the premise of fulfilling the above security requirements, the proposed scheme tries to reduce the computation cost and communication overhead. For practical smart grid, an efficient scheme is more suitable for SM with limited resources.

4. Preliminaries

Two preknowledge are briefly stated in this section, including elliptic curve cryptography and symmetric homomorphic encryption.

4.1. Elliptic Curve Cryptography

The definition of elliptic curve cryptography (ECC) comes from Millier [40]. Let be an additive cyclic group of prime order ; the generator is . The security problem and assumption are described as follows.

Elliptic curve discrete logarithm problem (ECDL problem) [41]: given two elements in the elliptic curve as input, output an integer where .

Elliptic curve discrete logarithm assumption (ECDL assumption) [41]: it is difficult for the probabilistic polynomial time algorithm to solve the ECDL problem with a nonnegligible advantage.

4.2. Symmetric Homomorphic Encryption

Mahdikhani et al. [42] designed a new symmetric homomorphic encryption; the algorithm is described as follows.

: on inputting the security parameter satisfying , the probabilistic key generation algorithm outputs the symmetric homomorphic encryption key , where the two large prime numbers satisfy and is randomly selected from . Next, the algorithm calculates and publishes the system parameters .

: on inputting the symmetric homomorphic encryption key and the plaintext , where the message space is , the encryption algorithm selects two random numbers and and encrypts the plaintext:

: on inputting the ciphertext and the symmetric homomorphic encryption key , the decryption algorithm decrypts the ciphertext:

The security of symmetric homomorphic encryption [42] is based on the following security assumption.

-based decision problem [43]: given , the -based decision problem is to determine whether an integer belongs to or without , where

-based decision assumption [43]: it is difficult for the probabilistic polynomial time algorithm to solve the -based decision problem with a nonnegligible advantage in and .

5. The Proposed Scheme

In this section, a detailed privacy-preserving data aggregation scheme that supports fault tolerant in the smart grid is proposed, consisting of six phases: initialization phase, registration phase, report generation phase, report aggregation phase, report reading phase, and fault tolerant phase. All the notations used in this paper are described in Table 1. The general picture of the proposed scheme is depicted in Figure 3.

5.1. Initialization

In this section, CC would produce the system parameters, TA would generate the blinding factors and the secret value for smart meters.

5.1.1. Control Center

(1)Given the security parameter , CC produces an additive cyclic group of the prime order satisfying ; is based on a nonsingular elliptic curve which is defined over a finite field , satisfying . CC chooses the generator of .(2)CC randomly chooses two large prime numbers satisfying and computes the public parameter . CC chooses arbitrary and computes and for , where and are two random numbers. CC secretly transmits the key to TA.(3)CC chooses five secure hash functions , , , where is the length of the real identity .(4)CC publishes the system parameters .

5.1.2. Trusted Authority

(1)TA selects the random number as the master secret key and calculates the corresponding public key .(2)TA selects arbitrary blinding factors satisfying .(3)TA chooses a random secret value .(4)TA secretly transmits to .

5.2. Registration

and GW would register with CC, respectively, in this section.

5.2.1. Smart Meters’ Registration

(1) randomly chooses and calculates the public key and knowledge signature:Then, transmits to CC.(2)After receiving , CC verifies whether . If it holds, CC randomly selects and calculates the pseudoidentity for , in which and .(3)CC publishes and secretly transmits to .

5.2.2. Gateway's Registration

(1)GW randomly selects and computes the public key and knowledge signatureThen, GW sends to CC.(2)CC verifies whether . If it holds, CC publishes .

5.3. Report Generation

In this section, would collect and transmit electricity data to GW.

5.3.1. Submits the Data Report to

(1) collects electricity measurement data , randomly selects , and computeswhere is the current timestamp.(2) randomly chooses and calculates(3) submits the data report to .

5.3.2. Uploads the Preaggregated Report to GW
(1)Given from other , examines the timestamp and verifies whetherIn order to speed up the verification, uses small exponent test technology [44] to achieve batch verification. randomly selects a set of tiny numbers and verifies whether(2)Given data reports and his own data report , randomly selects and calculateswhere is the current timestamp.(3)Finally, uploads the preaggregated report to GW.
5.4. Report Aggregation

In this section, GW would verify and aggregate the preaggregated reports from . Afterward, GW would upload the aggregated report to CC.(1)Given from the domain headers , GW examines the timestamp , randomly selects a group of tiny values , and verifies whether(2)GW randomly selects and calculateswhere is the current timestamp.(3)Finally, GW uploads the aggregated report to CC.

5.5. Report Reading

CC would verify and decrypt the aggregated report from GW in this section.(1)Receiving from GW, CC checks the timestamp and verifies whether(2)CC decrypts aggregated electricity measurement data:(3)CC analyzes and processes the aggregated data and makes optimal allocation.

Correctness:where and .

5.6. Fault Tolerant

This section describes how to obtain the aggregated data when some smart meters fail to work normally.(1)Assuming that only receiving reports, performs the same operations as Section 5.3.2 except . Then, broadcasts into the domain .(2)Given , examines the timestamp and verifies the signature . If it is valid, randomly selects , and computeswhere is the current timestamp. Then, submits the report to .(3)After receiving data reports from , the domain header examines timestamp and verifies signature . If batch verification is valid, randomly selects and calculateswhere is the current timestamp. Finally, uploads the preaggregated report to GW.(4)GW and CC normally execute the protocol as Sections 5.4 and 5.5, respectively. Finally, CC gets aggregated electricity data for no malfunctioning smart meters.

Correctness:where and .

6. Security Analysis

6.1. Indistinguishability

The proposed scheme is proved to be the indistinguishability under the chosen plaintext attack (IND-CPA). The adversary can execute the below queries:Hash query: given a hash query, output a random valueEncryption query: given an encryption query on the message , output the ciphertext The security model can be defined by the interactive game played between the adversary and the challenger .Setup: produces the system parameters and sends them to .Phase 1: adaptively executes the hash queries and the encryption queries for polynomial times.Challenge: after completing phase 1, randomly selects two messages and submits two messages to . Next, randomly chooses , calculates the ciphertext corresponding to , and replies it to .Phase 2: executes the same queries as Phase 1 apart from the encryption query on the message and .Guess: outputs as the result of guess.

The advantage for the adversary to win the game is defined as

Definition 1. The proposed scheme ensures IND-CPA secure if the advantage of an adversary in the above game is negligible.

Theorem 1. The proposed scheme is IND-CPA secure under the -based decision assumption.

Proof. Assume that the adversary wins the game in Definition 1 with a nonnegligible advantage ; an algorithm would be constituted for breaking the -based decision problem with advantage . Given an arbitrary bit , an instance is established, in which is arbitrarily selected from if and is arbitrarily selected from if . The ultimate target of is to guess .
Setup: sets the security parameter satisfying and chooses two large prime numbers which satisfy . calculates . Next, randomly chooses and sets up message space . secretly keeps and returns to .
For the purpose of continuous rapid response and consistency, holds the below list.(1): it consists of tuples .Phase 1: executes the following queries adaptively. Hash query: makes a query on and responds according to the following steps:(1)If is included in the list , replies the hash value to .(2)If is not included in the list , randomly selects , inserts into the list , and returns to . Encryption query: makes an encryption query on the message and randomly picks and calculates . Finally, returns to .Challenge: two messages are provided by which submits them to . Next, randomly picks and a bit , calculates , and replies it to .Phase 2: the same queries are executed by as Phase 1 apart from the encryption query on the messages and .Guess: outputs guess and submits it to . If , outputs the guess .When , which means and , the ciphertext is a valid ciphertext. The probability of correctly guessing is . Therefore, the probability that can successfully guess is .
When , which means and , the ciphertext is an invalid ciphertext. The probability of correctly guessing is . Therefore, the probability that can successfully guess is .
Based on the above two cases, the probability that would break the -based decision problem isConsequently, can break the -based decision problem with nonnegligible probability, . This generates a conflict with -based decision assumption; therefore, the proposed scheme is IND-CPA secure.

6.2. Unforgeability

The security of the proposed scheme satisfies the existential unforgeability under the adaptively chosen message attack (EUF-CMA). The adversary can execute the following queries:Hash query: given the hash query, output a random valueCreate user query: given a create user query on of , output the public key Corrupt user query: given a corrupt user query on of , output the private key Signature query: given a signature query on the ciphertext under of , output the signature

The security model can be defined by the interactive game between the adversary and the challenger .

Initialization: chooses a challenging identity and submits it to .

Setup: produces the system parameters and sends them to .

Query: adaptively executes the hash queries, the create user queries, the corrupt queries, and the signature queries for polynomial times except the corrupt user query on .

Forgery: produces a forged signature on the ciphertext and the challenging identity , such that(1) is a valid signature(2) has never been queried in the corrupt user queries

The advantage for the adversary to win the game is defined as

Definition 2. The proposed scheme ensures EUF-CMA secure if the advantage of an adversary in the above game is negligible.

Theorem 2. The proposed scheme is EUF-CMA secure under ECDL assumption.

Proof. Assume that the adversary wins the game in Definition 2 with a nonnegligible advantage ; an algorithm would be constituted for breaking ECDL problem with advantage . An instance of ECDL assumption is established, the ultimate target of is to discover .
Initialization: selects a challenging identity and submits it to .
Setup: selects security parameter and the cyclic group . Then, randomly selects five hash functions that are regarded as random oracles. Finally, sends the system parameters to . maintains the following three lists:(1): it consists of tuples (2): it consists of tuples (3): it consists of tuples Query: adaptively executes the polynomial times following queries.Hash query: makes a query on and responds according to the following steps:(1)If is included in the list , responds to (2)If is not included in the list , randomly selects , inserts into the list and responds to Hash query: executes a query for and responds according to the following steps:(1)If is included in the list , responds to (2)If is not included in the list , randomly selects , inserts into the list , and responds to Create user query: this query is issued by on the identity of and responds according to the following steps:(1)If is included in the list , responds to .(2)If is not included in the list , executes the following steps:(i)If , randomly chooses and sets and ; if already emerges in the list , randomly selects another and tries again. Then, inserts into the list and inserts into the list , respectively. Ultimately, responds to .(ii)If , executes the smart meters’ registration algorithm to produce and responds them to .Corrupt user query: this query is performed by on the identity of and responds according to the following steps:(1)If , aborts the game.(2)If , executes the following steps:(i)If is included in the list , responds to .(i)If is not included in the list , executes the create user query on and responds to .Signature query: after receiving a ciphertext and for a signature query, responds according to the following steps:(1)If , randomly chooses and calculates . If already emerges in the list , randomly chooses another and tries again. Afterward, inserts into the list and responds to .(2)If , executes the report generation algorithm to produce and responds them to .Forgery: produces a forged signature on the ciphertext under identity of , such that(1)If , aborts the game.(2)If , can produce an additional valid signature through different hash value according to forking lemma [45]. The following two equations can be obtained:We can calculateECDL problem’s solution is obtained by :Probability analysis: considering that is allowed to execute at most times query, times query, times create user query, times corrupt user query, and times signature query. The situation that breaks the ECDL problem defined three events as follows:(1): never aborts the game for the corrupt user queries(2): can produce a valid signature(3): According to the above simulation, there are , , and . Therefore, the probability that can solve the ECDL problem isThus, can break the ECDL problem with nonnegligible advantage . This produces a contradiction with ECDL assumption; consequently, the proposed scheme satisfies unforgeability security.

6.3. Analysis of Security Requirement

The security requirements are analyzed comprehensively in this section.

6.3.1. Confidentiality

On the basis of Theorem 1, the adversary cannot decrypts the ciphertext , and to collect electricity data without the key of symmetric homomorphic encryption . Consequently, confidentiality can be satisfied.

6.3.2. Authentication

Legal smart meter will register its identity information with CC in advance. After receiving the reports of , will verify whether holds. Based on Theorem 2, the adversary cannot create a valid authentication without the private key . Obviously, authentication can be met.

6.3.3. Integrity

The ciphertext is signed to generate the signature . On the basis of Theorem 2, the adversary cannot generate the legal signature without the private key , and only valid reports can be accepted. So, this means integrity can be achieved.

6.3.4. Anonymity

Every is set as a pseudoidentity in the registration phase corresponding to the real identity , where and . The adversary cannot get real identity without or . Thus, anonymity is guaranteed in the proposed scheme.

6.3.5. Traceability

When has malicious behavior, only TA can calculate by using private key to uncover the true identity . In this way, the proposed scheme realizes traceability.

6.3.6. Resistance against Collision Attack

GW can disclose extra ciphertext to CC. Next, CC can obtain by calculating . However, they cannot gain the plaintext without and , even if CC obtains ciphertext by accident. Similarly, CC is still unable to obtain the real electricity data. Hence, the proposed scheme would withstand the collision attack.

6.3.7. Resistance against Modification Attack

According to the guarantee of Theorem 2, any modification of the data report by the polynomial adversary will be detected. Hence, the proposed scheme could resist the modification attack.

6.3.8. Resistance against Replay Attack

Since the reports , and contain the timestamp, the receiver could check the freshness of timestamp. Therefore, the replay attacks can be withstood.

6.4. Functionality Comparison

The functionality comparison with the related schemes [1921, 27, 28] is shown in Table 2. Confidentiality, authentication, integrity, and fault tolerance are denoted by F1, F2, F3, and F4, respectively. In addition, anonymity, traceability, and resistance against collision attack, modification attack, and replay attack are represented by F5, F6, F7, F8, and F9, respectively.

The schemes [19, 20] do not support fault tolerance. Furthermore, the schemes [19, 21, 28] do not protect users’ identity privacy, and the schemes [20, 27] cannot trace malicious behaviors. Besides, the schemes [27, 28] may be subjected to collision attacks, and the scheme [27] may be subjected to replay attacks. It is clear that the other related schemes fail to meet several requirements, yet the proposed scheme simultaneously fulfill all the security requirements.

7. Performance Evaluation

In this section, the computation cost and the communication overhead are compared and analyzed in a quantitative way between the related schemes [1921, 27, 28] and the proposed scheme.

7.1. Computation Cost

In order to ensure fairness comparison, the proposed scheme should be compared with other existing data aggregation schemes [1921, 27, 28] based on the same 80 bits security level. With respect to the schemes [20, 27] based on Paillier encryption, two large prime numbers are selected as 512 bits, and is 1024 bits. Considering the schemes [1921, 27, 28] based on bilinear pairing, the symmetric bilinear pairing is exploited, where is an additive group with generator of the order , that is defined on the super singular elliptic curve with embedding degree 2, is 160-bit Solinas prime number, and is 512-bit primer number satisfying . In terms of the ECC-based schemes [20, 21] and the proposed scheme, an additive group with prime order is established by nonsingular elliptic curve , in which are both 160 bits prime numbers and is a random 160-bits prime number. With regard to the symmetric homomorphic encryption in this paper, and are two 512 bits prime numbers, and the length of is 160 bits.

For making more accurate comparison, the running time of each cryptographic operation is estimated by the MIRACL Crypto SDK [46]. The hardware equipment is a PC with 2.90 GHz, whose CPU is i5-10400, memory is 16 GB, and the operating system is 64 bit Windows 10 system. Table 3 indicates the mean consumed time of 10000 executions corresponding to different cryptographic operations.

Considering simplicity, some lightweight operations have been ignored, such as general hash function and point addition. The specific details are described in Table 4, in which represents the number of smart meters. Assume in the proposed scheme, that is, . The computation cost is divided into three phases, including report generation phase, report aggregation phase, and report reading phase.

First of all, the computation cost of the report generation phase is considered.

Li et al.’s scheme [19] employs exponentiation operations in , multiplication operations in , exponentiation operations in , and map-to-point hash operations. As a result, the running time is .

Shen et al.’s scheme [20] applies Paillier public key encryption operations, exponentiation operations in , scale multiplication operations in ECC, and map-to-point hash operations. In this way, the running time is .

Chen et al.’s scheme [21] utilizes bilinear pairing operations, map-to-point hash operations, exponentiation operations in , and scale multiplication operations in ECC. Consequently, the running time is .

Guan et al.’s scheme [27] demands exponentiation operations in , Paillier public key encryption operations, exponentiation operations in , and exponentiation operations in . As a consequence, the running time is .

Ding et al.’s scheme [28] needs exponentiation operations in and exponentiation operations in . Hence, the running time is .

In the report generation phase of the proposed aggregation scheme, executes n scale multiplication operations in ECC and multiplication operations in . executes scale multiplication operations in ECC. As a matter of fact, the running time is .

Afterward, the computation cost of the report aggregation phase is analyzed.

Li et al.’s scheme [19] employs bilinear pairing operations, map-to-point hash operations, multiplication operations in , one exponentiation operation in , and one exponentiation operation in . As a result, the running time is .

Shen et al.’s scheme [20] applies bilinear pairing operations, map-to-point hash operations, and one scale multiplication operation in ECC. In this way, the running time is .

Chen et al.’s scheme [21] utilizes bilinear pairing operations, map-to-point hash operations, and one scale multiplication operation in ECC. Consequently, the running time is .

Guan et al.’s scheme [27] demands exponentiation operations in and exponentiation operations in . As a consequence, the running time is .

Ding et al.’s scheme [28] needs exponentiation operations in . Hence, the running time is .

In the report aggregation phase, the proposed scheme executes scale multiplication operations in ECC. As a matter of fact, the running time is .

Ultimately, the computation cost of the report reading phase is summarized.

Li et al.’s scheme [19] employs two bilinear pairing operations, one map-to-point hash operation, one exponentiation operation in , and one solving the discrete logarithm operation. As a result, the running time is .

Shen et al.’s scheme [20] applies two bilinear pairing operations, one map-to-point hash operation, one Paillier public key decryption operation, and two exponentiation operations in . In this way, the running time is .

Chen et al.’s scheme [21] utilizes three bilinear pairing operations, two map-to-point hash operations, and one solving discrete logarithm operation. Consequently, the running time is .

Guan et al.’s scheme [27] demands one exponentiation operation in and one Paillier public key decryption operation. As a consequence, the running time is .

Ding et al.’s scheme [28] needs three exponentiation operations in , one bilinear pairing operation, and one solving discrete logarithm operation. Hence, the running time is .

In the report reading phase, the proposed scheme executes two scale multiplication operations in ECC, one modular operation, and one modular operation. As a matter of fact, the running time is .

The total running time of the other schemes [1921, 27, 28] and the proposed scheme are , , , , , and , respectively. Figure 4 displays that the overall computation cost varies with the number of smart meters. Apparently, the overall computation cost of all schemes has a linear relationship with the number of smart meters. The proposed scheme requires the minimum overall computation cost and shows slower growth than other schemes. Specifically, the proposed scheme reduced the cost by 94.5%, 94.3%, 96.7%, 89.1%, and 78.3%, respectively, compared with other schemes [1921, 27, 28]. Consequently, the proposed data aggregation scheme is more appropriate for the smart meters with limited computation resources because it involves no time-consuming operations, such as map-to-point hash and bilinear pairing operation.

7.2. Communication Overhead

The communication overhead will be compared with the schemes [1921, 27, 28]; the details are shown in Table 5, where denotes bit size of . In the smart grid, the size of the transmitted report is analyzed, including two parts, communication overhead from SM to GW and from GW to CC. Same as before, the length of , , , , , and are 160 bits, 512 bits, 1024 bits, 160 bits, 1024 bits, and 2048 bits, respectively. Furthermore, assume that the identity and the timestamp are both defined as 32 bits.

For the first part, the communication process from SM to GW is analyzed.

In Li et al.’s scheme [19], the electricity transmission data are , where is 32 bits identity and is 32 bits timestamp, and . Consequently, the size of communication overhead is .

In Shen et al.’s scheme [20], the electricity transmission data are , where , , , is 32 bits timestamp, and . In this way, the size of communication overhead is .

In Chen et al.’s scheme [21], the electricity transmission data are , where , , is 32 bits timestamp, and is 32 bits identity. Therefore, the size of communication overhead is .

In Guan et al.’s scheme [27], the electricity transmission data are , where , , , , and are considered as 32 bits. Thus, the size of communication overhead is .

In Ding et al.’s scheme [28], the electricity transmission data are , where , , , is 32 bits timestamp, and is 32 bits identity. Hence, the size of communication overhead is .

In the proposed scheme, submits the data report to , where , , , , is 32 bits, and the size is . submits the data report to GW, and the size is . As a matter of fact, the communication overhead is .

Figure 5 intuitively reflects the relationship between the communication overhead from SM to GW and the number of smart meters. It is clear that the communication overhead raises linearly with the increase in the number of smart meters. The proposed scheme demands bits, so it reduces the communication overhead by 47.2%, 52.7%, and 24.1%, respectively, compared with the other schemes [20, 27, 28]. Although the related schemes [19, 21] are slightly better than our work in terms of the communication overhead from SM to GW, this is negligible because the proposed scheme would meet the fault tolerance and anonymity that they did not. In general, the proposed scheme consumes less communication resources.

For the second part, the communication process from GW to CC is analyzed.

In Li et al.’s scheme [19], the electricity transmission data are , where is 32 bits identity and is 32 bits timestamp, and . Consequently, the size of communication overhead is .

In Shen et al.’s scheme [20], the electricity transmission data are , where , , , is 32 bits timestamp, and . In this way, the size of communication overhead is .

In Chen et al.’s scheme [21], the electricity transmission data are , where , , is 32 bits timestamp, and is 32 bits identity. Therefore, the size of communication overhead is .

In Guan et al.’s scheme [27], the electricity transmission data are , where , , and is 32 bits timestamp. Thus, the size of communication overhead is .

In Ding et al.’s scheme [28], the electricity transmission data are , where , , , , is 32 bits timestamp, and is 32 bits identity. Hence, the size of communication overhead is .

In the proposed scheme, the electricity report is , where is 32 bits identity, , , , and is 32 bits timestamp. As a matter of fact, the size of communication overhead is .

The last column of Table 5 directly illustrates the communication overhead from GW to CC of the related schemes. According to the size of the transmission data report, the proposed scheme utilizes 1408 bits, which is reduced by 12.0%, 56.9%, 12.0%, 54.6%, and 49.4%, respectively, compared with other schemes [1921, 27, 28]. Consequently, the proposed scheme realizes lower communication overhead from GW to CC, which is more beneficial for GW with limited communication resources.

8. Conclusion

In this paper, we have employed the symmetric homomorphic encryption technology and the elliptic curve signature to design a lightweight and privacy-preserving data aggregation scheme in smart grid. In the proposed scheme, even though the smart meters produce malfunction, the system can still run normally to get aggregated data. Besides, it does not restrict the space of electricity data. The security analysis has demonstrated that the proposed scheme is IND-CPA and EUF-CMA secure and satisfies all security requirements. Ultimately, the performance analysis has reflected the lightweight of the proposed scheme in terms of computation cost and communication overhead. Judging from the results, the proposed scheme is more practical for the smart grid with limited computation and communication capabilities.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (no. 62072054) and Key Research and Development Program of Shaanxi Province (no. 2021GY-047).