#### Abstract

Recent advances of Internet and microelectronics technologies have led to the concept of smart grid which has been a widespread concern for industry, governments, and academia. The openness of communications in the smart grid environment makes the system vulnerable to different types of attacks. The implementation of secure communication and the protection of consumers’ privacy have become challenging issues. The data aggregation scheme is an important technique for preserving consumers’ privacy because it can stop the leakage of a specific consumer’s data. To satisfy the security requirements of practical applications, a lot of data aggregation schemes were presented over the last several years. However, most of them suffer from security weaknesses or have poor performances. To reduce computation cost and achieve better security, we construct a lightweight data aggregation scheme against internal attackers in the smart grid environment using Elliptic Curve Cryptography (ECC). Security analysis of our proposed approach shows that it is provably secure and can provide confidentiality, authentication, and integrity. Performance analysis of the proposed scheme demonstrates that both computation and communication costs of the proposed scheme are much lower than the three previous schemes. As a result of these aforementioned benefits, the proposed lightweight data aggregation scheme is more practical for deployment in the smart grid environment.

#### 1. Introduction

By providing bidirectional communications of electricity and information, the smart grid performs real-time monitoring of power usage [1]. Based on the real-time information, the providers can monitor the power generation and consumption and get immediate power demand of each area. Then, they can take prompt action to optimize the power supply. The consumer can also get the current power price and adjust his/her behavior to lower expenses. Therefore, the smart grid can achieve efficient, economical, and reliable power services. Due to such advantages, the smart grid was a widespread concern for governments, industry, and academia in the last decade and is considered as the most promising candidate of the next generation power system [2].

The National Institute of Standards and Technology (NIST) presents a model and describes seven important domains of the smart gird [3]. As shown in Figure 1 [4], a smart gird consists of seven important domains, that is, the power generation (PG) domain, the power transmission (PT) domain, the power distribution (PD) domain, the power customer (PC) domain, the power operation (PO) domain, the power market (PM) domain, and the power service provider (PSP) domain [5, 6]. After being generated, transmitted, and distributed in the PG domain, the PT domain, and the PD domain, respectively, the customers in the PC domain can enjoy wonderful life based on the power. The PO domain, the PM domain, and the PSP domain manage the power flow, the participants, and all third-party operations, respectively [7, 8].

The smart meters in the smart grid collect the consumers’ power consumption data and other information and send them to the remote control center. Generally speaking, the smart meter is installed outside the door of a consumer and an attacker is in charge of the communication channel easily due to its openness. The attacker may maliciously modify the power consumption data to increase/decrease the consumer’s power expense. He/she also can get the daily routine of the consumer in order to commit crimes. For example, he/she knows that the consumer goes out when there is no power consumption and sneaks into the house to steal expensive things.

To address the above problems, how to achieve secure communications in the smart grid becomes an issue that needs to be addressed. In particular, ensuring the data’s integrity and confidentiality is even more important. Several cryptographic schemes can be applied for secure communications in the smart grid. Many key management schemes [9–11], key distribution schemes [12–14], and key agreement schemes [15–17] were presented in recent years. However, many of these schemes cannot implement the integrity and confidentiality simultaneously. To address this challenge, data aggregation schemes have been proposed by several researchers and applied in the smart grid. However, most of them are vulnerable to attacks from internal attackers. Although several data aggregation schemes against internal attackers were proposed to enhance security, their computation or communication costs are too high for practical smart grid applications. In addition, the smart meter has very limited computation and communication capabilities. It is therefore necessary to design lightweight data aggregation schemes for practical deployment.

##### 1.1. Our Contributions

To reduce both computation and communication costs, we propose a lightweight data aggregation scheme based on the Elliptic Curve Cryptography (ECC) [18, 19], which can obtain the same security level but with a much shorter key size. The main contributions of our paper are demonstrated as follows:(i)First, we propose a lightweight data aggregation scheme based on Schnorr’s signature scheme [18].(ii)Second, we prove that the proposed lightweight data aggregation scheme is secure and is able to satisfy security requirements.(iii)Finally, we analyze the performance of the proposed lightweight data aggregation scheme to demonstrate its high performance.

##### 1.2. Organization of the Paper

In Section 2, we briefly review related papers about data aggregation schemes. In Section 3, we give some preliminaries, including backgrounds of ECC, network model, and security requirements of the data aggregation scheme. In Section 4, we present our lightweight data aggregation scheme based on ECC. In Section 5, we describe a security model for the data aggregation scheme and present the security analyses of our scheme. In Section 6, we present the computation and communication analyses of our data aggregation scheme.

#### 2. Related Works

To guarantee secure communication in open environments, a lot of authentication schemes [20–22], encryption schemes [23–26], and secure outsourcing schemes [25, 27, 28] have been constructed in last several years. Li et al. [29] and Garcia and Jacobs [30] designed two data aggregation schemes using Paillier’s encryption scheme [31]. To improve performance, Lu et al. [32] designed an improved data aggregation scheme using Paillier’s encryption scheme and the super-increasing sequence. However, the above three schemes [29, 30, 32] cannot protect consumers’ privacy because none of them can provide anonymity. To protect consumers’ privacy, Zhang et al. [33] designed a security-enhanced data aggregation scheme based on the Chinese Remainder Theorem and Paillier’s encryption scheme. Chen et al. [34] also designed a security-enhanced data aggregation scheme with fault tolerance based on Paillier’s encryption scheme.

Unfortunately, internal attacks are not considered in the above data aggregation schemes [29, 30, 32–34] thereby allowing internal attackers to access the consumers smart grid data. To address this weakness, Fan et al. [35] designed the first data aggregation scheme that can withstand attacks from internal attackers by using blinding technology. Unfortunately, Bao and Lu [36] demonstrated that Fan et al.’s data aggregation scheme cannot guarantee the integrity of transmitted data. To enhance security, He et al. [4] designed an improved data aggregation scheme based on Boneh et al.’s encryption scheme [37]. The performance of Fan et al.’s data aggregation scheme [35] and He et al.’s data aggregation scheme [4] is not good enough because they use bilinear pairing operations.

#### 3. Preliminaries

##### 3.1. Elliptic Curve

Given a prime number , we say that the equation defines an elliptic curve , where and [38]. It is well known that all points on and the infinite point make an additive group . Given a generator point with a prime order , the scale multiplication operation is defined as , where is a positive integer.

Previous researches have showed that the following problems in the group are suitable for the design of public key cryptography because no probabilistic polynomial time algorithm can solve them efficiently [38].

*Discrete Logarithm (DL) Problem*. Given an element , the DL problem is to extract an element such that .

*Computational Diffie-Hellman (CDH) Problem*. Given two elements with two unknown elements , the CDH problem is to extract the element .

##### 3.2. Network Model

As shown in Figure 2 [4], there are three participants in the system of a data aggregation scheme, namely, a trusted third party (TTP), an aggregator (Agg), and a smart meter () [4, 35]. The functions of the above three participants are presented as below.(i)TTP: it is a trusted third party and its function is to generate blinding factors to withstand the internal attackers.(ii)Agg: it is the manager of the smart grid and its function is to generate the system parameters and the private keys of smart meters.(iii): it is a smart meter and its function is to collect consumers’ electricity consumption data and send it to Agg.

The workflow of the system is presented as follows. (1) Agg produces the system parameters and the mast private key; (2) registers in Agg and gets its private key; (3) TTP generates the blinding factors for Agg and ; (4) collects the electricity consumption, produces a ciphertext, and sends it to Agg; (5) after collecting all ciphertexts, Agg checks their validity and extracts the sum of all electricity consumption data.

##### 3.3. Security Requirements

Based on recently works, we know that a data aggregation scheme for the smart grid should meet the below security requirements [4, 35].

*(i) Confidentiality*. The consumer’s power consumption data indicates his/her habit and its leakage may be used by an attacker to commit a crime. To ensure the consumer’s safety, a data aggregation scheme should provide confidentiality; that is, both the external attackers and the internal attackers cannot extract the electricity consumption data from intercepted messages.

*(ii) Authentication*. The malicious attacker may forge a message and impersonate the consumer. To ensure if the received message is transmitted by a legal , a data aggregation scheme should provide authentication; that is, Agg can check the legality of the received message.

*(iii) Integrity*. All messages are transmitted over open communication channels and the malicious attacker may modify them to break regular transactions. To protect the rights and interests of all participants in the smart grid, a data aggregation scheme should provide integrity; that is, Agg can detect any modification of the received data.

*(iv) Resistance against Attacks*. Due to the openness of communication channels in the smart grid, the system is vulnerable to many types of attacks. To obtain secure communications in the smart grid, a data aggregation scheme should supply resistance against attacks; that is, it can withstand the replay attack, the modification attack, the man-in-the-middle attack, and the impersonation attack.

#### 4. The Proposed Data Scheme

We describe our proposed lightweight data aggregation scheme, which consists of three phases, namely, the initialization phase, the registration phase, and the aggregation phase.

*Initialization Phase*. In this phase, Agg executes some steps to produce the system parameters. TTP and Agg execute some other steps to produce the blind factors against internal attackers.

Agg runs the following steps to produce the system parameters.(1)Agg selects an elliptic curve determined by the equation , where is a prime and .(2)Agg selects an element with the order existing on , where is a prime.(3)Agg selects an element and calculates .(4)Agg selects three cryptographic hash functions .(5)Agg publishes and saves secretly.

TTP and Agg execute the following steps to produce the blinding factors.(1)TTP randomly selects a group of elements and computes . At last, TTP sends to Agg and also sends to , where .(2)Agg computes and keeps it secretly.

*Registration Phase*. In this phase, registers in Agg. After registration, receives its private key and becomes a legal smart meter. As demonstrated in Table 1, and Agg run the following processes to finish the registration.(1) randomly chooses an element , computes , and transmits to Agg secretly.(2)Agg randomly chooses an element and computes , and . At last, Agg sends to secretly.(3) computes and checks if the equation holds. If not, rejects the session; otherwise, stores and finishes the registration.

Due to the fact that , , and , then we have

Therefore, the correctness of the registration phase is demonstrated.

*Aggregation Phase*. In this phase, extracts the power consumption data and sends it to Agg. Agg checks the validity of the received messages and aggregates all the received data. As demonstrated in Table 1, the steps below are executed by and Agg.(1) gets the power consumption data , randomly chooses an element , and computes , and . At last, transmits to Agg.(2)Agg checks if , where and . To improve performance, we use the small exponent test technology [39] to achieve the batch verification. Agg randomly chooses a group of integers and checks if the equation holds. Agg computes and extracts the sum of the power consumption data by computing .

Due to the fact that and , we can derive

According to the above equations, the correctness of the aggregation phase of our scheme is demonstrated.

#### 5. Security Analysis

The security of the proposed lightweight data aggregation scheme is analyzed in this section. First, we present a security model for the data aggregation scheme. Second, we demonstrate that the proposed lightweight data aggregation scheme is provably secure in the security model. Finally, we demonstrate that the proposed lightweight data aggregation scheme can meet the security requirements presented in Section 3.

##### 5.1. Security Model

Based on security models [40] for signcryption schemes, we presented a security model for data aggregation schemes. The security of confidentiality and unforgeability is formally defined by two games executed by an attacker and a challenger . is allowed to make the following queries.(i): for such a query made by , randomly selects , sends to , and stores in the table , where .(ii): for such a query made by , generates ’s secret key and blinding factor and stores them in the table .(iii): for such a query made by , sends ’s private key and blinding factor to .(iv): for such a query made by , generates a ciphertext corresponding to the message .(v): for the query made by , checks the validity of the ciphertext and decrypts it to get the plaintext.

*Definition 1. *A data aggregation scheme is able to provide confidentiality [indistinguishability against adaptive chosen ciphertext attacks ()] if no attacker can win the following game with a nonnegligible advantage.*Setup*. produces system parameters and transmits them to .*Phase **1*. is able to adaptively make , CreateSM, CorruptSM, Signcrypt, and Designcrypt queries.*Challenge*. picks a challenging identity , chooses two messages and , and sends them to . picks a random element , produces a signcrypted ciphertext , and sends it to .*Phase **2*. In this phase, can adaptively make , CreateSM, CorruptSM, and Signcrypt queries except that it cannot make a CorruptSM query with or a Designcrypt query with .

Finally, gives its guess about the value of selected by .

The advantage of is defined by the equation . wins in the above game if it guesses the value of correctly.

*Definition 2. *A data aggregation scheme is able to provide unforgeability [existential unforgeability against adaptive chosen messages attacks (EUFCMA)] if no attacker wins the following game with a nonnegligible advantage.

*Setup*. produces the system parameters and sends them to .

*Qurey*. In this phase, picks a challenging identity and is able to adaptively make , CreateSM, CorruptSM, Signcrypt, and Designcrypt queries except that it cannot make a CorruptSM query with .

*Forgery*. In this phase, outputs a ciphertext corresponding to the challenging identity .

We say wins in the above game if is valid and it is not generated by executing a Signcrypt query.

##### 5.2. Security Analysis

Theorem 3. *The proposed data aggregation scheme is able to provide confidentiality if the CDH problem is hard.*

*Proof. *Assume that an attacker wins the game defined in Definition 1 with a nonnegligible advantage . Based on ’s capability, we can construct a challenger to solve the CDH problem with a nonnegligible advantage. Given an instance () of the CDH problem, sets and sends to . randomly picks up an identity as the challenging identity and answers queries from according to the rules below.(i): keeps a table of the form , where . Upon receiving such a query, checks if contains a tuple . If so, sends to ; otherwise, randomly selects an element , stores into , and sends to .(ii): keeps a table of the form . Upon receiving such a query, checks if contains a tuple . If so, sends to ; otherwise, randomly selects three integers and sets . stores () and () into , respectively.(iii): checks if contains a tuple (). If not, makes -query with the identity . After that, returns () to .(iv): checks if contains a tuple (). If not, makes -query with the identity . After that, gets the tuple () from and uses it to produce a ciphertext . At last, sends to .Given the power consumption data and , extracts () from and selects a random element . sets , randomly selects three elements , stores () into , and sends to .

After that, can make , CreateSM, CorruptSM, and Signcrypt queries and get the corresponding responses. Then, outputs as his/her guess against the confidentiality. selects a random tuple () from and outputs as the solution of the given CDH problem.

Let denote the number of -query involved in the game. The probability that can solve the given CDH problem is . Because of the nonnegligibility of , we know that is nonnegligible. This contradicts with the hardness of the CDH problem. Thus, the proposed data aggregation scheme is able to provide confidentiality.

Theorem 4. *The proposed data aggregation scheme is able to provide unforgeability if the DL problem is hard.*

*Proof. *Assume that an attacker wins the game defined in Definition 1 with a nonnegligible advantage . Based on ’s capability, we can construct a challenger to solve the DL problem with a nonnegligible advantage. Given an instance () of the DL problem, picks a random integer , computes , and sends to . randomly selects an identity as the challenging identity and answers queries from according to the rules below.(i): keeps a table of the form , where . Upon receiving such a query, checks if contains a tuple . If so, sends to ; otherwise, randomly picks up an element , stores into , and sends to .(ii): keeps a table of the form . Upon receiving such a query, checks if contains a tuple . If so, sends to ; otherwise, answers the query through the rules below:(1)If , randomly picks two integers and sets . stores () and () into , respectively.(2)Otherwise (), randomly selects three integers and sets . stores () and () into , respectively.(iii): checks if contains a tuple (). If not, makes -query with the identity . After that, returns () to .(iv): checks if and are equal. If they are not, extracts the tuple () from and uses it to produce a ciphertext according to the description of the proposed data aggregation; otherwise, randomly selects two integers and computes and . stores () into and sends to .(v): for the query made by , checks the validity of the ciphertext and decrypts it to get the plaintext using the systems secret key .

At last, outputs a forged ciphertext (). stops the game if the equation holds. Based on the forking lemma [41], can output another valid ciphertext () by choosing a different hash function . Since both ciphertexts are valid, we can derive the following two equation:

Based on the above two equations, we can derive the equation below:

outputs as the solution of the given DL problem. To compute the probability that solves the DL problem, three related events are listed below.(i): equals .(ii): is able to forge a legal ciphertext.

Let denote the number of involved in the game. It is easy to get that and . Then, the probability that solves the DL problem is . Because of the nonnegligibility of , we know that is nonnegligible. This is in contradiction with the hardness of the DL problem. Thus, the proposed data aggregation scheme is able to provide unforgeability.

##### 5.3. Analysis of Security Requirements

We will show that the proposed lightweight data aggregation scheme is able to meet security requirements presented in Section 3.

*(i) Confidentiality*. The internal attacker against the proposed data aggregation scheme can compute . Without the blinding factor , he/she cannot extract the sum of the power consumption data by computing . Besides, Theorem 4 shows that the proposed lightweight data aggregation scheme can supply confidentiality against any external attacker. Thus, our lightweight data aggregation scheme can supply confidentiality.

*(ii) Authentication*. Theorem 3 shows that any attacker cannot forge a legal ciphertext. Then, Agg can verify the legality of received messages by verifying if the equation holds. Therefore, the proposed data aggregation scheme can provide authentication.

*(iii) Integrity*. Theorem 3 demonstrates that any attacker against the proposed data aggregation scheme cannot forge a legal ciphertext. Agg can detect any modification of the received data by verifying if the equation holds. Therefore, the proposed data aggregation scheme can provide integrity.

*(iv) Resistance against Attacks*. The proposed lightweight data aggregation scheme can resist the replay attack, the modification attack, the man-in-the-middle attack, and the impersonation attack. The reason is analyzed below.

*(**1) Replay Attack*. The timestamp is involved in the ciphertext. Agg can find any reply of previous message by verifying ’s freshness. Thus, the proposed lightweight data aggregation scheme can resist the replay attack.

*(**2) Modification Attack*. Theorem 3 demonstrates that any attacker against the proposed data aggregation scheme cannot forge a legal ciphertext. Agg can detect any modification of the received data by verifying if holds. Thus, the proposed lightweight data aggregation scheme can resist the modification attack.

*(**3) Man-in-the-Middle Attack*. The above analysis demonstrates that the proposed lightweight data aggregation scheme can supply authentication; that is, Agg can authenticate by checking if holds. Thus, the proposed lightweight data aggregation scheme can resist the man-in-the-middle attack.

*(**4) Impersonation Attack*. Theorem 4 shows that any attacker cannot forge a legal ciphertext without ’s secret key. Then, Agg can detect any impersonation by verifying the validity of the received ciphertext. Therefore, the proposed lightweight data aggregation scheme can resist the impersonation attack.

#### 6. Performance Analysis

We analyze both computation and communication costs of our lightweight data aggregation scheme in this section. We also compare its performance with two of the most recently proposed data aggregation schemes to show its lightweight costs.

To achieve a fair comparison, we compare recently proposed aggregation schemes under the same security level. In the BGN encryption scheme [37], two 512-bit prime numbers and are applied in our experiments, where and are also large prime numbers. In schemes based on bilinear pairing, a Tate pairing based on a Type A elliptic curve with a prime order is applied in our experiments, where the lengths of and are 512 bits and 160 bits, respectively. In schemes based on ECC, an elliptic curve with a prime order is applied in our experiments, where the lengths of and are 160 bits.

##### 6.1. Analysis of Computation Costs

Based on the well-known cryptographic library MIRACL [42], we have implemented all related operations on a personal computer with an Intel I5-3210M 2.50 GHz Center Processor Unit (CPU), an 8 Gbyte Random Access Memory (RAM), and the Windows 7 operation system. Table 3 presents the operations’ notations and runtime results.

Each in the Fan et al.’s scheme [35] runs one BGN encryption operation, one exponentiation in BGN algorithm, two multiplications related to BGN algorithm, one operation, one operation, and one general hash function. Therefore, ’s runtime is × = × = 36.254 ms. Agg in Fan et al.’s scheme [35] runs one BGN decryption, one exponentiation related to the BGN algorithm, multiplication related to BGN algorithm, hash-to-point, bilinear pairing, point multiplication related to the bilinear pairing, point multiplication with a short exponent related to the bilinear pairing, exponentiation related to the bilinear pairing, and one general hash function. Therefore, Agg’s runtime is + + = + + = ) microseconds.

Each in the proposed scheme executes two point multiplication operations related to ECC and two general hash functions. Therefore, ’s runtime is microseconds. Agg in the proposed scheme executes point multiplication related to ECC, point addition related to ECC, and general hash functions. Therefore, ’s runtime is + = + .

Table 4 and Figure 3 show the runtime comparisons among Fan et al.’s data aggregation scheme [35], He et al.’s scheme [4], and the proposed scheme. From Tables 4 and 2, the proposed scheme incurs a lower computation cost as compared to Fan et al.’s scheme and He et al.’s scheme at both sides of and Agg.

##### 6.2. Analysis of Communication Costs

Since the sizes of , , , , , and are 512 bits, 512 bits, 512 bits, 160 bits, 1024 bits, and 160 bits, respectively, we can determine that the sizes of elements in , , , , , and are 1024 bits, 1024 bits, 1024 bits, 160 bits, 1024 bits, and 160 bits, respectively. We assume that the size of both the timestamp and the identity are each 32 bits. The communication costs of the related data aggregation schemes are shown below.

In Fan et al.’s data aggregation scheme [35] sends the message to Agg, where , , and is the timestamp. Therefore, the communication cost of Fan et al.’s data aggregation scheme is 1024 + 1024 + 32 = 2080 bits. In He et al.’s data aggregation scheme [4] sends the message to Agg, where , , , is ’s identity, and is the timestamp. Therefore, the communication cost of He et al.’s data aggregation scheme is 32 + 1024 + 160 + 1024 + 32 = 2272 bits. In the proposed data aggregation scheme, sends the message to Agg, where , , , and is the timestamp. Therefore, the communication cost of the proposed data aggregation scheme is 1024 + 1024 + 160 + 32 = 2240 bits.

Based on the above evaluation, we note that the proposed data aggregation scheme incurs lower communication cost than He et al.’s data aggregation scheme. The proposed data aggregation scheme incurs a higher communication cost than Fan et al.’s data aggregation scheme. Security is most important for a data aggregation scheme. Therefore, it is reasonable to address serious security weaknesses in Fan et al.’s data aggregation scheme at the cost of increasing the communication cost slightly.

#### 7. Conclusion

To ensure security and protect privacy in the smart grid environment, several data aggregation schemes have been proposed in recent years. However, most of them are not secure against internal attackers. To address the problem, Fan et al. [35] proposed a data aggregation scheme to mitigate internal attacks. Unfortunately, their data aggregation scheme suffers from serious security weaknesses. To enhance security, He et al. [4] proposed an improved data aggregation scheme using bilinear pairing. However, the performance of He et al.’s scheme is not very suitable for the smart grid environment because the smart meter has limited computation capability. In this paper, we have proposed a novel data aggregation scheme that can thwart internal attacks for the smart grid environment. The security analysis shows that the proposed scheme is provably secure and can meet the security requirements. Besides, performance evaluation results show that the proposed scheme incurs lower communication costs. The stronger security and better performance of the proposed scheme demonstrate that it is more suitable for smart grids.

With the fast development of quantum computing, the traditional mathematical problems (such as the DL problem and the CDH problem) are likely to be solved in polynomial time by quantum computers. Subsequently, all the above data aggregation schemes for the smart grid will not be secure at all. The lattice has been widely used to construct many cryptographic schemes that can provide resistance against the strong capabilities of quantum computers. However, no data aggregation scheme based on the lattice has been proposed yet. To improve security, it is worthwhile to consider the design of a data aggregation scheme for the smart grid based on the lattice approach.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The work was supported by the National Natural Science Foundation of China (nos. 61572370, 61501333, 61572379, and U1536204), the National High-Tech Research and Development Program of China (863 Program) (no. 2015AA016004), and the Natural Science Foundation of Hubei Province of China (no. 2015CFB257). Sherali Zeadally’s work has been supported by a University Research Professorship Award from the University of Kentucky.