Embedded Systems for Mobile SensorsView this Special Issue
Random Secure Comparator Selection Based Privacy-Preserving MAX/MIN Query Processing in Two-Tiered Sensor Networks
Privacy-preserving data queries for wireless sensor networks (WSNs) have drawn much attention recently. This paper proposes a privacy-preserving MAX/MIN query processing approach based on random secure comparator selection in two-tiered sensor network, which is denoted by RSCS-PMQ. The secret comparison model is built on the basis of the secure comparator which is defined by 0-1 encoding and HMAC. And the minimal set of highest secure comparators generating algorithm MaxRSC is proposed, which is the key to realize RSCS-PMQ. In the data collection procedures, the sensor node randomly selects a generated secure comparator of the maximum data into ciphertext which is submitted to the nearby master node. In the query processing procedures, the master node utilizes the MaxRSC algorithm to determine the corresponding minimal set of candidate ciphertexts containing the query results and returns it to the base station. And the base station obtains the plaintext query result through decryption. The theoretical analysis and experimental result indicate that RSCS-PMQ can preserve the privacy of sensor data and query result from master nodes even if they are compromised, and it has a better performance on the network communication cost than the existing approaches.
As wireless sensor networks (WSNs) have been widely used in a variety of important areas such as environment monitoring, medical care, national defense, and military, various security problems of data privacy are becoming more and more critical. For example, in the rare animal monitoring, the location of rare animals could be obtained for illegal hunting; in the application of smart home, the information for use of family hydroelectricity could be stolen for burglary. Therefore, privacy-preserving has become a very important issue in WSNs.
Most large-scale WSNs are expected to apply a two-tiered architecture with the resource-limit sensor nodes at the lower layer and resource-abundant master nodes at the upper layer, and this architecture is used to construct our concerned two-tiered wireless sensor networks (TWSNs) in this paper, as shown in Figure 1. The master nodes have abundant resources of energy, computation, communication, and so forth, while the sensor nodes only have limited resources. The sensor nodes are only responsible of collecting data and periodically submitting it to a nearby master node for storage, which responds to the query requests from the base station (BS) and then returns the query results. Due to the simplicity of topological structure which contains multiple independent cells and the resource abundance of master nodes, TWSNs have a lot of advantages, such as stable link quality, simple route structure, and higher network scalability [1, 2].
However, because the master nodes are not only responsible of storing all the data from the sensor nodes, but also processing the query requests from BS, they are much more attractive and vulnerable to attackers in a hostile environment. Once a master node is compromised, serious threats could be brought to the data privacy of TWSNs. The attackers could utilize the compromised master nodes to obtain all the collected data of sensor nodes and the query results. Thus, it is necessary to investigate the privacy-preserving problems in TWSNs and develop efficient and effective solutions.
MAX/MIN query is a useful data query method to obtain the maximum or minimum data in the areas and epochs of interest. It can be utilized in event monitoring. For example, it can be applied to monitor the highest temperature in a warehouse so as to alarm the fire risk. The challenges to achieve privacy-preserving MAX/MIN query processing in TWSNs include the following:(i)How to make the master nodes realize secure comparisons of data items without knowing their real values and then determine the maximum or minimum value, that is, the query result.(ii)How to maximally reduce the communication cost of network, especially that of the sensor nodes due to their limited resources.
In this paper, we propose a privacy-preserving MAX/MIN query processing approach based on random secure comparator selection in TWSNs, which is denoted by RSCS-PMQ. The basic idea is as follows: once finishing data collection, the sensor nodes will encrypt the collected data into ciphertext and select the corresponding random secure comparators generated by using the 0-1 encoding  and hashed message authentication coding (HMAC) . Then the ciphertext and the corresponding random secure comparators are submitted to the nearby master node. When the master node processes a query request from BS, it will utilize the algorithm MaxRSC to determine the minimal set of highest secure comparators, further determine the corresponding minimal set of candidate ciphertexts containing the query result, and return it to BS. After decrypting the received ciphertext, BS will obtain the query result in plaintext. Since the data storage and query response procedures in the master node do not involve the plaintext of collected data, the adversaries cannot read any hosted data or query result from the master node even if it is compromised. Thus, the proposed RSCS-PMQ can achieve the privacy protection in MAX/MIN query processing. In addition, the evaluation indicates that RSCS-PMQ has better performance than other existing works on the network communication cost.
The main contributions of this paper are as follows:(i)We propose a comparison model based on random secure comparator selection through the 0-1 encoding and hashed message authentication coding, which supports data comparison without real values in master nodes.(ii)We design an algorithm to generate the minimal set of highest secure comparators based on the former comparison model, MaxRSC, which is the key algorithm to accomplish RSCS-PMQ.(iii)We provide the concrete protocols of achieving RSCS-PMQ, which consist of the data collection protocol and the query response protocol, and the latter protocol can protect data privacy from master nodes even if they are compromised.(iv)We analyze the privacy protection and communication cost of RSCS-PMQ and conduct performance evaluation through comprehensive simulation.
The rest of this paper is organized as follows. Section 2 gives an overview of related works. Section 3 describes related models and problem statement. In Section 4, we present the privacy-preserving MAX/MIN query protocols based on random secure comparator selection. Then, Section 5 analyzes the privacy and communication costs of our approach. We evaluate the performance through simulation in Section 6 and conclude this paper in Section 7.
2. Related Works
Data query is an important operation for events monitoring or data analysis in TWSN. The security issues are the hot spots in data query researches, such as privacy protection, integrity, or completeness verification. Recently, the secure range query [5–11] and top-k query [12–17] have been broadly studied. However, there are limited researches on MAX/MIN query, and only [18, 19] propose solutions of privacy-preserving MAX/MIN query in TWSN.
Regarding the range query, a secure range query processing in TWSN is firstly proposed in , which employs bucket partition and symmetric encryption to achieve the privacy protection of collected data, and uses MAC to accomplish the completeness verification of query results. Based on , the spatiotemporal cross-check is introduced in [6, 7] to improve the efficiency of energy consumption. Furthermore, the spatiotemporal cross-check procedure of [6, 7] is optimized in , which balances the security and energy consumption and applies this method in multidimensional range query. A secure and energy-efficient range query processing protocol SafeQ is proposed in [9, 10], which is based on Prefix Membership Verification (PMV)  and neighborhood chain mechanism. In addition, the Bloom-Filter  is introduced to optimize the energy consumption. And a secure range query protocol based on order-preserving function and link watermarking QuerySec is proposed in , which is capable of saving energy during query processing. However, these secure range query methods are not suitable for solving the privacy-preserving MAX/MIN query.
For top-k query, the fine-grained verifiable top-k query methods are proposed in [12, 13], whereby the network owner can verify the completeness and authenticity of query results in TWSNs. The verification code which embeds ordered and adjacent relationships of the collected data by HMAC is utilized in  to achieve the verifiable top-k query processing. The symmetric encryption is applied in  to reduce the communication cost in verifiable top-k query processing. Works in [12–15] merely support the completeness and authenticity verification of query results, but they cannot achieve privacy protection. To preserve privacy, the privacy-preserving top-k query processing approaches based on Order Preserving Encryption  are proposed in [16, 17]. Though the top-k query can be transformed into MAX/MIN query when k = 1, it is wasteful in energy consumption. The reason is that each sensor node should submit all the data collected in every epoch since top-k is designed for obtaining the highest k data, where k is variable. In contrast, MAX/MIN query only has to submit the sole maximum or minimum value. Therefore, taking top-k query as MAX/MIN query will result in high unnecessary data communication. In conclusion, the secure top-k query methods are not suitable for solving the privacy-preserving MAX/MIN query.
For MAX/MIN query, the same PMV as in [9, 10], the symmetric encryption and HMAC are used in  to achieve the privacy-preserving MAX/MIN query processing. Since more codes are transmitted in data submission, which are generated by using the PMV and HMAC functions, the energy consumption of  is high, which will reduce the lifetime of whole network. In contrast, our former work  adapts 0-1 encoding verification instead of PMV to achieve an energy-efficient privacy-preserving MAX/MIN query which is denoted by EMQP. In EMQP, the codes generated by using 0-1 encoding and HMAC of same data are significantly less than those generated in , which can reduce the energy consumption of sensor nodes. Furthermore, in this paper, we will adopt random selection of codes on the basis of EMQP to save much more energy and accomplish better privacy-preserving MAX/MIN query processing.
3. Models and Problem Statement
3.1. Network Models
We consider a similar TWSN model as in [5–17], as shown in Figure 1. The network is divided into multiple cells, each containing master node and several sensor nodes , which is named after . In particular, the master nodes are powerful devices, which have abundant resources of energy, storage, and computation. Additionally, they are also responsible for receiving and storing data collected by the sensor nodes and processing the query requests from BS, while the sensor nodes are cheap devices with limited resources. Each sensor node merely submits its collected data to the master node in the same cell. The master node can apply its long-distance and high-frequency communication capacity to communicate with the nearby nodes, which should then construct the upper-tier multihop networks. The query results will be returned from the queried master nodes to BS through the above networks. There is an on-demand wireless link (e.g., satellite) between the master nodes and BS to interact with each other. However, such wireless link is usually unstable and is of high consumption and low speed.
We assume that the time is divided into nonoverlapping epochs, and in every epoch , the sensor node collects sensor data . In TWSNs, BS owns the global network topological information, while a master node owns the network topological information of its located cell, and a sensor node only knows the locations of the master node in the same cell and 1-hop neighboring sensor nodes.
3.2. MAX/MIN Query Models
A MAX/MIN query in TWSN is a kind of query operation aimed at obtaining the maximum or minimum data among the data items collected in the specified epochs and area. Therefore, the following MAX/MIN query will be considered, which is denoted by a triple tuple:where refers to the query type, is the set of queried epoch numbers, and denotes the set of queried sensor nodes IDs which indicate a query region. For example, query is aimed at getting the maximum data collected by sensor nodes in the epoch .
For simplicity, we focus on the simple MAX query aimed at one cell and one epoch ; that is , where, . For other complicated queries covering multiple epochs and cells, it can be easily achieved by decomposing them into multiple simple ones. And we will conduct discussion in Section 4.5. Additionally, the MIN query is similar to the MAX query.
3.3. Problem Statement
In TWSNs, is too vulnerable and tends to be easily under attacks from adversaries, since they are not only responsible for storing all the data collected by the sensor nodes, but also responsible for processing the query requests from BS. If the collected data is in plaintext and is compromised, any data stored in and the query results generated by will be exposed to attackers, which tends to lead to privacy leakage. Therefore, it is necessary to take efficient and effective measures for privacy protection.
We adopt the same honest-but-curious threat model as in , where may try to breach privacy to steal sensitive data but faithfully obey protocols while processing the query requests. Additionally, BS and sensor nodes are also assumed to be credible in contrast to . Based on the above assumption, for the sake of achieving privacy-preserving MAX/MIN query processing, the following conditions should be satisfied:(1)For the data collected by any sensor node in the network, only this sensor node and BS can obtain its real value in contrast to .(2)For the real value of query results, only BS can obtain it in contrast to .
Moreover, have abundant energy resources, while the sensor nodes are energy-limited, which results in the fact that the lifetime of the whole network totally relies on the energy consumption of sensor nodes. And most energy is consumed by communications according to . Therefore, the communication cost of sensor nodes is a key metric for performance evaluation of query processing method in TWSN. We will conduct concrete evaluations of in-cell communication cost and query response communication cost () in Section 6. The former represents the total energy consumption in bits incurred by data transmissions between the sensor nodes and per epoch, while the later refers to the total information in bits transmitted between and BS.
4. MAX/MIN Query Processing with Random Secure Comparator Selection
We use the 0-1 encoding verification , which can be utilized to compare data items without knowing their values. Let be a binary string with bits. The 0-encoding and 1-encoding of are denoted by and , respectively, where . For two data items and , if and only if ; otherwise . Obviously, if codes of and are of different types, they can be compared; otherwise they are incomparable.
In order to improve the efficiency of intersection computing, the numeralization functions are usually applied to convert the 0-1 encoding binary strings into numbers. Thus we adopt the same numeralization function as in , which satisfies the idea that, for any two 0-encoding or 1-encoding binary strings, and , if and only if . Additionally, we utilize HMAC to realize one-wayness and collision resistance of encoding data. We denote HMAC function by , where is the secret key of HMAC, which is only shared in sensor nodes.
4.1. Comparison Model Based on Random Secure Comparator Selection
Definition 1. For data , after applying 0-1 encoding, numeralization, and HMAC, the two generated code sets are denoted by the secure comparators of , where and are type 0 and type 1 secure comparators, respectively; that is, , .
According to the data comparison property of 0-1 encoding verification, we do not have to compare any two data items based on their values, but only the corresponding secure comparators. Thus, Lemma 2 is established.
Lemma 2. For data and , if , then ; otherwise .
Definition 3. The random secure comparator of data is denoted by , where is the random selection function of a set. Its value is denoted by ; that is, , and its type is denoted by , which is shown as follows:
Definition 4. For two secure comparators, and , if and only if , then is higher than , and we denote it by , which means ; if and only if , then is smaller than , and we denote it by , which means ; otherwise, and are incomparable.
According to Definition 4, only if two secure comparators are of different types, they are comparable. It is remarkable that the relations and do not have transitivity.
4.2. Algorithm for Generating Minimal Set of Highest Secure Comparators
In this section, we firstly give the definition of the minimal set of highest secure comparators, which is the theoretical basis to achieve RSCS-PMQ. Then, we provide the generation algorithm of this set and analyze the probability of the amount of its elements.
Definition 5. Assume that is the corresponding set of random secure comparators of . The minimal set of highest secure comparators is denoted by Ψ, where(1);(2), ;(3)
According to Definition 5, the secure comparators in are of the same type; therefore, they are incomparable. Moreover, for secure comparators and , if they are of different types, the former is obviously larger than the latter; if they are of the same type, there must exist another secure comparator with different type from and , which satisfies the idea that is larger than , while is larger than .
Lemma 6. Assume that is the minimal set of highest secure comparators of the data set ; then we have
Lemma 7. Given data set , its corresponding secure comparator set is ; then we have the probability of containing elements as follows:where and .
Proof. Assume that , and has secure comparators; then . According to Definition 5, all the secure comparators in are of the same type. And must be of different type from the ones in ; otherwise also belongs to , which contradicts to the given assumption. Therefore, the probability of having secure comparators is equivalent to the probability that are of the same type while differing from . Apparently, the probability of all being type 0 and being type 1 is . Similarly, the probability is also under the reverse circumstance. Thus, the probability of having secure comparators is .
The generation algorithm of the minimal set of highest secure comparatorsis given as Algorithm 2, denoted by MaxRSC.
As shown in algorithm MaxRSC, and are used to store the current sets of the highest and second highest secure comparators. The variable flag is to indicate the higher set between and , where indicates is higher; otherwise is higher. When the algorithm ends, the final minimal set of highest secure comparators is if ; otherwise it is . The algorithm is concise and direct, and its time complexity is only .
4.3. Data Collection Protocol
The data collection protocol is concerned with how a sensor node transmits its collected data items to . For each sensor node , after collecting data items in epoch , it performs the following procedure:(i)Determine the maximum data among the collected data items; that is, .(ii)Compute the random secure comparator , and set its type according to the random selection.(iii)Encrypt by using the key shared with BS. The output ciphertext is denoted by .(iv)Submit the following message to , where is the ID of : (v)Once receives the above message from , it will store the data of the message.
As shown in the above protocol, since the HMAC function is one-way and collision-resistant and sensor nodes only share the encryption and HMAC keys with BS, it is computationally infeasible to reveal the exact value to . Therefore, we can see that the data collection protocol can preserve data privacy from .
4.4. Query Response Protocol
The query response protocol is concerned with how cooperates with BS to accomplish the query requests from users. The main idea is that uses the MaxRSC algorithm to generate the minimal set of highest secure comparators on the basis of submitted random secure comparators of the queried sensor nodes and further determines the corresponding minimal set of candidate ciphertexts which is denoted by . Then, returns it to BS. And the final query result will be determined after BS performs decryption on . The concrete steps of query response protocol are as follows:(i)BS transmits the query request to , where .(ii)Once receives the query request, it firstly loads the ciphertext and the corresponding random secure comparator received from each sensor node in epoch . Assume that the loaded random secure comparators are . With them as the input, then generates the minimal set of highest secure comparators by using the MaxRSC algorithm, and the corresponding minimal set of candidate ciphertexts is determined, where(iii)Finally, constructs the following message and sends it to BS: (iv)When BS receives the response message from , it uses the key shared with the sensor nodes to decrypt the ciphertext in , and then the final query result will be determined.
Similar to data collection protocol, it is also computationally infeasible for to get the real values and the query result in the query response protocol. Therefore, this protocol can also preserve data privacy from .
Lemma 8. For the determined minimal set of highest secure comparators and the corresponding minimal set of candidate ciphertexts in the query response protocol, we have the following:(1), where means the number of elements in the set .(2)The query result of must be embedded in the ciphertext of .
Proof. According to the construction of in (6), we can easily have , and the values of secure comparators in are all embedded in the ciphertext of . And Lemma 6 indicates that the maximum data among the data items collected by the queried sensor nodes (i.e., query result) must exist in the corresponding data set of . Therefore, the query result must be embedded in the ciphertext of as well.
Lemma 9. Assume that there are elements in ; then we have its probability as follows:
Lemma 10. The mean quantity of elements in is the mathematical expectation of the ciphertext quantity in , whereand when is very large.
4.5. Complicated Query Processing Method
If complicated query involving multiple cells and epochs is applied, we can achieve it based on the basic ideas of the data collection and query response protocols. We give the overview of the complicated query processing method through an example.
As shown in Figure 2, there are four master nodes, A, B, C, and D, and BS composing the upper-tier tree-routing networks. Assume that the current query involves A, B, C, and D and several epochs. The main idea of processing is as follows. Firstly, A, B, C, and D use MaxRSC algorithm to determine their own minimal sets of highest secure comparators and the corresponding minimal set of candidate ciphertexts, which can be denoted by four pairs: , , , and , respectively. Then, A, B, and C submit , , and to D on their own. And D takes , , , and its own as inputs into MaxRSC algorithm to determine the global minimal set of highest secure comparators and the global corresponding minimal set of candidate ciphertexts . Obviously, the global query result is embedded in , and then D submits to BS. Consequently, BS decrypts the ciphertext in and gets the final query result of the complicated query .
5. Protocol Analysis
5.1. Privacy-Preserving Analysis
(1)Collected data privacy preservation: on the premise that BS and sensor nodes are credible in this paper, the privacy of data collected by the sensor nodes can be preserved from only if it is ensured that it is impossible for has to obtain the real value of any collected data. According to the data collection protocol, the data submitted by sensor nodes which are stored in are the ciphertext and HMAC codes instead of the plaintext. Since the HMAC algorithm is one-way and collision-resistant and the encryption and HMAC keys are only shared by the sensor nodes and BS, given a random secure comparator and ciphertext , it is computationally infeasible for to obtain the value of the collected data . And, for , the complexity to peek the privacy is equal to cracking the HMAC and encryption. Thus, our proposed RSCS-PMQ can protect the privacy of the collect data from master nodes.(2)Query result privacy preservation: as shown in the query response protocol, cooperates with BS to achieve the MAX/MIN query processing. During the procedure, takes the secure comparators as inputs to determine the minimal set of candidate ciphertexts embedding the plaintext query result through MaxRSC algorithm and then transmits it to BS. Consequently, BS decrypts the received ciphertext and gets the plaintext query result. Obviously, has no chance to touch any plaintext query result except for cracking encryption or HMAC. Therefore, RSCS-PMQ can protect the privacy of query results from the master nodes.
5.2. Communication Cost Analysis
To analyze the communication costs of data collection and query response protocols, we present the parameters as follows: : the number of sensor nodes. : the bit-length of a sensor node ID. : the bit-length of an epoch. : the bit-length of an encrypted data item. : the bit-length of a HMAC data item. : the bit-length of a query request. : the bit-length of a collected data item. : the average hops from a sensor node to .
According to the 0-1 encoding properties, there are type 0 and type 1 secure comparators for every bits data. Thus, the random secure comparator of a -bits data item contains HMAC data on average.
As shown in the data collection protocol, each sensor node submits a node ID, an epoch number, a ciphertext, and a secure comparator to . Therefore, the communication cost of data collection in the cell, denoted by in-cell communication cost , can be calculated with
As shown in the query response protocol, the communication cost for executing a query consists of two parts: one part is the communication cost of BS for sending query requests to and the other part is of for returning the feedback messages to BS. Additionally, Lemma 10 indicates that the mean quantity of ciphertext returned to BS is the mathematical expectation of the ciphertext quantity in . As a result, the calculation of query response communication cost is as shown which is denoted by :According to Lemma 10, we havewhen is very large.
Then, we have the total communication cost as follows:
5.3. Computation Cost Analysis
We analyze the computation cost of proposed RSCS-PMQ and compare it with other privacy-preserving MAX/MIN query methods: PMV-PMQ  and EMQP . First of all, since all of the three methods use the complex algorithms of encryption and HMAC in sensor nodes, the computation cost of sensor nodes is mainly caused by the encryption and HMAC. Secondly, the storage node determines the encrypted query results according to the intersections of paired code sets. To find out whether the intersection of two sets is null or not, many comparison operations are needed. As a result, the computation cost analysis of the three works is given in Table 1 on two aspects: the quantity of encryption and HMAC operations of a sensor node in an epoch and the quantity of comparison operations of in a query.
As shown in Table 1, PMV-PMQ, EMQP, and RSCS-PMQ perform the same quantity of encryption operations in a sensor node, but RSCS-PMQ performs less HMAC operations than the other two methods. RSCS-PMQ and EMQP perform the similar quantity of comparison operations, but they have general better performance than PMV-PMQ. Therefore, the RSCS-PMQ approach proposed by us is more efficient in computation cost than PMV-PMQ and EMQP.
We will not discuss the robustness of our method since it is not the focus of this paper. And we assume that the robustness is supported by the low-layer protocols.
6. Performance Evaluations
To analyze and compare the performance of protocols, we implement the proposed RSCS-PMQ, PMV-PMQ, and EMQP on the improved simulator of . According to the experimental results of , we know that the energy consumed by data communication is much larger than that by the computation of encryption and HMAC. Therefore, this paper will focus on the evaluation of communication cost. We perform the evaluations on the following two aspects:(1)We firstly measure and analyze the in-cell communication cost () of these three methods. Since the amount of codes for each collected data item constructed in PMV-PMQ is within a certain range, we consider the upper and lower bounds of PMV-PMQ in our evaluations, respectively, that is, the highest and lowest in-cell communication costs, which are denoted by PMV-PMQ-T and PMV-PMQ-B, respectively. Additionally, since the hash-based optimization in EMQP is also suitable for RSCS-PMQ and PMV-PMQ, which is aimed at reducing the length of HMAC data, this paper only compares the of three methods without the hash-based optimization.(2)To evaluate the query response communication cost () generated by and BS, we firstly measure the probability of containing ciphertext and the average quantity of ciphertext in in the RSCS-PMQ method. Then, we measure of the three methods and calculate their proportions in the whole network communication costs while processing the MAX query.
The evaluations are performed on a PC with an Intel Core i5-3230M (quad-core 2.6 GHz) CPU and 8 G memory, running Windows 7 operating system, Eclipse, and Matlab. In addition, the experimental data set is randomly generated. In this simulation, the sensor nodes are assumed to be uniformly distributed in a cell covering a 100 × 100 m2 area, and the communication radius of a sensor is 20 m. The default setting of other parameters is as shown in Table 2.
6.1. In-Cell Communication Cost Evaluations
In each measurement, we randomly distribute the sensor nodes and generate 10 networks with different topologies represented by different network IDs. Then, we can determine the communication cost of a MAX/MIN query by computing the average communication cost of these 10 networks.(1) versus network ID: Figure 3 shows that the of RSCS-PMQ, EPRQ, and PMV-PMQ are all uniformly distributed in different topology networks. And the of PMV-PMQ is the highest, and EPRQ has the mediate , while RSCS-PMQ has the lowest. Under the experiment setting, the of RSCS-PMQ is 32.23% lower than the lower bound of PMV-PMQ and 24.54% lower than the lower bound of EPRQ, since the amount of HMAC data used for secure comparison submitted from the sensor nodes to in the former method is smaller than that in the latter.(2) versus and : as shown in Figure 4, when the amount of sensor nodes increases, the of RSCS-PMQ, EPRQ, and PMV-PMQ also increase, since the amounts of ciphertext and HMAC data transmitted in the network both increase. In accordance with Figure 5, we can see that the of three methods also increase as increases, because the amount of HAMC data used for secure comparison is in proportion to . In addition, Figures 4 and 5 indicate that the of three methods are in linear proportion to and , which is consistent with the theoretical analysis result shown in (10). Moreover, we have the idea that the of RSCS-PMQ is significantly lower than that of EPRQ and PMV-PMQ, and the former is about 30% lower than the lower bound of PMV-PMQ and about 25% lower than the lower bound of EPRQ.(3) versus and : we adopt different encryption and HMAC algorithms to set different and , respectively. For example, could be 64, 128, and 256 bits if DES, IDEA, and AES-256 are adopted, respectively, while could be 128, 160, and 256 bits if HMAC-MD5, HMAC-SHA1, and HMAC-SHA256 are adopted, respectively.
Figure 6 shows that the of RSCS-PMQ, EPRQ, and PMV-PMQ have slow and unapparent increase as increases, while they increase obviously as increases. The reason is that there is only one encrypted data item in the message submitted from each sensor node to , but the amount of HMAC data is in proportion to the length of collected data , which is obviously bigger than the former one. And the increasing of has a more obvious influence on . Similar to the results of evaluations and in this section, Figures 6 and 7 indicate that RSCS-PMQ is significantly lower than EPRQ and PMV-PMQ in , and the former one is about 30% lower than the lower bound of PMV-PMQ and about 25% lower than the lower bound of EPRQ.
6.2. Query Response Communication Cost Evaluations
Assume that the sensor nodes collect data in 10000 epochs and transmit the corresponding ciphertext and HMAC data to . And executes 10000 MAX queries aimed at each epoch mentioned above. We measure the probability and mean value of the amount of ciphertext in returned from to BS. We repeat the experimental process 10 times and get the results as shown in Figures 8 and 9.
From Figure 8, we can see that the probability of containing ciphertexts in the practical experiment totally corresponds to the theoretical probability computed with (8) in Lemma 9, which also proves the correctness of Lemma 9 from the point of experimental statistics. Additionally, based on a large amount of experimental statistics, Figure 9 indicates that the mean quantity of ciphertext in is in agreement with the mathematical expectation computed with (9) in Lemma 10, and it is close to 2 as the amount of test samples becomes very large. The result verifies the correctness of Lemma 10 from the point of experimental statistics.
Based on the 10 groups of data transmitted from the sensor nodes under the 10 networks with random topologies in Section 6.1, we process 10 MAX queries, respectively. We test the query response communication costs () and the average proportion of them in the total network communication costs () for PMV-PMQ, EPRQ, and RSCS-PMQ, respectively. The experimental results are shown in Figures 10 and 11.
Figure 10 indicates that the of EPRQ and PMV-PMQ are constant and equal, while the of RSCS-PMQ is about 20% higher than the former two methods. The reason is as follows: can only determine the ciphertext as the query result in EPRQ and PMV-PMQ, while the result returned by in RSCS-PMQ is the set containing multiple candidate ciphertext. The probability statistics of the amount of ciphertext in is as shown in Figure 8, and the mean quantity of is about 2 according to Figure 9.
However, as shown in Figure 11, in the average of PMV-PMQ, EPRQ, and RSCS-PMQ where , the mean value of is significantly smaller than that of , and they merely account for a very small proportion of , only 0.22, 0.24, and 0.38 on average, respectively. Here, of PMV-PMQ is the lower bound of its in-cell communication cost. In addition, is generated by the resource-abundant master nodes and BS. As a result, has little impact on which is mainly determined by in contract, and of RSCS-PMQ is lower than that of PMV-PMQ and EPRQ.
From the above experimental results and analyses, we can obtain the following: compared with the existing EPRQ and PMV-PMQ, the in-cell communication cost of RSCS-PMQ is lower, which is about 30% lower than the lower bound of EPRQ and about 25% lower than the lower bound of PMV-PMQ. Additionally, although the query response communication cost of RSCS-PMQ is higher than that EPRQ and PMV-PMQ, it only accounts for a very small proportion of the total network communication cost, lower than 1%, and so do the later methods. And the total communication cost of RSCS-PMQ is lower than EPRQ and PMV-PMQ. Thus, the RSCS-PMQ proposed in this paper has a better performance than the existing works.
In this paper, we propose a novel random secure compactor selection scheme and a minimal set of highest secure comparators generating algorithm to achieve privacy-preserving MAX/MIN queries in two-tiered wireless sensor networks. Our technique can prevent the compromised master node from peeking at the hosted data and also ensure high query efficiency in network communication cost. Moreover, the efficacy and efficiency of our method are confirmed through detailed evaluations and analysis. In the future works, we will focus on the verification of query result completeness and further develop the key technique of this paper to support other types of data queries.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This research was supported by the National Natural Science Foundation of China under the Grants nos. 61300240, 61402014, 61572263, 61502251, 61472193, 61302157, 61373138, 61201163, and 61272084, the Natural Science Foundation of Jiangsu Province under the Grants nos. BK20151511 and BK20141429, the Project of Natural Science Research of Jiangsu University under Grants nos. 11KJA520002 and 14KJB520027, the Postdoctoral Science Foundation of China under the Grant no. 2013M541703, the Postdoctoral Science Foundation of Jiangsu Province under the Grant no. 1301042B, and Scientific & Technological Support Project (Society Development) of Lianyungang under the grant no. SH1306.
Y. Diao, D. Ganesan, G. Mathur, and P. Shenoy, “Rethinking data management for storage-centric sensor networks,” in Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research (CIDR '07), pp. 22–31, Asilomar, Calif, USA, January 2007.View at: Google Scholar
H.-Y. Lin and W.-G. Tzeng, “An efficient solution to the millionaires’ problem based on homomorphic encryption,” in Applied Cryptography and Network Security: Third International Conference, ACNS 2005, New York, NY, USA, June 7–10, 2005. Proceedings, vol. 3531 of Lecture Notes in Computer Science, pp. 456–466, Springer, Berlin, Germany, 2005.View at: Publisher Site | Google Scholar
H. Krawczyk, R. Canetti, and M. Bellare, “HMAC: keyed-hashing for message authentication,” Tech. Rep. RFC 2104, Internet Society, Reston, Va, USA, 1997.View at: Google Scholar
B. Sheng and Q. Li, “Verifiable privacy-preserving range query in two-tiered sensor networks,” in Proceedings of the 27th IEEE International Conference on Computer Communications (INFOCOM '08), pp. 46–50, Phoenix, Ariz, USA, 2008.View at: Google Scholar
Y. Yi, R. Li, F. Chen, A. X. Liu, and Y. Lin, “A digital watermarking approach to secure and precise range query processing in sensor networks,” in Proceedings of the 32nd IEEE Conference on Computer Communications (INFOCOM '13), pp. 1950–1958, IEEE, Turin, Italy, April 2013.View at: Google Scholar
R. Zhang, J. Shi, Y. Liu, and Y. Zhang, “Verifiable fine-grained top-k queries in tiered sensor networks,” in Proceedings of 29th IEEE International Conference on Computer Communications (INFOCOM '10), pp. 1199–1207, IEEE, San Diego, Calif, USA, March 2010.View at: Google Scholar
H. Dai, G. Yang, H. P. Huang et al., “Efficient verifiable top-k queries in two-tiered wireless sensor networks,” KSII Transactions on Internet and Information Systems, vol. 9, no. 6, pp. 2111–2131, 2015.View at: Google Scholar
R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Order preserving encryption for numeric data,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '04), pp. 563–574, Paris, France, June 2004.View at: Google Scholar
A. Coman, M. A. Nascimento, and J. Sander, “A framework for spatio-temporal query processing over wireless sensor networks,” in 1st International Workshop on Data Management for Sensor Networks, DMSN '04, in Conjunction with VLDB 2004, pp. 104–110, can, August 2004.View at: Publisher Site | Google Scholar