Abstract

A distributed sensor network (DSN) can be deployed to collect information for military or civilian applications. However, due to the characteristics of DSNs such as limited power, key distribution for a distributed sensor network is complex. In this paper, a neighbor-based path key establishing method and a seed-based algorithm are put forward to improve the original random key pre-distribution scheme. The new scheme is portable because it is independent of the routing protocol. Moreover, the connectivity of the entire network also approaches 1. In particular, the new scheme can keep high connectivity by setting a small amount of redundancy in parameter values when the number of neighbors drops because of the node dormancy or death. The resilience against node capture in our scheme is not lower than that in the -path scheme and the basic schemes when the number of hops in a path is larger than 5, and the simulation result shows that the efficiency of our scheme is also slightly higher.

1. Introduction

A lot of secure key distribution techniques, for instance, the public cryptography, cannot be applied in Wireless sensor networks (WSNs) due to the characteristics of WSNs like large scale network, lacking of trusted infrastructure, and physical constraints to energy and memory. A naive solution is that a single master key is used in all communications which is too vulnerable under the node capture attack. Matsumoto and Imai [1] defined key predistribution (KPD) which is the method of distribution of keys onto nodes before deployment. Therefore, the nodes build up the network using their secret keys after deployment, that is, when they reach their target position. Blom [2] (see also [3]) presented -secure key predistribution schemes. The tradeoff is that, unlike the pairwise-key scheme, those schemes are no longer perfect against node capture. Instead, the -secure KPD schemes [27] have perfect security when the number of captured nodes is less than . However, once more than nodes are controlled by a specific attacker, all keys in the entire network are compromised. Deployment knowledge based KPD schemes [810] utilized deployment knowledge to improve the connectivity and the resilience against node capture. However, these solutions are not quite viable since the location of each node may be unknown before deployment. Local center based scheme [1113] is another type of general KPD techniques, assuming that the network consists of a local trusted infrastructure. Some trusted and powerful nodes distribute keys to sensor nodes around them. However, it is not accessible in a random deployment and dynamic network. In WSNs, random key predistribution (RKPD) schemes [1421] are widely accepted due to its simplicity, low overhead, scalability, and high global connectivity.

1.1. Related Work

In 2002, Eschenauer and Gligor [14] proposed a random key predistribution (RKPD) scheme, often called basic scheme, using giant component theory. The basic scheme includes 3 phases: (1) key predistribution, (2) shared key discovery, and (3) path key establishment. The first phase is executed before the deployment of nodes. A controller generates a large key pool, and each node draws keys (to form a key ring) out of the pool without replacement. The second phase may occur at the neighbor discovery phase. Two neighbor nodes (two nodes connected physically) try to find a common key identifier in their key rings. If such a key identifier is found, these nodes are logically connected and the key corresponding to the same key identifier is called a shared key. Then, the two nodes separately add a record including the other party’s identifier and the shared keys’ identifier to a list, called a neighbor list. The last phase occurs just before the information transmission. A pair of nodes and (including nodes which are only physically connected with ) try to find a path , where represents that sensor nodes and are logically connected neighbors. Afterwards, and can distribute a key on the path, and the key is called a path key. The basic scheme can be parameterized to meet the demands that the connected probability of the entire network closes to 1.

Later, Chan et al. [15] gave a -composite scheme in which any two logically connected nodes need to share keys instead of only one to establish a path key. An -path scheme was also presented in [15], in which the path key is broken up into nuggets, and these nuggets are passed to along disjoint paths. But the problem of discovering multiple disjoint paths is computationally hard, and too much overhead may be incurred in this process. Zhu et al. [18] relaxed the requirement where a single physical path is used as long as the nuggets of the path key are transmitted through multiple logically disjoint paths. However, the cost of discovering multiple logically disjoint paths is also expensive.

Gu et al. [20] pointed that the performance of the basic scheme is satisfactory only in highly dense sensor networks, where the average number of physical neighbors per node is more than 20. They proposed a methodology called network decoupling and designed a new protocol to establish the path key between physical neighbors with the help of a node proxy. Li et al. [21] also proposed a multihop proxies random key predistribution scheme. In their schemes, each node constructs a local logical graph and a local physical graph before the path-key establishment, which incurs a large amount of storage, computation, and communication overload. In addition, the physical graph of WSNs changes all the time when the node moves or dies. Thus, if the local graph is not updated in time, the path key establishment will fail.

1.2. Our Contributions

Generally, a key establishment scheme has nothing to do with the routing protocol. However, the path in lots of RKPD schemes, such as the basic scheme, is set up through a routing protocol. That is, a customized routing protocol needs to be used along with those key distribution schemes, which will seriously affect the portability of those RKPD schemes. Moreover, the number of hops of the path may increase because the adjacent nodes in the path must be logically connected (see Section 6). In fact, the total amount of computation cost for deciding whether a shared key exists in key rings of two adjacent nodes is also very huge because the routing process may involve many nodes (see Section 6).

A new method to establish the path key with the help of neighbors is presented in this paper. The basic steps of Phase 3 are changed, and a path that the adjacent nodes only need to be physically connected can be achieved by original routing protocols [22]. To reduce the communication overload, a seed-based method is used to chose key ring for each node upon the input of node’s identifier. Although the seed-based method has been mentioned in paper [18, 19], we firstly construct a deterministic algorithm , with which the times that each key is selected by nodes approximate the average value. In addition, the connectivity of the new scheme is also higher if suitable parameters are chosen. Specifically, when the number of neighbors is less than the predefined out-degree due to the node dormancy or death, the probability that the entire network is connected still approaches 1 by setting a small amount of redundancy in parameters. The probability that a link is compromised when nodes are captured in our scheme is equal to that of the basic scheme, and it is higher than that of the -composite scheme and that of the -path scheme. However, the probability that a path key is compromised in the establishment phase is lower than that of -path schemes when the number of hops is more than . Our scheme is less competitive in terms of the computation and communication complexity analysis. However, the execution time of a path key establishment for our scheme is slightly less than that of the basic scheme in the simulation. The reason is that, the complexity does not include the extra traffic and calculations that are caused by the customized routing protocol in the basic scheme.

2. The New RKPD Scheme

In the rest of this paper, represents the number of nodes in the entire wireless sensor network, and each node has a node identifier , . is the number of keys in the key pool, and each key also has a key identifier , . is the average number of nodes in single-hop communication range, called node density. And is the number of keys in the key ring for each node. In addition, expresses the number of hops of a path between the source node and the destination node .

2.1. The Scheme

A new random key predistribution scheme is described in this section. The scheme also includes three phases: (1) key predistribution, (2) shared key discovery, and (3) path key establishment.

Phase 1 (key predistribution). Before the deployment of nodes, for each node, a control center (CC) randomly chooses a key ring and loads it into the node (the flow chart is shown in Figure 1).

Step 1. The CC randomly generates keys and assigns a unique key identifier to each ; those keys and the corresponding identifiers compose a key pool.

Step 2. The CC chooses a deterministic algorithm to decide the key identifiers allocated to each node on the input of the node’s identifier.

Step 3. For each node , the CC inputs its identifier into and output distinct values between and , denoted by . At last, the CC draws keys whose key identifiers are . Those keys and the corresponding key identifiers compose a key ring which is loaded into node . Also the algorithm is loaded into each node.

Phase 2 (shared key discovery). After the deployment of nodes, each node creates its own neighbor list.

Step 1. Each node broadcasts its identifier and records the received identifiers, denoted by .

Step 2. For each node , node runs the procedure and generates the key identifier set of nodes . If there is a common key identifier in such set and its own key ring, they are logically connected. Then node adds a record involving the node identifier and the same key identifier to its neighbor list.

Phase 3 (path key establishment). Node wants to establish a path key with node . If they are in wireless communication range and have a shared key, that is, they are on each other’s neighbor list, the shared key can be used as the path key. Else,   randomly generates a path key and encrypts with some key in  ’s key ring. The ciphertext, denoted by , together with the identifier of the encryption key is sent to on a physical connected path founded by a routing protocol [22]. Finally,  finds the encryption key corresponding to the received key identifier and decrypts CT to obtain . The following procedure explains how can find an encryption key which belongs to ’s key ring. Figure 2 shows the process of source node and Figure 3 shows the process of destination node and ’s neighbors .

Step 1.   inputs  ’s node identifier into the procedure to generate  ’s key identifier set and then searches for a shared key identifier between  ’s key identifier set and its own’s in its key ring using binary search method. If a same key identifier is found, looks up in its own key ring and obtains the encryption key. Else, if all the key identifiers are different, goes to step 2.

Step 2.   broadcasts ’s identifier .

Step 3.  ’s neighbor   which receives the identifier looks up in its neighbor list. If  ’s identifier is not on the list,   stops.

Step 4.   inputs  ’s identifier into the procedure to generate  ’s key identifiers and then tries to find a shared key identifier between  ’s key identifiers and its own’s. If   does not find a same key identifier, it stops. Otherwise,   sends “1” to   for confirmation.

Step 5.   chooses   from neighbors which have responded and obtains the shard through a neighbor list query and a key ring query. Then,   encrypts with the shared key and sends the temp ciphertext CTtemp and the key identifier to  .

Step 6.   searches its key ring for the shared and decrypts CTtemp to obtain . After that,   searches its key ring for the shared and encrypts with . At last,   sends the final ciphertext CT and the encryption key identifier to  .

Step 7.   forwards CT and to .

2.2. The Algorithm G

The deterministic algorithm is used to decide the key ring allocated to each node. Specifically, for each node with a unique node identifier, the algorithm generates distinct integers between and using pseudorandom number generator upon the input of a node identifier. These integers are the identifiers of the keys for this node. In this paper, the algorithm is based on a bit-oriented linear feedback shift registers (LFSR). The content of the LFSR is denoted by . The feedback polynomial of the LFSR, is a primitive polynomial of degree .

2.2.1. Initialization

Before any bit stream is generated, the register must be initialized with the node identifier. Let the bits of the node identifier be denoted by ,  . The initialization phase is done as follows. First, load the LFSR with the node identifier bits, , , and then the remaining bits of the LFSR are filled with “1”s.

2.2.2. Key Identifier Set Generation

Create a one-dimensional array of length and initialize the array with “0”s. After the following steps, output the array which is fulfilled with key identifiers for a node.For each (1)the LFSR shift to producing bit stream;(2)once every bits have been generated, compute value ;(3)if or , go to (1);(4)else, ;(5)sort the sequence using standard insertion sort methodend for.

2.2.3. Design Criteria

(1) The Number of Nodes, In Which Each Key in the Key Pool is Allocated to, Closes to . In random sampling without replacement, each member of a population has an equal chance of being included in the sample. Hence, each key in the key pool has a probability of to be chosen by a node. Since the network size is , each key should be statistically allocated to nodes. In practice, however, using computational algorithms can only produce long sequences which seems to be random because the outputs are in fact determined by a shorter initial value, called a seed. So the generators are regarded as pseudorandom number generators. As a result, the number of nodes that each key is selected by can only close to in reality.

A binary sequence generated by a linear feedback shift which registers with a primitive feedback polynomial has balance property, run property, and correlation property. Therefore, any truncation of the sequence can be regarded as a pseudorandom number. The experimental result (see Figure 5) also shows that the times in which each key identifier is selected approximate to .

(2)  . First of all, ensures that the register can be initialized to distinct states according to distinct node identifiers. Moreover, if , the key identifiers (in binary format) that have a run of length will never be selected by (refer to run test of LFSR for the proof [23]).

(3)  . This condition guarantees that a long subsequence of “0” will not appear at the start of the sequence, since most of those bits will be discarded in step 4 of the key identifier set generation procedure.

(4)  . This condition ensures that the key identifier set for a node can be generated in one cycle of the sequence produced by the LFSR.

Example 1. Suppose , , and . According to the design criteria, we choose and the feedback polynomial of the LFSR (see also Figure 4). Obviously, meets the conditions (2), (3), and (4).

The average times that each key is drawn by nodes are , and the actual statistical results are shown in Figure 5. It can be seen that most values are between 100 and 140, which close to 120.

3. Connectivity

The connected probability refers to the probability that any two nodes in a distributed sensor network can establish a path key successfully. Next, we will prove that the connected probability of the new RKPD scheme is

As mentioned in Phase 3, a path key can be securely sent to  y as long as an encryption key can be found, and the probability of which will be discussed in two situations.

Case 1.   finds a shared key with  y in step 1 of Phase 3, and this key is used as the encryption key.

Each key ring of size is randomly drawn out of the key pool, so the probability that two nodes do not share any key is . In other words, the first key ring is picked at random and the second key ring is drawn out of the remaining unused keys in the pool. Therefore, at least one shared key exists with probability .

Case 2.   cannot find any shared key with  y in step 1 of Phase 3, but a logically connected neighbor   which has a shared key with  y can be found. Hence, that shared key will be used as the encryption key.

A neighbor of node  , denoted by  , has both a shared key with   and a shared key with   which means that keys of  ’s key ring are taken from  ’s ring and keys are taken from  ’s ring. So the total number of the possible key rings of   is . Therefore, such a neighbor exists with probability .

In addition, the node density is , so there are neighbors in average. Therefore, the probability that at least one neighbor can help   encrypt in step 5 and step 6 is . In sum, the connected probability is

According to (2), the system parameters can be decided. For instance, let while the parameters are listed in Table 1. The first row is the size of key pool and the first column is the node density. Each entry of the table is the theoretical values of the key ring size.

Let the total number of nodes in the network be and let the connected probability be . Figure 6 shows how the key ring size varies along with the node density .

In the figures of this section, symbol denotes the new scheme, symbol denotes the basic scheme, and symbol denotes the -composite scheme. It can be seen that the theoretical value of in the basic scheme and the -composite scheme increases sharply when closes to . The reason is that the predefined out-degree is . In other words, if the node density is equal to or less than , the connectivity will be seriously affected. Since the out-degree cannot be modified when nodes are deployed, the size of the key ring must be very large to keep high connectivity when nodes sleep or die. As a result, the amount of storage per node in the basic scheme and the -composite scheme will increase. However, the ring size in the new scheme increases smoothly as the node density decreases. Thus, by adding a small amount of redundancy to the key ring size, the network will work well even in the case in which the number of active nodes is less than . For example, let when .

4. Security

Resilience in WSNs refers to the resistance of key distribution schemes against node capture. When sensor nodes are deployed in hostile areas (e.g., battle surveillance), an adversary can mount a physical attack on a sensor node and recover secret information from its memory. So we are interested in the question: for any two nodes   and   which have not been captured by the attacker, what is the probability that the attacker can eavesdrop their communications using the subset of the key pools that was recovered from the nodes captured. That is, the shared key of two physically connected neighbors (also called a link) is in the compromised key set or a path key is compromised in path key establishment.

Theorem 2. The probability that any secure link in the shared-key discovery phase between two uncompromised nodes is compromised when nodes have been captured is .

Proof. Let be a subset of the key identifers of the key pool, where . And let be the event in which key with identifier is compromised after nodes are captured by an adversary. The probability is (please refer to paper [24] for the proof).
Let be the event in which the link between the two nodes is compromised, so . If is fixed, a link in our scheme is compromised if and only if the shared key’s identifier of the two nodes is in . Hence, .
From the above two aspects, the probability of compromising a link in the shared-key discovery is
The probability that a link is compromised in our scheme is equal to that in the basic scheme, and it is higher than that of the -composite scheme.

Theorem 3. The probability that a path key between two nodes is compromised in the path key establishment phase when nodes have been captured is .

Proof. In path key establishment, let be the event in which a path key between the two nodes is compromised in the path key establishment phase when nodes have been captured. Similarly, .
In our scheme, a path key is encrypted with one shared , named direct encryption, with probability (see Figure 7) and is encrypted with two shared keys , named indirect encryption, with probability (see Figure 8). We already know that a shared key is compromised if and only if the shared key’s identifier is in . So in the direct encryption, the path key is compromised with probability . And in the indirect encryption, the probability that the path key is compromised is the probability of compromised one shared key plus the probability of compromised another shared key minus the probability of compromising both shared keys.

So the probability that the path key is compromised is when is fixed. To sum up, in our scheme, the probability that the path key is compromised is

In the basic scheme [14] and the -composite [15] scheme, the path key is established on a multihop path (mainly refers to multiple links using different encryption key), so the path key is compromised when at least one hop is compromised (see Figure 9). Assuming that one hop (a link) is compromised with probability when nodes are captured, the probability that all -hops are secure is . As a result, the probability that the path key is compromised is .

As a result, in the basic scheme, the probability that the path key is compromised is . And in the -composite scheme, the probability that the path key is compromised is , where and . Please refer to paper [24] for the proof of in the basic scheme and the -composite scheme.

In the -path scheme [15], a path key is compromised if and only if disjointed path is all compromised, that is, , where is the number of hops on the th path. So the probability that the path key is compromised is .

The probability that a path key will be compromised in the establishment phase when is listed in Table 2 (, , and ). It can be seen that the probability in our scheme will be lower than that of the -path scheme when the number of hops is larger than 4.

5. Complexity

In this section, the storage, computation, and communication complexity of the new scheme are presented.

(1) Storage Complexity. Same as the basic scheme, each node in the new scheme stores a key ring and a neighbor list. There are key-identifier/key pairs in the key ring. Let represent the length of a key, so the key ring takes bits of storage space. On the other hand, the probability that two neighbor nodes are logically connected is also , and the average number of neighbors for a node is . Thus, there are about records on the neighbor list. Each record includes a node identifier and a key identifier. Therefore, the neighbor list uses bits memory space. Consequently, the total storage complexity of the new scheme is .

(2) Communication Complexity. Let be the length of a ciphertext. If   and   share a key, the main message transmitted on the channel is the ciphertext and the key identifier of the encryption key. So, the communication complexity is , where represents the number of hops in a path.

If   and   do not share any keys, the process of searching for an encryption key (step 2 to step 6) will bring extra traffic.  ’s identifier with length is sent in step 2, and a key identifier and a ciphertext are transmitted both in step 5 and in step 6. The probability that a neighbor   has shared keys both with   and with   is , so about bits confirmation message will be sent to   in step 4. Hence, the extra traffic is .

And because the probability that   and   do not share any keys is , the average communication complexity is .

The communication complexity of the new scheme is slightly higher than that of the basic scheme. Figure 10 shows that the difference of the communication complexity between the two schemes can be almost negligible.

(3) Computation Complexity. Let denote the time complexity of random sequence generation such as LFSR, and let represent the time complexity of an encryption (or decryption) operation.

The amount of calculation of sorting a (key identifier) set of size is , such as bubble sort algorithm or insertion sort algorithm. And the calculation amount of comparing two ordered sets of size to find a common key identifier (binary search), called key identifier search, is . According to a key identifier, the amount of calculation of looking up the corresponding key in a key ring (binary search), denoted by key ring query, is .

The average length of the neighbor list is and a node identifier has bits, so the computation amount of deciding whether a node identifier is on a neighbor list, denoted by neighbor list query, will be .

Case 1. If   and   have a shared key, only   and   are involved in establishing a path key regardless of the receiving and forwarding operations of nodes in the path.   executes a random sequence generation, a key identifier search, a key ring query, and a path key encryption in step 1, so the computation complexity of   is .

Accordingly,    executes a key ring query and a decryption; hence, the computation complexity of   is .

Case 2. If   and   do not have any shared key,  ’s neighbors are involved in finding the encryption key. neighbors may run neighbor list query (step 3), so the complexity is . About neighbors will run step 4 which includes a random sequence generation and a key identifier search; therefore, the computation amount is .

In step 5,   runs a neighbor list query and a key ring query to find the shared key with   and encrypts the ciphertext with the shared key. The total computation complexity is .

In step 6,   executes key ring query twice, encryption once, and decryption once. So the calculation amount is .

Considering the above two aspects, the computational complexity is

In Phase 2 of the new scheme, the algorithm is used to save on traffic at the cost of increasing the computation amount. So the total traffic in this stage is about , and the cost of computation is about . Finally, the complexity of the basic scheme and the new scheme is listed in Table 3.

The efficiency (in Phase 3) of the basic scheme and the new scheme is shown in Figures 11 and 10. In those figures, symbol denotes the new scheme and symbol denotes the basic scheme.

From (5), the computational complexity of the new scheme is concerned with , , and . Besides that, the basic scheme varies with the number of hops , which is shown in Figure 11 ( and ). It can be easily noticed that if is small, the basic scheme performs faster than the new scheme. However, the computation complexity of the basic scheme grows rapidly with the increase of the number of hops . When is more than , the complexity of the new scheme is lower than that of the basic one.

It is worth noting that the computation complexity of the basic scheme in Table 3 does not include the extra computation cost of finding logically connected neighbors in routing phase, which will be estimated in the next section.

6. Simulations

In this section, we use computer to simulate the basic scheme and our scheme. In the basic scheme, every hop of the path is not only physically but also logically connected. We try to estimate the amount of computation on deciding whether two adjacent nodes are logically connected (neighbor list query) in routing process. On the other hand, we prove that the path length between two nodes will increase in the basic scheme. At last, the execution time of the path key establishment of the new scheme and the basic scheme is tested. The simulation environment is as follows:the operating system: ubuntu10.04,CPU: Intel Core2 Quad Q8400 2.66 GHz,RAM: 4.00 GB,communication bandwidth: 4 KB/s,routing protocol: AODV.

In those simulations, parameters are set as , , , and .

(1) Computation Amount. As previously metioned, the basic scheme will spend lots of extra computation on neighbor list query in the routing process. Table 4 shows the average number of nodes which execute the neighbor list query in 208 random routing trails. Moreover, the hop limit is 30.

Obviously, more than half nodes are involved in a routing process. The computation costs on a neighbor list query are , so the total computation complexity is . The theoretical value is more than 90000 when parameters are the same as those of routing trails.

(2) Path Length. In fact, the path length of the basic scheme may be longer than that of the new scheme since the routing protocol is affected by the basic scheme. Consequently, the communication complexity of the basic scheme will be more than the communication amount shown in Figure 10. We randomly chose two nodes which are physically connected neighbors and then conducted 10240 routing trails to find a path in which two adjacent nodes are logically connected. The length of the path in each trail was counted, and the count results are shown in Table 5.

Obviously, the probability that the path length increases () because of the influence of the basic scheme is at least .

It should be pointed out that parameters in those simulations are small, for instance, and , because we assume that the network has a lot of simple sensor nodes which have limited power. In fact, if parameters increase, for example, the above conclusions also hold.

(3) Execution Time of the Path Key Establishment. We simulate the path key establishment process on the computer, in which the encryption algorithm is DES. Figure 12 shows the average time spent on path key establishment.

The -coordinate in Figure 12 is the physical hops of two nodes  (). As mentioned above, the real number of hops in the path key establishment of the basic scheme may be larger than that. “Direct” means that   and   share a key . The path key is encrypted with this key, and then is sent to  . “Indirect” means that   transmits the to    with the help of a neighbor  . It can be seen in the figure that our scheme is slightly more efficient than the basic scheme.

7. Conclusions

The random key predistribution scheme is a good solution to distributed sensor networks, which reduces the energy consumption of nodes though losing a little connectivity and security of the scheme. In this paper, a new RKPD scheme which is independent of the routing protocol is given. The new scheme can be parameterized to meet the appropriate level of performance and keep the stability of the connectivity in a certain range by bringing small redundant in parameters.

Moreover, an algorithm with the input of node’s identifier is constructed to reduce the communication overload. However, the traffic of the new scheme in the path key establishment phase is still higher than that of the basic scheme. Since Pottic and Kaiser [25] revealed that the energy of transmitting 1 K bits over 100 meters could be used to execute instructions, the communication overload of the new scheme should be further reduced. A intuitive solution is that we do not generate a new path key. Instead, if   and   have a shared key , the key is used as the path key. Else, the that   shares with   will be sent to   (covered by ) by  . In this way, only transmits the key identifier or to  . However, the above solution may incur new security flaw which is the same as the scheme in [17]. For example, a captured node may be cloned and redeployed in WSNs, and then they send fake message (as step 1 in the path key establishment) to its neighbors and defraud fresh (not compromised) keys in the key pool.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by National Natural Science Foundation of China (nos. 61370194 and 61202082) and the Fundamental Research Funds for the Central Universities (nos. BUPT2012RC0219 and BUPT2013RC0311).