Abstract

To achieve efficiency in prolonging the lifetime of sensor networks many schemes have been proposed. Among these schemes, a clustering protocol is an efficient method that prolongs the lifetime of a network. However, in applying this method, some nodes consume energy unnecessarily because of an environment in which the collected data of the sensor nodes easily overlap. In this paper we propose a clustering method which reduces unnecessary data transmission among nodes by excluding the duplication of data. Our method alleviates the problem where nearby nodes collect the same data from adjacent areas by electing all nodes that form a cluster in consideration of the sensing coverage of the nodes. Also, it introduces relay nodes, also called repeaters, which help to hop the data transmission along to cluster head nodes in order to cope with energy-hole and link failure problems. This method prevents data loss caused by link failure problem and thus the data is collected reliably. According to the results of the performance analysis, our method reduces the energy consumption, increases the transmission efficiency, and prolongs network lifetime when compared to the existing clustering methods.

1. Introduction

In wireless sensor networks, sensor nodes are randomly and densely deployed in the field for data collection. It is often the case that sensing areas by each node sometimes overlap. Moreover, sensor nodes use batteries which have limited energy and are disposable [1]. Thus, energy use is the most critical problem to consider.

Typically, energy consumption can be divided into three domains: sensing, data processing, and communication [2, 3]. Among these, the energy cost of communication is much higher than the others [4, 5]. Hence, to save energy and prolong network lifetime, we should consider how to minimize communication costs. In addition, any method which considers these problems [6, 7] needs to also maintain network stability.

Therefore hierarchical structures that use clustering [813] are more appropriate than flat based routing [14, 15].

However, in current clustering methods, the energy load is concentrated on the cluster head node. Eventually the remaining energy imbalance in the network gets larger and larger. In order to mitigate this energy imbalance, methods that periodically change the role of cluster head node have been suggested. In LEACH [8] cluster head nodes are elected through a setup process for reforming clusters.

In LEACH-C [8] cluster head nodes are elected by a sink or base station to prevent energy imbalances. HEED consider residual node energy for the dispersion of energy consumption [9]. BCDCP uses a method that selects cluster head nodes through a candidate set of nodes [10]. TEEN [12] sets a threshold value on the basis of LEACH to reduce the energy consumption of nodes and the cluster head node. APTEEN [13] proposed to make up for the weak points in TEEN. This method sets a time to transmit data periodically for data accuracy and reliability. Nevertheless, these schemes did not consider multihop transmission and the energy hole problem [16].

Our research considered the sensor coverage of sensor nodes and data redundancy and evaluated performance using meteorological data [17]. Also it minimizes the problem that the nodes which are deployed at a specific region redundantly collect the same data. It also increases accuracy in collecting data and solves the problem of disconnection among nodes.

This proposal is based on the basic concept of ARCT [18] which is composed of two kinds of clusters: regional and normal cluster.

In the regional cluster of ARCT, all nodes which participate in this cluster check whether their sensor values are the same or not.

However, in our scheme we include one more value: the altitude. In other words, all the nodes that form a cluster need to consider 2 types of values, the first is its sensor value (the air temperature) and the second is its altitude. Furthermore, in order to decrease the error rate during data collection, our scheme uses one more condition for a node to be included in a cluster. Typically former schemes have only considered the transmission range to form a cluster. However, in order to achieve higher sensor data accuracy, our scheme considers the sensing coverage to form a cluster. Typically nodes sensing coverage is smaller than transmission range. In transmission, in order to mitigate the phenomenon of link failure, we introduce data relay node that is named repeater to support the successful data transmission.

According to the results of the experimental evaluation, our method shows less energy consumption, higher degree of network connectivity, prolonged network lifetime, and higher data collection rate than the existing methods.

The rest of this paper is organized as follows. Section 2 introduces some protocols that are based on cluster such as LEACH, LEACH-C, HEED, BCDCP, TEEN, APTEEN, and ARCT as related research. Section 3 explains our method, and Section 4 measures data collection rate, network connection rate during the network lifetime, and analyzes it in relation to the number of cluster head nodes that are created during the network lifetime, the degree of energy consumption equalization of all the nodes, the number of isolated node, and the lifetime of network. Finally, we conclude the paper in Section 5.

2.1. LEACH

LEACH (low-energy adaptive clustering hierarchy), the hierarchical routing protocol based on the cluster, is a representative clustering method of sensor networks. In this method the cluster head node collects data from its member nodes and combines it and retransmits it to the sink periodically.

Consequently, cluster head node consumes a lot of energy. Thus, the role of the cluster head node is changed periodically in order to disperse the energy consumption in the nodes over the entire network. This method can measure the data through the cluster head node. Nevertheless, it consumes a lot of energy. Therefore, it is difficult for the network to stay operational for long periods.

2.2. LEACH-C

To solve the problems of forming a cluster in LEACH, LEACH-C (low-energy adaptive cluster hierarchy-centralized) was proposed. In this scheme, to solve the problem of repositioning cluster head nodes, clusters are elected by a base station. All nodes in the network send a message which includes position and remaining energy information to the base station. After that, the base station broadcasts this information to all the nodes which are deployed in the field to help form an adequate cluster. However, this scheme is not energy efficient in clustering overhead because of the energy consumption in calculating and transmitting all the nodes positions to elect cluster head nodes.

2.3. HEED

HEED (hybrid, energy-efficient distributed) is a method to extend network lifetime. This method adds some values which take into account the nodes residual energy for cluster formation and election. In this scheme, the node which has more residual energy can be elected as cluster head node to prolong network lifetime. If some nodes that are candidates for the cluster head node have the same residual energy, then their transmission costs are compared. In this scheme the energy efficiency is considered in each round except for the energy that was consumed during node transmissions in round shifting.

2.4. BCDCP

BCDCP (base station controlled dynamic clustering protocol) in some parts this method is the same as LEACH-C like in the case of assigning the complex calculations to the base station. This method is composed of two phases: setup and data communication. In cluster formation, a base station elects a candidate set of cluster head nodes to determine cluster head nodes. To fit the number of cluster head nodes, it uses a cluster splitting algorithm which divides the network continuously. Furthermore, in this method, all the cluster head nodes send aggregated messages on a multihop basis to the base station without direct transmission. However, this method also uses same the transmission method used in LEACH-C.

2.5. TEEN

TEEN (threshold sensitive energy-efficient sensor network protocol) is a reactive network in which the sensor nodes manage the threshold data. The process which excludes the threshold value is equal to LEACH.

This method uses the same method as LEACH in cluster formation. After cluster formation, cluster head nodes transmit the parameters of the data, HT, the hard threshold value, and ST, the soft threshold value, to their member nodes. All nodes collect and transmit data when the value exceeds the HT value first. After exceeding HT, nodes collect and transmit data only when the measured data exceeds ST. As a result not all nodes act in the reported time and conserve energy. However, if the collected data does not exceed HT, the node does not transmit any collected data. And if it does not exceed ST, we cannot know about data changes after the default value is passed, especially if the data change is under the threshold value. Moreover, it is hard to judge whether the nodes are alive or not, and even if the data surpasses both of the threshold values, the collected data goes through a process of being aggregated by the cluster head node. Therefore the consuming of energy for the redundant management of the collected data should be considered.

2.6. APTEEN

APTEEN (A Hybrid Protocol for Efficient Routing and Comprehensive Information Retrieval in Wireless Sensor Networks) which combines the advantages of LEACH and TEEN is a hybrid protocol that unites the data transmission by the threshold value of TEEN and the periodic data transmission of LEACH. It has overcome the difficulty in collecting data and the recognition of the state of the nodes. After cluster formation, the cluster head node transmits the threshold value and parameters that include the TDMA schedule count time to the member nodes. All nodes transmit the collected data to the cluster head node at the time set by the parameter. This is done even when the data exceeds the threshold value. Therefore this method solved the problem of TEEN. However, all the nodes of the network have to steadily transmit data. Thus an improvement is needed.

2.7. ARCT

ARCT (An Advanced Regional Clustering Scheme using Thresholddataset) is a protocol based on the dynamic cluster that acts on the basis of the cluster and proactive network that transmits at the determined time. In addition it uses threshold—similar method to TEEN—and therefore it can be categorized as a reactive network such as TEEN. This method uses two types of clusters—the regional cluster and the normal cluster—and these two clusters have different ways to collect data. The regional cluster composes the cluster through a comparison of the data collected by adjacent nodes, and after that only the head node of the regional cluster acts and transmits the collected data. The normal cluster combines with the nodes that failed to participate in the regional cluster. All the nodes that composed this cluster participate in data collection. The nodes that participate in this cluster increase the energy efficiency using the threshold value table when collecting data.

3. Proposed Scheme

Sensor networks can be used for various applications. Our method, ARCS (an advanced region clustering scheme in wireless sensor networks for environment monitoring system) which is based on a dynamic cluster and proactive network, is proposed for environmental monitoring. Before addressing our scheme, we made some assumptions about sensor nodes, networks, and the environment.

All the nodes transmit data in a multihop fashion. All the nodes positions are fixed. All the nodes are synchronized according to time.

All the nodes have the same initial energy. All the nodes can adjust their transmission power. All the nodes have a sensor which can measure altitude. All the nodes sensor ranges are shorter than their transmission ranges. If node B can successfully receive a packet from node A, node A can also successfully receive a packet from node B. The base station (or Sink) has no limitations in energy of transmission coverage.

In the regional cluster of ARCT, all nodes which participate in the cluster check whether their sensor values are the same or not.

However, in our scheme more values are used, namely, the same altitude. In other words, all the nodes in the cluster need to consider two types of values, the first is the sensed value and the second is its altitude. Typically former schemes only considered transmission range to form clusters. However, our scheme considers sensor range which is shorter than transmission range as shown in Figure 1.

Altitude: to form a cluster, the same altitude is also included as a cluster forming condition. Even though all the nearby nodes have the same temperature, a difference in altitude is a major condition for temperature measurement. Thus this influences how data should be adequately collected.

Sensor coverage: regional clustering is a method that alleviates the redundancy of nodes collecting the same data. Thus it is not desirable to collect data from all the nodes in sensor range. All the nodes which are deployed in the AOI are chosen on the conditions above. Here is the process of node selection.(A.1.a.1) All nodes have a randomly generated time delay before being deployed in the field. (A.1.a.2) After deployment of the nodes, the node which has finished counting its delay time performs duties to become a candidate cluster head node.(A.1.a.3) When the node starts the operation, it collects all the data and transmits it to another node or the sink within the same range of its sensor range as shown in Figure 2. (A.1.a.4) All standby nodes which are deployed in the sensor coverage of operating nodes that have the same values of data mentioned above are stopped for a time out and go into sleep mode.(A.1.a.5) If one of these conditions is not met, wait for the allotted nodes operation time. Then perform as in (A.1.a.3).

In accordance with these processes, selected nodes become regional nodes and thus go into the next process to become regional clusters and repeaters.

After finishing the allotted time to select a regional node, regional nodes which are selected randomly generate the time delay again that was used in (A.1.a.1).

Here is the process of regional cluster head node selection.(1)In the predetermined time period for cluster head node selection, start from the node which has the lowest delay time to broadcast the message including the cluster head node. Then a regional cluster is formed that includes the nodes which receive the message as member nodes. The same as (A.1.a.4) in the process of node selection, all standby nodes which have received the message while counting the delay time give up becoming cluster head nodes and join the cluster as member nodes. After that, all nodes which join the cluster as member nodes count the number of packets broadcast from nearby cluster head nodes. Then, store the number of nearby cluster head nodes. (2)After finishing regional clustering, the member node which has a number of nearby cluster head nodes, that is, between 2 and 5, can be a candidate node for repeater.

The reasons for the conditions to be a repeater are as follows.

The main objective of the repeater is to support successful data retransmission in a multihop manner.

Thus, for successful relay information, repeaters need at least 2 nodes to contact. Also, Figure 3 shows the reason for the maximum number limitation.

As shown in Figure 3, when we assume that all cluster head nodes are deployed in an ideal case, to maintain the multihop transmission successfully, a repeater is needed until the number of nearby cluster head nodes is 5. However, in the case of the number of cluster head nodes being more than 6, repeaters are useless. Actually, in a real deployment based on random distribution, when there is less than 6 cluster head nodes they can communicate with each other. Thus, we set the selection ratio of repeater nodes among the candidate nodes. Repeaters perform as data relay nodes which perform at the same level of cluster head nodes to support multihop data transmission.

To achieve load balancing, our scheme selects the method of weighted values. When comparing the consumption of energy of nodes in accordance with the role of the nodes, consumption is simply as shown below:

regional node < repeater < cluster head node.

In the process of node selection, nodes which are selected as regional nodes are assigned a weight point of 1.

In the same way, nodes which are selected as cluster head nodes are assigned a weight point of 2 and repeaters are assigned a weight point of 1. By this value, all nodes renew the available random delay time slot as shown in Figure 4.

Predetermined available delay time slots that are assigned before deployment are 𝑅𝑡, the minimum value of this delay time is 𝑚𝑅𝑡, the maximum value is 𝑀𝑅𝑡, the current available random delay time is 𝑟𝑡, and the weight value of node is 𝑊𝑝. Then the equation to determine the minimum value 𝑚𝑟𝑡 of the current available time slot 𝑟𝑡 is as follows:𝑚𝑟𝑡=𝑀𝑅𝑡2𝑊𝑝,𝑀𝑟𝑡=𝑀𝑅𝑡.(1)

For example, assume that a node performs as cluster head node in the first round. Then, its weight value is 3.

Because of these points, this node receives a disadvantage in assigning the random delay time for selecting a regional node. According to Pseudocode 1, this node only assigned a 223–255 time slot. This method can increase the probability of nodes in sleep state in this round to conserve energy. In addition, after finishing the time assigning, all nodes which have weight points (more than 0) decrease 1 point in each round to achieve a balance in the consumption of energy. Pseudocode 1 shows some pseudo code in ARCS.

CH regional: regional cluster head node
Repeater: relay node
N RCH_member: regional cluster member node
N normal: normal node before joining the cluster
N regional: regional node
N sleep: sleep node
Initialize:
  (1)  generate (random_delay Time) /by normal nodes
Main Processing: /clustering process by normal nodes
  (1)  if (nodes weight value > 0)
  (2)    decrease weight value by 1 point
  (3)  end if
  (4)  Calculate available time slot period by weight value
  (5)  delay Time <- generate (random_delay Time)
  (6)  wait for delay Time or until receiving {any advertisement
    message}
  (7)  if (delay Time_Expired)
  (8)   if (nodeID == Nnormal)
     become Nregional
     broadcast (the Advertisement Message {nodeID, the 1st
    sensing value, position, altitude})
  (9)   else
     cancel the delay Time
  (10)  end if
  (11)  else
  (12)    if (receive the Advertisement Message {the 1st sensing value,
    altitude} == the 1st sensing value, altitude)
          become Nsleep
    else
       wait for delay Time or until receiving {any advertisement
Message}
  (13)    end if
  (14)  end if
  (15)  if (nodeID == Nregional)
         increase weight value by 1 point
    delay Time <- generate (random_delay Time)
  (16)   wait for delay Time or until receiving {any advertisement
     message}
  (17)  if (delay Time_Expired)
  (18)   if (nodeID == Nregional)
     become CHregional
     increase weight value by 2 points
     broadcast (the Advertisement Message {NodeID, position})
  (19)   else
     cancel the delay time
  (20)   end if
  (21)  else
  (22)   if (receive the Advertisement Message)
     cancel the delay time
            become NRCH_member
          else
        wait for delay Time or until receiving {any
advertisement Message}
  (23)   end if
  (24)  end if
  (25)  if (nodeID == NRCH_member)
  (26)    if (# of neighbor CHregional) > 2 &&
  (27)   # of neighbor CHregional < 6)
  (28)   become repeater
  (29)   broadcast (the Advertisement Message {NodeID, position})
  (30)    end if
  (31)  end if
CHregional
  (1)  broadcast (the Advertisement Message {nodeID, the 1st sensing
    value, position})
  (2)  accept (join Message)
  (3)  aggregate sensing data
  (4)  transmit_data_to_Sink (sensing value, nodeID, position)
NRCH_member
  (1)  join to cluster (CHnodeID, nodeID, position)
Repeater
  (1)  join to cluster (CHnodeID, nodeID, position)
  (2)  relay data

4. Performance Evaluation

In this section we present the experimental results of our method in terms of performance in data collection, that is, the basic function of the sensor network, the connection rate of the sensor network, the number of the cluster head nodes created during the network’s lifetime, the degree of equalization in energy consumption of all nodes, the number of isolated nodes, and the lifetime of the entire network. Experiments were conducted using MATLAB 7.0. The experimental environment is as follows: each round is composed of three frames and the same condition as used in LEACH is applied. Electronic energy is applied as Eelec = 50 nJ/bit, the amplified energy of the free space model is applied as Efs = 10 pJ/bti/m2, the amplified energy of the multipath model is applied as Emp = 0.0013 pJ/bti/m4, the consumed energy of node scheduling is applied as Eschedule = 5 nJ/bit/signal, the consumed energy when data is merged is applied as Eda = 5 nJ/bit/signal, the data size is applied as l = 1000 bit, the number of total nodes is applied as 𝑁 = 1000, and the length of a side of the network area is applied as 𝑀 = 100. The probability of transmission of the threshold value used in TEEN and APTEEN is applied as 𝐻th=−3 to 3 and 𝑆th = 0.2 according to the temperature data. The fixed transmission frequency of APTEEN is applied as one transmission every three rounds. The number of the cluster head nodes is applied as 5%. The environmental data used in this measurement is the database for temperature changes provided by the Korean Meteorological Administration.

4.1. Accuracy of Collected Data

Table 1 shows the degree of accuracy of the data collected.

According to Table 1, LEACH has the highest accuracy. This is possible because LEACH is a proactive method where all nodes are awakened and act in the same time slot. TEEN shows a low level of accuracy compared to the other methods. This is because TEEN is a method which controls transmission using two kinds of thresholds. Thus it is difficult to determine real data. In addition, we can see that the accuracy of spring and winter is low. This is what makes us indirectly guess that there was no data transmission because the collected data was less than the threshold in a certain region or for a certain time. Eventually, energy consumption of the node decreases but data collection itself becomes difficult. APTEEN shows an average degree of accuracy compared to the other methods. This is the result of the transmission of a fixed period. Therefore it scores relatively higher than TEEN. ARCS shows an accuracy higher than that of TEEN, APTEEN, and ARCT but lower than that of LEACH because it omits collected data of the regional cluster member nodes and only depends on the data collected by the cluster head nodes. However, it shows accuracy levels that are close to those of LEACH. This means that it has quite a high degree of data accuracy. Even though it shows good results, an 18% average error still remains. In the case of ARCS, it shows results higher than those of ARCT.

4.2. Network Connectivity

Table 2 shows the network connectivity measured by each protocol.

According to the table, all these protocols show results at almost the same level. In the protocol based on clusters, cluster head nodes perform the role of an intermediate data transmitter between the source node and the sink. Thus the number of cluster head nodes is most important. In the election of the cluster head nodes based on the probability, more numbers of cluster head nodes mean a higher probability of connection with the sink. Therefore more cluster head nodes give higher connection rates for the network. Typically, energy consumption of the cluster head nodes is higher than that of normal nodes. Thus it is known that the proper level for head nodes is about 5% of the total nodes as a result of former research. However, in our method regional clusters and repeaters do not consume much energy. Therefore, in our method more cluster head nodes and repeaters can be set up.

In the process of data collection and restoration, when the network is stable without a link disconnection, the primitive data goes through a normal collection process and is transmitted to the sink node, and this is successfully restored and provided to the user.

However, when a part of the network that is in charge of data collection gets disconnected, a part of the primitive data is lost because the nodes with disconnected links cannot transmit data, and the map that the sink node restructures gets harder to restore data transmission normally.

Table 3 shows the number of nodes being isolated during network lifetime.

4.3. The Number of the Cluster Head Nodes

The number of cluster head nodes is related to the degree of the connection of the network. It is essential for the method of selecting the cluster head nodes on the basis of probability to constantly maintain the number of cluster head nodes. Figure 5 shows the number of cluster head nodes that were created during the lifetime of the network for each protocol.

LEACH, TEEN, and APTEEN select cluster head nodes in the same way. All of these methods applied the same amount, or 5%, for cluster head node selection probability, and by this probability the cluster head node is selected. However, as you see in the Figure 5, the number of cluster head nodes are not stable. Cases where cluster head nodes have less than optimal values are common, and the degree of connection between the nodes decreases. Thus the probability that a problem will occur in data collection increases. When electing cluster head nodes through a competition of nodes by random time such as in ARCT, the number of cluster head nodes is maintained in the range of 35 to 40, and we can see that the error range has been decreased by half when compared to the other methods. In our method, ARCS, less than 10 nodes are maintained in its lifetime with no significant errors.

As shown in Figure 6, there are two types of cluster head nodes for successful multihop transmission, cluster head nodes and repeaters. Among these, the rate of cluster head nodes which have high energy consumption is 20%. Consequently, our method does not strongly influence the network consumption of energy.

4.4. Residual Energy of the Network

Figure 7 is a residual energy graph classified by the node that was measured at the point in time when a node with depleted energy occurred for the first time in the network.

According to Figures 7(a), 7(b), 7(c), 7(d) the other four methods show the residual energy distribution at similar levels of about 1~2 J. According to Figure 7(e) however, our method shows extraordinary results over other methods.

4.5. Energy Depletion Balance

In the multihop-based network, network lifetime can be determined by the nodes lifetime which is deployed one hop away from the sink. As shown in Figure 8, we divide the network into 3 parts (zones) to measure the degree of energy depletion among the nodes in each part. The energy hole problem typically occurs in the nodes which are deployed one hop away from the sink. Therefore, comparing the energy consumption ratio of that in area no. 1 with the other areas, we can estimate the network lifetime and stability and other factors.

Figure 9 shows the number of dead nodes that occur during the lifetime of the network in the existing methods and in the suggested method.

According to the graph, LEACH shows that over than 90% of nodes were dead within a few rounds in every zone, and this causes link-failure in multihop-based transmission.

However, the other three schemes show that less than 20% of nodes died every round. This means a low rate of link failure and partitioned networks. Our scheme shows that dead nodes are uniformly distributed throughout its lifetime and that less than 4% of nodes died. This means that the network can sustain more rounds without link failure with regard to high probability. Also it deeply correlates with the longevity of the network.

4.6. Network Lifetime

Figure 10 shows the network lifetime measured as classified by each protocol. LEACH shows the shortest network lifetime with the method that was first suggested, and TEEN, a reactive network, shows a lifetime that is almost double that of LEACH. APTEEN shows a lifetime a little shorter than that of TEEN. ARCT uses a data collection method that is different from the above three methods and has a fixed period transmission like LEACH but it decreases node density of the data collection area. Therefore it reduces the number of nodes which participate in the real data transmission and prolongs the lifetime of the network. ARCS, our method, has the same periodic transmission rate as LEACH. However, it decreases the deployed node density through a different cluster formation method and thus prolongs the network lifespan.

5. Conclusion

This research dealt with the problems and supplementations that arise when applying sensor networks that use clustering to an environmental monitoring network.

We suggested a clustering technique that is effective for this application. According to the performance evaluation, we can see that the clustering method that used regional clusters and repeaters presented by our method provides higher accuracy, a higher degree of connectivity, a lower error rate, and a longer network longevity than those of existing methods.

However, our method is optimized focusing on the characteristics of collecting temperature data and sensing area specifications. Optimized research which focuses on the characteristics and environment of the collected data before is helpful in increasing network performance and longevity enhancement in the design of protocols for sensor networks that aim to monitor the environment. We expect that our clustering method will be applied to diverse environmental monitoring.