Abstract

Collection of rare but delay-critical messages from a group of sensor nodes is a key process in many wireless sensor network applications. This is particularly important for security-related applications like intrusion detection and fire alarm systems. An event sensed by multiple sensor nodes in the network can trigger many messages to be sent simultaneously. We present Alert, a MAC protocol for collecting event-triggered urgent messages from a group of sensor nodes with minimum latency and without requiring any cooperation or prescheduling among the senders or between senders and receiver during protocol execution. Alert is designed to handle multiple simultaneous messages from different nodes efficiently and reliably, minimizing the overall delay to collect all messages along with the delay to get the first message. Moreover, the ability of the network to handle a large number of simultaneous messages does not come at the cost of excessive delays when only a few messages need to be handled. We analyze Alert and evaluate its feasibility and performance with an implementation on commodity hardware. We further compare Alert with existing approaches through simulations and show the performance improvement possible through Alert.

1. Introduction

With the transition of many automated tasks from a wired to a wireless domain, wireless sensor networks (WSNs) are increasingly being subjected to new application domains. Applications of critical nature have been the forte of wired networks due to their reliability. The ever increasing reliability of WSNs coupled with cost-effectiveness has led to their gradual adoption for such critical applications as well. The nature of such applications, however, require new MAC protocols for WSNs that meet the requirements and inspire sufficient confidence about their usage.

The requirements for applications of critical nature can be fundamentally different from the applications for which current MAC protocols are designed for. For example, energy is a valuable resource in sensor devices and most existing MAC protocols are optimized to conserve energy, trading off latency, throughput, and other similar performance metrics in the process. These same protocols are typically not suitable when the application demands better performance at the expense of some additional energy. If latency is to be minimized, with energy consumption only a secondary issue, protocols need to be redesigned from that application perspective.

In this paper, we consider applications that require all wireless sensors to convey urgent messages to a centralized base station (i.e., a single hop away) with minimum delay from the time they are generated. These messages are triggered by events detected by sensor nodes and their task is to inform the base station for possible action. Such messages are triggered very rarely, and the aim is to focus on minimizing latency when they are triggered, even if some additional energy is expended during those times. (We describe in Section 3.1 why energy is not a concern in this application scenario and can be ignored.) Intrusion detection and fire alarm applications are some examples which require such a solution. Even though the messages are typically correlated, the collection of all triggered messages as opposed to one of them provides valuable information which can be used for detection of false positives or postevent analysis. (When sensors cover a large area, only a subset of these nodes will detect events and trigger messages to be sent to the base station. We require that all messages generated due to event detection by this subset be reported.) For example, the European Standard EN 54-25 for fire alarm systems specifies the duration within which the first alarm should be reported and by when all alarms must be received at the base station [1]. The challenges in designing WSN MAC protocols for such applications are the handling of a number of simultaneous messages at the same time without knowledge of how many, and planning for possible interference. Additionally, it is also important to ensure implementation feasibility taking into account the additional constraints imposed on WSNs like time synchronization and limited computation and storage capabilities.

We present the Alert MAC protocol that is designed to minimize latency when collecting simultaneous urgent messages. Alert minimizes contention among nodes by using a combination of time and frequency multiplexing. Multiple frequency channels are used within time slots and contention is minimized by controlling the selection probability of each channel by the nodes. Note that in spite of the use of multiple channels, we assume the presence of only one transceiver in all nodes including the receiver. The important features of Alert are the following: (a) minimizes delay of collecting first message as well as all messages; (b) noncarrier sense protocol; it thus eliminates hidden terminal collision problems; (c) dynamic shifting of frequency channels to provide robustness against interference; (d) adaptive characteristic enables operation without knowledge of number of contending nodes.

We make the following additional contributions in this paper. (i) Theoretical justifications for the choice of different design parameters of Alert. Our analytical results are of a fundamental nature and should prove useful for the design of other MAC protocols as well. (ii) Detailed performance analysis of Alert by comparing it with other existing protocols through simulations. These evaluations examine the protocols from an implementation perspective and take into account low level details like degree of time synchronization available. This further provides great insight into the design criteria for event-driven MAC protocols that focus on minimizing latency. (iii) Demonstration of feasibility and validation of analytical results through an implementation on commodity hardware. This validation step inspires the necessary confidence to trust Alert with applications of critical nature.

The rest of this paper is organized as follows. Section 2 presents the design space of MAC protocols in Wireless Sensor Networks. Section 3 presents our Alert protocol and describes some of its features in more detail. Section 4 presents some theoretical considerations in the selection of design parameters for Alert and our analytical results. Section 5 describes an adaptive algorithm used with Alert to handle cases where the number of contending nodes is unknown. In Section 6, we demonstrate the feasibility of implementing Alert and validate our analytical results. We further compare Alert with two other existing MAC protocols to point out its advantages. Concluding remarks are made in Section 7.

The related MAC protocols for wireless sensor networks can be mainly classified into contention-free, contention-based, and energy-saving protocols. Contention-free protocols are mainly the ones based on time-division multiple access (TDMA) where slots are assigned to each node by the base station and each node sends its message (if it has one) only during its assigned slot (e.g., GUARD [2]). Such TDMA-based protocols perform very poorly when the number of nodes contending is unknown or keeps varying. For the specific application targeted in this work, only a subset (of unknown size) of nodes may have events to report. This makes assigning slots to all nodes undesirable as it leads to a significant increase in delay to receive the first message. For example, consider the case of only one node having a message to report. The average delay incurred to collect this message would be half the TDMA cycle with the worst-case delay being the whole cycle. TDMA schemes are useful in cases where most of the nodes have events to report, a rare case for our scenario. Other contention-free approaches, for example, frequency-division multiple access (FDMA) [3] face similar limitations as outlined above in terms of unknown or varying number of nodes that make scheduling difficult. The work by Chintalapudi and Venkatraman [4] designs a MAC protocol for low-latency application scenarios similar to those considered in this paper, but with some important differences. They assume multiple base stations while our solution requires only one base station. Also in their work, many of the concepts are based on a TDMA schedule which has the same limitations as pointed out above. Finally, their model assumes that each message generated has its own deadline. In our scenario, we are indifferent to the order in which messages are received as long as constraints are met on the latency to receive the first message as well as all the messages.

Contention based protocols can be bifurcated into carrier sense multiple access- (CSMA-) based or non-CSMA-based. The IEEE 802.11 and 802.15.4 protocols are examples of CSMA protocols with the latter designed specifically for applications catered to by wireless sensor networks [5, 6]. They use a variable-sized contention window whose size is adjusted at each node based on the success of the node in sending its message, with each node picking a slot in this window using a uniform probability distribution. These protocols do a good job in handling scenarios with small number of nodes but do not handle a large number of simultaneous messages well. For a detailed performance evaluation on contention window-based schemes and description of deficiencies of the IEEE 802.11 protocol, refer to [7]. The Sift protocol was designed to overcome these deficiencies for WSN applications which need to handle such large number of event-driven spatially corelated messages [8, 9]. Sift is also CSMA based but uses a fixed-size contention window. Nodes pick slots from a geometric probability distribution such that only a few nodes contend for the first few slots, and thus handles a large number of messages easily. A variation based on replacing the uniform-distribution contention window of IEEE 802.11 with a p-persistent backoff protocol was presented in [10]. Protocols based on Aloha on the other hand do not sense the channel before transmission and rely on each node picking a slot to transmit on randomly, with the probability of transmission depending on number of messages contending [11, 12]. When this number is not known, these protocols do not adapt well. In general, CSMA-based protocols outperform Aloha-based protocols when the propagation time between nodes is small enough to make carrier sense useful. When the relative effectiveness of a CSMA-based and a TDMA-based scheme is unknown, a hybrid MAC protocol like Z-MAC can be used to adapt between these protocol types based on prevailing conditions [13].

Our protocol, Alert, is similar to Sift in that it uses a similar nonuniform distribution to control contention among nodes, but is non-CSMA based. Alert separates message transmissions across different frequency channels with this distribution while Sift does it over different time slots. This allows Alert to be free of hidden terminal issues while Sift is susceptible. (Aside from the performance perspective, operations performed at the sender side in Alert are comparatively much simpler than Sift or any other CSMA-based protocol. So fewer resources (less memory and less computational power) are required for implementation of Alert at individual nodes. This makes Alert a more cost-effective solution.) Alert extends this distribution to handle different interference levels as well. While Sift is also optimized for unknown number of messages, the adaptiveness of Alert allows it to perform well even when the number of contending messages is small.

The topic of energy-saving MAC protocols has been well researched for wireless sensor networks in the last few years primarily due to the limited energy supply in these devices [1416]. A more general framework and survey of MAC protocols for wireless sensor networks can be found in [17]. Our work focuses on applications for which latency is the primary concern and the goal is to let all urgent messages reach a receiver node as soon as possible. Energy is a concern as well, but the rare occurrence of messages in these applications allows the MAC protocol to focus solely on the latency aspects. With such applications, energy efficiency should be built into more common tasks carried out by each node like time synchronization. Some researchers have focused on reducing latency in the collection of messages at a sink node from source nodes through duty cycling approaches (e.g., [18, 19]). These types of low-latency protocols solve a fundamentally different problem which is to sleep as often as possible while ensuring that messages reach the sink as fast as possible (with latencies in the order of seconds instead of milliseconds). Our work focuses on event that driven message generation where the receiver is always awaiting messages. The receiver is assumed to be wall powered or periodically rechargeable, and hence, its energy consumption is not an issue. The transmitters try to send messages when they have one, and the only latency that needs to be reduced is the one created due to contention between nodes trying to send messages at the same time.

In the preliminary version of this work in [20], we had presented some of the contributions above in terms of the Alert protocol. In this paper, there is additional emphasis on our theoretical results. We provide full mathematical proof of the optimal channel probabilities to use with Alert. Further, theoretical results on the success probability of sending a message in an Alert slot with number of nodes tending to infinity is given. Additionally, the optimal probability distributions for different optimization objectives are compared to each other.

3. Protocol Description

We present details of our Alert protocol in this section along with a description of the associated design parameters. Theoretical considerations for selecting values for these design parameters will be presented in the following section.

3.1. Alert Protocol Concepts

The protocol actions can be divided into those at a sender, a node that has a message to send, and a receiver, a node whose task is to collect messages from senders. The time is slotted into what we refer to as an Alert slot or simply a time slot. (We assume all the nodes in the network are time synchronized with each other. We use periodic broadcast beacons from our base station in our implementation in Section 6.) Each Alert slot can be used to exchange one data packet and its acknowledgment between a sender-receiver pair.

In each time slot, multiple frequency channels can be used by the senders or receiver. We denote the number of frequency channels in each slot by 𝑀. These channels have different priorities. (The topic of how priorities are calculated and assigned forms the basis for most of the analysis later in the paper.) The receiver samples them one by one based on their priority level and tries to receive a packet from one of the senders.

Each sender selects a frequency channel randomly and independently of all other senders, but the channels are not selected with equal probability. Less chance is given to select a higher-priority channel, that is, the selection probability decreases as we move toward higher priority channels. An example is shown in Figure 1 with 𝑀=3 channels and channel selection distribution (𝑝1,𝑝2,𝑝3)=(0.1,0.3,0.6). This nonuniform distribution is prespecified and known to all senders. It is designed to reduce the chance of collision among the senders. We discuss its effect and how to find the optimum distribution in detail in subsequent sections.

Once a sender selects a frequency channel randomly based on the prespecified channel selection probabilities, it switches to the selected frequency and sends a long preamble before sending its data packet. (Note that the diagrams in Figures 1 and 2 do not represent the actual timing scales within a time slot. To present the idea, the sampling and preamble durations are shown much longer than their actual value compared to the packet and ack exchange duration.) After the data packet, the node expects an acknowledgment packet (ack) from the receiver. If the ack packet is received correctly, the sender stops, otherwise, it tries to send its message again in the next time slot.

At the beginning of each time slot, the receiver samples the signal level on each of the 𝑀 frequency channels starting with the highest priority channel. If high signal level (high RSSI (RSSI stands for received signal strength indicator)) is sensed by the receiver, it stays on the same frequency (locking to that channel) and stops sampling any more channels. Then, the receiver waits to receive a packet. If a packet is received correctly, it sends an acknowledgment packet back in response, otherwise, after some fixed timeout period, the receiver stops and continues to the next Alert slot. If the sensed high signal on a channel is due to the simultaneous transmission of preambles by more than one sender, then it is very likely that the received packets are corrupted (Note that a packet may still be received correctly due to capture effect.) If the high signal is due to interference or noise, a packet never arrives from any node, and the receiver simply repeats the procedure in the following slot.

Note that a transmitter is not aware if a receiver has successfully “locked” onto its selected frequency and will thus always transmit the packet even if the receiver is waiting on some other channel. For the applications under consideration, a sender has a message to send very infrequently. For example, a typical system may encounter a situation that requires sending fire alarms only once or twice a year. The rarity of alarms allows for greater effort to be put into reducing latency even at the cost of some extra energy. The receiver, or centralized base station, is typically wall powered (AC-outlet) and its energy consumption is not an issue as mentioned in Section 2. This allows the focus on reducing contention among nodes reporting messages without worrying about receiver wakeup schedules.

While the number of channels, 𝑀, remains the same, the frequency of channels (and their priority) can change across time slots. This is illustrated in Figure 2. The frequency table shown in this example is from a simple expression where we have 16 channels numbered from 0 to 15 and 𝑓𝑚(𝑘) represents the 𝑚th frequency channel in 𝑘 th Alert slot: 𝑓𝑚(𝑘)=[5𝑘+9(𝑚1)]mod16,for𝑚=1,2,3.(1)

We assume that the frequency table is prespecified, and all the nodes in the network know this pattern. Varying frequency channels after each time slot increases the reliability and robustness against channel fading and interference.

A summary of the protocol design space with all our assumptions is provided in Table 1 for convenience.

3.2. Collision Avoidance

Alert is designed to avoid collision among senders such that in most time slots (with high probability), one message is received correctly. Therefore, the protocol can collect messages from all senders in as few time slots as possible. If there is only one sender, there will be no collision and the frequency channel selected by the sender does not matter as the receiver will find and lock to the frequency picked by the sender. If two nodes are contending to send their messages, the node that selects the higher priority channel will be successful since the receiver will hear its preamble/tone first and stay in the channel awaiting its packet. If both select the same channel, a collision occurs. The channel selection probabilities are designed to reduce the chance of collision. As the number of senders increases, it becomes more probable that one of the senders selects a higher priority channel, and if only one node selects the higher priority channel, the message from that sender will be successfully received. The illustration in Figure 3 shows three example cases where there are different number of senders.

3.3. Design Parameters

Based on the protocol description, it should be clear that the two key design parameters are the number of frequency channels to use in the protocol and the probability distribution over the channels that minimizes the overall time to read all messages. A larger number of channels should decrease contention among the nodes that have messages to send. But this increases the size of a time slot and results in fewer time slots within a given period of time, exposing a tradeoff. The channel probability distribution controls the contention among the nodes. When the number of nodes with messages is large, assigning small probabilities to the higher priority channels ensures lower contention for those channels. This increases the chance that only one node chooses that channel. On the other hand, when the number of simultaneous messages is small, that is, the load is low, assigning small probabilities to the higher priority channels could lead to under utilization of these channels and higher utilization on the lower priority channels resulting in collisions and an increase in the overall latency. In the following section, we construct a theoretical framework to select these design parameters to optimize the performance of our protocol.

For final deployment, each node should be preloaded with information about the number of channels that are to be used in a slot and the probability distribution from which to select them. These should be designed taking into account the application under consideration which might give some information about the expected number of messages and interference levels it should be able to handle.

4. Analysis and Design

For selection of the right design parameters to use in the Alert protocol to minimize latency, we need to consider the probability of success that a message is sent successfully in a single slot and extend that to quantify the number of slots that will be required to read all outstanding messages. We begin our analysis by obtaining an expression for the success probability of a single slot.

Let 𝐩=(𝑝1,𝑝2,,𝑝𝑀) represent a row vector of channel probabilities corresponding to each channel 1,2,,𝑀. In this section, we assume that the same channel probabilities 𝐩 will be used by each node in all slots; that is, 𝐩 is not updated during the protocol execution. This allows for a simpler implementation. Also, the results of this section can be built upon by more sophisticated approaches (we show one such approach in the next section).

4.1. Probability of Success in a Single Slot

We assume 𝑛 nodes are contending to send their message across in the slot (each node is assumed to have only one message to send). They select 𝑀 channels based on the probability distribution 𝐩=(𝑝1,𝑝2,,𝑝𝑀). Let random variables 𝑋𝑚 (for 𝑚=1,,𝑀) denote the number of nodes deciding to transmit on channel 𝑚. The variables (𝑋1,,𝑋𝑀) will have the joint multinomial distribution𝑋1,,𝑋𝑀Multinomial𝑛,𝑝1,,𝑝𝑀,(2)𝑋1=𝑥1,,𝑋𝑀=𝑥𝑀=𝑛!𝑥1!𝑥𝑀!𝑝𝑥11𝑝𝑥𝑀𝑀,if𝑀𝑚=1𝑥𝑚=𝑛,0,otherwise,(3) for nonnegative integers 𝑥1,,𝑥𝑀.

In order to model interference, we use an indicator random variable 𝑌𝑚 to specify if channel 𝑚 has been interfered with or not, if and when the receiver samples the channel. We assume 𝑌𝑚s are independent and have the Bernoulli distribution:𝑌𝑚Bernoulli(1𝑄),(4) that is,𝑌𝑚=𝑦=𝑄,if𝑦=0(nointerference),1𝑄,if𝑦=1.(5)

Parameter 𝑄 represents the probability that a channel sees no interference. The value of 𝑄=1 corresponds the ideal case of no interference on any channel. The assumption of independent interference on different channels is strengthened by the periodic switching of channels by all nodes in Alert protocol. (Our analysis can be easily extended to consider a correlated model for interference on all channels.)

Assume that 𝑐 is a non-idle channel that is selected by at least one of the nodes and node 𝑢 is transmitting on this channel. If any other node had transmitted (or interference had caused a high signal) on another channel before 𝑐 (in order of priority), the receiver would never have been waiting on 𝑐 to receive 𝑢’s packet. If another node 𝑣 also selects channel 𝑐, the message from these two nodes will collide at the receiver. Ignoring the possibility of capture effect, the collision results in a slot failure with no message succeeding in the whole slot since the receiver will remain on channel 𝑐 for the whole slot. Thus, 𝑢 succeeds on channel 𝑐 if it is the only node transmitting on that channel (with no interference as well). So we have the following rule: if there is only one node transmitting in the first non-idle channel (in order of priority), an Alert time slot will be successful.

So a time slot is successful if for any value of 𝑚 (𝑚 between 1 to 𝑀), only one node selects the 𝑚th channel (event 1={𝑋𝑚=1}) and no node selects any of the first higher priority (𝑚1) channels (event 2={𝑋1==𝑋𝑚1=0}), and there is no interference on any of the 𝑚 channels, that is, Int={𝑌1==𝑌𝑚=0}. So the probability of successful message delivery in a slot is𝒫(𝐩)𝑛=𝑀𝑚=112Int,(6) where we assume that interference is independent of the activities of the node, and (Int)=(𝑌1==𝑌𝑚=0)=𝑄𝑚. Combining this with (3), we find the the probability of successful message delivery in a slot for a given 𝑄, 𝑀, for known number of senders 𝑛 and channel probability distribution 𝐩=(𝑝1,,𝑝𝑀) as𝒫(𝐩)𝑛=𝑛𝑀𝑚=1𝑝𝑚𝑄𝑚1𝑚𝑘=1𝑝𝑘𝑛1.(7)

4.2. Number of Time Slots

We use random variable 𝑇𝑛 to denote the number of time slots required to collect 𝑛 messages. Next, we find the distribution of 𝑇𝑛 for a given probability distribution 𝐩.

If we start with 𝑛 nodes, in first time slot, with probability 𝒫𝑛, one message is successfully received and (𝑛1) nodes remain to go in the next time slots, or with probability (1𝒫𝑛) no node is successful and we still have 𝑛 messages left. Hence, we can write the following recursive expression:𝜉(𝑛,𝑘)=𝒫𝑛𝜉(𝑛1,𝑘1)+1𝒫𝑛𝜉(𝑛,𝑘1),(8) where 𝜉(𝑛,𝑘)(𝑇𝑛=𝑘) is defined as the probability that it takes 𝑘 time slots to collect all messages from 𝑛 nodes. We can solve (8) numerically using the following initial condition 𝜉(0,𝑖)=𝜉(𝑖,0)=0 for all 𝑖=1,2,, and 𝜉(0,0)=1.

We define the moment generating function (MGF) of 𝑇𝑛 asΦ𝑛(𝑧)=𝔼𝑧𝑇𝑛=𝑘=0𝑇𝑛=𝑘𝑧𝑘,(9) then from (8), we getΦ𝑛(𝑧)=𝑛𝑘=1𝒫𝑘𝑧11𝒫𝑘𝑧.(10)

This shows that the random variable 𝑇𝑛 is the sum of 𝑛 independent Geometrically distributed random variables with parameter 𝒫𝑛, that is,𝑇𝑛𝑛𝑘=1Geom𝒫𝑘,(11) where the Geometric distribution (𝑋Geom(𝛼)) shows the number of Bernoulli trials needed to get the first success and is defined as(𝑋=𝑘)=𝛼(1𝛼)𝑘1for𝑘=1,2,,𝔼(𝑋)=1𝛼,Var(𝑋)=1𝛼𝛼2,Φ𝑋(𝑧)=𝔼𝑧𝑋=𝛼𝑧1(1𝛼)𝑧.(12)

So from (11), we have𝔼𝑇𝑛=𝑛𝑘=11𝒫𝑘,Var𝑇𝑛=𝑛𝑘=11𝒫𝑘𝒫2𝑘.(13)

There is a simple intuitive explanation behind the distribution in (11): when we start with 𝑛 nodes, the chance of success in each slot is 𝒫𝑛, so it takes Geom(𝒫𝑛) trials/slots for the first message to go through. With the first message received, there remain (𝑛1) senders/messages, so the second message requires an extra Geom(𝒫𝑛1) slots. So, in general, the 𝑘th message requires Geom(𝒫𝑛(𝑘1)) slots.

4.3. Optimum Probability Distribution

The optimum distribution 𝐩 depends on the available information about number of nodes and interference level, and the performance metric that is optimized. In this subsection, we present different methods to find the distribution 𝐩 and rationale behind each case.

4.3.1. Maximizing 𝒫𝑛 for a Given 𝑛=𝑁

If we know (or estimate) the number of senders to be 𝑁, we can select 𝐩 such that the probability of successful message delivery in one time slot is maximized for the given value of 𝑛=𝑁. By maximizing 𝒫𝑁, we guarantee that the average delay of receiving the first message is minimized when 𝑁 nodes are contending to send. The number of time slots needed to successfully receive the first message is a random variable with geometric distribution and mean 1/𝒫𝑁. The optimum distribution 𝐩 in this case can be found using the following recursive expression:𝑝𝑚=𝑄𝛾𝑚𝑁𝑄𝛾𝑚1𝑀𝑚1𝑘=1𝑝𝑘,(14) where 𝛾1=0, and for 𝑚=2,,𝑀1, we have𝛾𝑚=𝑄𝑁+1𝑁1𝑁𝑄𝛾𝑚1𝑁1.(15)

This case is very similar to the results in Sift [8] and our result is a generalization of their solution with addition of the effect of interference through parameter 𝑄. The proof is given by the following lemma.

Lemma 1. Let 𝛾1=0 and, 𝛾𝑚=𝑄𝑁+1[(𝑁1)/(𝑁𝑄𝛾𝑚1)]𝑁1. Then given a probability distribution 𝑝, if (𝜕/𝜕𝑝𝑗)(Π𝑝(𝑁)/𝑁)=0 for 𝑗=1,,𝑀1, then 𝑁𝑄𝛾𝑖𝑝𝑀𝑖=𝑄𝛾𝑖1𝑀(𝑖+1)𝑚=1𝑝𝑚.(16)

Proof. We have the success probability: Π𝑝(𝑁)=𝑁𝑀1𝑠=1𝑝𝑠1𝑠𝑚=1𝑝𝑚𝑁1𝑄𝑠.(17) Now, 𝜕𝜕𝑝𝑗Π𝑝(𝑁)𝑁=1𝑗𝑟=1𝑝𝑚𝑁1𝑄𝑗𝑀1𝑠=𝑗𝑝𝑠(𝑁1)1𝑠𝑚=1𝑝𝑚𝑁2𝑄𝑠.(18) Equating the left-hand side to 0, we have (𝑁1)𝑀1𝑠=𝑗𝑝𝑠1𝑠𝑟=1𝑝𝑚𝑁2𝑄𝑠=1𝑗𝑟=1𝑝𝑚𝑁1𝑄𝑗.(19) Now, we will use induction to prove (16).
For 𝑖=1, set 𝑗=𝑀1 in (19) to get (𝑁1)𝑝𝑀11𝑀1𝑟=1𝑝𝑚𝑁2𝑄𝑀1=1𝑀1𝑟=1𝑝𝑚𝑁1𝑄𝑀1,(20) which can be simplified as 𝑁𝑝𝑀1=1𝑀2𝑟=1𝑝𝑚 which proves Lemma 1 for 𝑖=1 since 𝛾1=0.
Next, assume Lemma 1 true for 𝑖=𝑙. That is, 𝑁𝑄𝛾𝑙𝑝𝑀𝑙=𝑄𝛾𝑙1𝑀(𝑙+1)𝑚=1𝑝𝑚,(21) which can be rearranged as (𝑁1)𝑄𝑝𝑀𝑙=𝑄𝛾𝑙1𝑀𝑙𝑚=1𝑝𝑚.(22) Now, we set 𝑗=𝑀(𝑙+1) in (19) which gives (along with splitting the left-hand side into two terms) (𝑁1)𝑝𝑀(𝑙+1)1𝑀(𝑙+1)𝑚=1𝑝𝑚𝑁2𝑄𝑀(𝑙+1)+(𝑁1)𝑀1𝑠=𝑀𝑙𝑝𝑠1𝑠𝑚=1𝑝𝑚𝑁2𝑄𝑠=1𝑀(𝑙+1)𝑚=1𝑝𝑚𝑁1𝑄𝑀(𝑙+1).(23) Using (19), we can write the above equation as (𝑁1)𝑝𝑀(𝑙+1)1𝑀(𝑙+1)𝑚=1𝑝𝑚𝑁2𝑄𝑀(𝑙+1)+1𝑀𝑙𝑚=1𝑝𝑚𝑁1𝑄𝑀𝑙=1𝑀(𝑙+1)𝑚=1𝑝𝑚𝑁1𝑄𝑀(𝑙+1).(24) It follows from (21) and (22) that (𝑁1)𝑝𝑀(𝑙+1)𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑁2𝑝𝑁2𝑀𝑙𝑄𝑀(𝑙+1)+𝑝𝑁1𝑀𝑙𝑄𝑁1𝑁1𝑄𝛾𝑙𝑁1𝑄𝑀𝑙=𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑁1𝑄𝑀(𝑙+1)𝑝𝑁1𝑀l.(25) Dividing through by 𝑝𝑀(𝑙+1)𝑄𝑀(𝑙+1)[(𝑁𝑄𝛾𝑙)/(𝑄𝛾𝑙)]𝑁2 and simplifying eventually gives (𝑁1)𝑝𝑀(𝑙+1)+𝑁1𝑁𝑄𝛾𝑙𝑁1𝑄𝑁+1𝑁𝑄𝛾𝑙𝑄𝛾𝑙1𝑄=𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑁1𝑝𝑀𝑙.(26) But from Lemma 1, 𝛾𝑙+1=[(𝑁1)/(𝑁𝑄𝛾𝑙)]𝑁1𝑄𝑁+1. Thus, with some rearranging, we get (𝑁1)𝑝𝑀(𝑙+1)=1𝛾𝑙+1𝑄𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑝𝑀𝑙(27) or (𝑁1)𝑄𝑝𝑀(𝑙+1)=𝑄𝛾𝑙+1𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑝𝑀𝑙.(28) From (21), we have 𝑁𝑄𝛾𝑙𝑄𝛾𝑙𝑝𝑀𝑙=1𝑀(𝑙+1)𝑚=1𝑝𝑚.(29) Thus, (28) becomes (𝑁1)𝑄𝑝𝑀(𝑙+1)=𝑄𝛾𝑙+11𝑀(𝑙+1)𝑚=1𝑝𝑚,(30) which is the same as (22) for 𝑙+1 instead of 𝑙. Thus, by rearranging, we can get (21) for 𝑙+1 as well. That is, 𝑁𝑄𝛾𝑙+1𝑝𝑀(𝑙+1)=𝑄𝛾𝑙+11𝑀(𝑙+2)𝑚=1𝑝𝑚.(31) Thus, the lemma is true for 𝑖=𝑙+1 as well, completing the proof by induction.

Note that for 𝑁=1, the optimal probabilities are 𝑝(1)=1 and all 𝑝(𝑚)=0, 𝑚=2,,𝑀 since success in the later channels after first is smaller since they require all previous channels to be interference free as well. With these probabilities, the probability of success for 𝑁=1 becomes 𝑝(1)×𝑄=𝑄.

4.3.2. Minimizing 𝔼(𝑇𝑛) for a Given 𝑛=𝑁

Here, we find the probability distribution 𝐩 such that the expected number of time slots needed to collect from 𝑛=𝑁 senders is minimized. In this way, we guarantee that all messages are collected in as few time slots as possible. The optimization problem can be solved by using numerical methods and gradient descent techniques [21]. The gradient of the 𝔼(𝑇𝑛) is𝔼𝑇𝑛=𝜕𝔼(𝑇𝑛)𝜕𝑝1𝜕𝔼(𝑇𝑛)𝜕𝑝2𝜕𝔼(𝑇𝑛)𝜕𝑝𝑀1𝑇,(32) and we have𝜕𝔼𝑇𝑛𝜕𝑝𝑗=𝑛𝑘=11𝒫2𝑘𝜕𝒫𝑘𝜕𝑝𝑗,𝜕𝒫𝑛𝜕𝑝𝑗=𝑛𝑄𝑗1𝑠𝑗𝑛1𝑀1𝑘=𝑗𝑛(𝑛1)𝑄𝑘𝑝𝑘1𝑠𝑘𝑛2,(33) for 𝑛2 and 𝑠𝑚𝑚𝑘=1𝑝𝑘. For 𝑛=1, we have𝜕𝒫1𝜕𝑝𝑗=𝑄𝑗𝑄𝑀.(34)

4.3.3. Maximizing min𝒫𝑛 for a Given Range for 𝑛𝑁

If we do not know the number of nodes in the system, we can select the distribution 𝐩 such that the probability of successful message delivery in a time slot is high across a given range. Essentially, we try to find the solution which maximizes the minimum value of 𝒫𝑛 in the given range, that is,maximize𝑁min𝑛=1𝒫𝑛.(35) We used the subgradient method [21] to solve this optimization problem numerically.

4.4. Comparison of Optimal Distributions

We will present numerical examples and compare the different methods introduced in the previous subsection used to select the probability distribution 𝐩.

Figure 4 shows the success probability of an Alert slot, 𝒫𝑛, as a function of number of senders, 𝑛, for different values of 𝑁, with 𝑀=3 channels and 𝑄=0.95. In Figure 4(a) (case 1), the probability distribution 𝐩 is calculated to maximize 𝒫𝑁. In Figure 4(b) (case 2), the probability distribution 𝐩 is found to minimize the expected number of time slots required to receive from 𝑁 nodes. Figure 4(c), (case 3) corresponds to case where 𝐩 is found such that minimum of 𝒫𝑛 over the range 1𝑛𝑁, is maximized.

Note that there are two variables representing number of nodes or senders: 𝑁 denoting the estimated number of senders based on which the distribution 𝐩 is calculated and 𝑛 which is the actual number of senders (the horizontal axis).

For the first case (Figure 4(a)) the probability of success is maximized if the estimated 𝑛 is correct, that is, for 𝑛=𝑁, but for smaller values of 𝑛<𝑁, the probability of success decreases. So for this case, the delay of getting the first message is smallest. The second case on the other hand puts emphasis on collecting all messages as fast as possible. The third case provides a more flat 𝒫𝑛 for all values.

Figure 5 shows the average number of time slots required to receive the first message or all the messages in the network for the three cases. The results are calculated for 𝑁=50 messages, with 𝑀=3 channels, and 𝑄=0.95 (interference probability of 5%). For correct number of senders 𝑛𝑁, we see that case 2 (minimizing 𝔼(𝑇𝑁)) achieves the smallest number of time slots (on average) to collect all messages; case 3 follows closely, then we have case 1. This is expected as the optimization parameter in case 2 is the delay to collect all nodes. Case 1 (maximizing 𝒫𝑁) gives the best delay for collecting the first message at 𝑛𝑁 and for 𝑛>𝑁, but for smaller values 𝑛<𝑁, the delay of the first message and the overall delay to collect all messages are worse than the other two cases. Case 3 (maximizing min𝑁𝑛=1𝒫𝑛) gives an overall good performance between the two other cases and tries to keep delay of collecting all messages and the delay of the first message low for the whole range of 1𝑛𝑁.

4.4.1. Asymptotic Limit of 𝒫𝑛 for𝑛

It is interesting to see if 𝑛, how the probability of successful message delivery in a time slot scales. If we fix the probability distribution 𝐩, it is easy to see that lim𝒫𝑛=0 as 𝑛. However, if we let the distribution 𝐩 to change with 𝑛, we can get nonzero limits for the probability of success. These limits represent the asymptotic bounds on the performance of Alert. The distribution 𝐩𝑛=(𝑝(𝑛)1,,𝑝(𝑛)𝑀) which gives a nonzero lim𝒫(𝐩𝑛)𝑛 should have the following form: 𝑝(𝑛)𝑚=𝛼𝑚/𝑛 for 𝑚=1,,𝑀1, and 𝑝𝑀=1𝑀1𝑘=1𝑝𝑘, which gives𝒫lim𝑛𝒫(𝐩𝑛)𝑛=𝑀1𝑚=1𝛼𝑚𝑄𝑚exp𝑚𝑘=1𝛼𝑘.(36)

We claim that the values of 𝛼𝑚 which maximize 𝒫 can be found from the recurrence 𝛼𝑚=1𝑄𝑒𝛼𝑚+1 for 𝑚=1,,𝑀2, with 𝛼𝑀1=1. These values give the following asymptotic (upper) bound on the probability of successful message delivery in a time slot:𝒫=𝑄𝑒𝛼1=𝑄𝑒𝑒(𝑄/𝑒)𝑒(𝑄/𝑒)𝑒𝑄/𝑒(𝑀2)times.(37)

The proof of above result can be obtained as follows.

From (7) for 𝑛>1, we have𝒫𝑛=𝑀1𝑚=1𝑝𝑚𝑄𝑚𝑛1𝑚𝑘=1𝑝𝑘𝑛1.(38)

If 𝐩 is fixed and 𝑝1>0, then the limit of 𝒫𝑛 as 𝑛 will belim𝑛𝒫𝑛=𝑀1𝑚=1𝑝m𝑄𝑚×lim𝑛𝑛1𝑚𝑘=1𝑝𝑘𝑛1=0.(39)

This follows from the fact thatlim𝑛𝑛1𝑚𝑘=1𝑝𝑘𝑛1=0𝑚=1,2,,𝑀1.(40)

Note that with 𝑝1>0, it is guaranteed that (1𝑚𝑘=1𝑝𝑘)<1 for all 𝑚=1,,𝑀1.

In order to get a nonzero limit, we need to let the vector 𝐩 be a function of 𝑛 as well. The form 𝑝𝑚=𝛼𝑚/𝑛 gives a simple expression for 𝒫𝑛 which has a nonzero limit:𝒫𝑛=𝑀1𝑚=1𝛼𝑚𝑄𝑚1𝑚𝑘=1𝛼𝑘𝑛𝑛1,(41) with the following limit as 𝑛:𝒫lim𝑛𝒫𝑛=𝑀1𝑚=1𝛼𝑚𝑄𝑚lim𝑛1𝑚𝑘=1𝛼𝑘𝑛𝑛1=𝑀1𝑚=1𝛼𝑚𝑄𝑚𝑒(𝑚𝑘=1𝛼𝑘).(42)

The value of 𝛼𝑗 which maximizes 𝒫 should satisfy𝜕𝒫𝜕𝛼𝑗=0,for𝑗=1,2,,𝑀1.(43) we have 𝜕𝒫𝜕𝛼𝑗=𝑄𝑗𝑒(𝑗𝑘=1𝛼𝑘)𝑀1𝑚=𝑗𝛼𝑚𝑄𝑚𝑒(𝑚𝑘=1𝛼𝑘)=𝑄𝑗𝑒(𝑗𝑘=1𝛼𝑘)1𝑀1𝑚=𝑗𝛼𝑚𝑄𝑚𝑗𝑒(𝑚𝑘=𝑗+1𝛼𝑘),(44) which gives the following conditions for the optimum values of (𝛼1,,𝛼𝑀1):𝑀1𝑚=𝑗𝛼𝑚𝑄𝑚𝑗𝑒(𝑚𝑘=𝑗+1𝛼𝑘)=1,for𝑗=1,2,,𝑀1.(45)

For 𝑗=𝑀1, we can simplify previous condition to get𝛼𝑀1=1,(46) and for 𝑗=1,2,,𝑀2, we expand the sum and take out the first term (the term for 𝑚=𝑗) and factor out 𝑄𝑒𝛼𝑗+1 from the remaining terms in the sum:1=𝑀1𝑚=𝑗𝛼𝑚𝑄𝑚𝑗𝑒(𝑚𝑘=𝑗+1𝛼𝑘)=𝛼𝑗+𝑀1𝑚=𝑗+1𝛼𝑚𝑄𝑚𝑗𝑒(𝑚𝑘=𝑗+1𝛼𝑘)=𝛼𝑗+𝑄𝑒𝛼𝑗+1𝑀1𝑚=𝑗+1𝛼𝑚𝑄𝑚𝑗1𝑒(𝑚𝑘=𝑗+2𝛼𝑘)=1.(47) Note that the expression in parenthesis is the condition (45) for (𝑗+1) and, therefore, should be equal to one. So, we obtain the following recursive expression to solve for 𝛼𝑗 in terms of 𝛼𝑗+1 (with initial value of 𝛼𝑀1=1):𝛼𝑗=1𝑄𝑒𝛼𝑗+1for𝑗=1,2,,𝑀2.(48)

The value of 𝒫 for the optimum (𝛼1,,𝛼𝑀1) can be found from (42) and (45) as follows:𝒫=𝑀1𝑚=1𝛼𝑚𝑄𝑚𝑒(𝑚𝑘=1𝛼𝑘)=𝑄𝑒𝛼1𝑀1𝑚=1𝛼𝑚𝑄𝑚1𝑒(𝑚𝑘=2𝛼𝑘)=1from(45)for𝑗=1.(49)

We get close to this asymptotic bound for relatively, small values of 𝑛, for example, with 𝑄=0.9 and 𝑀3 the maximum possible 𝒫𝑛 is close to 𝒫 for 𝑛20.

4.5. Optimum Number of Channels

When multiple nodes contend to send messages in a slot, a larger value of 𝑀 (larger number of channels) decreases the contention among them by increasing the likelihood that they select different channels. Thus, one would think that we should use as many channels per slot as we can. However, there are practical considerations that present a tradeoff. For each channel used, the receiver has to sample it, and switch to the next channel. Thus, for each channel used, there is a sensing plus switching delay added to the size of a time slot. On one hand, as we increase 𝑀, the length of each time slot is increased, on the other hand, with larger 𝑀, we can get better success probabilities 𝒫𝑛 and, therefore, we can collect all messages in fewer time slots. Thus, selecting 𝑀 poses a tradeoff. We select 𝑀 to optimize this tradeoff and minimize the overall delay.

To find the optimum 𝑀, we need to have some timing constants from the radio. In general, we can write the length of a time slot as 𝜏slot(𝑀)=𝑀𝜏1+𝜏2, where 𝜏1 specifies the time duration required by the receiver to sample the presence of tone/preamble on one channel and switch to the next channel, and 𝜏2 is the time for completion of the packet, ack message, and other constant times in one time slot (see Figure 11 for more details). Since the number of channels is a positive integer value and bounded by a small number (total number of channels), we elected to calculate the optimum probabilities 𝐩 for each value of 𝑀 and pick the optimum 𝑀opt (and corresponding 𝐩opt) as the one which minimizes the expected value of overall delay to collect from 𝑁 nodes, that is, 𝔼(𝒟𝑁)𝔼(𝑇𝑁)×𝜏slot. (Considering each value of 𝑀 has additional benefits as explained in the following section.)

Figure 6 shows the normalized delay (𝔼(𝐷𝑁)/𝑁) as a function of 𝑀 for 𝑄=0.95. For timing constants, we used 𝜏1=0.4ms and 𝜏2=6.0ms which are representative values based on our measurements with the implementation on Bosch CC2420 node boards (see Section 6). The optimum 𝑀 which minimizes the delay is shown on each graph by a filled marker. The optimum value of 𝑀 depends on value of 𝑁 and 𝑄. However, it is not very sensitive to 𝑁 as we see that close to 𝑀opt, the curves are flat. So we can increase or decrease 𝑀 by one or two and expect to get almost the same performance.

Here we find out the optimal channel probabilities 𝑃 to be used by nodes for known values of 𝑁 and 𝑄. This can be useful if the application is such that every node can fairly accurately estimate the number of nodes that will respond to any event it detects itself within an environment with known, unvarying interference levels.

4.6. Impact of 𝑄 on Probability of Successful Message Delivery in a Slot

Given 𝑁, 𝑄, and 𝑀, the optimal channel probabilities can be calculated based on (14). In this subsection, we evaluate numerically the impact various values of 𝑄 and 𝑁 have on the probability of successful message delivery in a slot. These results are important because in subsequent sections, we will assume we do not know the level of interference 𝑄 during protocol execution. Using (14), we study the sensitivity of optimum channel probabilities 𝐩 to the interference level 𝑄. The aim was to see if an estimate of 𝑄 would be sufficient to compute close to optimal 𝐩. We calculated the optimal probabilities for three values of 𝑄; 𝑄=1, 𝑄=0.6, and 𝑄=0.1 for various values of 𝑁 keeping 𝑀=3. The results are plotted in Figure 7. (For 𝑄 = 0.1, for large 𝑁, some values are undefined and hence those data points are missing.)

It can be seen that the probability of successful message delivery in a slot, for all values of 𝑄 on which channel probabilities were calculated, does not change much. This is because the computed channel probabilities themselves do not vary much across different values of 𝑄. This does not mean that 𝑄 has no effect on probability of success in a slot; as actual 𝑄 varies, the slot success probability decreases as shown in the plots. We saw similar results for other values of 𝑀 as well. Thus, we can conclude that a reasonable estimate on 𝑄 is good enough for calculating optimal probabilities.

5. Adaptive Algorithm for Alert

The previous section presented different theoretical approaches to minimize the number of time slots to collect all messages. These approaches focused on using a single set of channel probabilities 𝐩 throughout the execution of Alert. Though this results in a simpler implementation, we can do better by adapting these probabilities as the protocol executes. We present one such approach in this section where the probabilities to use are updated every time unit 𝑡. We thus seek a set of channel probabilities used throughout the protocol execution represented as a list of vectors 𝐏=(𝐩(1);;𝐩(𝐹)), or equivalently in matrix form:𝐏=𝐩(1)𝐩(𝐹)=𝑝(1)1𝑝(1)𝑀....𝑝(𝐹)1𝑝(𝐹)𝑀,(50) where 𝐹 is the number of time units required for reading all messages, with a new set of channel probabilities calculated for each unit. A time unit can be either a single time slot or a whole frame of time slots in the context of Alert. (The actual implementation of Alert is based on the concept of frames. A specific amount of time per frame is allocated for reading messages, with the rest used for other operations like time synchronization, maintenance among others. This time can be divided up into any number of slots depending on the size of a slot.) In this section, we specifically look at the case where no knowledge of number of messages is assumed and present an adaptive algorithm (Algorithm 1). We believe this to be the most important case to handle for the protocol. For simplicity, we assume that all initial messages arrive simultaneously. This can be justified when messages are rare, but correlated, and usually occur due to some event observed by a subset of the nodes resulting in the generation of simultaneous messages. Generation of alarms by surveillance systems is such an example.

(1) Get 𝑀 o p t for some guess on number of nodes 𝑁 g u e s s ,
𝑀 o p t = g e t 𝑀 ( 𝑁 g u e s s ) .
(2) Start with some initial value of 𝑁 = 𝑁 i n i t
(3) Get some estimate on value of 𝑄 .
(4) Get 𝐏 , and expected number of frames needed, 𝐹 ,
   based on current estimate 𝑁 and for 𝑀 o p t ,
𝐏 , 𝐹 = g e t 𝐏 ( 𝑁 , 𝑀 o p t )
(5) while message not read do
(6) for frame 𝑡 = 1 to 𝐹   do
(7)  for all slots in frame 𝑡   do
(8)   Execute protocol for calculated 𝐩 ( 𝑡 ) for frame 𝑡
(9)   Terminate execution if message read and reset
    to initial state
(10)  end for
(11) end for
(12) Increase estimate by a constant factor 𝑁 = 𝑁 + 𝐶 i n c f
(13) Get 𝐏 , and expected number of frames needed, 𝐹 ,
   based on current 𝑁 for 𝑀 o p t
    𝐏 , 𝐹 = g e t 𝐏 ( 𝑁 , 𝑀 o p t )
(14) end while

We begin by describing how we calculate 𝐏 for our proposed adaptive algorithm. Then, we give the details of our algorithm along with our strategy to calculate the number of channels to be used per slot considering we do not know the number of nodes sending messages.

5.1. Calculation of Channel Probabilities

The distribution 𝐩(𝑡) controls the contention of nodes with messages to send in a time unit 𝑡. Given 𝑀, the 𝐩(𝑡) used for each time unit must control contention such that the probability of successful message delivery in that time unit is maximized. As the protocol progresses, maintaining our assumption of simultaneous arrivals of messages, the number of remaining messages, 𝑛, decreases as they are successfully received by the base station. Thus, for subsequent slots, after the first slot, a new value of 𝐩(𝑡) must be calculated before each slot 𝑡 taking into account the current value of 𝑛. For large initial 𝑛, the time to read all messages can be quite large, and the requirement of recalculation of 𝐩(𝑡) before each slot 𝑡 can prove infeasible. Moreover, we would prefer that the values of 𝐩(𝑡) used in each slot be precalculated and stored in memory. Thus, due to the computational and memory limitations on wireless sensor devices, we adopt a perframe recalculation strategy.

Let 𝑅𝑛 be the number of messages that go through in frame 𝑡 with 𝑛 messages contending at the beginning of the frame. We seek for frame 𝑡 the channel probabilities 𝐩(𝑡) that maximize the number of messages read, that is, the channel probabilities to use in a frame satisfiesmaximize𝐩(𝑡)𝔼𝑅𝑛.(51) This can be solved through numerical methods using a recursive algorithm or through Markov chains in conjunction with (14). The distribution 𝐩(𝑡) is updated at the beginning of each frame 𝑡. The number of remaining messages is updated at the end of each frame by subtracting the expected number of messages read in the frame for the 𝐩(𝑡) used in the frame. Thus, if 𝑀 is the number of channels used in a slot and 𝐹 is the number of frames that are expected to be required by the protocol to read all messages, the distribution 𝐏 is in the form of a 𝐹×𝑀 matrix. Each row of 𝐏 is the distribution to use in the frame corresponding to that row. The steps to calculate 𝐏 are given in Algorithm 2.

(1) for each frame 𝑡   do
(2) Calculate (and store) 𝐩 ( 𝑡 ) for the given 𝑀 with
  current value of 𝑛 using  (51)
(3) Reduce 𝑛 by expected no. of messages read so far,
   that is,
   𝑛 = 𝑛 𝔼 ( 𝑅 𝑛 )
(4) if 𝑛 𝑁 t h r   then
(5)  Store number of frames 𝐹 = 𝑡 + 1
(6)  Repeat last 𝐩 ( 𝑡 ) , that is, 𝐩 ( 𝐹 ) = 𝐩 ( 𝑡 ) .
(7)  Break from for loop
(8) end if
(9) end for
(10) Return 𝐹 and 𝐏 = ( 𝐩 ( 1 ) ; ; 𝐩 ( 𝐹 ) )

This procedure calculates the number of frames required to read all messages up to some small threshold, 𝑁thr, and then repeats the last 𝐩(𝑡) in the final frame 𝐹. The constant 𝑁thr is used to ensure that an underestimate of number of messages remaining (compared to the actual value) does not result in using channel probabilities that will make it almost impossible for further messages to go through. (For example, channel probabilities chosen when estimated number of messages is 2, but actual number of messages is 10, will result in probabilities that does not allow any of the 10 messages to go and may delay reading any message for next 100–500 attempts. On the other hand if estimated number is 10 and actual number is 2, the additional delay is much smaller and is of the order of 1–10 extra attempts. So by setting 𝑁thr to some small value, say 10, we ensure that the estimated number left allows further messages to go through still with a high probability.)

5.2. Adaptive Algorithm Details

When the actual number of messages to be sent is unknown, we desire that the protocol itself try to estimate it and calculate the corresponding design parameters for this estimate. We employ an adaptive approach where we vary a node’s estimate of number of senders, 𝑁, until it succeeds in sending its message. The algorithm starts out by selecting the number of channels to use per slot, 𝑀opt, based on some guess on the initial number of nodes (or messages with our assumption of one message per node), 𝑁guess, which need not be accurate due to insensitivity of 𝑀 to number of nodes. Further details of calculation of 𝑀opt are explained in Section 5. The algorithm sets its estimate of number of messages, 𝑁, to a small value 𝑁init as the starting point of its upward adaptation to larger values. Upward adaptation is chosen because changing to a different estimate only requires waiting a small time before realizing that the estimate may be incorrect. 𝑄 is a deployment environment specific parameter that is measurable, and, hence, can be estimated. As mentioned in Section 4.6, a reasonable estimate is enough. In the next step, the algorithm finds the optimal channel probabilities, 𝐏, and the expected maximum number of frames 𝐹 within which the message will be read for this estimate. If, after 𝐹 frames of protocol execution, the node’s message is still not successful, it increases the estimate 𝑁 by a constant additive factor 𝐶incf (𝐶incf>0) and recalculates 𝐹 and 𝐏 to use in the subsequent frames. The algorithm continues until the message is finally read, after which the node resets back to the initial state and on the next event will reexecute the algorithm. The algorithm is designed such that all the calculations of 𝑀opt and 𝐏 can be prestored and used from memory. The same initial 𝑁 and constant increase factor 𝐶incf ensures that we just need to store 𝑀opt calculated for some constant 𝑁guess, and 𝐏, 𝐹 for all estimates 𝑁 which can only have the following values: {𝑁init,𝑁init+𝐶incf,𝑁init+2𝐶incf,} up to some maximum limit on the value of 𝑁 or memory capacity available.

5.3. Number of Channels Per Slot

When the Alert protocol is deployed, we desire that only a single value of 𝑀 be used. Changing the number of channels per slot dynamically as number of messages 𝑁 that remain to be read changes is difficult to implement in practice. Next, we describe our method to derive 𝑀opt given in Algorithm 3. It begins with some initial guess of 𝑁, 𝑁guess, not necessarily close to the actual 𝑁, but not a small value. The optimal number of channels 𝑀opt to use is calculated based on this 𝑁guess. Because we do not recalculate 𝑀opt, we need to ensure that the initial value of 𝑁guess used gives us a good 𝑀opt to use for the rest of the protocol execution, as adaptive estimates on 𝑁 changes. From our earlier analysis (mentioned in Section 4.5), it was found that optimal 𝑀 is quite insensitive to 𝑁.

(1) Use timing constants 𝜏 1 and 𝜏 2 of specific radio used
(2) for each 𝑀 𝑀 m a x   do
(3)  𝜏 s l o t ( 𝑀 ) = 𝑀 𝜏 1 + 𝜏 2
(4) Get the expected number of frames needed, 𝐹 ( 𝑀 ) ,
   based on value of 𝑁 g u e s s :
    𝐏 , 𝐹 ( 𝑀 ) = g e t 𝐏 ( 𝑁 g u e s s , 𝑀 )
(5) end for
(6) Get optimal 𝑀 for reading all messages:
    𝑀 t e m p = a r g m i n 𝑀 𝐹 ( 𝑀 ) × 𝜏 s l o t ( 𝑀 )
(7) Get associated time to read all messages:
    𝑡 𝑀 t e m p v a l = 𝐹 ( 𝑀 t e m p ) × 𝜏 s l o t ( 𝑀 t e m p )
(8) Find largest 𝑀 within some factor 𝛽 of 𝑡 𝑀 t e m p v a l :
    𝑀 o p t = m a x { 𝑀 | 𝑡 𝑀 v a l ( 1 + 𝛽 ) 𝑡 𝑀 t e m p v a l }
(9) Return 𝑀 o p t

The calculation of 𝑀opt uses the same method explained in Section 4.5 but with a small difference. Assume 𝑀temp is the optimal value of 𝑀 for reading all messages with minimum delay. But this value may not be good for getting the first message through with minimum delay, for which a larger number of channels might be better. So we introduce a design factor 𝛽,𝛽0 by which we look for possibly larger number of channels to use without incurring an expected time penalty greater than (1+𝛽)𝑡𝑀tempval, where 𝑡𝑀tempval is the expected time to read all messages using 𝑀temp. Parameter 𝛽 allows us to control our optimization criteria: 𝛽=0 specifies selection of 𝑀opt that optimizes the time to read all messages, while larger values of 𝛽 increasingly look to optimize the time to send the first message by considering usage of larger number of channels.

6. Protocol Evaluation

In this section, we evaluate the feasibility and performance of Alert through an implementation on commodity hardware and also simulations. We begin with our implementations focusing on the feasibility of Alert.

6.1. Feasibility of Alert

We implemented the Alert protocol on Bosch CC2420-based wireless nodes. The Bosch boards use Chipcon/TI CC2420 radio which is an IEEE 802.15.4 compliant transceiver operating at 2.4 GHz band with direct sequence spread spectrum (DSSS) O-QPSK modulation and 250 Kbps data rate. An external power amplifier (max transmit power 10 mW) is used to improve the communication range.

In the first experiment setup, we deployed 15 senders in an office in Palo Alto, Calif, as shown in Figure 8. The receiver (base station) is in communication range of all nodes and it kept them in sync by sending periodic beacon messages. Every second, all the nodes sent a message simultaneously using Alert protocol with the following fixed probability distribution:𝐩=(0.05,0.063,0.092,0.182,0.613).(52) The receiver measured the number of time slots to receive the first message and number of time slots to collect all the 15 messages. Each time slot is 8 ms long.

Figure 10 shows the measured distribution of the number of time slots (for both the first and all messages). We see that the Alert protocol is performing better in real deployment (the experiment setup) than what the analysis predicts. The calculations show that in average we need 24.82 time slots to collect from all 15 nodes, but our experiment gives an average of 17.60 time slots. This improvement in performance is mainly due to Capture effect. When two nodes are sending simultaneously, in our analysis, we assume that there will be a failure, but in many cases the receiver can correctly decode one of the packets while treating the signal from the other sender as noise. Since the CC2420 radio employs spread spectrum techniques, it can tolerate higher level of interference and this helps increase the chance of capture effect. Note that Alert, by reducing the number of contending nodes at higher priority channels, increases the likelihood of capture effect.

To validate our analysis we reduced the chance of the capture effect by repeating the experiment with a different setup. In the second setup, all the 15 nodes were placed close to each other and close to the receiver on a table. The network configuration is shown in Figure 9. Since all nodes are close to one another, the received power at the receiver from all senders is high and equal. This reduces the chance of capture effect. The results are shown in Figure 10. We see that the measurements distribution with the second setup matches very closely to what the analysis predicts.

6.2. Simulation Setup

Using simulations enables us to evaluate the performance of Alert with a large number of nodes with messages to send, that is, for scalability, and also compare against other protocols. We compare Alert with two other contention-based MAC protocols—Sift [8] and Slotted Aloha (S-Aloha) [11]. Sift was chosen because it is a CSMA-based protocol (unlike Alert) and previously shown to do better than variable contention window protocols like 802.11 for the target application scenario (refer to Section 2 for more details). S-Aloha is a simple protocol, allowing each time slot to be very small, possibly providing advantages in reducing time required to read messages. We believe comparisons of Alert with these two protocols covers a wide design space for MAC protocols for the target application. We had discussed the infeasibility of other possible MAC protocols (e.g., TDMA) in Section 2.

For the evaluations, we wrote a simulator in MATLAB. The important abstraction was the concept of time across different protocols from an implementation perspective. The interference was modeled as pointed out in Section 4. All protocols send messages to the receiver (or base station) in fixed time slots. (The 802.15.4 protocol also has a fixed slot structure with both a contention access period and a contention-free period within each frame [6].) The size of a time slot for each protocol is different (but of fixed size) based on how it is used. The timing of all three protocols as used in our simulations are shown in Figure 11. In this figure, 𝑡1 is the guard time plus the rx/tx switching time, 𝑡Skew is the maximum clock skew, 𝑡2 is the channel sensing time, 𝑡3 is the channel switching time, and 𝑡4 is the total time to exchange a packet and ack. Based on the implementation of Alert and measurement on the CC2420 radio, the values used for these constants are 𝑡1=0.5ms, 𝑡2=0.1ms, 𝑡3=0.3ms, 𝑡4=2.5ms.

In the S-Aloha protocol, each node tries to send its message in a slot with probability 1/𝑁 until it finally succeeds in doing so, where 𝑁 is the current estimate of number of messages. We let the S-Aloha protocol use the same methodology of adapting 𝑁 as Alert, and chose an initial value and increment factors that gave best results; this was initial 𝑁=10 with additive increments 𝐶incf=50. For S-Aloha, a slot duration consists of the guard time plus RX/TX switching time 𝑡1, a single adjustment for clock skew 𝑡Skew and the time to exchange a Packet and Ack.

Sift uses a fixed contention window (CW) size and relies on a geometric probability distribution with which nodes pick a backoff slot for transmission. Once a node counts down to its chosen backoff slot and is the only one that has chosen that slot, it completes the packet and ack exchange with the receiver and all nodes move onto the beginning of the next protocol time slot. For simplicity, we do not implement RTS/CTS with Sift and do not consider the hidden terminal effect in our evaluations. The Sift slot duration consists of the guard time 𝑡1, the length of each backoff slot which is the sum of adjustment for clock skew and the channel sensing time, 𝑡Skew+𝑡2, and the time to exchange a packet and ack, 𝑡4. (We will mention how that consideration would effect the comparison between the protocols when we present our results.) Since a node might capture the channel in some backoff slot within the CW and begin packet transmission at that time, there is the possibility of some time left over after the ack is sent back until the next slot begins. This limitation is due to implementation considerations for which a fixed slot duration is highly desirable, and often, the most practical.

For Alert, a slot duration consists of the guard time 𝑡1, adjustment for possible clock skew 𝑡Skew both before and after the sampling time by receiver, multiple copies of time to sample channel plus channel switching time, 𝑡2+𝑡3, and the time to exchange a packet and ack, 𝑡4. Because each transmitter is sending a continuous tone the whole time the receiver is sampling channels, we do not need to adjust for clock skew (𝑡Skew) anytime except before and after the sampling is done when the transmitter is not sending the tone. Alert does not have any spare time left over in a slot-like Sift because the packet and Ack exchange takes place at a specified time in the slot regardless of which channel is used. The receiver simply waits on that channel at the time to receive a packet and return an ack. In our simulations, 𝑁init was taken as 10 with additive increments 𝐶incf=50. The value of 𝑀 was calculated for 𝑁guess=50 and 𝑁thr=10. The value of 𝛽 was set to 0.1 which provided a good balance between minimizing delay of collecting first message and collecting all messages.

Two levels of time synchronization were considered; tight and loose. Note that these are relative terms that are used to convey the compensation required for expected clock skew within slots. The tight synchronization allows smaller compensation times to be used with all protocols, but can prove to be a heavy burden on the higher level protocol that is responsible for it. Tight synchronization would require all nodes to participate in the synchronization protocol more frequently and would consume a lot of energy, even when the nodes have no messages to send. Thus, for applications targeting rare events, tight synchronization may not be feasible and a “looser” form of synchronization may be more desirable. We use the values 𝑡Skew=0.7 or 0.2ms for loose and tight synchronization, respectively. (Note that, for tight synchronization, the values used (𝑡Skew=0.2, 𝑡2=0.1 ms) give roughly the specified backoff slot duration of 0.32 ms in the IEEE 802.15.4 standard [6].)

To get a sense of the effectiveness of the adaptive version of Alert, we also show a plot of the expected number of slots required to read all messages when exact value of 𝑁 is known throughout the protocol execution. This is calculated theoretically using (13) for a known 𝑁. The scheme is termed 𝐸𝑥𝑂𝑝𝑡𝐴𝑙𝑒𝑟𝑡. Each data plot shown is the average of 150 runs and 95% confidence intervals are shown for plots that show time required to read all messages. The maximum number of nodes was set at 100 which is a reasonably large number for a one hop centralized network.

6.3. Comparisons Through Simulations

Figure 12 shows the comparison between all 4 schemes for the average time required to send the first message. (Confidence intervals are not shown for this plot to allow a close up snapshot of the schemes other than S-Aloha.) It can be seen that Alert manages to send the first message out far earlier than Sift and S-Aloha, and is quite close to its optimal expected performance ExOptAlert. The chosen channel probabilities of Alert allow the first message to go through in the initial few slots. The same happens for Sift, but more backoff slots in its contention window mean it takes more time to send the first message even though it may be successful in the first time slot. A smaller contention window for Sift could be useful here, but that could have a negative impact on success probability of a single slot and, hence, the delay to send all messages. S-Aloha seems to have the most delay since the random slots picked by nodes to send may not be the initial slots, or if they are, may not be successful due to collisions. The small slot time does not seem to have helped S-Aloha in this case.

Figure 13 shows the comparisons for all schemes to read all messages when 𝑄=0.95 (interference level of 5%). For the tight synchronization case, we see that Alert does slightly better than Sift. Note that this result does not take into account additional procedures like RTS-CTS which Sift might need to employ to handle hidden terminal collisions. Alert being a noncarrier sense protocol does not suffer from such issues. When loose time synchronization is used, the difference between Alert and Sift increases; in fact, S-Aloha does better than Sift now due to the much smaller slot structure it uses. When a higher level of interference (𝑄=0.8) is taken into consideration, Sift does better than Alert for the tight synchronization case because the latter has a higher possibility to be affected due to its use of multiple frequency channels (see Figure 14). The possibility of such high levels of interference (𝑄=0.8=20% interference) is, however, very unlikely. In practice, Alert switches the frequency channels it uses periodically so that an interference source on some channel does not affect performance for long.

7. Conclusions

We presented Alert, a MAC protocol to collect rare event-driven messages from multiple wireless sensor nodes with low latency. The protocol uses a novel time slot structure with nodes separated by prioritized frequency channels, which allows one node to succeed per slot with high probability. We provided extensive theoretical justifications for selecting values for the design parameters involved, and designed an adaptive algorithm for Alert to adjust parameter values based on the level of contention in the network. The feasibility and effectiveness of the protocol were demonstrated through both an implementation as well as extensive simulation-based comparisons with other protocols.

Disclosure

A preliminary version of this paper appeared in proceedings of ACM/IEEE International conference on Information Processing in Sensor Networks (ACM/IEEE IPSN), April 2008.