Abstract

Distributed clustering is widely used in ad hoc deployed wireless networks. Distributed clustering algorithms like DMAC, HEED, MEDIC, ANTCLUST-based, and EDCR produce well-distributed Cluster Heads (CHs) using dependent thinning techniques where a node’s decision to be a CH depends on the decision of its neighbors. An analytical technique to determine the cluster density of this class of algorithms is proposed. This information is required to set the algorithm parameters before a wireless network is deployed. Simulation results are presented in order to verify the analytical findings.

1. Introduction

Distributed clustering is a robust technique used to organize ad hoc deployed wireless nodes to form a communication network [1]. It is widely adopted in energy constrained ad hoc deployed wireless sensor networks (WSNs) [2]. Distributed clustering algorithms, used in ad hoc deployed wireless networks, can be broadly categorized into two classes. The first category consists of independent randomized cluster head (CH) selection class of algorithms; that is, the decision for a node to be a CH is made independent of the decision of its neighboring nodes. For example algorithms such as LEACH [3], LEACH-D [4], SEP [5], and EDAC [6] fall into this category. These algorithms do not produce well-distributed CHs [7]. They may produce two or more adjoining nodes as CHs. Furthermore, the variation between the theoretical expected number of CHs for these algorithms is considerable when compared to the actual number of CHs obtained after deployment [8]. The second category consists of distributed clustering algorithms like DMAC [1], HEED [9], ANTCLUST based [10], MEDIC [11], EDCR [12], and its derivatives [13]. The location of a CH for these algorithms is dependent on its neighbors decision as well. This ensures that no two CHs appear in each others neighborhood and all nodes have at least one CH in their neighborhood or the node itself is a CH. They produce well-distributed clusters using dependent decision making and is referred too as Dependent Thinning Distributed Clustering (DTDC) class of algorithms. We note that the CH selection process of DTDC class of algorithms resembles the reverse price auction and is sometimes known as the Dutch auction [11] method.

Irrespective of the distributed clustering algorithm used in ad hoc deployed wireless network applications, the knowledge of the expected number of clusters, denoted by 𝐸[𝑘], is an important parameter required at the planning stage of the network. For example, consider a WSN where data is collected periodically and aggregated at the CH, then communicated to the base station (BS). The application may expect 𝐸[𝑘] number of clusters, where each cluster has an expected number of nodes, denoted by 𝐸[𝑛], in the given deployment area 𝐴. The given requirement is generated based on the level of reliability expected from the collected data. That is, the reliability is directly connected to the redundancies associated with the nodes within a cluster [14]. Another example is if an ad hoc deployed wireless network application is required to produce an optimal number of clusters 𝐸[𝑘] based on the requirement to minimize the energy cost for communication and maximize the network lifetime [15, 16]. In both these examples, the WSN parameters should be set appropriately at the initial deployment stage so that when in operation the desired number of clusters 𝐸[𝑘] is achieved to meet the design objective. In the first example, this objective is increased reliability, where as in the second example it is to maximize the lifetime.

To identify the importance of knowing WSN design parameters, let us look at a known example like the LEACH algorithm from the first category. LEACH uses a parameter 𝑝 which represents the expected proportion of nodes to be CHs. That is, a node has 𝑝 probability of becoming a CH independent of the decision of its neighbors. When an independent randomized clustering algorithm like LEACH is applied to an ad hoc deployed network where 𝑁 nodes are uniform randomly deployed in a given area, the expected number of clusters can be found using the expression 𝐸[𝑘]=𝑝𝑁 [3]. According to [17], the node distribution of such a system is considered to be a 2-D Poisson point process with intensity 𝜆=𝑁/𝐴, and the resultant CHs too would be distributed as a 2-D Poisson point process with intensity (i.e., CH density) 𝜆𝑐=𝑝𝜆. We see that by setting the WSN parameters 𝑝 and 𝑁 we can achieve a desired 𝐸[𝑘]. That is the analytical expressions presented play an important part in achieving the proper 𝐸[𝑘].

To the authors knowledge, no such analysis exists to determine the CH distribution and density (𝜆𝑐) of DTDC class of algorithms. However, Bettstetter [18] has presented an empirical formula for the CH density of the DMAC algorithm using simulation results. As it is an empirical formula, it cannot be generalized. In this paper we present an analytical expression for CH density for the DTDC class of algorithms in order to address this gap.

In what follows, we will first establish that the DTDC class of algorithms such as HEED, ANTCLUST based, DMAC, MEDIC, and EDCR will indeed fall into one common category in terms of their CH distribution. Then, we will determine the probability distribution of the cluster area of the DTDC class of algorithms. Subsequently, the distribution of the cluster area will be used to derive the cluster density. Furthermore, we will also consider the boundary (or boarder) effect due to the finite geographical area in which the nodes are distributed and modify the expressions to accommodate it. The proposed analytical results will prove that the empirical results derived using simulations by Bettstetter in [18] are indeed accurate.

Rest of the paper is organized as follows: Section 2 presents the nomenclature. Section 3 provides a mathematical model to express the CH selection and distribution common to all DTDC algorithms. In in Section 4 the model presented in the previous section will be used to identify a probability distribution of cluster area of DTDC class of algorithms. Subsequently in Section 5 these results will be used to find the cluster density and the number of expected clusters in a rectangular and circular deployment area. Simulation results presented in Section 6 establish that the analytical findings are in line with the actual values presented in existing literature. Section 7 presents the conclusion.

2. Nomenclature

Table 1 gives the notations used in what follows. Some are extracted from [12].

3. Preliminaries

This section presents the background necessary to find the CH density of DTDC class algorithms covering HEED, DMAC, ANTCLUST base, MEDIC, and EDCR. As mentioned before, these algorithms will produce well-distributed CHs by making a node’s decision to be a CH based on the decisions of other nodes in its neighborhood.

We assume that there are 𝑁 number of uniform-randomly distributed nodes in a given deployment area 𝐴 resulting in a 2-D Poisson point distribution of intensity 𝜆, where 𝜆=𝑁/𝐴 [17] in our analysis. Furthermore we assume that all clusters are well populated; that is, each cluster consists of a large number of less reliable low cost nodes which work collaboratively to achieve reliable results. Hence, 𝛼𝜆1 where 𝛼 is a random variable denoting the cluster area. According to the DTDC class of algorithms, the area covered by a CH candidacy message is given by 𝜋𝑅2, where 𝑅 represents the maximum distance a CH candidacy announcement message would reach. Since 𝛼<𝜋𝑅2,𝜋𝑅2𝜆1.(1)

The following common features exists in the DTDC class of algorithms.(a)DTDC class of algorithms does not allow two CHs to be within a distance 𝑅. Furthermore, it ensures that all the nodes are either discovered by a CH (i.e., there is a CH within a distance 𝑅 of a regular node), or the node itself is a CH.(b)Each node calculates a time 𝑇𝑖 at which it will broadcast the CH candidacy announcement, provided it has not heard a similar message from a neighbor by this time. 𝑇𝑖 calculation is algorithm specific. However, all algorithms ensure that 𝑇𝑖 is inversely proportional to the fitness of a node to be a CH. For example in the EDCR algorithm, 𝑇𝑖 is inversely proportional to the relative residual energy level of a node [12]. As such, the node with the highest fitness to become a CH will have the lowest 𝑇𝑖, resulting it to announce CH candidacy first and becoming the CH for that neighborhood.(c)All the algorithms use a random component for tiebreaking. Hence, when all nodes are equally fit to be CHs, 𝑇𝑖 is purely random. This is true for EDCR, HEED, MEDIC and ANTCLUST algorithms at the initial deployment stage since all nodes have equal energy.

Above features of DTDC class of algorithms reaffirm that the selected CHs represent a dependent thinning point process on the original 2-D Poisson point process. Let 𝒮 represents the set of all deployed nodes, where 𝒮2 with |𝒮|=𝑁. The clustering process yields a random set 𝒮 of secondary points which are CHs with the property that |𝑖𝑗|>𝑅, where 𝑖,𝑗 and 𝑖𝑗. Note that 𝒮 are the regular (non-CH) member nodes. For any node 𝑚𝑘𝒮 we have |𝑚𝑘𝑖|<𝑅 and 𝑇𝑚𝑘>𝑇𝑖 at least for one CH node 𝑖. Further, it should be noted that 𝑚𝑘 is a member of the cluster with CH 𝑖 when |𝑚𝑘𝑖|<|𝑚𝑘𝑗|<𝑅forall𝑖𝑗.

According to [19] aforementioned dependent thinning point process follows a Matérn Type III process when 𝑇𝑖 is a pure random value. Hence, we can conclude that the CH distribution of dependent thinning algorithms like HEED, ANTCLUST, DMAC, MEDIC, and EDCR immediately after deployment would resemble a Matérn Type III point process.

Example 1. Figure 1 gives a simplified description of Matérn Type III process applied to 3 random nodes 𝑎, 𝑏, and 𝑐 with |𝑎𝑏|=0.30, |𝑏𝑐|=0.23, |𝑎𝑐|=0.53, 𝑅=0.4, 𝑇𝑎=0.27, 𝑇𝑏=0.52, and 𝑇𝑐=0.78.

According to this illustration, since 𝑇𝑎<𝑇𝑏, 𝑎 eliminates 𝑏; since 𝑏 is eliminated, even though 𝑇𝑏<𝑇𝑐, 𝑐 would not be eliminated; hence, nodes 𝑎 and 𝑐 will be elected as CHs. Even though the above description clearly indicates that the DTDC class of algorithms resemble a Matérn Type III process, we cannot find the resultant CH density (or expected number of clusters) using this information. As Bertil Matérn has shown in [20], the point distribution of Matérn Type III-dependent thinning process is mathematically intractable.

Based on this background, we will derive the CH density of the class of DTDC algorithms by finding the probability density function (p.d.f) of 𝛼 for practical cases satisfying (1) in the next section.

4. Probability Density Function of Cluster Area

Based on our analysis we observe that the probability of 𝛼 depends on the following two scenarios.(1)For a given cluster area, there are no uncovered nodes (uncovered node means a node that has not heard from a neighboring CH almost at the end of a new CH candidacy announcement time interval) in its cluster neighborhood.(2)The chance of having no such uncovered nodes.

Let 𝑃𝐵 be the probability that no uncovered nodes exist in a given cluster neighborhood. Then the conditional probability 𝑃𝐴𝐵 denotes the cluster area given no uncovered nodes existent in a given cluster neighborhood. Based on these facts, we find the probability 𝑃𝐴𝐵 of a resultant cluster area 𝛼 when no uncovered nodes exists. One finds that𝑃𝐴𝐵𝛿1𝛼𝛿2=𝑃𝐴𝐵𝛿1𝛼𝛿2𝑃𝐵𝛿1𝛼𝛿2,(2) where 0𝛿1<𝛿2𝜋𝑅2.

We use Figures 2, 3, and 4 to explain (2). Please note that the radius of each disk is 𝑅 in all the figures.

According to the class of DTDC algorithms, smallest possible cluster area size would result whenever a given CH’s neighboring CHs sit on the perimeter of its CH broadcasting coverage disc of radius 𝑅 since no two CHs could be selected within each other’s CH broadcasting range 𝑅. This situation is shown in Figure 2.

Hence, we can write𝑃𝐴𝐵0<𝛼<3𝑅22=0.(3)

In other words, Figure 2 shows the possible highest CH density (Number of CHs in a given unit area). According to the DTDC class of algorithms, we can expect cluster area sizes between smallest of 3𝑅2/2 to largest of 𝜋𝑅2 provided that there are no uncovered nodes in the cluster neighborhood. Therefore, we can write0<𝑃𝐴𝐵𝛿3𝛼𝛿41,(4) where 3𝑅2/2𝛿3<𝛿4𝜋𝑅2.

Further, when we have close packed clusters (smallest as shown in Figure 2 and largest as shown in Figure 3), there cannot be any uncovered areas. In other words, when cluster area 𝛼>33𝑅2/2, there can be uncovered nodes in its neighborhood since there can be uncovered neighboring regions as shown in Figure 4.

𝑃𝐵(𝛼) represents the probability that there is no uncovered nodes in a given cluster (with area 𝛼) neighborhood. This can be expressed by𝑃𝐵(𝛼)=𝑃𝑛=0𝜆𝐴𝑢=𝑒𝜆𝐴𝑢,(5) where 𝐴𝑢 is any uncovered area formed by the cluster setup as shown in Figure 4. We can show that the neighboring clusters are close packed when the cluster area, 𝛼33𝑅2/2. In other words, there is no uncovered area, resulting in 𝐴𝑢=0 for 𝛼33𝑅2/2. As a result, the probability that there would not be any uncovered nodes is given by𝑃𝐵3𝛼3𝑅22=1.(6)

According to (5), 𝑃𝐵(𝛼) is an exponential decaying function when 𝛼>33𝑅2/2. Now let us consider Figure 5. This is a special case of Figures 3 and 4 where nodes 0 and 6 are placed 2𝑅 distance apart. According to Figure 5, there is a chance for a node to be in the uncovered area 𝐴𝑢 shaded in gray. The cluster area 𝛼 of Figure 5 can be expressed as3𝛼=3𝑅22+𝜋634𝑅2=33𝑅22(1+.0349).(7)

This is only 3.49% bigger than the size of the cluster area shown in Figure 3. The uncovered area 𝐴𝑢 of Figure 5 is𝐴𝑢Δ=2𝑝,𝑞,𝑟+Δ𝑝,𝑞,𝑐0+Δ𝑝,𝑟,𝑐1+Δ𝑞,𝑟,𝑐6𝑆𝑝,𝑞,𝑐0𝑆𝑝,𝑟,𝑐1𝑆𝑞,𝑟,𝑐6,(8) where, in general, Δ𝑥,𝑦,𝑧 represents an area of a triangle {𝑥,𝑦,𝑧}, and 𝑆𝑥,𝑦,𝑧 represents an area of a sector {𝑥,𝑦,𝑧}. Since 𝑝=(0.5𝑅,0.8660𝑅), 𝑞=(0,𝑅), 𝑟=(0.5446𝑅,1.1613𝑅), 𝑐0=(0,0), 𝑐1=(1.5𝑅,0.8660𝑅) and 𝑐6=(0,2𝑅), we can derive 𝐴𝑢=0.094𝑅2.

We have shown that 𝜆𝜋𝑅21 in (1). Therefore, if we consider a WSN with 100 nodes in a given node neighborhood, then 𝜆𝜋𝑅2=100 and the resultant 𝑃𝐵=𝑃(𝑛=0𝜆𝐴𝑢)=0.0502. On the other hand, when the neighborhood contains 200 nodes, this will be further reduced to 𝑃𝐵=𝑃(𝑛=0𝜆𝐴𝑢)=0.0025. Hence, we can conclude that𝑃𝐵3𝛼>3𝑅220,(9) where𝜆𝜋𝑅21.

Therefore, we can approximate that𝑃𝐵3𝛼3𝑅22𝑃=1,𝐵3𝛼>3𝑅22=0(10) provided that 𝜆𝜋𝑅21. Hence, once we combine (2), (3), (4), and (10) we obtain that𝑃𝐴𝐵3𝑅223>𝛼>3𝑅22=0.(11) Therefore,𝑃𝐴𝐵3𝑅223𝛼3𝑅22=1.(12)

The resultant cluster areas 𝛼 of DTDC class of algorithms have an equal chance to be in the interval [3𝑅2/2,33𝑅2/2], due to the fact that all nodes having an equal chance to get the lowest 𝑇𝑖 as they may have equal fitness to be a CH. This results in cluster area p.d.f, 𝑝𝐴.𝐵(𝛼) to be uniform. Hence,𝑝𝐴𝐵1(𝛼)=3𝑅2,3𝑅223𝛼3𝑅220,otherwise(13) provided that 𝜋𝑅2𝜆1.

This far we have derived the p.d.f of cluster area 𝛼. This result will be used in deriving the expected cluster density in the subsequent section.

5. Derivation of Expected Cluster Density

In this section, we will derive the expected cluster density (or CH density as each cluster is served by one and only one CH) for the class of DTDC algorithms.

Let us define 𝑦 as the probability that a randomly chosen node is a CH. Thus,𝑦=NumberofCHsinagivenarea=1Numberofallnodesinthesamegivenarea=1Numberofnodesinarandomcluster.𝛼𝜆(14)

We note that when 𝑧=𝑓(𝑥) and 𝑥 is a random variable with a p.d.f of 𝑝𝑋(𝑥), then the p.d.f of 𝑧 is given by𝑝𝑍𝑝(𝑧)=𝑋(𝑓(𝑧))||𝑓(𝑓(||𝑧)).(15)

We can write the p.d.f of random variable 𝑦, 𝑝𝑌(𝑦) using (13) as,𝑝𝑌1(𝑦)=3𝑦2𝜆𝑅2,233𝜆𝑅22𝑦3𝜆𝑅2.0,otherwise.(16)

According to (14), 𝑦=𝑘/𝑁, where 𝑘 is the total number of CHs at a given moment, and 𝑁 is the total number of nodes. Hence 𝐸[𝑦], the expected probability that a given node is a CH, can be given as𝐸[𝑦]𝑘=𝐸𝑁=𝐸[𝑘]𝑁=𝜆𝑐𝜆,(17) where 𝜆𝑐 is the CH density. So we have𝜆𝑐[𝑦]𝜆𝐸[𝑦]==𝐸,(18)𝑦𝑝𝑌=(𝑦)𝑑𝑦2/3𝜆𝑅22/33𝜆𝑅2𝑦13𝑦2𝜆𝑅2=1𝑑𝑦3/𝜋ln3𝜋𝜆𝑅2=10.5018𝜋𝜆𝑅2.(19)

According to (19), we can expect a 0.5018 fraction of nodes belonging to a given CH’s broadcasting range 𝑅 neighborhood to join its cluster.

Further, using (18) and (19), we can show that𝜆𝑐=𝜆3/𝜋ln3𝜋𝜆𝑅2=ln33𝑅2=10.5018𝜋𝑅2.(20)

Hence, we can conclude that the expected CH density 𝜆𝑐 is independent of the node density provided that 𝜋𝑅2𝜆1.

Observation 1. The result obtained in (20) matches with the empirical formula proposed by Bettstetter in [18] where 𝜆𝑐=𝜆/(1+𝜇/2) and 𝜇=𝜋𝑅2𝜆. When 𝜋𝑅2𝜆1 the empirical formula proposed by Bettstetter reduces to 𝜆𝑐=𝜆=10.5𝜇0.5𝜋𝑅2.(21)

In the analysis thus far we have ignored the influence of the node deployment region boundary and its effects. In what follows, we will analyze the boundary effect. The CHs closest to the boundary does not have any neighboring CHs beyond the boundary; that is, nodes at the boundary have a higher isolation probability even though all the nodes are uniformly distributed within the deployed area. Hence, CHs are more likely to be found at the boundary. This was observed and confirmed in [18].

We can use (17) and (19) to derive the expected number of clusters 𝐸[𝑘] to be formed assuming that the boundary effect does not exist. In other words, we have relaxed the reality that there can be more CHs close to the boundary compared to rest of the area. Thus,𝐸[𝑘][𝑦]=𝑁=𝑁𝐸𝑀3/𝜋ln3,(22) where 𝑀=𝜋𝑅2𝜆 is the expected number of nodes in any given CHs broadcasting range 𝑅. That is, in (22), we have not considered the boundary effect. In what follows, we will derive 𝑀 considering the boundary effect for frequently considered node deployment region shapes, namely, a rectangular region and a circular region. Subsequently we will use these results to obtain 𝐸[𝑘] accounting for the boundary effect.

5.1. Boundary Effect on 𝐸[𝑘] due to a Rectangular Deployment Area

We derive 𝑀 for a rectangular region with dimensions 𝑎×𝑏 and 𝑁 ad hoc deployed nodes. For this scenario, the probability (𝑃0) that two uniformly distributed nodes each within CH candidacy broadcasting range 𝑅 is given by the integral𝑃0=𝑅0𝑓𝑆(𝑠)𝑑𝑠,(23) where 𝑓𝑆(𝑠) is the p.d.f of the distance 𝑆 between two nodes that are independently and uniformly distributed (at random) in a rectangular area of size 𝑎×𝑏, where 𝑎𝑏>𝑅. According to [21], 𝑓𝑆(𝑠) is given by𝑓𝑆(𝑠)=4𝑠𝑎2𝑏2𝜋𝑎𝑏21𝑎𝑠𝑏𝑠+2𝑠2for0𝑠𝑏.(24)

Further, when there are N(≫1) uniformly distributed nodes in the deployment region, we can expect 𝑀 nodes in a given CH neighborhood of radius 𝑅, where 𝑀 is given by𝑀=𝑁𝑃0.(25)

Hence, using (23)–(25), we can derive𝑀=𝑁𝑅2𝜋𝑎𝑏14𝑅𝑅3𝜋𝑎𝑏(𝑎+𝑏)+22𝜋𝑎𝑏.(26)

Therefore, when (26) is used with (22), we can derive the expected number of CHs. Thus,𝐸[𝑘]=𝑎𝑏ln33𝑅2𝑅1(4𝑅/3𝜋𝑎𝑏)(𝑎+𝑏)+2/2𝜋𝑎𝑏.(27)

As we have already discussed, deriving the CH candidacy broadcasting range 𝑅 for a desired 𝐸[𝑘] is a salient requirement in most applications. Hence rearranging (27), we obtain that3𝑅2𝜋𝑎𝑏443(𝑎+𝑏)𝑅3𝜋𝑎𝑏3+3𝑅2𝑎𝑏ln3𝐸[𝑘]=0.(28)

By solving (28), we can derive 𝑅 for a given ad hoc network setup for a rectangular deployment region with the desired number of clusters, 𝐸[𝑘] provided that 𝜋𝑅2𝜆1.

5.2. Boundary Effect on 𝐸[𝑘] due to a Circular Deployment Region

Let us now derive 𝐸[𝑘] for a circular deployment region. We follow the same approach as in the rectangular deployment region case. Let’s assume that the ad hoc deployed wireless node network consists of uniform randomly deployed 𝑁 nodes in the circular deployment region of 𝑟 radius resulting in 𝜆=𝑁/𝜋𝑟2.

The expected number of neighboring nodes 𝑀 in a given CH’s CH candidacy broadcasting range 𝑅, for a circular deployment area with radius of 𝑟 is also given by (25). Note that still the 𝑃0 given in (23) is applicable. However 𝑓𝑆(𝑠), that is, the p.d.f of the distance 𝑆 between two nodes that are independently and uniformly distributed (at random) in a circular area with radius 𝑟 is given by𝑓𝑆(𝑠)=4𝑠𝜋𝑟2cos1𝑠𝑠2𝑟2𝑟𝑠12𝑟2,for0𝑠2𝑟,(29) according to [22]. Hence, we can write the 𝐸[𝑘] of a given circular area with radius 𝑟 as𝐸[𝑘]=𝜋2ln323𝐷(𝑅/𝑟),(30) where𝑅𝐷(𝑅/𝑟)=42𝑟2cos1𝑅𝑅2𝑟3𝑅2𝑟12𝑟21/2𝑅+2𝑅2𝑟12𝑟23/2+sin1𝑅.2𝑟(31)

Thus, we can determine 𝑅 for a given circular deployment area with radius 𝑟 for an expected number of clusters 𝐸[𝑘] by solving the reordered (30).

Note 1. We derived 𝐸[𝑘] assuming that 𝑇𝑖 is a random variable. This is true only for the situation where all nodes have equal fitness to be a CH; that is, the residual energies all the nodes are the same. This is in fact true for HEED, ANTCLUST, and EDCR algorithms during initial deployment with the assumption that the sensors are ideal. However in subsequent rounds, 𝑇𝑖 would be weighted based on each node’s residual energy level at the beginning of the cluster formation. That is, a node with the highest residual energy would be the CH in a given neighborhood. We know that a node closest to a CH would spend the minimum energy in communication. As a result, it would be the highest energy node in that neighborhood at the beginning of the subsequent CH selection phase. Hence it can be observed that a subsequent round, the CHs would be the nodes closest to the previous CHs. Thus, we can expect on average, the same number of clusters formed in subsequent reclustering rounds as well. As a result, (28) will be valid for all subsequent rounds as well.

In this section, we presented an analytical technique to find the Cluster/CH density of DTDC class of algorithms. Further, we derived the expected number of clusters in a finite area considering the boundary effect. In what follows, we compare the analytical results with simulation experiment results.

6. Simulation Results

In this section, the proposed analytical method to determine the cluster density and expected number of clusters for the DTDC class of algorithms using MATLAB simulations were evaluated. It is already established that the proposed analytical results match the empirical results derived using DMAC algorithm in [18]. For comparison, the simulation results for HEED, ANTCLUST, and EDCR algorithms are presented as well. The results are presented based on the following design scenarios.(1)Design requirement of 20 clusters each with 15 nodes monitoring a square area of 100×100m2. That is, 300 nodes should be deployed in this region. According to (28), the computed broadcasting distance is 𝑅=19.42m to achieve the 20 cluster requirement.(2)Design requirement of 30 clusters each with 20 nodes monitoring a rectangular area of 150×100m2. That is, 600 nodes should be deployed in this region. According to (28) the computed broadcasting distance is 𝑅=19.11m to achieve the 30-cluster requirement.(3)Design requirement of 20 clusters each with 20 nodes monitoring a circular area with radius 200 m. That is, 400 nodes should be deployed in this region. According to the (30) the computed broadcasting distance is 𝑅=68.24m to achieve the 20 cluster requirement.

The simulation results related to above-described scenarios are given in Table 2. H1, A1, and E1 denotes the results of HEED, ANTCLUST, and EDCR algorithm respectively, for scenario 1 (square area). Similarly, H2, A2 and E2 represents the results for scenario 2 (rectangular area) and H3, A3 and E3 represent the results for scenario 3 (circular area). Note that 𝐸[𝑘] denote the desired number of clusters in each case. The average and standard deviation (AV ± SD) of the actual number of clusters (𝐸[𝑘]𝐴) obtained via a large number of different random node deployment simulations corresponding to each scenario has been tabulated. The 𝐸[𝑘]𝐴 tabulated in column “Beginning’’ corresponds to the cluster formation results at the initial deployment stage with a fresh set of homogeneous energy nodes, column “End’’ corresponds to the average number of clusters closer to the end of life of the sensor bed (we used 95% nodes alive as the lifetime measurement [12]), and column “Middle’’ corresponds to an average number of clusters at a position halfway in between the “Beginning’’ and “End’’ scenarios. Further the cumulative average of these three cases is presented in the column “Overall’’.

The results given in Table 2 show us that the analytical estimation for 𝑅 based on 𝐸[𝑘] cluster requirement is indeed valid as only a minimal variation of 𝐸[𝑘] is seen in all simulation results. These results (based on HEED, ANTCLUST and EDCR algorithms) and independent simulation results of DMAC algorithm (and its corresponding empirical formula) given in [18] affirm the validity and applicability of the proposed analytical technique in determining the cluster density and the expected number of clusters of DTDC class of algorithms.

As it can be seen from Table 2 all major algorithms in the DTDC class respond in a similar manner. Hence without loss of generality the EDCR algorithm can be selected from this class for further analysis. For the analysis 15 different hypothetical node deployment requirements (case) which would cover the applicability of the analytical method with square, rectangular, and circular deployment regions, with different expected number of clusters for a given deployment region, and different expected number of nodes for a given cluster based on the 𝐸[𝑘] requirement will be used. These requirements are listed in Table 3. The case number will be used to link the tabulated test results of Table 4 for each of these node deployment requirements. The column given under the heading “Area’’ presents the dimensions of node-deployed region (e.g., 𝑎×𝑏 for a rectangular region and 𝜋𝑟2 for a circular region), while the rest of the columns represents the expected number of clusters 𝐸[𝑘], expected number of nodes in a cluster 𝐸[𝑛], and the total number of nodes to be deployed in the region 𝑁(=𝐸[𝑘]×𝐸[𝑛]). The last column presents the calculated 𝑅 for each case using either (28) or (30) depending on the shape of the region.

Table 4 shows the simulation results of the deployment requirements listed in Table 3. Table 4 presents the average and standard deviation (AV ± SD) of the actual number of clusters we observed with the large number of different random node deployments corresponding to each case. The results tabulated in Table 4 indicate that the proposed analytical technique in estimating 𝑅 for a desired number of clusters 𝐸[𝑘] is indeed an accurate method to realize the actual number of clusters. Furthermore, it can be noted that there is minimal variation in 𝐸[𝑘] irrespective of the cluster shape (rectangular, square, or circular), desired number of clusters, and the expected member population in each cluster, provided that all clusters are well populated.

The simulation results presented thus far clearly show the applicability of the proposed analytical technique in estimating the expected number of clusters of the DTDC class of algorithms provided that each cluster is well populated, that is, 𝜋𝑅2𝜆1. In order to identify a minimum threshold for 𝜋𝑅2𝜆 or expected number of nodes in a cluster 𝐸[𝑛] for a given application requirement, the behavior of curves representing average number of actual clusters, 𝐸[𝑘]𝐴 versus different node densities, 𝜆 for different CH broadcasting ranges, 𝑅 can be observed. Figures 6 and 7 present these curves (𝐸[𝑘]𝐴 versus 𝜆) of EDCR algorithm applied for a square deployment region with size 200×200m2 and a circular deployment region with radius 100m, respectively. Both of these graphs consist of 𝐸[𝑘]𝐴 versus 𝜆 curves for 𝑅= 25, 30, 35, 40 and 45. The expected number of clusters, 𝐸[𝑘] calculated using (28) and (30) respectively for Figures 6 and 7 and is plotted as a vertical dotted line for each 𝑅.

Figures 6 and 7 clearly indicate that all the 𝐸[𝑘]𝐴 versus 𝜆 curves are asymptotic and close to the expected number of clusters, 𝐸[𝑘]. The vertical solid error bars marked on each 𝐸[𝑘] line shows the 5% (short) and 10% (long) levels below the 𝐸[𝑘] at 𝜋𝑅2𝜆 are 30 and 20, respectively. It has been already identified that 0.5018 fraction of nodes belonging to any given CH’s broadcasting range 𝑅 neighborhood (𝑀) to join its cluster in Section 5. Therefore, the proposed analytical technique can be used to determine CH candidacy broadcasting range, 𝑅 of DTDC class of algorithms with a maximum error of 10% for a required expected number of clusters, 𝐸[𝑘], when the expected number of nodes in a cluster, 𝐸[𝑛], is more than 10. The number of nodes in a cluster is well above this figure in most of the practical applications.

Above-presented simulation results and empirical formula derived based on simulation experiments in [18] affirm the accuracy of using the proposed analytical method in determining 𝑅 for given expected number of clusters, 𝐸[𝑘] of DTDC class of algorithms at the network planning stage.

7. Conclusion

Distributed clustering is a popular technique in organizing ad hoc deployed wireless networks including WSNs. We found that clustering algorithms like DMAC, HEED, ANTCLUST, MEDIC, and EDCR can be categorized into the class of DTDC algorithms based on the common underline Dutch Auction principle in CH selection resulting in a similar CH distribution. In this research, we have provided an analytical framework which can be used to derive the cluster density, 𝜆𝑐, for a given deployment requirement where each cluster is assumed to be well populated. Furthermore, the analysis framework has been extended to include the effects of the boundary resulting from a finite deployment region when computing the expected number of clusters. The proposed analytical technique was verified via simulation experiments, and the results were presented. Further, the empirical formula proposed by Bettstetter in [18] independently verifies the accuracy of the proposed technique and vice versa. The authors feel that this analytical framework can be extended to derive 𝜆𝑐 for any generic situation given by Matérn Type III-dependent thinning point process [20] in future research.