Abstract

In a distributed parameter estimation problem, during each sampling instant, a typical sensor node communicates its estimate either by the diffusion algorithm or by the incremental algorithm. Both these conventional distributed algorithms involve significant communication overheads and, consequently, defeat the basic purpose of wireless sensor networks. In the present paper, we therefore propose two new distributed algorithms, namely, block diffusion least mean square (BDLMS) and block incremental least mean square (BILMS) by extending the concept of block adaptive filtering techniques to the distributed adaptation scenario. The performance analysis of the proposed BDLMS and BILMS algorithms has been carried out and found to have similar performances to those offered by conventional diffusion LMS and incremental LMS algorithms, respectively. The convergence analyses of the proposed algorithms obtained from the simulation study are also found to be in agreement with the theoretical analysis. The remarkable and interesting aspect of the proposed block-based algorithms is that their communication overheads per node and latencies are less than those of the conventional algorithms by a factor as high as the block size used in the algorithms.

1. Introduction

A wireless sensor network (WSN) consists of a group of sensors nodes which perform distributed sensing by coordinating themselves through wireless links. Since the nodes operate in a WSN function with limited battery power, it is important to design the networks with a minimum of communication among the nodes to estimate the required parameter vector [1, 2]. In the literature, a number of research papers have appeared which address the energy issues of sensor networks. According to the energy estimation scheme based on the 4th power loss model with Rayleigh fading [3], the transmission of 1 kb of data over a distance of 100 m, operating at 1 GHz using BPSK modulation with 106 bit-error rate, requires 3 J of energy. The same energy can be used for executing 300 M instructions in a 100 MIPS/watt general purpose processor. Therefore, it is of great importance to minimize the communication among nodes by maximizing local estimation in each sensor node.

Each node in a WSN collects noisy observations related to certain desired parameters. In the centralized solution, every node in the network transmits its data to a central fusion center (FC) for processing. This approach has the disadvantage of being nonrobust to the failure of the FC and also needs a powerful central processor. Again the problem with centralized processing is the lack of scalability and the requirement for a large communication resource [1]. If the intended application and the sensor architecture allow more local processing, then it would be more energy efficient compared to communication extensive centralized processing. Alternatively, each node in the network can function as an individual adaptive filter to estimate the parameter from the local observations and by cooperating with the neighbors. So there is a need to search for new distributed adaptive algorithms to reduce communication overhead for low-power consumption and low-latency systems for real-time operation.

The performance of distributed algorithms depends on the mode of cooperation among the nodes, for example, incremental [4, 5], diffusion [6], probabilistic diffusion [7], and diffusion with adaptive combiner [8]. To improve the robustness against the spatial variation of signal-to-noise ratio (SNR) over the network, recently an efficient adaptive combination strategy has been proposed [8]. Also a fully distributed and adaptive implementation to make individual decisions by each node in the network is dealt with in [9].

Since in block filtering technique [10], the filter coefficients are adjusted once for each new block of data in contrast to once for each new input sample in the least mean square (LMS) algorithm, the block adaptive filter permits faster implementation while maintaining equivalent performance as that of widely used LMS adaptive filter. Therefore, the block LMS algorithms could be used at each node in order to reduce the amount of communications.

With this in mind, we present a block formulation of the existing cooperative algorithm [4, 11] based on the distributed protocols. Distinctively, in this paper, the adaptive mechanism is proposed in which the nodes of the same neighborhood communicate with each other after processing a block of data, instead of communicating the estimates to the neighbors after every sample of input data. As a result, the average bandwidth for communication among the neighboring nodes decreases by a factor equal to the block size of the algorithm. In real-time scenarios, the nodes in the sensor network follow a particular protocol for communication [1214], where the communication time is much more than the processing time. The proposed block distributed algorithm provides an excellent balance between the message transmission delay and processing delay, by increasing the interval between two messages and by increasing the computational load on each node in the interval between two successive transmissions. The main motivation here is to propose communication-efficient block distributed LMS algorithms (both incremental and diffusion type). We analyze the performance of the proposed algorithms and compare them with existing distributed LMS algorithms.

The reminder of the paper is organized as follows. In Section 2, we present the BDLMS algorithm and its network global model. The performance analysis of BDLMS and its learning characteristics obtained from a simulation study are presented in Section 3. Performance analysis of the BILMS and its simulation results are presented in Section 4. The performance of the proposed algorithms in terms of communication cost and latency is compared with the conventional distributed adaptive algorithms in Section 5. Finally, Section 6 discusses the conclusions of the paper.

2. Block Adaptive Distributed Solution

Consider a sensor network with 𝑁 number of sensor nodes randomly distributed over the region of interest. The topology of a sensor network is modeled by an undirected graph. Let 𝒢 be an undirected graph defined by a set of nodes 𝒱 and a set of edges . Nodes 𝑖 and 𝑗 are called neighbors if the are connected by an edgey that is, (𝑖,𝑗). We also considered a loop which consists of a set of nodes 𝑖1,𝑖2,,𝑖𝑁 such that the node 𝑖𝑘 is 𝑖𝑘+1’s neighbor, 𝑘=1,2,,𝑁, and 𝑖1 is 𝑖𝑁’s neighbor. Every node in the network 𝑖𝒱 is associated with noisy output 𝑑𝑖 to the input data vector 𝑢𝑖. We have assumed that the noise is independent of both input and output data; therefore, the observations are spatially and temporally independent. The neighborhood of node 𝑖 is defined as the set of nodes connected to node 𝑖 which is defined as 𝒩𝑖=𝑗(𝑖,𝑗) [15].

Now, the objective is to estimate an 𝑀×1 unknown vector 𝐰 from the measurements of 𝑁 nodes. In order estimate this, every node is modeled as a block adaptive linear filter where each node updates its weights using the set of errors observed in the estimated output vector, and broadcasts that to its neighbors. The estimated weight vector of the 𝑘th node at time 𝑛 is denoted as 𝐰𝑘(𝑛). Let 𝑢𝑘(𝑛) be the input data of 𝑘th node at time instant 𝑛, then the input vector to the filter at time instant 𝑛 is𝐮𝑘𝑢(𝑛)=𝑘(𝑛),𝑢𝑘(𝑛1),,𝑢𝑘(𝑛𝑀+1)𝑇.(1) The corresponding desired output of the node for the input vector 𝐮𝑘(𝑛) is modeled as [16, 17] 𝑑𝑘(𝑛)=𝐮𝑇𝑘(𝑛)𝐰+𝜐𝑘(𝑛),(2) where 𝜐𝑘(𝑛) denotes a temporally and spatially uncorrelated white noise with variance 𝜎2𝜐,𝑘.

The block index 𝑗 is related to the time index 𝑛 as 𝑛=𝑗𝐿+𝑖,𝑖=0,1,2,,𝐿1,𝑗=1,2,,(3) where 𝐿 is the block length. The 𝑗th block contains time indices 𝑛=[𝑗𝐿,𝑗𝐿+1,,𝑗𝐿+𝐿1]. Combining input vectors of 𝑘th node for block 𝑗 to form a matrix given by 𝐗𝑗𝑘=𝐮𝑘(𝑗𝐿),𝐮𝑘(𝑗𝐿+1),,𝐮𝑘(𝑗𝐿+𝐿1)𝑇,(4) the corresponding desired response at 𝑗th block index of 𝑘th node is represented as 𝐝𝑗𝑘=𝑑𝑘(𝑗𝐿),𝑑𝑘(𝑗𝐿+1),,𝑑𝑘(𝑗𝐿+𝐿1)𝑇.(5) Let 𝐞𝑗𝑘 represent the 𝐿×1 error signal vector for 𝑗th block of 𝑘th node and is defined as 𝐞𝑗𝑘=𝐝𝑘(𝑗)𝐗𝑗𝑘𝐰𝑗𝑘,(6) where 𝐰𝑗𝑘 estimated weight vector of the filter when 𝑗th block of the data is input at the 𝑘th node and of the order of 𝑀×1.

The regression input data and corresponding desired responses are distributed across all the nodes and are represented in two global matrices: 𝐗𝑗𝑏𝑔𝐗=col𝑗1,𝐗𝑗2,,𝐗𝑗𝑁,𝐝𝑗𝑏𝑔𝐝=col𝑗1,𝐝𝑗2,,𝐝𝑗𝑁.(7) The objective is to estimate the 𝑀×1 vector 𝐰 from the above quantities, those collected the data across 𝑁 nodes. By using this global data, the block error vector for the whole network is 𝐞𝑗𝑏𝑔=𝐝𝑗𝑏𝑔𝐗𝑗𝑏𝑔𝐰.(8) Now, the vector 𝐰 can be estimated by minimizing MSE function as 𝐰𝐸𝐝min𝑏𝑔𝐗𝑏𝑔𝐰2.(9) The time index is dropped here for simple mathematical representation. Since the quantities are collected data across the network in block format; therefore, the block mean square error (BMSE) is to be minimized. The BMSE is given by [17, 18] 1BMSE=𝐿𝐸𝐝𝑇𝑏𝑔𝐝𝑏𝑔𝐝𝐸𝑇𝑏𝑔𝐗𝑏𝑔𝐰𝐰𝑇𝐸𝐗𝑇𝑏𝑔𝐝𝑏𝑔𝐰𝑇𝐸𝐗𝑇𝑏𝑔𝐗𝑏𝑔𝐰.(10) Let the input regression data 𝐮 be Gaussian and defined by the correlation function 𝑟(𝑙)=𝜎2𝛼|𝑙| in the covariance matrix, where 𝛼 is the correlation index and 𝜎2 is the variance of the input regression data, then the relation between correlation and cross-correlation quantities among blocked and unblocked data can be denoted as [10] R𝑋𝑏𝑔=𝐿R𝑔𝑈,R𝑏𝑔𝑑𝑋=𝐿R𝑔𝑑𝑢,R𝑑𝑏𝑔=𝐿R𝑔𝑑,(11) where R𝑋𝑏𝑔=𝐸[𝐗𝑇𝑏𝑔𝐗𝑏𝑔], R𝑏𝑔𝑑𝑋=𝐸[𝐗𝑇𝑏𝑔𝐝𝑏𝑔], and R𝑑𝑏𝑔=𝐸[𝐝𝑇𝑏𝑔𝐝𝑏𝑔], which are the autocorrelation and cross-correlation matrices for global data in blocked form. Similarly, the correlation matrices for unblocked data are defined as R𝑔𝑈=𝐸[𝐔𝑇𝑔𝐔𝑔], R𝑔𝑑𝑢=𝐸[𝐔𝑇𝑔𝐝𝑔], and R𝑔𝑑=𝐸[𝐝𝑇𝑔𝐝𝑔] where the global distribution of data across the network is represented as 𝐔𝑔=[𝐮1,𝐮2,,𝐮𝑁]𝑇 and 𝐝𝑔=[𝐝1,𝐝2,,𝐝𝑁]𝑇. These relations are also valid for node data in individual nodes.

Now, the block mean square error (BMSE) in (10) is reduced to 1BMSE=𝐿𝐿R𝑔𝑑𝐿R𝑔𝑑𝑢𝐰𝐰𝐿𝑇R𝑔𝑑𝑢𝐿𝐰𝑇R𝑔𝑢𝐰=R𝑔𝑑R𝑔𝑑𝑢𝐰𝐰𝑇R𝑔𝑑𝑢𝐰𝑇R𝑔𝑢𝐰=MSE.(12) Comparing (12) with the MSE of conventional LMS for global data [17, 19], it can be concluded that the MSE in both the cases is same. Hence, block LMS algorithm has similar properties as that of the conventional LMS algorithm. Now, (9) for blocked data can be reduced to a form similar to that of unblocked data as 𝐰𝐸𝐝min𝑔𝐔𝑔𝐰2.(13) The basic difference between blocked and unblocked LMS lies in the estimation of the gradient vector used in their respective implementation. The block LMS algorithm uses a more accurately estimated gradient because of the time averaging. The accuracy increases with the increase in block size. Taking into account the advantages of block LMS over conventional LMS, the distributed block LMS is proposed here.

2.1. Adaptive Block Distributed Algorithms

In adaptive block LMS algorithm, each node 𝑘 in the network receives the estimates from its neighboring nodes after each block of input data to adapt the local changes in the environment. Two different types of distributed LMS in WSN have been reported in literature, namely, incremental and diffusion LMS [6, 19]. These algorithms are based on conventional LMS for local learning process which in terms needs large communication resources. In order to achieve the same performance with less communication resource, the block distributed LMS is proposed here.

2.1.1. The Block Incremental LMS (BILMS) Algorithm

In an incremental mode of cooperation, information flows in a sequential manner from one node to the adjacent one in the network after processing one sample of data [4]. The communications in the incremental way of cooperation can be reduced if each node need to communicate only after processing a block of data. For any block of data 𝑗, it is assumed that node 𝑘 has access to the 𝐰𝑗𝑘1 estimates from its predecessor node, as defined by the network topology and constitution. Based on these assumptions, the proposed block incremental LMS algorithm can be stated by reducing the conventional incremental LMS algorithm ((16) in [19]) to a blocked data form as follows, 𝐰𝑗0=𝐰𝑗1,𝐰𝑗𝑘=𝐰𝑗𝑘1+𝜇𝑘𝐿𝐿1𝑞=0𝐮𝑘×𝑑(𝑗𝐿+𝑞)𝑘(𝑗𝐿+𝑞)𝐮𝑇𝑘𝐰(𝑗𝐿+𝑞)𝑗𝑘1=𝐰𝑗𝑘1+𝜇𝑘𝐿𝐗𝑗𝑇𝑘𝐝𝑗𝑘𝐗𝑗𝑘𝐰𝑗𝑘1𝐰,for𝑘=1,2,𝑁,𝑗=𝐰𝑗𝑁,(14) where 𝜇𝑘 is the local step size, and 𝐿 is the block size.

2.1.2. The Block Diffusion LMS (BDLMS) Algorithm

Here, each node 𝑘 updated its estimate by using a simple local rule based on the average of its own estimates plus the information received from its neighbor 𝒩𝑘. In this case, for every 𝑗th block of data at the 𝑘th node, the node has access to a set of estimates from its neighbors 𝒩𝑘. Similar to block incremental LMS, the proposed block diffusion strategy for a set of local combiners 𝑐𝑘𝑙 and for local step size 𝜇𝑘 can be described as a reduced form of conventional diffusion LMS [6, 20] as 𝜽𝑘𝑗1=𝑙𝒩𝑘,𝑗1𝑐𝑘𝑙𝐰𝑘𝑗1,𝜽𝑘𝐰(1)=0,𝑗𝑘=𝜽𝑘𝑗1+𝜇𝑘𝐿𝐿1𝑞=0𝐮𝑘(×𝑑𝑗𝐿+𝑞)𝑘(𝑗𝐿+𝑞)𝐮𝑇𝑘(𝑗𝐿+𝑞)𝜽𝑘𝑗1.(15) The weight update equation can be rewritten in more compact form by using the data in block format given in (4) and (5) as 𝐰𝑗𝑘=𝜽𝑘𝑗1+𝜇𝑘𝐿𝐗𝑗𝑘𝑇𝐝𝑗𝑘𝐗𝑗𝑘𝜽𝑘𝑗1.(16) Comparing (15) with (19) in [21], it is concluded that the weight update equation is modified into block format.

3. Performance Analysis of BDLMS Algorithm

The performance of an adaptive filter is evaluated in terms of its transient and steady-state behaviors, which, respectively provide the information about how fast and how well a filter is capable to learn. Such performance analysis is usually challenging in interconnected network because each node 𝑘 is influenced by local data with local statistics {𝑅𝑑𝑥,𝑘,𝑅𝑋,𝑘}, by its neighborhood nodes through local diffusion, and by local noise with variance 𝜎2𝜐,𝑘. In case of block distributed system, the analysis becomes more challenging as it has to handle data in block form. The key performance metrics used in the analysis are MSD (mean square deviation), EMSE (excess mean square error), and MSE for local and also for global networks and are defined as 𝜂𝑗𝑘𝐰=𝐸𝑗𝑘12𝜁(MSD),𝑗𝑘𝑒=𝐸𝑗𝑎,𝑘2(𝜉EMSE),𝑗𝑘𝑒=𝐸𝑘(𝑗)2=𝜁𝑗𝑘+𝜎2𝜐,𝑘(MSE),(17) and the local error signals such as weight error vector and a priori error at 𝑘th node for 𝑗th block are given as 𝐰𝑗𝑘1=𝐰𝐰𝑗𝑘1,𝑒𝑗𝑎,𝑘=𝐮𝑗𝑘𝐰𝑗𝑘1.(18) The algorithm described in (15) is looking like the interconnection of block adaptive filters instead of conventional LMS adaptive algorithm among all the nodes across the network. As shown in (12) that the block LMS algorithm has similar properties to those of the conventional LMS algorithm, the convergence analysis of the proposed block diffusion LMS algorithm can be carried out similar to the diffusion LMS algorithm described in [18, 21].

The estimated weight vector for 𝑗th block across the network is defined as 𝐰𝑗=𝐰𝑗1𝐰;;𝑗𝑁.(19) Let 𝐶 be the 𝑁×𝑁 metropolis with entries [𝑐𝑘𝑙], then the global transaction combiner matrix 𝐺 is defined as 𝐺=𝐶𝐼𝑀. The diffusion global vector for 𝑗th block is defined as 𝜽𝑗𝐰=𝐺𝑗.(20) Now, the input data vector at 𝑗th block is defined as 𝐗𝑗𝐗=diag𝑗1,,𝐗𝑗𝑁.(21) The desired block responses at each node 𝑘 are assumed which have to obey the traditional data model used in literature [1618], that is, 𝐝𝑗𝑘=𝐗𝑗𝑘𝐰+𝐯𝑗𝑘,(22) where 𝐯𝑗𝑘 is the background noise vector of length 𝐿. The noise is assumed to be spatially and temporarily independent with variance 𝜎2𝜐,𝑘. Using blocked desired response for single node (17), the global response for 𝑘th block can be modeled as 𝐝𝑗𝑏𝑔=𝐗𝑗𝑏𝑔𝐰𝑔+𝐯𝑗,(23) where 𝐰𝑔 is the optimum global weight vector defined for every node and is written as 𝐰𝑔=[𝐰;,,;𝐰] and 𝐯𝑗=𝐯𝑗1;,,;𝐯𝑗𝑁(𝐿𝑁×1)(24) is the additive Gaussian noise for 𝑗th block index.

Using the relations defined above, the block diffusion strategy in (15) can be written in global form as 𝐰𝑗=𝜽𝑗1+1𝐿𝐒𝐗𝑗𝐝𝑗𝑏𝑔𝐗𝑗𝜽𝑗1,(25) where the step sizes for all the nodes are embedded in a matrix𝐒,𝜇𝐒=diag1I𝑀,𝜇2I𝑀,,𝜇𝑁I𝑀(𝑁𝑀×𝑁𝑀).(26) Using (20), it can be written as 𝐰𝑗𝐰=𝐺𝑗1+1𝐿𝐒𝐗𝑗𝐝𝑗𝑏𝑔𝐗𝑗𝐺𝐰𝑗1.(27)

3.1. Mean Transient Analysis

The mean behavior of the proposed BDLMS is similar to diffusion LMS given in [18, 21]. The mean error vector signal is given as 𝐸𝐰𝑗=I𝑁𝑀1𝐿𝐒R𝑋𝐰𝐺𝐸𝑗1,(28) where R𝑋=diag{R𝑋,1,R𝑋,2,,R𝑋,𝑁} is a block diagonal matrix and R𝑋,𝑘𝐗=𝐸𝑘𝑇𝐗𝑘𝐔=𝐿𝐸𝑇𝑘𝐔𝑘=𝐿R𝑈.(29) Hence, (28) can be written as 𝐸𝐰𝑗=I𝑁𝑀𝐒R𝑈𝐰𝐺𝐸𝑗1.(30) Comparing (30) with that of diffusion LMS ((35) in [21]), we can find that both block diffusion LMS and diffusion LMS yield the same characteristic equation for the convergence of mean; and it can be concluded that block diffusion protocol defined in (15) has the same stabilizing effect on the network as diffusion LMS,

3.2. Mean-Square Transient Analysis

The variance estimate is a key performance indicator in mean-square transient analysis of any adaptive system. The variance relation for block data is similar to that of conventional diffusion LMS 𝐸𝐰𝑗2Σ𝐰=𝐸𝑗12Σ+1𝐿2𝐸𝐯𝑗𝑇𝐗𝑗𝐒Σ𝐒𝐗𝑗𝑇𝐯𝑗Σ,(31)=𝐺𝑇1Σ𝐺𝐿𝐺𝑇𝐗Σ𝐒𝐸𝑗𝑇𝐗𝑗𝐺1𝐿𝐺𝑇𝐸𝐗𝑗𝑇𝐗𝑗+1𝐒Σ𝐺𝐿2𝐺𝑇𝐸𝐗𝑗𝑇𝐗𝑗𝐗𝐒Σ𝐒𝐸𝑗𝑇𝐗𝑗𝐺.(32) Using 𝐸[𝐗𝑗𝑇𝐗𝑗]=𝐿𝐸[𝐔𝑗𝑇𝐔𝑗] from the definition in (32), we obtainΣ=𝐺𝑇Σ𝐺𝐺𝑇𝐔Σ𝐒𝐸𝑗𝑇𝐔𝑗𝐺𝐺𝑇𝐸𝐔(𝑗)𝑇𝐔𝑗𝐒Σ𝐺+𝐺𝑇𝐸𝐔𝑗𝑇𝐔𝑗𝐔𝐒Σ𝐒𝐸𝑗𝑇𝐔𝑗𝐺,(33) which is similar to (45) in [21]. Using the properties of expectation and trace [18], the second term of (31) is solved as1𝐿2𝐸𝐯𝑗𝑇𝐗𝑗𝐒Σ𝐒𝐗𝑗𝑇𝐯𝑗=1𝐿2𝐸𝐗tr𝑗𝐒Σ𝐒𝐗𝑗𝑇𝐸𝐯𝑗𝑇𝐯𝑗𝐧=𝐸𝑗𝑇𝐔𝑗𝐒Σ𝐒𝐔𝑗𝑇𝐧𝑗,(34) where the noise variance vector 𝐧𝑗 is not in block form, and it is assumed that the noise is stationary Gaussian. Equations (31) and (32) may therefore be written as 𝐸𝐰𝑗2Σ𝐰=𝐸𝑗12Σ+1𝐿2𝐸𝐧𝑗𝑇𝐔𝑗𝐒Σ𝐒𝐔𝑗𝑇Σ𝐧(𝑗),(35)=𝐺𝑇Σ𝐺𝐺𝑇𝐔Σ𝐒𝐸𝑗𝑇𝐔𝑗𝐺𝐺𝑇𝐸𝐔𝑗𝑇𝐔𝑗𝐒Σ𝐺+𝐺𝑇𝐸𝐔𝑗𝑇𝐔𝑗𝐔𝐒Σ𝐒𝐸𝑗𝑇𝐔𝑗𝐺.(36) It may be noted that variance estimate (36) for BDLMS algorithm is exactly the same as that of DLMS [21]. In the block LMS algorithm, the local step size is chosen to be 𝐿 times that of the local step size of diffusion LMS in order to have the same level of performance. As the proposed algorithm and the diffusion LMS algorithm have similar properties, the evolution of their variances is also similar. Therefore, the recursion equation of the global variances for BDLMS will be similar to (73) and (74) in [21]. Similarly, the local node performances will be similar to (89) and (91) of [21].

3.3. Learning Behavior of BDLMS Algorithm

The learning behavior of BDLMS algorithm is examined using simulations. The characteristic or variance curves are plotted for block LMS and are compared with that of DLMS. The row regressors with shift invariance input [18] are used with each regressor having data as 𝐮𝑘𝑢(𝑖)=𝑘(𝑖),𝑢𝑘(𝑖1),,𝑢𝑘(𝑖𝑀+1)𝑇.(37) In block LMS, the regressors for 𝐿=3 and 𝑀=3 are given as 𝐗𝑘𝑢(1)=𝑘𝑢(1)00𝑘(2)𝑢𝑘(𝑢1)0𝑘(3)𝑢𝑘(2)𝑢𝑘,𝐗(1)𝑘𝑢(2)=𝑘(4)𝑢𝑘(3)𝑢𝑘𝑢(2)𝑘(5)𝑢𝑘(4)𝑢𝑘𝑢(3)𝑘(6)𝑢𝑘(5)𝑢𝑘.(4)(38) The desired data are generated according to the model given in literature [18]. The unknown vector 𝐰 is set to [1,1,,1]𝑇/𝑀.

The input sequence {𝐮𝑘(𝑖)} is assumed to be spatially correlated and is generated as 𝑢𝑘(𝑖)=𝑎𝑘𝑢𝑘(𝑖1)+𝑏𝑘𝑛𝑘(𝑖),𝑖>.(39) Here, 𝑎𝑘[0,1) is the correlation index, and 𝑛𝑘(𝑖) is a spatially independent white Gaussian process with unit variance and 𝑏𝑘=𝜎2𝑢,𝑘(1𝑎2𝑘). The regressors power profile is given by {𝜎2𝑢,𝑘}(0,1]. The resulting regressors have Toeplitz covariance with corelation sequence 𝑟𝑘(𝑖)=𝜎2𝑢,𝑘(𝑎𝑘)|𝑖|,𝑖=0,1,2,,𝑀1.

Figure 1 shows an eight-node network topology used in the simulation study. The network settings are given in Figures 2(a) and 2(b).

3.4. The Simulation Conditions

The algorithm is valid for any block of length greater than one [10], while 𝐿=𝑀 is the most preferable and optimum choice.

The background noise is assumed to be Gaussian white noise of variance 𝜎2𝜐,𝑘=103, and the data used in the study is generated using 𝑑𝑘(𝑛)=𝐮𝑘(𝑛)𝐰+𝑣𝑘(𝑛). In order to generate the performance curves, 50 independent experiments are performed and averaged. The results are obtained by averaging the last 50 samples of the corresponding learning curves. The global MSD curve is shown in Figure 3. This is obtained by averaging 𝑤𝐸𝑘𝑗12 across all the nodes over 100 experiments. Similarly, the global EMSE curve obtained by averaging 𝐸𝐞𝐣𝑎,𝑘2, where 𝐞𝐣𝑎,𝑘=𝐱𝑗𝑘𝑤𝑘𝑗1, across all the nodes over 100 experiments is displayed in Figure 4. The global MSE is depicted in Figure 5. It shows that in both the cases the MSE is exactly matching.

Since the weights are updated and then communicated for local diffusion after every 𝐿 data samples, the number of communications between neighbors is reduced by 𝐿 times compared to that of the diffusion LMS case where the weights are updated and communicated after each sample of data.

The global performances are the contributions of all individual nodes, and it is obtained by taking the mean performance of all the nodes. The simulation results are provided to compare with that obtained by diffusion LMS for individual node. The local MSD evolution at node 1 is given in Figure 6(a) and at node 5 is given in Figure 6(b). Similarly, the local EMSE evolution at nodes 1 and 7 is depicted in Figure 7. The convergence speed is nearly the same in both MSD and EMSE evolution, but the performance is slightly degraded in case of BDLMS. The loss of performance in case of BDLMS could be traded for the huge reduction in of communication bandwidth.

4. Performance Analysis of BILMS Algorithm

To show that the BILMS algorithm has guaranteed convergence, we may follow the steady-state performance analysis of the algorithm using the same data model as the one which is commonly used in the conventional sequential adaptive algorithms [5, 22, 23].

The weight-energy relation is derived by using the definition of weighted a priori and a posteriori error [18] 𝐰𝑗𝑘2Σ+|||𝑒𝑗Σ𝑎,𝑘|||2𝐗𝑗𝑘2Σ=𝐰𝑗𝑘12Σ+|||𝑒𝑗Σ𝑝,𝑘|||2𝐗𝑗𝑘2Σ.(40) Since (40) is similar to that of (35) in [19]. Thus, the performance of BILMS is similar to that of ILMS. The variance expression is obtained from the energy relation (40) by replacing a posteriori error by its equivalent expression and then averaging both the sides 𝐸𝐰𝑗𝑘2Σ𝐰=𝐸𝑗𝑘12Σ+|||𝜇𝑘𝐿|||2𝐸𝑉𝑗𝑇𝑘𝐗𝑗𝑘Σ𝐗𝑗𝑇𝑘𝑉𝑗𝑘Σ𝜇=Σ𝑘𝐿Σ𝐗𝑗𝑇𝑘𝐗𝑗𝑘+𝐗𝑗𝑇𝑘𝐗𝑗𝑘Σ+|||𝜇𝑘𝐿|||2𝐗𝑗𝑇𝑘𝐗𝑗𝑘Σ𝐗𝑗𝑇𝑘𝐗𝑗𝑘.(41) The variance relation in (41) is similar to the variance relation of ILMS in [19]. The performance of ILMS is studied in detail in literature. It is observed that the theoretical performance of block incremental LMS and conventional incremental LMS algorithms are similar because both have the same variance expressions. Simulation results provide the validation of this analysis.

4.1. Simulation Results of BILMS Algorithm

For the simulation study of IBLMS, we have used the regressors with shift-invariance as with the same desired data used in the case of BDLMS algorithm. The time-correlated sequences are generated at every node according to the network statistics. The same network has been chosen here for simulation study as defined for block diffusion network in Section 3.3. In incremental way of cooperation, each node receives information from its previous node, updates it by using own data, and sends the updated estimate to the next node. The ring topology used here is shown in Figure 8. We assume that the background noise to be temporarily and spatially uncorrelated additive white Gaussian noise with variance 103. The learning curves are obtained by averaging the performance of 100 independent experiments, generated by 5,000 samples in the network. It can be observed from figures that the steady-state performances at different nodes of the network achieved by BILMS matche very closely with that of ILMS algorithm. The EMSE plots which are more sensitive to local statistics are depicted in Figures 9(a) and 9(b). A good match between BILMS and ILMS is observed from these plots. In [19], the authors have already proved the theoretical matching of steady-state nodal performance with simulation results. As the MSE roughly reflects the noise power and the plot indicates the good performance of the adaptive network, it may be inferred that the adaptive node performs well in the steady state.

The global MSD curve shown in Figure 10 is obtained by averaging 𝝍𝐸𝑘(𝑗1)2 across all the nodes and over 50 experiments. Similarly, the global EMSE and MSE plots are displayed in Figures 11 and 12, respectively. These are obtained by averaging 𝐸𝐞𝑎,𝑘(𝑗)2, where 𝐞𝑎,𝑘(𝑗)=𝐱𝑘,𝑗𝝍𝑘(𝑗1) across all the nodes over 50 experiments.

If the weights are updated after 𝐿 data points and then communicated for local diffusion, the number of communications between neighbors is reduced by 𝐿 times that of ILMS where the weights are updated after processing each sample of data. Therefore, similar to BDLMS, the communication overhead in BILMS also gets reduced by 𝐿 times that of ILMS algorithm.

The performance comparison between two proposed algorithms BDLMS and BILMS for the same network is shown in Figures 1315. One can observe from Figure 13 that the MSE for BILMS algorithm is converging faster than BDLMS. Since the same noise model is used for both the algorithms, therefore after convergence, the steady-state performances are the same for both of them. But in case of MSD and EMSE performances in Figures 14 and 15, little difference is observed. It is due the different cooperation scheme used for different algorithms. However, the diffusion cooperation scheme is more adaptive to the environmental change compared to the incremental cooperation. But a higher number of communication overhead are required for BDLMS than BILMS algorithm.

5. Performance Comparison

In this section, we present an analysis of communication cost and latency to have a theoretical comparison of the performances of distributed LMS with block distributed LMS.

5.1. Analysis of Communication Cost

Assuming that the messages are of fixed bit width, the communication cost is modeled as the number of messages transmitted to achieve the steady-state value in the network. Let 𝑁 be the number of nodes in the network, and let 𝑀 be the filter length. The block length 𝐿 is chosen to be the same as the filter length. Let be the average time required for the transmission of one message, that is, for one communication between the nodes [2426].

5.1.1. ILMS and BILMS Algorithms

In the incremental mode of cooperation, every node sends its own estimated weight vector to its adjacent node in a unidirectional cyclic manner. Since at any instant of time, only one node is active/allowed to transmit to only one designated node, the number of messages transmitted in one complete cycle is 𝑁1. Let 𝐾 be the number of cycles required to attain the steady-state value in the network. Therefore, the total number of communications required to converge the system to steady state is given by 𝐶ILMS=(𝑁1)𝐾.(42) In case of BILMS also, at any instant of time, only one node in the network is active/allowed to transmit to one designated follower node, as in the case of ILMS. But, in case of BILMS, each node sends its estimated weight vector to its follower node in the network after an interval of 𝐿 sample periods after processing a block of 𝐿 data samples. Therefore, the number of messages sent by a node in this case is reduced to 𝐾/𝐿, and accordingly, the total communication cost is given by 𝐶BILMS=(𝑁1)𝐾𝐿.(43)

5.1.2. DLMS and BDLMS Algorithms

The diffusion-based algorithms are communication intensive. In DLMS mode of cooperation, in each cycle, each node in the network sends its estimated information to all its connected nodes in the network. So the total number of messages transmitted by all the nodes in a cycle is 𝑐=𝑁𝑖=1𝑛𝑖,(44) where 𝑛𝑖 is the number of nodes connected to the 𝑖th node, and the total communication cost to attain convergence is given by 𝐶DLMS=𝑐𝐾.(45) In this proposed block diffusion strategy, the number of connected nodes 𝑛𝑖 and the total size of the messages remain the same as that of DLMS. But, in case of BDLMS algorithm, each node distributes the message after 𝐿 data samples. Therefore the communication is reduced by a factor equal to the block length, and the total communication cost in this case is given by 𝐶BDLMS=𝑐𝐾𝐿.(46)

5.2. Analysis of Duration for Convergence

The time interval between the arrival of input to a node and the time of reception of corresponding updates by the designated node(s) may be assumed to be comprised of two major components. Those are processing delay to perform the necessary computations in a node to obtain the estimates to be updated and the communication delay involved in transferring the message to the receiver node(s). The processing delay will very much depend on the hardware architecture of the nodes to perform the computation which could be widely varying. But, without losing much of the generality of analysis, we can assume that each node has 𝑀 parallel multipliers and one full adder to implement the LMS algorithm. Let 𝑇𝑀 and 𝑇𝐴 be the time required for executing a multiplication and an addition, respectively. Therefore, the processing delay needed for single update in LMS is 𝐷=2𝑇𝑀+(𝑀+1)𝑇𝐴.(47) The communication delay is mostly due to the implementation of protocols for transmission and reception, which remains almost the same for different nodes. The location of nodes will not have any major contribution to the delay unless the destination node is far apart, and a relay node is required to make the message reach the destination. In this backdrop, we can assume that the same average delay is required to transfer each message for all receiver-transmitter pairs in the network.

5.2.1. Estimation of Delays for the ILMS and BILMS Algorithms

In case of ILMS, the duration of each updating cycle by all the nodes is 𝑁𝐷+(𝑁1),(48) and the total duration for convergence of the network is given as 𝐿ILMS=[]𝑁𝐷+(𝑁1)𝐾.(49) If the same hardware as that of ILMS is used for the implementation of BILMS, the delay for processing one block of data is 2𝑀𝑇𝑀+𝑀(𝑀+1)𝑇𝐴=𝑀𝐷. Then the duration of one cycle of update by the block incremental LMS is 𝑁{2𝑀𝑇𝑀+𝑀(𝑀+1)𝑇𝐴}+(𝑁1), and the duration of convergence of this algorithm is 𝐿BILMS=[]𝐾𝑁𝑀𝐷+(𝑁1)𝐿.(50) For 𝐿=𝑀, the above expression could be reduced to 𝐿BILMS=𝑁𝐷+(𝑁1)𝐿𝐾.(51) Comparing (51) with (49), we can find that in BILMS the processing delay remains the same as that in ILMS, but the communication overhead is reduced by 𝐿 times.

5.2.2. Estimation of Delays for the DLMS and BDLMS Algorithms

Similar to ILMS, it is also assumed here that the updates of a node reaches all the connected nodes after the same average delay . Therefore, the communication delay remains the same as that of ILMS, but in this case, it needs more processing delay to process the unbiased estimates received from the connected neighboring nodes. The total communication delay in a cycle in this case can be given by 𝑐𝑇𝐴+𝑁𝑇𝑀, where 𝑐 is the total number of messages transferred in a cycle given by (44). Now, the total duration of a cycle in diffusion LMS with the same hardware constraints is given by 𝐿DLMS=𝑐𝑇𝐴+𝑁𝑇𝑀+𝑁𝐷+(𝑁1)𝐾.(52) In case of DBLMS, the total communication delay per cycle is reduced by a factor of 𝐿, which can be expressed as 𝐿BDLMS=𝑐𝑇𝐴+𝑁𝑇𝑀𝐾+𝑁𝑀𝐷+(𝑁1)𝐿.(53) The mathematical expressions of communication cost and latency for the distributed LMS and the block distributed LMS algorithms are summarized in Table 1. A numerical example is given in Table 2 to show the advantage of block-distributed algorithms over the sequential-distributed algorithms. The authors have simulated the hardware for 8-bit multiplication and addition in TSMC 90 nm. The multiplication and addition time are found to be 𝑇𝐴=105ns,𝑇𝑀=103ns. We assume the transmission delay =102s. Looking at the convergence curves obtained from the simulation studies, we can say that the network attains steady state after 250-input data in DLMS and 50-input data in ILMS case. The filter length 𝑀 as well as the block size 𝐿 are taken to be 10 in the numerical study.

6. Conclusion

We have proposed the block implementation of the distributed LMS algorithms for WSN. The theoretical analysis and the corresponding simulation results demonstrate that the performance of the block-distributed LMS algorithms is similar to that of the sequential-distributed LMS. The remarkable achievement of the proposed algorithms is that a node requires 𝐿(block size) times of less communications compared to the conventional sequential-distributed LMS algorithms. This would be of great advantage in reducing the communication bandwidth and power consumption involved in the transmission and reception of messages across the resource-constrained nodes in a WSN. In the coming years, with continuing advances in microelectronics, we can accommodate enough computing resources in the nodes to reduce the processing delays in the nodes, but the communication bandwidth and communication delay could be the major operational bottlenecks in the WSNs. The proposed block formulation therefore would have further advantages over the sequential counterpart in the coming years.