Abstract

Location awareness is a key enabling feature and fundamental challenge in present and future wireless networks. Most existing localization methods rely on existing infrastructure and thus lack the flexibility and robustness necessary for large ad hoc networks. In this paper, we build upon SPAWN (sum-product algorithm over a wireless network), which determines node locations through iterative message passing, but does so at a high computational cost. We compare different message representations for SPAWN in terms of performance and complexity and investigate several types of cooperation based on censoring. Our results, based on experimental data with ultra-wideband (UWB) nodes, indicate that parametric message representation combined with simple censoring can give excellent performance at relatively low complexity.

1. Introduction

Location awareness has the potential to revolutionize a diverse array of present and future technologies. Accurate knowledge of a user's location is essential for a wide variety of commercial, military, and social applications, including next-generation cellular services [1, 2], sensor networks [3, 4], search-and-rescue [5, 6], military target tracking [7, 8], health care monitoring [9, 10], robotics [11, 12], data routing [13, 14], and logistics [15, 16]. Typically, only a small fraction of the nodes in the network, known as anchors, have prior knowledge about their location. The remaining nodes, known as agents, must determine their locations through a process of localization or positioning. The ad hoc and often dynamic nature of wireless networks requires distributed and autonomous localization methods. Moreover, location-aware wireless networks are frequently deployed in unknown environments and hence can rely only on minimal (if any) infrastructure, human maintenance, and a priori location information.

Cooperation is an emerging paradigm for localization in which agents take advantage of network connections and interagent measurements to improve their location estimates. Non-Bayesian cooperative localization in wireless sensor networks is discussed in [17]. Different variations of Bayesian cooperation have been considered, including Monte-Carlo sequential estimation [18] and nonparametric belief propagation in static networks [19]. For a comprehensive overview of Bayesian and non-Bayesian cooperative localization in wireless networks, we refer the reader to [20], which also introduces a distributed cooperative algorithm for large-scale mobile networks called SPAWN (sum-product algorithm over a wireless network). This message-passing algorithm achieves improved localization accuracy and coverage compared to other methods and will serve as the basic algorithm in this paper.

The complexity and cost associated with the SPAWN algorithm depend largely on how messages are represented for computation and transmission. As wireless networks typically operate under tight power and resource constraints, the choice of message representation heavily impacts the feasibility and ease of implementation of the algorithm. The method of message representation and ensuing tradeoff between communication cost and localization performance are thus of great practical importance in the deployment of realistic localization systems. Particle methods do not necessarily lend themselves well in practice to be exchanged wirelessly between devices, due to their high computational complexity and communication overhead [21]. Other message-passing methods have been developed that rely on parametric message representation, thus alleviating these drawbacks but limiting representational flexibility. In particular, in [22], the expectation propagation algorithm is considered with Gaussian messages, while in [23], variational message passing with parametric messages is shown to exhibit low complexity. A variation of SPAWN combining GPS and UWB was evaluated in [24], using a collection of parametric distributions with ellipsoidal, conic, and cylindrical shapes.

This paper addresses the need for accurate, resource-efficient localization with an in-depth comparison of various message representations for SPAWN. We describe and evaluate different parametric and nonparametric message representations in terms of complexity and accuracy. Additionally, we analyze the performance of various cooperative schemes and message representations in a simulated large-scale ultra-wide bandwidth (UWB) network using experimental UWB ranging data. UWB is an attractive choice for ranging and communication due to its ability to resolve multipath [25, 26], penetrate obstacles [27], and provide high resolution distance measurements [28, 29]. Recent research advances in UWB signal acquisition [30, 31], multiuser interference [32, 33], multipath channels [34, 35], non-line-of-sight (NLOS) propagation [28, 29], and time-of-arrival estimation [36] increase the potential for highly accurate UWB-based localization systems in harsh environments. Consequently, significant attention has been paid to both algorithm design [37โ€“42] and fundamental limits of accuracy [43โ€“46] for UWB localization. It is expected that UWB will be exploited in future location-aware systems that utilize coexisting networks of sensors, controllers, and peripheral devices [47, 48].

2. Problem Formulation

We consider a wireless network of ๐‘ nodes in an environment โ„ฐ. Time is slotted with nodes moving independently from time slot to time slot. The position of node ๐‘– at time ๐‘ก is described by the random variable ๐ฑ(๐‘ก)๐‘–; the vector of all positions is denoted by ๐ฑ(๐‘ก). At each time ๐‘ก, node ๐‘– may collect internal position-related measurements ๐‘ง(๐‘ก)๐‘–,self, for example, from an inertial measurement unit. The set of all internal measurements is denoted by ๐ณ(๐‘ก)self. Within the network, nodes communicate with each other via wireless transmissions. We denote the set of nodes from which node ๐‘– can receive transmissions at time ๐‘ก by ๐’ฎ(๐‘ก)โ†’๐‘–. Note that the communication link may not be bidirectional; that is, ๐‘—โˆˆ๐’ฎ(๐‘ก)โ†’๐‘– does not imply ๐‘–โˆˆ๐’ฎ(๐‘ก)โ†’๐‘—. Using packets received from ๐‘—โˆˆ๐’ฎ(๐‘ก)โ†’๐‘–, node ๐‘– may collect a set of relative measurements, represented by the vector ๐‘ง(๐‘ก)๐‘—โ†’๐‘–, which we will limit to distance measurements. We denote the set of all relative measurements made in the network at time ๐‘ก by ๐ณ(๐‘ก)rel. The full set of relative and internal measurements is denoted ๐ณ(๐‘ก).

The objective of the localization problem is for each node ๐‘– to determine the a posteriori distribution ๐‘(๐ฑ(๐‘ก)๐‘–โˆฃ๐ณ(1โˆถ๐‘ก)) of its position ๐ฑ(๐‘ก)๐‘– at each time ๐‘ก, given information up to and including ๐‘ก.

3. A Brief Introduction to SPAWN

In [20], we proposed a cooperative localization algorithm by factorizing the joint distribution ๐‘(๐ฑ(0โˆถ๐‘‡)โˆฃ๐ณ(1โˆถ๐‘‡)), formulating the problem as a factor graph with temporal and spatial constraints, and applying the sum-product algorithm. This leads to a distributed algorithm, known as SPAWN, presented in Algorithm 1. The aim of SPAWN is to compute a belief ๐‘(๐‘ก)๐‘–(๐ฑ(๐‘ก)๐‘–) available to node ๐‘– at the end of any time slot ๐‘ก, which serves as an approximation of the marginal a posteriori distribution ๐‘(๐ฑ(๐‘ก)๐‘–โˆฃ๐ณ(1โˆถ๐‘ก)). Note that each operation of SPAWN requires only information local to an individual node. Information is shared between nodes via physical transmissions. Each node can therefore perform the computations in Algorithm 1 using its local information and transmissions received from neighboring nodes.

( 1 ) Initialize belief ๐‘ ( 0 ) ( ๐ฑ ( 0 ) ๐‘– ) = ๐‘ ( ๐ฑ ( 0 ) ๐‘– ) , โˆ€ ๐‘–
( 1 ) for ๐‘ก = 1 to ๐‘‡ do { time index}
( 2 ) โ€ƒfor all ๐‘– do { mobility update}
( 5 ) โ€ƒโ€ƒMobility update:
โ€ƒโ€ƒโ€ƒ ฬƒ โ€Œ ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) โˆ
โ€ƒโ€ƒโ€ƒโ€ƒ โˆซ ๐‘ ( ๐ฑ ( ๐‘ก ) ๐‘– | ๐ฑ ( ๐‘ก โˆ’ 1 ) ๐‘– ) ๐‘ ( ๐‘ง ( ๐‘ก ) ๐‘– , s e l f | ๐ฑ ( ๐‘ก โˆ’ 1 ) ๐‘– , ๐ฑ ( ๐‘ก ) ๐‘– )
โ€ƒโ€ƒโ€ƒโ€ƒโ€‚ ร— ๐‘ ( ๐‘ก โˆ’ 1 ) ๐‘– ( ๐ฑ ( ๐‘ก โˆ’ 1 ) ๐‘– ) ๐‘‘ ๐ฑ ( ๐‘ก โˆ’ 1 ) ๐‘– โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒโ€ƒ(A-1)
( 6 ) โ€ƒend for
( 7 ) โ€ƒInitialize ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) = ฬƒ โ€Œ ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) , โˆ€ ๐‘–
( 8 ) โ€ƒfor ๐‘™ = 1 to ๐‘ i t do { iteration index; begin cooperative
โ€ƒโ€ƒupdate}
( 9 ) โ€ƒโ€ƒfor all ๐‘– do
( 1 0 ) โ€ƒโ€ƒโ€ƒfor all ๐‘— โˆˆ ๐‘† ( ๐‘ก ) โ†’ ๐‘– do
( 1 1 ) โ€ƒโ€ƒReceive and convert ๐‘ ( ๐‘ก ) ๐‘— ( ๐ฑ ( ๐‘ก ) ๐‘— ) to a distribution
โ€ƒโ€ƒโ€‚โ€ƒ ๐‘ ( ๐‘™ ) ๐‘— โ†’ ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) :
โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒ ๐‘ ( ๐‘™ ) ๐‘— โ†’ ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) โˆ
โ€ƒโ€ƒโ€ƒโ€ƒโ€ƒโ€ƒ โˆซ ๐‘ ( ๐‘ง ( ๐‘ก ) ๐‘— โ†’ ๐‘– | ๐ฑ ( ๐‘ก ) ๐‘– , ๐ฑ ( ๐‘ก ) ๐‘— ) ๐‘ ( ๐‘ก ) ๐‘— ( ๐ฑ ( ๐‘ก ) ๐‘— ) ๐‘‘ ๐ฑ ( ๐‘ก ) ๐‘— โ€ƒ โ€ƒ (A-2)
( 1 2 ) โ€ƒโ€ƒUpdate and broadcast ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) :
โ€ƒโ€ƒโ€ƒโ€‚ ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) โˆ ฬƒ โ€Œ ๐‘ ( ๐‘ก ) ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) โˆ ๐‘˜ โˆˆ ๐‘† ( ๐‘ก ) โ†’ ๐‘– ๐‘ ( ๐‘™ ) ๐‘˜ โ†’ ๐‘– ( ๐ฑ ( ๐‘ก ) ๐‘– ) โ€ƒโ€ƒ(A-3)
( 1 3 ) โ€ƒโ€ƒโ€ƒend for
( 1 4 ) โ€ƒโ€ƒend for
( 1 5 ) โ€ƒend forโ€‰โ€‰{end cooperative update}
( 1 6 ) end forโ€‰โ€‰{end current time step}

Observe that Algorithm 1 contains a number of key steps.(i)Mobility update (line 4), requiring knowledge of mobility models ๐‘(๐ฑ(๐‘ก)๐‘–โˆฃ๐ฑ(๐‘กโˆ’1)๐‘–) and self-measurement likelihood functions ๐‘(๐‘ง(๐‘ก)๐‘–,selfโˆฃ๐ฑ(๐‘กโˆ’1)๐‘–,๐ฑ(๐‘ก)๐‘–).(ii)Message conversion (line 10) of position information from neighboring devices to account for relative measurements, requiring knowledge of the neighbors and of relative measurement likelihood functions ๐‘(๐‘ง(๐‘ก)๐‘—โ†’๐‘–โˆฃ๐ฑ๐‘–,๐ฑ๐‘—).(iii)Belief update (line 11), to fuse information from the mobility update with information from the current neighbors.The first two operations can be interpreted as message filtering, while the latter operation is a message multiplication. How these operations can be implemented in practice will be the topic of Section 4.

4. Message Representation

4.1. Key Operations

In SPAWN, probabilistic information is exchanged and computed through messages. The manner in which these messages are represented for transmission between nodes and internal computation is closely related to the complexity and performance of the localization algorithm. In traditional communications problems, such as decoding, messages can be represented efficiently and exactly through, for instance, log-likelihood ratios [49]. In SPAWN, exact representation is impossible, so we must resort to different types of approximate message representations. Any representation must be able to capture the salient properties of the true message and must enable efficient computation of the key steps in SPAWN, namely, message filtering (A-1)-(A-2) and message multiplication (A-3). We consider three types of message representation: discretized, sample-based, and parametric.

For convenience, we will introduce a set of new notations. For the filtering operation, the incoming message is denoted by ๐‘๐—(๐ฑ), the filtering operation by โ„Ž(๐ฑ,๐ฒ), and the outgoing message by ๐‘๐˜(๐ฒ), with ๐‘๐˜(๐ฒ)โˆ๎€œโ„Ž(๐ฑ,๐ฒ)๐‘๐—(๐ฑ)๐‘‘๐ฑ.(1) For the multiplication operation, we assume ๐‘€ incoming messages ๐‘(๐‘–)๐—(๐ฑ) (๐‘–=1,โ€ฆ,๐‘€) over a single variable ๐—, and an outgoing message ๐œ™๐—(๐ฑ)โˆ๐‘€๎‘๐‘–=1๐‘(๐‘–)๐—(๐ฑ).(2) Note that (1) maps to (A-1) through the following association: ๐ฑโ†’๐ฑ(๐‘กโˆ’1)๐‘–,๐ฒโ†’๐ฑ(๐‘ก)๐‘–,โ„Ž(๐ฑ,๐ฒ)โ†’๐‘๎‚€๐ฑ(๐‘ก)๐‘–โˆฃ๐ฑ(๐‘กโˆ’1)๐‘–๎‚๐‘๎‚€๐‘ง(๐‘ก)๐‘–,selfโˆฃ๐ฑ(๐‘กโˆ’1)๐‘–,๐ฑ(๐‘ก)๐‘–๎‚,๐‘๐—(๐ฑ)โ†’๐‘(๐‘กโˆ’1)๐‘–๎‚€๐ฑ(๐‘กโˆ’1)๐‘–๎‚.(3) Similarly, (1) maps to (A-2) through the following association: ๐ฑโ†’๐ฑ(๐‘ก)๐‘—,๐ฒโ†’๐ฑ(๐‘ก)๐‘–,โ„Ž(๐ฑ,๐ฒ)โ†’๐‘๎‚€๐‘ง(๐‘ก)๐‘—โ†’๐‘–โˆฃ๐ฑ(๐‘ก)๐‘–,๐ฑ(๐‘ก)๐‘—๎‚,๐‘๐—(๐ฑ)โ†’๐‘(๐‘ก)๐‘–๎‚€๐ฑ(๐‘ก)๐‘–๎‚.(4)

4.2. Discretized Message Representation

A naive but simple approach to represent a continuous distribution ๐‘๐—(๐ฑ) is to uniformly discretize the domain of ๐—, yielding a set of quantization points ๐‘„={๐ฑ1,โ€ฆ,๐ฑ๐‘…}. The distribution is then approximated as a finite list of values, {๐‘๐—(๐ฑ๐‘˜)}๐‘…๐‘˜=1. The filtering operation then becomes ๐‘๐˜๎€ท๐ฒ๐‘˜๎€ธโˆ๐‘…๎“๐‘™=1โ„Ž๎€ท๐ฑ๐‘™,๐ฒ๐‘˜๎€ธ๐‘๐—๎€ท๐ฑ๐‘™๎€ธ,(5) requiring ๐’ช(๐‘…2) operations. The multiplication becomes ๐œ™๐—๎€ท๐ฑ๐‘˜๎€ธโˆ๐‘€๎‘๐‘–=1๐‘(๐‘–)๐—๎€ท๐ฑ๐‘˜๎€ธ,(6) requiring ๐’ช(๐‘…๐‘€) operations. Because ๐‘… scales exponentially with the dimensionality of ๐— and a large number of points are required in every dimension to capture fine features of the messages, discretization is impractical for SPAWN in UWB localization.

4.3. Sample-Based Message Representation

A sample-based message representation, as used in [19, 50], overcomes the drawback of discretization by representing messages as samples, concentrated where the messages have significant mass. Before describing the detailed implementation of the filtering and multiplication operations, we give a brief overview of generic sampling techniques (see also [51, 52]) and kernel density estimation (KDE).

4.3.1. Background: Sampling and Kernel Density Estimation

We say that a list of samples with associated weights {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1 is a representation for a distribution ๐‘๐—(๐ฑ) if, for any integrable function ๐‘”(๐ฑ), we have the following approximation: ๐ผ=๎€œ๐‘”(๐ฑ)๐‘๐—(๐ฑ)๐‘‘๐‘ฅโ‰ˆ๐‘…๎“๐‘˜=1๐‘ค๐‘˜๐‘”๎€ท๐ฑ๐‘˜๎€ธ.(7) Popular methods for obtaining the list of weighted samples include (i) direct sampling, where we draw ๐‘… i.i.d. samples from ๐‘๐—(๐ฑ), each with weight 1/๐‘…; and (ii) importance sampling, where we draw ๐‘… i.i.d. samples from a distribution ๐‘ž๐—(๐ฑ), with a support that includes the support of ๐‘๐—(๐ฑ), and set the weight corresponding to sample ๐ฑ๐‘˜ as ๐‘ค๐‘˜=๐‘๐—(๐ฑ๐‘˜)/๐‘ž๐—(๐ฑ๐‘˜). In both cases, it can easily be verified that the approximation is unbiased with mean ๐ผ and variance that reduces with ๐‘… (and that depends on ๐‘ž๐—(๐ฑ), for importance sampling). Most importantly, the variance does not depend on the dimensionality of ๐ฑ.

A variation of importance sampling that is not unbiased but that often has smaller variance is obtained by setting the weights as follows: ๐‘ค๐‘˜โˆ๐‘๐—(๐ฑ๐‘˜)/๐‘ž๐—(๐ฑ๐‘˜), โˆ‘๐‘˜๐‘ค๐‘˜=1. This approach has the additional benefit that it does not require knowledge of the normalization constants of ๐‘๐—(๐ฑ) or ๐‘ž๐—(๐ฑ). A list of ๐‘… equally weighted samples can be obtained from {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1 through resampling, that is, by drawing (with repetition) ๐‘… samples from the probability mass function defined by {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1.

For numerical stability reasons, weights are often computed and stored in the logarithmic domain, that is, ๐œ†๐‘˜=log๐‘๐—(๐ฑ๐‘˜)โˆ’log๐‘ž๐—(๐ฑ๐‘˜). When the distributions involved contain exponentials or products, the log-domain representation is also computationally efficient. Operations such as additions can be evaluated efficiently in the log-domain as well, using the Jacobian logarithm [49, pages 90โ€“94]. Once all ๐‘… log-domain weights are computed, they are translated, exponentiated, and normalized: ๐‘ค๐‘˜โˆexp(๐œ†๐‘˜โˆ’max๐‘™๐œ†๐‘™).

Given a sample representation {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1 of a distribution ๐‘๐—(๐ฑ), we obtain a kernel density estimate of ๐‘๐—(๐ฑ) as ฬ‚๐‘๐—(๐ฑ)=๐‘…๎“๐‘˜=1๐‘ค๐‘˜๐พ๐œŽ๎€ท๐ฑโˆ’๐ฑ๐‘˜๎€ธ,(8) where ๐พ๐œŽ(๐ฑ) is the so-called kernel with bandwidth ๐œŽ. The kernel is a symmetric distribution with a width parameter that is tuned through ๐œŽ. For instance, a two-dimensional Gaussian kernel is given by ๐พ๐œŽ(๐ฑ)=12๐œ‹๐œŽ2exp๎‚ตโˆ’โ€–๐ฑโ€–22๐œŽ2๎‚ถ.(9) While the choice of kernel affects the performance of the estimate to some limited extent (e.g., in an MMSE sense, where the error is โˆซ|๐‘๐‘‹(๐‘ฅ)โˆ’ฬ‚๐‘๐‘‹(๐‘ฅ)|2๐‘๐‘‹(๐‘ฅ)๐‘‘๐‘ฅ), the crucial parameter is the bandwidth ๐œŽ, which needs to be estimated from the samples {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1. A large choice of ๐œŽ makes ฬ‚๐‘๐—(๐ฑ) smooth, but it may no longer capture the interesting features of ๐‘๐—(๐ฑ). When ๐œŽ is too small, ฬ‚๐‘๐—(๐ฑ) may exhibit artificial structure not present in ๐‘๐—(๐ฑ) [53].

With this background in sampling techniques and KDE, we return to the problem at hand: filtering and multiplication of messages.

4.3.2. Message Filtering

We assume a message representation of ๐‘๐—(๐ฑ) as {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1 and wish to obtain a message representation of ๐‘๐˜(๐ฒ)โˆโˆซโ„Ž(๐ฑ,๐ฒ)๐‘๐—(๐ฑ)๐‘‘๐ฑ. Let us interpret โ„Ž(๐ฑ,๐ฒ) as a conditional distribution ๐‘๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ), up to some arbitrary constant. Suppose we can draw samples {[๐ฑ๐‘˜,๐ฒ๐‘˜],๐‘ค๐‘˜}๐‘…๐‘˜=1โˆผ๐‘๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ)๐‘๐—(๐ฑ); then {๐ฒ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1 will form a sample representation of ๐‘๐˜(๐ฒ). Now the problem reverts to drawing samples from ๐‘๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ)๐‘๐—(๐ฑ). This can be accomplished as follows: first, for every sample ๐ฑ๐‘˜, draw ๐ฒ๐‘˜โˆผ๐‘ž๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ๐‘˜) from some distribution ๐‘ž๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ๐‘˜). Second, set the weight of sample [๐ฑ๐‘˜,๐ฒ๐‘˜] as ๐‘ฃ๐‘˜=๐‘ค๐‘˜๐‘๐˜โˆฃ๐—๎€ท๐ฒ๐‘˜โˆฃ๐ฑ๐‘˜๎€ธ๐‘ž๐˜โˆฃ๐—๎€ท๐ฒ๐‘˜โˆฃ๐ฑ๐‘˜๎€ธ.(10) Finally, renormalize the weights ๐‘ฃ๐‘˜ to ๐‘ฃ๐‘˜/โˆ‘๐‘™๐‘ฃ๐‘™. The complexity of the filtering operation scales as ๐’ช(๐‘…), a significant improvement from ๐’ช(๐‘…2) for discretization. In addition, ๐‘… can generally be much smaller in a particle-based representation.

Let us consider some examples of the filtering operation in SPAWN.(i)Mobility update (A-1): let ๐‘๐—(๐ฑ) be the belief before movement (represented by {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1) and ๐‘๐˜(๐ฒ) the belief after movement. Assume that we are able to measure perfectly the distance traveled (given by ๐‘งself), but have no information regarding the direction, and furthermore that the direction is chosen uniformly in (0,2๐œ‹]. In that case, โ„Ž(๐ฑ,๐ฒ)โˆ๐›ฟ๎€ท๐‘งselfโˆ’โ€–๐ฑโˆ’๐ฒโ€–๎€ธ,(11) where ๐›ฟ is a Dirac delta function, so that ๐‘ž๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ๐‘˜)=๐‘๐˜โˆฃ๐—(๐ฒโˆฃ๐ฑ๐‘˜)โˆ๐›ฟ(๐‘งselfโˆ’โ€–๐ฑ๐‘˜โˆ’๐ฒโ€–) is a reasonable choice. For every ๐ฑ๐‘˜, we can now draw values for ๐ฒ=๐ฑ๐‘˜+๐‘Ÿร—[cos๐œƒsin๐œƒ]๐‘‡ by drawing ๐œƒโˆผ๐’ฐ(0,2๐œ‹) and setting ๐‘Ÿ=๐‘งself, leading to ๐ฒ๐‘˜=๐ฑ๐‘˜+๐‘งselfร—[cos๐œƒ๐‘˜sin๐œƒ๐‘˜]๐‘‡, with ๐‘ฃ๐‘˜โˆ๐‘ค๐‘˜.(ii)Ranging update (A-2): let ๐‘๐—(๐ฑ) be a message (represented by {๐ฑ๐‘˜,๐‘ค๐‘˜}๐‘…๐‘˜=1) from a node with which we have performed ranging, resulting in a range estimate ๐‘ง. Let โ„Ž(๐ฑ,๐ฒ)=๐‘๐‘โˆฃ๐ท(๐‘งโˆฃ๐‘‘), where ๐‘‘=โ€–๐ฑโˆ’๐ฒโ€–. Note that ๐‘๐‘โˆฃ๐ท(๐‘งโˆฃ๐‘‘) is a likelihood function, since the measurement ๐‘ง is known. Assume that we have a model for the ranging performance in the form of distributions ๐‘๐‘โˆฃ๐ท(๐‘งโˆฃ๐‘‘) for any value of ๐‘‘. We then sample ๐‘๐˜(๐ฒ) as follows: for every ๐ฑ๐‘˜, draw ๐ฒ=๐ฑ๐‘˜+๐‘Ÿร—[cos๐œƒsin๐œƒ]๐‘‡ by drawing ๐œƒโˆผ๐’ฐ(0,2๐œ‹) and ๐‘Ÿโˆผ๐‘ž๐‘…โˆฃ๐‘(๐‘Ÿโˆฃ๐‘ง), for some well-chosen ๐‘ž๐‘…โˆฃ๐‘(๐‘Ÿโˆฃ๐‘ง)(e.g., a Gaussian distribution with mean equal to the distance estimate, ๐‘ง, and a standard deviation that is sufficiently large with respect to the standard deviation of ๐‘(๐‘ง|๐‘‘) for any ๐‘‘). The weights are set as ๐‘ฃ๐‘˜=๐‘ค๐‘˜๐‘๐‘โˆฃ๐ท๎€ท๐‘งโˆฃ๐‘Ÿ๐‘˜๎€ธ๐‘ž๐‘…โˆฃ๐‘๎€ท๐‘Ÿ๐‘˜๐‘ง๎€ธ.(12)

4.3.3. Message Multiplication

Here we assume message representations {๐ฑ(๐‘–)๐‘˜,๐‘ค(๐‘–)๐‘˜}๐‘…๐‘˜=1 for ๐‘(๐‘–)๐—(๐ฑ๐‘˜), ๐‘–=1,โ€ฆ,๐‘€. In contrast to the discretization approach, we cannot directly compute โˆ๐‘€๐‘–=1๐‘(๐‘–)๐—(๐ฑ) for arbitrary values of ๐ฑ. Rather, for every message ๐‘(๐‘–)๐—(๐ฑ๐‘˜), we create a KDE ฬ‚๐‘(๐‘–)๐—(๐ฑ)=โˆ‘๐‘…๐‘˜=1๐‘ค๐‘˜๐พ๐œŽ(๐‘–)(๐ฑโˆ’๐ฑ(๐‘–)๐‘˜) with a Gaussian kernel and a bandwidth estimated using the methods from [53]. Suppose we now draw ๐‘… samples from a distribution ๐‘ž๐—(๐ฑ); then the weights are ๐‘ฃ๐‘˜โˆโˆ๐‘€๐‘–=1ฬ‚๐‘(๐‘–)๐—๎€ท๐ฑ๐‘˜๎€ธ๐‘ž๐—๎€ท๐ฑ๐‘˜๎€ธ,(13) which can be computed efficiently in the log-domain. A reasonable choice for ๐‘ž๐—(๐ฑ) could be one of the incoming messages ๐‘(๐‘–)๐—(๐ฑ) (e.g., the one with the smallest entropy) or a mixture of the incoming messages. The computational complexity of the message multiplication operation scales as ๐’ช(๐‘€๐‘…2). This appears worse than the discretized case (complexity ๐’ช(๐‘€๐‘…)), but note that ๐‘… is much smaller for sample-based representations than for discretization (e.g., ๐‘…=103 or ๐‘…=104 for the sample-based representation compared to ๐‘…=108 in the discretization).

4.4. Parametric Message Representation
4.4.1. Choosing a Suitable Parameterization

From the previous section, it is clear that the bottleneck of the sample-based message representation lies in the message multiplication, which scales quadratically with the number of samples. An alternative approach is to represent each message as a set of parameters (e.g., a Gaussian distribution characterized by a mean and covariance matrix). In contrast to the sample-based message representation, which can represent messages of any shape, parametric representations must be specially tailored to the problem at hand. For example, single two-dimensional Gaussian parametric messages are utilized in [22] for localization with both range and angle measurements. Our choice of parametric message is based on the following observations.(i)For the filtering operation with a two-dimensional Gaussian input ๐‘๐—(๐ฑ), the output ๐‘๐˜(๐ฒ) can be approximated by a circular distribution with the same mean for both the mobility update (A-1) and the ranging update (A-2).(ii)Multiplying Gaussian distributions yield a Gaussian distribution.(iii)The multiplication of multiple circular distributions can be approximated by a Gaussian distribution or a mixture of Gaussian distributions.

We will use as a basic building block the following distribution in two dimensions: ๐’Ÿ๎€ท๐ฑ;๐‘š1,๐‘š2,๐œŽ2,๐œŒ๎€ธ=1๐ถ๎€ท๐œŽ2,๐œŒ๎€ธexpโŽงโŽชโŽชโŽจโŽชโŽชโŽฉโˆ’๎‚ธ๎”๎€ท๐‘ฅ1โˆ’๐‘š1๎€ธ2+๎€ท๐‘ฅ2โˆ’๐‘š2๎€ธ2โˆ’๐œŒ๎‚น22๐œŽ2โŽซโŽชโŽชโŽฌโŽชโŽชโŽญ,(14) where [๐‘š1,๐‘š2] is the midpoint of the distribution, ๐œŒ is the radius, ๐œŽ2 is the variance, and ๐ถ(๐œŽ2,๐œŒ) is a normalization constant equal to ๐ถ๎€ท๐œŽ2,๐œŒ๎€ธ=2๐œ‹๐œŽ2โŽกโŽขโŽฃexp๎‚ตโˆ’๐œŒ22๐œŽ2๎‚ถ+12๎„ถ๎„ตโŽท2๐œ‹๐œŒ2๐œŽ2โŽ›โŽœโŽ1+erf๎„ถ๎„ตโŽท๐œŒ22๐œŽ2โŽžโŽŸโŽ โŽคโŽฅโŽฆ.(15) As a special case, we note that, when ๐œŒ=0, (14) reverts to a two-dimensional Gaussian. Moreover, we will represent all messages as a mixture of two distributions of the type (14), so that ๐‘๐—(๐ฑ)=12๐’Ÿ๎‚€๐ฑ;๐‘š(๐‘Ž)1,๐‘š(๐‘Ž)2,๐œŽ2,๐œŒ๎‚+12๐’Ÿ๎‚€๐ฑ;๐‘š(๐‘)1,๐‘š(๐‘)2,๐œŽ2,๐œŒ๎‚,(16) which can be represented by the six-dimensional vector [๐‘š(๐‘Ž)1,๐‘š(๐‘Ž)2,๐‘š(๐‘)1,๐‘š(๐‘)2,๐œŒ,๐œŽ2]. We will denote the family of distributions of the form (16) by ๐’Ÿ2. Note that it is trivial to extend this distribution, which is designed for two-dimensional localization systems, for use in three-dimensional systems. Before we describe the message filtering and message multiplication operations, let us first show how the parameters of (14) can be estimated from a list of samples.

4.4.2. ML Estimation of the Parameters ๐‘š1,๐‘š2,๐œŽ2,๐œŒ.

Given a list of samples {๐ฑ๐‘˜}๐‘…๐‘˜=1, we can estimate the parameters [๐‘š1,๐‘š2,๐œŽ2,๐œŒ] as follows. The midpoint ๐ฆ=[๐‘š1,๐‘š2] is estimated by ๎๐ฆ=1๐‘…๐‘…๎“๐‘˜=1๐ฑ๐‘˜.(17) To find the radius ๐œŒ and variance ๐œŽ2 of ๐’Ÿ-distribution, we use maximum likelihood (ML) estimation, assuming the ๐‘… samples are independent. Introducing ๐œถ=[๐œŒ๐œŽ2]๐‘‡, we find that ๎๐œถML=argmax๐œถ๐‘…๎“๐‘˜=1log๐‘๐—๎€ท๐ฑ๐‘˜;๎๐ฆ,๐œถ๎€ธ๎„ฟ๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…ƒ๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…€๎…Œฮ›(๐œถ).(18) Treating the log-likelihood function (LLF) ฮ›(๐œถ) as an objective function, we find its maximum through the gradient ascent algorithm ๎๐œถ(๐‘›+1)=๎๐œถ(๐‘›)โˆ’๐œ€โˆ‡๐œถฮ›(๐œถ)||๐œถ=๎๐œถ(๐‘›),(19) where ๐œ€ is a suitably small step size, and the gradient vector can be approximated using finite differences. To initialize (19), we consider two initial estimates for ๐œŽ2 and ๐œŒ: one assuming ๐œŒโ‰ช๐œŽ and a second assuming ๐œŒโ‰ซ๐œŽ. The LLF is evaluated for both preliminary solutions, and the one with the largest log-likelihood is used as the initial estimate in (19).

4.4.3. Message Filtering

To perform message filtering, we use the fact that sample-based message filtering is a low-complexity operation. We decompose ๐‘๐—(๐ฑ), represented in parametric form, into its two mixture components. From each component, we draw ๐‘…/2 samples and perform sample-based message filtering, as outlined in Section 4.3.2. We can then estimate the new ๐’Ÿ-parameters for each mixture component using the ML method described above. We thus have ๐‘๐˜(๐ฒ) in parametric form. The complexity of this operation scales as ๐’ช(๐‘…).

4.4.4. Message Multiplication

The motivation for using the parametric message representation is to avoid the complexity associated with sample-based message multiplication. Given ๐‘€ distributions ๐‘(๐‘–)๐ฑ(๐ฑ)โˆˆ๐’Ÿ2, our goal is to compute ๐œ™๐—(๐ฑ)=๐‘€๎‘๐‘–=1๐‘(๐‘–)๐—(๐ฑ).(20) Typically, ๐œ™๐—(๐ฑ)โˆ‰๐’Ÿ2, so we will approximate ๐œ™๐—(๐ฑ) by ๐‘žโˆ—๐—(๐ฑ)โˆˆ๐’Ÿ2 by projecting ๐œ™๐—(๐ฑ) onto the family ๐’Ÿ2: ๐‘žโˆ—๐—(โ‹…)=argmin๐‘ž๐—โˆˆ๐’Ÿ2๐ทKL๎€ท๐‘ž๐ฑโ€–โ€–๐œ™๐ฑ๎€ธ,(21) where ๐ทKL(โ‹…||โ‹…) denotes the Kullback Leibler (KL) divergence, defined as ๐ทKL๎€ท๐‘ž๐—โ€–โ€–๐œ™๐—๎€ธ=๎€œ๐‘ž๐—(๐ฑ)log๐‘ž๐—(๐ฑ)๐œ™๐—(๐ฑ)๐‘‘๐ฑ.(22)

Observe that all elements of ๐’Ÿ2 are characterized by the parameters ๐ฉโ‰œ[๐‘š(๐‘Ž)1,๐‘š(๐‘Ž)2,๐‘š(๐‘)1,๐‘š(๐‘)2,๐œŒ,๐œŽ2]๐‘‡ and that the optimization (21) is therefore a six-dimensional problem over all possible ๐ฉ. The divergence ๐ทKL(๐‘žโ€–๐œ™) for an arbitrary ๐ฉโˆˆโ„4ร—โ„2+ can be determined using Monte-Carlo integration as follows. We rewrite (22) as ๐ทKL๎€ท๐‘ž๐ฑโ€–โ€–๐œ™๐ฑ๎€ธ=๎€œ๐‘ž๐—(๐ฑ)๐‘“(๐ฑ)๐‘‘๐ฑ,(23) where ๐‘“(๐ฑ)=log๐‘ž๐—(๐ฑ)โˆ’โˆ‘๐‘€๐‘–=1log๐‘(๐‘–)๐—(๐ฑ). By drawing ๐‘… weighted samples {๐‘ค๐‘˜,๐ฑ๐‘˜}๐‘…๐‘˜=1 from ๐‘ž๐—(๐ฑ) (e.g., through importance sampling), we can approximate (23) by ๐ทKL๎€ท๐‘ž๐ฑโ€–โ€–๐œ™๐ฑ๎€ธโ‰ˆ๐‘…๎“๐‘˜=1๐‘ค๐‘˜๐‘“๎€ท๐ฑ๐‘˜๎€ธ.(24) Using this approximation, the six-dimensional optimization problem (21) is solved through gradient descent, similar to (19). The complexity of this operation scales as ๐’ช(๐‘…๐‘€). The initial estimate of ๐ฉ is obtained through a set of heuristics: we first decide whether ๐œ™๐—(๐ฑ) can reasonably be represented by a distribution in ๐’Ÿ2. If not, the outgoing message is not computed. Otherwise, we use a geometric argument to find at most two midpoints. The initial estimates for ๐œŒ and ๐œŽ2 are set to a small constant value.

4.5. Comparison of Message Representations

The complexities of the discretized, sample-based, and parametric message representations are compared in Table 1.

5. Performance Analysis

In this section, we compare the performance of the SPAWN algorithm with sample-based versus parametric message representation in a simulated wireless network. We also analyze the use of different subsets of information in the algorithm and its effect on localization performance.

5.1. Simulation Setup and Performance Measures

We simulate a large-scale ultra-wide bandwidth (UWB) network in a 100โ€‰m ร— 100โ€‰m homogeneous environment, with 100 uniformly distributed agents and 13 fixed anchors in a grid configuration. Each node is able to measure its range to other nodes within 20 meters. The simulated ranging measurements are independently drawn from the UWB ranging model developed in [54]. The model, based on data collected in a variety of indoor scenarios, consists of three component Gaussian densities, where the mean and variance of each component are experimentally determined functions of the true distance between the ranging nodes. To decouple the effect of mobility with the message representation, we consider a single time slot, where every agent has a uniform a priori distribution over the environment โ„ฐ. SPAWN was run for ๐‘it=20 iterations, though convergence was generally achieved well before 10 iterations. For the sample-based representation, the number of samples is set to ๐‘…=2048 unless otherwise stated.

We quantify localization performance using the complementary cumulative distribution function (CCDF) of the localization error ๐‘’=โ€–๐ฑ๐‘–โˆ’ฬ‚โ€Œ๐ฑ๐‘–โ€–, where ฬ‚โ€Œ๐ฑ๐‘– is the estimated location of node ๐‘–, taken as the mean of the belief, similar to [18]. To estimate the CCDF, we consider 50 random network topologies and collect position estimates at every iteration for every agent. Note that a CCDF of 0.01 at an error of, say, ๐‘’=1โ€‰m means that 99% of the nodes have an error less than 1 meter.

5.2. Cooperation with Censoring

In Section 3, we considered processing messages between all neighboring pairs of nodes. However, information from neighbors may not always be useful: (i) when the receiving node's belief is already very informative (e.g., concentrated around the mean); or (ii) when the transmitting node's belief is very uninformative. To better understand how much cooperative information is beneficial to localization, we will consider varying the subset of nodes that broadcast and update their location beliefs at each iteration. We distinguish between these subsets by the level of cooperative information they induce in the algorithm. The level of cooperative information indicates how each node utilizes information from its neighbors at each iteration.

We introduce the following terminology: a distribution is said to be โ€œsufficiently informativeโ€ when 95% of the probability mass is located within 2โ€‰m of the mean; a node becomes a virtual anchor when its belief is sufficiently informative; a virtual bianchor is a node with a bimodal belief, with each mode being sufficiently informative; a node that is neither a virtual anchor nor a virtual bianchor will be called a blind agent. We are now ready to introduce four levels of cooperative information at each iteration.(i)Level 1 (L1): virtual anchors broadcast their beliefs, while all other nodes censor their belief broadcast. Virtual anchors do not update their beliefs.(ii)Level 2 (L2): virtual anchors and virtual bianchors broadcast their beliefs, while blind nodes censor their belief broadcast. Virtual anchors do not update their beliefs.(iii)Level 3 (L3): all nodes broadcast their beliefs. Virtual anchors do not update their beliefs.(iv)Level 4 (L4): all nodes broadcast their beliefs. All nodes update their beliefs. In terms of cooperation, note that L4 utilizes more cooperative information than L3, L3 utilizes more cooperative information than L2, and L2 utilizes more cooperative information than L1. In this sense, the levels of cooperative information are strict subsets.

From previous sections, we know that the algorithm complexity scales linearly in ๐‘€, the number of incoming messages in the multiplication operation. Hence, the level of cooperative information directly affects the algorithm's computational cost, with lower levels requiring less computation.

5.3. Numerical Results

We now examine how localization performance varies with the algorithm parameters. In particular, numerical results show the effect of message representation (sample-based or parametric) and level of cooperative information (L1, L2, L3, or L4) on the CCDF of the localization error.

We first consider the localization performance as a function of the number of samples ๐‘… and level of cooperative information. Figure 1 displays the CCDF at ๐‘’=1โ€‰m after 10 iterations. As expected, for any level of cooperative information, the CCDF decreases as the number of samples is increased. However, the decrease in CCDF comes with a cost in computation time; as ๐‘… is increased, the per-node complexity increases quadratically. Figure 1 also shows that levels L1, L2, and L4 are not as sensitive to ๐‘… as L3 and that each generally outperforms L3. This effect is particularly pronounced when ๐‘… is small. L3 broadcasts more complex distributions than L2 and L1, and these elaborate distributions are not accurately represented with a small number of samples.

Secondly, we investigate level of cooperative information and its effect on localization performance, with numerical results represented in Figures 2 and 3, after ๐‘it=20 iterations. Note that each curve exhibits a โ€œfloorโ€ because there is always some subset of nodes that have insufficient information to localize without ambiguity. This may be due to lack of connectivity or large flip ambiguities. Let us focus on the sample-based representation in Figure 2 and consider the effect of the level of cooperative information on localization performance. In general, L4 has the best performance in terms of accuracy and floor. Intuitively, one might expect L3 to have the next best performance, followed by L2, and then L1. However, Figure 2 demonstrates that in some cases L3 has poorer accuracy than L2 and a similar floor. This effect can be explained as follows. Agents that do not become virtual anchors within ๐‘it=20 tend to have large localization errors, creating a floor. Such agents comprise 1.7% of the total nodes for L1 and 0.3% for both L2 and L3. Since L2 and L3 have a similar fraction of agents that do not become virtual anchors, they have similar floors. In addition, the accuracy of beliefs belonging to agents that have become virtual anchors turns out to be highest for L1, followed by L2, and then L3. This is because L3 uses less reliable information than L2, which in turn is less reliable than L1. The final CCDF depends both on the fraction of virtual anchors (lowest for L1) and the accuracy of those virtual anchors (highest for L1). Note that we cannot compare L4 in this context, since there is no concept of a virtual anchor in L4.

We now move on to the parametric representation, still in Figure 2. We observe that L4 has the lowest overall CCDF for any ๐‘’, for both types of message representation. For the parametric messages, the differences among different levels of cooperative information are smaller, and we generally obtain better performance (for ๐‘’<1โ€‰m) compared to sample-based messages.

Finally, in Figure 3, we evaluate the convergence speed of the different message representations and levels of cooperation, for a fixed error of 1 meter. We see that the parametric messages generally lead to faster convergence and lower CCDF than their sample-based counterparts. Levels L2, L3, and L4 all converge in around 5 iterations with a final CCDF at ๐‘’=1โ€‰m of around 0.01 for the parametric representation. Our results show that more cooperative information leads to faster improvement in terms of accuracy. The lowest level of cooperative information, L1, is consistently slower to converge and less accurate. However, higher levels of cooperative information also require the computation and representation of more complicated distributions. As a possible consequence, convergence issues may occur for levels L3 and L4. We also see that the parametric message representation performs approximately equal to or better than the sample-based messages in terms of both convergence and accuracy, while requiring much less execution time. Overall, parametric message representations yield a better performance/complexity tradeoff. This is due to the fact that the parametric distributions are well tailored to the localization problem and the homogeneous simulation environment.

6. Conclusions and Extensions

In this paper, we considered different message representations for Bayesian cooperative localization in wireless networks: a generic sample-based representation and a tailored parametric representation. We used experimentally derived UWB ranging models to evaluate the performance of SPAWN as a function of message representation and level of cooperative information. Our results show that the tradeoffs between message representation, cooperative information, localization accuracy, and algorithm convergence are not straightforward and should be tailored to the scenario.

Through large-scale network simulations, we demonstrated that more cooperative information may improve localization accuracy but also increase the complexity of messages. Higher levels of cooperative information do not always correspond to an improvement in localization accuracy or convergence rate. As complicated distributions associated with location-uncertain nodes are computed and transmitted, the resulting increases in computational complexity and signal interference can actually reduce localization performance. It may therefore be advantageous to broadcast only confident information in cooperative localization networks, especially considering the resources saved by a node censorship policy.

We also demonstrated that though parametric messages have less representational flexibility, they can outperform nonparametric message representation at a much lower computational cost. In our simulations, the parametric representation achieved a lower probability of outage for errors under 1 meter while converging in equal or fewer iterations than the sample-based representation. Clearly, a parametric representation well tailored to the localization scenario is desirable in terms of both resource efficiency and localization accuracy.

The use of parametric distributions for localization can be extended to (i) different ranging models; (ii) different types of measurements; (iii) more general scenarios. In terms of ranging models, the proposed distributions can be applied as long as typical distributions in SPAWN roughly resemble a distribution in the ๐’Ÿ2-family. Note that a Gaussian ranging error satisfies this criteria, as would many other, more realistic, models. Other models, such as those derived from received signal strength, will require different types of parametric distributions. The same comment applies to the use of different types of measurements. For instance, with angle-of-arrival measurements, the parametric distributions should include a collection of linear distributions. Finally, more general scenarios may require tailor-made distributions. With NLOS measurements that can be modeled as biased Gaussians [20], for example, mixtures of ๐’Ÿ2 distributions would easily accommodate LOS/NLOS propagation, without relying on explicit NLOS identification.