Abstract

Knowledge of deployed transmitters’ (Tx) locations in a wireless network improves many aspects of network management. Operators and building administrators are interested in locating unknown Txs for optimizing new Tx placement, detecting and removing unauthorized Txs, selecting the nearest Tx to offload traffic onto it, and constructing radio maps for indoor and outdoor navigation. This survey provides a comprehensive review of existing algorithms that estimate the location of a wireless Tx given a set of observations with the received signal strength indication. Algorithms that require the observations to be location-tagged are suitable for outdoor mapping or small-scale indoor mapping, while algorithms that allow most observations to be unlocated trade off some accuracy to enable large-scale crowdsourcing. This article presents empirical evaluation of the algorithms using numerical simulations and real-world Bluetooth Low Energy data.

1. Introduction

Locating the wireless transmitters (Tx) in the network provides mobile network operators with important and relevant information for a wide range of purposes, including finding rogue and nonfunctional access points (AP), planning and operating the communication networks, and estimating the radio frequency propagation properties of an area. Tx location determination is also used when constructing radio maps for localization services.

Every operator aims at providing good coverage so that subscribers in most locations can access the network. Competition between operators in providing the subscribers with continuous and uninterrupted data usage prompt them to find unknown Txs that mainly belong to their competitors. Based on the knowledge of the deployed Tx locations, operators decide optimal places for installing new infrastructure within the area or steering the beam directions. An unknown Tx can be a WLAN (wireless local area network) AP with unlicensed spectrum or a femtocell AP whose spectrum is licensed to the operator. These Txs may be managed by individuals or groups or the operator itself.

The operators offload users from their 3G or 4G cells to adjacent small cells or indoor femtocells when the traffic becomes heavy [1]. Knowing Txs’ locations and coverage areas helps the operators to identify which cells are nearby.

Locating unknown Txs helps the administrators secure the network when security loopholes are detected or when there are intruders that breach the area managed by the administrators [2]. Also when administrators update their network infrastructure within an authorized area, a map of existing Tx locations helps to determine optimal locations for new Txs.

Moreover, knowledge of Tx locations assists navigation in environments where GNSS (Global Navigation Satellite System) navigation is not feasible such as indoors. Indoor navigation requires detailed knowledge of the network topology of the building, and unmanaged Txs can also be used provided that their locations are estimated. In many indoor localization studies [35], it is assumed that Tx locations are known a priori. This assumption is usually only valid for Txs that belong to the owner of the infrastructure.

This survey provides the reader with a comprehensive review on methods for locating wireless Txs using a set of measurements of the received signal strength (RSS). Most of the presented methods can be applied to different types of wireless networks, such as WLAN, Bluetooth Low Energy (BLE), and cellular networks. Figure 1 shows examples of an outdoor cellular base station and an indoor BLE Tx and RSS measurement sets collected in the respective areas. RSS information is available in reception reports of most wireless networks’ receivers (Rx) without any special hardware or software modifications [6].

This article categorizes the methods based on two criteria: measurement type and reference location requirement. The measurement type can be the actual RSS from a Tx or just the connectivity, that is, whether the Tx can be sensed or not. Some of the reviewed methods rely on located measurements; that is, they assume that every observation includes accurate information about the location of the measurement. Some methods assume that most of the observations are unlocated, lacking the location information. The former are more accurate but are costly to implement, while the latter are especially suitable for crowdsourcing. We evaluate methods that use located measurements through numerical simulations and real-world BLE data.

The structure of this article is as follows: firstly, the methods are presented in detail, connectivity-only methods first in Section 2, then RSS based methods that use measurements with known locations in Section 3, and finally RSS based methods that do not require all measurements to be located in Section 4. Secondly, experimental results are presented in Section 5, along with a table that summarizes the basic practical properties of each method. Finally, Section 6 presents the conclusions.

2. Connectivity Based Methods with Located Observations

Connectivity based Tx localization algorithms assume that the closer one is to the Tx, the higher is the probability of observing the Tx when listening with a Rx device. The observations consist of tuples , where is the reference position of the th measurement and is the set of Tx identifiers observed at the th measurement. Connectivity actually means that the RSS exceeds the receiver’s sensitivity threshold [7]. Thus, connectivity based methods in fact rely on a very coarsely quantized RSS.

The simplest connectivity based Tx localization algorithm is the (unweighted) centroid algorithm that was proposed for the localization of a wireless sensor network’s nodes by Bulusu et al. [8]. The centroid algorithm has also been proposed and tested at least in [913]. The estimate of the location of the th Tx is the mean of the measurement locationswhere is the number of elements in set , means that is an element of the set , is the total number of measurements, and is the indicator function The centroid location is the solution of the optimization problembecause (3) can be expressed as the weighted least squares problemwhere and is the identity matrix, and (1) follows from (4) by the weighted linear least squares formula. Typically only the measurements where the th Tx has been observed are used in the estimation of ; that is, the information of not observing the th Tx in a measurement is omitted. In this sense, the centroid also has a probabilistic interpretation as the maximum likelihood solution to the measurement modelwhere the positive-definite matrix is a constant that does not affect the solution and denotes probability. Koski et al. [12] also estimate the coverage area parameter matrix for the purpose of online mobile Rx positioning.

The algorithm of Piché [14] can be considered a robustified version of the centroid algorithm. This work considers the assumption that there are outlier measurements with, for example, erroneous reference position in the observation set by relying on the Student’s -distribution that gives a higher probability for occasionally receiving the signal far from the Tx:where is a constant that does not affect the solution and is a model parameter degrees of freedom, and the closer is to zero, the more robust the algorithm is. Based on the Student’s model [14], use an EM (expectation–maximization) algorithm to solve the maximum a posteriori values of the centroid and the coverage area matrix. Figure 2 shows a localization scenario where four out of the 50 measurements are outliers. In this scenario the robust centroid’s Tx location estimates are significantly closer to the true location than the conventional centroid algorithm’s. Notice that both [12, 14] mainly concentrate on online positioning and do not explicitly assume that the Tx is actually located at ; the estimate rather models the center point of Tx’s coverage area.

The connectivity based methods are based on two assumptions: Tx’s antenna is omnidirectional and measurements are collected uniformly in the whole reception area [8, 10, 19]. As Bulusu et al. [8] point out, the performance of the centroid algorithm is highly dependent on the data. Some studies report [10, 19] that the Tx position estimate will be biased towards areas with the highest measurement densities. A possible solution to this problem is to model the thoroughness of the data collection in each location, which would also introduce information on where the Tx is not hearable; this information has been used for mobile Rx localization [7]. Another approach is gridding, clustering the observations in a regular grid so that each grid point represents all the measurements in its vicinity, which can partly mitigate the problem of uneven measurement distribution. Algorithms for detecting insufficient data collection and automatically proposing new measurement locations have been proposed [25].

The centroid method is straightforward to understand and implement. The basic centroid algorithm is computationally light and the robustified version is still computationally feasible for most purposes even though it is a constant factor heavier than the basic centroid. Furthermore, the centroid algorithms have a small number of tunable configuration parameters, which might be advantageous if there is little prior information on the Tx locations and signal propagation models, and there is no risk of overconfident RSS models.

One important property of a method is whether the Tx position estimate can be updated when new observations appear without needing to access all the old observations. For all the presented connectivity based methods, updateability can be achieved with a very low cost; only the point estimate and the number of samples used need to be stored in the database.

3. RSS Based Methods with Located Observations

This section reviews methods that estimate the Tx location using observations that consist of tuples , where is the reference location, is the list of Tx identifiers, and is the vector of corresponding RSSs.

The RSS is negatively correlated with the distance between the Tx and Rx. Attenuation of the signal strength (path loss, PL) is due to both free space propagation loss governed by the Friis equation and losses generated by various obstructions in the environment [26, Ch. 4]. Accurate modeling of these obstructions is in most practical cases infeasible, so simplifying probabilistic models are commonly used. The conventional probabilistic model in both outdoor and indoor environments is the log-normal shadowing model [26, Ch. 4]where is the RSS in dBm (dB referenced to milliwatt) at distance from the Tx, is a reference distance, typically 1 m, is the RSS at the reference distance, is the PL exponent parameter, and is a normally distributed shadowing term with variance . The environment dependent PL parameters and are usually estimated for a certain environment or for a certain Tx based on data, while the transmission power dependent parameter can be assumed to be known based on the Tx properties [26, Ch. 4]. A typical assumption is that the shadowing term is a statistically independent random variable for each measurement, while another possible approach is to assume spatially correlated shadowing; see, for example, the Gaussian process based algorithm [27]. For localization purposes, it is usually adequate to use distances in 2-dimensional Cartesian coordinates; in the localization of cellular base station tower, for example, Tx’s altitude affects the distance in the proximity of the Tx, but as most data typically comes from farther distances, this effect can be neglected [16].

The signal shadowing consists of the so called small-scale and large-scale shadowing components, and typically the PL models model average of the both statistically, as accurate analysis of the multipath propagation patterns that cause the small-scale fading is not feasible in large systems. See further discussion in [28, Ch. 7.2]. Currently most wireless communication networks transmit continuous waveforms, and optimization for impulse signals is out of the scope of this article. It should be noted that in case of most WLANs, for example, the mapping from the reported RSS indicator to the actual RSS in dBm is unknown. This problem can be circumvented, for example, by using RSS ratios [29] or RSS histogram [30]. The Rx can have one or multiple antennas, and in the latter case the Rx device can either report all the measurements separately or combine them into a single RSS measurement.

3.1. Closed Form Solutions

A commonly proposed closed form solution for Tx’s position using RSS measurements is the weighted centroid algorithm that was proposed for the localization of the wireless sensor nodes by Blumenthal et al. [31]. It has been proposed for WLAN Tx localization, for example, by [9, 11]. In the weighted centroid approach, the estimate of the th Tx’s location iswhere is a weighting function that depends on the RSS and is the RSS of the signal transmitted by the th Tx and received at the location . Usually the weights are chosen so that the stronger the RSS, the greater the weight. The standard weighting methods are the distance based weighting [31]and the RSS based weighting [11]In both the weighting methods, is a free parameter and is the signal detection threshold, that is, the lowest possible RSS. These weighting methods are compared for wireless node localization in [11] where it is found that the two methods have equal average performance. The weighted centroid location is the solution of the optimization problembecause (11) can be expressed as the weighted least squares problem (4) by setting , and (8) follows from (4) by the linear least squares formula. A corresponding probabilistic interpretation is that the weighting function value in the measurements where the th Tx is observed follows the exponential distribution whose scale parameter is inversely proportional to the squared Tx–Rx distance. By the change of variables formula for PDFs (probability density functions) this giveswhere is a constant that does not affect the solution. To obtain the objective function of (11) for the maximum likelihood solution, the Rx–Tx distance is only to appear in the exponent, so it is removed from the normalization constant by modeling the probability of the RSS exceeding the signal detection threshold to be inversely proportional to the squared Rx–Tx distancefor distances exceeding a limit.

The comments regarding the unweighted centroid are applicable to the weighted version as well, although modeling of the RSS somewhat reduces the sensitivity to uneven data density. Figure 3 shows a simulated example where Tx’s actual location is not in the middle of its coverage area. In such a case the weighted centroid algorithm outperforms the unweighted centroid due to the RSS measurement information.

Another RSS based closed form solution is proposed by Koo and Cha [15]. Earlier similar formulas have been proposed for wireless sensor networks in [32]. The same formulas are used in [33] for distance measurement based wireless transmitter positioning without the estimation of the signal propagation parameter. Instead of the log-normal shadowing model (7) [15], use a different PL modelwhere is the PDF of the (possibly multivariate) normal distribution with mean and covariance matrix evaluated at , and and are the parameters of this nonlogarithmic PL model that are not directly related to the PL parameters and in (7). (The notation is simplified from [15].) Thus, the distribution of two conditionally independent RSSs’ difference iswhereGiven the flat prior for the Tx position and the improper prior , the posterior of is thus given by the standard linear least squares (LLS) formulaswhere In this formula, each RSS measurement is used only once to avoid correlations between RSS differences. A strength of this LLS method is the existence of closed form formulas; the method thus has rather low and predictable computational cost, and convergence is not an issue. Addition of prior information on the Tx location is also straightforward. However, if the actual RSS follows the log-normal shadowing model (7), the approximation (14) can be crude.

3.2. Iterative Methods

Maximizing the likelihood of the Tx position and possibly some model parameters using the model (7) leads to the nonlinear least squares (LS) problemwhereis the model function of one measurement.

This optimization problem can be solved using various nonlinear LS methods that are typically iterative algorithms [34]. The general form of the nonlinear LS problem iswhere is a known nonlinear function and is the Euclidean norm. Many solution methods are based on differentiation, either on the first order derivative (gradient, Jacobian) such as the steepest descent and Gauss–Newton (GN) methods or on the second-order derivative (Hessian matrix) such as the Newton method [34]. To the authors’ knowledge, the second-order information has not been used in problem (19) because of the difficulty of analytical differentiation. Given an initial point , a GN iteration iswhere is the Jacobian matrix of the function evaluated in .

The GN method has been applied to problem (19) in [16, 17], for example. In this case the model function is the function whose th element isand the th row of its Jacobian matrix isIf the parameters and are known for a certain environment, the corresponding columns can be left out from the matrix (24). The GN algorithm can sometimes diverge. A less divergence-prone GN version is the Levenberg–Marquardt (LM) algorithm used for Tx localization in [18]. Alternatively, the divergence can be addressed by using an additional line search algorithm that ensures decrease of the objective function value as in [16].

As pointed out in [35], if the posterior covariance matrix is approximated by the covariance matrix of the linearized model. the estimate can be updated when new measurements are obtained. Including a Gaussian prior distribution for keeps the problem as a nonlinear LS problemfor which a GN iteration is [35]where is the measurement noise covariance matrix, and is the shadowing variance in (7). This iteration enables approximative updating of the estimate without storing the old observations by using the covariance matrix update . Notice that if there is enough knowledge of the PL parameters the Tx location estimate can be outside the observation area as illustrated by Figure 4.

The GN converges to a local minimum, so the choice of the initial point is important. Proposed choices of in Tx localization are the location of the strongest observation [16], the centroid of all observations [17], or the result of a grid-type algorithm, which is discussed in Section 3.3. A drawback of GN and LM is that if there are several separate areas of strong measurements, the computed Tx location estimate depends strongly on the initial point so that different strong areas are not compared.

Due to the assumption of normally distributed shadowing, the GN algorithm can be sensitive to outlier measurements, where the RSS differs significantly from the value predicted by the PL model. Outlier removal procedures for tackling this issue have been proposed at least in [36].

3.3. Monte Carlo and Grid Methods

This section discusses methods that are based on explicit evaluation of the Tx locations PDF at several points of the location space. In grid methods prespecified evaluation points are used, while Monte Carlo (MC) algorithms are based on pseudorandom evaluation points.

Importance sampling is a basic form of MC sampling. Kim et al. [10] use a method where the MC samples of the location of one Tx are generated from a prespecified prior distribution and then given weights based on the training measurements and known PL parameters. Kim et al. do not explain their weighting method, but the formula based on the model (7) isThe MC estimate of the posterior mean is then the weighted average of the sampleswhere is the number of MC samples. This method can also be called a particle filter with the static state model as in [10], since the weights can be updated recursively. A drawback is that importance sampling suffers from sample impoverishment in static state estimation [37, Ch. 3.4]: all weight will over time concentrate to a few samples and there will be little variability because of lack of dynamics. This problem can to some extent be overcome by using resampling techniques such as resample-move algorithm or Markov Chain Monte Carlo techniques [37, Ch. 3.4].

For some models a solution to sample impoverishment is Rao-Blackwellization [37, Ch. 3.4] that is proposed for simultaneous mobile Rx and static Tx localization by Bruno and Robertson [21]. They do online Tx localization using a Rao-Blackwellized particle filter (RBPF) so that the training measurements’ locations are obtained by inertial positioning. Thus, the distribution of the Rx locations is obtained by MC sampling. The distribution of each Tx location is approximated by a Gaussian mixture for each MC sample and each value of the PL parameter . The PL exponent is assumed to be known. This gives a recursive algorithm for joint estimation of the Rx and Tx locations. This solution is suitable for cases where the locations of the training measurements are imprecise but form a time-series that can be filtered.

Han et al. [19] propose a grid method, where a plane is fitted to the 3-dimensional position–RSS space for each grid point. The direction of gradient of is then considered as an estimate of Tx’s direction, and the Tx location estimate is defined as the point that minimizes the mean square error of the directions of the grid points. Han et al. use a dense grid method for the minimization, but they also suggest that more efficient optimization tools could be used.

Some authors exploit the fact that the PL parameters appear linearly in the measurement model given the Tx location. Thus, the PL parameters can be fitted analytically to each point of a set of candidate Tx locations. Shrestha et al. [20] make a linear least square fit of the PL parameters for every measurement assuming that the Tx is located in the considered measurement location. The Tx estimate is chosen to be the measurement location that minimizes the mean square error of the PL parameter fit. Dependence of the measurement density can be reduced by using a regular grid as the set of candidate points. This will make the algorithm more flexible but also increase the computational complexity. Achtzehn et al. [38] propose a genetic algorithm, but its details are left unexplained. If the PL parameters , , and are assumed known, the Tx location’s likelihood can simply be evaluated at each grid point [39]. A grid can also assist the GN or LM algorithm so that each grid point gives an initial point to the iterative algorithm [18].

Grid algorithms can achieve arbitrary modeling accuracy, but the computational complexity will increase rapidly along with the state dimension and grid density. Furthermore, optimal values of critical parameters such as grid density and grid size may vary in different subregions in large-scale systems.

4. Tx Localization with Unlocated Observations

All methods presented this far rely on a set of measurements with reference locations assumed known accurately or as a probability distribution. However, this assumption is not always realistic especially in indoor environments, where accurate GNSS services are unavailable and manual entry of reference locations is too laborious especially for data collected by crowdsourcing. This section reviews algorithms where Txs’ locations relative to other Txs is estimated using unlocated observations and the undirected graph created by connecting Txs that appear in a common observation. The basic assumption is that the more frequently two Txs are observed in the same measurement location, the closer to each other they are probably located. It is also possible to use the RSS: if two Txs’ signals are strong in a location, the Txs are probably close to each other. The locations in global coordinates, that is, the correct scaling and rotation of the radiomap, are obtained by adding some measurements with reference locations: it is assumed that when a Tx is observed (with a high RSS) in a located measurement, the Tx is probably close to this measurement’s location. The principle is illustrated in Figure 5.

Koo and Cha [22] propose multidimensional scaling (MDS). The RSSs in a measurement with more than one Tx determine the dissimilarity between the observed Txs, and the MDS finds the 2-dimensional Tx locations whose mutual distances best agree with the dissimilarity matrix. In [22] the dissimilarity of the mobile Rx and the Tx is defined to be a certain decreasing function of the RSS, and the dissimilarity of two Txs is the smallest sum of the Rx–Tx dissimilarities observed in the same training measurement. The dissimilarities of Txs that are not connected by a common measurement are determined through the other dissimilarities by using a graph construction. Since the dissimilarities are not simple functions of distance and contain noise, the Tx localization is a nonmetric MDS problem, for which iterative algorithms exist [40]. If some reference locations are available, the relative MDS location estimates are transformed to global coordinates by an optimal scaling, rotation, and translation given by Procrustes analysis [22]. A drawback of this algorithm is that if two Txs are located close to each other but the closest training measurement location is far from both, the Koo–Cha dissimilarity will overestimate the distance between the Txs, because the dissimilarity corresponds to the distance via the closest training measurement location. Furthermore, the most natural choice for the mapping from the RSS to dissimilarity would be the exponential relation derived from the log-normal shadowing model (7), which is different from the choice of [22].

Raitoharju et al. [23] propose several algorithms that use unlocated data. Based on their tests, they recommend a closed form solution called access point least squares (APLS). The APLS is based on the modelwhere the Txs and are observed in the same measurement, the Tx ’s located measurements’ mean location is , and and are constants whose values do not affect the solution if no prior distribution is used for the Tx locations. This results in a linear Gaussian measurement model whose solution is the standard linear least squares formula. Raitoharju et al. [23] also propose that the accuracy can be improved with the cost of increased running time by applying a Gauss–Newton method (22), where the log-normal shadowing model with fixed PL parameter values is used so that both Tx locations and mobile Rx locations are unknown. The GN algorithm is more accurate than the APLS due to modeling of RSS, but multimodality of the posterior distribution can cause convergence to nonglobal extrema [24].

Chintalapudi et al. [24] present a method that relies on a genetic algorithm for finding initial points for iterative optimization methods. In the first phase, all initial points are generated randomly; genetic algorithms are thus Monte Carlo algorithms. The initial points are then treated in a manner that depends on the objective function value (fitness) of the local maxima given by the iterative optimization method for each initial point. The initial points with high fitness are retained, while the initial points with low fitness are replaced by generating new values, added random noise, or mixed by random convex combinations. This cycle is iterated until the solution stops improving. Chintalapudi et al. estimate the mobile user location , the Tx location , and the PL parameters and jointly for each th measurement and th Tx. Chintalapudi et al. use a fitness function that is based on the mean absolute error, but the standard least squares approach of (19) can also be used for more standard modeling and a wider range of optimization methods. The genetic algorithm is capable of finding the global maximum with a much higher probability than a single gradient descent algorithm. The disadvantage is the increased computational burden. Chintalapudi et al. discuss criteria to select a subset of Txs and training data so that computational requirements are somewhat reduced without losing accuracy significantly.

5. Tests

5.1. Simulations

We implemented 11 Rx localization methods with MATLAB. We simulated 100 Txs with 250 measurements for each. We generated the measurement points from bivariate normal distributions whose covariance matrices were generated separately for each Tx from the Wishart distribution with three degrees of freedom . Each measurement point was then assigned a RSS value generated from the distributionthat is, the used PL parameters are , , and , which are approximately in line with the values , , and given in [41]. Each Tx localization method that uses measurements with known reference locations is then applied to each measurement set.

The parameter values used in the tests were the following: In the robust centroid algorithm the number of EM iterations was five, and the number of degrees of freedom . In the weighted centroid, we set  dBm. We optimized the parameter with a Monte Carlo simulation using 10.000 replications, and the median Tx positioning error as a function of is shown in Figure 6. Based on this, we set the parameter value to 0.07 for distance based and 5 for RSS based weighted centroid. The GN iteration was terminated when change in the Tx location between two successive iterations was less than 1 mm or after 1000 iterations. The importance sampling used 5000 Monte Carlo samples. In the RSS gradient method the gradients were fitted for each point of the regular grid with 1-meter spacing so that the grid squares that did not have any measurements were removed. The window size of the gradient fitting was chosen according to the advice given in [19]: the window size was increased until at least 30% of the grid points had at least three measurements to fit the gradient. The grid-point-wise PL parameter fitting method was based on a regular grid using 0.75-meter spacing and the square around the strongest RSS measurement with side length 60 m.

The Tx localization error distributions are illustrated in Figure 7. In these boxplots, the asterisks show the maximum and minimum error of the method, and the box levels are 5%, 25%, 50%, 75%, and 95% error quantiles. In the left subplot, the measurement locations are generated from the bivariate normal distributions. In the right subplot, the measurements whose east coordinates are greater than those of the Tx are removed; this test is done to study the robustness of the methods to training data distributions that are not symmetric with respect to the Tx location. Some of the algorithms can be given prior information on the PL parameters. Note that this kind of prior information is not always available in real-world scenarios. The red boxes in Figure 7 show the error distributions when the PL parameters are given the priorWith the importance sampling method, estimation without prior means using a prior with a large variance.

Figure 7 shows that when the measurement data distribution is point-symmetric, Gauss–Newton (GN) and grid-point-wise fit (grid-fit) are the most accurate methods. The importance sampling method is very close in accuracy and it has flexibility, for example, for extensions to non-Gaussian models, but it requires a good prior distribution to produce an efficient importance distribution. The accuracy of the measurement point-wise fit (meas-fit) is limited by the measurement point density and whether the measured area covers the true Tx location. The gradient method performs well with point-symmetric measurement sets, but suffers dramatically from removing the measurements of an area. The reason for this can be that the method is based solely on the measurement geometry; it does not use the logarithmic shape of the propagation model. That is, in the west–east direction there will mainly be arrows pointing to east, and this can deteriorate the accuracy in west–east direction. The linear least square (LLS) method of [15] suffers from approximating the logarithmic PL model with a linear one; the method seems to fit the linear PL model overweighting weak RSSs that are the majority, and therefore the RSS peak location estimation is biased.

The centroid algorithms that do not use RSSs perform well in accuracy with point-symmetric data distributions. The error is typically slightly higher than that of the GN, but the overall performances can be regarded as competitive considering the simplicity and computational ease of the centroid methods. The centroid methods are robust against deviations from the logarithmic PL model, but especially the nonweighted centroids are sensitive to asymmetric data sets. However, the weighted centroid still has accuracy slightly lower but comparable with that of the GN. Robust centroid is less accurate than the distance-weighted centroid, but slightly more accurate than the nonweighted centroid due to non-Gaussian coverage area.

In some cases the distribution of RSS is not a function of the distance only, but there can, for example, be several RSS peaks, that is, areas governed by strong RSS measurements. These can be due to uneven terrain topology, reflective building materials, or unmapped strong RSS areas, for example. Figure 8 shows the Tx localization error distributions when 20% of the training measurements are generated from a normal distribution , where is a random point close to the true Tx location. For each measurement point we then generated the RSS from the model (30) and the RSS from the same model using as the Tx location. We then set the actual RSS measurement to . Figure 8 shows that the methods that perform best in the unimodal RSS distribution’s case, that is, weighted centroid and GN, have some large Tx localization errors with bimodal RSS distribution. Weighted centroid and GN tend to choose one RSS peak, the weighted centroid based on the strongest measurements, and the GN solution based on the initial guess given to the algorithm. Centroid, importance sampling, and point-wise fitting methods give more weight to the whole RSS distribution and do not converge into nonglobal local extrema. Thus, the weighted centroid and GN have median accuracy close to the other methods, but they may require some heuristics to cope with cases with multiple RSS peaks.

5.2. Real Bluetooth Low Energy Data

We installed 82 Bluetooth Low Energy (BLE) Txs in a building in the campus of Tampere University of Technology. The ground truths of the the Tx locations were measured relative to some map objects using a measurement tape. Furthermore, we collected measurements of the received BLE signal strengths using an Android-run Samsung tablet device. The true location related to each RSS measurement was obtained manually by clicking an indoor map figure at each turn and interpolating between the turns. Floor estimation was assumed perfect, so only training data collected in the true floor of each Tx was used. The locations of the Txs and the training measurements are shown in Figure 9.

Figure 10 shows the Tx localization error distributions for the real data test. The results mostly resemble those of the simulation results with non-point-symmetric measurement point distribution in Section 5.1. The root-mean-square errors (RMSE) of the methods are given in Table 1.

6. Concluding Remarks

This paper reviews and tests mathematical models and methods for wireless transmitter localization based on received signal strength information. Empirical comparisons results using simulated and real-world data are provided. The key features of each presented method are summarized in Table 2. Note that the column accuracy refers to how accurately the method can be adapted to the assumed signal model, such as the path loss model; the real-world localization error can depend on the details of the scenario. Updateability means that an algorithm for recursive updating without storing the entire training database has been proposed.

The methods can be categorized based on what information they use: RSS or only connectivity, with or without known reference position. The methods that require reference positions are suitable for so called wardriving, that is, outdoor network surveying where GNSS provides reference positions, or for small-scale indoor mapping. The unlocated methods trade off some accuracy to enable large-scale crowdsourcing even in GNSS-less environments. Computational efficiency and ease of updating the estimate without storing large training databases are crucial in large-scale applications. An example of such a system is ubiquitous indoor positioning, which requires efficient initialization, improving, and updating of large-scale radio maps that contain not only 2-dimensional locations but also floor information.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors are grateful to Simo Ali-Löytty, Jukka Talvitie, Lauri Wirola, and Jari Syrjärinne for enlightening conversations. Henri Nurminen receives funding from Tampere University of Technology Graduate School, the Foundation of Nokia Corporation, and Tekniikan edistämissäätiö.