Abstract

Ubiquitous global positioning is not feasible by GNSS alone, as it lacks accurate position fixes in dense urban centres and indoors. Hybrid positioning methods have been developed to aid GNSS in those environments. Fingerprinting localization in wireless local area networks (WLANs) is a promising aiding system because of its availability, accuracy, and error mechanisms opposed to that of GNSS. This article presents a low-cost approach to ubiquitous, seamless positioning based on a particle filter integrating GNSS pseudoranges and WLAN received signal strength indicators (RSSIs). To achieve accurate location estimates indoors/outdoors and in the transition zones, appropriate likelihood functions are essential as they determine the influence of each sensor information on the position estimate. We model the spatial RSSI distributions with Gaussian processes and use these models to predict RSSIs at the particle’s positions to obtain point estimates of the RSSI likelihood function. The particle filter’s performance is assessed with real data of two test trajectories in an environment challenging for GNSS and WLAN fingerprinting localization. Outcomes of an extended Kalman filter using pseudoranges and a WLAN position as observation is included as benchmark. The proposed algorithm achieves accurate and robust seamless localization with a median accuracy of five meters.

1. Introduction

The global navigation satellite system (GNSS) is present in almost all domains of modern life and it provides the base for many applications and services ranging from transportation and logistics, via surveying and mapping, to an uncountable number of leisure activities. With the development of the Internet of Things, the amount of location-aware services and the demanded accuracy will increase further. In densely constructed urban areas and indoors, GNSS has deficits. Shadowing and multipath phenomenon, which are often the largest error sources in these situations [1], inhibit robust, ubiquitous positioning. GNSS alone is unable to satisfy many location based services.

A common strategy to achieve robust ubiquitous positioning is to support GNSS with additional sensor information. Different technologies are available and many of them have been used in combination with GNSS. Positioning systems based upon wireless local area networks (WLAN) are emerging from the large amount of options [2, 3]. This is due to the global dissemination of WLAN infrastructure (limiting the costs of its exploitation) and WLAN enabled devices and the trade-off between range and potential positioning accuracy.

The approaches to using WLANs for localization are diverse. In this work, we resort to WLAN location fingerprinting for the reasons set forth as follows. WLAN location fingerprinting is based on the recognition of prerecorded signal strength (RSSI) readings. These RSSIs are ideally unique at different locations. This ideal is usually not achieved. Nonetheless, a heterogeneous signal strength distribution, achieved by diverse signal attenuations, facilitates the recognition of signal strength patterns. That applies in environments with many obstacles and thereby improves the localization accuracy. Hence, this method works best indoors and in urban areas. In contrast, GNSS is based upon the signal propagation delays between the satellites and the receiver, so called pseudoranges. Exact GNSS positioning demands the line-of-sight between receiver and satellites. Open outdoor environments meet this condition best; harsh GNSS environments are dense urban centres and indoors. The operation principles of both localization technologies differ completely. They are well suited to complement each other, thus motivating the integration of pseudoranges and received signal strengths.

Several authors studied hybrid positioning systems based on GNSS and WLAN. The approaches differ in the level of integration and in the kind of information used from each of the systems. In a survey of these approaches, we found four categories in which the fusion of information of GNSS and WLAN can be grouped [4]: WLAN assisted GNSS, switching between position estimates of GNSS and of a WLAN based positioning system, weighting the position estimates of GNSS and of a WLAN based system, and deep integration of GNSS pseudoranges with different features of a WLAN based positioning system. We refer to [4] for the implications of the different integration schemes on the overall system and continue here with a brief overview of the works that study the fusion of GNSS and WLAN raw data, as they are the most relevant. They commonly rely on recursive Bayesian estimation methods, such as Kalman filter, particle filter, and their variants. Furthermore, all of them employ GNSS pseudoranges, but the features used from WLAN differ.

The authors of [59] deduce ranges from WLAN signals and combine them with pseudoranges. The WLAN ranges are obtained either from signal propagation times or from WLAN power measurements. In any case, the access point positions must be known, which is an unrealistic assumption in public areas, such as city centres. Another drawback of using ranges is the requirement of a line-of-sight between transmitter and receiver. This line-of-sight condition is rare indoors and makes accurate, seamless indoor/outdoor positioning very difficult. The advantage of this approach is that the same physical quantity is fused. This means that to fuse the ranges one can use the same, well-known methods as for GNSS positioning.

We consider the fusion of pseudoranges directly with RSSIs more promising, because of their contrary error mechanisms. In [10, 11], this approach was already investigated; likelihood functions from pseudorange and RSS measurements are established in both works and then fused within a particle filter. However, as fingerprints are spatially discrete distributed, the accuracy depends strongly on the fingerprint density. This is a clear disadvantage for medium- to large-area localization applications, where the laborious construction of fingerprint radio maps presents the biggest drawback and challenge.

This study extends the previous works and provides a solution that deals with the discrete fingerprints. We present a particle filter that integrates pseudoranges and RSSIs based on their likelihood functions, where a particle approximation of the signal strength likelihood function is obtained from a Gaussian process regression model. This method provides an automatic, well balanced weighting of the two sensor information sources and achieves accurate, robust, and smooth seamless positioning. We compare the algorithm with an extended Kalman filter that feeds WLAN position estimates as observation into a regular GPS extended Kalman filter.

2. Interpolating WLAN Signal Strength

To perform seamless location estimation in this study, we consider the problem of fusing WLAN RSSI with GNSS pseudoranges on a continuous state space. Fingerprinting techniques are based on spatial samples to construct the required radio map. Interpolating RSSI provides two principle advantages. First, it reduces the labor intensive construction of radio maps, because less fingerprints are needed to obtain a fingerprint database that is equally exact compared to without interpolation. Second and more importantly here, it provides the continuous model for the fingerprint database, facilitating the fusion of RSSI and pseudoranges.

This section explains interpolation of RSSI radio maps through Gaussian process regression.

2.1. Preliminaries

Gaussian processes are stochastic processes, basically a generalization of multivariate Gaussian distributions to infinite dimension. Every point of the infinite input domain has a Gaussian random variable associated. The defining property is that any finite collection of those random variables has a joint multivariate Gaussian distribution. Gaussian processes are completely characterized by a mean function and a covariance function :where is a vector from the index set.

2.2. Gaussian Process RSSI Model

Consider a nonlinear function describing the relationship between positions in space and the corresponding RSSIs, . The term is the latent function we seek in order to predict/interpolate RSSIs and represents i.i.d noise. In practice, one deals with a finite collection of input points. Constructing a fingerprint radio map with fingerprints corresponds to drawing a set of noisy RSSI samples at known positions from that unknown function. In the context of machine learning, this data is called training data.

Gaussian process modelling can be seen from a Bayesian point of view. The basic characteristics of the latent function is described by a Gaussian process a priori distribution and the function that we want to infer is modelled by a Gaussian process a posteriori distribution. To model the latent function appropriately, the mean and covariance function of the prior distribution have to be chosen and their parameters, so called hyperparameters, need to be found. (The hyperparameters can be learned from the training data, which is described in Section 2.3.2.) The likelihood function of the Gaussian process modelling reflects how likely are the RSSI samples in the light of the model.

The observed RSSIs are spatially correlated; neighbouring RSSIs are usually more strongly correlated than RSSIs that are distant. This relationship between the RSSIs is modelled by a covariance function. Covariance functions are specified by kernels. For two different input positions in space, and , these functions are one at zero input distance, , and decay with increasing distance. In [12], it has been shown that a constant mean function and a Matérn covariance function fit RSSI data very well. The constant mean function constitutes a single hyperparameter and is simply . As in [12], we use the Matérn kernel (with [13]). To account for the i.i.d observation noise, the kernel is enhanced with a noise term:This covariance function has three hyperparameters: is the signal variance, basically scaling the RSSIs; is the length scale, determining the variability of the underlying process; and is the noise variance, specifying the power of the noise. The constant mean, , specifies the value to which the Gaussian process converges if training data are absent.

2.3. Gaussian Process Regression

The Gaussian process that describes a finite collection of observed, noisy RSSI samples is a multivariate Gaussian distribution , with the mean vector and the covariance matrix . Here, is a matrix that contains the covariances, determined by the Matérn kernel, for all pairs of the position matrix .

The described relationship between already observed RSSIs extends as well to not yet seen RSSIs, that is, to values we wish to predict. We denote those RSSIs with and the corresponding positions with and call the collection of these two sets test data. Both sets of RSSIs, the training outputs and the test outputs, originate from the same physical phenomena and are thus jointly Gaussian distributed:The covariance matrix contains the covariances evaluated at all pairs of training and test positions and at all pairs of test positions, respectively.

By conditioning the joint Gaussian distribution (3) on the observations and inputs, the Gaussian process posterior distribution is obtained: Its mean and covariance functions are given withEquation (5) enables the prediction of RSSIs at arbitrary positions and (6) enables computing the covariance at these positions. These covariances provide a useful measure of the uncertainty of the predictions.

2.3.1. Nonzero Mean Function

The term is a deterministic mean function. Regression with a fixed mean function is simple. The fixed mean is subtracted from the observations, reverting the Gaussian process to a zero mean process. Then the standard Gaussian process regression procedure is applied, after which the mean function is added again. By replacing the fixed mean function with a set of basis functions, the mean function can be parametrized. Its hyperparameters, the parameters of the basis functions, can then be inferred from the data [13].

2.3.2. Finding the Hyperparameters

A final point we need to mention is how to obtain the hyperparameter of the mean and covariance function. The likelihood function of Gaussian process regression, , is the key to that.

For notational convenience, we collect the hyperparameters in a vector . The posterior distribution’s dependence on the hyperparameter is expressed by an intractable integral. Nonetheless, it can be approximated by using the most probable values of the hyperparameters [14]. The most probable hyperparameters can be determined based on the posterior probability of , . Maximizing allows inferring the hyperparameters. This fraction reduces to the marginal likelihood function, as the denominator is independent of and is modelled uniformly (usually no prior knowledge about the hyperparameters is available). In practice, the inference is done by minimizing the negative log-likelihood:

Assuming a normally distributed noise process and prior distribution, an analytic expression for the marginal likelihood can be derived [13]:

The optimal hyperparameters are found by computing the partial derivatives of this function with respect to the hyperparameters and then applying a gradient search algorithm to find them.

3. Sequential Bayesian Estimation

The localization problem can be optimally described by the sequential Bayesian filter, a recursive framework that also enables fusing the sensory data. It propagates the location estimate over time based on the previous estimate of the mobile terminal’s position and arriving observations.

It consists of two steps that are recursively executed. The process update predicts the current location based on the previous one and a model of the mobile terminal’s motion. The process update is expressed by the Chapman-Kolmogorov equation [15]:where is the state vector at discrete time . The time index denotes a sequence of variables, for example, all observations, , obtained up to time . The motion of the mobile terminal is modelled by the state transition probability density . The term is the prior probability density. It expresses information about the mobile terminal available before the current estimation step and represents the previous state of the object or some initial state. As soon as observations are available, the predicted state can be corrected. This is done in the measurement update step:where is the likelihood function [15]. It encodes the information of the observations and is used to correct the prediction to yield the posterior density . This recursion estimates the location of the mobile terminal optimally; however, these integrals do usually not have an analytic solution and must be approximated.

3.1. Particle Filter

One of the most versatile approximations to the recursive Bayesian estimation problem is particle filters. Particle filters rely on sampling from functions to approximate them. The versatility stems from their ability to approximate arbitrary functions. A detailed description of the generalized particle filter can be found in [15], that we follow in this subsection.

Complex integrals, as the ones that describe the sequential Bayesian estimator, can be approximated numerically if we can sample the involved functions. If the samples cover the function’s domain well, sums replace complex integrals while the simplified operation on the samples provides a sufficiently accurate approximation. We denote the samples by .

Nevertheless, it is often not possible to sample a function, as it is unknown or too complex. This issue is circumvented by sampling a similar function, , which is proportional to the target function . The term is the importance density. The introduced weights compensate the difference between the importance density and the target density. These weights need to be normalized to sum to unity, so that the functions are valid probability densities: . The idea of approximating arbitrary functions by a set of weighted particles is known as importance sampling. Consider the samples . Thus, we can approximate a density by samples:

In the context of recursive state estimation, is the state of the th particle and is its associated weight. The target density is the posterior density . To this point, it was left open how the weights are obtained. They are first defined in general by The derivation of the weights is based on a factorization of the importance density. A factorization that allows augmenting the importance density of the previous time step with a factor that represents the current state, so that particles that are distributed according to the importance density of the previous state can be updated with particles that approximate the current state. This leads to a recursive weight update which is given byEquation (15) allows approximating the posterior density by particles and their weights.The weights must be normalized after each iteration.

The described procedure constitutes the problem of particle degeneration; that is, after each of the iterations, only a few particles gain weights while, for the majority of particles, the weight is reduced, until just one particle concentrates almost the complete weight and the weight of the remaining particles is negligible.

The choice of the importance density may slow down this effect, but usually an additional resampling (with replacement) step is introduced to overcome it. During the resampling step, particles are multiplied according to their weights, so that particles that possess large weights are selected several times and particles with very small weights are selected once or possibly vanish.

As the resampling introduces further problems (principally reduces the diversity, because many particles are repeated), we resample when the degeneracy of the algorithm becomes severe. This can be assessed by the effective sample size . We resample if the effective samples size exceeds of the numbers of particles.

In this work, we choose the importance density to be the state transition density . This simplifies (15) further; it reduces the weight update to be a product of the previous weights and the likelihood function:

4. Fusing GNSS Pseudoranges and WLAN Received Signal Strength

This section starts with an explanation of the used models in Section 4.1 and describes their integration into the Bayesian filter framework in Section 4.2. The extended Kalman filter that is used as benchmark is detailed at the end of this section.

4.1. Models

The state space for the self-localization problem is defined in a local level coordinate frame, where a position in space is denoted by . The corresponding velocity is its time derivative . We denote the state vector at time by .

4.1.1. Process Model

To model the motion of the mobile terminal, we have chosen a random walk velocity process as in [16]. It stems from a Langevin process, a stochastic differential equation, that describes the velocity of a Brownian particle. For motion in a single dimension, this process reads withIt is driven by a white and normally distributed excitation process , which allows the uncertainty of the velocity to grow during prediction and also enhances the diversity of the particles. Parameter is the process’ rate constant. It determines the velocity increment of the mobile terminal during a discretization time step , where steady-state root-mean-square velocity, , confines the velocity of the mobile terminal. For the experiments described above, we determined these parameters empirically to  s−1,  s−1 and  m s−1, and  m s−1.

This model results in a linear process model:with the state transition and noise coupling matrix

The noise covariance matrix is simply ; thus, the state transition probability density is Gaussian with the corresponding mean and covariance matrix:

This process model is rather unspecific and therefore very general. Matrix implies basically a static mobile terminal; no specific motion or direction is presumed. Instead it increases the variance over time, thus broadening the region of the potential location. The choice of parameters increases the variance in east and north equally but restricts it in -direction. The reason for that setting is attributed to the lack of an useful geographic height in the radio map. As the -coordinate cannot be corrected in GPS denied areas, we basically slow down the motion in -direction considerably.

4.1.2. GNSS Pseudorange Likelihood Function

A pseudorange describes the distance from the satellite to the receiver. This consists of the geometric distance, the difference between the satellite clocks and the receiver clock, and an ionospheric and tropospheric correction term. Most of these terms can be corrected using information transmitted by the satellites. However, errors occurring in the user segment remain. This is the receiver clock offset, , which can usually be estimated. Other considerable errors are attributed to multipath propagation and shadowing effects. We collect these in , together with errors from the ephemerides prediction, relativistic effects, residuals from the correction terms, and noise. The pseudorange of the th satellite becomes

The error term is commonly modelled as a normally distributed, zero mean random variable . Assuming furthermore that the pseudoranges from different satellites are independent, the joint pseudorange likelihood function for the collection of pseudoranges, , can be written asIt expresses the likelihood that the observed pseudoranges were measured at , where . The standard deviation is composed by various terms that are set or estimated by RTKLIB [17]; among these terms are a deviation estimate based on the user accuracy index and fix values set by RTKLIB accounting for the errors in estimating delays attributed to the troposphere, ionosphere, and code bias.

Notice that the receiver clock offset is not part of the state vector and it is not estimated by the particle filter. As in [11], the least squares method is used iteratively to estimate the receiver clock offset.

4.1.3. WLAN RSSI Likelihood Function

A likelihood function for the RSSI observations, , is established as follows. As we work with time averaged RSSI, to compensate the variations and to get closer to the assumption of normally distributed observations, we denote the arithmetic mean of RSSIs by . To evaluate the likelihood of an observation at different positions, a model predicting the observation, , is required. However, the lack of an accurate model that relates RSSI and space is the reason to resort to the empirical method of fingerprinting. As RSSI measurements only exist at fingerprint locations, one has to approximate the RSSI measurement at the predicted position of the mobile terminal.

These RSSI estimates are obtained through Gaussian process regression; recall Section 2.3. Assume we have access points and that we receive their packets during survey and localization phase. The Gaussian process model for each access point can be trained off-line, before the actual positioning phase. Let be the average of RSSIs received from the th access point. Once the Gaussian process models are constructed, one uses (5) and (6) to predict a vector of RSSIs, , and the corresponding covariance matrix, , at the positions of interest .

Finally, to compute the likelihood that were observed at the position , we establish a Gaussian likelihood function:where is the interpolated RSSI and the corresponding standard deviation at the test input . This likelihood function is based on the assumption that the RSSI observations are independent. Notice, the radio map contains only two-dimensional location information; therefore and the Boolean matrix is .

Compared with the pseudorange likelihood function, the predicted RSSI and variance are independent of time, because they are inferred from the (static) radio map. The same test position yields the same prediction. In contrast, the standard deviation in the pseudorange likelihood function is independent of the location of the mobile platform; it is only determined by the signal reception conditions. The choice of the RSSI variance arises from measurement setup. For many access points, only a single packet was captured; hence, a reasonable deviation measure could not be estimated. An alternative would be to exclude the observations from these access points and then use only the sample deviation measure from the remaining RSSI.

4.2. Particle Filter Tight Fusing of Pseudoranges and Signal Strength

This section combines the ideas previously described; it details the fusion of GPS pseudoranges and WLAN RSSI within the particle filter. Briefly, the key probability density functions of the sequential Bayesian estimator are approximated with particles and their weights and then plugged into the particle filter of Section 3.1.

4.2.1. Process Update

The probability density function that models the motion is already stated in (22). One could sample that density, compute the particle approximation to (11), and predict the mobile terminal’s new location. Since an analytic process model is available, it is easier to propagate the particles that approximate the prior density, , through the process model (20):where are samples from the noise model.

When the filter is initialized, the prior density is usually not known. A common choice for is a uniform density over a plausible area.

4.2.2. Measurement Update

The measurement update is formally defined by (12). The particle filter computes the posterior density through (16) and (17). At this step, a particle approximation of the predictive density is already available , as are the weights from the previous time step. Initially and after resampling, the weights are all equal and sum up to unity. What is left is to update the weights via the likelihood function.

A particle approximation of the pseudorange likelihood is simply obtained by computing the likelihood for each particle:where is a prediction of the th pseudorange for a particle position according to the pseudorange model.

We proceed similarly for the RSSI likelihood function and compute the likelihood for each particle. This involves predicting a RSSI and variance value at each position of a particle; the RSSI likelihood function for the test positions becomesThe term denotes the RSSI predictions at the particle’s positions and is the standard deviation at these positions, respectively.

Pseudoranges and RSSIs are clearly independent of each other, thus fusing RSSIs and pseudoranges equals multiplying their likelihood functions. An approximation of the posterior density is given by

Likelihood functions provide information on the parameters only up to a multiplicative constant. To yield a valid probability density, the updated weights have to be normalized so that they sum up to unity.

In situations when no observation from one of the sensors was obtained, the likelihood function cannot be computed and is therefore ignored. If no measurement at all is available, the algorithm proceeds directly to the process update.

The influence of the measurements, either a pseudorange or a RSSI average, on the particles is controlled by these likelihood functions. On the one hand, the variances affect the particles weights. In the case of pseudoranges, it increases if the GPS observations are believed to be inexact or noisy. The variance of the RSSI likelihood function increases if the mobile terminal is in an area with few fingerprints, because, in these areas, information about RSSI is little. Large variances lead to small weights and vice versa. On the other hand, the position of a particle also affects its weight, because a particle far from the location indicated by an observation causes a large difference in the exponential of the functions, which decreases the weight. Due to these mechanisms, the proposed particle filter balances automatically between the two positioning systems.

4.2.3. Position Estimate

The final position estimate results from the minimization of the mean squared error, [18], and is known as the expected a posteriori estimate:for the particle filter that becomes .

We compute also the standard deviation of the particles as quality measure of the point estimate:where denotes the nonzero weights [19].

4.3. Benchmark Extended Kalman Filter

To put the results of the proposed particle filter into context, we present them along with results obtained from an extended Kalman filter. This filter is based upon the process and measurement update stated in (11) and (12), but instead of propagating and correcting the probability densities it assumes the noises to be zero mean and normally distributed and therefore it propagates only the mean, , and the error covariance matrix, , in time. Furthermore, we extend the state vector by the receiver clock offset, , and its time derivative : .

We present first the process and measurement models and then the update and the correction step of the extended Kalman filter.

4.3.1. Process Model

The motion of the mobile terminal is described by the linear model (20). The process model of the receiver clock is according to [20] a two-state random process model where the state transition matrix reads The noise coupling matrix, , can be obtained from the covariance matrix of the clock offset noise by , where denotes the eigenvectors and the eigenvalues of ; stands for the square root of the matrix’ elements. The elements of the noise covariance matrix itself are determined by the characteristics of the clock, given through the Allan variance coefficients , and : which can be found in [20] (We assume the GPS receiver clock to have a temperature compensated crystal clock.).

Combining the state transition and the noise coupling matrices of the motion and the receiver clock model we can write the complete process model:

4.3.2. Measurement Model

Measurements to the extended Kalman filter are the pseudoranges and, if available, a position from WLAN. Thus, the measurement vector is . The measurement model is the common GPS measurement model for an eight-state extended Kalman filter; see [20]. Recalling the pseudorange model (23), we will abbreviate it here with . Its Jacobian matrix reads where stands for the partial derivative of the pseudorange measurement model (23) with respect to for the satellite.

In the case that a position estimate from WLAN is available, the measurement matrix becomes simply The position estimated from WLAN is a maximum likelihood estimate obtained from (28): where corresponds to the first three elements of .

For completeness, we denote the general measurement model including all observed pseudoranges and a position from WLAN based system:The noise term, is, as in (23), zero mean and white and follows a normal distribution with covariance matrix .

4.3.3. Process Update

In the time update step of the extended Kalman filter, the mean, , and the error covariance matrix, , are propagated in time [15, 20]: where denotes the process noise covariance matrix. The final noise covariance matrix results as block matrices of and (given by (34)):

4.3.4. Measurement Update

The extended Kalman filter approximates (12) as follows [15, 20]: Matrix is the Kalman gain and is given by . Here, is the measurement covariance matrix that contains a variance estimate estimated by RTKLIB for each pseudorange on its diagonal. These variance estimates are based on the user range accuracy index. If a WLAN position is obtained, the covariance matrix is extended by a diagonal block containing a variance for each coordinate of the WLAN position: We found a standard deviation of to achieve the lowest errors for both trajectories, although we expected that larger values would model the error of the WLAN maximum likelihood position estimate better [21].

5. Experimental Results

To analyze the performance of the particle filter, we conducted experiments for two trajectories. We begin this section with a description of these experiments and continue with some notes concerning the implementation of the particle filter. The outcomes of these experiments are reported subsequently in Sections 5.3 and 5.4.

5.1. Experimental Setup and Data Recording

The data used to evaluate our algorithm is from a test bed at the Universidad Autónoma de Querétaro. The test bed constitutes two relatively small buildings (about 8 × 40 meters) with roofed passageways through which the buildings are entered. These two buildings and a larger four-storey building in the south west surround a larger open space. In addition to the roofed passageways, several trees block the view to the sky. Figure 1 depicts the test bed and the two trajectories used to evaluate the algorithm. The first trajectory, named Trajectory-1, commences indoors without GPS observations and runs mainly indoors and along the semiopen passageways along the buildings. It enters the right building only once and ends in the open space, nearby some trees. The second trajectory is called Trajectory-2. It starts in the open space and continues on it, passes below trees, and leads to the passageway of the right building. Following that passageway, it enters the right building once and proceeds to the roofed passageway of the left building. From the middle of the left building, it runs around the small lawn area and returns on the passageway to the passageway of the right building, from where it turns sharp right towards the open space. Trajectory-2 ends further southwest on the open space, in front of a four-storey building. Since Trajectory-1 has larger indoor sections, we refer to it as indoor-like trajectory. The opposite is true for Trajectory-2; we call it outdoor-like trajectory.

This localization scenario presents a harsh environment for GNSS, as scattering, blockage and multipath propagation are very likely. The two buildings used in the indoor sections of the experiments are relatively lightweight. They are made of brick walls and the sectioning indoors is principally done by soft-partitions. As the climate favours open doors and windows, these buildings attenuate signals rather little; thus, the indoor environment is challenging for WLAN fingerprinting as well.

We recorded the RSSIs with a laptop with a consumer-grade wireless network interface card. To create the radio maps, RSSIs and MAC addresses need to be captured at certain, known locations. Therefore we modified JMapViewer, a map application providing access to OpenStreetMap data. By a click on the map, at the location that corresponds to that of the experimenter, the position is obtained and the WLAN data capturing process is initialized: RSSIs from all available access points and the corresponding MAC addresses are captured. A second click on the user interface, after three to five seconds, issues a command to compute the arithmetic mean and the variance of the RSSIs from each access point. The capturing process finishes by storing the WLAN data and the corresponding position (obtained by from the first click on the map) in a database.

This software and procedure were also used to record the WLAN data for the test trajectories. This resulted in a stop-and-go motion, which may not be the most common motion of a pedestrian, but this facilitates recording a ground truth. Again, the WLAN data at each position of a trajectory was recorded for a few seconds. The ground truth of both paths is depicted in Figure 1.

The data sets of the trajectories also contain GPS raw data, from which pseudoranges, satellite positions, and pseudorange correction terms were obtained. This data was recorded continuously in time. The used receiver is an u-blox LEA-6T-0 GPS receiver ([22]). To parse and preprocess the data from the GPS receiver, we used RTKLIB.

Time synchronization is achieved by associating a Julian date time-stamp directly after parsing the WLAN and GPS data, respectively. After adding the time-stamps, the data was postprocessed and subsequently stored in the databases. Time synchronization is at least accurate up to half a second, enough to estimate the location of a (walking) pedestrian every second.

During data recording, the experimenter held the laptop in front of his chest, the GPS antenna was mounted on the experimenters head, whereas the WLAN antenna was connected directly to the laptop. The experimenter’s body probably influenced the WLAN RSSIs and his head movements may have affected the GNSS data reception. Between the radio map survey phase and localization phase, there are more than 18 months. A degradation of the radio map is likely, nonetheless not severe, because the university’s official WLAN infrastructure is relatively stable.

The radio map (see Supplementary Material in [12]) was transformed to a local coordinate frame with an origin at (20.59°, −100.415°, 1800.0 m) and the developed filter was also implemented to compute positions in this frame.

5.2. Filter Implementation Details

Before the observation data is processed, it is synchronized with help of the assigned time-stamps.

The reception of all necessary GPS data, especially ephemerides, takes a few seconds. For that reason and the knowledge that a WLAN based location estimate is likely inexact in that area, we skip the first 35 seconds of Trajectory-2 and start the estimation with GPS observations available.

To initialize the filter we distributed the particles uniformly over the test bed and set the velocity to zero. The initial position estimate is close to the centroid of the initial particle distribution and thus relatively close to the centre of the test bed.

We repeat the estimation of a trajectory 50 times with random initialization of the particles. The results, that is RMSE, standard deviation, and the data to create the figures, are averages over these 50 repetitions.

The particle filter operates with 1000 particles. This assures a certain reliability of the filter outcomes.

5.3. Indoor-Like Trajectory: Trajectory-1

Figure 2 depicts the trajectory estimated by the particle filter (in three configurations), by the extended Kalman filter and the ground truth data. The path estimated by the particle filter follows the ground truth quite well. Visible are jumps of the position estimates. They are caused by a lack of observations. That is because RSSIs are received only every 2 s to 5 s. If, in that time interval, no GPS observations are obtained, the filter can only predict but not correct that prediction, which does not alter the position estimate. In the moment that again a reliable RSSI measurement is received, the filter corrects the prediction and the position estimate jumps to the new corrected estimate. The last section consists of an indoor part immediately followed by an outdoor part. When the trajectory enters the building (around (247, 94) m), the estimated location is already off by a few meters and also the path inside the room is not that accurately estimated. After leaving the room, the estimates drift further and the error increases. During the last section of the trajectory, the particle filter estimates follow the ground truth poorly with an offset. The extended Kalman filter shows a very poor performance. The trajectory shows large position jumps. A comparison of the extended Kalman filter estimates with that from the particle filter suggests that the extended Kalman filter takes the position estimate from WLAN not much into account; if it does, these large jumps occur. This was not expected, because the variance of the WLAN position estimate is set lower than former experiments suggested. Furthermore, the estimates of the extended Kalman filter on the roofed passageway of the right building are pulled into the open space. This indicates that the extended Kalman filter weights the GPS information much more than the particle filter does, even though the GPS data are apparently of low quality. Only towards the end of the path, when outdoors on the open space, the extended Kalman filter estimates get better, even better than those of the particle filter.

The availability of sensory data confirms the described behaviour. Information about the availability can be deduced from Figure 3. This figure depicts the magnitude error over time for the three particle filter configurations and for the extended Kalman filter. The error of the WLAN stand-alone solution indicates that WLAN data is available almost all the time. The error is low, except after the initialization, when no measurements are received for some seconds, and during the outdoor section, where WLAN location fingerprinting is known to perform poorly. The GPS stand-alone solution yields low errors during the first minutes or so, although no GPS data is received. This is due to the initial position that is relatively close to the true start position. The similarity of errors of the particle filter hybrid solution and the WLAN-only solution implies that no GPS data was received for the largest time of the trajectory, or it at least implies that GPS information was not fused into the hybrid solution.

Taking into consideration that WLAN data was available all the time and GPS-only towards the end, the accuracy of the hybrid solution can be explained in more detail: up to minute 24.5, the estimates are governed by the RSSI observations. Therefore, the errors are small where fingerprints can be distinguished well, indoors and most of the time along the passageways. (The reader is referred to [21] for a performance evaluation of the (Gaussian process regression based) WLAN-only location estimator.) The large errors during the last minutes can be explained as follows. After a few seconds outside of the right building, GPS data are received and the filter integrates GPS observations into the solution. The error of the GPS stand-alone solution is still high but decreases as the solution converges to the true location. Simultaneously, while the mobile terminal departs further from the building, the radio map data becomes more ambiguous, which increases the WLAN-only error. The error of the hybrid solution increases because GPS data are still inaccurate, they are still converging to the true location, and WLAN data are becoming inexact. The particle filter weights the WLAN data more than the GPS data until minute 25. Then the GPS observations get more accurate and the particle filter begins weighting the GPS observations more than the WLAN observations. This is the desired effect; as one can see, during the last minute, the GPS-only solution yields higher accuracy than the WLAN stand-alone solution. Shifting the weights to GPS observation induces a decrease of the GPS + WLAN solution’s error. Notice that the hybrid solution’s error is always equal or lower than the error of the better performing solution, pointing to an adequate weighting of the two information sources. However, the GPS data, on which the hybrid solution is relying during the last minute, is still relatively poor, maybe because the filter has not yet converged after the huge deviations, or because the last part of Trajectory-1 is surrounded by trees.

The error of the extended Kalman filter is initially equal to that of the GPS-only configuration. It takes about three minutes until a WLAN position estimate is considered by the extended Kalman filter. In that moment, the error decreases to the magnitude of the error of the particle filter. While the particle filter solution mainly ignores the GPS information, the extended Kalman filter occasionally weights the GPS data and deviates from the quite accurate WLAN-only solution. A better performance can be seen at the end of the path, when GPS data quality increases due to better visibility of the satellites. The GPS measurements get more weight while the WLAN data is weighted less. During the last minute, the extended Kalman filter outperforms the particle filter.

The precision of the algorithm can be assessed from Figure 4, which illustrates the empirical cumulative distribution function of the errors. The hybrid solution presents the highest accuracy and precision, with a median accuracy of 5 m and an accuracy of 11 m at 90% probability.

The extended Kalman filter solution yielded an accuracy of 10 m at 50% and 20 m at 90% probability. That is about twice as low compared with the accuracy of the particle filter. The maximum error of the extended Kalman filter solution is about 30 m, whereas the maximum error of the particle filter is 15 m.

Table 1 compares the performance of the particle filter hybrid solution with the individual GPS and WLAN solutions and with the extended Kalman filter numerically, that is, in terms of the root-mean-square error (RMSE) and the standard deviation (SD). The hybrid solution clearly outperforms the two other particle filter configurations. The GPS-only solution is very inaccurate, because GPS observations were only available during the last section of the trajectory. For the same reason, the WLAN stand-alone solution is almost as good as the hybrid solution. Nevertheless, it has a higher probability for large errors. Due to the additional information in terms of GPS observations in the last section of the trajectory, the hybrid solution is more accurate than the WLAN-only solution. The RMSE of the particle filter WLAN + GNSS configuration is in comparison with the extended Kalman filter solution two times smaller. However, considering the standard deviation of these two filters, the outcome of extended Kalman filter is two times smaller than that of the particle filter.

The error distribution function and the standard deviation of the particle cloud show that the hybrid solution is not only more accurate but also more precise; its empirical error distribution function presents the lightest tails. The GPS + WLAN configuration also yielded the smallest standard deviation.

5.4. Outdoor-Like Trajectory: Trajectory-2

The overall performance of the filter for Trajectory-2 is depicted in Figure 5. The continuous availability of GPS observations is observable; the stop-and-go motion causes the position estimates to cluster, because, during the phases without motion, the filter continues estimating position of the static mobile terminal based on GPS observations. In between these clusters, the estimated position tends to jump, which can be explained again by the reception of a reliable RSSI observation after some time. After the estimates converge closer to the starting point, the estimated path follows roughly the ground truth. The estimates become more exact where the mobile terminal enters the right building. However, most times along the roofed passageways and around the small lawn area the localization performance is only fair. More exact estimates can be seen about the corner, where the two buildings almost touch, and later on the open space again.

The extended Kalman filter shows a similar positioning performance. It estimates well the entering of the right building. On the semiopen passageway of the left building and around the small lawn area, the accuracy decreases as well; nevertheless, position estimates of the extended Kalman filter appear smoother than those of the particle filter. The accuracy of the extended Kalman filter estimates increase again towards the end of the path on the open space.

To reason about the causes of the described performances, we refer to Figure 6. It illustrates the magnitude error, again for the three configurations of the hybrid algorithm and the extended Kalman filter. This figure shows that the error of the particle filter GPS + WLAN solution is in average below 5 m; nevertheless, in various moments, it exceeds 10 m. The GPS-only solution yields errors similar to the hybrid solution and errors smaller than that of the WLAN stand-alone solution. The outcomes based on only WLAN are very poor, especially about 24 min to 26 min and for the last 4 min of the trajectory.

The low accuracy of the particle filter’s GPS + WLAN solutions about 24-25 min is because of the poor quality of WLAN and GPS observations. During that time interval, the WLAN observations are very inexact for which reason they are basically unregarded, and therefore the accuracy of the hybrid solution equals that of the GPS-only solution. Between minutes 34 and 35, the poor accuracy of the hybrid solution is caused by an inappropriate weighting of the observations. The filter is too confident about the GPS observations or estimates the quality of the WLAN observations as too poor, respectively. The same aspect is visible at the beginning around 23 min and also about 27.5 min. In general, however, the accuracy of the hybrid solution, again, leans towards the more accurate sensor information; it never presents the largest errors among the three configurations. Figure 6 indicates that the weighting mechanism works reasonably well besides a few sporadic exceptions.

The accuracy of the extended Kalman filter is also most of the time about 5 m and shows only a few peaks that are larger than 10 m; see, for example, minutes 24, 28, and 34. It is worth noting that in some occasions the extended Kalman filter presents the lowest errors; refer to minutes 25, 35, 37, and 38. In general, Figure 6 confirms the similarity of the accuracy of the particle filter in GPS + WLAN configuration and the extended Kalman filter.

We examine again the cumulative distribution function of the error, as shown in Figure 7. The proposed algorithm yielded a median accuracy of 4.5 m and only 10% of the errors are larger than 10 m. As the hybrid solution makes use of both sensor data, it is not surprising that it outperforms the individual solutions. The median error for GPS-only is roughly 6.5 m and that of the WLAN-only configuration is 8 m. The difference is even bigger at 90% probability; GPS-only yielded 12 m and WLAN-only 23 m. Thus, in comparison with the two stand-alone solutions, the integration of the two systems improves also the precision.

The median accuracy of the extended Kalman filter is 5.5 m, only one meter larger than that of the particle filter hybrid solution. The error at 90% probability is, as for the particle filter in GPS + WLAN configuration, 10 m. The maximum error of the extended Kalman filter is about 18 m, that is almost 10 m less than the maximum error of the hybrid solution. However, the largest error of the particle filter stems clearly from the initial estimate, before the first observations were obtained; the maximum error of the extended Kalman filter occurred about 34 min.

Table 2 summarizes the performance analysis and confirms the previously made notions. It presents the RMS error and the standard deviation derived from the particle cloud and contrasts them for the three configurations, as well the RMSE and the standard deviation of the extended Kalman filter are shown. The extended Kalman filter yielded the smallest RMS error and standard deviation, although the GPS + WLAN configuration’s median error is smaller than the median error of the extended Kalman filter. The hybrid solution of the particle filter is in average slightly more inaccurate and its standard deviation is more than twice than that of the extended Kalman filter. As expected, the particle filter in GPS + WLAN configuration outperforms the GPS-only and WLAN-only solutions. The GPS stand-alone solution performs also well considering the localization scenario. The performance based on only WLAN observations is mediocre, because large parts of the paths are outdoors, where fingerprints are ambiguous.

The stated error figures are specific for that system and scenario and shall be put into context. If the measurements of GPS and WLAN are severely degraded, the particle filter can only yield poor estimates. An idea about the quality of the measurements can be gained from Figure 8.

Figure 8 compares the estimates for Trajectory-2 of the GPS-only particle filter solution with that of the commercial GPS receiver ([22]) and with the least squares (LS) solution (that was used to estimate the clock offset of the receiver). Additionally, this figure depicts a second trajectory from the commercial GPS receiver from another experimental run. The black trajectory is derived from the same GPS observations that were fed into the particle filter (teal curve) and least square algorithm (cyan curve). The grey trajectory is an alternative trajectory estimated by the GPS receiver and was recorded on the same path, but during another experiment; the experimenter’s motion during this experiment was continuous.

The commercial GPS receiver estimates (black path) the trajectory well at the beginning, until the path enters the right building. The opposite holds for the GPS-only solution of the particle filter; its estimates are pulled towards the open space. The indoor part is not recognized by the GPS receiver, whereas the outcomes of the GPS stand-alone solution are good in the indoor section. On the roofed passageway of the left building, only the GPS receiver provides more or less adequate estimates; the estimates appear of higher accuracy than that of the GPS-only solution. Later, on the passageway of the right building, the GPS receiver’s accuracy decreases again; its estimates cross the right building until they come back on track which in the receiver is on the open area. The GPS-only solution shows better results on the passageway of the right building, but on the open space the GPS receiver appears to be more accurate again.

The least squares solution illustrates the quality of the GPS measurements and the alternative trajectory of the GPS receiver shows the harshness of the environment. The estimates of the least squares solution are similar compared to that of the GPS-only solution, but much more noisy due to the missing motion model. Nonetheless, position fixes during the indoor part are surprisingly good. The alternative GPS trajectory is worse than the black trajectory, especially on the semiopen passageway of the right building and after surrounding the small lawn area; no position fixes could be estimated during the last minutes of the trajectory.

A comparison of the accuracy of the trajectory of the GPS receiver and the hybrid particle filter is shown in Figure 9. That figure depicts the magnitude error of Trajectory-2 obtained from the hybrid particle filter next to the error from the GPS receiver. Additionally, we show the number of satellites. At the beginning of the trajectory, on the open space, the GPS receiver is more accurate than the hybrid solution. It actually follows the ground truth very well, even on the passageway of the right building, which is remarkable under these conditions (it might even be an exception as the alternative trajectory of the GPS receiver suggests). The indoor section and the path next to the left building, in general after minute 29, are nevertheless estimated poorly. Errors rise up to 20 m and 30 m. These large errors occur when the number of available satellites is varying a lot, indicating quick changes of the sky view which degrades GPS performance. Time intervals of good GPS performance correlate with sections where the number of observations is constant.

For most of the time of the trajectory, 6 to 7 GPS observations are available, a fair amount, given the scenario. Only in a few epochs, the number of satellites decreases to four and only one time it is below four. That means, that during time intervals of poor GPS performance, the number of observations was fair enough to get a good position fix, but that the quality of the observations was not sufficient for accurate position estimates.

The integration of WLAN data improves the localization accuracy. The GPS receiver’s RMSE for that trajectory is 9.79 m, about three meters larger than the RMSE of the particle filter and of the extended Kalman filter.

6. Discussion

The performance of GNSS degrades in areas with obstructed satellite-to-receiver link, because, for example, scattering and multipath effects delay the signal and increase the propagation time. However, the same obstructions attenuate WLAN signals so that the spatial RSSI distribution is rather heterogeneous. This facilitates distinguishing nearby RSSIs and improves the resolution and the accuracy of a WLAN fingerprint localization system. The opposite effect occurs in open areas, where GNSS’s position fixes are usually accurate, but position estimates relying on WLAN RSSI often show large errors. These large errors are caused by very similar RSSI patterns at distant locations.

As expected, the particle filter hybrid solution attains the highest accuracy and precision compared with the stand-alone solutions. The smaller error of the two stand-alone configurations presents an upper bound for the error of the hybrid solution.

In comparison with the extended Kalman filter, the proposed algorithms outperformed the extended Kalman filter in the indoor-like scenario. For Trajectory-2, the performance of the particle filter and the extended Kalman filter are comparable, the particle filter performed in average slightly worse than the extended Kalman filter. The extended Kalman filter estimates are more precise than the estimates of the particle filter. This is because of the little amounts of noise that are introduced in the process model of the particle filter to mitigate sample impoverishment (caused by resampling). However, the extended Kalman filter’s performance is considerably poorer (large position jumps) than that of the particle filter when only WLAN data was available. The integration of WLAN data works better in the particle filter. From our point of view the disadvantage that the extended Kalman filter presents in scenarios without or with few GPS data does not outweigh the small advantage shown in the scenario where GPS and WLAN data were basically available all the time.

Sections of low accuracy of the hybrid solution are due to the loss of one of the systems or due to very poor quality of one of the observations in combination with sharp direction turns and/or changes of the sky view. As for most of the time one of the systems operates reasonably well, extreme deviations from ground truth, as it is common for fusion on a position level and can be seen for the extended Kalman filter for Trajectory-1, can be avoided, because the continuous adaption of the weights results in smooth location estimates, also during indoor/outdoor transitions.

The chosen motion model is very general and versatile and served well in our experiments. However, an adequate model must be chosen according to the final application.

One drawback of our method is the computational complexity of the RSSI interpolation. Predicting RSSIs requires the inversion of the covariance matrix, which has a dimension equal to the number of used particles. One way to reduce the computational costs is to reduce the number of particles. To compensate the loss of accuracy due to reduction of particles, a better importance density is advisable. For some scenarios, an extended Kalman filter solution can achieve similar or better performance at lower computational complexity. Our methodology used a maximum likelihood estimator to estimate a position from WLAN RSSI; this is based as well on the Gaussian process regression. Replacing the used WLAN position estimation with a simpler algorithm (e.g., k-Nearest Neighbour) would reduce the complexity of the extended Kalman filter further. The concrete impact of the above-mentioned changes on the positioning accuracy of the particle filter requires further research.

The ubiquity of the proposed method depends heavily on the dissemination of WLAN fingerprinting location systems, more precisely on the coverage with radio maps, and that they are accessible and compatible. For further thoughts on that we refer to [4]. Another point we are presuming is the accessibility to the pseudoranges of a GNSS receiver. However, most off-the-shelf receivers do not provide pseudoranges yet. This is, however, changing right now as smartphones with access to raw measurement recently appeared on the market.

One objective of this study was the exploitation of the contrary information provided by pseudoranges on the one hand and RSSI on the other hand. We consider the automatic weighting of the different sensor information as functioning, but it is also apparent that it is improvable. Some relatively simple and straight forward ideas and changes in the presented system are likely to improve the localization accuracy.

The simplest change is the use of a multiconstellation, multifrequency receiver. This improves the coverage because the use of those receivers increases the number of receivable signals, and it improves the localization performance because a second frequency enables a more accurate correction of GNSS errors. Exploiting the carrier phase information would significantly decrease GNSS positioning error; nevertheless, such receivers are just starting to penetrate the market. Additional improvement is expected from multipath mitigation methods in the receiver, for example, at the correlator stage, if these are not yet implemented in the used receiver. Multipath mitigation based on the RSSI data is also worth investigating.

On the WLAN side, the most crucial issue is the choice of the variance in the RSSI likelihood function. It appears advisable to consider time-dependent variance estimates that express the uncertainty of the WLAN observations and not of the predictions of the Gaussian process model. Combining the currently used RSSI variance, containing spatial information, and a RSSI variance estimated from the observed samples, containing temporal information, improves very likely the weighting between the two information sources. Approaches to reducing the computational costs of the interpolation of WLAN RSSI are another open point.

Dependent on the use case, one could create fingerprints only on the paths. This indirectly includes information of the environment and improves the localization performance by removing potentially ambiguous fingerprints. This idea also translates to the interpolation with Gaussian processes, because the variance in areas without fingerprints (training data) will be higher.

7. Conclusion

This study presents a new algorithm combining GPS pseudoranges and WLAN RSSIs. The proposed algorithm achieves accurate and robust seamless localization as it exploits the complementary information contained in the sensory data. We developed a likelihood function that overcomes the drawback of the spatially discrete fingerprints by interpolating RSSIs through Gaussian process regression. The integration of the RSSI likelihood function and the pseudorange likelihood function into the particle filter yields a fine grained, automatic weighting of GPS and WLAN observations, which improves the accuracy and precision when compared with the stand-alone solutions using only GPS or WLAN data. This gain in accuracy and precision comes from an increase in computational complexity. The extended Kalman filter used as benchmark, that integrates a WLAN position estimate with GPS pseudoranges, achieved in one scenario similar positioning performance compared to the proposed particle filter. However, without enough GPS data, the extended Kalman filter accuracy degrades much, which is why we consider the particle filter worth its complexity.

In realistic indoor/outdoor experiments, in an environment harsh for GNSS and WLAN fingerprinting localization, the proposed method achieved an average positioning accuracy about 7 m. The proposed integration approach of GPS pseudoranges and WLAN RSSI is of low-cost and can improve localization performance of mobile platforms, whenever pseudoranges and RSSIs are available in the navigation processor.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was partially supported by the Consejo Nacional de Ciencia y Tecnología and the División de Investigación y Posgrado de la Facultad de Ingeniería de la Universidad Autónoma de Querétaro. The authors thank Karla Alejandra Rojas Camargo for the assistance in programming the data logger and data acquisition. Philipp Richter would also like to thank Eduardo Castaño-Tostado for his comments that improved this work.