#### Abstract

We address the indoor tracking problem by combining an *Impulse Radio-Ultra-Wideband* handset with an ankle-mounted *Inertial Measurement Unit* embedding an accelerometer and a gyroscope. The latter unit makes possible the detection of the stance phases to overcome velocity drifts. Regarding radiolocation, a *time-of-arrival* estimator adapted to energy-based receivers is applied to mitigate the effects of dense multipath profiles. A novel quality factor associated with this estimator is also provided as a function of the received *signal-to-noise ratio*, enabling us to identify outliers corresponding to obstructed radio links and to scale the covariance matrix of radiolocation measurements. Finally, both radio and inertial subsystems are loosely-coupled into one single navigation solution relying on a specific *extended Kalman filter*. In the proposed fusion strategy, processed inertial data control the filter state prediction whereas *Combined Time Differences Of Arrival* are formed as input observations. These combinations offer low computational complexity as well as a unique filter structure over time, even after removing outliers. Experimental results obtained in a representatively harsh indoor environment emphasize the complementarity of the two technologies and the relevance of the chosen fusion method while operating with low-cost, noncollocated, asynchronous, and heterogeneous sensors.

#### 1. Introduction

For the last past years, new *location and tracking* (LT) needs have been gradually introduced into a wide variety of applications, such as security, health care, rescue, logistics; or house automation. A growing interest has been more particularly expressed in location-dependent indoor services, which require seamless pedestrian navigation capabilities in harsh environments where satellite-based solutions cannot operate. In this context, alternative technologies are currently under investigation, based on for example, location-enabled wireless networks [1, 2].

On their own, most of modern wireless networks can indeed retrieve the positions of mobile radio devices relative to the known position of reference anchors or *base stations* (BS). The radiolocation functionality simply relies on the measurement of radio metrics, which depend on the distance traveled in the air by transmitted signals. For instance, when a mobile radio device is synchronized with a BS (e.g., using *n*-way ranging protocol transactions [3]), range information can be derived from the *time of arrival* (TOA) of the received signal. Several TOA-based range measurements collected (with respect to fixed BSs with known locations) can hence feed positioning or tracking algorithms to solve out a circular (resp., spherical) location estimation problem in 2D (resp., 3D).

Alternatively, if the surrounding BSs are strictly synchronized (i.e., independently of the clock of a mobile transmitter), the *time difference of arrival* (TDOA) can be considered, leading to a hyperbolic problem formulation.

Benefiting from unprecedented resolution and synchronization capabilities for the acquisition of such temporal radiolocation measurements, the *Impulse Radio-Ultra-Wide band* (IR-UWB) technology is today viewed as a credible solution to address LT applications through short-range and *low-data-rate* (LDR) links [4–9], *a fortiori* in the context of *ad hoc wireless sensor networks* (WSN). However, despite the claimed fine properties of IR-UWB signals, obstructed radio links notoriously introduce additional biases and high dispersion onto measurements, for example, when direct paths are blocked, or when transmitted paths are severely attenuated and shifted in time [3]. Overall, IR-UWB systems claim to provide very fine location and tracking performances in *line of sight* (LOS) (i.e., typically within submetric precision), but they can hardly guarantee such a precision level under generalized *non-line-of-sight* (NLOS) conditions (e.g., affecting several links with respect to distinct BSs simultaneously), what is a rather common situation in indoor environments. Hence, one first challenge is to derive robust TOA estimators with practical receiver architectures. One more point would be to evaluate the instantaneous quality of TOA estimates, or even to efficiently remove measurement outliers due to NLOS to assist and enhance the fed LT algorithms.

But it is also well known that radiolocation solutions operating in harsh radio environments could benefit from external means, like assisting *inertial navigation systems* (INSs) based on *inertial measurement units* (IMUs). In a navigation scenario, one INS can deliver relevant information out of raw inertial measurements for example, the pedestrian displacement amplitude, velocity, or heading [10–12]. Such information can then be used in addition to radiolocation measurements, as proposed by recent works in the field of hybrid data fusion for example, with GNSS pseudoranges in [13], with IR-UWB TOA in [14, 15], with IR-UWB TDOA and AOA in [16, 17], or even with WiFi *received signal strength indicators* (RSSI) fingerprints in [18].

Besides strict performance considerations, fusion is all the more relevant, since mobile devices (e.g., personal terminals) are expected in the near future to physically integrate multiple standards and sensors and/or to cooperate with other systems or networks, making heterogeneous modalities naturally available on the user side. Accordingly, several system architectures and configurations involving either collocated or noncollocated radio devices and IMUs can be considered for navigation purposes. As an illustrating example, smart clothes comprising distributed IMUs might form a *body area network* (BAN) using one first radio access technology and interacting with a portable handset displaying or relaying inertial information to an external access point, to the infrastructure equipping the building, or to another wearable network in the vicinity (potentially through another radio technology). Finally, coupling low-power and low-cost technologies particularly makes sense for the perennial and massive deployment of such systems. This implies the use of adequate radio technologies with energy-efficient transceiver design, reasonably simple inertial units (i.e., with a limited number of embedded sensors), and low-computational complexity for further postprocessing (including data fusion tasks).

In this context, we address herein the pedestrian navigation problem in indoor environments that present dense multipath profiles and magnetic perturbations, by loosely coupling an IR-UWB handset transmitter with a shoe-mounted IMU endowed with a 3-axis accelerometer and a 3-axis gyroscope. A specific *extended Kalman filter* (EKF) is defined to optimally hybridize the radio and inertial data and to cope with the specific system architecture constrains (i.e., operating with noncollocated sensors). Adapting and extending previous results from [19], this formulation also combines UWB measurements (obtained through the method proposed in [20]) into new observations defined as *combined TDOA* (CTDOA). While reducing filter complexity (as a function of the number of available measurements), the proposed solution allows us to remove outlier measurements, without reconfiguring the whole filter structure and without omitting relevant measurement information. The selection of nonoutlier measurements is performed by monitoring the instantaneous quality of TOA estimates, based on a new practical SNR-dependent indicator. As for inertial data, we take benefit from the IMU place to detect stance phases and to reset the foot velocity, hence mitigating in turn the drift that usually affects the estimated INS velocity. The obtained pedestrian heading and average body velocity subsequently feed the fusion filter to control the state prediction phase. Indoor experiments were carried out to validate the proposed fusion approach, as well as to draw intermediary statistical UWB channel parameters useful to TOA estimation.

The paper is structured as follows. In Section 2, we present a selection of techniques and concerns from the recent state of the art in the fields of IR-UWB TOA-based ranging, inertial navigation systems, and the fusion of both modalities. We also try to position the main contributions of this paper in comparison with existing solutions. Then in Section 3, we briefly discuss the specificities of the statistical UWB channel models, and we estimate channel parameters from real indoor measurements. Then a robust TOA estimator adapted to energy detection receivers is recalled. We also show how to practically evaluate the quality of this estimator and to remove outlier measurements, from a filter-oriented perspective. In Section 4, we present our INS, which basically consists of an ankle/foot-mounted magnetometer-free IMU. In particular, we show how to use detected stance phases to remove the drift on INS velocity and infer the average body velocity. In Section 5 we detail and justify further our loosely-coupled fusion strategy, along with the corresponding filter structure. Then we account in Section 6 for experimental results, which were obtained in a typical indoor environment offering mixed operating conditions in terms of *signal-to-noise ratio* (SNR), radio links obstructions, *geometric dilution of precision* (GDOP), and magnetic disturbances. Finally, Section 7 concludes the paper.

#### 2. Related State-of-the-Art and Paper Contributions

##### 2.1. Impulse Radio-Ultra-Wideband Time-of-Arrival Estimation

Many solutions have been described in the literature to cope with IR-UWB TOA estimation, including sophisticated algorithms inspired by former high-resolution channel estimation solutions [6] requiring high sampling rates. More recently, various other techniques adapted to the low-complexity LDR context have been proposed and compared (e.g., [7]). A specific focus is usually made on noncoherent receivers like *energy detectors* (ED), for which one simple approach consists in comparing the energy collected in consecutive time bins with an appropriate detection threshold. The index of the first bin exceeding this threshold is then associated with the TOA estimate [7, 21]. Unfortunately, within realistic indoor channels, threshold-based ED suffers from overlapping *multipath components* (MPC) and poor *signal-to-noise ratio* (SNR) conditions, introducing significant estimation errors and biases. One weakness of these methods is that they do not take benefit from the whole MPC profile (though conveying constructive information), but they only depend on marginal and independent energy terms. Therefore, new TOA estimators have been proposed very recently in [20], assuming realistic path amplitude statistics in compliance with IEEE 802.15.4a recommendations [22], and considering the whole observed energy profile before making a decision on the estimated TOA. These estimators were shown to exhibit low-estimation dispersion around the actual channel leading edge, over a wide range of practical SNRs and channels.

But one more challenge is to allow the real-time prediction of TOA estimation uncertainty or dispersion. As regards to this measurement quality assessment, for most of the TOA estimators proposed in the literature, one could estimate offline the uncertainty based on the received signal power and conditioned on the environment category (e.g., indoor industrial, indoor residential, outdoor, etc.), on the channel configuration (e.g., LOS, NLOS, severe NLOS, etc.), and/or on the actual distance. A relationship could then be established *a priori* between the error affecting the measured TOA and the SNR through simulations, experimental campaigns, or even theoretical analysis (e.g., [23]). However, this kind of method can hardly benefit from the specificities of the received signal at each instant (e.g., of the current multipath energy profile) under mobility. As an example, under given SNR conditions, the received signal could have either sparse or dense multipath profile, which directly impacts the reliability of the estimated TOA. Moreover, offline characterization is usually mostly intended for ranging performance assessment, but still remains unexploited for online tracking purposes.

Hence a new practical method is still required to associate the instantaneous TOA estimation quality with practical estimators (e.g., [20]).

##### 2.2. Inertial Navigation Systems

INSs based on integrated IMUs are more and more used for navigation purposes due to their low cost, low weight, and low consumption. Moreover, as stand-alone autonomous solutions, they can be used indifferently in indoor or outdoor situations, in the lack of surrounding means or infrastructure. They can also be considered for *dead reckoning navigation* (DRN) (i.e., estimation of the current position from a known starting point) when any absolute positioning system or GNSS is not available, ensuring the navigation service continuity.

A typical IMU consists of a combination of low-cost *microelectromechanical sensors* (MEMS) for example, accelerometers, magnetometers, or gyroscopes. Such units, available as commercial devices now [24], are intended for numerous applications such as motion capture, unmanned vehicular control, antenna stabilization, video gaming, and pedestrian navigation. Those IMUs provide metrics related to acceleration, orientation, magnetic field, and angular rate. The raw measurements must be processed and analyzed further in order to get relevant navigation information, such as the direction, displacement, speed, and so forth.

In the specific pedestrian navigation context, DRN systems rely on step detection and, for each detected step, on length and heading estimation [25]. Examples of step detection methods are given in [26] or [27]. For step length estimation, two kinds of approaches can be considered. One approach is based on an empirical model linking the step length with some parameters extracted from the sensors measurement during the step [28, 29]. The advantage of this approach is that the models can be adapted to any placement of the IMU. The disadvantage is that a calibration is needed for each pedestrian. A second approach uses the integration of the gyroscope and the accelerometer raw data. Nevertheless, even small errors on the measurements would lead, after integration, to a large error cumulated over time. The drift of the estimated velocity can be reduced with a foot-mounted IMU using *zero velocity update* (ZUPT) [10], which enables to reset the velocity to zero when the foot is on the floor. Other methods enable to limit the drift of foot-mounted INSs [30]. As no calibration is needed and drifts are limited, this kind of method leads to better performances than empirical methods. However, the IMU has to be placed on the foot in order to benefit from zero velocity. For heading estimation, magnetometers can be used. But they can be subject to local magnetic disturbances induced by pieces of furniture, buried or on-body metallic materials. Those perturbation are particularly present near walls and floors, making hazardous the use of foot-mounted magnetometers in typical indoor environments.

##### 2.3. Hybrid Data Fusion Strategies

Three different fusion strategies can be applied to merge heterogeneous data in a tracking context.

The first one is the so-called *decoupled* strategy, for example [13]: the mobile node position is estimated separately and independently by each subsystem and then the set of estimated positions (delivered by independent systems) is fused into one final solution. This strategy is preferably applied when both location systems are not subject to drift, or when raw data are not available.

The second strategy is *tightly coupled*, for example [15]: some raw data are available at each subsystem, which are processed at once to track the location. Hence, it necessitates a global model that relates the heterogeneous measurements of both systems to a correction model accounting for defected measurements (i.e., orientation drift, outlier ranging measurements, magnetic disturbance). The tightly coupled fusion seems to be the best method at first sight since it enables to jointly optimize the output estimate given all the available data. However, these solutions require that the involved subsystems are physically collocated or rigidly connected, what is neither necessarily relevant nor practical for inertial-based pedestrian navigation. Finally, the synchronization issue between subsystems, which is inherent to any fusion strategy, is all the more critical within these tightly coupled solutions.

The last fusion strategy is * loosely coupled*, for example [16, 17]: the inputs to the location estimator can be raw and/or preprocessed data issued from sub-systems. This strategy enables to mitigate the previously mentioned drawbacks of the decoupled and tightly coupled fusion strategies. It tolerates that the radio part includes a controlling module to remove outlier measurements and/or to adjust the assumed quality of radio measurements. Finally, this loosely-coupled strategy enables different placements of the involved subsystems.

Besides the coupling strategy itself, the fusion of heterogeneous data also requires the use of specific estimation tools and advanced filtering techniques, such as a particle filter in [14], a combination of a particle filter and an extended Kalman filter (EKF) in [15, 18], or a backward/forward Kalman filter with a recording/smoothing unit in [31]. But another feature of navigation fusion filters concerns the way the IMU data are exploited. In a first approach, the inertial data are used in the prediction phase of the filter [17]. In the second one, they are integrated as observations in the correction phase [16].

In the following, we will provide detailed justifications for retaining a loosely-coupled scheme based on EKF and using inertial data in the prediction phase of the filter, given our system architecture constraints.

##### 2.4. Summary of the Main Paper Contributions

Overall, the main paper contributions are as follows: (i)application of an energy-based Bayesian TOA estimator under realistic channel parameters; (ii)proposal of a new quality indicator for TOA estimates, feeding the tracking filter; (iii)proposal of a specific fusion-oriented filter admitting: (a)IMU-based pedestrian heading and velocity measurements as control inputs into the filter state prediction; (b)combined temporal radiolocation measurements as filter observations, reducing computational complexity and mitigating local harmful effects due to EKF linearization; (iv)evaluation of the previous items within one single unified tracking scheme through representative experiments in a typical indoor environment.

#### 3. TOA Estimation with UWB Energy Detection Receivers

##### 3.1. Received Signal Statistics

The received IR-UWB multipath signal can be modeled as a sum of weighted replicas of an energy-normalized pulse of duration , as follows:
where is the overall channel energy received per transmitted pulse, and are, respectively, the arrival time and the normalized complex gain of the path (with without loss of generality), if and denote the path amplitude and phase, respectively, and is the TOA of the first path. The received signal is filtered with an ideal *Bandpass filter* (BPF) of bandwidth . Accordingly, a zero-mean *additive white Gaussian noise* (AWGN) process with double-sided power spectral density is also filtered in the band of the transmitted signal into , with the resulting variance . Finally, we define .

Considering a typical ED receiver [20, 21, 32] (see Figure 1), the overall observation time is divided into time slots of length . As we assume nonoverlapping replicas of the transmitted pulse hereafter, the noise samples taken in different slots are considered as statistically independent. The slots are numbered starting from slot (i.e., for ) up to slot (i.e., for ). The output sample associated with the time slot of the ED can be written as
Let be the slot index to be estimated, which is associated with the bin that contains . Then the slots with indexes correspond to the multipath region. According to [33], under the AWGN assumption the normalized energy samples at the integrator output of the ED follow a Chi-square distribution. For , is either a central Chi-square distributed (C) *random variable* (r.v.) with degrees of freedom or a noncentral Chi-square distributed r.v. (NC), still with degrees of freedom, but with a noncentrality parameter that accounts for the energy of the useful signal in slot . is usually assumed as a Nakagami- distributed r.v. with parameters and . Finally, in [34] the *average power delay profile* (APDP) is modeled by a single exponential decay, with a decay constant . In [22], the APDP also follows an exponential decay, but with further multipath clustering effects.

To verify these model assumptions and characterize key channel parameters, a channel measurement campaign was carried out in the indoor environment described in Section 6.1 for our tracking experiments. Averaging over all the measured channel profiles, an exponential model was fitted to the empirical average power decay, and an empirical function was drawn for the path presence probability, depending on the channel excess delay (or time index, equivalently). To get a better fit of the power decay to our measurements, we preliminarily conditioned received energy upon the presence of a path in each slot so that the slots with no detected path energy were not taken into account in the average. In order to decide whether contains only noise energy, we simply set a threshold equal to dB above the noise power. Finally, the probability of path presence was computed over all the observed channel profiles, as follows: According to [22], each received profile realization is affected by a distinct shadowing effect. Thus, the measured channel profiles were normalized in power before averaging. Figure 2 represents the exponential power decay model fitted to the measured channel profiles, along with the empirical path presence probability , both of which decrease as a function of the excess slot index (i.e., the time index after channel leading edge). For the exponential power decay, the best fit was obtained for ns, which is rather close to the mean value ns found in [34].

Since is Nakagami- distributed with a mean value , then is Gamma distributed with a shape parameter equal to and a scale parameter equal to . For the sake of simplicity, we assume that all the received paths have the same parameter ( corresponding to the part of the received signal comprising the useful multipath energy), like in [21]. As shown in Figure 3, for the theoretical *cumulative density function* (CDF) of the normalized path energy satisfactorily fits to the empirical one computed out of our experiments. In the following, the exponential power decay and Nakagami- parameters, as well as the probability of path presence as a function of the excess time index, are used as prior information to feed our TOA estimator. Similarly, in [21, 33], such channel characteristics are assumed available *a priori* to set optimal detection thresholds.

##### 3.2. TOA Estimation

We consider the *minimum mean square error* (MMSE) estimator proposed in [20]. Accordingly, the sample index associated with the estimated TOA is defined as
where is now a conditional probability density function (pdf). Complete analytical developments with their intermediary results can be found in [20]. The TOA estimate is finally obtained as , where is the ED sampling period.

##### 3.3. TOA Covariance Estimation

We now intend to provide an indicator reflecting the quality of energy-based TOA estimation. One step beyond, the underlying idea is to feed a tracking filter with further information to dynamically adjust the covariance matrices used in the filter correction step, and/or even to help the detection of outlier TOA-based observations, for instance based on a filter innovation test [35].

For this purpose, we rely here on the instantaneous SNR of the received signal. First, the noise power spectral density can be rather straightforwardly estimated in the absence of transmitted signal. Thus we assume that it is available on the receiver side. One ideal solution would consist in jointly estimating the SNR and the TOA. However, for the sake of practicability and simplicity, it is chosen to estimate the SNR first out of all the collected normalized energy samples . Unfortunately, there is no close form for the *maximum likehood estimator* (MLE) of the corresponding (NC) noncentrality parameters . However, it is at least shown in [36] that the following estimator:
has a lower *mean square error* (MSE) than the MLE for a single observation and for , which is practically verified in our case. Now, since the TOA is *a priori* unknown before applying (4), then the mean energy value of the multipath components within each time slot is also unknown. Hence, we use to determine the SNR. Here, two possibilities are available, as follows:
where is an NC distributed r.v. with degrees of freedom and a noncentrality parameter equal to . Both estimators in (6) provide the same estimation result if , which is more probable at high SNR values. Otherwise, through simulations, we found out that gives lower MSE than . Moreover, for , is unbiased, and the standard deviation of the relative error is equal to , which decreases when increases (i.e., at high SNR).

Let , where is the sum of Gamma distributed r.v. , each with a probability , then is also a r.v. whose *characteristic function* (CF) is given by
where is the square root of , and still assuming (for all corresponding to the part of the received signal comprising the useful multipath energy). Suppose now that is given, then a nonbiased estimator of the SNR can be built as
where is the expected value of , which can be practically computed as , that is to say, as the derivative of (7) with respect to evaluated at .

Using the experimental and found in Section 3.1, it turns out that practically, which is fairly acceptable.

Finally, SNR is estimated at the real receiver using the approximation from (6) instead of in (8), as follows:
At this point, deriving the analytical expression for the *root mean square error* (RMSE) of the estimated TOA as a function of still remains challenging. Consequently, we proceed by simulations to draw the required relationship. In these simulations, the true TOA is a uniformly distributed r.v. in the time interval (i.e., in the entire observation window), and the received signal is simulated according to the statistical parameters given in Section 3.1. Figure 4 shows the RMSE of the estimated TOA as a function of , the latter being computed on the receiver side based on (9). This monotonic evolution can be tabulated as a function of . In an online tracking context, the instantaneous standard deviation of the current TOA estimate with respect to the BS can then be approximated by the previous tabulated RMSE function, assuming unbiased TOA estimate (i.e., RMSE ). In the following, is used in Sections 5 and 6 to determine the instantaneous measurement covariance matrix at each update of a tracking filter (out of the collected energy samples).

##### 3.4. Measurement Outliers Detection and Discarding

In the previous section, we have proposed a dispersion estimator , assuming unbiased estimated TOAs. Unfortunately, most of NLOS situations result in biased estimates.

From a tracking filter perspective, a first approach to mitigate harmful NLOS effects consists in increasing the diagonal elements of the measurement covariance matrix. Different solutions can also be used to detect NLOS outliers, for instance based on innovation tests [15, 16]. In this case, the innovation is related to the difference between the current observation and the predicted one, computed under an unbiased hypothesis. However, the complete measurement set would be discarded, even if only one single measurement was biased. Moreover, in some scenarios, severely obstructed NLOS situations might also coincide with situations where the location information is adversely conditioned, for instance due to a poor *geometrical dilution of precision* (GDOP). In this case, the values in the predicted state covariance matrix are so large (and so are the values in the covariance matrix of predicted measurements) that outliers can hardly be detected through simple innovation tests. Another method is to preliminarily identify the channel state LOS/NLOS based for example, on collected channel energy [37, 38]. Then, identified NLOS channel conditions are associated with TOA measurement outliers. Whatever the detection method, outliers can be discarded before the location estimation step [15] independently of the latest state estimate. In our context, as ED receivers naturally provide access to the channel energy profile, the last approach is considered.

One indicator of the channel status (and hence indirectly of potential TOA estimation biases) is conditioned on the received signal energy as follows: Practically, this indicator reflects the variance of the estimated TOA conditioned on the energy samples of the received signal, averaging over all the possible true , supposed to be uniformly distributed in the observation window. Since the multipath energy spread of NLOS channels is notoriously larger [22], the hypothesis conditioned on the observed energy samples in (10) is likely equiprobable for all the slot indexes. Therefore, the indicator will exhibit larger values for NLOS channels. Note that computing is rather convenient, since all the required quantities are already available for the calculation of (see Section 3.2).

Practically, to decide if one TOA measurement is biased (i.e., if it is an outlier) or not, is compared with a threshold, which is set at three times above the LOS shown in Figure 4. This threshold ensures correctly identifying of the LOS channels under the default centered Gaussian assumption.

#### 4. Inertial Navigation System

As already pointed out in Section 2.2, it is better to place the INS on the foot in order to benefit from ZUPT resetting methods [10] or other update methods [30]. Moreover, we do not consider the use of a magnetometer as magnetic disturbances are too important near the floor. Then we assume a foot/ankle-mounted IMU with a triaxis gyrometer and a triaxis accelerometer only.

First, two main frames are defined: (i)the *body frame* (BF) is the frame in which raw data are measured. BF data are referred to their definition frame with ; (ii)the *navigation frame* (NF) is the reference frame, which is linked to the earth frame. The NF data are referred to this frame with . Since no magnetometer is used, it is not possible to estimate the orientation of the sensor with respect to the North. Hence the navigation frame includes a rotation around the vertical axis with respect to the North frame defined by the East, North, and Up orientations. This rotation is fixed at the beginning of the walk, using the projection of the sensor axis on the horizontal plane instead of East and North axis. Vertical axis (orthogonal to the horizontal plane) is given during the stance phase by the measurement of the accelerometer. Let be the rotation matrix related to the unitary quaternion , denoting the orientation of the NF in the BF using the relation given in the appendix. Initial orientation with respect to the vertical axis is given, at the beginning of the walk and during the stance phase, by the accelerometer measurement. Considering our definition of the NF, with respect to horizontal axis, initial orientation is set to zero. Then, as the gyroscope measures the angular rate, and then can be estimated by integrating its raw data. This rotation matrix also transforms the coordinates of a vector in the NF into its coordinates in the BF. Then the raw data issued at the accelerometer can be written as
where and denote, respectively, the gravity field and the proper acceleration in the NF. The proper acceleration corresponds to the derivative of the velocity and to the second derivative of the position. The velocity can hence be estimated by integrating this proper acceleration. In order to limit the drift due to noise or bias integration, it is necessary to detect correctly the stance phase and suppose the velocity is null during this phase, as proposed within the ZUPT method [10].

##### 4.1. Stance Phase Detection

Many step detection methods are proposed in the literature. For instance, in [39] the step detection method is based on the Fourier transform through counting zero-crossing points over a threshold of the accelerometer output. In [26], the pedestrian step pattern is first detected. Then, the beginning and the end of the step can be defined according to this detected pattern. In [27] several other methods are also compared.

In the scenario considered here, the pedestrian is assumed to be walking in a building, moving from one room to another one with some stops and a direction that may change after a few steps. The pedestrian cannot walk very quickly, thus the stance phase of his foot is large enough to be easily detectable. A simple method used to detect the stance phase consists in comparing the acceleration variance to a threshold. Note that even if the ankle trajectory looks reproducible over distinct steps, the pattern of the IMU measurements will depend on the sensor orientation, see (11). To overcome this problem, as does not depend on the instantaneous rotation of the IMU, one can simply use the amplitude of the 3D accelerometer measurement [26]. During the stance phase, this amplitude is mostly constant and equal to the amplitude of the gravity vector. We then compute the variance of in a sliding window of temporal length equal to s. When the variance value falls under a fixed threshold, a stance phase is detected. During our experiments presented in Section 6, this very simple method has efficiently detected all the steps.

##### 4.2. IMU Proper Acceleration

Let be the swing phase time interval related to the detected step, and let be the index of inertial data within this time interval (i.e., ). Then the acceleration has to be integrated only during this time interval, whereas during the remaining time, the velocity is set to zero. Thus, the velocity of the pedestrian ankle is computed step by step, whereas the orientation of the NF in the BF at time index is continuously determined with the quaternion results from the integration of the D angular rate measurements , as follows: where denotes the quaternion product and where is the time sampling interval of the IMU and . Once the body orientation is determined, the accelerometer measurement terms are simply rotated into the NF as follows: where .

Then we have where denotes the IMU velocity. Since at and the foot is on the floor, then , and consequently Finally, the proper acceleration is given by where in (17) compensates at the same time the gravity and the velocity bias.

##### 4.3. Inertial Support to Pedestrian Navigation

Our aim is to continuously compute the ankle velocity of the pedestrian together with his heading. Note that the heading is defined as the angle given by the direction of the walk in the horizontal plane of the Navigation Frame. Since during the stance phase the velocity is set to zero, then the velocity is computed separately for each step with Unlike in [10], the velocity here is continuous between different gait phases after centering the acceleration in (17) during the swing phase of each step. Thus, the computed IMU velocity in (18) starts at zero and ends up at zero.

We consider that the pedestrian walks on a flat floor. The pieces of horizontal information of the ankle velocity and heading are then given as follows:

During a walk with a constant speed, the ankles alternate between stance and swing phases. Hence, the pedestrian waist experiences almost a constant velocity, whereas the ankles experience high velocity variations alternating between maximum and null speed values. As the horizontal waist velocity of the pedestrian does not vary as much as the ankle velocity, then we compute the waist velocity using a smoothed version of . In the following, the horizontal velocity of the pedestrian and the heading are the two processed inertial data simultaneously incorporated into the tracking filter, contrarily to the loosely-coupled fusion strategies in [16, 17], where the IMU is used only for heading and step detection. Furthermore, in our proposal, there is no need to have the same pedestrian heading and displacement orientation. For instance, pedestrian side walk and back walk are freely enabled since the real displacement of the foot is estimated. Moreover, there is no need to estimate the step length as in [16, 17] or to calibrate the leg length as in [40].

#### 5. Tracking Problem Statement and Fusion Filter Formulation

In the considered scenario, we remind that inertial sensors are attached on the ankle, whereas the pedestrian holds in his hands an IR-UWB transmitter, and known reference receivers are disseminated in the environment. Hence, out of , , , and the problem here is to track at least the unknown pedestrian position , where and refer to the Cartesian coordinates. For this purpose, we consider using an *Extended Kalman Filter* (EKF), which is widely used in mobile tracking applications. After recalling the general EKF formulation in Section 5.1, we will then justify our overall fusion strategy in Section 5.2. Finally, the state and observation models will be detailed in Sections 5.3 and 5.4.

##### 5.1. Generic Extended Kalman Filter Formulation

We start formulating the tracking problem with the following generic state equation where is the time index, , and denote respectively the state vector that contains all the parameters to be estimated, the input control to the dynamic system, and an i.i.d. process noise sequence. is the function that relates two consecutive states. As for the observation function , it binds the state vector to an observation vector composed of measurements, as follows: where is an i.i.d. process noise sequence.

The EKF is a popular linearized version of the Kalman filter that can handle nonlinear functions for both estimated states and observations [35]. It enables to estimate the state vector and the related state covariance matrix at time increment from the previous estimates and , the control input obtained at , and the current observations .

The estimation of with the EKF is typically split in two steps, namely the state prediction and the state update, which rely, respectively, on the dynamic state equation in (20) and the observation equation in (21). (a) Prediction: (b) Update:

where is the Jacobian of with respect to computed in , and is the Jacobian of computed in . and , which are assumed to be known, are the covariance matrices of and .

##### 5.2. Overall Fusion Strategy and Filter Structure

In Section 2.3, we presented “decoupled,” “tightly coupled,” and “loosely coupled” options as three different fusion strategies to merge radio and inertial subsystems into a single navigation solution.

We choose a “loosely coupled” strategy here because it authorizes different placements of the subsystems. For instance, the pedestrian can hold the UWB transmitter in his hand, avoiding many near-floor obstacles, and increasing visibility with respect to anchor nodes, whereas a foot-mounted IMU can take advantage of ZUPT to overcome the INS velocity drift. Moreover, with the “loosely coupled” strategy, each drift-free observation issued from the UWB subsystem can be used to limit the INS drifts.

In our application, we choose to integrate the IMU data in the prediction phase of the filter for the following reasons: (i)it is hard to quantify the errors affecting the inertial data model due to for example, step missing and pedestrian walk behavior (variations of walk speed or unusual pedestrian movements); (ii)in an alternative approach, the fusion filter update would have to run at the rate of the inertial data instead of the UWB rate (e.g., Hz instead of -4 Hz in our experiments), while the update part of the filter includes matrix inversions/manipulations and a few matrix multiplications, which is computationally demanding for real-time applications; (iii)the first proposed approach naturally supports different sampling rates for the prediction and the correction phases. For instance, in our application, the UWB rate is applied to the correction phase, whereas the inertial rate is applied to the prediction phase.

Figure 5 shows the global structure of the proposed EKF fusion filter. The raw inertial data are preprocessed as described in Section 4 to get pedestrian velocity and heading as inputs to the filter prediction phase. In parallel, IR-UWB TOA measurements are delivered at each receiver with their estimated variance, as computed in Section 3.3. In the update part of the filter, specific combinations of such measurements are performed, as introduced in [19], enabling us to remove outliers using the indicator presented in Section 3.4 without changing the filter structure (e.g., the size of involved matrices).

##### 5.3. IMU-Based State Prediction

Generally speaking, at time index , in the absence of state noise, is expressed as a function of the previous state , the pedestrian velocity , and the nonbiased or biased compensated heading, as follows: For instance if , then is fully accounted by with the inertial refreshment rate.

As already pointed out, the INS on its own suffers from heading drift because of the integration of the biased angular rate measurement from the gyroscope. Hence, the heading angle determined in Section 4.3 is biased by . Thus, as one further variable to estimate, we incorporate the latter bias in the modeled EKF state , whereas the control input is composed of the data issued from the INS, namely, (i.e., assuming is sufficiently representative for ). Therefore, in our filter the state prediction at time step is given by with the inertial refreshment rate and hence the Jacobian of computed at the predicted state is as follows:

Note that the state covariance prediction also runs at the inertial rate . The state covariance matrix is supposed to be time invariant. The update part of the EKF is achieved only when new radiolocation measurements are available. Thus changes only at the update step, and holds the same value during the prediction step.

##### 5.4. Filter Update with Combined T(D)OA Measurements

At the * base station* (BS) (), TOA-based measurements are performed through ED (see Section 3.2), along with an approximation for the corresponding standard deviation (see Section 3.3). It is also assumed that the receivers are synchronized, but independently of the mobile transmitter, so that all the pseudo-TOA measurements are biased by a common unknown delay.

Filter complexity may be critical with respect to both hardware and software capabilities, especially when many available measurements must be processed in realtime. It is hence worth noting that the most significant part of EKF complexity results from filter gain computations, which directly depend on the size of the observation vector. Hence we consider using the combination-based observations proposed in [19] to reduce the number of observation functions from the number of anchors (or measurements) down to the dimension of the location problem with no precision degradation.

For instance, let be an combination matrix generating observation functions used in filter update (out of measurements), as follows: where is a vector composed of the pseudodistances between the mobile node and the 4 BSs, and it is reminded . Thus, the same position accuracy as that of the standard TDOA formulation can be preserved using CTDOA, as follows: where with , if is an vector of ones and the diagonal covariance matrix of TOA-based measurements, whose diagonal is practically composed of (. Note that similar expressions are available in [19] for the simpler CTOA formulation.

Using such combinations, it is still possible to remove outliers with the indicator , while maintaining a constant filter architecture as a function of time. Within a more classical filter formulation, this would lead to change the size of the filter gain matrix, whereas this size is constant in our proposal and only dependent on . Practically, so as to discard the measurement, the only operation consists in replacing in the diagonal element by when computing . Finally, the problem of choosing the best referential BS among the BSs providing nonoutlier measurements does not exist anymore here, unlike in the classical TDOA formulation.

#### 6. Indoor Experiments

##### 6.1. Experimental Setup

Real-life tracking experiments were carried out in a typical indoor environment at CEA-Leti Minatec premises, as shown on Figure 6. A pedestrian followed a m-long round-trip path in two rooms and a corridor, referring to visual markers on the floor. The pedestrian was holding an IR-UWB transmitter in his hand and an IMU was attached to one of his ankles, in compliance with the fusion scenario and system architecture considered so far. We used inertial data from a 3A3G IMU at the sampling rate Hz to estimate the heading and the waist average velocity. The latter was obtained by smoothing the online ankle velocity with a low-pass filter whose impulse response is s-width rectangular. Without loss of generality with respect to TOA resolution or fusion concepts, the considered IR-UWB transmitter emitted in the band GHz for implementation convenience at the *pulse repetition period* (PRP) of 1 s. UWB antennas were connected to the four synchronous channels of a GHz-bandwidth *digital storage oscilloscope* (DSO) using cables with N/SMA connectors, serving as four surrounding BSs in Room A. The DSO, enabling signal acquisition at Gsps, integrated a PC for postprocessing. Relying on this setup, after averaging the incoming signal over sequences of successive pulses to increase SNR before energy integration, the received signal was stored for consecutive time intervals of ns (i.e., for the duration of the ED observation window). Then the ED-based TOA estimation method described in Section 3.2 was applied to the acquired signals. At the emulated receivers, the observation window was divided into resolvable time bins of length ns (i.e., approximately the pulse width). The four DSO Rx channels were triggered synchronously but affected by the same unknown delay, which could vary from one acquisition to the next, hence requiring CTDOA as observations in the tracking filter, as previously mentioned. Finally, the UWB acquisition rate was not constant, and reached to acquisitions per second, which is much lower than the inertial sampling rate.

**(a)**

**(b)**

##### 6.2. Evaluation Procedure and Algorithms Benchmark

Five different estimator settings, depicted as so-called “scenarios” in the following, were tested and compared: (i)scenario : EKF tracking with IR-UWB only, using in the observation vector CTDOA combinations of the 4 available pseudo-TOA measurements, with a time-invariant covariance matrix (i.e., being the time step whenever an UWB observation is available, with a multiple of ); (ii)scenario : idem as scenario 1, except the covariance matrix that is adjusted depending on (using the tabulated function in Figure 4); (iii)scenario : idem as scenario 2, with additional outliers detection with ; (iv)scenario : tracking with a stand-alone INS only; (v)scenario : EKF tracking fusing IR-UWB and INS data as in Section 5, that is, idem as scenario 3, with INS outputs controlling the filter state prediction (with predictions of the IMU frequency and updates whenever UWB measurements are available).

For scenarios 1, 2, and 3 (i.e., UWB only) the state vector of the corresponding EKF just contains the D cartesian coordinates and velocities , with a classical linear transformation matrix where is the time elapsed between samples and at the UWB subsystem. Scenario 4 being without UWB (i.e., stand-alone INS), only the state prediction step is applied, and the heading bias is systematically set to zero.

##### 6.3. Performance Indicators

To assess and benchmark the performances of the proposed tracking schemes, two kinds of metrics are used. First, we consider an error distance between each estimated position issued at the filter output at time step and its nearest orthogonal projection onto the actual trajectory This instantaneous distance error is finally averaged into , over specific portions of the trajectory or geographic areas (i.e., over selected but contiguous sequences of ).

As for the second performance indicator, we consider the difference between the real and estimated traveled distance (over the entire back and forth trajectory), normalized by the real traveled distance. This second relative performance indicator gives an idea about the uncertainty on the overall trajectory length, what could be interesting in several applications besides navigation (e.g., sports analysis, activity monitoring in physical rehab or as dietetics support, etc.).

##### 6.4. Results and Discussion

Relying uniquely on the heading angle and on the pedestrian velocity (i.e., after smoothing the ankle velocity ) shown on Figure 7 as a function of time, the tracking performance of a stand-alone dead-reckoning INS (i.e., scenario ) is illustrated in Figure 8(a). The drifts affecting both the estimated heading and position clearly justify the use of a side IR-UWB subsystem here.

**(a)**

**(b)**

Figure 8(b) shows the estimated trajectory obtained within scenario 5, where the location drift is now significantly reduced. The dashed part of the trajectory refers to the first part of the walk (i.e., the one-way portion of the trajectory forth). Noting that and are in NLOS configurations in this first portion, thus the mobile could not properly correct the position drift until it gets sufficiently good pseudorange estimates. Consequently, the mobile gets closer to the real trajectory in the middle of the scene, and it even sticks to the real trajectory for the remaining part of the walk (see e.g., the trip back in straight lines).

For each scenario, Figure 9 shows the average location error over the entire trajectory and in each room separately. The error is particularly large in Room B for a single IR-UWB radiolocation system (i.e., scenarios 1 to 3). This is due to the combined harmful effects of generalized NLOS links (with respect to the four BSs simultaneously), to body shadowing, and to poor GDOP conditions. Comparing for the three first scenarios in Room B specifically, one can notice that using an adaptive observation covariance matrix (scenario 2) and removing further outlier measurements (scenario 3) clearly help to reduce the error. In more favorable areas, for example in Room A (with at least one LOS links systematically) or in the corridor with light NLOS conditions (i.e., through plasterboard walls), even in the first scenario, the IR-UWB system alone would slightly outperform the INS system alone. As expected, fusing the two subsystems (scenario ) reduces systematically the overall error, even if the enhancement is far more spectacular in Room B in comparison with both scenarios 3 and 4. These results open the floor to parsimonious fusion schemes, where one could switch from a stand-alone subsystem into the complete fusion-oriented system on demand, depending on the operating conditions, hence saving energy and complexity at the price of slight performance degradations.

Finally, Table 1 shows the average estimated distance per actually traveled meter. This takes into account the distance between consecutively estimated locations. Due to NLOS situations, a tracking system only based on IR-UWB would tend to overestimate the traveled distance mostly because of occasional but strongly biased TOA-based measurements, which lead to nonstraight and more erratic estimated paths. Omitting measurement outliers, the estimated traveled distance is significantly reduced, but still rather large. The use of INS then enables us to reduce significantly this error.

The previous results illustrate the complementarity of the two subsystems and the potential of the proposed fusion scheme, under the architectural constraints of noncollocated and asynchronous sensors. The IR-UWB part of the system tends to correct the heading drift and resolves the growing error of the INS in time, whereas the INS part helps IR-UWB radiolocation in generalized NLOS situations and/or in penalizing mobile locations that would experience bad geometrical configurations with respect to the BSs (e.g., in room B). Finally coupling both systems enables reliable and robust pedestrian tracking with an almost uniform quality all over the scene.

#### 7. Conclusion

In this paper, we have addressed the problem of pedestrian tracking by coupling an IR-UWB transmitter handset with an ankle-mounted INS device. One motivation to couple these subsystems was to overcome their respective limitations in harmful operating indoor environments, while benefiting from their complementary capabilities.

The raw measurements of each subsystem have been carefully studied, and new preprocessing techniques have been proposed before applying hybrid data fusion techniques. Regarding IR-UWB first, TOAs are estimated at low complexity energy-based receiver following a Bayesian approach. A new practical criterion predicting the standard deviation of estimated TOAs and a methodology to identify outlier measurements have been proposed. Real channel measurements have been carried out and exploited to validate a few statistical multipath parameters. As for INS, in order to limit the drift due to noise integration and avoid magnetic disturbances, we have considered one ankle-mounted IMU with a 3-axis gyroscope and a 3-axis accelerometer, whose measurements are processed to determine the pedestrian average velocity and biased heading. This ankle-mounted INS (when considered with a UWB Tx handset) obviously imposes further challenging constraints in terms of system architecture, while operating with noncollocated and asynchronous heterogeneous devices, hence impacting in turn the fusion strategy.

To merge the heterogeneous data from both subsystems, we have proposed a specific EKF filter formulation. Combined time difference of arrival [19] of nonoutlier TOA measurements are used as observation inputs to the filter, the observation covariance matrix being dynamically scaled thanks to the new practical TOA estimation quality indicator. The use of CTDOA enables to remove outliers without changing the low complexity structure of the tracking filter. The velocity and heading estimates issued at the IMU are taken into account into the filter during the prediction phase. This option enables us to take benefit from zero velocity information during the stance phase, as well as to operate under different sampling/refreshment rates for the IMU and IR-UWB subsystems.

Experimental results illustrate the complementarity of the two subsystems and the efficiency of the proposed fusion scheme. In particular, the IR-UWB part corrects the heading drift and resolves the growing error of the INS as a function of time, whereas the INS part advantageously assists IR-UWB radiolocation in generalized NLOS situations and/or in penalizing mobile locations that would experience bad geometrical configurations with respect to the BSs. Overall, coupling both systems enables reliable and robust tracking with uniform quality of service over the scene.

Finally, the possibility to apply parsimonious fusion schemes, hence saving energy and complexity at the price of slight performance degradations, has been pointed out and briefly discussed. Accordingly, one could switch from a stand-alone subsystem into the complete fusion-oriented system on demand, depending on the operating conditions (i.e., while experiencing generalized NLOS or generalized LOS, poor or favorable GDOP, etc.).

#### Appendix

Let be the quaternion that denotes the orientation of the “*X*” frame relative to the “*Y*” frame, then the associated rotation matrix is given in the following equation:

#### Acknowledgments

This work has been performed in the framework of the *WHERE* (ICT-217033) and *WHERE2* (ICT-248894) projects, which are partly funded by the European Union. The authors would like to thank also the people from Movea (http://www.movea.com/) for their precious technological support as regards to inertial systems. The authors also thank the reviewers for their valuable comments and suggestions.