#### Abstract

This paper presents a ranging receiver architecture able to timestamp IEEE 802.11b Wireless LAN signals with sub-100 picosecond precision enabling time-based range measurements. Starting from the signal model, the performance of the proposed architecture is assessed in terms of statistical bounds when perturbed by zero-mean additive white Gaussian noise (AWGN) as well as in case of multipath propagation. Results of the proposed architecture, implemented in a Field Programmable Gate Array-(FPGA-) based prototype, are presented for different environments. For AWGN channels, the prototype system is able to attain an accuracy of 1.2 cm while the ranging accuracy degrades in dynamic multipath scenarios to about 0.6 m for 80% of the measurements due to the limited bandwidth of the signal.

#### 1. Introduction

Despite the fact that Global Navigation Satellite Systems (GNSSes) cover nearly 100% of the planet, satellite-based localization is not available within buildings as the roofing and walls deteriorate the signal to a degree where an errorless decoding is no longer possible. Mounting pseudolites, devices transmitting the navigation signals, under the roofs are certainly not a valid solution, not only due to legal restrictions. The differences between indoor and outdoor localization are more substantial than just the received power. Radio propagation within complex environments, typical for indoor scenarios, are challenging for high-speed wireless communication, but even tougher for any form of localization service.

Many localization concepts (e.g., based on ultrasonic, electromagnetic waves, inertial sensors) have been proposed to bridge the gap between GNSSes and the lack of indoor locating systems. Nevertheless, for indoor environments, there is still no general satisfactory solution available as different key factors, such as low power consumption and high refreshment rate are incompatible. One major reason why indoor radio localization systems are way behind satellite navigation solutions is that the majority of all current wireless communication standards have not been designed with position determination in mind. These signals are often referred to as Signals of Opportunity (SoO). In theory, adding a localization service upon an existing standard is always possible. However, the key parameters like accuracy, reliability, or cost depend on the restrictions of the wireless standard. As a result, retrofitting a localization service to an existing technology might turn out to be highly complex as, for example, the integration of the Enhanced 911 service into GSM networks. The degree of complexity depends primarily on the measurement principles ranging from Received Signal Strength (RSS) to time or angle measurements, or a combination of these. Some principles require proprietary hardware, while others only require software modifications, but impose other restrictions, such as limited range or accuracy. As a result, there is no generally best solution to retrofit localization to a communication standard.

In particular, Wireless LAN (WLAN) seems to be an interesting candidate for localization as it is a widely accepted industrial standard deployed in billions of mobile devices in home and office environments. RSS-based WLAN localization is a thoroughly investigated subject, and still the positional accuracy can be considered rather poor. Time-based localization methods like Time of Arrival (ToA) or Time Difference of Arrival (TDoA) are attractive not only for satellite-based positioning, but also for indoor applications. The fact that the measured time is a linear function of the range makes it an ideal candidate even for larger distances in contrast to RSS.

The outline of the paper is as follows. Section 2 describes the localization process and technologies related to WLAN locating. In the following section, the signal model and common cross-correlation TDoA estimation methods are discussed. The proposed receiver architecture able to measure the ToA by means of timestamping is found in Section 4. Section 5 discusses the error due to multipath propagation and mitigation techniques for the presented receiver architecture. Subsequently, a set of these receivers is used in a hardware implementation to assess the performance under various conditions in Section 6. The conclusion summarizes the findings, and finally an outlook for further investigations is given.

#### 2. Related Work

##### 2.1. The Localization Process

Locating a target in a radio localization system is based on the exchange of signals between the target and a number of base stations with known positions. In a network-based localization system, the location of the target can be calculated directly from the received signals collected by a locating unit or by the mobile device in case of handset-based localization. This approach is called direct position determination (DPD) [1]. Similarly, if an extraction of ranging parameters from the received signals precedes the position determination step, locating is based on a two-step approach as shown in Figure 1. While DPD offers higher performance for low signal-to-noise ratio (SNR) conditions [2], the joint position estimation increases the computational complexity of DPD and therefore decreases its practical applicability.

The simulations performed by [3] showed that DPD performs nearly identical than the conventional two-step approach, and DPD can only provide improved positional accuracy in case of difficult situations such as obstructed propagation paths. In this paper, we assume the commonly used two-step approach. The second step, position determination, can be grouped in subtasks starting with parameter conversion (propagation models, fingerprinting), preprocessing (filtering, weighting, selection), localization (nearest neighbor search, Bayesian or probabilistic approaches [4, 5]) up to tracking algorithms (Kalman or particle filtering, Markov models, neural networks [6]). Nevertheless, all these sophisticated methods depend on the quality of the parameter estimates of the first step. In this paper, we focus on the parameter estimation process, particularly for time-based WLAN ranging systems.

##### 2.2. Related Work

RSS localization in WLANs is the most investigated solution as the measurement of the RSS is a mandatory feature in standard compliant WLAN devices. All that is required is to read the RSS information from the devices and collect it at the locating unit performing the localization based on fingerprinting or geometric approaches. One of the oldest RSS system is the RADAR system from Microsoft research [7] using fingerprinting techniques based on combining empirical measurements with signal propagation models. It can be considered as a network-based system as the APs collect the signal strength of all associated targets. In the recent past, a plethora of RSS propagation models have been proposed for WLAN modeling different environments, such as offices, parking lots, outdoor or industrial environments [8–10]. Despite all the effort and optimizations, the location accuracy of RSS solutions is poor, in the 2–20 m range depending on the environment and distance to the AP [11].

Time-based WLAN ranging has been analyzed by the authors of [12]. The basis for their approach is two-way ToA, where the target operates as interrogator sending ranging requests to the access point (AP). The round-trip time of the frame minus the turn-around time in the AP yields the time of flight in both directions. As neither the target nor the AP has any capabilities to perform accurate time measurements, the authors propose to use software timestamping within the Linux Kernel of the operating system (OS) in the AP and the target. Software timestamping is influenced by various jitter sources in the OS such as interrupt latency, bus arbitration, and therefore the measured round-trip times suffer from a large variance. With software modifications, the authors measured an average round-trip time of 269.83 *μ*s with a standard deviation of 371 ns (corresponding to 111 m).

Even higher accuracy can be achieved directly in the physical layer. A WLAN mechanism exploiting the Request to Send/Clear to Send (RTS/CTS) handshake mechanism for two-way range measurements is presented in [13]. The authors have built a printed circuit board (PCB) supporting hardware timestamping, which starts a counter when transmitting an RTS frame. The counter stops when the CTS response frame from the target is received. Despite some concurrence and interference issues, which are mitigated through filtering and selection, the system is bound by the 22.72 ns resolution of the counter running with 44 MHz. To decrease the large variance of the measurement, robust linear regression techniques are used, averaging over 50 measurements. Even closer to the physical layer are methods processing the sampled baseband signal, for instance, methods based on dirty templates [14, 15], where a previously received signal is used as a correlation template. Alternatively, Generalized Cross Correlation-(GCC-) based ToA estimation methods are used for WLAN as well [16].

As alternative to cross-correlation, some authors have proposed to use subsample interpolation methods belonging to the group of super resolution methods, such as the Estimation of Signal Parameters via Rotational Invariance Technique (ESPRIT), Root-MUltiple Signal Classification (RootMUSIC), or Matrix Pencil (MP) [17, 18]. Others prefer to estimate the ranging parameters in the frequency domain, such as the Prony method used by [19]. Compared with the matched filter approach, which is optimal in AWGN environments, these algorithms offer improved accuracy in multipath environments when the channel complies to a certain known model.

The Global Positioning System (GPS) also belongs to the group of time-based systems and can be considered similar to ToA WLAN localization. However, the properties of GPS are specifically designed for ranging applications, whereas WLAN was designed as more or less a cable replacement. The largest difference is that the GPS C/A code is permanently transmitted with a chip rate of 1.023 MChips/s and a data rate of just 50 bit/s [20]. In contrast, IEEE 802.11b WLAN employs a Carrier Sense Multiple Access (CSMA) scheme with 11 MChips/s and a data rate of up to 11 Mbit/s. As WLAN devices do not send constantly, they can naturally only be located during the transmission of frames. The received signal power and therefore the signal-to-noise ratio (SNR) in GPS are 60 dB and more lower than in WLAN, which is compensated in GPS by length of the Gold code used for signal correlation. In total, the higher chip rate of WLAN suggests that localization in WLANs should be even more accurate than GPS given the same channel conditions apply.

##### 2.3. Previous Work

The work presented in this paper is based on previous investigations extended in terms of accuracy, multipath mitigation, and a detailed analysis of impairments. In [21], we described the basic measurement principle of WLAN ranging as a piggy-back solution to an existing WLAN transceiver architecture. Finally, we designed a stand-alone transceiver solution based on the Maxim 2822 transceiver chipset. The design considerations for the hardware platform are described in [22]. A predecessor of the receiver architecture achieving nanosecond precision is described in [23]. In all papers, we consider a network-based TDoA solution, where synchronized base stations capture the wireless signal and process it locally to generate ToA timestamps. These timestamps are submitted (together with frame identification information) to a central locating unit, which groups them by the unique frame check sum and performs parameter-based position determination, the second step of Figure 1. Taking timestamps in the base stations requires a priori knowledge of the modulation and frame format. Knowledge of the used modulation is required to decode the frame and to optimize the estimation algorithm. The structure of the wireless frame is necessary to detect a unique position within the frame (epoch) to take a timestamp. As the epoch is agreed among all base stations, the difference of two ToA timestamps yields a similar estimate as correlation-based approaches. The necessary synchronization of the base stations is only briefly discussed within this paper, different approaches based on wired and wireless synchronization can be found in [24]. As outlined in Section 6.1, all base stations are assumed to be supplied by a common clock sourced from an Ethernet switch.

#### 3. Signal Model and Range Estimation Techniques

##### 3.1. Signal Model

The data of a WLAN frame, according to the IEEE 802.11b standard [25], is modulated using either Differential Binary Phase-Shift Keying (DBPSK) or Differential Quadrature Phase-Shift Keying (DQPSK) in combination with Direct Sequence Spread Spectrum (DSSS) or Complementary Code Keying (CCK) baseband modulation. Common to all encodings is that the transmitted baseband signal can be expressed by a sum of periodic pulses as given in

describes the baseband signal neglecting any modulation to an RF carrier, is a real-valued baseband pulse, is the chip rate, is the sequence of complex-valued chips, and refers to the chip counter. In particular for WLAN, the chip rate is 11000 000 , the alphabet of is in case of BPSK, and in case of QPSK. Given that the chip sequence is wide-sense stationary, the transmitted signal is a wide-sense cyclostationary process with period . The cyclostationary property is obvious as the generated sequence has the same statistical properties as a time-shifted process for any arbitrary integer . The actual shape of the baseband pulse is not specified, but only the spectral mask for the transmit pulse in subsection 15.4.7.4 of the 802.11b standard [25]. That is, the side lobes 11 or 22 MHz apart from the carrier frequency must be suppressed by −30 or −50 dB with respect to the carrier.

The baseband signal is modulated to the RF carrier and demodulated again in the receiver. If the carrier frequencies in the receiver and the target are sourced from different oscillators, the carrier frequencies are not exactly the same as each oscillator is subject to different tolerances and jitter. The frequency of an oscillator can be modeled as the sum of the nominal frequency , a frequency skew , and a random term as shown in Hence, the received signal includes also a residual carrier frequency skew with and the frequency skews of the transmitter and receiver with respect to the nominal frequency . The received signal delayed by and with a phase offset and skew in presence of AWGN noise can therefore be described by

Depending on the length of the transmitted frame and the joint stability of and , the residual carrier frequency may be considered constant or variable with time. For the latter case, some kind of carrier recovery algorithm has to constantly track and remove it. The removal of the residual carrier frequency is commonly termed carrier wipeoff. The IEEE 802.11 standard requests that the oscillator used for generating the carrier may deviate 20 ppm from the nominal frequency; hence, each carrier recovery algorithm must be able to deal with up to 49 kHz frequency skew from its nominal value.

Not only the carrier can be subject to minor frequency skews. The chip rate of receiver and transmitter may slightly vary due to oscillator tolerances. As the chip rate is considerably lower than the carrier frequency by a factor of more than 200, the impact of imprecise clocks on the receiver performance is far lower. As 20 ppm chip frequency skew is equivalent to a time stretch of 20 ns/ms, it cannot be neglected for WLAN ranging applications with typical frame durations in the millisecond range. It should be noted that the chip and carrier frequency skew may be correlated if sourced from the same physical oscillator. In this case, the carrier skew estimate can be used to calculate the chip skew as the carrier skew estimate is about 100 times less noisy than the chip skew estimate as shown by [26] for GPS.

##### 3.2. Range Estimation Based on Cross-Correlation

The estimate (the hat notation is used to indicate that the variable is an estimate) for the propagation time difference of a signal transmitted by a target and received by two synchronized base stations in a TDoA system can be calculated by cross-correlation of the two received waveforms. Consider two base stations receiving a continuous time signal with different zero-mean noise realizations uncorrelated to . Let the signals have different amplitudes and phases , then the signals can be modeled as with the baseband signal (see (1)). Then, the TDoA estimate can be found by maximizing the cross-correlation function by

This approach works without the knowledge of the transmitted data or modulation, given the transmitted signal's autocorrelation has a significant peak. Hence, continuous wave signals cannot be used as their autocorrelation is periodic.

The performance of the estimator can be improved by passing the signals through a filter with matched frequency response ; this method is referred to as Generalized Cross-Correlation (GCC) [27, 28]. As the correlation and filtering are computationally simpler in the frequency domain, the TDoA estimation is often calculated using the Fourier transform as shown in (6) with the Fourier transform of :

Apart from cross-correlation approaches neglecting the DSSS modulation characteristics, Cyclic Cross-Correlation (CCC) approaches can be applied to the baseband signal as well. CCC algorithms exploit the fact that the autocorrelation function of a cyclostationary signal is periodic and can therefore be expressed by a sum of functions with cycle frequencies as [29] While the GCC method is simpler, Gardner and Chen [30] have shown that the CCC method yields better estimation results (lower variance) based on a Monte-Carlo analysis. One particular issue with CCC is that the cyclic cross-correlation is dependent on the sum of the time delays and a search through all possible delay values has to be performed to find the correlation peak [31]. Teplitsky and Yeredor [32] have extended the previous work and have proven analytically that CCC is in general better than GCC when comparing the lower estimation bounds by introducing a cyclic-correlation-based CRLB.

Although cross-correlation is a feasible instrument for estimating the TDoA in the baseband (e.g., for audio applications like [33]), the presence of carrier and chip frequency skews imposes restrictions to correlation-based methods. If the carrier frequencies among receivers are not phase-locked, then the residual frequency skew after the mixer is not the same in all receivers; the s, modeling the phase and frequency skew, in (4) are then receiver dependent and time variable. Hence, the signals are cyclostationary and the cross-correlation depends on the actual frequency skews. TDoA estimation using GCC (assuming stationary signals) is therefore no longer applicable unless the residual carrier frequency skew is removed in advance. Although frequency-locked loops exist that remove the carrier frequency skew for any kind of signal, such as the delay-and-multiply carrier recovery [34], Classen and Meyr showed that these algorithms provide only a poor frequency estimate [35]. Hence, carrier synchronizers neglecting the chip timing information (also called ND algorithms [36]) can only be used at high SNR, thus limiting the possible range for locating systems.

Even in presence of carrier frequency skews, CCC methods can still be used. However, the introduction of carrier frequency skews increases the complexity by adding one search dimension. As the chip rate of common WLAN devices is uncorrelated to the carrier frequency, the estimation problem is extended by another dimension.

A particular issue with distributed signal capturing and centralized correlation methods is the transport of the sampling data to the correlation unit. If the system is intended to capture the data continuously, then a high-speed connection is required. For WLAN, a sample rate of at least 44 MHz is required, and given that each sample contains 8 bits, it requires a connection with at least 352 Mbps per receiver. Hence, for a simple scenario with 4 receivers, even a 1 Gbps connection to the locating unit is not sufficient. Due to these substantial drawbacks of correlation-based localization in WLANs, this paper proposes to perform ToA estimation in each base station and to calculate the position based on the TDoAs (from the ToAs) in the locating unit. Hence, the TDoA estimation is factored into ToA timestamping in each base station instead of a joint estimation of the TDoAs using correlation methods.

#### 4. Proposed Receiver Architecture

In the previous section, we have identified that the localization of a target by means of timestamping is an attractive approach. Timestamping requires that all base stations use a commonly agreed algorithm to capture the instant when a frame is received at a base station, which can range from stopping a counter, when a certain part of the frame is received (like in [13]), or more sophisticated approaches. To outline our approach, let us assume a standard receiver architecture as depicted in Figure 2, where the received signal from the antenna is amplified, bandpass filtered, and mixed with the carrier signal generated from a local clock source. Due to the inevitable residual carrier frequency skew (caused by oscillator clock skews and Doppler shifts), it must be wiped off by a carrier numerical controlled oscillator (NCO). The code NCO regenerates the code replica and wipes off the spread code using a multiplier followed by an integrate-and-dump block.

In many DSSS receivers (e.g., GPS chipsets), the code NCO is purely binary, and therefore the NCO only outputs a sign function simplifying the multiplication to an adjustable inverter. The drawback of this approach is that the alignment of the code replica is quantized by the clock period of the NCO. In this case, the clock period of the NCO should be significantly shorter than the chip period of the received signal. For instance, the GPS receiver presented in [37] uses a 20 MHz NCO frequency for the GPS chip rate of 1.023 MChips/s and generates ToA estimates with 50 ns quantization. Carrier and code NCO are driven by discriminators, which compute an error signal and steer the NCOs. Algorithms aligning the code replica to the received data are termed clock synchronizers or timing recovery, whereas algorithms aiming to remove the carrier frequency skew and phase offset are commonly called carrier, frequency or phase recovery. Since the rise of digital communication, synchronizers have been a well-established field of research covered by dozens of publications and books, such as [34, 36, 38, 39]. As a result, a multitude of clock and carrier synchronization algorithms have been created.

##### 4.1. Fractional Delay Ranging Receiver

For the proposed ranging WLAN Fractional Delay Ranging Receiver (FDRR), we assume that the received signal is sampled using a fixed local clock with period shared among all base stations. After passing the signal through a matched filter compensating for the Channel Impulse Response (CIR) and in presence of noise , the phase offset , and the frequency skew , the received baseband signal sampled at time can be described by

Despite the dependency of the baseband pulse shape and data chips , the received signal is dependent on three other parameters, which have to be estimated by the receiver: the fractional delay , the phase offset , and the frequency skew . These are unknown but deterministic and can be estimated by either a joint parameter estimation or by a separate estimation of each parameter. The former will result in the Maximum-Likelihood Estimation (MLE). The variance of the MLE is bound by the Cramer Rao Lower Bound (CRLB) [38, 39], the inverse of the Fisher information matrix. We assume that these terms are uncoupled, and therefore each parameter can be estimated separately without degrading the estimation. For ranging based on the baseband code, the interesting term in (8) is the fractional delay , estimated by the synchronizer, as it describes the delay of the received signal with respect to the local clock.

This leads to the modified receiver architecture shown in Figure 3. It differs from Figure 2 by the aspect that the NCOs have been included into the carrier and code wipeoff blocks (timing and phase recovery) and that the code synchronization is factored into a chip and spread code synchronization. This split is motivated by the fact that WLAN does not use the same spread code for all data rates, but the 11 bit Barker sequence for the 1 and 2 Mbps mode and CCK together with an increased symbol rate of 1.375 MS/s for the higher data rates.

The timing recovery calculates the fractional delay estimate and finally interpolates and decimates the input signal with rate to the chip rate of the transmitter . We have selected the squaring timing recovery belonging to the group of non-data-aided synchronizers generating a spectral line at the chip rate and multiples of it. It has been shown in [36] that the signal transitions of the chips create a cyclostationary process with period at the output of the squarer, that spectral line at the chip rate contains an estimate for the chip timing. The argument of the Fourier coefficient of the squared magnitude of yields the unbiased estimate for the chip timing. It can be calculated by with denoting the received sampled signal, the number of samples per chip, and the number of chips to average. The estimation is the MLE under the assumption that the sampling rate is at least twice the required sampling rate of , is an integer, and the baseband pulse is assumed symmetric and real-valued ().

The residual frequency and phase offset are wiped off in the timing-aided (D) and data-aided (DA) carrier and phase recovery block, where the aiding information is provided by the receiver control unit. Other estimated parameters, such as the carrier phase and frequency, and , the equalizer coefficients , the DSSS lock-in position , and the carrier lock-in position are also marked above the corresponding blocks in Figure 3. A summary of the resolution and ambiguities of these estimates is given in Table 1.

The baseband code synchronization is factored into the timing recovery and correlator. The correlator lock-in position is required to determine the correct chip for timestamping. As the evaluation of together with still yields an ambiguity of 1 *μ*s (the length of the Barker DSSS sequence), the epoch is required to find the unique position within the frame. The epoch is agreed among all base stations and is found by decoding and analyzing of WLAN frame format (e.g., the first chip of the frame header).

The term resolution in Table 1 needs some further explanation. Consider the estimation of using a squaring synchronizer based on (9). As the fractional delay of the received signal with respect to the local sampling clock is available as a numerical term, it is not bound by any quantization, and therefore the resolution can be arbitrarily high. If we would simply monitor the arrival of a certain chip (epoch), the resolution would be bound by the duration of a chip (91 ns) or by the clock period of the receiver hardware, given that we mark a certain clock cycle as the arrival of the epoch. In the proposed FDRR, the fractional delay estimate is propagated through the entire receiver and therefore the resolution is preserved.

However, there is a common misconception by several authors (e.g., [40–42]) that the ranging resolution is in general bound by the clock frequency of the timing source or receiver architecture. For instance, Bensky states on page 90 of his book [40] that a 1 m range or 6.67 ns time resolution requires a 150 MHz clock. Yet, for some particular receiver architectures, the resolution constraint is indeed valid. For instance, when evaluating the cross-correlation peak of a sampled signal, the resolution is limited, as the correlation peak cannot be narrower than a single sample. However, given that the sampling theorem is fulfilled, all required information is still contained in the sampled data. Different approaches have been proposed to mitigate the resolution constraint. Increasing the clock rate is one possible solution to increase the resolution, interpolation as described in [43] is the other solution. Our approach can be classified as an interpolating approach using CCC and exploiting the cyclostationarity at the chip period of the WLAN signal. The full signal decoding is required to find the epoch, to identify the target, and to group different frames belonging to the same target.

Until now, we have considered taking a single timestamp at one specific instant in a frame, the epoch, and submit it to the locating unit calculating the position based on the timestamp set from different base stations. It is known from estimation theory that averaging as many realizations of a stochastic function as possible decreases the estimation variance by the improvement of the test statistic. For ranging, the estimation should be averaged, if possible, over the entire frame requiring very narrow bandwidths in the timing recovery. As a solution, we propose a two-step approach. The loop bandwidth in the timing recovery is optimized for chip synchronization, while the noisy outputs are filtered in a separate timestamping unit after the receiver control unit using a linear regression model. The timestamp optimization using linear regression, as described in the Appendix A, enables the calculation of the MLE timestamp for the center of each wireless frame.

##### 4.2. Theoretical Bounds

The performance of various ranging estimators can be assessed by statistical bounds, which are used for comparison in Section 6. In the field of estimation theory, there exist a number of such bounds, one of them is the CRLB defining the lowest bound for the variance of an unbiased estimator. The CRLB is based on the assumption that the Probability Density Function (PDF) of parametrized on obeys the regularity condition, that the expectation of the derivative of the log-likelihood of is zero as stated in

Then, the CRLB defines a lower bound for any unbiased estimator as where the derivative is evaluated at . When the CRLB is applied to the ToA estimation problem, the ToA variance with the propagation speed is bound by [44] where is the range estimate and SNR refers to the signal-to-noise ratio. represents the effective bandwidth of the signal, which is the square root of the second moment of the spectrum, expressed by the Fourier transform of the pulse, , normalized over the energy of the signal by

Increasing the effective bandwidth or the SNR improves the ToA ranging variance. This objective can be achieved by using a large signal bandwidth as in Ultra-Wideband (UWB) communication systems. For WLAN with raised-cosine pulses (e.g., with a roll-off factor of 50%), the effective bandwidth is 2.96 MHz and therefore the variance can only be decreased by increasing the SNR.

#### 5. Multipath

##### 5.1. Multipath Modeling

Multipath propagation is present when the transmitted signal arrives at the receiver via multiple echoes. In contrast to interference with AWGN, the multipath components (MPCs) have similar statistical properties and are correlated to the direct signal. If the number of MPCs is , the received signal can be written as the sum of weighted and delayed transmit signals in form of a tapped delay line model perturbed by zero-mean AWGN by The complex-valued channel coefficients are represented by , and the path delays by . The CIR decays for long delays due to the path loss. The terms, where the channel coefficients are nonzero, define the multipath spread. If the target, the base stations, and the Interfering Objects (IOs) are static, then the channel coefficients can be considered time-invariant. For practical applications with moving objects, this assumption does not hold and the time variability of the channel coefficients must be taken into account [45].

When multipath is present, the tracking loop of the timing recovery locks on the composite signal consisting of the line-of-sight signal (LOS) and the MPCs as the receiver is unable to differentiate between this perturbed signal and the desired signal. An MLE receiver tries to find the ToA by maximizing the cross-correlation function of the received signal with the stored template (e.g., by means of a matched filter). In presence of multipath, the cross-correlation function is no longer identical to the autocorrelation function (ACF) , it is rather a weighted sum of shifted ACFs as

Hence, the cross-correlation function becomes a nonsymmetric function with a correlation peak that may be shifted with respect to the correlation peak of the direct signal. The estimation error due to the distortion is known as multipath error. It is dependent on the current channel coefficients (amplitude, phase, delay), which are defined by the propagation conditions. As the phase of the channel coefficients changes for movements as small as the carrier wavelength, the error is not predictable by a receiver in the vicinity. This small-scale multipath effect is well known for GPS and addressed by a number of mitigation strategies such as modified synchronizer discriminator functions (e.g., narrow correlators for delay-locked loops [46]).

In spite of the limited realism, it is a common model to describe multipath propagation by the two-path channel model, where the CIR consists only of two terms, the LOS signal and a single multipath component. We consider the CIR as time-invariant, which represents the case of zero Doppler spread or infinite coherence time, respectively. This setup is, for instance, present if all transmitters, receivers, and IOs are static. Without loss of generality, we assume in the two-path scenarios that the LOS signal has unity amplitude ( in (14)), zero phase, and zero delay . The simplified CIR is therefore

##### 5.2. Multipath Ranging Errors

Even for the two-path channel model, three parameters influence the multipath error: the amplitude , the phase , and the delay .

Let us recall that a spectral-line generating squaring synchronizer, as used by our proposed FDRR, squares the absolute of the signal and calculates the Fourier coefficient . The argument of times yields an unbiased estimate of the chip timing, with the chip period and the composite received signal as

If we insert the composite signal , generated by convolution of the direct signal with (16), into (17) and exploit the cosine formula, the multipath error can be derived as the difference between the estimates of the composite and direct signal by

It can be seen that the multipath error depends not only on (amplitude and phase) and , but also on the received signal itself. As the squaring synchronizer works in non-data-aided (NDA) mode, it operates on the plain received signal without taking any symbol decisions in the receiver into account. is cyclostationary with the chip period given that the alphabet sequence used to generate is uncorrelated (see (1)). A useful interpretation for wide-sense cyclostationary signals is that the ACF can be expressed by a sum of functions with cycle frequencies [29] (cf. (7)). As the ACF of a cyclostationary signal is insensitive to a shift , the th Fourier coefficient of the ACF can be calculated by setting as

The Fourier coefficients are also referred to as cyclic autocorrelation functions [47]. Using (19), the timing estimate of (17) evaluates for and . Hence, the multipath error can be rewritten as

For certain multipath amplitudes, the error can be depicted in form of a ranging error envelope by varying the multipath delay and selecting the phase that generates the largest multipath error. The ranging error envelopes for 10, 30 and 50% multipath field strength are depicted in Figure 4. The graph was generated based on a simulation using uncorrelated data encoded with BPSK baseband modulation, filtered by a raised-cosine filter with 50% roll-off factor, and the WLAN chip period of 91 ns. The -axis depicts the multipath delay, and the -axis shows the multipath ranging error. Positive multipath errors arise in case the MPC is inphase with the direct signal as the delay shifts the correlation peak into the future. Negative multipath errors are created by destructive interference. It can be observed that destructive interference causes large ranging errors, even if the multipath delay is rather short. These results for a single path, corresponding to ToA ranging, can be applied for TDoA as well, if any two points within the error envelope are added. Thus, the TDoA multipath ranging error may be even twice as large.

The multipath error depends on the cyclic function evaluated at and according to (20). It can be seen that the multipath error is zero, whenever the delay matches a multiple of the chip period because the error terms in (20) are inphase with . Figure 5 depicts the same error as Figure 4 but also shows the impact of MPCs with delays larger than the chip period . The NDA timing recovery based on the cyclic autocorrelation function has the disadvantage that only the center term of (20) decays for larger delays. As a result, the MPCs with delays larger than the chip period still contribute to the multipath error as depicted in Figure 5.

The drawback that the multipath error can be influenced by far specular reflections is alleviated, if we consider that the extra time delay causes an additional path loss. In addition, depending on the environment, large smooth surfaces acting as specular reflectors are unlikely. In terms of resolvability, delays, which are longer than the chip period, can be resolved (e.g., by spectral estimation of the equalizer or pseudospectrum estimates of superresolution methods). As a consequence, we can conclude that the multipath ranging error is mainly induced by MPCs with short delays, which are unfortunately not resolvable due to the limited bandwidth and the lack of the pulse shape definition.

##### 5.3. Multipath Mitigation

Multipath can cause significant ranging errors in the order of several meters as the receiver tracks the composite signal. Despite particular antenna designs which attenuate the MPCs, multipath can also be tackled by signal and data processing using nonparametric, parametric, and averaging techniques. Nonparametric techniques use a modified receiver template, which rather tracks the derivative of the pulse shape than the pulse shape itself, such as a narrow correlator setup commonly used in GPS receivers. The modified template is designed to achieve a sharper correlation result that is less influenced by multipath propagation. Parametric techniques rely on a certain CIR model and estimate the nuisance parameters such as phases, amplitudes, and delays of the MPCs. A common approach within this group is superresolution techniques in the time and frequency domain [48].

A necessary assumption for all estimation methods is that the signal (or at least its autocorrelation and covariance) is either known in advance or the receiver is able to reconstruct this information. Whereas the digital information and signal constellation of each chip in the baseband can be reconstructed, the WLAN pulse shape is not defined and may vary from device to device. Hence, an estimation of the CIR is not possible because the transmitted signal cannot be reconstructed. On the other hand, nonparametric multipath mitigation techniques, such as narrow correlators with an early-late spacing of resulted in the same multipath error as the squaring synchronizer (Figure 4). The reason for the discrepancy with the multipath mitigation improvements reported by other authors (e.g., [49]) stems from the fact that commonly a triangular ACF (responding to the transmission of rectangular pulses) is assumed, while the bandwidth limitation of WLAN imposes a rounded (raised-cosine) ACF of the signal and therefore diminishes the advantages of special correlator arm arrangements.

Multipath is in particular the limiting factor for point positioning applications. When the channel is static and therefore the channel coherence time is infinite, the range estimate may include a constant ranging bias. In principle, multipath mitigation through averaging can be done by any kind of frequency or space diversity, which is able to change the CIR or minimize the coherence time. If the target changes the position by about 10 times the carrier wavelength, the small-scale fading effects and connected to these the multipath errors average. For WLAN, this equates to a displacement of more than 1 m. Spatial averaging is impractical as it prohibits point positioning and requires a constant movement to generate varying CIRs.

Frequency diversity is another option to mitigate multipath errors as different carrier frequencies change the CIR. The main disadvantage of this approach is that changing the WLAN channel causes a loss of connection, if not enforced simultaneously by the AP and all associated targets. Yet, we still propose this method as it is one of the few methods, which requires only software modifications of the target, namely, preplanned channel hopping. As shown in the next section, frequency diversity can significantly improve the multipath performance.

#### 6. Results

This section presents the results of the FDRR architecture introduced in Section 4 based on a digital logic implementation using an FPGA platform. Several measurements have been performed to assess the synthetic performance (perfect channel) and the practical performance in line-of-sight and multipath conditions.

##### 6.1. Measurement Setup and Platform

The proposed FDRR architecture has been implemented using the SMart integrated Localization Extension 3 (SMiLE 3) base station hardware, which is a revised version of our previous hardware described in [22]. It consists primarily of an RF mixer IC to convert the WLAN signal to the baseband, a dual-channel ADC/DAC, an FPGA for signal processing and message handling, and an Ethernet connection to communicate the captured ranging parameters to the locating PC calculating the positions. Two methods for synchronizing the base stations are available: Ethernet synchronization and Board-Link. Ethernet synchronization exploits the fact that the transmit signal of a 100Base-TX Ethernet link is synchronous to the source oscillator of the transmitter. When all SMiLEs are connected to a common switch, the base stations are able to recover the transmit Ethernet clock from the received Ethernet signal and can build a synchronous network with phase-locked sampling clocks . The same can be done when connecting two base stations with a dedicated clock cable named Board-Link. Still, as the startup time of all base stations is not synchronized, clock offsets between the base stations exist. These offsets can be resolved by transmitting a WLAN frame from a known position and compensating for the propagation time to the base stations. This step is referred to as clock offset calibration.

Various measurements have been conducted by deploying either two or four base stations for one- and two-dimensional setups. The RF mixer of SMiLE 3 platform has a dependency of the delay on the premixer and postmixer amplification by a few nanoseconds. In all measurements, these systematic errors are compensated by a table lookup.

##### 6.2. Synthetic Performance

First, we assess the synthetic performance of the hardware, when the RF signal between the base stations is connected via cables and attenuators as depicted in Figure 6. The setup consists of a transmitter sending frames with adjustable frame length and period to two base stations and . The transmitter's RF signal is fed into an 18 dB splitter, and each of the branches is connected by attenuators to one base station. Each base station captures the frame, timestamps it, and transmits the timestamp together with the frame check sum and the identification MAC address to the locating PC, which subtracts the timestamps and records the TDoA.

The measurement results for both synchronization approaches (Ethernet and Board-Link) for variable frame lengths are depicted in Figure 7. It is remarkable that the SMiLE 3 hardware is able to consistently achieve range measurements with a TDoA standard deviation slightly above 40 ps, equivalent to 1.2 cm. Note that the clock calibration removes not only the clock offsets, but also minor delay differences caused by the cabling.

The resolution ps of the fractional delay estimate (and therefore the timestamps) has been selected as of the chip duration as this represents a good compromise between resolution and logic demands in the FPGA. Given that the quantization errors in both receivers are uncorrelated, the standard deviation of the resulting triangular distribution is 36.24 ps. In Figure 7, this is shown as quantization bound. Compared to the hardware implementation of [13] with a quantization of 22.72 ns (equivalent to ns), our hardware implementation offers a more than 200 times lower standard deviation.

The GCC CRLB for chips can be calculated by (12). It assumes an SNR of 49.9 dB (the estimated SNR Figure due to the jitter of the ADC sampling clock) per chip, an effective bandwidth of 2.96 MHz, 11 chips per symbol (1 Mbps), and no correlation between the receivers. For this particular setup with high SNR, the CRLB imposes no real limit. Note that the performance is only impaired by the synchronization, in particular for short frames, as seen by the difference between Ethernet and Board-Link. The reason of the degraded Ethernet performance can be found in the large phase noise of the Ethernet clock distribution scheme.

When attenuators are inserted into the cabling between the target and the base stations, an increased distance and degradation of the SNR can be tested. For the experiment a frame length of chips has been selected. The graph showing the variance of each SNR value is depicted together with the CRLBs in Figure 8. The ranging standard deviation stays approximately constant at 60 ps for SNRs above 12 dB and slightly increases for lower SNR values until the signal can no longer be decoded. Interestingly, the measurement shows to be better than the GCC CLRB using (12). This may seem as a contradiction to the definition of the CRLB as lowest bound for the variance of an unbiased estimator, albeit the definition is in fact not violated.

The GCC CRLB applies to the cross-correlation of the entire baseband signal with a noise-free template in each receiver given that there are independent range measurement pulses. The low standard deviation even at low SNRs stems from the fact that the timing recovery exploits the cyclostationarity of the signal by calculating the argument of Fourier coefficient . Let the captured signal be , and let be defined by with samples (squaring synchronizer), then frequency bins can be calculated by the Fourier series expansion, of which only one, , is evaluated. Consequently, the Neyman-Fisher factorization (21) can be applied [50]. If the PDF can be factored into a function depending only on via , and is a function only depending on , then is a sufficient statistic for : For the estimation of the fractional delay, is a sufficient statistic. If the WLAN frame consists of independent samples and the noise is uncorrelated to the signal, then the frequency selectivity (by evaluating only ) improves the SNR by the same factor . Therefore, the ranging standard deviation improves by . As the averaging itself (without frequency selectivity) improves the CRLB already by , for a cyclostationary signal with pulses, appears twice in the denominator of the CRLB (12). Consequently, the CRLB is identical to a signal with times the bandwidth, or times the SNR as

This bound is depicted as CCC CRLB in Figure 8. The actual receiver design diverts from this approach as it uses an NDA timing recovery, which has an increased variance compared to a decision-directed timing recovery for conditions with an SNR below 7 dB in case of BPSK modulation, as shown by the closed-form evaluation in [51]. The CCC CRLB assumes perfect synchronization and no chip frequency skews. In practical implementations, chip frequency skews are always present, and therefore the loop bandwidth in the timing recovery must be sufficiently large to track frequency changes. As a result, the noise over a larger bandwidth needs to be collected for tracking reasons, thus leading to an increased variance.

##### 6.3. Performance in LOS Conditions

In all the following measurements, a Linksys WRT54GL wireless access point has been used sending beacons with chips with an interval of approximately 10 ms. The target is equipped with a standard dipole antenna with horizontal polarization, which ensures angular-independent uniform signal coverage. The target is moved within the area of interest, and the position of the target is calculated in the locating unit based on the parameter estimates captured by the base stations. Once the setup is built up, the initial clock offset calibration is performed and the same offsets are used for all positions of the measurement.

A two-dimensional medium-scale measurement has been performed on a flat meadow with four base stations positioned in the corners of a 36 m square. The square has been divided into sections of 6 m forming a grid. To assess the point localization performance, the target is located on each grid point and the position estimates are recorded. The base stations and the target are mounted on tripods with a height of 1.6 m to enable LOS propagation without the Fresnel zone touching the ground.

Figure 9 shows the true positions marked with crosses, and the measured positions marked with circles. It displays that the positions determined by the range estimates follow the true position with high accuracy. The empirical Cumulative Distribution Function (CDF) of the position error is shown in Figure 10. In 80% of all measurements, the distance between the true position and the measured one is below 0.81 m. The maximum error is 1.70 m.

An evaluation of the measured data shows that the TDoA RMS error of the unfiltered timestamps is 2.45 ns. As the TDoA standard deviation is only 390 ps, the largest contribution to the RMS error is the offset. Compared to the synthetic range measurements, the standard deviation is larger and a significant average offset exists. Offsets may arise due to positioning errors of the target and the base stations. These errors are inevitable as the markings of the positions on the meadow are accurate to approximately 10 cm. As shown in Section 5, large measurement errors may also originate from multipath propagation. A possible interpretation of the RMS error is that a small fraction of the signal is reflected from the ground leading to multipath-induced offsets.

##### 6.4. Multipath Performance

For the assessment of the multipath performance, we consider a one-dimensional setup as there is only a single range to be extracted, and the ranging and position error are independent of the geometric setup in contrast to multidimensional locating. This allows for a more intuitive interpretation of the results as only a single time error is present. Clearly, the setup can be extended to more dimensions as well.

In the experiment, the target is moved along the perpendicular path that joins the base stations and , positioned 6 m apart from each other. The setup is depicted in Figure 11 together with the hyperbolas for multiples of 5 ns propagation time difference. Given that the path is centered between the base stations, the hyperbola becomes a straight line and any movement along this axis should not change the TDoA. Hence, the measured range corresponds to the TDoA ranging error. The setup is placed in an office environment surrounded by cupboards, desks, chairs, and other typical furniture. The rooms within the office building are separated by drywalls.

The results of the unfiltered ranges for this experiment are shown in Figure 12. At the start of the measurement, the offset is calibrated, and, as a result, the TDoA bias is set to zero. When the target is moved on the perpendicular line, the propagation conditions change and the multipath error varies resulting in an error of up to 4 m. After 4200 samples, the movement of the target is stopped and the multipath error becomes static again. However, the error is not zero, but approximately −2.1 m. Two problems are associated with locating in multipath environments: the synchronization at the start of the experiment is already biased, and each range measurement includes an unknown bias as well. The synchronization bias is caused by the fact that the selected synchronization approach requires one device (the target) to transmit from a known position to calibrate the receiver offsets. Yet, this calibration may already introduce a bias by multipath propagation. Both biases can be partially mitigated by Frequency-Hopping Spread Spectrum (FHSS) through all available WLAN channels.

Figure 13 depicts the results of the FHSS TDoA measurements by the average error and ranging error envelope. The average error refers to the TDoA average of one FHSS sequence using 15 channels separated by 5 MHz with no other filtering applied. The TDoA error envelope is defined by the maximum and minimum TDoA of each FHSS sequence and can be understood as a bound for the multipath error when using a single channel. As in the previous setup, the measurement is calibrated at the start of the system. Yet, already at this point, the TDoA range varies from channel to channel by 2.6 m. During the movement of the target TDoA multipath errors of up to 12 m arise on some channels. The diversity introduced by FHSS leads to a reduced multipath error, in this particular setup the TDoA falls within −2.3 up to 2.0 m, which is a significant improvement compared to a single channel.

As FHSS over 15 channels takes 150 ms with 10 ms frame period, the question arises, if a similar performance can be achieved with fewer channels. If only 4 channels are evaluated (1, 5, 9, and 13), the TDoA ranging error is slightly higher with maximum range offsets of −2.8 up to 2.4 m. The empirical CDF for the unfiltered ranging error is shown in Figure 14 for FHSS using 4 or 15 channels as well as using only one channel (channel 1 at 2412 MHz). When using all 15 channels, in 80% of all measurements, the ranging error is below 0.59 m, and 0.79 m in case of 4 channels. When using no FHSS, for the same probability, the error is below 1.73 m. It should be noted that these results show a significant improvement compared to the WLAN locating system of [13], who reported a multipath error using a single channel of 4 m. Yet the results are not entirely comparable because the authors used a hardware with 22.72 ns timestamp resolution, and therefore the measurement required filtering of about 50 measurements.

Interestingly, multiple measurements in different environments showed that ranging errors using FHSS are significantly larger when the target is moving, while for stationary targets the FHSS-averaged ranging bias is always below 1 m. A possible interpretation is that fast-fading channels generate temporary ranging errors as the channel conditions change during the reception of the frame, while the equalizer is set into tracking mode and cannot compensate for the distortions. This open point is to be investigated in the future.

#### 7. Conclusion

In this paper, we proposed a novel fractional delay ranging receiver architecture for ranging purposes in IEEE 802.11b WLANs. The particular feature of this architecture is that the timing recovery estimates the chip timing by a CCC approach and outputs its fractional delay estimate to all subsequent blocks. It has been shown that the proposed architecture can achieve a timestamping precision in the sub-100 picosecond range, which imposes a significant improvement to previously presented architectures. This high precision makes noise filtering techniques more or less obsolete.

The multipath-induced ranging bias, however, imposes the largest restriction to WLAN ranging. As WLAN has a narrow signal bandwidth and the baseband pulse is not defined, common multipath mitigation techniques, such as narrow correlators, showed no improvements. Yet this is not a limitation caused by a poor receiver implementation but justified by the fact that the wideband CIR is not resolvable with the limited signal bandwidth. The problem can be mitigated by performing frequency hopping or by averaging the TDoA estimates over all selected channels. With frequency hopping, the multipath error improved from 1.73 m with a probability of 80% for a single channel to 0.59 m when using all channels. As the WLAN channels overlap, using only 4 channels resulted in almost the same multipath resilience.

The future development of the system will focus on minimizing the impact of multipath propagation as further improvements in terms of ranging variance have virtually no impact on the practical accuracy. The receiver architecture already implements an equalizer compensating for the CIR of the channel. It is planned to develop an algorithm which reconstructs the specific pulse shape of each transmitter based on the equalizer coefficients captured from all base stations. With the knowledge of the pulse shape, parametric multipath mitigation techniques, such as superresolution methods, can be applied and multipath compensation values can be calculated. Particularly for point positioning with a static CIR, this technique has the potential to significantly reduce the multipath error without the necessity to transport the sampled signals of each base station to the central locating PC.

#### Appendix

#### A. Timestamp Optimization

The linear regression model enables to generate an optimized timestamp exploiting the high accuracy of the fractional timing delay together with averaging. In a real-world timing recovery, the loop bandwidth of the code-tracking loop cannot be reduced to any arbitrary low number for a number of reasons. Recall that in WLAN chip frequencies may deviate up to 40 ppm between transmitter and receiver. A synchronizer should be able to compensate for the frequency skew and any random clock fluctuations within reasonable amount of time and follow the chip rate of the transmitter. This presents a lower limit on the code-tracking loop. Yet, for ranging, the loop bandwidth should be very narrow. As a solution, we propose a two-step approach. The loop bandwidth in the timing recovery is optimized for chip synchronization, while the noisy outputs are filtered in a separate timestamping unit after the receiver control unit. We assume that the timestamps vector follows a simple linear regression model as with the unknown time offset, the unknown chip frequency skew, a vector containing the local clock at the timestamp instances and the actual timestamps, and the random uncorrelated error vector. This model assumes that there are no major chip frequency drifts within the timespan of interest (the frame duration). In this case, the MLE boils down to the linear regression of the data. The expectation of and can be calculated by the quotient of the covariance and the variance of and as

Either these parameters are directly transmitted to the location unit or a filtered version of a timestamp for any arbitrary position in the frame can be calculated. For a TDoA architecture, this can be simplified as the locating unit uses range differences from two base stations. The distance estimation between base station and base station can be written as the difference of two timestamps as with and the corresponding linear regression coefficients for each base station. It can be assumed that the velocity of a possible WLAN target is limited to about 5 m/s. For a baseband chip frequency of 11 MHz, this limits the Doppler frequency shift defined by with the speed of light, to about 0.18 Hz or 17 ppb. For a typical frame duration of 1 ms, the Doppler effect will contract the frame by just 16 ps. As all base stations are assumed synchronized and the effect of the Doppler shift is negligible for the baseband signal, the expectations of the chip frequency skews can be considered as the same for all base stations (). As the base stations are synchronized (), the range difference only depends on the estimated offsets . Substituting (A.2) into (A.3) shows that the TDoA only depends on the averaged timestamps of two base stations as The constraint that the chip frequency skews must be identical among all base stations is not required, if we define that the timestamp epoch to be at the center of the frame with . If we use this definition and substitute (A.2) into (A.1), the center timestamp is just the expectation of independent of the chip frequency skews . Defining the epoch at the center of the frame is rather unconventional as typical timestamping or ranging systems like UWB or IEEE1588 use an epoch at the start, such as the start frame delimiter [52].

#### Acknowledgments

This paper was partly financed by the province of Lower Austria, the European Regional Development Fund, the FIT-IT project *-WiFi* under Contract 813319 in cooperation with Oregano Systems and the EU under the FP7 STREP ^{flex}WARE Contract no. 224350.