Mobile Information Systems

Volume 2015, Article ID 416197, 14 pages

http://dx.doi.org/10.1155/2015/416197

## Crowd-Sourced Mobility Mapping for Location Tracking Using Unlabeled Wi-Fi Simultaneous Localization and Mapping

^{1}Chongqing Key Lab of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China^{2}Ericsson, San Jose, CA 95134, USA^{3}China Internet Research Lab, Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China

Received 1 March 2015; Revised 23 April 2015; Accepted 29 April 2015

Academic Editor: Laurence T. Yang

Copyright © 2015 Mu Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Due to the increasing requirements of the seamless and round-the-clock Location-based services (LBSs), a growing interest in Wi-Fi network aided location tracking is witnessed in the past decade. One of the significant problems of the conventional Wi-Fi location tracking approaches based on received signal strength (RSS) fingerprinting is the time-consuming and labor intensive work involved in location fingerprint calibration. To solve this problem, a novel unlabeled Wi-Fi simultaneous localization and mapping (SLAM) approach is developed to avoid the location fingerprinting and additional inertial or vision sensors. In this approach, an unlabeled mobility map of the coverage area is first constructed by using the crowd-sourcing from a batch of sporadically recorded Wi-Fi RSS sequences based on the spectral cluster assembling. Then, the sequence alignment algorithm is applied to conduct location tracking and mobility map updating. Finally, the effectiveness of this approach is verified by the extensive experiments carried out in a campus-wide area.

#### 1. Introduction

Location tracking in wireless mobile environments plays an important role in recent development of Location-based services (LBSs), such as visitor navigation, elderly health care, facility management, transportation, and emergency rescue [1–3]. With the significant growing interests in Wi-Fi technology and the ubiquitous deployment of Wi-Fi infrastructures, intense attention has been paid to various approaches of Wi-Fi network aided location tracking [4]. Location tracking using Wi-Fi received signal strength (RSS) fingerprints presents many challenges due to less predictable variations of RSSs with respect to the distances from the access points (APs) to the receiver. The Wi-Fi network aided location tracking approaches can be generally classified into two categories. In the first one, only Wi-Fi RSS data are used to estimate the parameters in propagation models [5, 6] or construct a radio map corresponding to the coverage area [7, 8]. Then, the target locations are estimated based on the propagation model aided triangulation algorithm or radio map aided fingerprint matching. The major limitations of such category are that (i) the RSS propagation modeling requires the prior knowledge of AP locations and easily leads to the low accuracy when the relations between the RSSs and their corresponding distances are not captured appropriately and (ii) the construction of radio map usually involves the site survey over a large number of reference points (RPs) and their associated RSSs, which deteriorates the adaptability for the large coverage area, and meanwhile the radio map can easily become outdated.

In the second one, the Wi-Fi RSSs and the acquired measurements from several different types of inertial and vision sensors are integrated to achieve simultaneous localization and mapping (SLAM) [9]. The SLAM approach can be used to construct the virtual floor plan corresponding to the coverage area, meanwhile conducting location tracking. Although the RSS propagation modeling and radio map construction are not required, the combined problem of target localization and mobility mapping needs to be significantly concerned in order to make the SLAM approach robust to the radio noise. Specifically, the mobility mapping built up from the environment helps to keep tracking of the target locations, while the results of location tracking feed the process of the mobility mapping. The aim of this paper is to use solely Wi-Fi network to construct a mobility map without location fingerprinting and inertial or vision sensing (or called unlabeled mobility map) for the coverage area. Moreover, the proposed approach is expected to achieve a high-enough probability of locating the new RSS data into the corridor where these data are actually recorded, namely, the corridor-level location accuracy. In summary, the following two problems are resolved in this paper.

(i) How to construct an unlabeled mobility map by using crowd-sourcing based on the sporadically recorded RSS data without explicit information of their physical coordinates?

(ii) How to simultaneously conduct location tracking and mobility map updating?

The solution to the construction of the unlabeled mobility map is by spectral cluster assembling, which consists of the intrasequence spectral clustering step and the intersequence cluster assembling step. The solution to the target location tracking and mobility map updating is called sequence alignment algorithm, which matches the new RSS data against the constructed mobility map and selects the location point (LP) with the highest alignment similarity as the tracked location for each location query.

*Solution 1 (spectral cluster assembling). *Mobility map construction by using crowd-sourcing consists of two main steps: intrasequence clustering and intersequence assembling. In the intrasequence clustering step, each string of time-stamped consecutive RSS samples recorded by the person is first represented as an RSS sequence, where the RSS samples in each sequence are sequenced in chronological order. Then, for any recorded RSS sequence, spectral clustering is performed on this sequence to classify the RSS samples into different clusters. The spectral clustering used in this paper preserves the locality of RSS samples in RSSs and time-stamps. Another significant advantage of spectral clustering is that the RSS samples which are high-dimensional can be mapped into a low-dimensional space, which is beneficial to the information retrieval and data mining [10]. Specifically, after an RSS sequence is recorded, the similarity of every two RSS samples in this sequence is first calculated, and then a low-dimensional mapped space is constructed by using Laplacian embedding. Finally, the -means clustering is performed to classify the RSS samples into different clusters.

In the intersequence assembling step, the concept of winning path used in Smith-Waterman algorithm for protein sequencing [11] is first applied to identify the clusters which have high similarities between each other. The similarity of any two clusters is measured by their cumulative matching score, which is calculated based on the Kullback-Leibler (KL) divergence of their RSS distributions and the cumulative matching scores of several previous pairs of clusters (or called pairs of clusters with time-stamps before the current pair of clusters). After the matrix of cumulative matching scores between the clusters, namely, the scoring space, for each pair of RSS sequences is obtained, the pairs of RSS sequences having one or more pairs of clusters with the cumulative matching scores higher than a given threshold are required to be assembled, while the corresponding pairs of clusters are merged into a new cluster. This process is repeated until all the remaining pairs of clusters are with the cumulative matching scores lower than the threshold. At this point, each remaining cluster is recognized as an LP in mobility map.

*Solution 2 (sequence alignment algorithm). *For the purpose of simultaneously conducting location tracking and mobility map updating, the three RSS sequences containing the largest number of RSS samples, namely, the three longest RSS sequences, in each LP is first selected as the virtual fingerprints. The concept of sequence alignment in protein sequencing is then used to find the strings of tracking LPs which have the highest alignment similarities to the new RSS sequences. Finally, based on the time-stamped transition relations of LPs, the target locations can be tracked by using the modified strings of tracking LPs with the improved accuracy performance. A preliminary discussion on the time-stamped transition relations of LPs in mobility map can be found in the previous work [4]. In this paper, to validate the construction of mobility map by using crowd-sourcing and the unlabeled Wi-Fi SLAM, extensive experiments are conducted in a campus-wide area.

The remainder of this paper is organized as follows. Section 2 reviews some related work on SLAM based location tracking. In Section 3, the steps of the proposed crowd-sourced mobility mapping for location tracking without the site survey on location fingerprinting and inertial or vision sensing are described in detail. The experimental results and the related discussions are shown in Section 4. Finally, this paper is concluded in Section 5.

#### 2. Related Work

Tracking the target locations cost-efficiently by using the conventional RSS propagation modeling and construction of radio map is a challenging work since both of them involve the time-consuming and labor intensive site survey on the relations between RSSs and physical locations [12, 13]. To achieve the cost-efficiency purpose, the recent works began to focus on the SLAM based location tracking approaches, which have good adaptability to the environment with low site survey effort and high robustness to the environment noise.

Wang and Thorpe [14] integrated the SLAM and detection and tracking of moving objects (DTMO) approaches and verified the efficiency of the integrated SLAM and DTMO at high speed in a large crowded city environment. A simulation system is designed to be implemented to analyze and validate the SLAM based on the well-known extended Kalman filter (EKF) [15]. There are generally three major approaches studied for SLAM, optimal control approach, local submap approach, and frontier based approach, to achieve the localization and mapping purposes in an active and intelligent way. Similar work on Kalman filter (KF) based SLAM has been addressed extensively in the literature [16, 17]. Chatterjee and Matsuno [16] introduced a new approach of using the neurofuzzy assisted EKF to enhance the performance of SLAM. There are two limitations of this approach. First of all, to suitably use the neurofuzzy supervision, the free parameters of neurofuzzy system are required to be learned carefully. Second, this approach could be unavailable when the process and sensor noise covariance matrices are inaccurate. To improve the accuracy and fast convergence of state estimation involved in KF, the pseudolinear model based Kalman filter (PLKF) based SLAM is proposed [17]. The PLKF based SLAM outperforms the conventional EKF based SLAM since the pseudolinear model preserves the nonlinearity in the system, motion, and observation models.

Wang et al. [18] proved that the particle filtering (PF) can also be used to improve the performance of SLAM. The PF can not only reduce the complexity of the data but also enhance the real-time capacity of SLAM. The weakness of the PF based SLAM is that many types of sensors and the related data fusion process are required to guarantee the effectiveness of the feature extraction in the unknown and highly complex environments. The Rao-Blackwellised particle filter (IRBPF) based SLAM is used to achieve the accurate localization and generate a consistent map of the environment [19]. The IRBPF based SLAM first uses PF to estimate the posterior probability distributions of the target. The adaptive resampling approach is then applied to reduce the risk of sample depletion. Finally, the motion and observation models are constructed based on the data from a ranging sensor and an odometer. However, this approach is not suitable for the highly dynamic environment.

Luo et al. [20] presented a new concept of vision based SLAM (V-SLAM), in which the visual feature point buffer and human body elimination are used to decrease the estimative errors to the system. Specifically, the V-SLAM implements the visual feature point buffer to filter the temporary feature points which are extracted from features from accelerated segment test (FAST) corner detector and conducts the human body elimination to help V-SLAM to be more accurate. As an application example, the V-SLAM is used to aid inertial navigation by compensating for inertial navigation divergence [21]. Sazdovski and Silson [21] proved that such an integrated inertial navigation system with V-SLAM requires the coordination between the guidance and control measurements to achieve the high navigation accuracy.

The aforementioned SLAM is based on the single target and single model. The decentralized platform developed by Saeedi et al. [22] is known as the first application of neural network for SLAM in multiple targets condition. Their proposed approach consists of five modules of the high-level map segmentation for map preprocessing, application of self-organizing maps for preprocessed map clustering, inclusion of map uncertainty in learning phase, estimation of the relative transformation matrix of every two maps, and use of surface norms for relative transformation determination. Yingmin and Ding [23] investigated the nonlinear interacting multiple model (IMM) based SLAM to solve the problem of the statistical property mutation of SLAM. There are five steps involved in the nonlinear IMM based SLAM, the model condition reinitialization, model condition filtering and data association, model probability updating and estimate fusion, and state augmentation and map building.

By considering the indoor Wi-Fi network, the Wi-Fi SLAM is invented to gather the location and mapping information simultaneously [24]. Specifically, the Wi-Fi SLAM first uses the location fingerprinting to get an idea of what the construction of a particular building is going to do to Wi-Fi RSS distributions. The initial trajectories are then constructed based on the measurements from multiple sensors on a smartphone including the accelerometer, gyroscope, and magnetometer. Finally, the constructed trajectories are mated with the results of Wi-Fi RSS trilateration to serve fine-grain localization and create accurate indoor maps. Since the measurements are gathered by different sensors, the Wi-Fi SLAM uses pattern recognition and machine learning to draw the correlations between these measurements for data fusion purpose. The SmartSLAM is recognized as another representative Wi-Fi network based SLAM by using the measurements from inertial sensors and Wi-Fi network [9]. One of the significant problems of SmartSLAM is that the high energy cost involved in continuously scanning multiple sensors and Wi-Fi channels seriously limits the practical use.

In this paper, a better solution will be provided to the simultaneous mobility mapping and location tracking by using crowd-sourcing from the sporadically recorded Wi-Fi RSS data without location fingerprinting and inertial or vision sensing. In summary, the three major contributions of this paper are as follows.

(i) An unlabeled Wi-Fi SLAM approach is developed to avoid the location fingerprinting and additional inertial or vision sensors.

(ii) The mobility map is constructed by using crowd-sourcing based on the sporadically recorded Wi-Fi RSS data without explicit information of their physical coordinates.

(iii) The concept of alignment similarities between the newly recorded RSS data and prestored virtual fingerprints is utilized to achieve the location tracking.

The major notations and parameters used in this paper are summarized in Notations and Parameters section.

#### 3. System Description

##### 3.1. Intrasequence Clustering

###### 3.1.1. Problem Statement

In the system, each person holding a Wi-Fi RSS receiver walks around the coverage area and collects Wi-Fi RSS sequences. It is assumed that such sequences are collected where is the th sequence and is the th sample (of dimensions ) in which contains samples. Each sample is a vector containing the RSS values from APs. The th () element in , is the RSS value from the th AP. The difference in RSSs and time-stamps between any two RSS samples in the same sequence is calculated by and where is the time-stamp of in and is the sampling interval.

Here, the intrasequence clustering problem is as follows. Given a set of RSS samples belonging to a -dimensional mapped space , which is embedded in the raw -dimensional RSS space , a set of mapped vectors in is first found such that is mapped from , and then the mapped vectors are classified into RSS clusters. Finally, the RSS samples which have the corresponding mapped vectors in the same cluster are also classified into the same RSS cluster.

The first advantage of intrasequence clustering is that the similarities of RSS samples in RSSs and time-stamps between any two clusters are minimized. Second, the graph Laplacian which is selected for dimensionality reduction can avoid the situation that the isolated RSS samples accidentally form some outlier RSS clusters by minimizing the values of Ncut over all the RSS clusters [10, 25].

###### 3.1.2. Steps

The steps of intrasequence clustering are provided as follows.

*Step 1. *The similarity of and is computed as , where and are two tunable weighting factors and and are the normalized values of and , respectively.

*Step 2. *Consider the problem of mapping the RSS samples in each sequence onto a line such that the RSS samples with large similarities in RSSs and time-stamps are corresponding to the mapped points which can stay as close together as possible on the line. Let be the set of the mapped points. The optimal objective function to this problem can be described as .

*Step 3. *Compute the Laplacian matrix for each RSS sequence. If , is set as . Otherwise, is set as 0. By using the Lagrange multiplier method [4], the problem described in Step 2 is converted to , where ’s are the eigenvalues of .

*Step 4. *Consider the general problem of mapping each RSS sequence (of dimensions ) into the -dimensional space, where . The set of mapped vectors, namely, the mapped sequence, is given by a matrix , where is the mapped vector of and the superscript “” represents the transpose operation. Thus, the optimal objective function to this general problem equals . Based on the previous work [4], the solution to this optimal objective function can be provided by the eigenvectors which are corresponding to smallest eigenvalues of the generalized eigenvalue problem .

*Step 5. *As discussed in the literature [10, 26], the process of the previous dimensionality reduction which preserves the locality of RSS samples in RSSs and time-stamps can also yield a similar solution to the RSS clustering. In concrete terms, after the dimensionality reduction from to , -means clustering is conducted on the mapped vectors to obtain RSS clusters for each RSS sequence, where is the th RSS cluster in .

##### 3.2. Intersequence Assembling

###### 3.2.1. Problem Statement

The problem to be solved by intersequence assembling is to assemble the RSS clusters obtained in intrasequence clustering step into a mobility map. Since each RSS sequence can be represented by a string of consecutive time-stamped RSS clusters, , the concept of winning path in scoring space used by Smith-Waterman alignment is applied to identify the RSS clusters required to be merged. By considering the process of clusters combination graphically, each string of consecutive time-stamped RSS clusters is first viewed as a string of consecutive vertices in a graph, where any two adjacent vertices are connected by an edge. Second, the string assembling is conducted by merging the specific vertices in different strings. Finally, after all the strings of consecutive vertices have been assembled, the mobility map is constructed in which and represent the sets of vertices and edges in graph, respectively.

###### 3.2.2. Steps

The four steps of intersequence assembling are as follows.

*Step 1. *For every two strings of consecutive vertices and , their initial scoring space is set as a zero matrix . The entry at position in scoring space is the cumulative matching score between and .

*Step 2. *The entries in scoring space are calculated starting at position and proceeding from the first to the last rows in scoring space. The entry at position , , is calculated that (i) when , ; (ii) when , ; (iii) when , , and , ; and (iv) otherwise, one haswhere and are the missing factor and the damping factor, respectively; is a threshold; and . and are the probabilities of RSS value under the RSS distributions in and , respectively.

*Step 3. *The winning path creation is started at the position with the largest entry, namely, the first position on the winning path, in scoring space, such thatwhere is a threshold.

After the first position on the winning path is obtained, it is required to go backwards to one of positions, , , and , and compare the entries of these three positions, , , and . A connection from to , notated as , is constructed as the first jump on the winning path, where

This process is continued until the winning path reaches the first row or the first column in scoring space or reaches a position with the entry not larger than the threshold .

*Step 4. *After the winning path is created, the specific RSS clusters in and , respectively, are merged based on the concept of jumps used in Smith-Waterman alignment [27]. It is required to start from the last jump and go forward to the first jump on the winning path to identify the specific RSS clusters to be merged. There are three types of jumps involved [27]: (i) diagonal jump implying that and are required to be merged into a LP; (ii) top-down jump implying that is an isolated LP; and (iii) left-right jump implying is an isolated LP. Figure 1 gives an example of the winning path creation, while the two RSS sequences and happen to be both with four RSS clusters in length. Based on the created winning path in Figure 1, we can obtain three pairs of RSS clusters to be merged into , , and , respectively, and eventually construct a mobility map containing 5 LPs, notated as LP , and 5 transitions, notated as Tran. , between the LPs.