Abstract

The researches on two-dimensional indoor positioning based on wireless LAN and the location fingerprint methods have become mature, but in the actual indoor positioning situation, users are also concerned about the height where they stand. Due to the expansion of the range of three-dimensional indoor positioning, more features must be needed to describe the location fingerprint. Directly using a machine learning algorithm will result in the reduced ability of classification. To solve this problem, in this paper, a “divide and conquer” strategy is adopted; that is, first through k-medoids algorithm the three-dimensional location space is clustered into a number of service areas, and then a multicategory SVM with less features is created for each service area for further positioning. Our experiment shows that the error distance resolution of the approach with k-medoids algorithm and multicategory SVM is higher than that of the approach only with SVM, and the former can effectively decrease the “crazy prediction.”

1. Introduction

With the development of mobile communication technology and the growing demand for new services, location-aware computing especially the Location Based Service (LBS) gradually attracts people’s attention, and how to determine the user’s location is the core issue in LBS. The global positioning system (GPS) [1] and the cellular network positioning system are location service systems widely used in open outdoor environment [2]. However, these positioning systems in an indoor environment are out of people’s satisfaction. And there are already many researches about the indoor positioning technology [3].

Nowadays, various radio techniques have been applied to the indoor positioning, such as UWB (ultra-wide band), RFID (radio-frequency identification), and VHF (ultra-high frequency) technology [47]. However, these positioning systems require redeployment of the network and additional signal measurement hardware, and the cost is relatively high, so their application is limited.

In recent years, with the increasing popularity of the application of WLAN, WLAN-based indoor positioning technology has been rapidly developed. The technology can leverage existing wireless LAN resources without additional network deployments or facilities, which leads to the most attractive advantage, low cost.

WLAN-based indoor positioning technology research mostly uses the location fingerprint method and usually is based on the nearest neighbor search, naive Bayesian statistics, BP neural network or support vector machines [813], and other machines learning ideas. The positioning technology based on location fingerprint focuses more on the application of two-dimensional case, and relevant experiments are also conducted in a two-dimensional plane [14]. However, in practice the location systems in large shopping malls, libraries, offices, hospitals, airports, museums, and other places need to provide location information that includes not only the latitude and longitude coordinates, or other two-dimensional representations, but also the height or the room number.

When it comes to three-dimensional indoor positioning environments and the problem begins to be tougher, those methods previously mentioned may not be simply adaptive or scalable, and often a “divide and conquer” strategy acts as a good direction guide.

Cotroneo et al. [15] proposed a naive partition positioning method that the communication range covered by an AP can be considered as a subregion, and which subregion the mobile terminal belongs to is determined by the signal strength it receives from each AP.

Xu et al. [16] divided the location space into multiple zones and used the distance-loss model choosing different parameters for each region. In the positioning phase, they used the maximum likelihood estimation to determine the location of the mobile terminal.

Nel Samama et al. [17] pointed out that many indoor positioning scenarios often do not require high accuracy and just tell the user some symbolic information, such as which corridor or room the current place is at. They proposed a 3D symbol positioning algorithm that divides the location space into different positioning symbolic subspace and designed a simple rule of symbolic subspace resolution to give the user positioning information.

Gansemer et al. [18] pointed out that the 3D indoor positioning is more realized by UWB, RFID, and other technologies. They stressed the need for 3D WLAN indoor positioning and proposed a method that extends isolines algorithm [19], used in 2D WLAN indoor positioning to 3D space.

Zhong-liang et al. [20] adopted k-means clustering algorithm to partition a three-dimensional indoor space into multiple regions; namely, location fingerprints with similar Euclidean distance are clustered into one region and the central fingerprint of every region is saved. When in the postioing phase, the fact that the fingerprint received by the mobile terminal is closest to which central fingerprint helps estimate in which area the mobile terminal is located. But they just used this principle to determine which floor the mobile terminal is located on.

Various research works suggest and prove the effectiveness of the partitioning thought which will also be used in this paper. We will first use k-medoids algorithm to cluster the location space into different service areas to achieve coarse positioning and then build a multicategory SVM for each service area to achieve further fine positioning.

3. The Hybrid Indoor Positioning Approach

3.1. Cluster by k-Medoids

Definition 1 (location fingerprint). A location fingerprint is a vector bound to a specified location, which consists of a series of Wi-Fi single strength values received by the mobile terminal from different APs (access point, Wi-Fi hotspot, or WLAN wireless router):

Definition 2 (record). A record here is a structure consisting of a location representation and a location fingerprint:

We adopt k-medoids [21] algorithm to partition the location space into different services areas. Compared to k-means algorithm, k-medoids algorithm is not sensitive to outliers, which allows us to get a better partition center.

As the complexity of the original implementation of k-medoids (Partitioning around Medoids, PAM [22, 23]) is too high, we need only a method to partition the location space, especially to solve the problem that whether partitioning those locations in stairs into up-floor or down-floor service area is better. Hence, we design our implementation of k-medoids algorithm, as shown in Algorithm 1.

Input: , , …, , , th record, total records;
   , , …, , , th initial medoid, total medoids; , maximum iteration time.
Output: , , …, , , a record as th clustered medoid.
(1) For       do
(2)  
(3) EndFor
(4) While && do
(5)  For       do
(6)      EuclideanDistance( )
(7)      
(8)     For       do
(9)        temp EuclideanDistance( )
(10)      If do
(11)         
(12)         
(13)      EndIf
(14)   EndFor
(15)   add      
(16) EndFor
(17)  For       do
(18)   centroidAverage( )
(19)    nearest  centroid
(20)    EndFor
(21) If don’t change do
(22)     
(23)    EndIf
(24)    count++
(25) EndWhile

Note. is a set of records in th service area with as its medoid; the initial value of variable changed is true; the initial value of variable count is 0.

3.2. Classify by Multiclass SVM

The support vector machine (SVM) is a popular classification technique. Professor Chih-Jen Lin has done lots of researches deeply upon SVM for about many years, and he and his fellows or students developed and maintained a very useful SVM library LIBSVM [24].

The standard SVM is a binary classifier, but more often we need a multicategory SVM, especially in the indoor positioning problem, where a location is one category. In the library LIBSVM, there are three methods implementing multicategory SVM, namely, one-against-all, one-against-one, and DAG-SVM [25]. We will choose the DAG-SVM because the testing time of the DAG-SVM is less than the other two and the testing time is an online and time-sensitive operation.

In [26], Hsu et al. also presented some tricks on improving the performance of an SVM, such as scaling on input data and using cross-validation to get proprietary parameters of the RBF kernel.

3.3. Indoor Positioning Approach Combining k-Medoids and Multicategory SVM

The location space that we are now faced with is not limited to the scope of a room or a floor but has been extended to the whole building.

Take our college building as an example; usually there is an AP in a room and there are several APs in a corridor. So the total number of all APs in the whole building is very considerable.

Hence, the number of APs become large, so as the dimension of the location fingerprint. If an SVM is directly applied to classify the location fingerprints with large dimension, the ability to classify will decrease.

In the preparing phase, we will first through k-medoids algorithm use large dimension location fingerprint to partition the location space into several service areas and save a medoid location fingerprint for each service area. Thus, there are a set of location fingerprints (including the medoid itself) bound to each service area. Then we reduce the dimension of fingerprints in every service area through deleting those APs that are shared by none location in that service area. We will create a multicategory SVM upon each set location fingerprints. Detailed processing is shown in Algorithm 2.

Input: , , …, , , set of records in th service area with as its medoid;
Output: , , …, , , trained SVM for th service area;
   , , …, , , AP list of the fingerprints for th service area.
(1) For       do
(2)   For       do
(3)    
(4)    For       do
(5)     If do
(6)        add to
(7)      EndIf
(8)    EndFor
(9)   EndFor
(10)      For    do
(11)   
(12)   delete from
(13)      EndFor
(14)      train on new with LIBSVM
(15) EndFor

In the positioning phase, when the mobile terminal receives many Wi-Fi signals from different APs, some of them are constructed as a fingerprint, and then the medoid fingerprint which is nearest to this finger can help determine which service area the mobile terminal is located in. Finally, the fingerprint modified by reducing dimension is input to the multicategory SVM corresponding to the determined service area, and the fine location is output by the SVM. Detailed processing is shown in Algorithm 3.

Input: , fingerprint of certain place;
   , , …, , , medoid of th service area;
   , , …, , , AP list of the fingerprints for th service area;
   , , …, , , trained SVM for th service area.
Output: predicted location respsentation.
(1) dist EuclideanDistance( )
(2)
(3) For       do
(4)   temp EuclideanDistance( )
(5)   If do
(6)     
(7)     
(8)   EndIf
(9) EndFor
(10) delete from
(11) call with input new

4. Experiment and Analysis

4.1. The Experiment Procedure

Our experiment is conducted on the 2nd, 3rd, and 4th floors in the building of the College of Electronics and Information Engineering (CEIE), Jading campus of Tongji University.

Figure 1 shows one of the experiment deployment floor plans, and the other two are similar. 100 sample spots are chosen from the corridor or stairs. These spots are labeled from number 1 to number 100 and every two neighbor spots are 5 m or 8 m or 12 m apart.

Table 1 lists two Wi-Fi info items received from nearby APs by an Android application (Wi-Fi scanner) developed by ourselves when the mobile phone is set in one of sample spots. The application can detect a list of info items (frequently there are more than 10 APs around) every time in the building of CEIE, and the situation can also be found in the library and dormitory buildings of the campus. Although we do not know the location of these APs, the information we can get is enough for our experiment, and the “address of AP” and “signal strength” are used to construct a fingerprint of a location.

The main steps of our experiment are shown below.

4.1.1. The Preparing Phase

(1)At every sample spot, with the mobile phone held in the hand of our tester, run Wi-Fi scanner 5 times for north, west, south, east, and a random direction, respectively; store the Wi-Fi info items for every time.(2)Delete those items with the “address of AP” that do not appear 5 times from the list of Wi-Fi info items, because their signal may be weak or unstable at that sample spot.(3)Count the frequency of every “address of AP,” sort those addresses by their frequency, reserve the first 2/3 of the sorted addresses, and discard the rest, 1/3, because we only need the address shared by more sample spots.(4)Shuffle the 2/3 of sorted addresses, and store them as a set of index indicators of a fingerprint, called Index Set.(5)Construct 5 fingerprints for every sample spot received from every 5 lists of Wi-Fi info items using the principle that if the item has the “address of AP” in Index Set, the “signal strength” is assigned to the corresponding element with the same index of a fingerprint (the initial value of all elements is −100.0dBm), and these 5 fingerprints are bounded to the sample spot as one location or category for below SVM.(6)We get a set of fingerprints, and the size of this set is large. Because these data are collected from 9 corridors of 3 floors, we choose a fingerprint in every middle corridor to get 9 initial medoids; then we use Algorithm 1 with the input (the set of fingerprints, the 9 initial medoids, ) to partition them into 9 subsets for every subset as a service area and store the output (9 clustered medoids).(7)As every service area does not need so many “addresses of AP,” we can discard those addresses shared by none sample spot in every subset; in other words, we can reduce the dimension of the fingerprint in that service area.(8)Use the multicategory SVM of LIBSVM to train the data on every service area; then finally we get 9 SVMs for 9 service areas, respectively.

4.1.2. The Positioning Phase

(1)Run Wi-Fi scanner at a testing spot; get a list of Wi-Fi info items; use the same method mentioned in the preparing phase to construct a new fingerprint.(2)Compare this new fingerprint with every clustered medoid (totally 9 medoids);choose the medoid nearest to the new fingerprint with Euclidean distance, and the testing spot is located corresponding to service area.(3)According to the subset of fingerprints of the service area, reduce the dimension of the new fingerprint and input the dimension-reduced fingerprint to the corresponding SVM; we finally get the category or location of the testing spot.

4.2. Testing and Analysis

We consider two testing scenarios to evaluate the performance of the hybrid indoor positioning approach.

One is in-place testing scenario and the other is middle-place testing scenario, as is shown Figure 2. The left and right circles represent two sample spots. In the in-place testing scenario, we choose every testing spot that is almost the same with certain sample spot; while in the middle-place testing scenario, we choose every testing spot that is in the middle of two neighbor sample spots.

In these two scenarios, the hybrid k-medoids + SVM approach is compared with the only SVM approach which does not have partitioning and reducing dimension steps and directly trains all fingerprints with one multicategory SVM.

Figure 3 shows the cumulative distribution function (CDF) of the error distance for the hybrid k-medoids + SVM and only SVM approaches methods in the in-place scenario, while Figure 4 shows the counterpart in the middle-place scenario.

From Figure 3, in the in-place scenario, taking the 75th percentile for example, the error distance is less than 3.25 m with k-medoids + SVM approach and is less than 3.5 m with only SVM. We find that the difference between the performances of the two approaches is not very obvious. This is not hard to expect, because the in-place scenario represents an ideal situation, in which two approaches both achieve their best performance.

From Figure 4, we can easily find that the k-medoids + SVM approach performs better than only SVM approach in the middle-place scenario. The comparison in this scenario is listed in Table 2.

In Table 2, compared with only SVM approach, the k-medoids + SVM approach improves more than 1 m both in terms of the 50th percentile and 75th percentile of the error distance. Because the distance between two neighbor sample spots in our experiment is 5 m, 8 m, or 12 m, the error distance hence becomes a little large. But indoor positioning resolution is not the key point we are concerned about in this paper. It is worth mentioning that with only SVM approach there occurs 3 times “crazy prediction,” for instance, that an actual spot on the 3rd floor is predicted on the 2nd floor or 4th floor. The k-medoids + SVM approach can well-reduce and even avoid this kind of “crazy prediction,” which also meets the requirement of indoor positioning, especially in 3D space.

5. Conclusions

A hybrid approach is proposed in this paper. The hybrid approach uses k-medoids algorithm to partition the set of fingerprints into several subsets, reduces the dimension of fingerprints of every subset, and trains a multicategory SVM on each subset data. The hybrid approach outperforms the approach just using SVM to train on all large-dimension fingerprints, in terms of error distance resolution. In addition, the hybrid approach with k-medoids algorithm and multicategory SVM can effectively reduce “crazy prediction.” Finally, we conclude that the hybrid approach can be used to solve 3D WLAN indoor positioning problem with a better a performance.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the Natural Science Foundation Programs of Shanghai (Grant no. 13ZR1443100), by the 973 Program of China (under Grant 2010CB328101), by ISTCP (under Grant 2013DFM10100), by the National Science and Technology Support Plan (Grant no. 2012BAH15F03), and by NSFC (under Grants 51034003 and 51174210).