Abstract

Many factors influence the positioning performance in WLAN RSSI fingerprinting systems, and summary of these factors is an important but challenging job. Moreover, impact analysis on nonalgorithm factors is significant to system application and quality control but little research has been conducted. This paper analyzes and summarizes the potential impact factors by using an Ishikawa diagram considering radio signal transmitting, propagating, receiving, and processing. A simulation platform was developed to facilitate the analysis experiment, and the paper classifies the potential factors into controllable, uncontrollable, nuisance, and held-constant factors considering simulation feasibility. It takes five nonalgorithm controllable factors including APs density, APs distribution, radio signal propagating attenuation factor, radio signal propagating noise, and RPs density into consideration and adopted the OFAT analysis method in experiment. The positioning result was achieved by using the deterministic and probabilistic algorithms, and the error was presented by RMSE and CDF. The results indicate that the high APs density, signal propagating attenuation factor, and RPs density, with the low signal propagating noise level, are favorable to better performance, while APs distribution has no particular impact pattern on the positioning error. Overall, this paper has made great potential contribution to the quality control of WLAN fingerprinting solutions.

1. Introduction

During the past two decades, there has been an exceptional development in localization and positioning field benefiting from global positioning system (GPS) [1]. However, GPS performs weakly when the radio signal is obstructed, for example, in urban canyons and indoor environments. To fill the gap, many indoor positioning systems have been proposed using different methods [25], including triangulation, proximity, scene analysis, and pedestrian dead reckoning (PDR). There are already many commercial off-the-shelf (COTS) systems for indoor localization, such as Situm (https://situm.es/en), Kio-RTLS (http://www.eliko.ee), OpenRTLS (https://openrtls.com), Aero Scout (http://www.aeroscout.com), and Ubisense (http://ubisense.net/en). Among these different indoor localization systems, location fingerprinting is a solution using scene analysis method and usually working with wireless local area network (WLAN) received signal strength indicator (RSSI), for example, the Skyhook (http://www.skyhookwireless.com/) commercial positioning system. The scene analysis algorithm consists of two phases: offline learning and online positioning. Firstly, it builds a location fingerprint database on some reference points (RPs) with given location coordinates during the offline learning phase. Then in the online positioning phase, the main work is to search the nearest RP or RPs by using RSSI received on an unknown point. There are two kinds of common searching algorithms in use: the deterministic [6] and probabilistic [7, 8] algorithms.

In a WLAN location fingerprinting system, there are many factors which may have impact on positioning performance. For example, location fingerprinting is a cost-saving solution by using previously deployed access points (APs) for communication. However, these nondedicated APs also have limitations for localization. For example, only a limit number of APs can be detected in a particular environment. Moreover, the RSSI of WLAN is unstationary during radio wave attenuating from APs to RPs under the influence of environment noise. Therefore, the number of APs and unstationary RSSI may influence the positioning result. Search and summary of the potential impact factors are thus important to guiding the further impact analysis, yet a challenging task, which has not been paid much attention.

There have been many studies on WLAN location fingerprinting in different aspects, such as developing diverse systems, implementing with novel searching algorithms, and constructing fingerprint database using different approaches, which will be reviewed in Section 2 in detail. However, little research effort has been made on impact factors due to some limitations in conducting analysis experiments. Comparing with factors such as searching algorithm and parameters in an algorithm, it is less convenient to conduct experiments for analyzing some nonalgorithm factors. But, nonalgorithm factors are also important in technique application and system quality control. For instance, environment noises cause the instability of WLAN RSSI, which may have influences on positioning results. However, some environment factors such as the crowd, temperature, and humidity are not easy to control in a real experiment field. Moreover, the experiment will be complicated, and the number of tests will be extremely large when interactions of all factors are taken into account. For example, assuming there are factors that are regarded as factors of interest and the th factor has levels to be analyzed, the number of tests would be in a factorial experiment.

To address these issues, the research objectives of this study are as follows:(1)To summarize the potential factors and provide a reference for factors research, this paper uses an Ishikawa diagram [9] method with consideration of WLAN radio signal transmitting, propagating, receiving, and processing in WLAN location fingerprinting systems (Section 3).(2)To facilitate the analysis experiment of factors and impacts, an open-source simulation WLAN location fingerprinting platform (https://github.com/cumtlkq/WiFiPosSimu) was developed, and this paper classifies the factors into four types including controllable, uncontrollable, nuisance, and held-constant factors considering simulation feasibility. All the categorized factors are presented in another Ishikawa diagram (Section 4).(3)To promote the application and quality control of WLAN location fingerprinting, this paper selected five nonalgorithm controllable factors (APs density, APs distribution, radio signal propagating attenuation factor, radio signal noise, and RPs density) as factors of interest (Section 4).(4)To reduce the complexity and number of tests in a factorial experiment, the analysis experiment was conducted using the one-factor-at-a-time (OFAT) method with different test settings to analyze the impact patterns of the selected factors on positioning error (Section 5).

The most famous fingerprint localization system was proposed by Bahl and Padmanabhan [6], named RADAR. It combined location fingerprinting with triangulation on signal propagation modeling to determine user location. The median resolution of the RADAR system is in the range of 2 to 3 meters, and the accuracy of location fingerprinting method is superior to signal propagation modeling. In their following work [10], they enhanced RADAR with a Viterbi-like algorithm which improved the accuracy of user location by over 33% and helped alleviate problems due to signal aliasing.

After RADAR, there have been many WLAN location fingerprinting systems developed. Horus [7] system used the probabilistic algorithm in positioning, and the experiment results showed that accuracy reached over 90% probability within 2.1 m. Castro et al. [11] presented a WiFi location service called Nibble, which used Bayesian networks to infer the location of a device. In their experiment setting, the location service reached 97% in accuracy. Taheri et al. [12] described an independent location fingerprinting system, Locus, for comparing the performance of various location fingerprinting algorithms. Kontkanen et al. [13] showed that the probabilistic modeling approach offers a theoretical solution to the positioning problem and many other issues such as calibration, active learning, error estimation, and tracking with history. This solution was also used to develop the Ekahau (http://www.ekahau.com/) commercial WLAN location fingerprinting positioning system. Ching et al. [14] presented the WiPos system using WiFi APs deployed across a university and the system involved developing a server and a client running on the Android platform. Testing showed that university’s WiFi network is sufficient to provide “room to room” level accuracy.

In recent years, a number of indoor positioning systems have been developed by integrating other techniques with WLAN location fingerprinting to enhance the indoor positioning performance. Wang et al. [15] presented a floor-map-aided WiFi/pseudo-odometry integration based indoor positioning system and the field experiment showed that it could reliably achieve meter-level accuracy. Karlsson et al. [16] utilized signals of both 2.4 and 5.0 GHz to obtain more information of WiFi, and a particle filter was used to combine location fingerprinting with PDR to improve the accuracy. Ban et al. [17] proposed a high accuracy indoor positioning method by using residual magnetism in addition to PDR and WiFi-based localization methods. They evaluated the method in real environments and confirmed that it could provide accurate indoor positioning with a mean error less than 8 m and more accurate position detection than existing techniques. Kim et al. [18] combined the magnetic field strength, cellular signal strength, and WiFi to build a hybrid system to address some limitations in WiFi localization, and the experiment demonstrated that the performance was improved in terms of not only accuracy but also computational efficiency.

In addition to the development of different systems, many researchers have conducted research on novel algorithms. Battiti et al. [19] presented a neural network method (the multilayer perceptron) for building a flexible mapping between the raw signal measurements and the position of the mobile terminal. The average accuracy reached approximately 2.3 meters when the environmental changes during the day were taken into account. Youssef et al. [20] provided two implementations for indoor location determination, joint clustering and incremental triangulation, and described the performance regarding accuracy and computation load. The results of their experiment showed that both techniques achieved over 90% accuracy within 7 feet with low computation. Saha et al. [21] assessed the performance of different classifiers including neural network, nearest neighbor, and histogram matching method. The neural network algorithm provided an error of less than one meter with 72% probability, less than 2.6 meters with over 95% probability, and with almost 98% probability it could infer the location within 3.3 meters. A hybrid method that combines the strength of radio frequency propagation loss with location fingerprinting was developed by Kwon et al. in [22]. This hybrid method exhibited 20 to 40% improvement in positioning accuracy to that of competing methods in real site experiment and reached 5 to 7 meters even for a very sparse placement of APs. Ito and Kawaguchi [23] proposed the Bayesian-based location estimation system which achieved an accuracy of 3.0 meters with a cumulative probability of 0.35 in a lecture room and 0.81 in a hallway. Roos et al. [24] studied the WLAN user location estimation problem following a machine learning framework and presented a probabilistic model to solve the estimation problem. In their test, an average location estimation error below 2 meters was easy to obtain, and the two probabilistic methods produced slightly better results than nearest neighbor methods. Yu et al. [25] utilized a support vector machine (SVM) algorithm in the location fingerprinting system and compared it with three kernel functions. Experimental results indicated that the algorithm could improve the localization accuracy and, among the three kernel functions, the radial basis function performs best. Feng et al. [26] presented an accurate RSSI-based indoor positioning system using compressive sensing method, and the experimental results showed that the proposed system leads to substantial improvement in localization accuracy and complexity over the widely used traditional methods. Lu et al. [27] proposed the extreme learning machine with dead zone to address the problems related to signal variations and environmental dynamics in indoor settings. And the real-world experimental results demonstrated that the proposed algorithm could not only provide higher accuracy but also improve the repeatability.

Other research aspects such as fingerprinting database building also attract research interests, especially in recent years. Li et al. [28] presented a method based on Kriging to reduce the workload and save training time and make fingerprinting techniques more flexible and easier to implement. Pan et al. [29] presented a system called LeManCoR based on manifold coregularization, which is a machine learning technique for building a mapping function between data, could adapt the static mapping function effectively, and is robust to the number of RPs. Zhao et al. [30] improved a WiFi fingerprinting based indoor positioning system by efficiently combining the universal Kriging interpolation method, -nearest neighbor (-NN), and naive Bayes classifier, and their lab experiments showed that 28 observation points could achieve the average positioning error of 1.265 m. Ma et al. [31] proposed a fingerprint recovery method based on inexact augmented Lagrange multiplier algorithm. The experiment results indicated that the method could precisely recover the fingerprint and achieve good positioning performance. Liu et al. [32] designed and tested a fast setup algorithm for collecting data for fingerprinting database with the help of smartphone built-in motion sensors. Experiments showed that there was no significant difference on positioning accuracy between the fast setup method and the traditional method. A novel method called RSSI geography weighted regression was proposed by Du et al. [33] to solve the fingerprint database construction problem. The extensive experiments were performed to validate that the proposed method was robust and workforce efficient. Further, two autonomous crowdsourcing systems were proposed to build fingerprint databases on handheld devices by Zhuang et al. [34]. The proposed systems can run on smartphones, build and update databases autonomously, and adaptively account for dynamic environments. Results in different test scenarios indicated that the average positioning errors of both proposed systems are all less than 5.75 m.

Different from the related papers that focus on building individual systems, implementing with different algorithms and constructing fingerprint databases, this paper concerns on impact factors. In the past, some impact factors have been studied. Prasithsangaree et al. [35] presented a study of the positioning performance and placement issues, including algorithm, granularity of the grid in the database, AP fault tolerance, and building architecture. Li et al. [36] discussed pros and cons of different techniques used in WLAN location fingerprinting including numbers of RPs in use, parameter in the -NN algorithm, distance weighting and universal Kriging in generating fingerprinting database, and probabilistic algorithm in online processing. Honkavirta et al. [37] provided a comparative survey on WLAN RSSI location fingerprinting by introducing the mathematical formulation and tuning the parameters in the formulation. Although papers [3537] already analyzed some impact factors, to our best of knowledge, there is no systematic summary of the impact factors by using Ishikawa diagrams [9] and experimental analysis of the nonalgorithm factors, which are the main focus of this paper.

3. Analysis and Summary of Potential Impact Factors Using the Ishikawa Diagram

This section first introduces the basic quality control tool, Ishikawa diagram, by presenting an example and the diagram constructing steps, and gives the reasons for choosing this tool. It then constructs an Ishikawa diagram for WLAN location fingerprinting systems and analyzes the potential impact factors to positioning performance in a step-by-step manner, considering WLAN radio wave transmitting, propagating, receiving, and processing.

3.1. The Ishikawa Diagram Tool

The Ishikawa diagram [9] was developed by Dr. Kaoru Ishikawa at the University of Tokyo in 1943, who is also a pioneer in quality management techniques in Japan. The diagram has been used in process improvement methods to identify the contributing causes and factors likely to be causing a problem in a systematic way. There are also some other names for Ishikawa diagrams, such as fishbone diagrams, Fishikawa, herringbone diagrams, and cause-and-effect diagrams. Figure 1 shows an example of the Ishikawa diagram, and the name of fishbone diagram comes from its shape. As illustrated in Figure 1, a completed Ishikawa diagram includes a central “spine” and several branches reminiscent of a “fish skeleton.” The “fish head” represents the main problem, and the potential factors causing a problem are indicated in the “fish bones” of the diagrams. In constructing an Ishikawa diagram, the first thing is to determine the “fish head” which presents a problem of interest, and the discussion of factors should focus on the problem. The next step is to decide how to categorize the main factors. There are two basic methods to do so, including by function and by process sequence. The third step is to determine the factors in every main factor by analysis, discussion, and summary. Some factors may have subfactors, and all the main factors, factors, and subfactors should be presented in “fish bones.”

The Ishikawa diagram is considered one of the seven basic tools of quality control [9]. Moreover, this methodology can be used to any problem and can be tailored by users to fit their circumstances. Other six quality control tools include check sheet, control chart, histogram, Pareto chart, scatter diagram, and stratification, and these six tools emphasize process control or post hoc analysis. However, the Ishikawa diagram is an initial step in the screening process of quality control, which can be a useful technique for organizing some of the information generated in preexperimental planning [38]. After identifying potential causes and factors in a very systematic way, further testing will be necessary to confirm the true causes and factors and their impact pattern. In addition, this method encourages group participation and utilizes group knowledge. The structure provided by the Ishikawa diagram also helps team members to think in a very systematic way and follow a structured approach [39]. Based on the characteristics of the Ishikawa diagram method and the objectives of this paper, the Ishikawa diagram method is adopted.

3.2. The Ishikawa Diagram Construction for WLAN Fingerprinting

As previously described, the first thing in constructing an Ishikawa diagram is to present a problem of interest. In this paper, the problem should be the positioning performance of a WLAN RSSI fingerprinting system. The next step is to decide how to categorize the factors. This paper chooses the process sequence in a WLAN RSSI fingerprinting system. Radio signal goes through four steps in location fingerprinting system in either offline or online phase: transmitting, propagating, receiving, and processing. These four steps can be regarded as main factors in the Ishikawa diagram. In the WLAN fingerprinting system, radio signal is first transmitted from APs (online/offline), propagated from transmitters to receivers (online/offline) through environment, received on RPs (offline) or any unknown target location (online), and finally processed to construct fingerprint database (offline) or compute location (online). The positioning performance can be influenced in every step. Therefore, the potential factors will be analyzed step-by-step in detail as follows.

In the signal transmitting step, all factors are about transmitters, which are APs in WLAN RSSI location fingerprinting systems. As previously mentioned, this technique is a cost-saving solution, which means it can make full use of already deployed APs, such as WiFi hotspots. However, hotspots are made by distinct vendors and are diverse with different device parameters such as maximum transmit power. Meanwhile, hotspots may also be placed with different densities, distributions, and heights. As such, a receiver may access different numbers of hotspots in different areas. In summary, differences in APs’ device models, density, distribution, and height may influence the RSSI values in the receiver at this step.

At the second step, the radio signal propagates from transmitters to receivers, and RSSI may be influenced by indoor physical environments and the radio signals from other sources. For example, indoor structure, building material, and furniture placement may have impacts on radio signal attenuation, reflection, and diffraction [40]. People crowd in indoor environments may absorb and obstruct the radio signal, and other dynamic objective measures such as temperature and humidity may influence the radio signal propagating as well. Interference may appear between WLAN signals from different APs or other sources such as a microwave oven.

In the signal receiving stage, sampling RSSI in offline learning should be on RPs, whose coordinates should be known previously. The differences of operators, RPs’ density, distribution, and height will influence the complexity level and RSSI values in fingerprinting databases. In the online positioning phase, the main work is to search the nearest RP or RPs by using RSSI sampled on an unknown point, which means that the sampling working is also needed online. Thus, the receivers’ model differences in offline and online may have impacts. For example, one smartphone may be utilized in fingerprint database construction usually but users’ devices are diverse with different brands and models in online positioning. Additionally, different sampling rate and the number of samples may also influence the samples in fingerprint databases and online positioning results.

The last step is signal processing; the works in this step involve RSSI fingerprint database constructing (offline phase) and location computing (online phase). In database constructing phase, signal filtering method, feature extraction, and selection may change the data and data structure in the fingerprint database. And these factors also determine the features used in online positioning. In online positioning processing, positioning result is computed from received RSSI online and fingerprint database using machine learning algorithms. Hence, the main factors of this step are signal filtering, feature extraction, feature selection, and machine learning algorithm.

To summarize the potential impact factors which influence the WLAN RSSI location fingerprinting performance, an Ishikawa diagram is used in Figure 2. The “fish head” in Figure 2 is WLAN RSSI location fingerprinting positioning performance and “fish bones” are main factors and potential factors.

4. Simulation Platform and Factors of Interest Selection

According to Section 3, there are many potential impact factors to WLAN RSSI location fingerprinting positioning performance. To understand the impact pattern between factors and performance, a couple of experiments with different factors settings need to be conducted. For this purpose, this paper builds a simulation platform used as the test environment. The positioning performance should be measurable as well; thus, this paper chooses the root mean square error (RMSE) and cumulative distribution function (CDF) to present the positioning error. For the controllable ability of the simulation platform, simulation feasibility should be taken into consideration when choosing factors of interest for experiments. The potential factors are classified into four types including controllable, uncontrollable, nuisance, and held-constant factors in another Ishikawa diagram, and all the controllable factors are regarded as factors of interest.

4.1. A WLAN Location Fingerprinting Simulation Platform

The platform simulates a cuboid test field with 10-meter length, 10-meter width, and 3-meter height. All APs are placed at 3-meter height on the edges of the testing field, and sampling RSSI in offline learning is on grid-shape RPs at one-meter height. Figure 3 shows the placement of four APs in the field and sampling RSSI on a particular RP with coordinates of .

Radio signal propagating is described by path loss, which is a positive quantity measured in decibel (dB). The common model for radio signal propagating used in indoor is Log-distance path loss model [4144] as follows:where is the path loss at distance , at near distance is a known received power reference point and is usually chosen as 1 (m) in indoor environment, is the attenuation factor, and is the signal noise error which obeys a normal distribution with 0 mean value and standard deviation. Thus, the received power at a distance can be calculated using where is transmitted power and is the received power.

When setting near distance  (m), the path loss can be calculated by free space propagation model in where is the transmitted power, is the received power, is the transmitter antenna gain, and is the receiver antenna gain.

For the frequency of WLAN radio signal  (GHz), the velocity of radio signal is  (m/s), and the wavelength is . Given , we get the  (dB). Usually, a real AP device’s transmitted power is 20 (dBm), which is 100 milliwatts. Thus, the received power measured by RSSI in the simulation work can be calculated by

As aforementioned, there are two kinds of commonly-used algorithms in fingerprinting location: deterministic and probabilistic methods. Traditional deterministic algorithms can be easily implemented based on -NN. Probabilistic algorithms are based on statistical inference of positioning target and fingerprint database, and they search the positioning result by using the maximum likelihood. Concretely, suppose the number of RPs is , is the th RP, is the RSSI fingerprint on , and the number of APs is , , is positioning target points, and St is the RSSI fingerprint on ; thus, . Then, the positioning result of can be calculated from formula (5), if it takes the 1-NN as the positioning result. 1-NN is also used by this paper.

In formula (5), the distance is measured by the Euclidean distance of RSSI which is presented as follows:

Formula (7) shows the positioning result obtained by the probabilistic algorithm:

According to Bayes’ theorem, the probability can be further transformed into

In formula (8), is the same for all RPs searching and is prior probability and usually regarded as 1/M. Thus, the positioning result searching is transformed into formula (9).

The probability that signal appears on the th RP is , which can be approximately calculated by parametric distributions. In this paper, a Gaussian distribution is selected as shown in formula (10). where and are statistic parameters according to .

4.2. Output Response and Selection of Factors of Interest

For the response output of WLAN RSSI location fingerprinting performance, the measurement should be based on positioning error obtained by formula (11):where is the locating positioning error, is the true coordinates of a test point, and is the positioning result of the test point.

To facilitate description, this paper chooses the RMSE as the measurement of positioning results, which can be calculated by (12). To give more details about the errors, this paper chooses the error CDF measured by the cumulative number of test points (CNTP) in the positioning error presentation.where is the total number of test points and is the th locating positioning error.

According to Montgomery [38], impact factors can be classified as controllable, uncontrollable, nuisance, and held-constant factors, and controllable factors are factors of interest. For the controllable ability of the simulation platform, the simulation feasibility should be taken into consideration when choosing factors of interest for experiments. Additionally, because this paper focuses on nonalgorithm factors, signal filtering, feature extraction, feature selection, and machine learning algorithm are regarded as constant factors. Further, this paper chooses two algorithms including deterministic and probabilistic algorithms during positioning using features of RSSI mean value and standard deviation.

In the signal transmitting step, the density of APs relies on the number in sight. Since the simulation field is a fix 10 × 10 (m2) square, the density of APs is measured by the number of APs (APN). The placement heights of APs always relate to building floor height, which means they are similar on the same floor. Therefore, this factor can be regarded as a held-constant factor and be set as 3 (m) in simulation platform. This setting also simplifies APs’ distribution problem from three-dimensional to two-dimensional. The two-dimensional APs’ distribution can be described by two aspects: () distance of APs’ centroid to indoor field centroid, which is notated as centroid distance (CD) and () the coefficient of variation (CV) of distances from all APs to the centroid, which ban be calculated by formula (8). With regard to AP’s models’ difference, this paper classifies this factor as nuisance factor which is out the research scope.

Radio signal propagating is described by Log-distance path loss as in formula (1). There are some other propagating models proposed to get more accurate signal strength by improving formula (1) considering floors’ and walls’ impact to retrieve precise distance for indoor positioning. For example, RADAR [6] system presented a propagating model which takes walls’ impact into account. In this paper, the Log-distance path loss model is selected using coefficients and to simulate complex indoor environments. It is impracticable to control physical environment impact factors in an experiment platform in reality. However, the influence of indoor structure, building material, and furniture placement can be summarized as signal propagating attenuation factor (notated as ) in the radio signal propagating model in the simulation platform. Meanwhile, dynamic people crowd, temperature, humidity, and other signal resources can be regarded as the signal noise standard deviation (notated as ) in the propagating model as well. Thus, attenuation factor and signal noise can be chosen as controllable factors in the signal propagating step, and the empirical values of and are presented in Table 1 according to Rappaport [41].

For the signal receiving, the fingerprint sampling can be operated on RPs in the grid shape. Thus, the distribution can be regarded as held-constant factors, and the density of RPs can be measured by RPs grid interval distance (GID). In the simulation platform, RPs’ heights, sample rate, and the number of samples are also regarded as held-constant factors and are set as 1 (m), 1 (Hz), and 240 (four orientations), respectively. Nevertheless, operators may be a person or a robot with different receiver models, and these two factors are complex in reality. In this paper, they are classified as nuisance factors.

Figure 4 illustrates the response output and classified factors using another Ishikawa diagram. It should be noted that a controllable factor remains constant and does not change throughout an experiment test, while the uncontrollable one means a factor varies randomly during the test. In reality, factor in radio signal propagating is an uncontrollable factor, but for the controllability of the simulation platform, is regarded as a controllable factor as shown in Figure 4. Table 2 lists all the controllable factors (factors of interest), measurement details, and notations.

5. Experiment and Results

According to Section 4, a simulation platform was built for experiment and it took five controllable factors as factors of interest. In the experiment, the OFAT (one-factor-at-a-time) analysis method was adopted, which omits the interaction impact of all controllable factors. This section describes the details of experiment settings and results.

5.1. Tests Settings in the Analysis Experiment

To conduct the OFAT analysis experiment, a couple of tests with different settings of controllable factors were conducted in the simulation system. Table 3 shows all the experiment settings. It should be noted that a (start, end, and step) notation was used to describe a level range and a notation was used to describe different factor levels. For example, in the first test, APN was set with 20 levels as and was set with five levels as . Meanwhile, it is inconvenient to show the CD and CV of APs distribution in a single table cell because CD and CV depend on coordinates of APs, which varies in every test. Therefore, this paper uses the “open-shape” to describe APs distribution in the experiment when the distribution is not the test factor. The “open-shape” APs distribution prefers a dispersed shape, which simulates the placement of APs in a real environment for communication. Figure 5 shows the open-shape APs placement when APN is set as 3, 5, 10, 15, and 20, which are common settings in Table 3. It also should be noted that empirical coefficient values for indoor propagation in Table 1 were taken into consideration in all settings for attenuation factor and signal noise.

According to Table 3, the first test factor was APs density with APN measurements, and in the APs density test, APN changed from 1 to 20 with a step of 1 AP and every APN setting took the open-shape APs distribution. The attenuation factor was set as 2, signal noise changed in five levels that is, 1, 5, 8, 10, and 15 (dB), and GID was 1 (m).

The second test factor was APs distribution measured by CD and CV, and it took 110 different shapes of five APs into consideration. In APs distribution test, APN was 5, was 2, ranged in five levels 1, 5, 8, 10, and 15 (dB), and GID was 1 (m). Figure 6 demonstrates all the 110 different shapes in the simulation fields.

With consideration of empirical coefficient values for indoor propagation in Table 1, the next test factor was attenuation factor , and ranged from 1 to 5.5 with an interval of 0.5 in the test. The APs were placed in the open-shape and the values changed into 3, 5, 10, 15, and 20, the was set as 5 (dB), and the GID was set as 1 (m).

The test factor, signal noise standard deviation , ranged from 1 to 15 with a step of 1 (dB) in the signal noise test. The APs were deployed in the same way as with the attenuation factor test, and was 2 and GID was 1 (m).

The last controllable test factor was RPs density described by GID between two neighboring RPs. In the test of this factor, the GID varies with eight different settings, which were 0.1, 0.2, 0.5, 1, 1.25, 2, 2.5, and 5 (m). Figure 7 shows the different RPs distributions with different GID values. Other factors were set as follows: five APs in open-shape, was 2, and had five levels including 1, 5, 8, 10, and 15 (dB).

In every OFAT test, there was one fact under test and the procedure on the simulation platform is as follows: () input coordinates of APs, GID, and test points number (set as 100 in this study), () generate RPs and the fingerprint database of RPs, () generate the test points randomly and generate the signal strength of all the test points, () test all the testing points with deterministic algorithm, () test all the testing points with probabilistic algorithm, and () output the error, calculate the statistics, and plot the RMSE and CDF graphs. These steps are shown in Figure 8.

5.2. Tests Results in the Analysis Experiment
5.2.1. APs Density

Figure 9 shows the test results of positioning error by applying the deterministic and probabilistic algorithms on different APs densities. Figure 9(a) shows the trend of RMSE on different APNs in the simulation test area with different levels when and GID = 1 (m). In this figure, we can find that the more APs it can access, the better positioning result it can achieve when the noise error is low. But if the noise error is high, for example,  (dB), APN becomes less important. It should be noted that  (dB) approximates an ideal noise situation, and in this ideal situation, RMSE is high when APN is less than 3 and the RMSE declines gently when APN increases from 3 to 20. This line trend indicates that at least 3 APs are needed for RSSI location fingerprinting theoretically. Figure 9(b) gives more details about the error distribution with different levels, and the line located in “up-left” of CDF means better accuracy performance. It can be seen from Figure 9(b) that lines with higher APN are more towards “up-left” than ones with lower APN, but with the increase of noise error, the error lines become more towards “down-right” and close to each other. These lines indicate that large APN can reduce the positioning error, but if the noise becomes higher, the influence of APN becomes less and less.

5.2.2. APs Distribution

Figure 10 shows the test results of positioning errors by using the deterministic and probabilistic algorithms on 110 different APs distributions measured by CD and CV with model setting as APN = 5, , and GID = 1 (m). Figures 10(a) and 10(c) plot the RMSE on CD and CV, respectively, at different levels; however, we can hardly find clear patterns on RMSE with CD and CV in these two figures except that higher have higher RMSE levels. Figures 10(b) and 10(d) plot the CNTP with ordered CD and CV. In these two figures, we can see that lines with lower are more towards “up-left” than the higher , but no matter is, all the lines are unordered with CD and CV. Thus, similar to RMSE, we can hardly summarize the influence trend of CD and CV to error distribution according to Figures 10(b) and 10(d).

5.2.3. Attenuation Factor

Figure 11 illustrates the test results of positioning errors by using the deterministic and probabilistic algorithms on attenuation factor . Figure 11(a) shows the RMSE versus under  (dB) and GID = 1 (m) at different APN levels. According to this figure, we can see that the RMSE declines with the increase of . And the larger APN shows lower RMSE level: for example, line with APN = 20 locate lower than APN = 3. But APN’s impact trends to less when APN has been already large; for instance, lines are closer to each other when APN is set to 10, 15, and 20 comparing to the settings of 3 and 5. Error distributions are shown in Figure 11(b); it can be seen that lines with higher and more APs are more towards “up-left,” which means higher n and more APs are good for better performance. However, error distributions are similar when APN is large, such as 10, 15, and 20, no matter which algorithm is used.

5.2.4. Signal Noise

Test results of positioning errors by using the deterministic and probabilistic algorithms on signal noise with model settings and GID = 1 (m) are illustrated in Figure 12. The RMSE results with under different APN levels are shown in Figure 12(a). There are growing trends of RMSE results with the level in all APN settings in Figure 12(a), which means that the higher noise level in the environment, the worse positioning performance. It is interesting to see that as increases, all the lines change from close through apart to close again. These lines indicate that APN can hardly influence the result when the environment has very low or high lever noise, for example, when is lower than 3 (dB) or higher than 13 (dB) in Figure 12(a). Figure 12(b) shows the error distributions of positioning results. In this figure, signal noise influence can be clearly seen from the color lines in every error distribution; however, the impact of APN is not clear for the similar error distributions especially when APN is larger than 3.

5.2.5. RPs Density

Test results of positioning errors by using the deterministic and probabilistic algorithms on RPs density GID are shown in Figure 13. The model was set as APN = 5 and , and Figure 12(a) illustrates the RMSE results on GID with different levels. All the lines of RMSE show increasing trends, but this trend becomes smoother when is high. This phenomenon is much clearer when taking the error distributions in Figure 12(b) into consideration. When  (dB), all the lines of CNTP with less GID are located “up-left” no matter which algorithm is used. However, if increases, these lines tend to move “down-right” and much nearer. According to Figure 13, reducing GID will not benefit positioning performance significantly when GID is less than 1 (m) and especially when noise is low as  (dB). It seems that 1.25 (m) is a good choice for GID, if noise is high; for example,  (dB).

6. Conclusions and Future Work

WLAN location fingerprinting is a cost-saving solution but it still faces difficulties in practical applications because of the impact of the factors such as inhomogeneous APs placement, unstationary WLAN RSSI, and additional offline learning work. There are many factors that influence positioning performance in location fingerprinting systems. A good summary of potential factors and analysis of their impact patterns can benefit the applications and quality control of location fingerprinting. The issue is challenging; however, little effort has been made on systematical investigation.

This paper analyzed the impact factors of positioning performance in RSSI location fingerprinting systems step-by-step considering the radio transmitting, propagating, receiving, and processing and summarized potential factors by using Ishikawa diagram to provide a reference for further research. To facilitate the analysis experiment of factors and impacts, this paper presented a simulation WLAN location fingerprinting platform. The paper classified all the factors into controllable, uncontrollable, nuisance, and held-constant factors in another Ishikawa diagram with consideration of the feasibility of the simulation platform. Finally, the paper considered five controllable factors (including APs density, APs distribution, radio signal propagating attenuation factor, radio signal propagating noise, and RPs density) as factors of interest and utilized the OFAT analysis method to conduct the experiment to reduce the complexity and number of tests in a factor analysis experiment.

The results indicate that high APs density, signal propagating attenuation factor, and RPs density with a low level of signal propagating noise are favorable for better positioning performance, while APs distribution has no particular impact pattern on the performance. It is not necessary to improve RPs density to get better positioning performance when GID is less than 1 meter. Moreover, high RPs density means a heavy work for building fingerprinting database.

According to the results, some observations can be drawn to guide the quality control in applications of WLAN location fingerprinting: () the number of APs must be larger than 3 for location fingerprinting, and deploying some external APs can improve the performance especially in a noisy environment no matter what the distribution these APs have. However, if the environment is very noisy (e.g., is about 15 (dB) according to the experiments), this method can hardly be effective; () an environment with complex structures, such as a unit with many walls and rooms (high attenuation factor), is more preferable for deploying location fingerprinting than a simple one like an underground parking lot (low attenuation factor); and () improving RPs density can benefit positioning, but it is useless when the GID is less than 1 meter. Meanwhile, high RPs density means more work for fingerprinting database building and updating.

In the near future, real field experiments will be designed and conducted to further verify the conclusions given in this paper. It should be noted that to reduce the complexity and number of tests in a factor analysis experiment, an OFAT method was used to omit the factor interactions. Although it can demonstrate two factors’ interactions in every test, a more complicated factor analysis experiment needs to be conducted to extend the OFAT experiment in the future. We also plan to conduct practical applications of WLAN location fingerprinting and control the quality of positioning results based on the outcomes of this paper.

Acronyms

WLAN:Wireless local area network
AP:Access point
OFAT:One-factor-at-a-time
GPS:Global positioning system
-NN:-nearest neighbor
CNTP:Cumulative number of test points
CD:Centroid distance
GID:Grid interval distance
RSSI:Received signal strength indicator
RP:Reference point
RMSE:Root mean square error
PDR:Pedestrian dead reckoning
CDF:Cumulative distribution function
APN:APs number
CV:Coefficient of variation.

Competing Interests

The authors declare no conflict of interests.

Authors’ Contributions

Keqiang Liu and Yunjia Wang conceived the paper and designed the experiments. Keqiang Liu summarized the impact factors, developed simulation platform, and conducted the analysis experiment. Lixin Lin and Guoliang Chen participated in analyzing the data. Keqiang Liu mainly composed this paper, and Yunjia Wang, Lixin Lin, and Guoliang Chen contributed in revising. All authors participated in elaborating the paper and proofreading.

Acknowledgments

The research work presented in this paper is partly supported by China National Key Research and Development Program (Grant no. 2016YFB0502102), the Natural Science Foundation of Jiangsu Province (no. BK20161181), the Key Laboratory of Advanced Engineering Surveying of National Administration of Surveying, Mapping and Geo-Information (Grant no. TJES1302), Education Department of Jiangsu (Grant no. KYLX_1394), the National Natural Science Foundation of China (no. 41371423), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (Grant no. SZBF2011-6-B35). All the authors would like to thank Professor Songnian Li from Ryerson University, Canada, for helping them in English writing which greatly improved the manuscript.