A reliable Indoor Positioning System (IPS) is a crucial part of the Ambient-Assisted Living (AAL) concept. The use of Wi-Fi fingerprinting techniques to determine the location of the user, based on the Received Signal Strength Indication (RSSI) mapping, avoids the need to deploy a dedicated positioning infrastructure but comes with its own issues. Heterogeneity of devices and RSSI variability in space and time due to environment changing conditions pose a challenge to positioning systems based on this technique. The primary purpose of this research is to examine the viability of leveraging other sensors in aiding the positioning system to provide more accurate predictions. In particular, the experiments presented in this work show that Inertial Motion Units (IMU), which are present by default in smart devices such as smartphones or smartwatches, can increase the performance of Indoor Positioning Systems in AAL environments. Furthermore, this paper assesses a set of techniques to predict the future performance of the positioning system based on the training data, as well as complementary strategies such as data scaling and the use of consecutive Wi-Fi scanning to further improve the reliability of the IPS predictions. This research shows that a robust positioning estimation can be derived from such strategies.

1. Introduction

The current increase in the population’s average age [1] leads to new requirements in the healthcare domain, particularly in aspects such as care-giving, home assistance, rehabilitation, early detection of diseases, or physical support [2]. The need for assistance and healthcare to the elderly is becoming more and more necessary for social as well as for economic reasons. This trend urges affording suitable assistance systems to improve the quality of life of older people [3] with the aim of helping them live an active and productive aging at an affordable cost [4]. Due to underlying and often debilitating health conditions that are associated with elderly people, aspects of everyday living can become physically and mentally challenging for them. Technology can be integrated into the healthcare of senior citizens to provide safe, high-quality lives, improve their health and happiness, and enable a longer period of independent living. Assistive technical applications should be easy to use, unobtrusive, suitably designed, and adaptable to changing needs and individual preferences.

Ambient Assisted Living (AAL) concept has been defined as a set of products and services aimed at building intelligent environments in the assistance of these groups of people [5]. One of the goals of AAL is to provide reliable and meaningful information to health professionals, caregivers, psychologists, or family members. AAL applications consist of networks of heterogeneous information appliances and smart artifacts that can assist people with special needs in several areas such as daily task facilitation, mobility assistance, healthcare and rehabilitation, and social inclusion and communication. With the help of Artificial Intelligence (AI), AAL facilitates the use of technology in a nonintrusive way to support safe, high-quality, and independent lives for the frail and elderly. AAL platforms strongly rely on an accurate underlying localization system in order to provide timely and reliable services to elderly users. Knowing their position and actions is vital for medical observation, timely accident prevention, behavioral pattern characterization, or anomaly detection [6].

While the problem of localization in outdoor environments has been solved by the use of satellite positioning systems such as GPS or Galileo, which provide an acceptable level of accuracy and precision, indoor positioning remains an open issue. Satellital signals are not available inside buildings, since they are attenuated and scattered by roofs, walls, and other building elements and are unable to reach the user’s device with enough intensity to provide precise positioning services. Researchers and industry are currently involved in the investigation, development, and improvement of reliable Indoor Positioning Systems. Although significant progress has been made, there is not an accurate and widely accepted solution for this topic.

Indoor Positioning Systems (IPS) are systems that locate and track people or objects inside buildings using radio waves, magnetic field, acoustic signals, images, or other information collected by sensors [7]. A suitable IPS system for AAL has to be able to localize the assisted person in an indoor environment, with accuracy and performance high enough to reliably monitor his/her activity and provide meaningful assistance and services. These systems have to be deployed at the user’s living place, in real scenarios where the particular constructive characteristics and peculiarities of the building may affect the way that signals propagate. This type of scenarios is very different from the controlled experiments where environmental conditions are known in advance. The fact that homes are diverse, with different layouts and varied architectural particularities makes it complex, time-consuming, and expensive to model the propagation of radio-frequency signals each time the positioning system has to be installed. Furthermore, technical proposals for AAL should be easy to use, unobtrusive, and inexpensive, so deployment has to be as simple as possible.

In order to predict the position of an agent in an indoor environment, traditional approaches rely on the construction of reliable models for signal propagation, which is a complex topic. The performance of these methods depends on the correct assumptions about the underlying rules governing the observed signals. If the assumptions are wrong, the model will not describe the observations and will not be able to make solid predictions. On the other hand, Machine Learning (ML) algorithms allow the identification of correlations in datasets without the need for a proper determination of the underlying model. In other words, ML techniques treat the model as a black box for which an explicit characterization is unknown. The training process of the supervised ML algorithms is able to uncover meaningful and characteristic patterns directly from the data and to build effective and predictive models that perform well on unseen data.

Wi-Fi fingerprinting, which uses ML techniques to take advantage of an already deployed infrastructure, is a good choice for such systems [8, 9]. Nevertheless, factors such as channel interference, sensor orientation, or multipath propagation and fading introduce a level of indeterminacy in Wi-Fi sensor readings, impacting negatively on the performance of this positioning method [10]. In scenarios where accuracy at room level is enough to provide relevant services, selecting an adequate classifier algorithm and collecting data appropriately can significantly improve precision. Furthermore, taking into consideration readings from other sensors, such as Inertial Motion Units (IMU), along with Wi-Fi fingerprints, can help to account for changes in the user’s location, providing valuable information that can be utilized to improve the IPS results.

This paper presents the results of a study to assess the impact on the IPS performance of strategies based on the utilization of motion sensor readings to detect user states. This work also presents some preliminary data study techniques that can help to predict the quality of the data recorded by the users, which directly affects the accuracy of the IPS system. Furthermore, we also carried out a set of experiments to evaluate the effectiveness of a series of actions aimed at reducing the influence of Wi-Fi signal uncertainty and selecting the most appropriate ML algorithm for the positioning system. As a summary, the main contributions of this paper are as follows:(i)We perform a set of data analysis techniques that help to predict the performance of the positioning system based on the characteristics of the training data recorded by the user(ii)We assess the performance gain of considering readings from body-worn inertial sensors to recognize room transitions(iii)We compare the impact of some strategies to increase the positioning performance of the positioning algorithms and to reduce uncertainty in Wi-Fi signals(iv)We also compare the performance of four machine learning algorithms in room level indoor localization tasks

A preliminary version of this work, entitled “Improving Positioning Accuracy in Ambient Assisted Living Environments. A Multi-Sensor Approach” [11] was presented at the 15th International Conference on Intelligent Environments (IE19). With respect to the preliminary version, we have extended and partially rewritten all sections. Furthermore, we have added a new section dedicated to exploring data characteristics and their relationship with the performance of the classification algorithms used in the experiments.

The organization of this paper is as follows: Section “Background” shortly provides context on Ambient Assisted Living and Indoor Positioning Systems. Section “System Overview” presents an overview of the positioning system used to perform the experiments. Section “Data” presents a study of the data and discusses the outcomes. Section “Experiments Description” describes the experiments performed and section “Results” discusses their results. Finally, section “Conclusions and Future Work” underlines the main conclusions of this work and explores possible future lines of research.

2. Background

General AI and Machine Learning (ML) based systems are being developed and used in areas such as context-awareness, agent-based technologies or computer vision, to provide more intelligent, flexible, natural and supportive services for healthcare. Some examples of how services based on AI and ML techniques can be used in healthcare services are:

•Human Activity Recognition (HAR): systems can combine data from multiple sensors to recognize user’s activities and identify behavioral patterns. The performance of daily activities can be used as a measure of the cognitive and physical condition of the elderly [1214].

•Anomaly Detection: anomaly detection techniques can expose declining health conditions. Changes and anomalies in the user behavior can be of use in chronic diseases monitoring [15] and early depression detection [16] and can denote elder-specific illnesses such as cognitive decline, Alzheimer’s disease, dementia or functional impairment [17, 18].

•Decision Support: decision support systems assemble different types of data from multiple patients and help doctors and healthcare professionals to organize their work, to analyze people personal needs, or to survey some common phenomenon

Some of the key aspects of deploying an IPS for Ambient Assisted Living are related to choosing the right positioning technology while implementing the system in a passive, device-free, and unobtrusive way. This objective might require the use of an existing infrastructure, the deployment of a new one, the use of the so-called signals-of-opportunity, or even a combination of some of these techniques. Many of these techniques take advantage of the radio-frequency signals emitted by devices, whose position can be known or not, to estimate the user’s position from the perceived strength of these signals, the Received Signal Strength Indication (RSSI). RSSI is used to measure the relative quality of a received signal to a client device. The value read by a device is given on a logarithmic scale and can correspond to an instant reading or a mean of some consecutive readings, but each chipset manufacturer is free to define their own scale for this term. There are many kinds of devices and technologies that can be used for positioning purposes, such as Wi-Fi access points, Bluetooth beacons, Radio-Frequency Identification (RFID), or Ultra-Wide Band (UWB) devices [19]. The effectiveness of these techniques can be improved by leveraging the use of other sensors that are commonly present in wearable devices.

Beyond Wi-Fi or Bluetooth signals, the use of the Earth’s magnetic field to build an Indoor Positioning System has been explored in recent years in several research works. Man-made constructions cause disturbances that alter the magnetic field. These magnetic anomalies are location specific and temporally stable and can be leveraged to build an indoor positioning framework. In many works, this approach is combined with other sensors to enhance its performance. For example, in [20], the authors use the magnetic field along with opportunistic Wi-Fi signals to achieve a 90-percentile accuracy of 3.5 m for localization. The use of different deep learning architectures, such as deep neural networks (DNN) or convolutional neural networks (CNN), has also been proved to achieve good localization accuracy [2123].

Using inertial sensors’ data has been applied previously to solve diverse problems related to localization. In [24], the authors present a pedestrian dead reckoning tracking system that relies on two modules: a step counter and a stride length estimator. Although the reported results are good, their solution is based on a homogeneous walking, which cannot be assumed in some small indoor scenarios such as homes. Combining radio-frequency signals along with other sensors’ data has been implemented in several previous works. For example, Xie et al. [25] achieve good accuracy in large indoor buildings using magnetic field fingerprinting together with an augmented particle filter. The use of particle filters to fuse data from various sensors has been a common practice for indoor localization systems [26, 27], generally providing good results. In [28], the authors use magnetic field readings along with Wi-Fi to create a spatiotemporal signal fusion graph to identify crowd-flows in large indoor scenarios such as malls or airports. This technique can be applied to applications or services like advertisement and recommendation or urban-flow monitoring systems.

Techniques used for indoor location can be divided into three general categories: proximity, triangulation, and fingerprinting [29]:(i)Proximity methods compare the RSSI value from different transmitters and determine the position of the client assuming that the received signal with the highest value is from the closest access point. The accuracy is generally low and relates to the density of deployed beacons and its signal range.(ii)Triangulation methods use the geometric properties of triangles to determine the target location. When the position of at least three transmitters is known, the position of the mobile node can be estimated calculating its distance to each device. The difficulties come with the task of finding the right model for transforming RSSI to distance. Triangulation methods can be divided into two groups: lateration techniques such as Time-of-Arrival (ToA), Time-Difference-of-Arrival (TDoA), or Round-Trip-Travel-Time (RTTT), based on the measurement of the propagation time, and angulation techniques, such as Angle-of-Arrival (AoA), based on the angle of the arrival wave. These technologies are not available to inexpensive positioning infrastructures due to the need for antenna arrays or time synchronization [30].(iii)Fingerprinting methods assume that, for a given indoor environment, a signal mapping exists and that such map can be reconstructed measuring the RSSI signal at discrete locations of the mapped area. In the case of Wi-Fi fingerprinting, its main advantage relies on the fact that there already is an existing Wi-Fi infrastructure in the majority of urban areas. Therefore, the location of the user can be obtained without deploying any additional equipment. Obstacles, reflections, multipath interference, environmental changes, or device orientation are factors that affect the signal propagation [31] and can degrade the performance of IPS based on Wi-Fi fingerprinting.

Mapping fingerprinting assumes that an RSSI map exists, and it is constructed by measuring the RSSI at some locations of interest. The radio map, or fingerprinting dataset, is composed of a set of collected fingerprints and the associated positions where the measurements were taken and may contain some additional variables, such as the type of device used or a timestamp of the observation, along with any other useful data. This stage in which the data is acquired to construct the radio map is known as training, offline, or survey phase. During operation, once the radio map is completed, the IPS will use this map as a database for location purposes [32]. This stage is known as the online phase (see Figure 1).

Determining the location of a receiver device at room level based on the RSSI mapping can be seen as a classification problem, where the classes are the mapped rooms and the features are the RSSI signals. However, there are some issues that make it difficult to achieve good classification performance in IPS. The main problems are caused by the heterogeneity of devices and RSSI variability in space and time due to environment changing conditions. Regarding the former, since RSSI is a not standardized indication of power level being received by a wireless device, any device manufacturer may implement its measuring in a different way. Therefore, RSSI value can vary depending on the hardware, software driver libraries, operating system, or software monitoring implementation. With respect to the latter, RSSI is sensitive to dynamic environmental conditions such as channel noise, interference, reflection, and attenuation. This can degrade the performance of the IPS when circumstances change from the offline to the online phase. There have been some proposals to tackle different aspects of the heterogeneity problems. For example, in [33], the authors find that the relation in the order of RSS values from different APs at a fixed location is more stable than the values themselves and propose the use of an algorithm that uses this relation to construct a more stable fingerprint. Other works [34] propose hybrid systems based on the use of pyroelectric infrared sensors to process sets of zone-based fingerprints with the goal of excluding outliers due to device diversity or shadowing effects. Other authors [35] disregard the traditional approach for fingerprinting and propose a system that exploits the Wi-Fi access points’ coverage area uniqueness and the coverage area overlap to calculate the user’s current position while mitigating the impact of using heterogeneous devices.

Furthermore, collecting and maintaining a radio fingerprint database is a high cost and time-consuming task. This cost can be reduced considerably in household environments when room level positioning is enough to provide most AAL services. In those cases, the training stage has to be performed at least once in each room. In a typical house with 6 rooms, this process can take 10–20 minutes.

The ubiquity of smartphones and smartwatches and the availability of different wireless interfaces, such as Wi-Fi, 3G, and Bluetooth, make them an attractive platform for indoor monitoring. Smart home-based behavioral data have already been found to be useful in assisting older adults to live independently and to monitor health state and the onset and progress of age-related diseases and disorders such as dementia and Alzheimer’s disease [36]. Psychological health in older adults (loneliness, depression, or emotional states) has been assessed by means of such data too [37]. Nevertheless, the level of technology readiness for home health monitoring technologies is still low [38].

When choosing a particular device to implement an IPS in AAL, one of the most important factors to consider is the fact that it has to be as unobtrusive as possible and do not modify, disturb, limit, or interfere in the user’s daily activities or lifestyle. Most Wi-Fi fingerprinting location systems are based on the use of a smartphone. Nevertheless, tracking the user location implies the device to be permanently attached to the user, which may not be applicable in home daily living. For instance, forgetting the device on the top of the bedside table would point the IPS to assume that the user is still in bed.

Smartwatches can be seen as an extension of a smartphone which looks like a common watch. A smartwatch is always attached to the user, so it is less likely to be forgotten on top of the bedside table than a smartphone. In addition, it is a nonobtrusive, relatively cheap, and easy-to-use tool, which can also provide direct communication between the user and caregivers, nurses, or general practitioners.

As most smartphones do, smartwatches also embody several sensors such as accelerometer, gyroscope, ambient light intensity, and compass. On the connectivity layer, most of them also embody Bluetooth, NFC, and Wi-Fi communications, which allows the use of Wi-Fi fingerprinting technology as a suitable positioning candidate to be deployed in such devices. Moreover, most of these devices also include a GPS chip. This sensor can be used along with an IPS to provide the location of the user both outdoors and indoors.

3. System Overview

The Indoor Positioning System designed to perform these experiments is part of the research project “Senior Monitoring” [39], which is aimed at providing solutions for monitoring elderly people’s behavior and detecting short-term issues (falls) as well as long-term issues (cognitive decay). The IPS consists of a smartwatch, which is worn by the user who is being monitored, and a paired smartphone, which is used to configure and control the smartwatch behavior and to communicate with a central cloud server (see Figure 2). The server stores the sensory data gathered through the smartwatch and offers assistance to provide decision support services by performing analysis tasks such as indoor positioning, activity recognition, or anomaly detection.

3.1. Hardware

The IPS described in this paper requires the use of a smartwatch attached to the user’s wrist and a smartphone that communicates with the former through a user-friendly application in the following way.

•Smartwatch: The wearable device used is the model SmartWatch 3, manufactured by Sony. This device runs Android Wear as its operating system and embodies a Wi-Fi chip along with GPS, accelerometer, compass, gyroscope, and ambient light sensors. Connectivity is supported through Wi-Fi, NFC, and Bluetooth. The resolution of its 1.8″ screen is 320 × 320 pixels. This device runs an application that can be set up to continuously scan for any nearby Wireless Access Points (WAPs) signal, as well as to record readings from some other sensors. This application is controlled via a paired Android one that runs its corresponding version of the application.

•Smartphone: The smartwatch is paired with a smartphone that controls its behavior. The smartphone is used to configure some sensor options such as scan intervals, number of consecutive scans, sensor activation. Both devices communicate through Bluetooth. All the smartwatch readings are sent to a central server through the smartphone. In case the devices are not in range, the smartwatch buffers the data to be sent when a network connection becomes available.

3.2. Software

The Android application that runs in the smartwatch is in charge of collecting the sensor data. The configuration and behavior of this application are controlled by means of its reciprocal application installed in the paired smartphone. Figure 3 shows the main screen of the smartphone application. This version of the software is used for research purposes, so it shows some information relevant only to this purpose, along with general information that is useful for end users. The smartphone is the interface through which the elderly user can perform tasks such as checking the smartwatch status, viewing his/her level of physical activity, observing the readings and status of active sensors, or responding to notifications delivered by his/her caregivers, health professionals, psychologists, or automatic healthcare services provided by the analysis system.

The smartphone sends the sensory data to a cloud server using the MQTT protocol. The server stores the data for posterior analysis using Elasticsearch as a NoSQL database. The data provided through user interaction, such as login data or interaction through notifications, is sent to a REST cloud server and stored in the same database.

3.3. Sensors

The goal of the structure described so far is to collect meaningful sensory data to build systems able to provide reliable AAL services to its users. To this end, the software previously outlined has been designed to make use of the following sensors:

3.3.1. Wi-Fi

This sensor constitutes the base of the positioning system. The smartwatch performs a given number of consecutive Wi-Fi scans every minute. The default number of scans is 5, but this setting can be modified through the smartphone app. The procedure is described as follows:(1)The app sends a startScan command to the Wi-Fi module to scan for nearby AP signals.(2)The Wi-Fi module performs a scan and stores its results in the cache. A notification is sent to the operating system.(3)The operating system notifies the app when a scan is completed. The app then sends a getScanResults command to request the scanning results stored in the cache.

When a scan is performed, the Wi-Fi module updates some data in the cache while keeping some intact. Some WAPs may be present in the scan results although they have not been detected in the most recent scan. The details of the cache updating algorithm are unknown, but outdated data may persist during some scans. Moreover, in highly crowded WAP environments, channel interference is very likely. This means that some WAP signals, especially those whose RSSI is low, may appear and disappear stochastically. Other circumstances such as heating and ventilation have their own impact on the radio signals. Because of the aforementioned conditions, signals collected from incorrect locations at incorrect times are likely to happen, introducing errors in data analysis. To minimize the impact of this behavior and lessen stochasticity, the application completes a default number of 5 consecutive scans, each one taking approximately one second to complete. For the smartwatch model used in the experiments, these settings allowed for around 15 hours of battery duration, long enough to collect data during day time and recharge the device during the night.

3.3.2. Significant Motion Sensor

The user physical activity can be determined with the use of inertial sensors such as accelerometer and gyroscope. Both sensors are capable of measuring human motion and estimating body position, which allow determining the physical activity the user is performing, such as walk, run, and sit [40]. The main drawback of the use of these sensors in a smartwatch is its energy cost. Continuously monitoring inertial readings keeps the system from going into low power/sleep mode and drastically reduces the battery duration to less than a whole day, which is a minimum requirement for monitoring applications.

An alternative is the use of the Significant Motion Sensor (SMS), a virtual sensor that uses the physical accelerometer but only is triggered when it detects a motion that might lead to a change in the user’s location. Thus, though this sensor does not allow determining the activity the user is performing, it provides a way to detect a possible change in his/her location. Inversely, if the SMS has not been triggered during an interval of time, it may be assumed that the user has not changed its location during that period.

3.3.3. Step Counter

This sensor detects the number of steps taken by the user since the last time the sensor was activated. The application automatically resets the counter every day at midnight. Similarly to the SMS, the step counter could help to detect intervals during which the user is not walking.

3.3.4. Activity Recognition

In order to automatically monitor the user’s activity, at least one inertial sensor, preferably the accelerometer, has to be continuously monitored and its data analyzed in search of patterns that characterize the activities of interest. This would cause a considerable battery drain, seriously compromising the device’s usefulness. To remedy this situation, the Android API allows registering for activity recognition updates. To keep the power usage to a minimum, the activity detection is done by periodically waking up the smartwatch and reading short bursts of motion sensor data. It can detect if the user is currently on foot, in a vehicle, or on a bicycle or is still, but the accuracy of the prediction depends on the update interval. Larger interval values will result in fewer activity detections while smaller values will result in more frequent activity detections but will consume much power. Each detection result contains a list of activities sorted by a probability that indicates how likely that activity is.

To prevent excessive battery use, the activity reporting service may stop when the device is motionless for an extended period of time. Once the device moves again, which is detected through the SMS, the service will resume.

3.3.5. Magnetic Field

Geomagnetic fingerprinting (GF) is a technique that maps disturbances of the Earth’s magnetic field caused by the metal construction of buildings and uses this data to achieve indoor localization through pattern matching [41].

The 3D magnetometer of the smartwatch measures the magnetic field in its coordinate system. As the smartwatch may be oriented arbitrarily in the user’s wrist, the measurements have to be transformed into the coordinate system of the indoor plan, which can be done with the aid of inertial sensors such as the accelerometer and the gyroscope. An alternative to this transformation is to only use the module of the signals, thus eliminating the need for other sensor reading but compromising the quality of the localization.

Geomagnetic fingerprinting can be integrated with some other positioning technology in a sensor fusion system to improve localization. For instance, Wi-Fi fingerprinting can be used to determine the location at room level and GF to estimate the most likely position within the room.

The smartwatch scans the magnetic field continuously and sends the collected data to the server every minute.

4. Data

When the system is deployed in a home, users manually create the radio map while wearing the smartwatch and following the indications of the smartphone application. The users first select a set of rooms and then the software guides them to collect training data in certain points of the selected rooms, such as the center or any commonly used location. When this process finishes, the collected training data is sent to the server. During the system’s normal operation, the data acquired by the device sensors are sent every minute to the paired smartphone, which in turn dispatches it to the server to be stored and analyzed.

The data used to perform these experiments were collected by four users, two males and two females, at their homes for two months. During this period, the users manually reported many intervals of time at which they were in a particular room performing activities of their daily living. This labeled information constitutes the test data used to assess the accuracy of the predictions. Table 1 shows some of the characteristics of each dataset, such as the number of data points for each room, the total number of access points detected, or the number of rooms that were selected by each user.

4.1. Data Exploration

Since data has been labeled at room level and given that usually rooms are separated by walls that attenuate the perceived intensity of the Wi-Fi signal, the feature space of the data should reflect this; that is, we should be able to find a way to separate the RSSI data into a number of clusters equal to the number of labels. Each one of these clusters is formed by signals that have high similarity among them but are dissimilar to signals in other clusters. Therefore, we can have a measure of the predictive quality of the data by finding these clusters and comparing them to the actual labels. The more similar the clusters are to the labels, the more feasible it will be for a machine learning algorithm to find these discriminative patterns between classes and achieve a good classification accuracy.

The well-known k-means clustering algorithm works by grouping data into a given number of clusters by calculating the Euclidean distance among data instances and assigning each observation to the cluster with the nearest mean. The algorithm iteratively minimizes within-cluster squared Euclidean distances until the solution converges; that is, there are no changes from the previous iteration or until the maximum number of iterations has been reached.

In order to find if the training data is well segmented and to know if we can expect good classification accuracy, we apply the k-means algorithm to the data from each user and then compare the obtained clusters with the actual labels. Figure 4 shows four heatmaps with the results, with darker colors representing higher room-cluster correlation and values denoting percentage. A perfect correlation would show 100% on each diagonal value, that is, each cluster being composed only by data from the correct label (room).

Since users usually did not spend the same time in all rooms, the train and test dataset may be imbalanced. Therefore, we adopt the f1 metric as the metric for classification performance, since it is more resilient than accuracy on imbalanced datasets:

When more than two classes are considered, we report the weighted average of the individual f1-scores of all classes as the evaluation metric for each model:where c is the number of classes and is the weight (the number of instances) of the ith class.

The calculated metrics for each user, shown in Table 2, can be seen as a predictor of the quality of data. Higher values will indicate well defined boundaries between classes, revealing potentially useful hidden predictive information that will make it easier for a classifier to assign the correct room to a new instance of data. From these results, it is clear that the k-means clustering algorithm was able to find a better partitioning of the data space for users 2 and 3 than for users 1 and 4. Therefore, we may expect better classification results for these users.

Users 2 and 3 are also users with a higher number of Wi-Fi access points detected, which could partially explain the results obtained. Since the sensor used to record the data is the same for all users and leaving aside factors such as each particular house layout, which are unknown, we can assume that the number of WAPs clearly influences the ability of the k-means algorithm to differentiate between classes.

A visual representation can make it easier to detect meaningful patterns and outliers in groups of data. To be able to find a structure in data in a way that can be visualized, we need to reduce its dimensionality while trying to keep most of the knowledge. There are many techniques available to automatically reduce the complexity of high-dimensional data. Some of these techniques are as follows.(i)Principal Components Analysis (PCA) is an unsupervised technique that finds the components that hold most of the variance (information) of the data. Each component has both direction and magnitude. The direction represents across which principal axes the data has most variance, and the magnitude expresses the amount of variance that is captured of the data when projected onto that axis. Each subsequent principal component is orthogonal to the previous and has less variance. The final result is a set of uncorrelated principal components.(ii)Linear Discriminant Analysis (LDA) identifies a suitable low-dimensional representation of original data by finding not only the component axes that maximize the variance of the data (PCA) but also the axes that maximize the separation between multiple classes, thus maintaining the class-discriminatory information. LDA is a supervised technique since it needs label information to determine a suitable feature space in order to distinguish between patterns that belong to different classes.(iii)t-Distributed Stochastic Neighbor Embedding (t-SNE) is an unsupervised, nonlinear technique primarily used for exploration and visualization of high-dimensional data. It differs from PCA by preserving only small pairwise distances or local similarities whereas PCA preserves large pairwise distances to maximize variance. The algorithm calculates a similarity measure between pairs of instances in the high-dimensional space and the low-dimensional space and tries to minimize the difference between these two similarity measures using gradient descent and the Kullback–Leibler divergence (KL) as the cost function

Figure 5 shows the visualization obtained for each user and algorithm. PCA does not seem to reveal any clear pattern for any user. For users 2 and 3, there are some rooms that seem to be well segmented, but there is still some confusion with the remaining groups. With respect to LDA, it has been able to find a good separation between classes for dataset 1, specially for 2 and 3. For user 4 the representation found looks more cluttered. And finally, the t-SNE algorithm shows some structure for datasets 2 and 3, where we can visualize a clear separation between some classes. On the other hand, the plots corresponding to data from users 1 and 4 look more chaotic.

In order to get a numeric evaluation of these figures, we use the Silhouette metric [42]. The silhouette analysis can be used to study the separation distance between the resulting clusters, as a measure of the quality of clustering achieved. This value measures the space between clusters with a value in the range −1 to 1. If cluster cohesion is good and cluster separation is good, the value will be close to 1. On the other hand, if samples have been assigned to the wrong clusters, the score will be near to −1. Figure 6 shows the values obtained for this metric for each one of the algorithms used for visualization.

The conclusions that arise from these visualization and silhouette plots are consistent with the results obtained with the k-means clustering. We can expect machine learning classifiers to have more difficulty finding discriminative patterns for those datasets on which clustering and visualization techniques have not been able to find significant differentiation among groups of instances belonging to distinct rooms. In particular, silhouette values predict a better classification accuracy for users 2 and 3 with respect to users 1 and 4, for which the algorithm could not find clear boundaries between clusters.

5. Experiments’ Description

The goal of the experiments is to evaluate the influence of a set of parameters in the accuracy of the positioning system, as well as assessing the impact of considering the lack of motion as a constraint for its predictions. Each experiment consists of the evaluation of the classification metric for a particular dataset and a given set of parameter values. The parameters that will determine each experiment are the following.

•Classifier: A total of four classification algorithms have been tested: Decision Tree (DT), k-Nearest Neighbors (kNN), Neural Net (NN), and Random Forest (RF). The best parameters for each classifier were determined through a series of tests before the experiments:(i)DT: max. depth = 20(ii)kNN: k = 3, distance = euclidean(iii)NN: 5 hidden layers, units = 50, act = RELU(iv)RF: max. depth = 20, max nodes = 50

•Scaling: The RSSI values from the Wi-Fi fingerprints are usually in the range (−100, −30). One common strategy [43] to ease the work of classification algorithms and increase their performance is to scale those values into the range (0, 1), where 0 would mean that the WAP is not present in the fingerprint, and 1 would represent the maximum value for a RSSI (see equation (1)). We compare the performance of this strategy against feeding the classifiers without preprocessing the data:

•Reducing Stochasticity: As stated in Section 3, the IPS described in this work performs a number of consecutive scans to minimize the impact of uncertainty in the RSSI values of Wi-Fi fingerprints. These scan instances, five by default, are passed to the classifiers and the predictions are determined by a majority vote. We assess the performance impact of this strategy against classifying only the first scan instance.

•Minimum Interval without Significant Motion (MISM): To improve indoor localization accuracy, specially in room level applications, motion sensors can play an important role, since the detection of steps or any significant motion could be used to discover transitions from one room to another. If a body-worn sensor does not register a significant motion in a given period of time, it can be supposed that the user has not changed his/her location. In this scenario, all fingerprints received during this interval must correspond to the same room. Knowing this, the most reasonable procedure may seem to take the locations estimated by the IPS for that period and assume that all occurrences correspond to the location that occurs more frequently.

As an example, let us consider that the user gets into the living room and sits on the sofa. Since the user was moving, we have a signal from the SMS. Now she stays on the sofa for 30 minutes and then goes to the bathroom. When she gets up from the sofa, we receive another signal from the SMS. We know that she was in the same room for 30 minutes, but we do not know which room it is. During this time, the Wi-Fi sensor has been sending signals every minute, so we have 30 fingerprints. Let us suppose now that inferring the position of the user from the Wi-Fi signals gives us these results: 22 in the living room, 6 in the kitchen, and 2 in the bathroom. It is safe to assume, given the fact that we know that she did not move that she was in the living room?

The MISM parameter represents the minimum period to consider when recognizing intervals at which the user has not made a significant movement and, therefore, is supposed to be in the same room. We considered 20 different intervals, between 10 and 200, in steps of 10 minutes.

•Prediction Threshold (PT): During the interval of time in which the user stays in a particular room, the IPS generates a series of predictions, specifically, one per minute. Due to the particularities of Wi-Fi signals, environment changes, or user orientation, the predictions produced by the classification algorithm for this period may not be uniform and contain different predicted rooms. If we assume that the user has not changed his/her position, the best policy to determine the actual position may be to select the most commonly occurring prediction. In this experiment, we evaluate the performance gain of following this strategy, in relation to the ratio of the most occurring prediction over the total number of predictions. To this end, we considered 50 values for this ratio, in the range (0.50, 1.00) with a step value of 0.01. A ratio value of 0.50 would correspond to a case where the most occurring prediction corresponds to the 50% of the IPS predictions. A ratio of 1.00 would correspond to a case where all predictions are equal. In this case, there would not be any improvement when considering motion sensors to amend the IPS predictions.

This configuration gives a total of 16000 experiments for each dataset. This figure is broken down as follows:

To compare the results to determine the best parameter values, we used the Wilcoxon signed-rank test [44], a paired difference test to evaluate the mean ranks differences that we applied to the f1 metric.

6. Results

Figure 7 shows a boxplot presenting the results obtained for each classifier on the four datasets. The Random Forest algorithm seems to perform better in all scenarios. To detect significant differences between the performances of the four algorithms and determine the most reliable option, we apply Wilcoxon signed-rank test as a statistical method for testing the differences among the outcomes. The results of the test, comparing RF algorithm versus each one of the remaining classification algorithms for each dataset, are shown in Table 3. In all cases, the Null Hypothesis (H0) of equivalence of means can be rejected ( value < 0.05). Therefore, the experimental results show an improved performance in room detection when using the RF algorithm.

These results also endorse the results obtained in Section 4.1. The Random Forest classifier obtains better results in the test data for those datasets that showed a significant structure or pattern when using a clustering or visualization technique. In particular, the best results are obtained for users 2 and 3 that show an average f1 of 0.88 and 0.89, respectively. On the other hand, results for users 1 and 4, with an average f1 of 0.83 and 0.76, also confirm the intuition that the correlation between clustering groups and rooms in training data is a good predictor for the performance of the positioning system on test data.

Figure 8 displays a boxplot comparing the classification effectiveness of scaled data versus raw data. Table 4 shows the results of the Wilcoxon signed-rank test used to compare the results. The outcome shows that for all the datasets, the Null Hypothesis (H0) can be rejected, showing that scaling the data increments the performance of the algorithms.

With regard to diminishing the influence of the Wi-Fi signal stochasticity, Figure 9 displays a boxplot evaluating the use of the majority vote strategy applied to the consecutive samples acquired by the Wi-Fi sensors during each scan process. As is shown in Table 5, for all datasets, the Null Hypothesis is rejected, indicating that a majority vote strategy significantly increments the accuracy of the positioning algorithm.

The previous tests helped to determine the best positioning algorithm and strategies to improve the positioning accuracy. In order to assess the impact of considering the SMS data to further improve the performance of the IPS, we used scaling and majority vote strategies and RF as the selected classifier. Figure 10 shows the average performance increase obtained depending on the value for the PT parameter, for all possible values of MISM.

The averaged results show an f1 increase of around 3% in the range of 0.50 to 0.8. Therefore, the results suggest that it is safe to assume the most predicted class as the outcome for the positioning system for a given period of time with no motion detected. Moreover, the performance increase is mostly independent of the MISM parameter.

The results for each particular dataset show much variability in the range of PT between 0.5 and 0.7, where the accuracy cost of making an incorrect prediction is greater. This variability may be caused by the different house distribution of rooms on each dataset. Spaces like open plan kitchen/dining rooms may need additional information, such as the use of Bluetooth beacons or magnetic field sensors, to help the IPS discriminate areas in the same open space. Nevertheless, the maximum cost in the accuracy of incorrect predictions is around 1%, and it occurs only for PT values lower than 0.65. Hence, for PT values greater than 0.65, there is a general performance increase in positioning system when using the SMS as an indicator of room/position changes.

In the process of continuously improving the performance of the positioning system and with the goal of assessing the validity of these findings, we scheduled a new round of data collection four months after the data used for the previous experiments were recorded. These new datasets were recorded by seven elder users, two females and five males, who performed the training process while following the indications showed by the application. In the same way as with the previous dataset, the process was conducted at their homes, where they used the positioning system during a period that varies between one and two months. Following the conclusions arisen from the previous experiments, we used these new data to validate the method of using the SMS as a landmark to detect possible room changes. The results are shown in Figure 11.

The results for the second stage show a similar pattern to the results shown earlier, but with increased variability in the results in the range of PT between 0.5 and 0.7. In this interval, there is not a general gain in performance, since the majority of the users report a decrease in the accuracy. This behavior was already detected for two datasets in the first stage, and now it happens for four out of seven users. As discussed earlier, this drop of performance in this interval of PT values can be expected, since it is risky assuming that the correct room can be predicted with only 50–65% of occurrences in a given period of time. For PT values greater than 0.65, the results validate the proposed approach, since there is a general gain in performance for all Users.

7. Conclusions and Future Work

The experiments presented in this paper show an improved accuracy in room detection when using strategies such as data scaling and the use of consecutive Wi-Fi scanning. The results also demonstrate that the use of a Significant Motion Sensor along with the Wi-Fi fingerprints can help to significantly increase the performance of Indoor Positioning Systems.

As future work, more data from a variety of new users is being collected and will be used to validate the conclusions of this work, while providing more data to test new strategies such as the step counter and the activity recognition API to improve positioning accuracy and the use of the magnetic field readings to assess the possibility of determining the position of the user within the room.

Data Availability

The data used in this work are available upon request to the authors.


A preliminary version of this work, entitled “Improving Positioning Accuracy in Ambient Assisted Living Environments. A Multi-Sensor Approach” was presented at the 15th International Conference on Intelligent Environments (IE19).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This work has been partially funded by the Spanish Ministry of Science, Innovation and Universities through the “Retos Investigación” programme (RTI2018-095168-B-C53) and by the Universitat Jaume I “Pla de promoció de la investigació 2017” programme (UJI-B2017-45). Oscar Belmonte-Fernández had a grant from the Spanish Ministry of Science, Innovation and Universities (PRX18/00123) for developing part of this work. The authors also would like to thank Pilar Bayarri Iturralde for her contribution to the organization of the experiments.