Abstract

Indoor localization detection acts as an important issue and has wide applications with wireless Internet of Things (IoT) networks. In recent years, the WiFi-based localization by using the latest artificial intelligence methods for improving the detection accuracy has attracted attention of many researchers. Granular computing is a newly emerged computing paradigm in artificial intelligence, which focuses on the structured thinking based on multiple levels of granularity. Thus, we introduce granular computing approaches to the task of wireless indoor localization detection, and a novel heuristic data discretization method is proposed based on the binary ant colony optimization and rough set (BACORS) for the selection of optimal granularity. For BACORS, the global optimal cut point set is searched based on the binary ant colony optimization to simultaneously discretize multiple attributes. Meanwhile, the accuracy of approximation classifications coined from rough sets is used to determine the consistent of multiple attribute data. To validate the effectiveness of BACORS, it is applied to a wireless indoor localization data set, and the experimental results indicate that it has promising performance.

1. Introduction

With the rapid development of wireless indoor positioning systems and techniques in Internet of Things (IoT), the application of indoor localization detection is more and more extensive [1]. To name a few, one can consider detecting the location of criminals in a bounded area, detecting the location of vehicles in a large underground garage, and finding products stored in a warehouse [2]. In recent years, the indoor localization detection based on WiFi signal strength has received great attention due to its widely available and low cost. In existing literatures, many methods have been proposed by considering positioning as a classification problem, and lots of machine learning techniques have been used to improve the prediction accuracy of localization detection [3, 4]. For instance, Pei et al. [5] presented a motion recognition-assisted wireless positioning method by using a least square-support vector machines (LS-SVM) to detect common motion states used during indoor navigation. Wang et al. [6] studied a deep network with three hidden layers pretrained with the greedy learning algorithm for location prediction. Zhang et al. [7] explored a novel wireless positioning method by using a four-layer neural network structure pretrained by stacked denoising autoencoder (SDA) to extract features from massive widely fluctuating WiFi data. As received WiFi signal strength is vulnerable to various environments, Zhang et al. [8] proposed a multi-information fusion algorithm to improve the accuracy of indoor location. Rohra et al. [2] used neural network to detect the location of the users in an indoor environment based on WiFi signal strength received from various routers. To obtain a higher accuracy, fuzzy hybrid of particle swarm optimization and gravitational search algorithm is used to train the weights. Bhatti et al. [9] put forth a technique named iF_Ensemble by using the combination of isolation forest (IForest), support vector machine (SVM), nearest neighbor (KNN), and random forest (RF) to effectively detect the outlier of users in a WiFi indoor localization environment.

To overcome the drawback that accuracy of indoor localization detection based on WiFi signal strength is easily disturbed by diverse environments, researchers devote to investigating novel detection methods for improving the accuracy. Granular computing has emerged as one of the fastest growing computing paradigms on formation, processing, and communicating information granules in the artificial intelligence field [10], which has been widely applied in many realistic areas, such as private data protection [11], medical diagnose [12], image segmentation [13], and web services recommendation [14]. Thus, it is meaningful to apply granular computing methods to indoor localization detection for improving the detection accuracy.

For granular computing, the construction of information granules and computation with granules are two basic issues. The selection of appropriate granularity levels plays an important role to address application problems. Optimal granularity selection has become one of the hotspots. Discretization is an essential data preprocessing procedure, which has been used to obtain the equivalence-based information granules and improve the performances of granular computing methods. In the past decades, the researches on discretization-based optimal granularity selection have been received significant attention, and many discretization methods have been developed [15, 16]. Those methods can be classified with different taxonomies. Liu et al. [17] summarized the commonly used discretization methods from different taxonomies in detail. Unsupervised and supervised ones are the most commonly used taxonomies among them. Unsupervised discretization methods do not consider class labels during the discretization process. They are generally easy to implement and own low time complexity, whereas supervised discretization methods relate class labels to the discretization process. Therefore, the research work of discretization mainly focuses on the improvement of supervised discretization methods.

Supervised discretization methods mainly include entropy-based, measure-based, and dependency-based methods. The main differences of those methods are the choice approaches of optimal cut point sets. Entropy-based discretization methods find the optimal cut point set by using different kinds of entropies to evaluate the importance of cut point sets. MDLP (minimal description length principle) is one of the most common methods among them [1820], which uses the class information entropy as the measure to evaluate importance of cut points during the discretization process. For a given continuous attribute, the cut point which minimizes the entropy function over all candidate cut points is selected as the optimal cut point. This method can then be applied recursively to both of the intervals induced by the optimal cut point until the stopping condition MDLP defined by Fayyad and Irani is achieved. measure-based discretization methods are a series of supervised discretization methods which evaluate the importance of cut points based on measure. ChiMerge and Chi2 are two typical measure-based discretization methods. The ChiMerge merges the adjacent pair intervals with lowest value until a termination condition is met [21]. One drawback of ChiMerge is that it requires to specify the significant level α when computing the value. Nevertheless, too big or too small of an α will over- or underdiscretize continuous data, and it is hard to find an appropriate α for ChiMerge. The Chi2 stems from ChiMerge and an autoversion of ChiMerge [22]. Dependency-based discretization methods evaluate the importance of cut points based on the dependency between continuous data and class labels calculated according to contingency table. They mainly include CAIM and CACC. CAIM (class attribute interdependence maximum) defined the interdependency between class labels and the discretization scheme of a continuous attribute. It maximizes the class attribute interdependence and generates a possibly minimal number of cut points [23]. This method does not need to predefine the number of intervals and can always achieve better discretization results. The drawback of CAIM is that the number of intervals generated is very close to the number of target classes. CACC (class-attribute contingency coefficient) is the improvement method of CAIM and overcomes the drawbacks of CAIM. CACC defines contingency coefficient and maximizes it in the process of discretization instead of making class attribute interdependence maximum in CAIM [24].

For the commonly used supervised discretization methods, most of them are local methods performing on a single attribute. They independently generate intervals on each attribute without taking into account the interdependence among attributes. Compared to local discretization methods, global discretization methods discretize multiple continuous attributes simultaneously. The discretization process can be defined as a problem of searching for an optimal cut point set of multiple continuous attributes. Global discretization should obtain less cut points and have better discretization results than local discretization. However, it has been proved by Nguyen and Skowron that the optimization problem to find the optimal cut point set of multiple continuous attributes is NP-hard [25]. Therefore, the commonly used discretization methods are basically heuristic methods by using efficient heuristics to return suboptimal cut point set.

To obtain the optimal equivalence-based information granules, this article proposes a new global heuristic supervised discretization method based on binary ant colony optimization and rough set. It uses binary ant colony optimization to search the optimal least cut point set of multiple attributes. Meanwhile, this method tries to keep the indiscernibility relation defined with rough set unchanged for maintaining the consistent of multiple continuous attributes data. The experiment applying the proposed method to the wireless indoor localization data set indicates that it is effective for improving the prediction accuracy of indoor localization detection.

2.1. Discretization

Discretization problem investigates how to partition continuous attributes into several discrete intervals with the selected optimal cut points. Let be a continuous attribute of a data set with multiple continuous attributes and the cut point be a value within the range of . A cut point set is used to partition into intervals representing with different labels during the discretization process. The essential problem of discretization is the selection of optimal cut point set from the candidate cut point set. For continuous attribute , the value set is after sorting in ascending order and removing duplicate values, where . The candidate cut point set of can be defined as , and the candidate cut point set of is

For the available discretization methods, the main difference is the selection method used to find the optimal cut point set. Local and global discretization methods search the optimal cut point set of single attribute and multiply attributes, respectively. For supervised local or global discretization methods, the optimal cut point set is determined by taking into account the relationship between attributes and class labels.

A discretization process mainly include four steps: (1) sort the continuous values of the attribute and remove the duplicate values, (2) calculate the candidate cut point set of the attribute, (3) search the optimal cut point set, and (4) use the optimal cut point set to discretize the continuous attribute into discrete attributes.

2.2. The Rough Set Theory

The rough set theory proposed by Pawlak et al. is an extension of the classic set theory and proved to be very useful in dealing with inconsistency problems [26]. Since rough sets are useful in analyzing vagueness and uncertain data, it has been widely applied in uncertainty data mining and knowledge discovery. Rough set operates on information system , where is a finite nonempty set of objects, is a finite nonempty set of condition attributes, is a finite nonempty set of decision attributes, is a nonempty set of values of all attributes, and is an information function such for each and . To each nonempty subset of attributes , an indiscernibility relation can be defined as

IND () is called the -indiscernibility relation. The family of all the equivalence classes of the relation IND () containing an object is denoted as .

Let be an information system, be nonempty subset, and . The lower and upper approximations of are defined as follows:

is an object set whose objects belonging to the equivalence classes generated by the indiscernibility relation IND () can be certainty contained in ; is an object set whose objects belonging to the equivalence classes generated by the indiscernibility relation IND () can be possibly contained in . For any nonempty subset and , is a definable set of with respect to iff ; otherwise, is a rough set with respect to iff . The vagueness of can be described by the accuracy of approximations: where . provides a measure of how closely the lower and upper approximations of X.

The definition of accuracy of approximations can be applied to a partition or classification of . Subsets, , are equivalence classes of . The lower and upper approximations of are expressed as and , respectively. The accuracy of approximations of the classification is defined as

denotes the ratio of all correctly classified objects to possibly classified objects by means of attributes from . It measures the classification ability with respect to . The closer to 1 the value of is, the greater is the classify accuracy with respect to .

2.3. Binary Ant Colony Optimization

Ant colony optimization (ACO), initially proposed by Dorigo et al., is a stochastic metaheuristic for solutions to global combinatorial optimization problems [27]. ACO is a nature-inspired intelligent algorithm inspired from the phenomenon that ants could always find the shortest path between the nest and food. The main idea of ACO is that a number of ants cooperate according to the pheromone laid on the path. ACO has been successfully applied in data mining, particularly for learning classification rules [28]. Binary ant colony optimization (BACO) is a binary version of ACO to solve the binary optimization problems. For BACO, the artificial ants walk on the mapping graph to represent the given binary optimization problem as described in Figure 1. And each path corresponds to a potential solution to the given problem [29].

Ant selects the path to walk from node 1 to node by depending on the pheromone on the path. represents the pheromone level for path .The solutions of BACO can be represented by a binary string: . If the ant selects path i0, then , and if the ant selects path i1, then . The selection probability of path for each ant can be defined as [30] where and represent the visibility of path i0 and i1, and and are both weight factors.

Every ant makes a complete selection of the entire paths to generate a solution. After all ants complete the routines, all the solutions generated during the current iteration are evaluated by the fitness function . The ants having better fitness values are selected to enhance the pheromone. The pheromones on each path are updated as where is the evaporation rate of pheromone, and is the intensified pheromone of ant on path ij.

3. BACORS

Optimal discretization of multiple attributes is an actual subset optimization problem, which selects optimal cut point set from the candidate cut point set. The solution of the problem can be represented by a binary-encoding bit string. For each candidate cut point, if it belongs to the optimal cut point set, the bit value is 1; otherwise, the bit value is 0. Obviously, it is a binary optimization problem. As BACO is a useful technique to solve binary optimization problem, it is used to deal the optimal discretization problem of multiple attributes in this article. The selection of optimal cut point set with BACO during the discretization process can also be described in Figure 1, where candidate cut points are represented by the node 1 to . The length of the routine is equal to the cardinal number of the candidate cut point set. If ant walks through the upper path i0, it means that the candidate cut point is not selected. And if the lower path i1 is selected by the ant, the candidate cut points is selected. Pheromone laid on the path is used for selections of 0 and 1 for each bit of the solution string. And the probabilities of the selections of 0 and 1 are calculated depending on the pheromone.

The fitness function is used to evaluate the solution (ant) in BACO. To find the optimal cut point set, the fitness function of the solution is defined as where and are both constants, is the cardinal number of cut point set , and is the accuracy of approximation of classification induced by . For multiple attributes, the optimal discretization is to find the least cut points among all attributes by keeping consistency for class. Therefore, the construction of fitness function takes into account the consistency and the number of cut points. The consistency is measured by defined in rough set. The solution of the ant corresponding to higher accuracy of approximation and less cut points has higher fitness value.

An ant deposits various amounts of pheromone on the path, which can be detected by other ants. The path having richer pheromone concentration will be selected by more ants. The laid pheromone of an ant on path ij is defined as

The pheromone on each path are updated for each ant . And the selection probability of paths i0 and i1 for each ant is calculated by using Eq. (12). To improve the search ability, some randomness is added to calculate the transfer direction of the ant. The value of the solution is determined as where and are both random numbers in [0, 1]. The steps of BACORS are as follows:

Input: information system ;
Output: cut point set corresponding to the solutions having highest fitness values;
Initialize parameters of BACORS including the population of ant colony P, the number of generations G, best fitness ; pheromone of each ant on each path, visibility of the path , evaporation rate of pheromone and accuracy of approximation classification defined in rough set;
Calculate the candidate cut point set of multiple continuous attributes C;
Determine the solution of each ant ;
for m=1, 2, ..., P
 for n=1, 2, ..., G
  Determine the routine of ant n by Eq. (7) and Eq. (8);
  Obtain the cut point set according to the routine;
  Calculate the accuracy of approximation classification by Eq. (6);
  Calculate the fitness of ant n by Eq. (10);
  if then
   Determine the solution of ant n;
   Obtain the cut point set corresponding to the solution;
  end if
  if then
   Calculate the intensified pheromone of ant n by Eq. (11);
  end if
 end for
 Updating the pheromone on the paths by Eq. (9);
end for

4. Experimental Study

4.1. Data Sets

A wireless indoor localization data set is gathered from the U.C. Irvine repository. It is collected for the purpose of performing experimentation on how signal strengths can be used to determine one of the indoor locations of the users. The data set totally have 2000 instances and eight attributes. The first seven attributes correspond to the signal strengths captured by Android devices from seven routers. And the last attribute is the class label including four numbers 1, 2, 3, and 4 which correspond to the four locations of the office such as conference room, kitchen, or the indoor sports room.

4.2. Experiment Settings

The performance of our proposed method is compared with seven other commonly used discretization methods: (1)EW (equal width)(2)EF (equal frequency)(3)MDLP (minimal description length principle)(4)ChiMerge(5)Chi2(6)CAIM (class-attribute interdependency maximization)(7)CACC (class-attribute contingency coefficient)(8)BACORS (hybrid BACO and RS)

Among the eight discretization methods, EW and EF are unsupervised and require the user to specify the number of intervals. It is set to 5 and 10 in our experiment. The last six methods are supervised discretization methods. MDLP is an entropy-based discretization method. ChiMerge and Chi2 are measure-based discretization methods. For the ChiMerge method, we set the level of significance to 0.95. CAIM and CACC are dependency-based methods.

To validate the effectiveness of different discretization methods, the C4.5, Naive Bayes, and Bayes Network classifiers are implemented to the experimental data set. It has been proved that discretization leads to the improvement of accuracy and efficiency of the three classifiers. This is the reason why the three classifiers were chosen. The wireless indoor localization data set is classified into training sets and test sets. The instances of training sets are selected randomly from the original data, and five training and testing schemes are generated. The ratios of original data for generating five trainings are set to 0.1, 0.15, 0.2, 0.25, and 0.3, respectively. The discretization methods are applied to the training sets, and the testing sets are discretized using the generated cut points from the training set. The classification accuracies as a major evaluation indicator are compared among the eight discretization methods.

4.3. Results

The wireless indoor localization data are discretized with the proposed BACORS and other different discretization methods. The impact of the discretization methods on the classification result will be analyzed. Here, the original and discretized wireless indoor location data are classified by C4.5, Naive Bayes, and Bayes Network classifiers. The classification accuracies of the three classifiers with different discretization methods are shown in Tables 13. To effectively reflect the differences of the classification accuracies in Tables 13, four maps are drawn (Figures 25), which can give direct and visual expressions for the changes of the classification accuracies with different discretization methods.

4.4. Discussions

In our experimental study, BACORS is used to discretize multiple continuous wireless indoor location data for improving the classification accuracies. To assess the effectiveness, the classification accuracies of C4.5, Naive Bayes, and Bayes Network classifiers are compared between BACORS and other commonly used discretization methods. Figure 2 displays a map of the mean classification accuracies for the original and discretized wireless indoor localization data with different discretization methods. The mean classification accuracies are calculated for the five training and testing schemes. It can be seen that discretization methods have obviously impacts on the performance of the classifiers. Compared to the original data, the classification results are improved when the data are discretized with some discretization methods, but the performance of the discretization methods is distinct when the discretized data are classified by different classifiers. The mean classification accuracies of C4.5 and Naive Bayes classifiers for the discretized data with other commonly used discretization methods are lower than the mean classification accuracy for original data. However, the classification accuracies of Bayes Network classifier are improved when EW (10), EF (10), MDL, CAIM, ChiMerge, and Chi2 are used to discretize the original data. The mean classification accuracies for the discretized data with BACORS are 96.29%, 97.07%, and 97.25% when the three classifiers are used. They are the highest among the classification accuracies for the original data or discretized data with other discretization methods.

Figures 35 display maps of the classification accuracies of C4.5, Naive Bayes, and Bayes Network, respectively, when different discretization methods are applied to the five types of training and testing schemes. From these three maps, the impacts of discretization methods on the classification results are compared. From Figure 3, it can be seen that the classification accuracies of C4.5 classifier are the highest when BACORS is applied to all the five training and testing scheme. And they increase when the sizes of training set become larger. Other discretization methods have bad performance especially for unsupervised discretization methods EW and EF. The classification accuracies of C4.5 classifier based on those discretization methods are lower than the classification accuracies for the original data. The number of interval is a parameter for EW and EF which needs to be predefined before discretization. As there is no rule to find the optimal number of intervals, 5 and 10 are set in our article. Compared to EW and EF, the classification accuracies of C4.5 classifier are significantly higher when BACORS is used.

Figure 4 is also analyzed and showed that the differences of classification accuracies become smaller for Naive Bayes classifier. The classification accuracies are larger than that for original data when the ratio for training set is 0.1, and MDL, CAIM, CACC, and Chi2 are used for discretization. Meanwhile, the classification accuracies are also improved when ChiMerge and Chi2 are used and the ratios for training set are 0.25 and 0.3, respectively. The classification accuracies of Naive Bayes classifier are still the highest for all the training and testing schemes discretized based on BACORS. It can be seen that from Figure 5, the classification accuracies of Bayes Network are very close for most of the discretization methods. When the ratios for training set are 0.1, 0.25, and 0.3, the classification accuracies based on most of the discretization methods are much larger than that for original data. It indicates that discretization can help to improve the efficacy of the classification result for Bayes Network classifier. For BACORS, the classification accuracies are the highest except that the ratio for training set is 0.3, for which the highest accuracy is reached for ChiMerge and second highest for BACORS. Following the comparison of classification accuracies, we regard that BACORS is an optimal discretization method for enhancing the accuracy of indoor localization detection.

5. Conclusions and Future Works

In this paper, we propose a novel discretization method titled BACORS to construct the optimal granularity structures for multiple continuous attributes by combining the binary ant colony optimization with rough set. BACORS uses the binary ant colony optimization (BACO) algorithm to search the global optimal cut point set of multiple continuous attributes. Meanwhile, the accuracy of approximation classification defined in rough set is used to measure the consistency of the data and construct the fitness function to guide the path selection of the ant. We also apply BACORS to the wireless indoor localization data to validate the effectiveness. Comparing with several typical discretization methods, it can be seen that the BACORS has relative better performance on the classification of C4.5, Naive Bayes, and Bayes Network classifier. However, there are some issues to be resolved. For example, some parameters must be initialized while BACO is used for searching the optimal cut point set such as colony size, generation size, evaporation rate, and visibility of path. The efficiency of BACO depends on the selection of the parameters. How to select optimal BACO parameters to obtain better discretization results needs further research. For BACORS, the accuracy of approximation classification defined in rough set is used to measure the consistency of multiattribute data. Multigranulation rough set (MGRS) is an extended version of classic rough sets, which uses multiple binary relations to construct granular structures rather than single binary relation. It is a kind of new information fusion strategies and has become one of desirable direction in granular computing [3133]. In future researches, we will investigate more discretization methods based on BACO and MGRS to extend the applicability of the proposed method [34, 35].

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This research was supported in part by the National Natural Science Foundation of China (Grant Nos. 62072291, 62072294, 61672332, 62002210), Anhui Provincial Natural Science Foundation (2008085MF202), Natural Science Foundation of Shanxi Province, China (Grant No. 20210302123455), and Open Project Program of the Key Laboratory of Embedded System and Service Computing Ministry of Education (Tongji University) (Grant No. 2021-04).