Abstract

Granular computing is to represent, construct, and process information granules formalized in many different approaches, and different formal approaches emphasize the same fundamental facet in different ways. In this paper, we present the multigranulation rough set models in the intuitionistic fuzzy neighborhood information system generated by Internet of Things (IoT) data with weighted features and develop the basic properties of the proposed models. Moreover, the multigranulation-based optimal granularity selection approach is established by introducing the concept of granularity significance. The experimental results on eight IoT datasets demonstrate that the proposed methods exhibit better efficiency by comparing three classical methods under the intuitionistic fuzzy weighted neighborhood information system.

1. Introduction

The Internet of Things (IoT) [14] refers to the real-time collection of important information or processes that need monitoring, connection, and interaction through various information sensors, laser scanners, and other devices and technologies, so as to form the IoT datasets. Granular computing [58] is an area of study that explores different levels of granularity in human-centered perception, problem solving, and information processing, as well as their applications in the design and implementation of knowledge-intensive intelligent systems. Granular computing as an emerging area brings a great deal of original and practically relevant ideas. Granular computing brings and unifies fundamental ideas of interval analysis, fuzzy sets, and rough sets and facilitates building a coherent view at all of them with an overarching concept of granularity of information. It helps identify main problems of processing and the key features of such processing, which are common to all the formalisms being considered. Granular computing forms a coherent conceptual and algorithmic platform. It directly benefits from the already existing and well-established concepts of information granules formed in the setting of set theory, rough sets, fuzzy sets, and others.

Since Zadeh introduced fuzzy set theory [9], several generalizations have been proposed [1013]. Among them, intuitionistic fuzzy set proposed by Atanassov [10] provides a flexible mathematical framework to cope, besides the presence of vagueness, with the hesitance orienting from imprecise information. The intuitionistic fuzzy set is an extension of the fuzzy set and considers both membership degree and nonmembership degree which are functions valued in interval , while the fuzzy set gives a membership degree only. Intuitionistic fuzzy sets are described using two membership functions expressing the degree of membership (belongingness) and the degree of nonmembership (nonbelongingness) of elements of the universe to the intuitionistic fuzzy set. Intuitionistic fuzzy set can be used to improve the accuracy of the results to compare with fuzzy set, even though the intuitionistic index is “0.” The intuitionistic fuzzy set can be used to describe the fuzziness of the objective world more accurately, and it has attracted the attention of many scholars [1419]. Different aspects of intuitionistic fuzzy sets have been used for pattern recognition and decision-making, where imperfect facts coexist with imprecise knowledge. Numerous scholars have put forward many theories on the foundation of intuitionistic fuzzy set theory: intuitionistic fuzzy ordered decision table, intuitionistic fuzzy neighborhood rough set model, etc. The intuitionistic fuzzy set theory has been extensively used in data classification [20, 21], decision-making [22, 23], attribute reduction [24], prediction [25, 26], risk evaluation [27], and so on.

Rough set theory proposed by Pawlak [2831] is an extension of the classical set theory and could be regarded as a mathematical and soft computing tool to handle imprecision, vagueness, and uncertainty in data analysis. This relatively new soft computing methodology has received great attention in recent years, and its effectiveness has been confirmed successful applications in many science and engineering fields, such as pattern recognition, data mining, imaging processing, and medical diagnosis. Rough set theory is built on the basis of the classification mechanism, it is classified as the equivalence relation in a specific universe, and the equivalence relation constitutes a partition of the universe. Due to the existence of uncertainty and complexity of particular problems, several extensions of the rough set model have been proposed in terms of various requirements, such as the variable precision rough set model and rough set model based on neighborhood relation. These extended rough set models can be roughly cast into two perspectives: (1) extending the data type, including incomplete data, set-valued data, interval-valued data, fuzzy data, and intuitionistic fuzzy data and (2) extending the binary relation, including similarity relation, tolerance relation, dominance relation, and neighborhood relation. From the perspective of granular computing, a binary relation used can be regarded as a granulation. Hence, the classical rough sets are based on a single granulation (only one equivalence relation). However, rough sets may be associated with many granulations. Qian et al. extended Pawlak’s single-granulation rough set model to a multiple granulation rough set model [32]. Since the multigranulation rough set was initially proposed by Qian et al., many researchers have extended the multigranulation rough sets [3338]. From the thought of multigranulation, optimistic multigranulation and pessimistic multigranulation are two of the most basic ways of research.

For an IoT dataset, sometimes, the attributes or features of the dataset do not have the same weight [39]. In other words, some weight factors are large, and some weight factors are small. Sometimes, the attribute is not equally important, for example, when judging a person’s gender by certain characteristics, the features of hair length are considered more important than the age. So, rendering a different weight for each attribute is extremely important. However, these studies are carried out under the single granulation. With the advent of the era of massive data, it is necessary to study the granularity under the condition of multigranulation. In this case, if we still use the traditional rough set to do the data analysis, it is obviously inappropriate. We need to discover new data analysis methods that can deal with the characteristics of weight. For the intuitionistic fuzzy weighted neighborhood information formed by IoT data, the purpose of this paper is to propose several multigranulation intuitionistic fuzzy weighted neighborhood rough set models, study their important properties, and then make the optimal granularity selection. The main contents and innovation of this article could be summarized in the following aspects. (1)The multigranulation rough set models in the intuitionistic fuzzy weighted neighborhood information systems and the corresponding basic properties are discussed.(2)The granularity selection criterion based on granularity significance is proposed to select the optimal granularity from the intuitionistic fuzzy weighted neighborhood information systems.(3)The experimental evaluation is performed using 8 public available datasets, and the superiority of optimal granularity selection is shown by the analysis of experimental results.

From the selection of optimal granularity, it can eliminate irrelevant or redundant granularity, so as to reduce the number of granularity, improve the accuracy of the model, and reduce the running time. That is, it can reduce the amount of data processing, save processing time, reduce the impact of noise in data, and improve the performance of information processing system. This paper is organized as follows. In Section 2, related concepts about intuitionistic fuzzy weighted neighborhood information system and multigranulation rough set model are reviewed briefly. In Section 3, three kinds of multigranulation rough sets for intuitionistic fuzzy weighted data are constructed, and the related properties and further relationship are discussed. In Section 4, the concepts of dependency degree and granularity significance are introduced, and a heuristic algorithm is presented to select the optimal granularity of the intuitionistic fuzzy weighted neighborhood information system. In Section 5, the corresponding experimental testing is conducting by IoT related data from public datasets to test the effectiveness of the proposed method. Finally, Section 6 covers some conclusions.

In this section, we review the basic concepts about the intuitionistic fuzzy set and the intuitionistic fuzzy weighted neighborhood rough sets. The notion of information system provides a convenient basis for the representation of objects in terms of their attributes.

Definition 1. An information system is a tuple , where is a nonempty and finite set of objects, and ; is a nonempty and finite set of attributes, and ; , , is the domain of , ; , , is the value of on .

A decision information system is an information system , where , and is the condition attribute set, while is called the decision attribute set. In the decision information system, and are equivalence relations induced by and , respectively. The constructions of and are expressed as follows: and . It is easy to see that partitions the universe into disjoint subsets, the same to . Such a partition of the universe is a quotient set of and is denoted by , where is called equivalence class containing with respect to , and . If , then we say that is consistent; otherwise, it is inconsistent. For the sake of simplicity, in the sequel, we set , , and . is the decision class .

Let be an information system and and be an equivalence relation. For any , one can characterize by a pair of upper and lower approximations which are

For a target concept , if , is called definable set, and if , then is called Pawlak rough set. Three regions can be obtained as , , and which are called the positive region, negative region, and boundary region of , respectively.

In an information system, the equivalence class of an object with respect to an attribute subset of is a granularity from the viewpoint of granular computing. A partition of the universe is a granular structure. An attribute set or its partition can also be called a granulation. Rough set proposed by Pawlak is a single-granulation rough set model, and the granular structure in this model is induced by the indiscernibility relation of the attribute set. In general, the above cases cannot always be satisfied or required in practical problems. In the three cases referred in reference [40], there are limitations in single-granulation rough set for addressing practical problems with multiple partitions, and multigranulation rough set can now be used to solve these problems better. Under those circumstances, we must describe a target concept through multiple binary relations on the universe according to a user’s requirements or targets of problems solving. In multigranulation rough sets, a concept is approximated through multiple partitions of the universe, which are induced by multiple equivalence relations.

Definition 2. Let be an information system, are equivalence relations induced by . For any , the optimistic multigranulation and pessimistic multigranulation lower and upper approximations of the target set are shown below.

An intuitionistic fuzzy set of has the form where and . and are called the membership degree and nonmembership degree of the object to . Furthermore, they satisfy for any . In general, we use to denote all intuitionistic fuzzy sets in the universe . Let , for any . If both and , then we say is equal to , denoted by . The universe set and empty set are special intuitionistic fuzzy sets, where and . The intersection and union of and are denoted as and , respectively. Moreover, we denote complement of by Let , and then,

For , the membership degree of is , and the nonmembership degree of is . Then, the intuitionistic index (or hesitancy degree) of is . They have the following condition: Under the condition that the intuitionistic index is a constant, the above formula can be used as a formula to determine the nonmembership degree of intuitionistic fuzzy sets.

An information system is called an intuitionistic fuzzy information system if the domain of each condition attribute is an intuitionistic fuzzy set of . A decision intuitionistic fuzzy information system is an intuitionistic fuzzy system with decision attribute set, namely, , where is a decision attribute. For an intuitionistic fuzzy information system, in many cases, its condition attributes are no longer indistinguishable, but the impact and importance of each attribute on the system are different. In other words, the weight of attributes should be considered. Let us introduce an objective method for solving weight with an intuitionistic fuzzy information system and provides the concept of intuitionistic fuzzy weighted neighborhood information system ().

Definition 3 (see [39]). Given an , , an intuitionistic fuzzy weighted neighborhood relation is defined as where is a neighborhood threshold.

Denote as the intuitionistic fuzzy weighted neighborhood class of . , it satisfies . is a function to compute the distance between elements and , which is defined as

where refers to the attribute value of element under attribute . where means the attribute ’s membership degree weight, and means the attribute ’s nonmembership degree weight.

The formation of and is calculated as follows. where and are defined as and means the membership degree of in the attribute ; similarly, means the nonmembership degree of in the attribute , and is the decision vector

Sometimes, there exists some columns in (or ) that are linearly correlated or the number of samples less than the number of attributes, which makes and that are singular matrices. Under these circumstances, the calculation formula of and turns into

3. Multigranulation Rough Sets for Intuitionistic Fuzzy IoT Data with Weighted Attributes

In this section, we provide the construction of multigranulation rough sets in the intuitionistic fuzzy weighted neighborhood information systems and discuss their corresponding properties. We denote the pessimistic multigranulation intuitionistic fuzzy weighted neighborhood rough set as model I and the optimistic multigranulation weighted neighborhood rough set as model II.

Definition 4. Let be an , , and are intuitionistic fuzzy weighted neighborhood relations induced by . and are thresholds, where . The weight of each granularity can be calculated from formula (7). Then, the upper and lower approximations of model I and model II are defined as follows.
Model I: Model II:

Remark 5. (1) satisfies reflexivity and symmetry, and is a covering on (2)When , and , the relation degenerates to an intuitionistic fuzzy equivalence relation

In order to facilitate the generalized multigranulation rough sets in intuitionistic fuzzy weighted neighborhood information system, the definition of characteristic function is discussed in the following.

Definition 6. Let be an , and are intuitionistic fuzzy weighted neighborhood relations induced by . The characteristic functions and are defined as represents the relationship between the conditional probability of under and the parameter , and represents the relationship between the conditional probability of under and the parameter . indicates the total number of granularity whose conditional probability , and indicates the total number of granularity whose conditional probability .
With the above concepts, model I and model II can be expressed as
Model I: Model II: We introduce the parameter variable () to study the upper approximation and lower approximation for the generalized multigranulation intuitionistic fuzzy weighted neighborhood rough sets below.

Definition 7. Let be a intuitionistic fuzzy weighted neighborhood information system, , and are intuitionistic fuzzy weighted neighborhood relations induced by . The generalized upper approximation and lower approximation are defined as follows:

The positive region, negative region, upper boundary region, and lower boundary region are derived as

Proposition 8. Given an and granularity sets , , there exists the following: (1)(2)(3)

Proof. (1)From Definition 7, ; hence, (2)(3)

Proposition 9. Given an and granularity set , , then
(L1)
(U1)
(L2)
(U2)
(L3)
(U3)
(LU1)
(LU2)

Proof. (L1) for , if , then we can obtain and . Therefore, it can be obtained by combining Definitions 6 and 7 that and , namely,
(U2) This item can be proved similar to item (L1) in this proposition
(L2) For or , we can obtain or from Definitions 6 and 7. Hence,
(U2) (U2) can be proved similar to (L2)
(L3) , for as ; if , we can obtained . Thus, we can say when , there exists
(U2) (U3) can be obtained similar to (L3)
(LU1) , . holds for every granularity . So, . Because , is established
(LU2) , . holds for every granularity . So, . Because , is established

Theorem 10. (1) When , the generalized multigranulation rough sets degenerate into optimistic multigranulation rough sets; (2) when , the generalized multigranulation rough sets degenerate into pessimistic multigranulation rough sets.

Proof. (1)When , there are Due to and , it can only take an integer between , so that equals to and equals to (2)While , (3)Can be easily obtainedIn the following, some properties of generalized multigranulation rough sets are studied when they degenerate into pessimistic multigranulation rough sets and optimistic multigranulation rough sets.

Proposition 11. If , for any target set ,

Proof. For any : , from Definition 4, it can obtain , in which it can find , namely, . Similarly, Definition 4 can obtain , and , and then is derived. With the same mentality, it can easily prove and .

Proposition 12. Given an , for , target set , the following properties hold. (1)(2)When and , the proposed models degenerate to classical intuitionistic fuzzy multigranulation weighted neighborhood rough set; in addition, if , model I and model II degenerate to classical intuitionistic fuzzy multigranulation weighted rough set

Proof. (1)From Definition 4, we can easily comprehend that under each granularity , the upper and lower approximation has the relationship: . Then, , that is to say, (2)Can be obtained from Definition 4To make it easy for readers to comprehend, an example is given to solve the upper and lower approximations for generalized multigranulation rough sets when (model II).

Example 13. The following Table 1 gives an information table COVID-19 Surveillance Data Set, which is downloaded from UCI (the original data has 13 samples and 7 attributes). To facilitate calculation, the experiment only selects 6 samples and the first three attributes, and one granularity considers only one attribute).

Assume that the membership degree of element under attribute corresponds to the attribute value under attribute , , and the nonmembership degree is computed by . After processing, it can obtain the processed data like Table 2. Let , and the distance matrix at each granularity is shown as

Then, from the distance matrix, the neighborhood classes under each granularity can be obtained: and

Let , based on model II, considering granularity , and the lower approximations are Considering granularity , the lower approximation is Considering granularity , the lower approximation is Considering all granularity, the lower approximation of model II is

4. Optimal Granularity Selection Based on Granularity Significance

This section aims to select the optimal granularity from the intuitionistic fuzzy weighted neighborhood information systems.

Definition 14. Let be an , , is a covering on , and is a decision attribute. Under the decision attribute , a partition is derived. With the relation of , the dependency degree of is defined as follows:

In this formula, means the cardinality of . can describe the ability of subset to approximate . It is obvious that .

Definition 15. Let be an , granularity and granularity subset . The internal significance and external significance of granularity are defined as follows:

Lemma 16. If , then is important on granularity set . If , we say is unnecessary on granularity set .

When , it represents granularity is unimportant. Deleting these granularity can get the optimal granularity selection. The optimal granularity selection results of Example 13 are given according to this rule.

Example 17. From Example 13, it can, respectively, calculate under each granularity that is : , : , and : , considering all granularity: . If we compute the under and , we can get , which is same to the granularity. There is no doubt that , which means is unimportant. Therefore, is the optimal granulation selection. To understand our model, we provide the program flow chart and algorithm for solving the optimal granularity selection process below. Figure 1 is the program flow chart: it is the process of data processing and solving the optimal granularity. Algorithm 1 is the optimal granulation selection of generalized multigranulation rough sets model. The parameter is a threshold that decisions the size of the intuitionistic fuzzy weighted neighborhood class, when , it means the intuitionistic fuzzy weighted neighborhood class is degenerate into intuitionistic fuzzy weighted class. The parameters , , and are three parameters that used to define the generalized upper and lower approximation.

  Input: An intuitionistic fuzzy multi granularity table and four parameters , , and
  Output: Optimal granulation set
1  begin
2    ;
3    Compute the weight of each conditional attribute by formula (3);
4    foreachdo
5      Compute for each granularity by formula (8);
6    end
7    foreachdo
8      ;
9      ifthen
10        ;
11      end
12    end
     return:
13  end

In the following analysis, assume that is the number of attributes, is the maximum number of attributes in the granularity , and is the number of granularity. In step 1, the time complexity is ; in step 2, compute the weight of each granularity by formal (3), and the complexity is ; in steps 3-5, the complexity of computing the intuitionistic fuzzy weighted neighborhood relation for each granularity is because we need to compute each elements’ neighborhood classes, which demand to traverse all other elements. Steps 6-11 aim to calculate the of each granularity in and then update the set . When computing for each granularity, we need to calculate times, and each time, we demand to calculate the WPOS (), which needs to compute the lower approximation for each granularity. When computing the lower approximation for each granularity, we need to traverse each element to determine whether it is in the lower approximation, and the complexity is . There aredecision classes that need to calculate the lower approximation; so, the complexity of computing WPOS(D) under one granularity is . Therefore, the complexity of steps 6-11 is . In summary, the complexity of this algorithm is .

When facilitating the evaluation of the model, not only the classification accuracy but also the number of selected granularity should be considered. Consequently, the definition of classification accuracy and number of selected granularity(CAN) is shown below.

Definition 18. Assume that the total granularity number for a dataset is . If the machine learning model is trained with the granularity selected by one model, and the classification accuracy is , the number of selected granularity is . The definition of classification accuracy and the number of selected granularity CAN are

CAN is used to describe the quality of the granularity selection model. and are parameters describing the importance of and and satisfy . If , it means that we pay more attention to the accuracy in the results of granularity selection. It is obvious that, when and parameters are the same, the smaller the is, the larger CAN is, which means this model is superior. When and parameters are the same, the bigger the is, the larger CAN is. In real life, classification accuracy is often more important. If a lot of granularity is removed but the classification accuracy is not high, such reduction is often without practical significance. Therefore, when defining CAN, the weight of accuracy is more important than removed granularity so that in Section 5, we set and .

5. Numerical Analysis

In this section, we designed a numerical analysis experiment to verify the effectiveness of these defined models. We will use four models to select data to train a machine learning model: gradient boosting regression trees (abbreviate it as GBRT). The final classification accuracy is the average of the classification accuracy of 10 cycles. What is more, we will compare both the classification accuracy and the number of selected granularity to evaluate these methods. All these codes are executed in Anaconda 3 and run in a hardware environment with Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz, with 8.00 GB RAM. The Movement Libras dataset is from KEEL-dataset, and the other 7 datasets are from UCI. Table 3 shows the basic information of the data and the granularity division. We preprocessed the data as follows: dataset Shill Bidding Dataset dropped one attribute with text type values, and each dataset has been normalized. However, the attribute in these datasets only has a single value; to simulate the intuitionistic fuzzy information system to get the membership degree and nonmembership degree, we assume that the membership degree of an element under attribute corresponds to the attribute value under attribute , and the nonmembership degree is computed by . Divide the number of attributes by an integer to obtain the number of granularity in the ceiling function. When the number of granularity is not enough to divide, all granularity contains attributes except the last granularity. Accidentally, we found that and of dataset micromass are singular matrices and thus use and to calculate and instead.

This experiment uses machine learning model gradient boosting regression trees as a classifier. GBRT is a method based on ensemble learning, which trains multiple weak classifiers and determines the final classification results by voting [41]. This sequential model construction process is in the form of function gradient descent; that is, a new tree is added in each step to minimize the loss function [42, 43]. In order to simplify the selection of experimental parameters, we set the parameter to . Four models are used to select granularity. Model 1 is the generalized multigranulation rough sets model proposed, and the parameters and are randomly generated between [0.5, 1] and [0, 0.5], respectively. Model 2 is the predecessor of model 1, in which attribute is not weighted, and parameter and are same as model 1. Model 3 is the model I with . Model 4 is the classical pessimistic multigranulation rough set model under IFWN, in which the attribute is not weighted, and parameter is 1. Then, the selected granularity is used to train the machine learning model GBRT, which will be trained 10 times, each training set accounts for 85% of the total data, and the training set and test set are randomly divided. The final classification accuracy takes the average of 10 classification accuracy. If the accuracy variance exceeds 0.0001, the accuracy result is expressed in the form of average classification ; otherwise, it is expressed in the form of .

Tables 4 and 5 are the accuracy and value of datasets in different models for 8 datasets. From Tables 4 and 5, the CAN value in each case can be calculated to obtain Table 6. Figure 2 shows the accuracy of each dataset under different models, the number of selected granularity , and CAN value. It can be seen from Table 6 and Figure 2 that the granularity selection results of the novel model: generalized multigranulation rough set model (model 1) is significantly better than the other three models because the optimal granularity selection results of 7 datasets are in the novel model, which proves that the models proposed in this paper are extremely effective and feasible.

6. Conclusions

The optimal granularity selection method is an ingenious strategy in data dimensionality reduction. In an information system, it uses certain information or variable to select useful granularity. In this paper, we propose a weighted neighborhood relation in an intuitionistic fuzzy information system. The generalized multigranulation rough set model under intuitionistic fuzzy weighted neighborhood information system is established. In addition, based on the novel model and combined with the granularity importance, the algorithm of optimal granularity selection is studied. Finally, to compare the efficiency of the new model and the classical model, a series of experiments are carried out on 8 datasets to verify the effectiveness of the proposed model. Experimental results show that the algorithm based on generalized multigranulation rough set can get better granularity selection results. And the parameters of the model can be adjusted according to the actual data, which proves that the new model has certain robustness to a certain extent. However, in the paper, the changes of the model under different parameter values have not been deeply studied. When facing rich data and various problems, our future work will deeply study whether there is a certain relationship between the parameters of the new model and the characteristics of the dataset, to facilitate the determination of model parameters and improve the scalability of the model.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

There is no conflict of interest.

Acknowledgments

This work is supported by the Science and Technology Research Program of Chongqing Education Commission (Nos. KJQN202100205, KJQN202100206) and the China Postdoctoral Science Foundation (2021M700432).