Mathematical Problems in Engineering

Mathematical Problems in Engineering / 2021 / Article

Research Article | Open Access

Volume 2021 |Article ID 9977488 | https://doi.org/10.1155/2021/9977488

Jie Yang, Tian Luo, Fan Zhao, Shuai Li, Wei Zhou, "Fuzzy Knowledge Distance with Three-Layer Perspectives in Neighborhood System", Mathematical Problems in Engineering, vol. 2021, Article ID 9977488, 15 pages, 2021. https://doi.org/10.1155/2021/9977488

Fuzzy Knowledge Distance with Three-Layer Perspectives in Neighborhood System

Academic Editor: Anna M. Gil-Lafuente
Received30 Mar 2021
Accepted01 May 2021
Published22 May 2021

Abstract

Information granule is the basic element in granular computing (GrC), and it can be obtained according to the granulation criterion. In neighborhood rough sets, current uncertainty measures focus on computing the knowledge granulation of single granular space and have two main limitations: (i) neglecting the structural information of boundary regions and (ii) the inability to reflect the difference between neighborhood granular spaces with the same uncertainty for approximating a target concept. Firstly, a fuzziness-based uncertainty measure for neighborhood rough sets is introduced to characterize the structural information of boundary regions. Moreover, from the perspective of distance, based on the idea of density peaks, we present a fuzzy-neighborhood-granule-distance- (FNGD-) based method to discover the relationship between granules in a granular space. Then, to characterize the difference between granular spaces for approximating a target concept, we present the fuzzy neighborhood granular space distance (FNGSD) and fuzzy neighborhood boundary region distance (FNBRD). FNGD, FNGSD, and FNBRD are hierarchically organized from fineness to coarseness according to the semantics of granularity, which provide three-layer perspectives in the neighborhood system.

1. Introduction

Granular computing (GrC) is an emerging computing paradigm of intelligence information processing [13]. GrC aims to establish a granular view for interpreting and solving problems. As the main GrC models, from the different views, fuzzy sets [4], rough sets [5], quotient space [6], and cloud model [7] realize the representation and transformation of uncertain knowledge. Information granule is the basic element in GrC for construction of multigranularity spaces. All the information granules and the relationship between them construct a granular space. The different granulation criterion corresponds to multiple granular spaces. All the granular spaces and the relationship among granular spaces construct a granular topological space, which is also called granule structure. Yao [8] introduced a basic progressive computing algorithm to explore a sequence of information granulation from fineness to coarseness. Pedrycz [1, 2] proposed the principle of justifiable granularity, which combined the construction and optimization of information granularity. Wang [9] reviews the previous works of GrC from granularity optimization to multigranularity joint problem solving and proposed the diagram for relationship among three basic modes of GrC.

Rough set is an effective tool to handle uncertain information by using the given information granulation. Generally speaking, the target concepts in classical rough sets are usually accurate and crisp. However, the target concept may be fuzzy [10] in many real applications. For an uncertain concept, it can be described by a pair of lower and upper approximation sets. In 1988, Lin [11] first introduced the concept of neighborhood system (NS). Based on the notion of NS, Yao [12] proposed the definition of neighborhood rough sets and introduced a distance function to compute a neighborhood. Recently, neighborhood rough sets have received much attention. To handle the information system with heterogeneous data, Hu [13, 14] presented a uniform framework to construct neighborhood-based classifiers. Li [15] proposed a neighborhood-based decision-theoretic rough set model. From the perspective of knowledge engineering, Yang [16] presented a general framework for neighborhood rough sets to deal with incomplete information system. Furthermore, to overcome the limited labeled property of big data, Qian [17] first proposed the local neighborhood rough sets. In addition, neighborhood rough set has been applied in various fields, including attribute reduction [18, 19], image classification [20], EEG processing [21], optical signal processing [22], medical application [23], and other fields [24].

In rough set theory, uncertainty measure [22] plays an important role in knowledge acquisition from information system. Yao [25] proposed an expected granularity class to build a unified framework for the information granularity. Liang and Qian [26] presented an axiomatic definition of knowledge granulation and modified the three existing measures (accuracy, roughness, and approximation accuracy). Dai [27] introduced a new form of conditional entropy to measure the uncertainty of incomplete systems. To measure the uncertainty in probabilistic rough set and its three regions, Zhang [28] presented and revealed the change rules of uncertainty of probabilistic rough set. Wang [29] constructed a class of uncertainty measures for the feature selection, which often chooses fewer features and improves the classification accuracy in most cases. Moreover, Wang extended several uncertainty measures [3032] to attribute reduction in fuzzy rough sets. In general, these uncertainty measures fall into mainly two types: algebra and information theory. However, they are based on discrete decision systems, which are not applicable to the neighborhood system.

Currently, there are limited works on the uncertainty measure for neighborhood systems. Chen et al. [33, 34] proposed several uncertainty measures including neighborhood accuracy, information quantity, neighborhood entropy, and entropy-based roughness to handle uncertainty of a neighborhood information system. Tang [35] utilized neighborhood approximation accuracy to measure the uncertainty. The above methods only focus on the knowledge granularity or boundary region sizes, while the structural information of boundary regions is ignored. Moreover, these methods are hard to characterize the difference among neighborhood granular spaces with the same uncertainty. In order to solve this problem, Qian [36, 37] first proposed the concept of knowledge distance to reflect the difference between two granular spaces. For neighborhood system, Chen [38] proposed several distance measures of neighborhood granules and granule swarms and discussed some properties of distance measures. However, these works only focus on the granularity difference among granular spaces without considering the approximation for a target concept in neighborhood system. If two neighborhood granular spaces process the same uncertainty when approximating a target concept, it does not mean they are equivalent, and their difference cannot be distinguished.

Research on the uncertainty in multigranularity spaces becomes a basic issue of uncertainty measure. If the uncertainty measure is not accurate enough, two different rough approximation spaces of a target concept may have the same uncertainty, and the difference between them for describing a target concept cannot be reflected. In this case, attribute reduction, granularity selection, and multigranularity measure cannot be conducted effectively. Therefore, establishing an uncertainty measure model with strong distinguishing ability in multigranularity spaces is a key issue in uncertainty knowledge processing. More specifically, in neighborhood rough set, current uncertainty measures focus on computing the knowledge granulation of single granular space. Two main shortcomings remain in current uncertainty measures for neighborhood system: (1) The structural information of boundary regions in granular spaces cannot be reflected. (2) Current uncertainty measures fail to reflect the difference between granular spaces with the same uncertainty for approximating a target concept in neighborhood system.

In this paper, based on our previous works [39, 40], we address these issues in neighborhood system and propose the effective uncertainty measures for neighborhood systems from the perspective of distance. As shown in Figure 1, to characterize the difference between granular spaces for approximating a target concept, the fuzzy neighborhood granule distance (FNGD), fuzzy neighborhood granular space distance (FNGSD), and fuzzy neighborhood boundary region distance (FNBRD) are proposed, which are hierarchically organized from fineness to coarseness according to the semantics of granularity and provide three-layer perspectives in neighborhood system. Tri-level thinking [41] provides a general tri-level framework; similar to tri-level thinking, our three-layer approach provides three-layer perspectives to measure uncertainty in different granularities and specifically focused levels: a top level, a middle level, and a bottom level; that is, FNGD, FNGSD, and FNBRD are hierarchically established from fine to coarse according to the semantics of granularity. Different from tri-level thinking, where each level focuses on a particular aspect by using different distance measure from the perspective of granular computing, by integration, a synthesis of the investigations at three levels may provide a full understanding of knowledge distance measure. Herein, FNGD is utilized to discover the relation among neighborhood granules in a granular space at the bottom layer. Furthermore, FNGSD and FNBRD characterize the difference between granular spaces at the level of granules and regions, respectively. However, in some applications, that is, granularity optimization and attribute reduction, FNGSD is too meticulous to be suitable for comparing granular spaces in hierarchical multigranulation spaces; moreover, from the perspective of complexity [42], the complexity measure of FNGSD is more than that of FNBRD, because FNGSD possesses more blocks than FNBRD. FNBRD is proposed to focus on characterizing the difference of the structural information of boundary region between two granular spaces. Compared with FNGSD, FNBRD is suitable for granularity selection or attribute reduction by only considering its uncertainty information of boundary region and reduces the complexity of problem-solving through ignoring the information of granularity partition of boundary region.

The remainder of this paper is organized as follows. Related preliminary concepts are introduced in Section 2. In Section 3, the fuzziness for neighborhood rough set is presented. Section 4 introduces the fuzziness-based neighborhood granule distance. In Section 5, knowledge distance with three-layer perspectives is presented. Finally, conclusions are drawn in Section 6.

2. Preliminaries

In order to facilitate the description of this paper, many basic concepts are reviewed briefly in this section. In this paper, we denote a decision system by , where is a nonempty finite domain, is the set of condition attributes, is the decision attribute, is the set of all attribute values, is the set of all attribute values, and is an information function.

Definition 1 (rough set) [5]). Given a decision system , , and , the lower and upper approximation sets of are defined as follows:

Here, denotes the equivalence class induced by the equivalence relation ; namely, .

In this paper, a partition space is also called a knowledge space or granularity space. For simplicity, let in case of confusion. If , is a definable set; otherwise is a rough set. The universe is divided by positive region, boundary region, and negative region; then the three regions can be defined, respectively, as follows:

To distinguish the traditional decision system, we denote a neighborhood decision system by , where is a nonempty finite domain, is the set of condition attributes, is the decision attribute, is the set of all attribute values, and is an information function; denotes the neighborhood radius

Definition 2 (Neighborhood Rough Set) [26]. Given a decision system , , and , the lower and upper approximation sets of are defined as follows:

where and is a distance function. The three regions can be defined, respectively, as follows:

Suppose that and are two neighborhood granular spaces. If , then is finer than , denoted by . If , then is strictly finer than , denoted by .

Definition 3 (see [33]). Given a decision system , , and , the roughness related to is defined as follows:where denotes the cardinal number.

Definition 4 (see [33]). Given a decision system , , and , the entropy-based roughness related to is defined as follows:Here, is called neighborhood entropy.

Example 1. Let and be two granular spaces, and let be a target concept.
Obviously, we have .

From Example 1, on one hand, although the roughness considers the size of boundary region, it ignores the structural information of boundary region. On the other hand, entropy-based roughness takes the knowledge granulation in three regions (positive region, boundary region, and negative region) into account, and the structural information of boundary region is also neglected.

3. Fuzziness-Based Uncertainty Measure for Neighborhood Systems

In order to reflect the structural information of boundary region, we propose a fuzziness-based uncertainty measure for neighborhood rough set in this section. Moreover, in many real decision-making applications, the states of target concept are usually fuzzy or uncertain. To be more universal, we study the case where the target concept is fuzzy set in this paper.

Definition 5. Given a decision system , , and , is a neighborhood granular space induced by . The fuzziness-based uncertainty measure for neighborhood systems is defined as follows:where and denotes the membership degree of the object belonging to target set .

Obviously, the fuzziness-based uncertainty measure takes the structural information of boundary region into account, which is more reasonable and suitable for attribute evaluation. Moreover, the following theorem holds.

Theorem 1. Given a decision system , , and , and are two neighborhood granular spaces induced by and . Then,(1), if and, , or (2) if,

Proof. (1) and, , or ; from Definition 5, obviously, . Therefore, .
(2) The proof is straightforward by Definition 5.

4. Fuzzy Neighborhood Granule Distance

A neighborhood granule is a set of objects, and how to measure the distance between two neighborhood granules is a challenging problem. As is well known, distance measure is an effective tool to handle uncertainty. For example, Qian [43] first proposed the binary granule-based knowledge distance (GBKD) as a mathematical tool. Yang [39] proposed a partition-based knowledge distance based on the Earth Mover’s Distance (EMD) and generalized knowledge distance into rough approximation spaces [40]. Xiao [44] proposed an evidential fuzzy multicriteria decision-making (MCDM) method to process the uncertainty in MCDM, and this can effectively decrease the uncertainty caused by the subjectivity of humans. Furthermore, Xiao [45] proposed the complex evidential distance (CED) to measure the difference or dissimilarity between complex basic belief assignments (CBBAs), which is a more generalized evidential distance. To measure the difference between intuitionistic fuzzy sets (IFSs), Xiao [46] developed a distance measure between IFSs based on the Jensen-Shannon divergence, and this is able to generate more reasonable results than other methods. However, currently, there are few works on measuring the difference between granular spaces for approximating a target concept. To solve this issue, a fuzzy neighborhood granule distance is proposed in this paper. Firstly, research on distance measures of neighborhood granules is helpful to apply granular computing to machine learning to reduce the computational complexity, such as cluster and classification.

To measure the difference between two neighborhood granules, Chen [38] proposed the concept of neighborhood granule distance (NGD).

Definition 6 (see [38]). Let U be a nonempty universe. , and are two neighborhood granules, and the granule distance between and is defined aswhere .

Example 2. Let ; there are two granules and ; then

Based on the granule distance, Chen [38] proposed a KNN classifier. Different from the traditional KNN classifier, the granule-distance-based KNN classifier needs a process of neighborhood granulation and then adopts the granule distance to measure the similarity between neighborhood granules. This provides a new method for classifier from the perspective of granular computing. As is well known, the idea of density peak clustering (DPC) algorithm [47] is simple and novel. Firstly, two parameters of each sample are needed to be calculated: local density and relative distance. DPC algorithm is based on two characteristics of the cluster centers: (1) The local density of the cluster center is larger; that is, the density of its neighbors is not more than itself. (2) The distance between the cluster center and other data points with higher density is relatively larger. DPC algorithm utilizes the density distance to draw the decision graph to find the cluster centers and assigns the remaining points efficiently. In this paper, combined with granule distance and the concept of density peaks, we propose a method to discover the relations among neighborhood granules in a granular space. Before the algorithm is given, there are two main parameters to be defined as follows.

Definition 7. The local density of is defined as follows:where denotes the distances of the pairs of data points in and and is the cutoff distance.

Definition 8. The relative distance of is defined as follows:where the index set is defined as .

Obviously, when , we have .

The specific processes of the discovery of neighborhood granular relation based on density peaks (Algorithm 1) are as follows:

(1)Set a neighborhood parameter and granulate the dataset to form neighborhood granules , .
(2)Compute distance matrix of neighborhood granules by using formula (4).
(3)Compute local density and relative distance of .
(4)Establish the structure of neighborhood granules by making each granule to its unique nearest granule with higher according to the results of steps 2 and 3.

The above processes are illustrated by the simple example in Figure 2. As shown in Figure 2, there are 15 points that denote the neighborhood granules in a knowledge space, which shows 15 points embedded in a two-dimensional space. Obviously, the density maxima are at points 1 and 9, which are cluster centers. Then, the structure of neighborhood granules is established by making the other granules to its unique nearest granule with higher local density. According to Algorithm 1, we can discover the relation among neighborhood granules and find the key granules in a knowledge space. For example, points 1 and 9 are the key granules in Figure 1, because they are the cluster centers of each cluster based on the granule distance, respectively.

In the view of rough set, in order to measure the difference between two neighborhood granules for characterizing the target concept whether it is fuzzy or crisp concept, we propose a fuzzy neighborhood granule distance (FNGD) based on the concept of neighborhood granule distance as follows.

Definition 9. Let U be a nonempty universe, let be a target concept on . and are two neighborhood granules; then the fuzzy granule distance between and is defined aswhere

Note that because , holds obviously.

Example 3. Let . and are two neighborhood granules, and is a target concept; namely, ; thenIn Example 3, when , .

In this paper, and denote the finest approximation space and the coarsest approximation space, respectively. The following theorem holds.

Theorem 2. Let U be a nonempty universe; let be a target concept on . , is a neighborhood granule, and holds.

Proof. As we know, ; then

According to Theorem 2, it is obvious that can be characterized in the view of granule distance; that is, the larger is and the smaller is, the larger is, and vice versa.

Lemma 1. Let be a nonempty universe; then is a distance measure on .

Proof. Suppose that , and are three neighborhood granules. Obviously, it is positive and symmetric. According to [48], we have ; then . Then, .Therefore, is a distance measure on .

Theorem 3. Let be a nonempty universe, and let be a target concept on . and are two neighborhood granules; if , then .

Proof. Because , and . Obviously, we have .

From Theorem 3, we know that when computing the FNGD between two neighborhood granules in positive region, the FNGD degenerates to NGD.

Theorem 4. Let U be a nonempty universe, and let be a target concept on . and are two neighborhood granules; if , then .

Proof. Because , we have . Obviously, we have .

This is easy to understand, because the membership of information granules in negative region is 0. This implies that the information granules are useless for constructing lower/upper approximation of , which can be regarded as useless for describing the target concept. Therefore, their fuzzy neighborhood granule distance is equal to 0.

Theorem 5. Let U be a nonempty universe, and let be a target concept on . , and are three neighborhood granules. If , then the following theorem holds:(1)(2)(3)

Proof. Because and , we haveTherefore, . Similarly, we can draw .
Becausewe have .

Theorem 6. Let be a nonempty universe, and let be a target concept on . and are two neighborhood granules; holds.

From the above discussion, NGD only can measure the difference of grain size between neighborhood granules, while FNGD focuses on reflecting the difference between neighborhood granules for constructing lower/upper approximation of target concept. Therefore, combining with NGD and FNGD, we further propose a method to discover the relation among neighborhood granules when describing a target concept in a granular space based on Algorithm 1. The specific processes are as follows:(1)Transfer Algorithm 1(2)Compute distance matrix of neighborhood granules by using formula (11) in each cluster(3)Compute local density and relative distance of (4)Establish the structure of neighborhood granules by making each granule to its unique nearest granule with higher according to the results of steps 2 and 3

The above processes are illustrated by the simple example in Figure 3. According to Algorithm 1, we can discover the relation among neighborhood granules and find the key granules in a granular space. Compared with the relation established by NGD, the relation among fuzzy neighborhood granules is established by FNGD, which has changed. In this section, FNGD is designed to describe the relation between the neighborhood granules for constructing lower/upper approximation of target concept in a granular space. In the next section, we will further present the neighborhood granular space distance based on FNGD.

5. Knowledge Distance with Three-Layer Perspectives

In terms of finer granularity layer, we discussed the relative properties of FNGD and discover the relation among fuzzy neighborhood granules in a granular space based on FNGD in Section 4. In terms of coarser granularity layer, the fuzzy neighborhood granular space distance (FNGSD) and fuzzy neighborhood region distance (FNRD) are proposed in this section. FNGD, FNGSD, and FNRD are hierarchically organized from fine to coarse according to the semantics of granularity and provide three-layer perspectives in neighborhood systems.

5.1. Fuzzy Neighborhood Granular Space Distance

In literature [38], neighborhood granular space distance (NGSD) is proposed to measure the partition difference between two neighborhood granular spaces.

Definition 10. Given a decision system and are two neighborhood granular spaces. The NGSD between and is defined aswhere .

However, NGSD cannot reflect the difference between granular spaces for describing a target concept. As is well known, different neighborhood granular spaces could be induced for an information system by its different attribute subsets; thus, the multigranulation spaces of a target concept could be developed. If the uncertainty measure is not accurate enough, two different granular spaces when approximating a target concept may have the same result, and the difference between them cannot be reflected.

Example 4. Let be a target concept. and are two neighborhood granular spaces.
Obviously, we have , and . According to Definition 3, we have

From Example 4, and possess the same uncertainty (i.e., entropy-based roughness or fuzziness). However, it does not mean that these two neighborhood granular spaces are not completely equivalent to each other, and the descriptions of and for a target concept cannot be distinguished. Therefore, establishing an uncertainty measure model with strong distinguishing ability in multigranulation spaces is a key issue in uncertainty knowledge processing. To solve this problem, we proposed the fuzzy neighborhood granular space distance (FNGSD) based on FNGD.where .

Definition 11. Given a decision system , . and are two neighborhood granular spaces. The FNGSD between and is defined as

Example 4. Because , namely, , then .

In Example 4, is a crisp set. When is a fuzzy concept, the FNGSD is still valid. For example, when is a target concept, we have

Therefore, FNGSD possesses a stronger distinguishing ability than traditional uncertainty measure in multigranulation neighborhood spaces.

As we analyzed in Section 4, the information granules are useless for constructing lower/upper approximation of target concept. Figure 4 shows the difference between NGSD and FNGSD. For the two types of knowledge distance to measure the difference between two neighborhood granular spaces, NGSD only compares the grain size of granules in each granular space, while FNGSD only compares the membership of granules referring to describe the target concept in each neighborhood granular space.

Theorem 7. Given a decision system , . , , and are three neighborhood granular spaces. If , .

Proof. According to Theorem 5, Theorem 7 obviously holds.

In this paper, and denote the finest approximation space and the coarsest approximation space, respectively. According to Theorem 7, the following corollary holds.

Theorem 8. Let be a nonempty universe; is a distance measure on .

Proof. Suppose that , , and are three neighborhood granular spaces. Obviously, is positive and symmetric; namely, and . According to Lemma 1, we have ; hence, is also triangle inequality. Therefore, is a distance measure on U.

Corollary 1. Given a decision system and , and are two neighborhood granular spaces. If , then .

Corollary 2. Given a decision system and , and are two neighborhood granular spaces. If , then .

Theorem 9. Given a decision system and , is a neighborhood granular space. is a granularity measure.

Proof. (1)From Theorem 8, (2)When , obviously, (3)Suppose that is a neighborhood granular space; from Corollary 1, if , then holds

Theorem 10. Given a decision system and , is a neighborhood granular space. is an information measure.

Proof. (1)From Theorem 8, (2)When , obviously, (3)Suppose that is a neighborhood granular space; from Corollary 1, if , then holds

Theorem 11. Given a decision system , , and are three neighborhood granular spaces. If , then .

Proof. According to Theorem 7, Theorem 11 obviously holds.

From Theorem 11, we can find that the FNGSD among granular spaces which have the partial order relation is linearly additive.

Corollary 3. Given a decision system and , , , and are three neighborhood granular spaces. If , then .

Corollary 4. Given a decision system and , , , and are three neighborhood granular spaces. If , then .

According to Corollary 3 and Theorem 9, for a target concept, the FNGSD among granular spaces which have the partial order relation is equal to their granularity difference. According to Corollary 4 and Theorem 10, for a target concept, the FNGSD among granular spaces which have the partial order relation is equal to their information difference. Combined with FNGSD and NGSD, we design a formula to reflect the consistency of a knowledge space as follows.

Definition 12. Given a decision system and , is a neighborhood granular space. The consistency of can be defined as

Example 5. Given a decision system and , is a target concept on , and and are two neighborhood granular spaces.
According to formula (20), we have and . Therefore, .

It is worth noting that when neighborhood relation degenerates to equivalence relation in a knowledge base, and can be denoted as follows:where , , and .

The constructions of formulas (9) and (10) are similar to Earth Mover’s Distance [49, 50], where represents a distance between and and represents the flow from to . For details, refer to literature [49].

Example 6. Given a decision system , is a target concept on U, and and are two neighborhood granular spaces. Because and , according to formula (9), we haveSimilarly, according to formula (24), we have

Example 6 can be further described intuitively by Figure 5, which explains the NGSD and FNGSD in the view of equivalence relation. Obviously, when and are two neighborhood granular spaces, and degenerate to many-to-many matching distance. This idea is similar to Earth Mover’s Distance [49, 50], which was first proposed for image retrieval based on a solution to the well-known transportation problem [51] and it is capable of calculating partial matches. For more details, researchers can refer to [49, 50, 52].

5.2. Fuzzy Neighborhood Boundary Region Distance

Although FNGSD provides a tool to measure the difference between two neighborhood granular spaces, it is too meticulous to be suitable for comparing granular spaces in hierarchical multigranulation spaces, such as granularity optimization and attribute reduction. In particular, for evaluating the significance of attributes, we only need to consider its uncertainty information of boundary region and ignore its granularity information.

Definition 13. Given a decision system and , is a target concept on . The attribute significance of can be defined as

Theorem 12. Given a decision system and , and are two neighborhood granular spaces. is a sufficient condition for and .

Example 7. Given a decision system