Abstract
The granular reduction is to delete dispensable elements from a covering. It is an efficient method to reduce granular structures and get rid of the redundant information from information systems. In this paper, we develop an algorithm based on discernability matrixes to compute all the granular reducts of covering rough sets. Moreover, a discernibility matrix is simplified to the minimal format. In addition, a heuristic algorithm is proposed as well such that a granular reduct is generated rapidly.
1. Introduction
With the development of technology, the gross of information increases in a surprising way. It is a great challenge to extract valuable knowledge from the massive information. Rough set theory was raised by Pawlak [1, 2] to deal with uncertainty and vagueness, and it has been applied to the information processing in various areas [3β8].
One of the most important topics in rough set theory is to design reduction algorithms. The reduction of Pawlak's rough sets is to reduce dispensable elements from a family of equivalence relations which induce the equivalence classes, or a partition.
Covering generalized rough set [9β19] and binary relation generalized rough set [20β26] are two main extensions of Pawlak's rough set. The reduction theory of covering rough sets [10, 11, 15, 23, 27, 28] plays an important role in practice. A partition is no longer a partition if any of its elements is deleted, while a covering may still be a covering with invariant set approximations after dropping some elements. Therefore, there are two types of reduction on covering rough sets. One is to reduce redundant coverings from a family of coverings, referred to as the attribute reduction. The other is to reduce redundant elements from a covering, noted as the granular reduction. It is to find the minimal subsets of a covering which generate the same set approximations with the original covering. Employed to reduce granular structures and databases as well as interactive with the attribute reduction, we think the granular reduction should be ignored by no means. In this paper, we devote to investigate granular reduction of covering rough sets.
In order to compute all attribute reducts for Pawlak's rough sets, discernibility matrix is initially presented [29]. Tsang et al. [15] develop an algorithm of discernibility matrices to compute attribute reducts for one type of covering rough sets. Zhu and Wang [17] and Zhu [18] build one type of granular reduction for two covering rough set models initially. In addition, Yang et al. systematically examine the granular reduction in [30] and the relationship between reducts and topology in [31]. Unfortunately, no effective algorithm for granular reduction has hitherto been proposed.
In this paper, we bridge the gap by constructing an algorithm based on discernibility matrixes which is applicable to all granular reducts of covering rough sets. This algorithm can reduce granular structures and get rid of the redundant information from information systems. Then a discernibility matrix is simplified to the minimal format. Meanwhile, based on a simplification of discernibility matrix, a heuristic algorithm is proposed as well.
The remainder of this paper proceeds as follows. Section 2 reviews the relevant background knowledge about the granular reduction. Section 3 constructs the algorithm based on discernibility matrix. Section 4 simplifies the discernibility matrix and proposes a heuristic algorithm. Section 5 concludes the study.
2. Background
Our aim in this section is to give a glimpse of rough set theory.
Let be a finite and nonempty set, and let be an equivalence relation on . generates a partition on , where is an equivalence class of generated by the equivalence relation . We call it elementary sets of in rough set theory. For any set , we describe by the elementary sets of , and the two sets are called the lower and upper approximations of , respectively. If is an -exact set. Otherwise, it is an -rough set.
Let be a family of equivalence relations, and let , denoted as . is dispensable in if and only if . Otherwise, is indispensable in . The family is independent if every is indispensable in . Otherwise, is dependent. is a reduct of if is independent and . The sets of all indispensable relations in are called the core of , denoted as CORE(). Evidently, CORERED, where RED is the family of all reducts of . The discernibility matrix method is proposed to compute all reducts of information systems and relative reducts of decision systems [29].
is called a covering of , where is a nonempty domain of discourse, and is a family of nonempty subsets of and .
It is clear that a partition of is certainly a covering of , so the concept of a covering is an extension of the concept of a partition.
Definition 2.1 (minimal description [9]). Let be a covering of , is called the minimal description of . When there is no confusion, we omit the from the subscript.
Definition 2.2 (neighborhood [9, 19]). Let be a covering of , and is called the neighborhood of . Generally, we omit the subscript when there is no confusion.
Minimal description and neighborhood are regarded as related information granules to describe , which are used as approximation elements in rough sets (as shown in Definition 2.3). It shows that . The neighborhood of can be seen as the minimum description of , and it is the most precise description (more details are referred to [9]).
Definition 2.3 (covering lower and upper approximation operations [19]). Let be a covering of . The operations and are defined as follows: for all ,
We call the first, the second, the third, or the fourth covering lower approximation operations and the fifth, the sixth, or the seventh covering lower approximation operations, with respect to the covering .
The operations , , , , , , and are defined as follows: for all ,
, , , , , , and are called the first, the second, the third, the fourth, the fifth, the sixth, and the seventh covering upper approximation operations with respect to , respectively. We leave out at the subscript when there is no confusion.
As shown in [32], every approximation operation in Definition 2.3 may be applied in certain circumstance. We choose the suitable approximation operation according to the specific situation. So it is important to design the granular reduction algorithms for all of these models.
More precise approximation spaces are proposed in [30]. As a further result, a reasonable granular reduction of coverings is also introduced. Let , . is the approximation space of the first and the third types of covering rough sets, is the approximation space of the second and the fourth types of covering rough sets, and is the approximation space of the fifth, the sixth, and the seventh types of covering rough sets (referred to [30] for the details). In this paper, we design the algorithm of granular reduction for the fifth, the sixth, and the seventh type of covering rough sets.
Let be a covering of , denoting a covering approximation space. denotes an -approximation space. represents an -approximation space. We omit at the subscript when there is no confusion (referred to [30] for the details).
3. Discernibility Matrixes Based on Covering Granular Reduction
In the original Pawlak's rough sets, a family of equivalence classes induced by equivalence relations is a partition. Once any of its elements are deleted, a partition is no longer a partition. The granular reduction refers to the method of reducing granular structures and to get rid of redundant information in databases. Therefore, granular reduction is not applicable to the original Pawlak's rough sets. However, as one of the most extensions of Pawlak's rough sets, a covering is still working even subject to the omission of its elements, as long as the set approximations are invariant. The purpose of covering granular reduction is to find minimal subsets keeping the same set approximations. It is meaningful and necessary to develop the algorithm for covering granular reduction.
The quintuple () is called a covering rough set system (CRSS), where is a covering of , and are the lower and upper approximation operations with respect to the covering , and is the approximation space. According to the categories of covering approximation operations in [30], there are two kinds of situations as follows.(1)If or , then : thus; is the unique granular reduct of . There is no need to develop an algorithm to compute granular reducts for the first, the second, the third, and the fourth type of the covering rough sets.(2)If , generally, is not a subset of . Consequently, an algorithm is needed to compute all granular reducts of for the fifth, the sixth, and the seventh type of covering rough set models.
Next we examine the algorithm of granular reduction for the fifth, the sixth, and the seventh type of covering rough sets. Let be a covering of , since , and is the collection of all approximation elements of the fifth, the sixth, or the seventh type of lower/upper approximation operations. is called the -approximation space of . Given a pair of approximation operations, the set approximations of any are determined by the -approximation spaces. Thus, for the fifth, the sixth, and the seventh type of covering rough set models, the purpose of granular reduction is to find the minimal subsets of such that . The granular reducts based on the -approximation spaces are called the -reducts. is the set of all -reducts of , and is the set of all -irreducible elements of (referred to [30] for the details).
In Pawlak's rough set theory, for every pair of , if belongs to the equivalence class containing , we say that and are indiscernible. Otherwise, they are discernible. Let be a family of equivalence relation on , . is indispensable in if and only if there is a pair of such that the relation between and is altered after deleting from . The attribute reduction of Pawlak's rough sets is to find minimal subsets of which keep the relations invariant for any . Based on this statement, the method of discernibility matrix to compute all reducts of Pawlak's rough sets was proposed in [29]. In covering rough sets, however, the discernibility relation between is different from that in Pawlak's rough sets.
Let be a covering on , . Then we call indiscernible if , that is, . Otherwise, is discernible. When is a partition, the new discernibility relation coincides with that in Pawlak's. It is an extension of Pawlak's discernibility relation. In Pawlak's rough sets, is indiscernible if and only if is indiscernible. However, for a general covering, if and , that is, and , is discernible while is indiscernible. Thereafter, we call these relations the relations of with respect to . The following theorem characterizes these relations.
Proposition 3.1. Let be a covering on , and let .(1) if and only if .(2) if and only if there is such that and .
Proof. for any , for any , .
It is evident from .
Theorem 3.2. Let be a covering on , . Then if and only if there is whose discernibility relation with respect to is changed after deleting from .
Proof. Suppose that , then there is at least one element such that , that is, . Since , suppose that , then and . Namely, is discernible with respect to , while is indiscernible with respect to .
Suppose that there is whose discernibility relation with respect to is changed after deleting from . Put differently, is discernible with respect to , while is indiscernible with respect to . Then we have and , so . Thus, . It implies .
The purpose of granular reducts of a covering is to find the minimal subsets of which keep the same classification ability as or, put differently, keep invariant. In Theorem 3.2, is kept unchanged to make the discernibility relations of any invariant. Based on this statement, we are able to compute granular reducts with discernibility matrix.
Definition 3.3. Let , be a covering on . is an matrix called a discernibility matrix of , where(1), ,(2), .
This definition of discernibility matrix is more concise than the one in [11, 15] due to the reasonable statement of the discernibility relations. Likewise, we restate the characterizations of -reduction.
Proposition 3.4. Consider that for some .
Proof. For any , , then there is such that and . It implies that and . Moreover, for any , since , we have if . Thus, .
If for some , then and . And for any , if , then , that is, and , then . Namely, , which implies .
Proposition 3.5. Suppose that , then if and only if for every .
Proof. β for any , if and only if ,β for any , there is such that and if and only if there is such that and ,β for any , .
Proposition 3.6. Suppose that , then if and only if is a minimal set satisfying for every .
Definition 3.7. Let , let be a covering of , and let be the discernibility matrix of . A discernibility function is a Boolean function of Boolean variables, , corresponding to the covering elements , respectively, defined as .
Theorem 3.8. Let be a family of covering on , let be the discernibility function, and let be the reduced disjunctive form of by applying the multiplication and absorption laws. If , where , and every element in only appears once, then .
Proof. For every , for any , so . Let for any , then . If for every , we have , then for every , that is, , which is a contradiction. It implies that there is such that . Thus, is a reduct of .
For any , we have for every , so , which implies . Suppose that, for every , we have , then for every , there is . By rewriting , . Thus, there is such that , that is, , which is a contradiction. So for some , since both and are reducts, and it is evident that . Consequently, .
Algorithm 3.9. Consider the following:
input: ,
output: and The set of all granular reducts and the set of all -irreducible elements.βStep 1: =, for each , let .ββStep 2: for each , compute .ββββ.ββStep 3: .ββStep 4: compute to where ,βββββ, and every element in only appears once.βStep 4: output , .βStep 5: end.
The following example is used to illustrate our idea.
Example 3.10. Suppose that , where denote six objects, and let denote seven properties; the information is presented in Table 1, that is, the th object possesses the th attribute is indicated by a in the -position of the table.
is the set of all objects possessing the attribute , and it is denoted by . Similarly, , , , , , and . Evidently, is a covering on .
Then, , , , , , and .
The discernibility matrix of is exhibited as follows:
So , . As a result, Table 1 can be simplified into Table 2 or Table 3, and the ability of classification is invariant. Obviously, the granular reduction algorithm can reduce data sets as shown.
4. The Simplification of Discernibility Matrixes
For the purpose of finding the set of all granular reducts, we have proposed the method by discernibility matrix. Unfortunately, it is at least an NP problem, since the discernibility matrix in this paper is more complex than the one in [33]. Accordingly, we simplify the discernibility matrixes in this section. In addition, a heuristic algorithm is presented to avoid the NP hard problem.
Definition 4.1. Let be the discernibility matrix of . For any , if there is an nonempty element such that , let ; otherwise, , then we get a new discernibility matrix , which called the simplification discernibility matrix of .
Theorem 4.2. Let be the discernibility matrix of , and is the simplification discernibility matrix, . Then for any nonempty element if and only if for any nonempty element .
Proof. If for every and , it is evident that for every and .
Suppose that for every and . For any nonempty , if there is an nonempty element such that , and for any nonempty element , , then . Since , then ; thus, . If for any nonempty element , then . Since , then . Thus, for every nonempty .
Proposition 4.3. Suppose that , then if and only if is a minimal set satisfying for every and
Proposition 4.4. Consider that .
Proof. Suppose that , then there is such that and . For any , if , let . Otherwise, , where . Suppose that ; it is easy to prove that . Thus, .
Suppose that , then there is such that . From Proposition 4.3, we know that is a minimal set satisfying for every and . So there is a such that , or else is redundant in . Thus, .
In summary, .
Proposition 4.5. Let be the simplified discernibility matrix of , then is the minimal matrix to compute all granular reducts of , that is, for any matrix where , can compute all granular reducts of if and only if for .
Proof. If for , then , and can compute all granular reducts of .
Suppose that there is a nonempty such that . If , suppose that , then . From the definition of the simplification discernibility matrix, we know that for any , then for any . So cannot compute any granular reducts of . If , we suppose that . Then there is a , and let . For any , if , let . Otherwise, let where . Let and , and it is easy to prove that . However, , that is, cannot compute all granular reducts of . Thus, if can compute all granular reducts of , then for .
From the above propositions, we know that the simplified discernibility matrix is the minimal discernibility matrix which can compute the same reducts as the original one. Hereafter, we only examine simplified discernibility matrixes instead of general discernibility matrixes. The following example is used to illustrate our idea.
Example 4.6. The discernibility matrix of in Example 3.10 is as follows: So , .
From the above example, it is easy to see that simplified discernibility matrix can simplify the computing processes remarkably. Especially when is a consistent covering proposed in [30], that is, , the unique reduct .
Unfortunately, although the simplified discernibility matrixes are more simple, the processes of computing reducts by discernibility function are still NP hard. Accordingly, we develop a heuristic algorithm to obtain a reduct from a discernibility matrix directly.
Let be a discernibility matrix. We denote the number of the elements in by . For any , denotes the number of which contain . Let , if for any , , then . Since , if , then the elements in may either be deleted from or be preserved. Suppose that , if for any , is called the maximal element with respect to the simplified discernibility matrix . The heuristic algorithm to get a reduct from a discernibility matrix directly proceeds as follows.
Algorithm 4.7. Consider the following:
input: ,
output: granular reducts redβStep 1: =, for each , let .βStep 2: for each , compute .βββββββββ.βStep 3: for each ,ββββif there is a nonempty element such thatβββββ, let // get the simplified discernibility matrix.βStep 4: for each , compute and select the maximal βββββelement of .ββββFor each ,ββββif ,βββββlet .ββStep 5: if there is such that ,βββββreturn to Step 3;ββββelseβββββoutput .βStep 5: end.
Example 4.8. The simplified discernibility matrix of in Example 3.10 is as follows:
For a maximal element of , let , then we get as follows:
Thus, is a granular reduct of .
For a maximal element of , let , then we get as follows:
Thus, is also a granular reduct of .
From the above example, we show that the heuristic algorithm can avoid the NP hard problem and generate a granular reduct from the simplified discernability matrix directly. With the heuristic algorithm, the granular reduction theory based on discernability matrix is no longer limited to the theoretic level but applicable in practical usage.
5. Conclusion
In this paper, we develop an algorithm by discernability matrixes to compute all the granular reducts with covering rough sets initially. A simplification of discernibility matrix is proposed for the first time. Moreover, a heuristic algorithm to compute a granular reduct is presented to avoid the NP hard problem in granular reduction such that a granular reduct is generated rapidly.
Acknowledgments
This work is supported by National Natural Science Foundation of China under Grant no. 11201490 and no. 11061004, Science and Technology Plan Projects of Hunan Province no. 2011FJ3152.