Abstract

This paper focuses on constructing uncertainty measures by the pure rough set approach in ordered information system. Four types of definitions of lower and upper approximations and corresponding uncertainty measurement concepts including accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree are investigated. Theoretical analysis indicates that all the four types can be used to evaluate the uncertainty in ordered information system, especially that we find that the essence of the first type and the third type is the same. To interpret and help understand the approach, experiments about real-life data sets have been conducted to test the four types of uncertainty measures. From the results obtained, it can be shown that these uncertainty measures can surely measure the uncertainty in ordered information system.

1. Introduction

Rough set theory, originated by Pawlak in the early 1980s [1, 2], is an extension of the classical set theory and can be regarded as a soft computing tool to handle imprecision, vagueness, and uncertainty in the data analysis. The theory has been found successful applications in the field of pattern recognition [3], medical diagnosis [4], data mining [5, 6], conflict analysis [7], algebra [8, 9], and other fields [1012]. Recently, the theory has generated a great deal of interest among more and more researchers.

Until now, several extensions of the rough set model have been proposed in terms of various requirements. For example, by exploring the relationship between rough sets and modal logics, Yao proposed and examined a number of extended rough set models. Then with respect to graded and probabilistic modal logics, graded and probabilistic rough set models are also discussed in [13]. Also Yao summarized various formulations of the standard rough set theory. It demonstrated how those formulations can be adopted to develop different generalized rough set theories. The relationships between rough set theory and other theories are discussed in [14]. In [15], Wu presented a general framework for the study of mathematical structure of rough sets in infinite universes of discourse. Lower and upper approximations of a crisp set with respect to an infinite approximation space are first defined. And the connections between rough sets and Dempster-Shafer theory of evidence are also explored. Also some other extensions have been introduced, such as the variable precision rough set (VPRS) model [16], the rough set model based on tolerance relation [17, 18], the Bayesian rough set model [19], the fuzzy rough set model, and the rough fuzzy set model [20, 21]. And many achievements have been made in rough set theory. For example, Grzymala-Busse [22] developed a system LERS for rule induction, which can handle inconsistencies and induce both certain and possible rules. Polkowski [23] worked on using granular rough mereological structures in classification of data. Skowron et al. [24] worked on the relation and the combination of rough set theory and granular computing [25]. Lin proposed granular computing model based on binary relations [26]. Yao studied three-way decisions in probabilistic rough set model [27, 28]. Equivalence relation is a basic notion in Pawlak’s rough set model. However, the original rough set theory approaches do not consider attributes with preference-ordered domains, that is, criteria. In many real situations, we are often faced to the problems in which the ordering of properties of the considered attribute values plays a crucial role. One such type of problem is the ordering of objects. For this reason, Yao considered the problem of mining ordering rules as finding association between orderings of attribute values and the overall ordering of objects in [29]. For mining ordering rules, the notion of information tables is generalized to ordered information tables by adding order relations on attribute values. And Iwiński has also addressed the problem from the ranking of objects in information systems [30, 31]. Moreover, Greco et al. proposed an extension of rough set theory, called the dominance-based rough set approach (DRSA) to take into account the ordering properties of criteria [3234]. This innovation is mainly based on substitution of the indiscernibility relation by a dominance relation. Moreover, Greco et al. characterize the DRSA as well as decision rules induced from rough approximations, while the usefulness of the DRSA and its advantages over the CRSA (classical rough set approach) are presented [3234]. In DRSA, condition attributes are criteria and classes are preference ordered. Several studies have been made about properties and algorithmic implementations of DRSA [3537].

Uncertainty measurement is an important issue in rough set theory. Pure rough set approach and information theory approach are two methodologies to deal with uncertainty measure problem in rough set theory. In pure rough set approach, the accuracy measure, the roughness measure, the approximation quality measure, the approximation accuracy measure, the dependency degree measure, and importance degree measure are important numerical characterizations that quantify the imprecision of a rough set caused by its boundary region. Recently, Yao [38] studied two definitions of approximations and associated measures based on equivalence relations. In information theory approach, entropy and its variants have been introduced into rough set theory [3942].

Classical rough set model is based on equivalence relation or partition. Thus, the corresponding uncertainty measures are not suitable for ordered information system. Several authors have defined uncertainty measures in ordered information system by information theory approach. Xu et al. introduced the concepts of rough entropy and knowledge granulation in ordered information system [43]. Also, Xu et al. defined the knowledge granulation, knowledge entropy, and knowledge uncertainty measure in ordered information system and gave some of their properties [44]. However, there are few studies on uncertainty measurement issue based on pure rough set approach in ordered information system. In this paper, we mainly focus on extending Pawlak’s pure rough set uncertainty measures to ordered information system.

The organization of the remainder of this paper is as follows. In Section 2, some basic concepts in classical rough set theory and ordered information system are reviewed. Four types of lower and upper approximations and their corresponding uncertainty measures are investigated in Section 3, and some important properties are studied. Also we find that the essence of the first type and the third type is the same. In Section 4, four types of uncertainty measures are tested on some real-life data. And in Section 5, we conclude the paper with a summary and outlook for further research.

2. Preliminaries

In this section, we review some basic notions in classical rough set theory and ordered information system rough set.

Throughout this paper, we assume that the universe is a nonempty finite set, and the class of all subsets of is denoted by , and the complement of in is denoted by~.

2.1. Rough Set Approximations in Classical Information System

A classical information system is an order triple , where is a nonempty finite set of objects, is a nonempty finite set of condition attributes, and, for any , is a map, where is the domain of the attribute . In particular, a classical target information system is given by , where is a nonempty finite set of decision attributes, and for any , is a map, where is the domain of the attribute .

Suppose that is a classical information system, and ; let be a partition of induced by the attribute subset . For any , ; more information can be found in [4547].

Let be a subset of ; the lower and upper approximations are defined, respectively, as follows:

From the definition, we can see that two different approaches have been employed for the constructing of lower and upper approximations. The first one is element-based approach, while the second one is class-based approach. The lower approximation of a set with respect to is the set of all objects, which certainly belongs to with respect to . The upper approximation of a set with respect to is the set of all objects, which possibly belongs to with respect to .

Let be a classical target information system and let be the set of decision classes of the information system .

2.2. Uncertainty Measures in Rough Set Theory

Rough sets can also be characterized numerically by accuracy measure, roughness measure, and approximation quality, which can be used for evaluating uncertainty of a set. And approximation accuracy can be used to evaluate the uncertainty of a rough classification [2]. Besides, dependency degree and importance degree can be employed to evaluate condition attribute subset with respect to decision attribute [1]. The definitions of the uncertainty measures are shown as follows.

Definition 1 (see [2]). Let be a classical information system, , and . The accuracy of set according to is The roughness of set with respect to is And the approximation quality of set with respect to is

In fact, the roughness measure is the well-known Marczewski-Steinhaus distance between the lower and upper approximations according to Yao [48].

Definition 2 (see [2]). Let be a classical decision information system, be the classification of the universe , and be the attribute subset that . The approximation accuracy of according to is The dependency degree and importance degree of with respect to are defined as [1]

According to the definitions of these measures, we know that the accuracy measure is equal to the degree of the completeness of knowledge about the given object set and the approximation quality can also evaluate the completeness degree of the set , while the roughness measure represents the incompleteness of the knowledge. Meanwhile, the approximation accuracy provides the percentage of possible correct decisions when classifying objects by employing the attribute set . The dependency degree and importance degree are used to measure the degree of the dependency and the importance of with respect to .

Moreover, to investigate the uncertainty measures, a partial relation is defined such that given two families of the equivalence relations and are induced by the attribute subsets and , respectively. One can define if and only if, for each , there exist such that ; then is said to be coarser than (or is finer than ). If and , then is said to be strictly coarser than (or is strictly finer than ) and it can be denoted by .

Since we have many uncertainty measurements to measure the uncertainty, not all the measures can be reasonable. If the accuracy measure, roughness measure, approximation quality measure, approximation accuracy measure, dependency degree measure, and importance degree measure are reasonable, they should have the following properties.

Accuracy. Let be a classical information system and . If , then .

Roughness. Let be a classical information system and . If , then .

Approximation Quality. Let be a classical information system and . If , then .

Approximation Accuracy. Let be a classical decision information system and . If , then .

Dependence Degree. Let be a classical decision information system and . If , then .

Importance Degree. Let be a classical decision information system and . If , then .

Obviously, these measures are reasonable to be used as uncertainty measures in classical rough set theory.

2.3. Ordered Information Systems and Dominance Relation

An ordered information system is an order triple , where is a nonempty finite set of objects, is a nonempty finite set of condition attributes, and, for any , is a map, where is the domain of the attribute . In particular, an ordered decision information system is given by , where is a nonempty finite set of decision attributes, and, for any , is a map, where is the domain of the attribute .

Definition 3 (see [34]). Let be an ordered information system, for ; then is called the dominance relation with respect to : And the dominance class of an object with respect to an attribute subset is

In ordered information system, just like it in classical information system, assume that is coarser than (or is finer than ), denoted by , if, for any , . If and , then is said to be strictly coarser than (or is strictly finer than ) and it can be denoted by .

Note that if , then .

Definition 4. Let , be two ordered information systems; they have the same object set, attribute set, but they may have different attribute values on some objects. If, for any , , either or if , we can get , and then we say is coarser than (or is finer than ), which is denoted by .

Note that if , then exist and , such that .

Theorem 5. Let , be two ordered information systems and . If , then, for any , .

Proof. (1) If, for any , , , then, for any , we have .
(2) If there exists , , such that . So if , then . Hence, .
This completes the proof.

3. Approximations and Uncertainty Measures in Ordered Information System

In this section, we investigate four types of definitions of lower approximation and upper approximation in ordered information system. We focus on the problem of whether these definitions are appropriate for the uncertainty measures (accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree). Actually, forms a covering on based on the dominance relation discussed in the last section. Thus, one can obtain four types definitions of lower and upper approximations based on coverings. In fact, the first, the third, and the fourth types of definitions of lower approximations and upper approximations were studied by Yao in [49]. Yao defined the three types of approximation operators based on an arbitrary relation, while in this paper the relation is confined to the dominance relation defined in the last section. Essentially, we note that dominance relation is only one special type of binary relations. Most important of all, the granule in ordered information system is in fact a successor neighborhood as used in [49]. They are natural or direct extensions of Pawlak rough set model just by replacing the equivalence relation with the dominance relation, while the second definition just changes the element-based approach with the class-based approach, which can be viewed as indirect extensions of Pawlak rough set model.

3.1. The First Type of Approximations and Corresponding Measures

In this subsection, we will consider the first type of lower and upper approximations which are the element-based type. It can be defined as follows.

Definition 6. Let be an ordered information system, , and . The first type of lower approximation and upper approximation of according to are defined as follows:

Based on the above definition of lower and upper approximations, one can define the accuracy, roughness, and approximation quality based on the first type as

For an ordered decision information system and , let is the set of equivalence decision classes of the ordered decision information system; then the approximation accuracy of according to can be defined as

The dependency degree and importance degree of with respect to can also be defined as

We investigate some new properties which are important when investigating whether the uncertainty measurement concepts including accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree are appropriate for uncertainty measures or not.

Theorem 7. Let be an ordered information system and ; for any , one has(1),(2),(3),(4).

Proof. (1) Suppose ; then ; according to the definition of , we have, for any , . For any , ; then , so . Thus, .
(2) Suppose ; according to the definition of , there exist , , such that . Then and . For any , ; then , so . Thus, .
(3) Suppose ; then and according to the definition of , we have for any , . For any , ; then , so . Thus, .
(4) Suppose ; according to the definition of , there exist , , such that . Then and . For any , ; then , so . Thus, .
Thus, the theorem is proved.

From the theorem above, one can get the following theorem easily.

Theorem 8. Let be an ordered decision information system and ; for any , the following properties hold:(1),(2),(3),(4),(5),(6),(7) ,(8),(9),(10),(11),(12).

Proof. (1) Suppose ; then . From (1) and (3) in Theorem 7, we have
(2) Suppose ; from (2) and (4) in Theorem 7, we have
(3) It is straightforward by (1).
(4) It is straightforward by (2).
(5) Suppose ; then . From (1) and (3) in Theorem 7, we have
(6) Suppose ; from (2) and (4) in Theorem 7, we have
(7) Suppose ; then . From (1) and (3) in Theorem 7, we have
Then,
So,
(8) It can be proved similar to (7) by (2) and (4) in Theorem 7.
(9) Suppose ; then . From (1) and (3) in Theorem 7, we have
(10) It can be proved similar to (9) by (2) and (4) in Theorem 7.
(11) From (9), we have
Then,
So,
(12) It can be proved similar to (11) by (2) and (4) in Theorem 7.

The theorem above shows that the accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree measures of Definition 6 are reasonable. Therefore, , , , , , and can be used as the uncertainty measures.

3.2. The Second Type of Approximations and Corresponding Measures

In this subsection, we will consider the second type of lower and upper approximations which are the class-based type. It can be defined as follows.

Definition 9. Let be an ordered information system, , and . The second type of lower approximation and upper approximation of according to are defined as follows:

Based on the above definition of lower and upper approximations, one can define the accuracy, roughness, and approximation quality based on the second type as

For an ordered decision information system , , is the set of equivalence decision class of the ordered information system; then the approximation accuracy of according to can be defined as

The dependency degree and importance degree of with respect to can also be defined as

Similarly, we investigate some new properties which are important when investigating whether the uncertainty measurement concepts including accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree are appropriate for uncertainty measures or not.

Theorem 10. Let be an ordered information system and ; for any , one has(1),(2),(3),(4).

Proof. (1) Suppose ; then and according to the definition of , we have, for any , .
For any , we have and . It is clear that and . Then we only need to prove and . For any , there exist such that . Then, for any , . While , for any , and . So, for any , ; hence . Thus and .
For any , there exist such that . Then and , so . Thus, .
(2) Suppose ; according to the definition of , there exist , , such that . Then and .
For any , we have and . It is clear that and . Then we only need to prove and . For any , there exist such that . Then for any , . While , then, for any , . So for any , ; hence . Thus and .
For any , there exist such that . Then and , so . Thus, .
(3) Suppose ; then . According to the definition of , we have, for any , . For any , ; then , so . Thus, .
(4) Suppose , according to the definition of , there exist , , such that . Then and . For any , ; then , so . Thus, .
Thus, the theorem is proved.

From the theorem above, one can get the following theorem easily.

Theorem 11. Let be an ordered decision information system and ; for any , the following properties hold:(1),(2),(3),(4),(5),(6),(7) ,(8), (9),(10),(11),(12).

Proof. The proof is similar to Theorem 8.

The theorem above shows that the accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree measures of Definition 9 are reasonable. Therefore, , , , , , and can also be used as the uncertainty measures.

3.3. The Third Type of Approximations and Corresponding Measures

In this subsection, we will consider the third type of lower and upper approximations which the lower approximation is class-based lower approximation, and the upper approximation is defined by the duality. They can be defined as follows.

Definition 12. Let be an ordered information system, , and . The third type of lower approximation and upper approximation of according to are defined as follows:

Based on the above definition of lower and upper approximations, one can define the accuracy, roughness, and approximation quality based on the third type as

For an ordered decision information system , . be the set of equivalence decision classes of the ordered decision information system; then the approximation accuracy of according to can be defined as

The dependency degree and importance degree of with respect to can also be defined as

Similarly, we investigate some new properties which are important when investigating whether the uncertainty measurement concepts including accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree are appropriate for uncertainty measures or not.

Theorem 13. Let be an ordered information system and ; for any , one has(1),(2),(3),(4).

Proof. (1) The proof is the same with (1) in Theorem 10.
(2) The proof is the same with (2) in Theorem 10.
(3) From (1) and Definition 12, we have .
(4) From (2) and Definition 12, we have .
Thus, the theorem is proved.

From the theorem above, one can get the following theorem easily.

Theorem 14. Let be an ordered decision information system and ; for any , the following properties hold:(1),(2),(3),(4),(5),(6),(7) ,(8),(9),(10),(11),(12).

Proof. The proof is similar to Theorem 8.

The theorem above shows that the accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree measures of Definition 9 are reasonable. Therefore, , , , , , and can also be used as the uncertainty measures.

3.4. The Fourth Type of Approximations and Corresponding Measures

In this subsection, we will consider the fourth type of lower and upper approximations which the upper approximation is class-based upper approximation, and the lower approximation is defined by the duality. They can be defined as follows.

Definition 15. Let be an ordered information system, , and . The fourth type of lower approximation and upper approximation of according to are defined as follows:

Based on the above definition of lower and upper approximations, one can define the accuracy, roughness, and approximation quality based on the fourth type as

For an ordered decision information system , . Let is the set of equivalence decision classes of the ordered decision information system; then the approximation accuracy of according to can be defined as

The dependency degree and importance degree of with respect to can also be defined as

Similarly, we investigate some new properties which are important when investigating whether the uncertainty measurement concepts including accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree are appropriate for uncertainty measures or not.

Theorem 16. Let be an ordered information system and ; for any , one has(1),(2),(3),(4).

Proof. (3) Suppose ; then . According to the definition of , we have, for any , . For any , ; then , so . Thus, .
(1) According to (3), we have ; therefore, .
(4) It can be proved similar to (3).
(2) It can be proved similar to (1).
Thus, the theorem is proved.

From the theorem above, one can get the following theorem easily.

Theorem 17. Let be an ordered decision information system and ; for any , the following properties hold:(1),(2),(3),(4),(5),(6),(7) ,(8),(9),(10),(11),(12).

Proof. The proof is similar to Theorem 8.

And similar to the above three types of approximations and corresponding measures, the , , , , , and can be employed to evaluate the uncertainty.

3.5. Relationships among These Four Types of Approximations

We first discuss the relationships among the four types of approximation operators, that is to say, the relationships among the four types of lower and upper approximation operators.

Theorem 18. Let be an ordered decision information system and ; for any , the four approximations have the following property:

Proof. (1) If , then ; that is to say, . Then , so ;
(2) If , then ; according to Definition 6, we have . Then , so . And, for any , there exist such that ; then for any , ; that is to say, , so . Thus, .
(3) It is straightforward that .
(4) The first type lower and upper approximations are defined based on the element, while the third type lower approximations are the class-based lower approximation, and the upper approximation is defined by the duality. From (2) we have their lower approximations which are the same, and both the upper approximations have the duality, so their upper approximations are also the same.
(5) If , then ; according to Definition 15, we have , so .
(6) It is straightforward that .

From the proof of the theorem above, one can find that the first type and the third type are the same actually, so the corresponding uncertainty measurements are the same, respectively.

Since all the four types of definitions of accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree can be used to measure the uncertainty of knowledge in ordered information system, here we study the relationships between the four types of measures.

Theorem 19. Let be an ordered decision information system and ; for any , the four types of uncertainty measures have the following properties:(1),(2),(3),(4) ,(5),(6).

Proof. (1) From Theorem 18, we have
So That is to say,
(2) From (1), we have Similarly, we can prove the following.
(3) ,
(4) ,
(5) ,
(6) .

Example 20. Table 1 shows an ordered decision information system , where is the universe, is the conditional attribute set, and is the decision attribute set.

We first calculate the dominance class of each object with respect to attribute set :

Let ; then ~. The four types of lower and upper approximations are calculated as follows:(1), (2), (3), (4).

From the results above, we can find that . After obtaining the lower and upper approximations, we can calculate the accuracy, roughness, and approximation quality as follows:

Consider . In order to calculate the approximation accuracy, we also need to calculate the four types of lower and upper approximations of sets and . The four types of lower and upper approximations of with respect to are:(1), (2), (3), (4) .

Thus the approximation accuracy is

And the dependency degree of with respect to can also be calculated as

4. Empirical Experiments

In Section 3, we find that all the four types can be used to evaluate the uncertainty of knowledge in ordered information system from the theoretical view, especially the first type and the third type which are the same. In this section, we will test the first type, the second type, and the fourth type of measures (the essence of the first type and the third type is the same) on some real-life data sets. And three real-life data sets available from the UCI are used. The characteristics of the data sets are summarized in Table 2.

Figure 1 shows the accuracy values of the first 70 percent of whole objects with respect to different sizes of attribute sets in ordered information system. Figure 2 shows the values of roughness measure of the first 70 percent of whole objects with respect to different sizes of attribute sets in ordered information system. Figure 3 shows the values of approximation quality measure of the first 70 percent of whole objects with respect to different sizes of attribute sets in ordered information system. The -axis represents the size of attribute set, from one attribute to all attributes. The -axis represents the value of the measures. The values of approximation accuracy of are described in Figure 4. The dependency degree of different attribute sets with respect to is represented in Figure 5. Each figure has three lines.

From the figures, we can find that the accuracy, the approximation quality, the approximation accuracy, and the dependency degree measures of all types get larger, and the roughness measure gets smaller when the attribute set gets bigger. Moreover, the accuracy and approximation accuracy measures of the first type are larger than those of the second type, and the second type is larger than the fourth type. While the roughness measure of the first type is smaller than that of the second type, the second type is smaller than the fourth type. And the approximation quality and the dependency degree of the first type are equal to those of the second type, and they are larger than those of the fourth type. These results verify the properties of Theorems 8, 11, and 17.

5. Conclusions

Uncertainty measurement is an important issue in rough set theory. In this paper, we investigated four types of lower and upper approximations and the corresponding accuracy, roughness, approximation quality, approximation accuracy, dependency degree, and importance degree in ordered information system. Moreover, we found that all the uncertainty measures of four types have the property of monotonicity and they can be used to evaluate the uncertainty in ordered information system. Furthermore, the relationships among the four types of lower and upper approximation operators and the corresponding uncertainty measures were got in the system. Finally, the four types of measures were tested on some real-life data sets. From results obtained, it can be shown that these uncertainty measures can surely measure the uncertainty in ordered information system. In the future, we will consider the application of the presented uncertainty measures, especially in attribute reduction or rule generation in ordered information system.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (no. 61105041), the National Natural Science Foundation of CQ CSTC (nos. cstc 2011jj A40037 and cstc 2013jcy A40051), and the Science and Technology Program of Chongqing University of Technology (no. YCX2012203).