#### Abstract

Multiscale information system is a new knowledge representation system for expressing the knowledge with different levels of granulations. In this paper, by considering the unknown values, which can be seen everywhere in real world applications, the incomplete multiscale information system is firstly investigated. The descriptor technique is employed to construct rough sets at different scales for analyzing the hierarchically structured data. The problem of unravelling decision rules at different scales is also addressed. Finally, the reduct descriptors are formulated to simplify decision rules, which can be derived from different scales. Some numerical examples are employed to substantiate the conceptual arguments.

#### 1. Introduction

As one of the important mathematical tools for granular computing [1, 2], the theory of rough set [3] has been demonstrated to be useful in fields such as data mining, knowledge discovery, decision support, machine learning, and pattern recognition.

Pawlak’s rough set was proposed on the basis of an indiscernibility relation, which can generate a granulation space on the universe of discourse. Such granulation space is actually a partition since the indiscernibility relation is an equivalence relation. With respect to different requirements, a variety of the expanded rough sets models have been proposed. For example, the tolerance relation [4–7], similarity relation [8–10], characteristic relation [11, 12], and neighborhood system [13] based rough sets can be used to deal with the incomplete information systems; the dominance-based rough set approach [14–18] can be used to deal with the multicriteria decision problems; the covering based rough sets [19–22] are constructed on the basis of a covering, which is an expansion of the partition on the universe; the fuzzy rough set approaches [23–26] are proposed to approximate the fuzzy concepts in the fuzzy environments; the variable precision rough sets approaches [27–29] allow some inconsistency to exist, which can not only solve classification problems with uncertain data but also relax the boundary definition of Pawlak’s rough set to improve the suitability.

Obviously, the above rough sets are constructed on the basis of one and only one set of the information granules, which can be generated from a binary relation or a covering. From this point of view, we may call these rough sets the single-granulation rough sets. In single-granulation rough sets, a partition is a granulation space, a binary neighborhood system induced by a binary relation is a granulation space, and a covering is also a granulation space. Nevertheless, it should be noticed that, in [30], the authors said that we often need to describe concurrently a target concept from some independent environments; that is, multigranulation spaces are needed in problem solving. From this point of view, Qian et al. [31–33] proposed the concept of the multigranulation rough sets. The main difference between single-granulation and multigranulation rough sets is that we can use multidifferent sets of the information granules for the approximating of target concept. Since each set of the information granules can be considered as a granulation space, then the space induced by multidifferent sets of the information granules is referred to as the multigranulation space. For example, a family of the partitions can be regarded as a partitions based multigranulation space.

Presently, the development of multigranulation rough sets approaches is progressing rapidly. For instance, Qian et al. classified their multigranulation rough sets into two categories: one is the optimistic case and the other is the pessimistic case. Yang et al. [34] generalized the optimistic and pessimistic multigranulation rough sets into fuzzy environment and then proposed the multigranulation fuzzy rough sets models. Furthermore, Qian et al. also proposed a positive approximation [35, 36], which can be used to accelerate a heuristic process of attribute reduction. Since the positive approximation uses a preference ordering, which can make the granulation space finer step by step, that is, a finer granulation space can be obtained by last granulation space, then the positive approximation also reflects the thinking of multigranulation. Bittner and Smith [37, 38] proposed the concept of granular partition, which provides what may be thought of as hierarchical family of partial equivalence relations. Khan and Banerjee [39] studied the rough set approach to multiple-source information system, which reflects the situation where information arrives from multiple sources. Wu and Leung [40, 41] investigated a new knowledge representation system, which is called the multiscale information system. In such system, the data are represented by different scales at different level of granulations, and the granular information is transformed from a finer to a coarser level of granulation.

It must be noticed that the multiscale information system is a very important knowledge representation approach; it can help us to analyze data from the viewpoint of different levels of granulations. For example, maps can be hierarchically organized into different scales, from large to small and vice versa. The smaller the scale, the finer the partition that can be obtained; conversely, the bigger the scale, the coarser the partition that can be obtained. However, what Wu and Leung investigated is complete multiscale information systems. Gore, in his influential book Earth in the Balance [42], notes that “We must acknowledge that we never have complete information. Yet we have to make decisions anyway.” This quote illustrates not only the difficulty of making decisions about environmental issues but also the fact that making such decisions with partial information is ultimately inevitable. Therefore, the investigation of incomplete multiscale information system has become a necessity. Different from the complete multiscale information system, since unknown values are existing in incomplete multiscale information system, then the obtained granulation space at each level of granulation is not necessarily a partition but a covering. To solve such problem, Wu and Leung's approach to multiscale information system has to be reexamined in incomplete multiscale information system. This is what will be discussed in our paper.

In the next section, we first introduce some basic notions related to Pawlak's rough set and multiscale information system. The incomplete multiscale information system and rule induction problem are explored in Section 3. In Section 4, the concept of reducts is introduced into descriptors in incomplete multiscale decision system for the deriving of simplified decision rules. We then conclude the paper with a summary and outlook for further research in Section 5.

#### 2. Preliminary Knowledge

##### 2.1. Rough Set

Formally, an information system [3] can be considered as a pair , in which(i) is a nonempty finite set of objects; it is called the universe;(ii) is a nonempty finite set of attributes, such that, , is the domain of attribute .

, let us denote by the value that holds on (). For an information system , one then can describe the relationship between objects through their attributes values. With respect to a subset of attributes such that , an indiscernibility relation [3] may be defined as

The relation is reflexive, symmetric, and transitive; then is an equivalence relation. By the indiscernibility relation , one can derive the lower and upper approximations of an arbitrary subset of . They are defined as respectively, where is the -equivalence class containing . The pair is referred to as Pawlak's rough set of with respect to the set of attributes .

##### 2.2. Multiscale Information System

A multiscale information system [40] is a tuple , where(i) is a nonempty, finite set of objects called the universe of discourse;(ii) is a nonempty, finite set of attributes, and each is a multiscale attribute; that is, for the same object in , attribute can take on different values at different scales.

In multiscale information system, Wu and Leung [41] assumed that all the attributes have the same number of levels of granulations. Therefore, a multiscale information system can be rewritten as a system such that , where is a surjective function and is the domain of the th scale attribute . For the set of the th scale attributes , one then denotes an equivalence relation such that

The partition induced by is denoted by such that where is the equivalence class that includes at scale .

It should be noticed that, in multiscale information system, since different scales represent different levels of granulations and then there is a hierarchical structure among -scales, such hierarchical structure can be expressed by the inclusion relation among equivalence relations; that is,

A system is referred to as a multiscale decision system, where is a multiscale information system and is a special attribute called decision. Such multiscale decision system can be decomposed into decision systems , with the same decision . Each decomposed decision system represents information on a special level of granulation, that is, scale.

In [40], Wu and Leung said that the multiscale decision system is referred to as consistent if and only if the decision system under the first (finest) level of scale, that is, , is consistent; otherwise it is referred to as inconsistent. Following such work, we will propose a more generalized definition of the concept of consistency in multiscale decision system.

*Definition 1 (see [40]). *Let be a multiscale decision system; then is referred to as -scale consistent if and only if -scale decision system, that is, , is consistent; otherwise is referred to as -scale inconsistent.

By Definition 1, we can see that if , 1-scale consistency is same to what have been proposed by Wu and Leung. In such case, we have . Moreover, since , then we do not always have ; it follows that, in -scale levels of granulations, we need only at least one of the levels of granulations to satisfy the condition of consistent; then this type of the consistent, that is, -scale consistent, can be referred to as the optimistic consistent in multiscale decision system.

On the other hand, let us consider the -scale consistent. In such case, we have . Moreover, since , then we also have ; it follows that, in -scale levels of granulations, we need all the levels of granulations to satisfy the condition of consistent; then this type of the consistent, that is, -scale consistent, can be referred to as the pessimistic consistent in multiscale decision system.

From discussions above, it is not difficult to observe that the optimistic and pessimistic consistent are all special cases of -scale consistent in multiscale decision system.

Proposition 2 (see [40]). *Let be a multiscale decision system; if is -scale consistent, then is also -scale consistent, where and .*

The above proposition tells us that if a multiscale decision system is consistent in a given scale, then such multiscale decision system is also consistent in the scale, which is smaller than the given scale.

*Remark 3. *It should be noticed that the inverse of Proposition 2 does not always hold; that is, if the multiscale decision system is consistent in a smaller scale, then such decision system is not always consistent in a bigger scale.

*Definition 4 (see [40]). *Let be a multiscale decision system; then , the -scale lower and upper approximations of are denoted by and , respectively, where

The pair is referred to as the -scale rough set of in multiscale decision system. Since the -scale rough set shown in Definition 4 is still based on the equivalence relation, then the properties of Pawlak’s rough set still satisfy the -scale rough set. We omit these properties in this paper.

By Definition 4, we can see that, in 1-scale lower approximation, we have . Since , then we do not have that ; it follows that, in levels of granulations, we need only at least one of the levels of granulations to satisfy the inclusion condition between equivalence class and target concept; such explanation is compatible with that in Qian et al.’s optimistic multigranulation lower approximation [31, 33]. Moreover, in 1-scale upper approximation, we have . Since , then we always have ; it follows that, in levels of granulations, we need all the levels of granulations to satisfy the intersection condition between equivalence class and target concept; such explanation is also compatible with that in Qian et al.’s optimistic multigranulation upper approximation. From this point of view, 1-scale rough set is also referred to as the optimistic multiscale rough set in multiscale decision system.

On the other hand, let us consider -scale lower approximation; we have . Since , then we also have ; it follows that, in levels of granulations, we need all the levels of granulations to satisfy the inclusion condition between equivalence class and target concept; such explanation is compatible with that in Qian et al.'s pessimistic multigranulation lower approximation [32]. Moreover, in -scale upper approximation, we have . Since , then we do not always have ; it follows that, in levels of granulations, we need at least one of the levels of granulations to satisfy the intersection condition between equivalence class and target concept; such explanation is also compatible with that in Qian et al.’s pessimistic multigranulation upper approximation. From this point of view, -scale rough set is also referred to as the pessimistic multiscale rough set in multiscale decision system.

Proposition 5 (see [40]). *Let be a multiscale decision system; then, , one has
*

The above proposition tells us that, with the monotonous increasing of levels of granulations, the -scale lower approximations become smaller while the -scale upper approximations become bigger. In other words, we can obtain a string of rough sets through different levels of granulations in multiscale decision system.

The accuracy of -scale rough approximation is defined by where denotes the cardinal number of set . Obviously, holds.

Proposition 6 (see [40]). *Let be a multiscale decision system; then, , one has
*

#### 3. Incomplete Multiscale Information System

##### 3.1. -Scale Descriptors Based Rough Set

An incomplete multiscale information system is still denoted by in this paper. Given an incomplete multiscale information system , if , then we say that the value of object is unknown on the attribute in terms of the -scale. Moreover, we assume that the unknown value can be compared with any other values in the domain of the corresponding attributes [4, 5]. Therefore, we use the descriptor based rough set for analyzing the incomplete multiscale information system.

In the discussion to follow, the symbols and denote the logical connectives “and” (conjunction) and “or” (disjunction), respectively [15]. Given an incomplete multiscale information system , if , then any attribute-value pair is called an *-atomic property* where and . Any -atomic property or conjunction of different -atomic properties is called the *-descriptor*. If is the atomic property occurring in -descriptor , we simply say that . Obviously, is constructed at scale ; it can also be called a -scale descriptor.

Let be an -descriptor; if, for all , we have , that is, is constructed from a subset of atomic properties occurring in , then we say is* coarser* than or is* finer* than and is denoted by or . If is constructed from a proper subset of atomic properties occurring in , then we say is* properly coarser* than and is denoted by or .

Let be an -descriptor; the attributes set occurring in is denoted by . Moreover, if is an -descriptor and , then is called* full **-descriptor*. Here, suppose that is a full -descriptor; we denote
then is referred to as the support of .

Here, let us denote

By the descriptor technique, the universe could be partitioned into several subsets that may overlap at scale , and the result is denoted by such that

In complete multiscale decision system, the hierarchical structure is represented by a partial relation among different equivalence relations or among different partitions. In incomplete multiscale decision system, since, for each level of granulation, we can obtain a family of the supports of the descriptors, which form coverings on the universe of discourse, and then we can use those supports of the descriptors to represent the hierarchical structure such that where, and , means that the following two conditions hold: (1), there must be such that ;(2), there must be such that .

*Remark 7. *It should be noticed that if and are all partitions, then condition implies condition or condition implies condition ; it follows that only one of the above conditions is needed. However, since, in incomplete multiscale information system, and may be the coverings instead of the partitions, then the above two conditions are needed simultaneously.

The above two conditions for expressing the hierarchical structure in incomplete multiscale information system are consistent with the basic thinking of surjective function. In other words, in an incomplete multiscale information system , and , a surjective function can be defined as Such surjective function transforms the granulation spaces from a smaller scale to a bigger scale in the incomplete multiscale information system.

*Example 8. *Table 1 shows an example of incomplete multiscale decision system , where , is the set of the condition attributes, and is the decision attribute. The system has three levels of granulations, where “,” “,” “,” “,” “,” “,” “,” and “” stand for, respectively, “good,” “fair,” “bad,” “low,” “medium,” “high,” “yes,” and “no.”

By the descriptor technique we mentioned above, it is not difficult to obtain the descriptors and their supports in each level of granulation. The results of full , , and descriptors are shown in Tables 2, 3, and 4, respectively.

*Definition 9. *Let be an incomplete multiscale decision system; then is referred to as -scale consistent if and only if -scale decision system is consistent; that is, ; otherwise is referred to as -scale inconsistent.

Similar to the complete case, the -scale consistent incomplete multiscale decision system is referred to as the optimistic consistent incomplete multiscale decision system while the -scale consistent incomplete multiscale decision system is referred to as the pessimistic consistent incomplete multiscale decision system.

Proposition 10. *Let be an incomplete multiscale decision system; if is -scale consistent, then is also -scale consistent, where and .*

*Remark 11. *It should be noticed that the inverse of Proposition 10 does not always hold; that is, if the incomplete multiscale decision system is consistent in a smaller scale, then such incomplete decision system is not always consistent in a bigger scale.

*Definition 12. *Let be an incomplete multiscale decision system; then, , the -scale lower and upper approximations of are denoted by and , respectively, where
By Definition 12, the -scale boundary region of is then

*Example 13. *Take, for instance, Table 1; since the decision attribute partitions the universe into three disjoint subsets such that
where , then by Definition 12, we obtain the following 1-scale, 2-scale, and 3-scale rough sets: (1)* 1-scale lower and upper approximations:* ,
, , , ,, , , , , , , , , , , ,, , , , , , , , , , , , , , , , , , , , , , , , ,
;(2)* 2-scale lower and upper approximations:* , , , , , ;(3)* 3-scale lower and upper approximations:* , , , , , .

By the above computations, we can see that Table 1 is inconsistent at each level of granulation, that is, each scale.

Proposition 14. *Let be an incomplete multiscale decision system; then, , one has
*

The results in Proposition 14 are consistent with those in Proposition 5; that is, with the variety of levels of granulations in incomplete multiscale decision system, the lower approximations, upper approximations, and boundary regions are monotonic.

##### 3.2. -Scale Decision Rules

The end result of rough set is a representation of the information contained in the data system considered in terms of “if… then…” decision rules [43, 44]. Since an incomplete multiscale decision system contains a family of the systems with different levels of granulations, then, given an incomplete multiscale decision system, one can derive decision rules at each scale. For example, suppose that be an incomplete multiscale decision system; then a -scale decision rule is represented by where , , , and and are, respectively, called the condition and decision parts of the rule .

For each -scale decision rule , we associate a quantitative measure, called the certainty, of and it is defined by

A -scale decision rule is referred to as certain if and only if ; a -scale decision rule is referred to as possible if and only if . Similar to the traditional rough set approach, the certain decision rules are supported by the descriptors in lower approximation while the possible rules are supported by the descriptors in boundary region. In other words, if , then is a certain decision rule; if , then is a possible decision rule.

In an incomplete multiscale decision system ,(1)since , then, , there must be such that ; it tells us that if we have a certain -scale decision rule such that , then we can also obtain a certain -scale decision rule; that is, ;(2)since , then, , there must be such that ; it tells us that if we have a possible -scale decision rule such that , then we can also obtain a possible -scale decision rule; that is, .

*Example 15. *Following the results of approximations we obtained in Example 13, it is not difficult to derive the following decision rules at 3 different levels of granulations in Table 1: (1)* 1-scale decision rules:*(a)* 1-scale certain decision rules:* // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by ;(b)* 1-scale possible decision rules:* , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by ;(2)* 2-scale decision rules:*(a)* 2-scale certain decision rules:* // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by , // supported by ;(b)* 2-scale possible decision rules:* , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by ;(3)* 3-scale decision rules:*(a)* 3-scale certain decision rules:* // supported by ;(b)* 3-scale possible decision rules:* , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by , , Cer // supported by .

#### 4. Reductions in Incomplete Multiscale Decision System

##### 4.1. Reducts of Descriptors

*Definition 16. *Let be an incomplete multiscale decision system, is a full -descriptor, is an -descriptor, and ; then is referred to as the reduct descriptor of if and only if the following two conditions hold: (1);(2) for each .

By Definition 16, we can see that a reduct descriptor of is a conjunction of the atomic properties in , which preserves the support of . The reduct descriptor allows us to classify objects with the smallest number of required atomic properties.

To compute the reduct descriptor of , let us define then is referred to as the discernibility matrix.

Theorem 17. *Let be an incomplete multiscale decision system in which , , and ; then* * for each .*

*Proof. *: Suppose such that ; then by the definition of , we have because . Moreover, since , then ; such result is contradictive to the condition .

: Since , then holds obviously. Therefore, it must be proved that . For each , , since ; then there must be such that ; it follows that , from which we can conclude that ; that is, . That completes the proof.

*Definition 18. *Let be an incomplete multiscale decision system in which ; then the discernibility function is defined as

Theorem 19. *Let be a multiscale decision system in which , , and ; then* * is a reduct descriptor of if and only if is the prime implicant of the discernibility function .*

*Proof. *: By Theorem 17, we have for each . We claim that, for each , there must be such that . If fact, if , there exist such that , where ; let , where ; then by Theorem 17 we know , which contradicts the fact that is a reduct descriptor of ; it follows that is the prime implicant of the .

: If is the prime implicant of the , then by Definition 18, we have for each . Moreover, for each , there exists such that . Consequently, and , ; we then conclude that is a reduct descriptor of . That completes the proof.

##### 4.2. Lower Approximation and Boundary Region Reduct Descriptors

In Section 3.2, we have mentioned that the certain decision rules can be generated from the descriptors, which in the lower approximation, the possible decision rules can be generated from the descriptors, which are in the boundary region. Therefore, to obtain the simplified decision rules, the concept of reduct can also be introduced into the lower approximation and boundary region.

*Definition 20. *Let be an incomplete multiscale decision system in which , , and ; define
Then is referred to as the lower approximation reduct descriptor of if and only if the following two conditions hold: (1);(2) for each ;

is referred to as the boundary region reduct descriptor of if and only if the following two conditions hold: (1);(2) for each .

By Definition 20, we can see that a lower approximation reduct descriptor of is a minimal conjunction of the atomic properties in , which preserves the inclusion relation between support of and the decision classes; a boundary region reduct descriptor of is a minimal conjunction of the atomic properties in , which preserves the intersection relation between support of and the decision classes.

*Remark 21. *It should be noticed that if , then for each ; it follows that no certain decision rules are supported by the -scale descriptor . Therefore, it is meaningless to compute the lower approximation reduct descriptor of if .

Given an incomplete multiscale decision system , if , then ; we can obtain a certain decision rule such that . Moreover, suppose that is a lower approximation reduct descriptor of ; then is a minimal conjunction of the atomic properties in , which preserves the inclusion property; therefore, the decision rule can be simplified to be . The condition part of is shortened because . Similar to what have been analyzed, by the boundary region reduct descriptor, the possible rules supported by can also be simplified.

To compute the lower approximation and boundary region reduct descriptor of , let us define