Abstract

Rule extraction is the core in rough set. Two procedures are contained in rule extraction: one is attribute reduction and another is attribute value reduction. It was proved through computational complexity perspective that obtaining all the reduction, minimum attribute reduction, and minimum attribute value reduction is an NP problem. So, generally, a heuristic reduction method is used to solve attribute reduction and attribute value reduction. However, for most heuristic methods, it is hard to put into practice and has high cost on computational complexity. Moreover, part of the methods extracted redundant rules. To approach a quick and effective model for rule extraction in decision systems, against the concept of distinguishable relation, relevant concepts and basic theorems of rule extraction are proposed. In order to get concise and accurate rules quickly, algorithms for finding conflict object set, finding duplicate object set, and finding redundant rules are given. After that, using decision dependency degree as attribute importance to determine the importance of each attribute in rule object, a new rule extraction model based on decision dependency degree is proposed in this paper. Compared with the previous models, this model does not generate matrix; instead, it finds conflict object set and duplicate object set by equivalence class, and consequently, improves the time performance to , , and . The theoretical analysis and experimental research show that the new model more accurately and effectively reduces the redundant data and extracts more concise decision rules from dataset.

1. Introduction

Today, in this ubiquitous and intelligent interconnection environment, after integration, cleansing, and transformation, the data information in the dataset (denoted by information system in this paper) from all kinds of perceived data and context data is not equally important [1]. Some are even redundant, which will interfere with policymakers and will seriously affect the efficiency on the followup operations. Hereby, it is necessary to perform attribute reduction and attribute value reduction or similar simplifying process on the dataset to reduce redundancy and extract decision rules before executing the followup operations.

Rough set is a data induction tool, which can effectively process massive data [2, 3]. One of the main advantages of Rough Set is that it does not need any preliminary or additional information about data, such as dataset probability distribution or grade membership. Without any information other than the dataset itself, Rough Set can keep original classification information unchanged and start with the description set of a problem and then get the essential features and inherent law of the problem by deleting redundant information.

Rule extraction is the core in Rough Set. By rule extraction, clarity of the knowledge system can be improved and decision knowledge underneath dataset can be found out. Two procedures are contained in rule extraction: one is attribute reduction [413] and another is attribute value reduction [1428].

By conducting the attribute reduction on the decision system, redundant information can be removed from the system. In the premise of ensuring decision system has the same distinguishability, attribute reduction can simplify and improve clarity of the original system to a certain extent. However, after attribute reduction on the whole system, the system is not the most simplified system. In the system, each rule object may contain redundant information; that is, all condition attribute values of each rule object are not equally important and are not necessary to impact decision rules extraction execution. Therefore, to improve the quality of data reduction, it is mandatory to evaluate all condition attribute values for each rule object and remove redundant condition attribute values which have no impact on performing rule extraction. This process is named as attribute value reduction in decision systems. Using attribute reduction and attribute value reduction, small and concise decision rules can be extracted from the original system. After that, data from the system can be simply presented as the valuable and straightforward information, which can be easily used.

It was proved through computational complexity perspective that to get all the reduction, minimum attribute reduction, and minimum attribute value reduction is an NP problem by Wong and Ziarko [3]. Because of the computational complexity of getting core attributes and value-core attributes during the process of attribute reduction and attribute value reduction, moreover, with big volume of candidates for reduction, the NP problem indicates that it is easy to get combination explosion. So, generally, a heuristic reduction method is used to solve attribute reduction and attribute value reduction.

Today, value reduction of decision systems is one of important topics in research and application of Rough Set. Researchers have done a lot of research work and gradually put forward some methods, primarily including value reduction algorithms based on discernibility matrix [1422] and heuristic value reduction algorithms [2328]. However, when using discernibility matrix to get value reduction on decision systems with massive data, a huge matrix is required, which leads to high cost on time-space complexity during constructing and traversing the matrix. Therefore, it needs furthermore research on discernibility matrix methods. For most heuristic methods, it is hard to put into practice and has high cost on computational complexity. Moreover, part of the methods extracted redundant rules, for example, methods in [23].

To resolve the above problems and reduce the time-space complexity, with learning from tradition reduction algorithms, we propose a new heuristic value reduction and rule extraction model. The model will use an improved chain base sorting method to perform sorting and equivalence class partitioning on the system and then will compare rule objects. Moreover, it will use conditional equivalence class to determine conflict objects and duplicate objects. So, the time cost generated by comparison of rule objects is significantly decreased. At the same time, the model uses theory of decisive distinguishable matrix (but never create the interim result) to determine the importance of attributes, which effectively control the complexity. This model will perform attribute reduction before performing value reduction. After the reduction, the count of condition attributes in reduction will not be very big. Compared with the current known methods, even with massive data, this model can significantly reduce time-space complexity and improve efficiency.

Term distinguishable object set is named under distinguishable relationship. Against this concept, theorems on distinguishable object set and consistency in decision system are given. With the theorems, formalized concept on attribute value reduction, theorem on value-core attribute of object, and formalized concepts on conflict object set and duplicate object set during value reduction are defined. After that, algorithms for conflict object set, duplicate object set, and redundant rules are proposed, respectively. With dependency based on distinguishable relation, an attribute value reduction and rule extraction model are given for decision systems. At last, the correctness and feasibility of the model are proved by specific examples and experiments. The model can provide effective solution on data value reduction and decision rule extraction for all kinds of intelligent networking systems and intelligent control systems.

2. Attribute Reduction of Decision Systems

Definition 1 (decision system). Let be an information system. U =  is a nonempty finite object set, called universe. A =  is a nonempty finite attribute set, called attribute set. If A consists of a condition attribute set C and a decision attribute set D and two sets satisfy and  = , then S is a decision system which is denoted by . When there is only one decision attribute in D, the decision system is denoted by commonly.
In general, the basic structure of decision system S =  is shown in Table 1.

Definition 2 (equivalence class). Let be a decision system. For any , , define the indiscernibility relation as follows:And the equivalence class of object according to attribute set P is defined as follows:The indiscernibility relation corresponds to a partition of U, and the partition is denoted by . In general, is used as the abbreviation.
If , then and are indiscernible by P. contains all the objects that are indiscernible with by attribute set P. That is, an equivalence class is a set of objects with the same condition attributes and the same decision attributes in universe.

Definition 3 (attribute reduction of decision system). Let be a decision system, and . If (that is, ), then P is a consistent set of S [28]. Furthermore, if P is a consistent set and any is not a consistent set of S, then P is called as an attribute reduction of S [28].
The attribute reduction of a decision system is a reduction for its condition attributes. The attribute reduction is the smallest subset of the condition attribute set found on the premise of maintaining the original system classification ability.

3. Distinguishable Relation and Consistent Reduction with Distinguishability

In order to facilitate subsequent discussion and research, related concepts and theorems are given as follows.

Among them, Definition 4 “Distinguishable Relation of Attribute Set” is discussed from the perspective of distinguishability between objects, while Definition 2 “Indiscernibility Relation and Equivalence Class” in the previous section is discussed from the perspective of indiscernibility between objects.

And, definitions “Consistent Reduction with Distinguishability” and “Decisive Distinguishable Matrix” are based on Definition 4, while Definition 3 “Attribute Reduction of Decision System” in the previous section is based on Definition 2.

Based on the distinguishable relation of attribute set, the traditional logical knowledge granulation process related to Definition 3 can be transformed into matrix computing and the rapid reduction of redundant attributes can be realized.

Definition 4 (distinguishable relation of attribute set). Let be a decision system, and . Define the distinguishable relation of as follows: is used as the distinguishable relation of P.
If , then and are distinguishable by P. That is, there is a condition attribute, and the attribute values of the two objects are different.

Definition 5 (distinguishable unit set of attribute set). Let be a decision system, and . If , then is called a distinguishable unit. The distinguishable unit set of P is denoted by .
contains all the units that can be distinguished by P.

Definition 6 (distinguishable unit set of attribute). Let be a decision system, and . For any attribute , let is used as the distinguishable relation of . The distinguishable unit set of is denoted by , which contains all the units that can be distinguished by .
Based on Definition 4, Definition 5, and Definition 6, the definition of 7 “Consistent Reduction with Distinguishability” is given as follows.

Definition 7 (consistent reduction with distinguishability). Let be a decision system, and . If , then P is a consistent attribute set with distinguishability of S [28]. For any and , if and , then P is a consistent reduction with distinguishability of S [28].
Under the premise of keeping the system classification ability unchanged, a consistent reduction with distinguishability can be achieved after deleting the irrelevant or unimportant knowledge.

Definition 8 (decisive distinguishable matrix). Let , and . For any attribute , let is used as the kth column vector of the decisive distinguishable matrix , which contains rows and m columns.
is the intersection of and . Given a decision system , and , is used as the decisive distinguishable matrix of .

Definition 9 (core with distinguishability). For a decision system , which has r reductions, , a core set with distinguishability of S is defined as follows:In the process of attribute reduction, the core cannot be deleted from the attribute set. Because the distinguishability of the core cannot be replaced by other attributes, the core is indispensable in expressing the distinguishability of the system. The distinguishability of other noncore attributes can be replaced by other attributes.

Theorem 1. Let be a decision system, and . The following propositions are equivalent:(1)S is a consistent decision system(2)(3)

Proof. (2) (3). Evidently, is equivalent to . That is, for any , if , then .
According to Definition 5 and Definition 6, if , then and are indistinguishable by C. That is, . Similarly, if , then and are indistinguishable by D. That is, .
Thus, is equivalent to any , if , then . That is, . Therefore, (2) (3).
Similarly, it can be proved that (1) (2).

Theorem 2. Let be a decision system, and . has the following properties.(1)If , then (2)

Proof. The proof of Theorem 2 is similar to that of Theorem 2 in [27].

Theorem 3. Let be a decision system, and , then .

Proof. According to Definition 8, for any , , and . Thus, according to property 2 in Theorem 2, .

Theorem 4. Let be a decision system; then, consistent reduction with distinguishability must exist.

Theorem 5. Let be a decision system. In decision system S, consistent set and consistent attribute set with distinguishability are equivalent.

Theorem 6. Let be a decision system. In decision system S, attribute reduction and consistent reduction with distinguishability are equivalent.

Theorem 7. Let be a decision system, and . The following propositions are equivalent.(1)P is a consistent attribute set with distinguishability(2)(3)

Theorem 8. Let be a decision system, and . The following propositions are equivalent.(1)P is a consistent attribute set with distinguishability(2), and (3), and , is not trueThe proof of above theorems is similar to the proof of Theorems 11 to 15 in [29] (not elaborated here).

4. Decision Dependency Degree Based on Distinguishable Relation

Based on the distinguishable relation, Definition 10 and Definition 11 are proposed as follows. Definition 10 “Dependency Based on Distinguishable Relation” is used to determine whether there is a dependency relationship between the attribute sets and . And Definition 11 “Dependency Degree based on Distinguishable Relation” is used to assess the extent of the dependency or replacement relationship that exists between and .

Definition 10 (dependency based on distinguishable relation). Let be a decision system, and . Then, the definition “dependency based on distinguishable relation” is defined as follows:(1)Attribute set depends on attribute set based on distinguishable relation iff . This decision dependency can be denoted as .(2)Attribute set is equivalent to attribute set based on distinguishable relation iff and . This equivalent relationship can be denoted as . It is obvious that , iff .

Definition 11 (distinguishable relation-based dependency degree). Let be a decision system, and ; if decision dependency degree is denoted asthen the extent to which depends on attribute set based on distinguishable relation is , called as . k depends on ( depends on in k decision dependency degree), or can be substituted by in decision substitution degree, which is denoted as .(1)If , named does not depend on based on distinguishable relation, or can be completely replaced (substituted) by (2)If , named partly depends on based on distinguishable relation, or can be partly replaced (substituted) by (3)If , named completely depends on based on distinguishable relation, or there does not exist decision substitution relationship between and

Theorem 9. Let be a decision system, , , and ; then, the following propositions are true:(1)Decision dependency degree is 0 if (2)Decision dependency degree is 1 if or

Proof. (1)According to Theorem 3, it is obvious that . Thus, is 0 if .(2)According to Definition 11 and Theorem 3, it is obvious that is 1 if or .

Theorem 10. Let be a decision system, , , and ; is inversely proportional to decision dependency degree and is proportional to decision substitution degree.

Proof. According to Definition 11 and Theorem 3, it is obvious that is inversely proportional to decision dependency degree and is proportional to decision substitution degree.
With the above analysis, can well reflect the decision dependency and decision substitution relationship between and . Due to the denominator being a fixed value and inversely proportional to , the decision dependence degree of P can be directly calculated by .

5. Value-Core Attribute Set of Object and Attribute Value Reduction Set

Performing reduction on the decision system, a simplified attribute set can be achieved without impacting the distinguishability of the original system. However, for each object instance in the decision system after reduction, there still exists an attribute redundancy problem. Here, obtaining minimum attribute reduction for each object instance is equivalent to value reduction.

Because the distinguishability of core attributes in the decision system cannot be substituted during attribute reduction, core attributes are objected to be removed. The others can be replaced. Similar to the existence of core attributes during attribute reduction, there are value-core attributes during attribute value reduction. The definition “value-core attribute” is defined in value reduction in Rough Set.

After performing attribute reduction and deleting duplicate object instances, there is not any duplicate object instance in the system. Each object instance corresponds to a rule. These object instances are named as “rule object” (or simply “rule” for short) in this paper. At this point, the rule object has less clustering power, which is not conducive to matching object instances. In order to improve the clustering ability, similar to attribute reduction, for each rule object, the distinguishability of each condition attribute should be evaluated. Rules can be simplified and summarized by removing redundant and non-value-core condition attributes. Thus, the ability of rules to match instances is improved.

Given a decision system , and . If is a consistent decision system, then in , and , after removing from , the consistency of will be changed. And there exist three scenarios as follows:(1)After deleting from , in the , , those remaining condition attribute values for and are the same while decision attribute values are not. That is, decision conflict occurs. In such a case, is a value-core attribute of and should not be deleted. Hereby, has to be recovered.Similarly, if is removed from , in the , , decision conflict is surely found between and . In such a case, is also a value-core attribute of and should not be deleted. Hereby, has to be recovered.This shows that in system , , if is removed, has decision conflict with other objects. That is, consistency of system is changed after removing , which makes to be an inconsistent system. In such a case, is the value-core attribute of and of other conflict objects with . can avoid decision conflict between and other objects. So it should not be removed from those objects.(2)After deleting , in the , , both condition attribute values and decision attribute values for and are the same. That is, no decision conflict is generated while duplicate objects arise. In such a case, is not a value-core attribute of and it will not impact the decision for new duplicate object. Hereby, can be removed.Similarly, if is removed from , in the , , both condition attribute values and decision attribute values for and are the same; that is, is the new duplicate object of . In such a case, is not a value-core attribute of . Hereby, can be removed.This shows that in system , , if is removed, there are new objects which are duplicate objects for ; then, is not a value-core attribute for all duplicate objects including . So it can be removed.(3)After deleting , in the , there is no decision conflict or duplicate objects. Hereby, according to existing information, it cannot be determined whether the deletion on of will impact decision consistency of the system or not. Additional information has to be considered on how to proceed on .

Combined with above analysis, definitions “Distinguishable Set of Object”, theorems of “Distinguishable Set of Object” and “System Consistency” are given. With the theorems, formalized definitions “Attribute Value Reduction,” “Value-core Attribute of Object,” “Conflict Object Set,” and “Duplicate Object Set” during value reduction are proposed.

Definition 12 (distinguishable set of object). Let be a decision system, and . For any , let is used as the distinguishable set of . Object belongs to distinguishable unit which is generated by in universe. Hereby, can also be defined asLet be a decision system and be a decision system after attribute reduction where is an attribute reduction of S. For , -1 is true.
It shows that after reduction, derived from attribute reduction , for , the distinguishable set of is . That is, after reduction, for , can be distinguishable with all the other objects in by .

Theorem 11. Let be a decision system, and . The following propositions are equivalent:(1) is a consistent decision system(2)(3)For ,

Proof. According to Theorem 1, it is obvious that (1) (2).

(2) (3). According to Theorem 1, is equivalent to , that is, is equivalent to , (namely, ). As and , is equivalent to , that is equivalent to . Thus, for , is equivalent to . Therefore, is equivalent to , .

From the above theorems, if objects that can be distinguished with by condition attributes are more or at least equal to objects that can be distinguished with by decision attributes, the system is a consistent decision system. That is, only when all the distinguishable objects deduced by condition attribute set P contain those deduced by decision attribute set D, the system is a consistent decision system. In another word, if all the objects in system meet the condition , the system is a consistent decision system.

Theorem 12. Let be a decision system, and . The following propositions are equivalent:(1)For , (2)For , (3)For ,

Proof. (1) (2).

According to Theorem 1, it is obvious that and is equivalent to , which is a consistent decision system; that is, is true.

According to Theorem 1, is equivalent to , and is equivalent to , (that is, ). As is equivalent to , , is equivalent to . Thus, is equivalent to and .

Because and , is equivalent to . That is, is equivalent to . Thus, , is equivalent to , and .

Therefore, , is equivalent to , and .

(2) (3).

It is obvious that . According to Theorem 2, is true. Thus, is equivalent to .

Theorem 13. Let be a decision system, and . The following propositions are equivalent: (1) is a inconsistent decision system (2) , (3) , .

Proof. (1) (2).
According to Theorem 11, it is obvious that is a consistent decision system, which is equivalent to and . Then, is a inconsistent decision system, which is equivalent to .
According to Theorem 12, it is obvious that , is equivalent to , and :Because and ,Because is equivalent to , according to the definition and properties of equivalence class, is true, so the reasoning process is as follows: is a inconsistent decision system .
Thus, Proposition (1) is equivalent to Proposition (2).
(1) (3).
Because is equivalent to , according to the definition and properties of equivalence class, is true, so the reasoning process is as follows:Alternatively, according to Theorem 1, Theorem 2, and Theorem 12, the reasoning process is as follows:
is a consistent decision system:Thus, is an inconsistent decision system:

Definition 13 (value-core attribute set of object). Let be a consistent decision system, if , , and satisfy , then attribute set is called value-core attribute set of and is called non-value-core attribute set of .
The distinguishability of the value-core cannot be replaced by other condition attributes, while the distinguishability of non-value-core attributes can be replaced by other attributes. The rule can be simplified and generalized by evaluating and deleting redundant non-value-core attributes.

Definition 14 (attribute value reduction set). Let be a consistent decision system; if , , and satisfy and satisfy for , then P is called an attribute value reduction set of .
Based on the above definitions and theorems, it can be proved that the distinguishability of and the original consistency of S will not change based on the attribute value reduction set P only, and no objects can be found with the same condition attribute values and different decision attribute values with . For any , if attribute set arises and meets the conditions , then the process is finished.

Theorem 14. Let be a decision system, , if has r value reduction sets, , then value-core attribute sets and attribute value reduction sets of satisfy (comply with) the following formula:The value-core attribute set is the intersection of all value reductions, and all attributes in it are essential and indispensable for every value reduction.

6. Algorithm for Conflict Object Set and Duplicate Object Set

Based on Theorem 11, Theorem 12, and Theorem 13, the definitions of “Conflict Object Set” and “Duplicate Object Set” are given as follows.

Definition 15 (conflict object set). Let be a consistent decision system, be an attribute reduction of S, and be a decision system after reduction. If , , and satisfy (or ), then object set () is called as conflict object set of P.
Condition shows that is not equal to . Due to the lack of in P, becomes a conflict object set. That is, satisfies () and (). In another word, and are indistinguishable by P only, but and are distinguishable by . In one word, is an inconsistent decision system. In the system, conflict objects exist which have the same condition attribute values and different decision attribute values. Moreover, the conflict objects lead to the inconsistency of .

Definition 16 (duplicate object set). Let be a consistent decision system, be an attribute reduction of S, and be a decision system after reduction. If , , and satisfy (or ) and (or or ), then object set () is called duplicate object set of after reduction.
Conditions and show that is true by P or by . Due to the lack of in P, becomes an indistinguishable object set without any conflict. That is, , which satisfies () and () and (). In another word, and are indistinguishable by P or , but and are distinguishable by . In one word, is a consistent decision system, where duplicate objects exist with the same condition attribute values and same decision attribute values, and the consistency of is not changed while the distinguishability of is changed by the conflict objects.
Based on the above formal analysis and definitions (especially, Definitions 15 and 16), an algorithm for conflict object set and duplicate object set (ConDup algorithm) is designed in Algorithm 1.
In Algorithm ConDup, denotes the current condition equivalence class, C is used to count the base of , and D is used to count the number of decision values in .
Steps 3–30 are the main part of the algorithm. Steps 5–20 are used to calculate the conflict object set and duplicate object set of . The condition “” shows the base of is greater than 1, and all the objects in have the same decision attribute values, namely, is not a conflict equivalence class but a duplicated equivalence class. Then, all objects in should be added into . If the condition “” is not true while “” is true, it shows that the base of is greater than 1 and all the objects in have different decision attribute values, namely, is not a duplicated equivalence class but a conflict equivalence class. Then, all objects in should be added into . In addition, steps 21–24 are used to judge whether and are duplicate objects or not. Steps 25–29 are used to judge whether and are conflict objects or not.
The time complexity of is mainly decided by steps 3–30. Generally, is 1 or far less than . So, the time complexity is  = .

Input: a sorted decision system
Output: a duplicate object set and a conflict object set
(1)Initialize arrays C and D to 1
(2)Let pointer s point to
(3)for each do
(4) = 0
(5)for each do
(6)  if , then
(7)   if , then
(8)    
(9)   end
(10)   if , then
(11)    
(12)   end
(13)   j = j + 1
(14)    = 1
(15)   Let pointer s point to
(16)   break
(17)  else
(18)   i = i + 1
(19)  end
(20)end
(21)if Flag == , then
(22)  
(23)  j = j + 1
(24)end
(25)if Flag == , then
(26)  
(27)  
(28)  j = j + 1
(29)end
(30)end
(31)Output and

7. Algorithm for Redundant Rule Object Set

After performing value reduction, some rule objects may contain the same decision attributes. Moreover, the attribute value set of one object may be the subset of another object’s. Namely, inclusion relation may exist among these rule objects. Actually, the longer rule objects are redundant, which should be removed from the current system.

With the removal of the redundant rules, the rule generalization can be promoted in the system and a more simplified system can be achieved.

Based on the above analysis, for , the condition that rule object is a redundant rule is that at least one rule object exists, which complies with the following conditions:

(1) and belong to the same decision equivalent class. (2) and have the same values among their condition attribute intersection. (3) The rule length of (that is, the number of ’s condition attributes with value) is less than ’s.

If there exist and satisfying condition (3), it means there exists proper relationship between the attribute values set of and ’s. If there exist and satisfying all the three conditions, it means that is a redundant rule and should be added into redundant rule object set .

In [29], algorithm sorts objects based on firstly and then adds the first objects of consistent condition equivalent classes into to get . According to and the characteristic of redundant rules, an algorithm for redundant rule object set (RedRul algorithm) is designed in Algorithm 2.

Input: marked decision system
Output: redundant rule object set
(1)Min_rule = 
(2)for each do
(3) Flag = 0
(4)for each do
(5)  if then
(6)   Min_rule = 
(7)   Flag = 1
(8)   Break
(9)  else
(10)   
(11)  end
(12)end
(13)if Flag== Min_rule and have the same exact values among their condition attribute set then
(14)  if the rule length of is less than Min_rules then
(15)   Min_rule ⟶ 
(16)   Min_rule = 
(17)  else
(18)   
(19)  end
(20)end
(21)if Flag== Min_rule and do not have the same exact values among their condition attribute set then
(22)  Min_rule = 
(23)end
(24)
(25)end
(26)Output

8. Rule Extraction Model against Decision Dependency Degree

8.1. Rule Extraction Model

As the core problem in Rough Set, many scholars have studied the value reduction and rule extraction algorithms and achieved more results [1428].

After analyzing most of the existing algorithms, it is found that some algorithms are unreasonable for the selection and deletion of condition attributes, and some algorithms have more complicated calculation processes. These algorithms have a higher time and space complexity due to the construction of a discernibility matrix or discernibility function. Moreover, most algorithms are based on algorithms in [23]. The basic principle of those algorithms in [23] is as follows: when performing value reduction on the decision system, for each row of the decision system, where attribute reduction has been finished, it is treated as one decision rule object. For each rule object, its attributes are deleted one by one; at the same time, decision conflict is determined to verify if the deletion is valid or not. With the above principle repeated, value reduction will be achieved finally.

Compared with the previous models, the rule extraction model designed in this paper does not generate matrix and its basic idea is when seeking the minimum reduction, the decision dependency degree is achieved and the value-core attributes that have the greatest impact on decision-making are achieved. After achieving the value-core attributes of the object, the decision dependency degree is used as the attribute importance information to determine the importance of other attributes. Under the premise of without causing a decision conflict or maintaining system consistency, subsequent attribute values are evaluated by the decision dependency degree. With the above principle repeated, a more refined decision rules set (abbreviated as ) is achieved and the rule extraction process is completed.

Based on the above basic idea, a rule extraction model against decision dependency degree ( Algorithm) is designed as follows.

8.2. Related Algorithms

Let be a decision system, . Several algorithms from [29] will be used in the rule extraction model proposed in this paper. Here is a brief introduction.

Generally, the sorting algorithm mainly implements the sorting process by comparing and moving keywords. The average time complexity of the sorting process is preferably . For example, Liu Shao-Hui et al. [30] use the quick sort method against condition attribute set C to perform equivalence class partitioning on universe, which results in time complexity for the reduction model. However, it is not necessary to compare key in pair for base sort. One universe sorting method is given in [29] against static chain base sorting, which has time complexity (normally equals to 1 or far less than in reality).

In [29], the algorithm extracts the first element of all equivalence classes of the universe. During the extraction from the whole universe, ensures the consistency of the system and has time complexity . Working with the sorting method , the algorithm can effectively reduce the size of universe from to .

In [29], the algorithm is used for obtaining of P. When obtaining , uses the idea of Definition 9 (corresponding to the Definition 8 in this paper), which does not create decisive distinguishable matrix as the interim result; hereby, the complexity is controlled to (Algorithm 3).

Input: decision system
Output: the decision rule set
(1)Call DecDep_Red reduction model in [29] to get ; initialize rule set
(2)Call RadSort and RedSet (, ) in [29] to get
(3)for each do
(4) Let ; call RadSort and DecDep_Deg in [29] to get of P
(5)end
(6)Add relevant attributes into Array in descending order according to the achieved values
(7)for each do
(8) Let and call RadSort
(9) Call ConDup to get and
(10) Execute a mark procedure to get
(11)end
(12)for each do
(13)if , then
(14)  For all attributes of , add attributes with original value to P, add attributes with “!” to
(15)end
(16)if , then
(17)  ; Restore to its original value and update
(18)end
(19)for each do
(20)  
(21)end
(22) Call RadSort and get
(23)while
(24)  Restore to its original value
(25)  Update P and
(26)  Call RadSort and update
(27)end
(28) Replace each “!” with “” for all attribute value of ; and update
(29)end
(30)Call RedSort
(31)Call RedSet , delete duplicate rules, and delete rules only with “” to update of
(32)Call RedSort
(33)Call RedRul to get
(34)For  = , update − 
(35)Delete “” in each rule of to get

In [29], the attribute importance reduction model based on decision dependency degree is given with Theorems 11 to 15 (corresponding to Theorems 4 to 8 in this paper). The model uses the bottom-up method, with core as fundamentals and with decision dependency degree as heuristic information. The time complexity of the model is .

8.3. Analyzing the Rule Extraction Model Based on Decision Dependency

Let be a decision system, . The steps of model are analyzed in the order of execution as follows.

Firstly, an attribute reduction model as defined in [29] is called in step 1 to complete the attribute reduction and to get reduced decision system . After that, in step 2, and in [29] are called to complete extraction on global universe. With deletion of redundant object, new system is achieved.

Meanwhile, based on decision dependency degree, in steps 3–5, of attribute set is achieved for any , where is distinguishability of P. Since the distinguishability of is constant, so when is smaller, it means without the distinguishable relationship of P becomes more weak. That is, the smaller the is, the stronger distinguishability of is. In step 6, the count of is , and relevant attributes are added into according to each in the descending order.

Steps 7–11 are to mark the system. For each object in the system, its useless attributes are marked as redundant attributes with “,” its value-core attributes whose absence will lead to decision conflict are kept the original values, and the rest are marked with “!” for further assessment on whether to keep or remove. In step 8, is called to sort all rule objects by , which can make step 9 efficient. In step 9, is called. It determines whether current equivalence class is conflict equivalence class or duplicate equivalence class and gets conflict object set and duplicate object set . After that, the mark procedure is executed in step 10. The step replaces the value of with “” for objects in , keeps the original value of for objects in , and replaces the value of with “!” for objects in .

In steps 12–29, for each object, its non-value-core attributes are evaluated and marked with “!” to decide whether to keep or remove. The attribute set is used to store attributes marked as “!.” And relevant attributes are added into in descending order according to the values. is the sequence number of the relevant attribute for each object. In step 22, is called. As system is sorted by , where attribute set P of keeps original values, it is easy and fast to get conditional equivalence of . In this condition equivalence class, objects which have the same condition attribute values with P of and have different decision attribute values are counted to get decision conflict set of . In steps 23–27, the attribute of with the minimum is restored to original value, where is the sequence number of that attribute for . After that, attribute set P is updated by adding attribute and attribute set is updated by deleting . is called to sort unmarked system by and to update conflict set of . At the end of the loop processing, is empty. At that time, the decision can be determined by all of the attributes with original value from . At last in steps 28-29, all “!” of are set to “” to achieve a new marked system .

Algorithm is called in step 30 to sort universe of the marked system by . is called in step 31 to delete full rule and to delete potentially duplicate decision rules and then update universe of the marked system .

uses two operations distribution and collection in [29] to sort object in universe by attribute set. In short, mainly adopts the base sort method, where “distribution” is actually to find equivalence class. Together with the characteristics of redundancy rules, if rule objects in universe can be aggregated according to decision attribute after the sorting, that is, rule objects with the same decision attribute can be adjacent; then, it is easy for searching and deleting redundancy rules. Thereafter, all the rule objects can be “distributed” and “collected” by condition attributes; after that, all the objects can be “distributed” and “collected” by decision attributes to sort the whole universe by decision equivalence class and sort decision equivalence class by conditional equivalence class.

To this end, is called in step 32 to sort rule objects in current universe by (which is not by as mentioned in [29]). That is, when sorting by , for any , if columns 0 to m − 2 from are condition attributes after reduction and column m − 1 is a decision attribute, then i can take values from 0 to m − 1 in ascending order. As will perform base sort by condition attributes firstly, followed by the decision attribute, rule objects with the same decision value will be gathered in groups; then, it is easy to get decision equivalence class. Therefore, it is easy to determine inclusion relationship among rule instances.

The importance of will be evaluated by and attribute with smaller will give preference to change from “!” to original value. Similarly, when sorting rule objects by , it is reasonable to perform the sort later for those attributes with smaller . As a result, after sorting universe, attributes of rule objects with original value will be found in the top column (top column attribute with stronger distinguishability). Then, it is convenient to compare, search, and locate multiple redundancy rule objects, among which exist inclusion relationships.

After sorting the whole universe by with , rule objects will be aggregated by decision equivalence class and condition equivalence class. In step 33, is called to get redundancy rules set from . At last, in step 34, the difference between rule objects set and redundancy rules set is nonredundancy rules set. After deleting “” in this nonredundancy rules set, a rule set is finally achieved after reduction.

8.4. Analysis of Model Time Complexity

Step 1 is the attribute reduction model in [29]. The time complexity of the model is . The other steps, that is, steps 2–35, are composed of attribute value reduction model.

Thereinto, time complexity of step 2 is . Time complexity of steps 3–5 is . The time complexity of is mainly decided by steps 3–30. Generally, is 1 or far less than . So, the time complexity is . Time complexity of steps 7–11 is mainly decided by steps 8 and 9, so time complexity of steps 7–11 is . Time complexity of steps 12–29 is mainly decided by steps 22–27, so time complexity of steps 12–29 is . Because P is the attribute set of keeping original value of object and  = , so time complexity of steps 12–29 is . Time complexity of step 30 is . Time complexity of step 31 is . And time complexity of steps 32–34 is .

Given the above analysis, time complexity of steps 2–35 is , , and .

As is and is , and . That is, time complexity of the rule extraction model is , , , ,  = , ,  = , , .

9. Simulation Example Analysis

Let decision system , which is the reduction result after performing steps 1-2, where , , and . Illustrated below with demo in Figure 1(a) (rule objects in the table is processed by “ordering whole universe U by ” and “the full universe extraction”) is the detail on rule extraction procedure.

Firstly, for each is achieved by completing steps 3–5. The result is as follows:  = 151 + 26 + 21 ∗ 2 = 219,  = 152 + 26 + 21 ∗ 2 = 220, = 149 + 24 + 21 + 19 = 213, and  = 146 + 25 + 21 + 20 = 212. Thereafter, in step 6, the order of attributes in Array (in descending order according to attribute importance) is . Perform steps 7–11 for any . In step 8, is called to sort universe by . When , results are specified in Figure 1(b).

In step 9, is called to get conflict object set and duplicate object set . Then, get and  =  . After marking the value column of a in step 10, a marked system is specified in Figure 1(c).

In a similar way, steps 7–11 are performed for other attributes in to get marked results, respectively, specified in Figure 1(d).

Steps 12–29 are called to evaluate rule objects. Let and be restored to original value “2,” then and . In step 22, according to the value of , is called to sort unmarked system by . With reference to Figure 1(a), the conflict set of is . Then, steps 23–27 are performed to restore the value of attribute c for to 3 and to update and . According to the value of , is called to sort the unmarked system by . The sorted result is . With reference to Figure 1(a), conflict set of is achieved. Repeat execution on steps 23–27, a of is restored to original value “2,” , , and . At that time, according to the value of , conflict set . At the completion of loop execution on steps 23–27, the remaining “!” of is restored to “.

Similarly, rule objects and are processed by sequence. The two objects can make decision in the original decision system according to current value, so the remaining “!” of and is restored to “.

Get marked system specified in Figure 1(e). After steps 30-31, perform step 32 to order by and get marked system specified in Figure 1(f). In Figure 1(f), rule objects with strikethrough are duplicate objects. Then, a rule set specified in Figure 1(g) is achieved after deleting redundant rules in steps 32–34.

Thereafter, for all rules, “” in each rule are deleted and a reduced rule set  = ; ; ; ; ; ; ; is achieved.

10. Experiment Analysis

Dataset “Iris Plants Database” from database [31] is used to compare rule extraction methods. Method A (proposed in this paper), method B (proposed in [22]), and method C (proposed in [23]) are compared on accuracy and effectiveness. There are 150 instances in Iris Plants Database, including 4 condition attributes “sepal length,” “sepal width,” “petal length,” and “petal width” and one decision attribute “class.” All the instances in the dataset are categorized into 3 decision classes. Because all the condition attribute data are continuous attribute data, the “discretization” method in [32] is used to discrete them as Rough Set has limited processing capacity on continuous data.

Experimental comparison results of three methods are given in Tables 24 and Figure 2. All of the instances are separated into two parts: one is for training set and another is for testing set randomly. One part is treated as the training set of random objects and it is used as learning samples. The other part is treated as the testing set which is the difference set of whole dataset and training set.

Testing of three groups is performed on the set, including(1)Count of instances in training set/count of instances in testing set = 80/70(2)Count of instances in training set/count of instances in testing set = 100/50(3)Count of instances in training set/count of instances in testing set = 120/30

Count of rules, average length of rules, and average accuracy of all the rules by each method are calculated in each testing, where average accuracy is . At the same time, to avoid randomness of a single experiment, it is executed 10 times for each group experiences and then the average value is obtained.

The comparison results are shown in Tables 24 and Figure 2: average accuracy of methods A and B in three-group testing is almost the same and average accuracy from two-group testing result is a bit higher by method A. Moreover, count of rules and average length of rules from method A are less than of methods B and C. With the practice instance increasing, count of rules getting from method A is not significantly increased.

Considering run time, “Geriatric Care Medical Dataset” and other 3 datasets are used as testing datasets in Table 5. The “Geriatric Care Medical Dataset” is from the Canadian Study of Health and Aging. Detailed description of the dataset is available in [33]. The dataset has 8,547 instances, in which 5,089 are female and the rest are male. And, the dataset has 44 attributes, such as chest, diabetes, and dental. Its class or decision attribute is a binary value indicating whether an individual instance has died during follow-up.

LEM2 algorithm (Learning from Examples Module, version (2) [34]) is a rule extraction algorithm based on Rough Set. Since the LEM2 algorithm does not change the content and structure of the original system, and the extracted rules are not affected by the default values, and it has become one of the most commonly used rule extraction algorithms in recent years. Method D (LEM2 algorithm) and other three methods are compared on time.

Experimental comparison results of four methods are given in Table 5. The blind selection of attribute-value pairs and the multiple traversal of multiple cycles leads to the inefficiency of LEM2 algorithm. From the analysis in section “Analysis of model time complexity”, time performance of method A is obviously better than other three comparison methods. And the experimental results also show that the proposed method A is better than other methods in terms of running time.

11. Challenges

The model calls algorithm , which can extract the first element of all equivalence classes in the universe to complete extraction and get . And before calling algorithms and , the model calls algorithm or algorithm . These two algorithms can effectively complete the extraction of the universe.

Therefore, the model is more suitable for processing datasets that contain more duplicate data and redundant data, while it is not well suitable for processing datasets that contain very few duplicate object instances or redundant data.

And, we also find the rule extraction model itself cannot work well in larger than or equal to dataset scales, because it takes too long to get rules. We believe that in combination with parallel processing, the model could better handle dataset scales. This is also what we will study further in the future.

12. Conclusion

In this paper, based on formalized definitions “Attribute Value Reduction” and “Value-Core Attribute,” the algorithm for conflict object set and duplicate object set and the algorithm for redundant rule object set are proposed. Combined with algorithms in [29] and related theorems on attribute value reduction, the rule extraction model based on decision dependency degree is given. As the model does not generate matrix during reduction and uses object equivalent class to get conflict and duplicate object set, its time-space complexity is effectively controlled. The analysis results of simulation example and testing sets show that the new model can reduce redundant data more accurately and effectively while keeping the system classification ability unchanged. In conclusion, the new model can extract concise and key information and is effective and fast.

Data Availability

In the past, the data supporting the conclusions of the study were available at "Silicon Graphics International Corp. Sgi-Mlc++: Datasets from U-ci[EB/OL]. [2014-08-08]: “http://www.sgi.com/tech/mlc/db/”. And now, the same data are available at “http://www.martinbroadhurst.com/stl/”. These datasets are cited at relevant places within the text as references [31].

Conflicts of Interest

The author declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under (Grant nos. 61976032, 61371090, 61602076, and 61702072), the China Postdoctoral Science Foundation Funded Project (2017M621122 and 2017M611211), the Natural Science Foundation of LiaoNing Province (nos. 20170540144 and 20170540232), and the Fundamental Research Funds for the Central Universities (nos. 3132017118, 3132017121, and 3132017123).