Attribute Reduction Based on Consistent Covering Rough Set and Its Application
As an important processing step for rough set theory, attribute reduction aims at eliminating data redundancy and drawing useful information. Covering rough set, as a generalization of classical rough set theory, has attracted wide attention on both theory and application. By using the covering rough set, the process of continuous attribute discretization can be avoided. Firstly, this paper focuses on consistent covering rough set and reviews some basic concepts in consistent covering rough set theory. Then, we establish the model of attribute reduction and elaborate the steps of attribute reduction based on consistent covering rough set. Finally, we apply the studied method to actual lagging data. It can be proved that our method is feasible and the reduction results are recognized by Least Squares Support Vector Machine (LS-SVM) and Relevance Vector Machine (RVM). Furthermore, the recognition results are consistent with the actual test results of a gas well, which verifies the effectiveness and efficiency of the presented method.
Attribute reduction has become an important step in pattern recognition and machine learning tasks [1, 2]. The main goal of attribute reduction is to remove redundant information in datasets and draw useful information so as to improve classification ability . The theory of classical rough set, as proposed by Pawlak in 1982, has been used as a mathematical tool to deal with various types of insufficient and imperfect data . Rough set theory, which provides a popular mathematical framework for knowledge discovery, feature selection, data mining, and rule extraction, has been concerned by many research scholars since it was first proposed. Generally speaking, the traditional rough set theory can partition the objects of a universe into mutually exclusive equivalence classes, which was based on equivalence relations. The data table that needs to be analyzed by rough set theory is called an information system. Information system, as a mathematical model in artificial intelligence, is deemed as an important application of rough sets [5, 6]. Over the last decades, there has been much work on information systems with rough set, including some successful applications in machine learning, decision analysis, and knowledge discovery. Therefore, rough set theory has been playing a significant role in the unpredictable and uncertain information systems [7, 8].
A drawback of attribute reduction in traditional rough sets is that it can only deal with discrete databases. Therefore, the continuous databases need to be discretized before attribute reduction. Presently, the existing discretization methods can be roughly classified into two categories: supervised discretization method and unsupervised discretization method . Supervised discretization methods generally include discretization based on information entropy and discretization based on ChiMerge algorithm , while unsupervised discretization methods arguably include box method for equal frequency or equal width, intuitive division discretization, and discretization based on cluster analysis [11, 12]. There are two limitations in traditional attribute reduction based on rough set theory: () databases are numerical in the real world, so that they cannot be handled directly by traditional rough set theory; () numerical data have to be discretized before attribute reduction, which inevitably leads to information loss. Therefore, it is desirable to develop an efficient method which can deal with numerical databases directly. The covering rough set theory was proposed to solve this problem efficiently and it avoids the attribute discretization .
Covering rough set theory is a generalization of traditional rough set theory, which can deal directly with numerical data. Once launched, covering rough set was of great concern. So far, many researchers conducted studies on the approximation problems based on covering rough set [14–17]. However, to the best of the authors’ knowledge, there are relatively few results published on the attribute reduction of covering rough sets and simultaneously its practical application, which motivates the present study.
In this paper, we will first review the theory of traditional rough set and upper-lower approximations, present some basic concepts of consistent covering rough set theory, and establish a model of attribute reduction. Then, the attribute reduction based on consistent covering rough set will be presented and further generalized, which will be compared with attribute reduction based on traditional rough set. Finally, we will apply the studied attribute reduction method to actual logging gas data. LS-SVM and RVM algorithms will be used to recognize the reduction results to confirm its validity.
2. Basic Theory Relate to Rough Set
2.1. Pawlak’s Rough Sets
In rough set theory, the quadruplet is called an information system, where is a nonempty set of samples, called a universe or a sample space. And is a nonempty set of attributes or features, which is divided into the set of conditional attributes and the set of decision attributes, . Every subset of attributes can induce binary relation , also called -indiscernibility relation and defined as . , is the range of ; is a mapping function of ; it gives an attribute value for each object, where , , . For , - approximation and - approximation are defined as follows: , , , and . Figure 1 intuitively shows the - approximation, - approximation, and the boundary area.
An attribute () is called relatively dispensable in if ; otherwise, is indispensable in . The set of all indispensable attributes in is called the core of , denoted as . If is relatively independent in and , is called a reduction of .
2.2. Basic Nations Related to Covering Rough Set
Definition 1 (see ). Let be a universe of discourse and be a family of subsets of . Then, is called a covering of if no subset in is empty and .
It is obvious that a partition of is certainly a covering of and the concept of a covering is an extension of a partition. In [19, 20] the notion of coverings was used to construct lower and upper approximation operators and to study properties of these operators.
Definition 2 (see ). Let be a covering of . For every , let , is also a covering of , and is the minimal set including in ; one calls the induced covering of .
For every , is the minimal set including in . Every element in cannot be written as the union of other elements in . if and only if is a partition. For any , if then ; so if and , then .
Definition 3 (see ). Let be a family of coverings of . For every , let , , and is then also a covering of ; is the intersection of all coverings including in .
Obviously is the intersection of all coverings including in . So, for every , is the minimal set including in . can be viewed as the intersection of coverings in . Every element in cannot be written as the union of other elements in . If every covering in is a partition, then is also a partition and is the equivalence class that includes . For every , if , then , so if and , then .
Definition 4 (see ). Let be a family of coverings of . For any , the lower and upper approximations of with respect to are defined as follows: , .
The positive domain, negative domain, and boundary domain of relative to are, respectively, computed by the following formulas: , , and .
3. Attribute Reduction and Simulation Experiment
3.1. Attribute Reduction Based on Traditional Rough Set
In an information system, attribute reduction is an important application of rough set theory. The key idea is to reduct redundant information while maintaining the indiscernibility relation. Then, traditional reduction methods, such as attribute reduction based on discernibility matrix , attribute reduction based on heuristic information , and attribute reduction based on evolutionary computation , can be used to obtain the attribute reduction results. In the following, we take the particle swarm optimization (PSO) algorithm as an example to study attribute reduction based on an evolutionary algorithm.
3.1.1. Attribute Reduction Based on PSO Algorithm
The basic concepts of attribute reduction in rough set theory and the ideas of particle swarm optimization (PSO) are briefly combined to construct attribute reduction algorithm based on PSO. It reduces algorithm complexity effectively. The steps of its algorithm are as follows.
Step 1. Discretize data in original information table (the discretization method is attribute discretization based on curve inflection points) .
Step 2. Initialize the particle swarm randomly.
Step 3. Construct the fitness function: ; calculate the fitness value of each particle swarm.
Step 4. For each particle swarm, set current fitness value as the new , if the current fitness value is better than the past one. Select the best as , and continue to update the position.
Step 5. Determine whether the termination condition is satisfied; if yes, go to Step . Otherwise, return to Step (or take the iteration times as termination condition).
Step 6. Test each particle by using the reduction definition, get all the candidate reduction sets, remove the redundant attributes, and then get the final reduction sets.
3.1.2. Classic Example Simulation and Comparative Analysis
Table 1 is weather information of a city during daytime. And “No.” shows the number of tested days. There are 5 condition attributes which are “c1” (luminosity), “c2” (temperature), “c3” (relative humidity), “c4” (wind speed), and “c5” (precipitation), respectively. “D” is decision attribute which represents travel condition.
After discretization based on curve inflection points , see Table 2.
After attribute reduction based on PSO algorithm, the reduction result is , which means that the condition attributes c1 and c3 are redundant. And 3 key attributes determine the travel condition, which are c2 (temperature), c4 (wind speed), and c5 (precipitation), respectively.
However, if we apply discretization based on information entropy, see Table 3.
After the same reduction method based on PSO algorithm but not same discretization method, the reduction result is . Apparently, 4 key attributes determine travel condition. They are c1 (luminosity), c3 (relative humidity), c4 (wind speed), and c5 (precipitation), respectively.
We examine that the numerical data have to be discretized through traditional rough set theory in real life. However, it should be pointed out that attribute discretization destroys indiscernibility relations between condition attributes and decision attributes to some extent, and it also leads to lack of information and different reduction results. As a result, the accuracy of attribute reduction is affected. In order to solve the complexity of continuous attribute discretization, we will present a method of attribute reduction based on consistent covering rough set. And the present method can be used to greatly improve the accuracy and efficiency of attribute reduction.
3.2. Attribute Reduction Based on Consistent Covering Rough Set
3.2.1. Basic Definitions and Principles
In practical applications, a large number of databases cannot be directly handled by classical rough sets. For this reason, neighborhood rough sets and similarity relation rough sets were developed. These models induce coverings of a universe instead of partitions and can thus be categorized into covering rough sets. In the following, we review some definitions of consistent covering rough sets.
Definition 5 (see ). Let be a family of coverings of . is a decision attribute set. is a decision division on .
If, , such that , then decision system is called a consistent covering decision system and donated as . The positive region of relative to is defined as . Otherwise, is called an inconsistent covering decision system.
Definition 6. Let be a consistent covering decision system. Supposing , by we donate matrix , called the discernibility matrix of and defined as follows.
(1) when ,(2) When ,If for , is one of the covers to maintain the relation between and with respect to . Here we should point out that if , the relations between elements in are a disjunction; if , we mean it is conjunction between and , , , , . Since is symmetric and , for , we represent only by elements in the lower triangle of .
Theorem 7. Let be a consistent covering decision system, is the discernibility matrix of , and the discernibility function is as follows:where means, for every , or conducts a disjunctive operation. Make ; then the set is the collection of all the reductions of the system.
3.2.2. Classic Example Simulation
In order to further validate the feasibility of the algorithm, illustrative example is applied for simulation analysis.
Table 4 is logging dataset of a typical well, where “No.” represents the number of wells. “c1” represents acoustic time, “c2” represents caliper, “c3” represents natural gamma, “c4” represents plate radius, “c5” represents induction resistivity, and “c6” represents flushed zone resistivity. “D” is the type of a well, “0” represents dry well, and “1” represents oil well .
Obviously, all the condition attributes are numerical data in Table 4. So data have to be discretized and they cannot be directly handled by traditional rough sets. Therefore, consistent covering rough set can be applied to deal with the data in Table 4 so that the lack of information by traditional rough set theory is avoided.
According to the definition of covering rough set, is a covering of sample set . Therefore, constitutes 4 coverings of sample . In decision attribute table, for every condition attribute, the “descending order” is used to establish the equivalence relation. So we can get , , , , , , . The sample can be divided into two categories which are , according to decision attribute , and , . Obviously, due to the definition of consistent covering rough set. And , . Therefore, the discernibility matrix of is as follows: For every , calculation results are as follows:Reduction results based on discernibility function are as follows:
According to the results, there are two reduction results in this decision system, which are , , respectively. Apparently, the condition attributes , , are the key information to distinguish between “dry well” and “oil well,” so .
4. Algorithm Description Based on Consistent Covering Rough Set
Attribute reduction is a core application of rough set. In this paper, the main emphasis is laid on the attribute reduction based on consistent covering rough set. For consistent covering decision system, the essence of attribute reduction is to ensure the minimum subset of conditional attribute so as to achieve the purpose of attribute reduction . According to the above classic example (Section 3.2.2), the flow chart of attribute reduction model based on consistent covering rough set is provided in Figure 2.
According to Figure 2, algorithm steps are designed as follows. Furthermore, the algorithm of attribute reduction based on consistent rough set is programmed in this paper.
Step 1. Read sample data in decision information table.
Step 2. Sort sample data as descending order, and build coverings of the sample.
Step 3. Ensure the decision system is consistent; then run Step (we only consider consistent covering decision system in this paper).
Step 4. Build discernibility matrix , that is, , .
Step 5. Write discernibility function: according to discernibility matrix.
Step 6. Get the reduction set through conjunctive and disjunctive forms; that is, .
5. Practical Application and Experimental Analysis
In order to validate the effectiveness of the studied method for attribute reduction based on consistent covering rough set, we adopt the logging data of a gas well named “Su6” in Xinjiang (China) as showed in Table 5 and conduct a comparative analysis. All condition attributes are numerical. Moreover, 200 experimental sample data types (well depth 3000 m–3400 m) are selected instead of all logging data in order to maintain confidentiality. Among them, they are 80 gas layer points and 120 nongas layer points according to the actual test results. There are 13 condition attributes in Table 5, which are GR (natural gamma), DT (acoustic time), SP (spontaneous potential), WQ (flush zone resistivity), LLD (deep investigated double lateral resistivity), LLS (shallow investigated double lateral resistivity), DEN (density), NPHI (compensated neutron), PE (photoelectric absorption index), U (uranium), TH (thorium), K (potassium), and CALI (borehole diameter). The decision attributes of sample information are the nongas layer and the gas layer, the decision attributes are denoted by , , respectively. And “0” is for nongas layer; “1” is for gas layer. (Note: gas field is abbreviated as natural gas field that is rich in natural gas. Typically, organic matter is buried between 1 and 6 km depth, and oil will be produced with temperatures between 65 and 150 degrees Celsius. Natural gas will be produced while deeper.)
According to the definition of consistent covering rough set, . Obviously, logging data in Table 5 is consistent decision system. The data in Table 5 are input to the program of attribute reduction based on consistent covering rough set. Then, reduction results are . The two different traditional reduction methods of the rough set, which are attribute reduction based on identification matrix and particle swarm optimization- (PSO-) based attribute reduction of rough set, are used to deal with the data in Table 5 for comparison and analysis. Reduction results are shown in Table 6.
According to reduction results and running time in Table 6, we know that reduction method based on consistent covering rough set has advantages of fewer reduction attributes and shorter running time.
In order to further validate the effectiveness of attribute reduction based on consistent covering rough set, the reduction results are recognized by Least Squares Support Vector Machine (LS-SVM)  and Relevance Vector Machine (RVM) . Recognition results are shown in Table 7. The recognition results show that recognition accuracies of the studied algorithm are 94.2% (LS-SVM) and 91.5% (RVM), respectively, which are higher than the other two reduction algorithms.
Figure 3 shows the actual gas distribution, Figure 4 shows that recognition results of the studied algorithm by LS-SVM, and the recognition accuracy is 94.2%. Figure 5 shows recognition results of the studied algorithm by RVM and the recognition accuracy is 91.5%.
According to a comparison of recognition results in Figures 3, 4, and 5, we know both recognition accuracies of the studied algorithm by LS-SVM and RVM go up to 90%. It can effectively reduce the tedious work in gas recognition and improve the recognition accuracy. Figures 6 and 7 show classification of the studied algorithm by LS-SVM and RVM, respectively. Among them, the red line indicates classification line, the green points indicate gas layer points, and the black asterisks indicate nongas layer points.
The proposition of attribute reduction based on consistent covering rough set is of great significance. On one hand, it avoids the tedious steps of continuous attribute discretization and reduces the lack of important information in decision information table. For these reasons, the accuracy and efficiency of attribute reduction based on traditional rough set can be improved largely. On the other hand, the proposed algorithm can directly handle the numerical data in the real world and significantly reduce the workload compared with traditional attribute reduction. The presented method was applied to actual lagging data; it proved that gas exploration is effective and the recognition accuracy is high. The presented method is feasible and reasonable and it has important theoretical significance and practical value for artificial intelligence and data mining.
An efficient attribute reduction algorithm on the basis of consistent covering rough set has been presented. The knowledge of traditional rough set and covering rough set has been analyzed. The drawbacks of attribute reduction based on traditional rough set and the advantages of covering rough set have been also discussed. The actual logging data have been applied to test the feasibility and efficiency of the presented algorithm. The experimental results have shown that the studied reduction method can effectively handle numerical data and is much more efficient than traditional rough set theory. The reduction results have been compared with actual recognition results by LS-SVM and RVM algorithm so as to validate the algorithm's effectiveness. It has been demonstrated that the proposed recognition results are consistent with the actual gas distribution and the recognition accuracy is high.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported by Hebei Province Natural Science Foundation (no. E2016202341) and Hebei Province Foundation for Returned Scholars (no. C2012003038).
W. Ding, J. Wang, and Z. Guan, “A novel minimum attribute reduction algorithm based on hierarchical elitist role model combining competitive and cooperative co-evolution,” Chinese Journal of Electronics, vol. 22, no. 4, pp. 677–682, 2013.View at: Google Scholar
D. Chen, S. Zhao, L. Zhang, Y. Yang, and X. Zhang, “Sample pair selection for attribute reduction with rough set,” IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 11, pp. 2080–2093, 2012.View at: Publisher Site | Google Scholar
Z. a. Pawlak, “Rough sets,” International Journal of Computer and Information Sciences, vol. 11, no. 5, pp. 341–356, 1982.View at: Publisher Site | Google Scholar | MathSciNet
Z. Pawlak, “Rough sets and their applications,” Computational Intelligence in Theory And Practice, vol. 8, pp. 73–91, 2001.View at: Google Scholar | MathSciNet
C. Wang, Q. Hu, X. Wang, D. Chen, and Y. Qian, “Feature selection based on neighborhood discrimination index,” IEEE Transactions on Neural Networks Learning Systems, vol. 99, pp. 1–14, 2017.View at: Google Scholar
L. Guan and G. Wang, “Generalized approximations defined by non-equivalence relations,” Information Sciences. An International Journal, vol. 193, pp. 163–179, 2012.View at: Publisher Site | Google Scholar | MathSciNet
D. Miao and D. Li, Rough Set Theory, Algorithm And Application, Tsinghua University Press, Beijing, China, 2008.
Y. Qian, J. Liang, and C. Dang, “Incomplete multigranulation rough set,” IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans, vol. 40, no. 2, pp. 420–431, 2010.View at: Publisher Site | Google Scholar
C.-J. Tsai, C.-I. Lee, and W.-P. Yang, “A discretization algorithm based on Class-Attribute Contingency Coefficient,” Information Sciences, vol. 178, no. 3, pp. 714–731, 2008.View at: Publisher Site | Google Scholar
R. Kerber, “ChiMerge: discretization of numeric attributes,” in Proceedings of the Tenth Conference on Artificial Intelligence, pp. 123–128, 1992.View at: Google Scholar
D.-Q. Miao, “A New method of discretization of continuous attributes in rough sets,” Acta Automatica Sinica, vol. 27, no. 3, pp. 296–302, 2001.View at: Google Scholar
G. Zhang, W. Jin, and L. Hu, “Generalized discretization of continuous attributes in rough set theory,” Control and Decision, vol. 20, no. 4, pp. 372–376, 2005.View at: Google Scholar
L. Hou, G. Wang, N. Niu, and Y. Wu, “Discretizaton in rough set theory,” Computer Science, vol. 27, no. 12, pp. 89–94, 2000.View at: Google Scholar
R. Slowinski and D. Vanderpooten, “A generalized definition of rough approximations based on similarity,” IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 2, pp. 331–336, 1996.View at: Publisher Site | Google Scholar
N. D. Thuan, “Covering rough sets from a topological pointof view,” International Journal of Computer Theory and Engineering, vol. 1, no. 5, pp. 601–609, 2012.View at: Publisher Site | Google Scholar
W. Zhu, “Topological approaches to covering rough sets,” Information Sciences. An International Journal, vol. 177, no. 6, pp. 1499–1508, 2007.View at: Publisher Site | Google Scholar | MathSciNet
G. Resconi and C. Hinde, “Agents and rough sets,” International Journal of Computational Intelligence Systems, vol. 7, no. 1, pp. 137–157, 2014.View at: Publisher Site | Google Scholar
Z. Bonikowski, E. Bryniarski, and U. Wybraniec-Skardowska, “Extensions and intentions in the rough set theory,” Information Sciences. An International Journal, vol. 107, no. 1-4, pp. 149–167, 1998.View at: Publisher Site | Google Scholar | MathSciNet
J. A. Pomykała, “Approximation operations in approximation space,” Bulletin of the Polish Academy of Sciences, vol. 35, no. 1, pp. 653–662, 1987.View at: Google Scholar | MathSciNet
J. A. Pomykala, “On definability in the nondeterministic information system,” Bulletin of the Polish Academy of Sciences: Mathematics, vol. 36, no. 3, pp. 193–210, 1988.View at: Google Scholar | MathSciNet
C. Wang, Q. He, D. Chen, and Q. Hu, “A novel method for attribute reduction of covering decision systems,” Information Sciences. An International Journal, vol. 254, pp. 181–196, 2014.View at: Publisher Site | Google Scholar | MathSciNet
Y. Lv, H. Liu, and J. Jiang, “The method of attribute reduction based on discernibility matrix,” in Proceedings of the International Conference on Computer Application System Modelling, vol. 1, pp. 370–370, 2010.View at: Google Scholar
R.-S. Cao, S.-H. Ni, and P. Zhang, “New heuristic attribute reduction algorithm based on sample selection,” Computer Science, vol. 43, no. 8, pp. 40–43, 2016.View at: Google Scholar
M. Alweshah, O. A. Alzubi, and J. A. Alzubi, “Solving attribute reduction problem using wrapper genetic programming,” International Journal of Computer Science and Network Security, vol. 16, no. 5, pp. 77–84, 2016.View at: Google Scholar
J. Bai, K. Xia, Y. Chi, and L. Liu, “Continuous attribute discretization based on inflection point,” Journal of Information and Computational Science, vol. 11, no. 4, pp. 1327–1333, 2014.View at: Publisher Site | Google Scholar
C. Degang, W. Changzhong, and H. Qinghua, “A new approach to attribute reduction of consistent and inconsistent covering decision systems with covering rough sets,” Information Sciences. An International Journal, vol. 177, no. 17, pp. 3500–3518, 2007.View at: Publisher Site | Google Scholar | MathSciNet
C. Wang and D. Chen, Theory and method of knowledge acquisition based on Rough Sets, Harbin Institute of Technology Press, China, 2010.
Y. Yao and B. Yao, “Covering based rough set approximations,” Information Sciences. An International Journal, vol. 200, pp. 91–107, 2012.View at: Publisher Site | Google Scholar | MathSciNet
M. M. Adankon and M. Cheriet, “Model selection for the LS-SVM. Application to handwriting recognition,” Pattern Recognition, vol. 42, no. 12, pp. 3264–3270, 2009.View at: Publisher Site | Google Scholar
G. M. Foody, “RVM-based multi-class classification of remotely sensed data,” International Journal of Remote Sensing, vol. 29, no. 6, pp. 1817–1823, 2008.View at: Publisher Site | Google Scholar