Complexity

Volume 2017 (2017), Article ID 1608147, 33 pages

https://doi.org/10.1155/2017/1608147

## Recent Fuzzy Generalisations of Rough Sets Theory: A Systematic Review and Methodological Critique of the Literature

^{1}Faculty of Management, Universiti Teknologi Malaysia (UTM), 81310 Skudai, Johor, Malaysia^{2}Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia^{3}Department of Computer Engineering, Lahijan Branch, Islamic Azad University, Lahijan, Iran^{4}Department of Construction Management and Real Estate, Vilnius Gediminas Technical University, Sauletekio Al. 11, LT-10223 Vilnius, Lithuania^{5}Business Systems and Analytics Department, La Salle University, Philadelphia, PA 19141, USA^{6}Business Information Systems Department, Faculty of Business Administration and Economics, University of Paderborn, 33098 Paderborn, Germany^{7}Department of Graphical Systems, Vilnius Gediminas Technical University, Saulėtekio Ave. 11, LT-10223 Vilnius, Lithuania

Correspondence should be addressed to Jurgita Antucheviciene; tl.utgv@eneicivehcutna.atigruj

Received 12 April 2017; Revised 23 August 2017; Accepted 10 September 2017; Published 29 October 2017

Academic Editor: Danilo Comminiello

Copyright © 2017 Abbas Mardani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Rough set theory has been used extensively in fields of complexity, cognitive sciences, and artificial intelligence, especially in numerous fields such as expert systems, knowledge discovery, information system, inductive reasoning, intelligent systems, data mining, pattern recognition, decision-making, and machine learning. Rough sets models, which have been recently proposed, are developed applying the different fuzzy generalisations. Currently, there is not a systematic literature review and classification of these new generalisations about rough set models. Therefore, in this review study, the attempt is made to provide a comprehensive systematic review of methodologies and applications of recent generalisations discussed in the area of fuzzy-rough set theory. On this subject, the Web of Science database has been chosen to select the relevant papers. Accordingly, the systematic and meta-analysis approach, which is called “PRISMA,” has been proposed and the selected articles were classified based on the author and year of publication, author nationalities, application field, type of study, study category, study contribution, and journal in which the articles have appeared. Based on the results of this review, we found that there are many challenging issues related to the different application area of fuzzy-rough set theory which can motivate future research studies.

#### 1. Introduction

Rough set theory is a powerful and popular machine learning method [1]. It is particularly appropriate for dealing with information systems that exhibit inconsistencies [2]. Fuzzy-rough set theory can be integrated with the rough set theory to handle data with continuous attributes and can detect inconsistencies in the data. Because the fuzzy-rough set model is a powerful tool in analysing inconsistent and vague data, it has proven to be very useful in many application areas. Rough set theory, introduced by Pawlak [3] in the 1980s, is a powerful machine learning tool that has applications in many data mining [4–11] instances, attribute and feature selection [12–25], and data prediction [26, 27]. Rough set theory deals with information systems that contain inconsistent data, such as two patients who have the same symptoms but different diseases. In the rough set analysis, data is expected to be discrete. Therefore, a continuous numeric attribute is required to be discretized. Fuzzy-rough set theory [28] is an extension of the rough set theory that deals with continuous numerical attributes. It can solve the same problems that rough set can solve and also can handle both numerical and discrete data. The importance of fuzzy-rough set theory is clearly seen in several applications areas. For example, Wang [29] and Wang [30] investigated topological characterizations of generalised fuzzy-rough sets in the context of basic rough equalities. Pan et al. [31] enhanced the fuzzy preference relation rough set model with an additive consistent fuzzy preference relation. Namburu et al. [32] suggested the soft fuzzy-rough set-based magnetic resonance brain image segmentation for handling the uncertainty related to indiscernibility and vagueness. Li et al. [33] proposed an effective fuzzy-rough set model for feature selection. Feng et al. [34] used uncertainty measures for reduction of multigranulation based on fuzzy-rough sets, avoiding the negative and positive regions. Sun et al. [35] presented three kinds of multigranulation fuzzy-rough sets over two universes using a constructive method. Zhao and Hu [36] examined a decision-theoretic rough set model in the context of models of interval-valued fuzzy and fuzzy probabilistic approximation spaces. Zhang and Shu [37] suggested a new paradigm based on generalised interval-valued fuzzy-rough sets by combining the theory of rough sets and theory of interval-valued fuzzy sets based on axiomatic and constructive methods. Zhang et al. [24] proposed a new fuzzy-rough set theory based on information entropy for feature selection. Wang and Hu [38] proposed arbitrary fuzzy relations by integrating granular variable precision fuzzy-rough sets and general fuzzy relations. Vluymans et al. [19] suggested a new kind of classifier for imbalanced multi-instance data based on fuzzy-rough set theory. Feng and Mi [39] investigated and reviewed the variable precision of multigranulation fuzzy decision-theoretic rough sets in an information system. Wang and Hu [40] presented novel generalised* L*-fuzzy-rough sets for generalisation of the notion of* L*-fuzzy-rough sets.

In recent decades, various kinds of models have been proposed and developed regarding the fuzzy generalisation of rough set theory. However, the literature review has not kept pace with the rapid addition of knowledge in this field. Therefore, we believe that there is a need for a systematic consideration of the most relevant recent studies conducted in this area. This review paper attempts to systematically review the previous studies that proposed or developed fuzzy-rough sets theory. This review paper adds significant insight into the literature of fuzzy-rough set theory, by considering some new perspectives in examining the articles, such as the classification of the papers based on author and year of publication, author nationalities, application field, type of study, study category, study contribution, and the journals in which they appear.

The structure of this review study is organised as follows. Section 2 shortly reviews the literature regarding fuzzy logic and fuzzy sets, rough sets, fuzzy set theory, fuzzy logical operators, fuzzy relations, rough set theory, and fuzzy-rough set theory. In Section 3, we present the related works. Section 4 presents research methodology including the systematic review, meta-analysis, and the procedures of this study. Section 5 provides findings of this review based on the application areas. Section 6 presents the distribution of papers by the journals. In Section 7, we present the distribution of papers by the year of publication. Section 8 presents the distribution of papers based on the nationality of authors. Section 9 discusses the results of this review with the focus on the recent fuzzy generalisation of rough sets theory and further investigations in this area. Finally, Section 10 presents the conclusion, limitations, and recommendations for future studies.

#### 2. Fuzzy-Rough Sets Theory

##### 2.1. Fuzzy Logic and Fuzzy Sets

Binary logic is discrete and has only two logic values which are true and false, that is, 1 and 0. In real-life, however, things are true to some extent. For example, regarding the patient there are some assumptions such as patient is sick, or say patient is very sick or starting to become sick. Therefore, we cannot confidently say if the patient is sick or not with a certain intensity. Fuzzy logic [43] has extended binary logic through adding an intensity range of values to specify the extent to which something is true. In this situation, the range of truth values is between 0 and 1. The closer the truth value of a statement to 1, the truer the statement. For example, the patient is very sick could have the degree of sickness around 0.9 to specify that the patient is very sick. On the other side, patient could have the sickness degree of 0.1 indicating that the patient is nearly recovered from the illness. A fuzzy set [44] is a set of factors that fit to the set with the membership degree. For instant, we assume that there are two fuzzy sets representing two groups of people including old and young people. Thus the larger the age of one person, the higher the membership degree to the old people, and the lower the person membership degree to the young people. Meanwhile fuzzy logic is extending the binary logic and moreover extends its logical operations. These are the* t*-conorm fuzzy logical -norm and implicator [45] that extend the binary logical implication conjunction and disjunction.

##### 2.2. Rough Sets

In the very big datasets with several items, calculating the gradual indiscernibility relation is very challenging in terms of memory and runtime. Rough set theory (Pawlak [46], Polkowski et al. [47]) is the novel mathematics technique dealing with uncertain and inexact knowledge in several applications in various real-life fields such as information analysis, medicine, and data mining. Rough set is the powerful machine learning technique which has been used in many application areas such as feature selection, prediction, instance selection, and decision-making. Rough set also has applications in many areas such as medical data analysis, image processing, finance, and many other real-life problems [3]. A rough set approximates a certain set of factors with two subsets including upper approximations and lower approximations. The fuzzy-rough set theory is constructed based on two theories including rough set theory and fuzzy set theory. In the next section, we present these theories with hybridisation of both theories.

##### 2.3. Fuzzy Set Theory

Zadeh [44] found that traditional crisp set is not capable of explaining the whole thing in the real manner. Zadeh proposed fuzzy sets to solve this problem. Therefore Zadeh proposed a fuzzy set as a mapping from the universe to the interval . The set for is called the degree of membership of in . Employ this method; factors in the universe could belong to a set to a certain degree. Discovery for the good fuzzy set for model concepts could be subjective and challenging; however it is more significant than trying to create an artificial crisp distinction among factors. Indeed that fuzzy set was an extension of crisp set. Consequently, any crisp set could be modelled using a fuzzy set as follows:

The cardinality of a fuzzy set is defined as the sum of the membership values of all factors in the universe to :

##### 2.4. Fuzzy Logical Operators

There is need for the new logical operators to extend the crisp sets to the fuzzy sets. In crisp set theory, for example, the proposition a factor belongs to the sets and is either true or false. For extending this theorem to fuzzy set theory, there is need to the fuzzy logical operators for extending the logical conjunction , for expressing to what extent an example belongs to and , given the membership degrees and .

The conjunction and the disjunction were extended by using the* t*-conorm and* t*-norm that map , satisfying the following conditions: and are increasing in both arguments. and are commutative. and are associative.

The most significant examples of* t*-norms are the minimum operator , which is the largest* t*-norm, the product operator , and the Łukasiewicz* t*-norm :

The important examples of* t*-conorms are the maximum operator , which is the smallest* t*-conorm, the probabilistic sum , and the Łukasiewicz* t*-conorm :

The implication is extended by fuzzy implicators, which are mappings : that satisfy the following: is decreasing in the first and increasing in the second argument. satisfies and .

The well-knowing implicator is the Łukasiewicz implicator , defined by

##### 2.5. Fuzzy Relations

The binary fuzzy relations in are the special type of fuzzy sets which are fuzzy sets in and express to what extent and are associated with others. In the field of fuzzy-rough set, usually use relations for modelling indiscernibility between the examples. Therefore, we refer to them as indiscernibility relations. We need to be minimum a fuzzy tolerance relation; that is, is reflexive: , and symmetric , .

These two situations are linked to the symmetry and reflexivity conditions of the equivalence relation. The third situation for an equivalence relation, transitivity, is translated to -transitivity for a certain* t*-norm :

For this case, is called a -similarity relation. Thus when is -transitive, is -transitive for all* t*-norms . For this case, will be a similarity relation.

##### 2.6. Rough Set Theory

Pawlak [3] proposed the rough set theory for handling the problem of incomplete information. Pawlak introduced a universe involving factors, an equivalence relation on , and a concept within the universe. The problem of incomplete information indicates that it should not be possible to determine the concept based on the equivalence relation , which is there are two factors and in that are equivalent to but to which belongs to and does not. Figure 1 represented this case in which the universe is divided into squares applying the equivalence relation. The concept does not follow the lines of the squares, which means that cannot be described using . In real world this kind of problem with incomplete information often occurs such as problem of spam classification. For example we assume the world contains nonspam and spam e-mails and mention that this concept is spam. The equivalence relation is introduced according to the predefined list of 10 words which usually are in the spam e-mails; thus it can mention that two e-mails are equivalent if they have the similar words among the list of 10 words. Some equivalence classes will be completely contained in the spam group, and some will be completely contained in the nonspam group. Though it is very likely that there are two e-mails that have the similar words among the list of 10 words, however for which one is spam and the other is nonspam. In this case, the equivalence relation is not able to distinguish between spam and nonspam. Pawlak [3] addressed this kind of problem by approximating the concept . The lower approximation includes all the equivalence classes that are included in , and the upper approximation includes the equivalence classes for which at least one factor is in . Figure 2 represented the concept of lower and upper approximations.