Abstract

The knowledge characteristics weighting plays an extremely important role in effectively and accurately classifying knowledge. Most of the existing characteristics weighting methods always rely heavily on the experts’ a priori knowledge, while rough set weighting method does not rely on experts’ a priori knowledge and can meet the need of objectivity. However, the current rough set weighting methods could not obtain a balanced redundant characteristic set. Too much redundancy might cause inaccuracy, and less redundancy might cause ineffectiveness. In this paper, a new method based on rough set and knowledge granulation theories is proposed to ascertain the characteristics weight. Experimental results on several UCI data sets demonstrate that the weighting method can effectively avoid subjective arbitrariness and avoid taking the nonredundant characteristics as redundant characteristics.

1. Introduction

In data mining, in order to effectively classify the knowledge, we need to make proper assessment on the knowledge characteristics sets. Therefore, it is very important to compute the weights of characteristics sets. Weights reflect the role of characteristics in the classification process and directly affect the validity and accuracy of the classifier. The common weighting methods include experts scoring method, fuzzy statistics method [13], Analytic Hierarchy Process (AHP) method [46], and Principal Component Analysis (PCA) method [7, 8]. In these methods, the a priori knowledge must be used.

Rough set theory was firstly proposed by Pawlak in 1982 [9]. It has become an extremely useful tool to handle the imprecision and uncertainty knowledge [9, 10]. Rough set theory can be used to analyze and process the fuzzy or uncertain data without the a priori knowledge [1117]. Now, the rough set theory has been widely used in pattern recognition [1820], data mining [2123], machine learning [2429], and other fields [3036].

In recent years, the rough set method has been studied to calculate the characteristics weight. For instance, based on the concepts of characteristics importance, Wang et al. proposed a method to determine the characteristics weights. However, this method did not consider the influence of decision characteristics on conditional characteristics [37]. Cao and Liang combined the characteristics importance of the rough set and the experts’ a priori knowledge to determine the characteristics weight [38]. This method achieved the unity of the subjective a priori knowledge with the objective situations, but it ignored the internal difference in the equivalent partitions. Therefore, some nonredundant characteristics would be handled by redundant characteristics. Bao et al. proposed a method ascertaining characteristics weight based on rough set and conditional information entropy. It avoids some nonredundant characteristics to be handled by redundant characteristics. But in this method the characteristics importance obtained by redundant characteristics was higher than that got by nonredundant characteristics [39]. Zhu and Chen constructed the priority queue of characteristics importance to improve Bao’s research. They presented a weighting method based on the conditional information entropy and rough set, but that method also involved additional costs [40].

In this paper, a new knowledge characteristics weighting method based on the rough set and knowledge granulation theory is proposed. The accuracy of equivalent partitions in knowledge characteristics is studied and the difference in equivalence classes is analyzed. Experimental results on several UCI data sets confirm our theoretical results. By comparing the numerical results with those of the AHP method, the PCA method, and two rough set based methods, we can draw the conclusion that our new method can effectively avoid taking nonredundant characteristics as redundant characteristics and can improve classification accuracy.

The rest of the paper is structured as follows. Some basic concepts about rough set are briefly introduced in Section 2. In Section 3, a new knowledge characteristics weighting method is proposed and studied. Some experimental results are given in Section 4 to show the effectiveness of the proposed weighting method. Finally, we end this paper with some conclusions in Section 5.

2. Basic Concepts

2.1. Rough Set

Rough set theory takes knowledge as a partition of the objects domain. The equivalence relations and equivalence classes produced by the equivalence relations are valid information or knowledge about the objects domain. Let denote the universe of objects, which is a nonempty set. is the equivalence relation on , called the knowledge on the universe . The equivalence relation divides into the disjoint subsets; it is denoted as or , representing all the equivalence classes. For the subset of the universe , there are the equivalence classes . In general, there are two approximation sets—the lower approximation (set) and the upper approximation (set) . The lower approximation (set) of the set is also defined as the positive region . The set will be referred to as the -boundary region of . Obviously, when the border area is larger, the set divided by is rougher. Therefore, the roughness of rough set about the equivalence relation can be achieved; it is denoted by The accuracy of rough set about the equivalence relation is defined aswhere represents the number of the elements in the collection, . When , is defined as the accuracy set about the equivalence relation . When ,   is defined as the rough set about the equivalence relation .

Suppose and are two equivalence relations about the universe , if , for , there is . Thus, the equivalence classes can be considered finer than the equivalence classes and the knowledge is more accurate than the knowledge ; see [3740] for details.

2.2. Knowledge Granularity

By the rough set theory, people learn that knowledge is related to the equivalence classes, which shows that knowledge is granular. That is why some scholars also identify the structure of knowledge granularity by the equivalence classes and calculate the size of the knowledge granularity [39].

Suppose that is a knowledge base, and is an equivalence relation, also known as knowledge. Knowledge granularity is defined asIf the granularity of reaches its minimum, then . If reaches the universe , i.e., the granularity reaches its maximum, then . If , it indicates that the objects and belong to the same equivalence class with the equivalence relation ; they are indiscernible. Obviously the smaller is, the stronger the discernibility of becomes.

Assume that is an equivalence relation, is a knowledge base, and is the equivalence class. According to (3), the knowledge granularity can be expressed asAnd the discernibility of is defined asAccording to (4), there is . Therefore, we have .

3. Knowledge Characteristics Weighting Based on Rough Set and Knowledge Granulation

Cao and Liang calculated the characteristics weights by the cardinality of the positive region set over the cardinality of the discourse set, but the results may be inaccurate [38]. For example, on the field . Let , and let and be defined as the equivalence relation on . Then the following equivalence classes can be obtained: Their positive areas about on and are . The weight of the knowledge characteristics , in which represents the number of the elements in the collection . And the weight is also shown in . Thus . It is obvious that the characteristics weights are the same, but the equivalence classes of these two characteristics are different.

In order to solve the problems above, we use the knowledge granularity to study the relationship between the various subsets in the complex sets of the equivalence classes and propose a method based on the knowledge granularity to compute the discernibility of knowledge characteristics. Then, the knowledge characteristics weights according to the relationship between the discernibility and the weights of knowledge characteristics will be determined.

3.1. The Discernibility of Knowledge Characteristics

We first give a definition about the discernibility of the knowledge characteristics.

Definition 1. Suppose that is a knowledge base, is the equivalence relation, and is a characteristic. Let and . Then, the discernibility of is denoted by

By Definition 1, we know that the larger is, the more discernible the ability of becomes. When we select two objects randomly on , there are ways. After adding characteristic into , the characteristic discernibility increases from to . Thus, the number of equivalence classes is more than or equal to the original set. Thus, the ability of such discernibility is improved, and the discernibility increases.

Theorem 2. Let ,  ,  , and denote as discernibility of ; then there is .

Proof. From (4) and (5), we have After adding characteristic into , the characteristic discernibility increases from to , and the number of equivalence classes increases. Thus, there exists such that . And we have which shows .
When the granularity of attains its minimum, there is only one element in . When reaches the universe ,   reaches its maximum. Then we obtain Thus, is proved.

3.2. Method to Determine Characteristics Weight

To propose our new characteristics weight method, we further give two definitions.

Definition 3. Suppose that is a knowledge base and , where denotes the condition characteristics and denotes the decision characteristics. identifies the equivalence classes on the universe equivalence partitioned by the decision characteristics . is the discernibility of on the universe . The discernibility of the knowledge characteristics on is defined as

According to (2) and (5), we have the following formulation of :

Definition 4. Suppose that is a knowledge base and , where is the condition characteristics and is the decision characteristics. identifies the equivalence classes on the universe equivalence partitioned by the decision characteristics . For condition characteristics , the discernibility of is and the discernibility of is . Then the discernibility of the is defined as

According to Definitions 3 and 4, we present a new formula to compute the weight of characteristic in the following definition. Detailed computation process is shown in Algorithm 1.

Input: The knowledge base ;
Output: the weight of the characteristic, ;
(1) compute the equivalence class
(2) for   to   do
(3) compute the equivalence class
(4) end for
(5) for   to   do
(6) compute the equivalence class
(7) for   to   do
(8) compute the upper approximation on the set ,  
(9) compute the lower approximation on the set ,  
(10) end for
(11) compute the discernibility of the knowledge characteristics
(12) end for
(13) for   to   do
(14) compute the discernibility of the knowledge characteristics
(15) end for

Definition 5. Suppose that is a knowledge base and , where denotes the condition characteristics and denotes the decision characteristics. identifies the equivalence classes on the universe equivalence partitioned by the decision characteristics . is the discernibility of the knowledge characteristics on equivalence partitioned by the condition characteristics . For any conditional characteristics , the weight of the characteristic is defined as

Theorem 6. Assume that is the equivalence class on the universe equivalence partitioned by the characteristics . For any condition characteristics , is the discernibility of to , and it satisfies .

Proof. By the rough set theory, we know that According to Theorem 2, we have . Thus it is easy to check that .

Theorem 7. Assume that is the equivalence class on the universe equivalence partitioned by the decision characteristics . For any condition characteristics , is the discernibility of to . Then (1)if and are two equivalence relations on and , then ;(2).

Proof. According to Definition 3, there are and . According to (11), there is(1)For the universe , and are two equivalence relations on the universe . Let . There are and . There exists such that .For the universe , there are and . So the following is satisfied:When , we have andSubstituting (18) into (16), we obtainTherefore, .(2)From (16) and (19), we have . When becomes the universe , it partitions the universe into equivalence classes (one class comprises individual elements). For this case, reaches its maximum . Therefore, is obtained.

4. Experimental Results

In this section, some experiments are used to show the effectiveness of our new method. The data used in our experiments come from the Pima Indians Diabetes Data Set, which includes a total of 768 cases, of which 392 are valid, and the rest of the data cases’ characteristics values are missing. Note that the Pima Indians Diabetes Data Set is no longer available due to permission restrictions.

In actual computations, we use these 392 cases for experimentation. The condition characteristics information includes “plasma glucose concentration at 2 hours in an oral glucose tolerance test”, “diastolic blood pressure (mm Hg)”, “triceps skin fold thickness (mm)”, “2-hour serum insulin (mu U/ml)”, “body mass index (weight in kg/(height in m)2)”. The data set is given in Table 1, where “”, “”, “”, “”, and “” denote the condition characteristics, respectively. “” stands for the decision characteristics “class variable (0 or 1)”. Then the condition characteristics values are discretized to different levels as “, , ” or “, , , ”; see Table 2.

According to Algorithm 1, the following characteristics weights can be obtained:

Two experiments are conducted to show the advantages of our new method. The first experiment is to compare different rough set based methods with our method. The second one is to compare the AHP and PCA methods with our method. Both comparisons can show that our new proposed method is more effective than those methods.

In the first experiment, we also choose two rough set-based methods. One is based on the dependence in rough set theory to calculate the characteristics weight. The other is based on rough sets and conditional information entropy.

In knowledge bases and , the dependence of the characteristic is defined as . The characteristics importance . Then the characteristics weight is [39]. By calculation, we have

In knowledge bases and , the dependence of the characteristic is defined as . , and the characteristics importance . Then the characteristics weight is [40]. By calculation, we have

In Table 3, we list the weighting results of the three methods based on rough set. Figure 1 clearly shows their comparison. From Table 3 and Figure 1, it shows that when the methods based on the dependence of rough set and the method based on the rough set and conditional information entropy are used to calculate the characteristics weights, the weights of “” and “” are redundant. But when the proposed method is used to calculate the characteristics weights, the results were not redundant. There is a little relation between “diastolic blood pressure (mm Hg)”, “body mass index (weight in kg/(height in m)2)”, and diabetes, but they are related. So, from this point of view, the new method is more accurate than the other two rough set-based methods.

In the second experiment, the AHP method and the PCA method are used to calculate the characteristics weight. We also compare their results with ours.

For the AHP method, we construct the analytic hierarchy matrix according to the opinion of medical experts [41]. Then we obtain the weights:

For the PCA method, we select the representative variables through the transformation of multiple variables. Then the SPSS software is used to seek the explanation of the total variance and component of the matrix. We take principal components variance contribution rate as weight [41] and finally normalize them to get the weights:

The weighting results are given in Table 4. Figure 2 shows the comparison between the proposed method and two well-known methods. From Table 4 and Figure 2, it is easy to check that the rank of the results calculated with our method is “” > “” > “” > “” > “”. It shows that there is a closed relation between “plasma glucose concentration at 2 hours in an oral glucose tolerance test” and diabetes, and there is a little relation between “diastolic blood pressure (mm Hg)” and diabetes. These results are synthetic optimization on the results calculated by AHP and PCA from Figure 2. By consulting the medical experts, the results calculated by our method are more accordant with the actual situation.

However, the Analytical Hierarchy Process (AHP) method is based on the subjective judgment of the experts and the Principal Component Analysis (PCA) method needs to extract representative principal components and increase an additional a priori information and evaluation criteria. Therefore, these two methods cannot objectively reflect the weight distribution. The new method does not need the prior knowledge, but the obtained weights are in line with the actual situation.

From the above discussion, the weighting method based on rough set can avoid the arbitrariness of subjective judgment. In addition, the weighting method with granularity theory can effectively avoid taking nonredundant characteristics as redundant characteristics. We can conclude that our new method reasonably distributes the weight for each characteristic. The weights basically reflect the importance of each characteristic and can also objectively reflect the actual situation of the patient’s body. Thus, the proposed method is a powerful method in knowledge classification.

5. Conclusions

Knowledge characteristics can help us have a good understanding of the knowledge base. The determination of knowledge characteristics weight can help us effectively classify the knowledge base, so as to achieve the purpose of knowledge management and decision making. In this paper, based on rough set theory and knowledge granularity theory, the weights of knowledge characteristics are determined. Experimental results show that the proposed method can effectively avoid taking nonredundant characteristics as redundant characteristics and can effectively determine the weights of knowledge characteristics.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (no. 61472256, no. 61771265) and the Natural Science Foundation of Jiangsu Province (no. BK20151272).