Abstract

Dempster-Shafer evidence theory is a very useful tool in dealing with the decision-making of uncertain information. However, the classical evidence theory is no longer applicable when the frame of discernment (FOD) is incomplete. Moreover, incomplete FOD is an important reason for the emergence of conflict. So it is necessary to identify whether the FOD of the system is complete or not. In this paper, a method is proposed to identify the incomplete FOD under framework of the generalized evidence theory dealing with incomplete information. Within the proposed method, pieces of evidence are generated from the attributes of each sample firstly; then three criteria are used to identify weather the FOD is incomplete according to these evidence. The main parameters of the criteria are the amount of being a focal element in generated evidence, the mass of in the weighted average evidence from generated evidence, and the mass of in the combination of generated evidence. Some experiments are used to demonstrate the effectiveness of the proposed method.

1. Introduction

Dempster-Shafer evidence theory (D-S evidence theory) [1, 2] is widely used in many fields such as decision-making [36], evidential reasoning [7, 8], uncertainty measure [9, 10], and others [1115] because of its advantages in handling uncertainty information. This theory is also widely used in practical applications, such as fault diagnosis [16, 17], knowledge acquisition [18], risk and reliability analysis [19, 20], and failure mode [2123]. While the counterintuitive results can be obtained when the given evidences highly conflict with each other, hundreds of methods have been proposed to address this issue [24]. In D-S evidence theory, conflict management is still an important issue. In general, there are two main reasons that may lead to conflict: one is the incomplete frame of discernment (FOD) and the other is that the sensors are disturbed. In order to better implement the combination of conflicting evidence, it is necessary and important to identify whether the FOD is complete or not.

In previous studies, Lefevre et al. [25] proposed a unified belief function combination method to manage the conflict, mainly considering the issue of conflict redistribution. Haenni’s view is to get the pretreatment evidence and then use Dempster combination rule to manage the conflict [26]. Murphy [27] presented a method to combine belief functions named averaging to balance multiple evidence. But it does not offer convergence toward certainty. Based on this, an improved method is presented in [28]. While these studies ignored the fact that the incomplete FOD is also an important cause that may lead to the conflict, for that, Smets and Kennes [29] proposed the TBM model under the open world assumption. Recently, a generalized evidence theory was presented in [30], addressing the combination of conflicting evidence in open world. It greatly expands the application of evidence theory like fuzzy [3133] and game theory [34] and jumps out of the original model in dealing with conflicts. But the research aforementioned also did not tell under which condition the system is in incomplete FOD. According to previous studies, the research on identifying the incomplete FOD is still an open issue and has not been given deserved attention.

In this paper, a method is proposed to identify the incomplete FOD under framework of the generalized evidence theory. Generalized evidence theory [30] is a novel theory which can express and deal with uncertain information in an incomplete FOD. In consideration of empty set can express the information of incomplete FOD; three parameters of are used in the proposed method. They are the mass of in the weight average evidence from generated pieces of evidence, the amount of being a focal element in generated pieces of evidence, and the mass of in the combination of generated pieces of evidence. Within the proposed method, pieces of evidence can be generated from the attributes of each sample firstly; then three criteria are used to identify weather the FOD is incomplete according to these evidence.

In this paper, a method to identify the incomplete FOD, which takes into consideration the information in both the evidence and the samples, is proposed. The method uses the correlation coefficient [35] which has a better performance compared with other coefficients to express the similarity of evidence. From three aspects the proposed method collects the information about the FOD of the system, respectively, the mass of in one piece of evidence, the mass distribution of in weighted average evidence, and the mass distribution of in combination result. Some experiments are used to demonstrate the effectiveness of the proposed method. The experiments show that, for a collected sample, if the criteria can be satisfied, the system is seen incomplete. If it is not satisfied, it is seen as a complete FOD.

The rest of this paper is organised as follows. In Section 2, the preliminaries about D-S evidence and generalized evidence theory are briefly introduced. Section 3 presents the proposed method with three criteria. In Section 4, some experiments are shown to demonstrate the effectiveness of our method. An application about the motor rotor fault diagnosis is shown in Section 5. Finally, a brief conclusion is made in Section 6.

2. Preliminaries

2.1. Dempster-Shafer (D-S) Evidence Theory

D-S evidence theory is introduced by Dempster [1] and then developed by Shafer [2]. Owing to its outstanding performance in uncertainty model and process, this theory is widely applied to decision-making, optimization and reliability, and risk analysis.

Definition 1. Let be a finite nonempty set of mutually exclusive hypotheses, indicated by where set is called a frame of discernment. The power set of , , is indicated as follows:

Definition 2. A mass function is a mapping from to , formally noted by which satisfies the following condition: When , is called a focal element of the mass function.

Definition 3. Evidence combination in D-S evidence theory is noted as . Assume that there are two BPAs indicated by and ; the evidence combination of the two BPAs with Dempster’s combination rule [1] is formulated as follows:where reflects the conflict between the two BPAs and .

When given pieces of evidence, the evidence fusion with Dempster’s combination rule can be shown in (8). It attributes to the merit of the commutativity and associativity of the combination rule

Recently, Jiang proposed a correlation coefficient [35] to measure the degree of evidence.

Definition 4. For a discernment frame with elements, suppose the mass of two pieces of evidence denoted by . A correlation coefficient is defined as follows: where is degree of correlation denoted as where ; is the focal elements of mass, respectively; and is the cardinality of subset.

2.2. Generalized Evidence Theory

Generalized evidence theory (GET) [30] eliminates the constraint close world and builds framework on an open world. To some extent, GET is the extension of the D-S theory and can express and deal with more uncertain information in the open world, comparing with D-S theory.

Definition 5. Let be a frame of discernment (FOD) in an open world. Its power set is composed of propositions. For , the mass function is a mapping : that satisfies then is the generalized basic probability assignment (GBPA) of the FOD .

The difference between GBPA and classical BPA is the restriction of (4), which means the empty set should also be regarded as a focal element and represents the union of the focal element out of the given FOD. And if , the GBPA degenerates to a classical BPA.

The same as GET, TBM model also assigns mass to empty set to represent unknown information. But the difference is the generation process of the mass of empty set. TBM model is simply to remove the normalization process of the Dempster’s combination rule and assigns the value of conflict coefficient to empty set. When generating evidence, the mass of empty set is still 0, while in GET, when generating evidence, mass can be assigned to empty set. This means there is no the restriction of (4).

Definition 6. Given two GBPAs ( and ), and are regarded as conflict with each other and the mass distribution of should be assigned to conflict. The generalized combination rule (GCR) is defined as follows: with Equation (13) defines the generalized conflict coefficient, and when means the framework of discernment is complete, the generalized conflict coefficient degenerates to a classical conflict coefficient.

While Jiang and Zhan pointed that there are two shortcomings of GCR in [30], one is the way to obtain that is unreasonable and lacks specific physical meaning. The other is that the way to obtain generalized conflict coefficient in (16) is not consistent with the GBPA. So the modified generalized combination rule (mGCR) in GET was proposed in [30].

Definition 7. In mGCR, and are considered as a support for . The orthogonal sum of and should also be normalized like other focal elements. Given two GBPAs ( and ), the mGCR is defined as follows: with

Also the distance between two bodies of evidence based on GET is proposed the same as in D-S evidence theory.

Definition 8. Let and be the two GBPAs on the framework of discernment ; the distance between and can be defined as follows: where is an dimension matrix and its element is expressed as follows:and its computing method is where , .

Equation (19) can be used in the situation when the frame of discernment is complete, and the result is similar to the distance using the definition in [36].

3. The Proposed Method

Generally speaking, the empty set indicates that no elements are included. In the classical evidence theory, there is no mass assignment to . While in GET [30], it is considered to indicate the elements that are not in the framework, which presents the information that is out of the FOD. Based on this idea, a method mainly employing the mass of is proposed under the GET framework to identify the incomplete FOD.

An incomplete FOD means that there are targets or classes or anything else that are not included in the current FOD. Let us consider a classification problem. Assume it is known that all known samples belong to classes which constitute a FOD . And it is assumed that each sample has attributes. Now a new sample is obtained. How can we identify the completeness of FOD according to ? In this paper, for the sample , at first GBPAs which allow the empty set to have mass (i.e., ), denoted as , will be generated from the attributes. Then, three criteria are used as follows.

3.1. Criterion 1

Criterion 1. Let be a variable taking value in , . If in GBPA ; then ; else, . Sample supports that FOD is incomplete if . Otherwise, FOD is said to be complete.

This criterion illustrates that if the amount of the initial generated GBPA of whose value exceeds 0.5 is more than half of the quantity of evidence, it is considered in an incomplete FOD. There is a physical meaning of the criterion. Firstly, the parameter of a generated GBPA represents the evidence’s confidence to support the incomplete FOD. That is because is treated as a focal element which expresses the elements that are not in the FOD. That is to say if the distribution of is larger, there is a larger support that the FOD is incomplete. Then assuming 0.5 is a threshold, if exceeds the threshold, this evidence is judged supporting an incomplete FOD. Therefore, the amount of evidence which satisfies , equivalent to , is used to identify the FOD. If the criteria can be satisfied, which says more than half of the evidence support is out of the FOD, the FOD is incomplete.

3.2. Criterion 2

Criterion 2. Let be the weighted average of . FOD is incomplete if . Otherwise, FOD is said to be complete.

The criterion indicates if the mass of in weighted average evidence is more than 0.5, it is considered in an incomplete FOD. In this criterion, a weighted average of , namely, , is calculated by considering that the evidence generated from different attributes should have different importance. represents the total support of the incomplete FOD, taking the correlativity and difference of the generated evidence into account. That is to say the value of also is an information to identify the FOD. And if the value of is larger, there is a larger support that the FOD is incomplete. Assuming 0.5 is a threshold, if , it is judged as supporting an incomplete FOD.

As shown above, this criterion is based on the weighted average evidence . In this paper, Deng’s approach given in [28] is used to obtain . The process is given as below.

Step 1. For each pair of generated GBPA and , the similarity between and is denoted as . Deng proposed to calculate based on the distance of evidence. While Jiang discussed in [35] that the correlation coefficient she proposed has a better performance compared with distance of evidence, so in this paper the correlation coefficient is used to measure the similarity measure .

Definition 9. Let be a frame of discernment (FOD) in an open world, containing mutually exclusive and exhaustive hypotheses. The similarity measure is expressed as where where ; is the focal elements of mass, respectively; and is the cardinality of subset, especially, .

Step 2. For the generated GBPA, we can calculate the similarity measure between and (). So a similarity measure matrix (SMM) can be constructed to give the insight into the agreement between the pieces of evidence:

Step 3. After obtaining the similarity measure matrix SMM, the support degree of each evidence () is defined by Then, the credibility degree of evidence (i.e., ) is obtained For each piece of evidence its credibility degree is seen as its weight.

Step 4. Finally, the modified weighted average evidence is given as

Once is obtained, can be known as well. According to this criterion, if , it can be judged that the FOD is incomplete.

3.3. Criterion 3

Criterion 3. Let be the combination of by using the modified generalized rule (mGCR). FOD is incomplete if . Otherwise, FOD is said to be complete.

This criterion shows that if the mass of in the combination of generated pieces of evidence with mGCR is more than 0.5, it is considered in an incomplete FOD. That is because the parameter represents the pieces of evidence’s total confidence of the incomplete FOD, and the combination rule takes all generated pieces of evidence into account to get a final result to identify which set the sample belongs to. If the assignment of is large, which means the assignment of other set will be inversely small, it is supporting the set and the incomplete FOD. Assuming 0.5 is a threshold, if , this system is judged supporting the incomplete FOD.

3.4. A Numerical Example for the Three Criteria

The proposed method can be used to identify the incomplete FOD, as long as we obtain the three parameters , , . In this subsection, an illustrative example is given to show the identification result according to the three criteria.

Example 10. Assume there is a FOD . A new sample with attributes is obtained. It is assumed that four GBPAs are generated from this sample; these GPBAs are shown as follows: Now, we can use the proposed criteria to identify the completeness of FOD in terms of , , , .
At first, according to Criterion 1, FOD is incomplete, since , , which means .
Then, from Criterion 2, FOD is also identified as incomplete; the process is shown as follows.
Steps 1 and 2. The similarity measure matrix (SMM) can be calculated with (22) and (24) firstly:Step 3. Then the weights of four pieces of evidence , , , are calculated according to (25) and (26): Step 4. Finally, the weighted average evidence of the four pieces of evidence is Therefore, according to Criterion 2, , which shows that the FOD is still incomplete.
At last, let us use Criterion 3 to identify the completeness of . By using mGCR to combine the four pieces of evidence, the result is where . According to Criterion 3, the FOD is incomplete. To express intuitively, these results are all shown in Table 1.

As shown in this example, evidence supports that this sample belongs to element with 0.7988 mass of distribution; other evidences , , support the sample belonging to which represents the element out of the FOD . If all these evidence are not disturbed, human will consider this sample is in an incomplete FOD intuitively. From Table 1, , , ; three criteria are all satisfied; this sample is in an incomplete FOD as these evidences imply.

4. Case Study

In this section, several experiments are given to demonstrate the effectiveness of the proposed method. Iris data set, a popular data set for classification problem, is used in modeling and simulation in these experiments. In this data set, 150 samples belong to three categories, namely, setosa (), versicolor (), and virginica (). Each category has 50 samples. Every sample in the data set has four attributes: sepal length (PL), sepal width (SW), petal length (PL), and petal width (PW).

Now the data set is divided into two parts; one is the training set which includes 90 samples that are randomly selected from the three categories with equal quantity and the other is the test set which contains 60 samples from the three categories. In order to establish an incomplete FOD , we only abstract samples belonging to categories and from the training set to construct the training model. By using the method mentioned in [37], a triangular fuzzy number model for each attribute of categories and is constituted according to the minimum, mean, and maximum value of each attribute in training set. Then for each attribute two triangular fuzzy numbers associated with categories and are generated to form a membership function for each attribute. The relevant values are listed in Table 2 and associated training modes are shown in Figure 1.

Then, based on the training models shown in Figure 1, four pieces of evidence can be generated for each sample in the test set because there are four attributes for an Iris sample. The method in [38] to generate GBPA is used in this paper. Based on the mentioned above for each sample in the test set pieces of evidence (i.e., GBPAs) associated with attributes can be obtained, as shown in Figures 24.

Now four experiments are carried out to verify the effectiveness of the proposed criteria of identifying incomplete FOD. Three of them will consider a single criterion, while the last one will use all three criteria simultaneously.

Experiment 1. Only consider Criterion 1. For each sample in the test set, we can calculate . Then use Criterion 1 to judge whether FOD is incomplete or not.

Experiment 2. Only consider Criterion 2. For the generated GBPAs of each sample in the test set, use the method proposed in Section 3.2 to derive the average weighted evidence . Then is obtained as well, according to Criterion 2, judging the completeness of FOD .

Experiment 3. Only consider Criterion 3. Use mGCR to combine the generated evidence of each sample to derive . Then is obtained as well. According to Criterion 3 the completeness of FOD can be judged.

Experiment 4. In this case, Criterions 1, 2, and 3 are simultaneously considered. According to the method in Section 3 to derive the parameters of three criteria, judge whether the FOD is incomplete or not. If all three criteria are satisfied, the FOD is incomplete. If one of the criteria cannot be satisfied, the FOD is complete.

In order to clearly show the results of these experiments, the confusion matrix [39] is used containing the information about actual and identified situation. Based on the matrix, some indices, for instance, accuracy, sensitivity (also called recall), and precision, have been developed to evaluate the performance of each criteria. For each sample of and , if it does not meet the criteria, it means FOD is complete, which is correct. If it does, it is considered that is not complete which is incorrect, while, for each sample of , it is recognized correctly only when the result supports that is incomplete. Therefore, for each sample, it either implies FOD is complete or supports FOD is incomplete. The simulation results of the four experiments are given in Table 3.

According to Table 3, the identification for FOD is divided into two situations: complete FOD and incomplete FOD, simply denoted as and . Based on the experiment results, four confusion matrixes are derived, shown in Tables 47. And the recall rate, precision rate, and the overall accuracy rate can be calculated, as shown in Table 8. As shown in the last column of Table 8, the fourth experiment occupies the highest accuracy for the identification, which shows that the simultaneous consideration of the three criteria could lead to the best accuracy compared with just simply considering one criterion. As also can be found from Table 8, Criterion 2 has the best performance in accuracy among the three criteria. That is because it takes the correlativity and difference of evidence into account.

In order to give a clear comparison to the information of the results, Figures 57 graphically show the results of recall, precision, and accuracy rate, derived from different experiments. It is found that in every experiment the recall rate of is greater than that of , which says for each actual situation there is a greater possibility of judging correctly for the samples out of the FOD. And precision rate of smaller than that of means the identification result is more accurate compared with the .

Figure 7 shows that Experiment 4 which simultaneously considers the three criteria has the best performance in accuracy. There are several reasons for this result. For example, in identifying the complete FOD the condition is weakened which means three criteria do not need to be simultaneously satisfied. According to Table 8, the accuracy rate for Criterion 3 is lower compared with other experiments. So this condition eliminates some of Criterion 3 influence.

In a word, for a sample FOD can be identified with a high degree of accuracy. According to this, the proposed method is proved to be effective. Moreover, if all these three criteria can be satisfied, the system identified in incomplete frame of discernment is correct with 88% of the accuracy.

5. Application

In this section, the case of the motor rotor fault diagnosis is shown to demonstrate the effectiveness of the proposed method. In this case, we can obtain the rotor acceleration spectrum and the time domain vibration displacement average amplitude according to the sensor data. Then we can judge which kind of fault is the rotor based on in the proposed method.

There are three kinds of faults for the motor rotor: imbalance (), misalignment (), and loose support base (). We select the the frequency to be at an amplitude of vibration of 25 Hz (), 50 Hz (), and 75 Hz () and the vibration displacement average amplitude in time domain () to be four () attributions’ eigenvalue. Each attribution eigenvalue was collected 40 times in the time interval continuously, a total of 5 groups. These data can be divided into two parts: training set and test set. Training set also includes 90 samples, and test set includes the remaining 30 samples.

For each group, abstract the samples belonging to faults , to establish the incomplete FOD . Based on the method mentioned in Section 4, the triangular membership function in Figure 8 and the GBPAs in Figures 911 are generated. The same as the above experiments, there are four experiments taking into the criteria, respectively, and simultaneously. Then average identification results are shown in Figure 9, as well as the recall, precision, and accuracy rate based on the confusion matrix.

In Table 9, we can observe the same conclusion as the fourth experiment occupies the highest accuracy for the identification. Moreover for each experiment the accuracy is quiet high, and particularly all samples out of the FOD can be identified correctly in this application. The experiment results prove the effectiveness of the proposed method.

6. Conclusions

As stressed in previous studies, the method to identify the incomplete frame of discernment is still not presented. And the incomplete framework is an important reason for the emergence of conflict. In this paper, a new method is proposed under the framework of GET, making full use of the available information contained in the generated pieces of evidence from a sample, to identify the incomplete frame of discernment. The case study of four experiments demonstrates the effectiveness of the proposed method. And in the experiments, three criteria parameters have a great influence on identifying the incomplete FOD.

The proposed method can be applied to many applications, such as fault identification and infectious disease surveillance. However, the incorrect identification result may be obtained because of the inaccuracy membership function. In the future, the proposed criteria of identifying incomplete FOD will be merged with the combination of highly conflicting pieces of evidence to obtain more reasonable combination result.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Wen Jiang and Yue Chang conceived and designed the experiments. Yue Chang performed the experiments and wrote the paper. Shiyu Wang analyzed the data.

Acknowledgments

The work is partially supported by National Natural Science Foundation of China (Program nos. 61671384 and 61703338), Natural Science Basic Research Plan in Shaanxi Province of China (Program no. 2016JM6018), Project of Science and Technology Foundation, and Fundamental Research Funds for the Central Universities (Program no. 3102017OQD020).