Research Article  Open Access
Ezgi Türkarslan, Jun Ye, Mehmet Ünver, Murat Olgun, "Consistency Fuzzy Sets and a Cosine Similarity Measure in Fuzzy Multiset Setting and Application to Medical Diagnosis", Mathematical Problems in Engineering, vol. 2021, Article ID 9975983, 9 pages, 2021. https://doi.org/10.1155/2021/9975983
Consistency Fuzzy Sets and a Cosine Similarity Measure in Fuzzy Multiset Setting and Application to Medical Diagnosis
Abstract
The main purpose of this study is to construct a base for a new fuzzy set concept that is called consistency fuzzy set (CFS) which expresses the multidimensional uncertain data quite successfully. Our motive is to reduce the complexity and difficulty caused by the information contained in the truth sequence in a fuzzy multiset (FMS) and to present the data of the truth sequence in a more understandable and compact manner. Therefore, this paper introduces the concept of CFS that is characterized with a truth function defined on a universal set . The first component of the truth pair of a CFS is the average value of the truth sequence of a FMS and the second component is the consistency degree, that is, the fuzzy complement of the standard deviation of the truth sequence of the same FMS. The main contribution of a CFS is the reflection of both the level of the average of the data that can be expressed with the different sequence lengths and the degree of the reasonable information in data via consistency degree. To develop this new concept, this paper also presents a correlation coefficient and a cosine similarity measure between CFSs. Furthermore, the proposed correlation coefficient and cosine similarity measure are applied to a multiperiod medical diagnosis problem. Finally, a comparison analysis is given between the obtained results and the existing results in literature to show the efficiency and rationality of the proposed correlation coefficient and cosine similarity measure.
1. Introduction
Fuzzy set theory was introduced by Zadeh [1] in 1965 with the help of the concept of membership (truth) function that is used as an effective tool to overcome uncertainty in science, and it has applications in many different fields such as economics, engineering, decisionmaking, management, and medicine [2–4]. There are many generalizations of the concept of the fuzzy set in the literature, and their applications to several areas such as decisionmaking and medical diagnosis are studied to model uncertain data that is encountered in science often. For example, Akram et al. [5] have proposed a new decisionmaking method in complex spherical fuzzy environment and Das et al. [6] have introduced a medical diagnosis model by using fuzzy logic and intuitionistic fuzzy logic. Moreover, a decisionmaking method, for the selection of an effective sanitizer to reduce COVID19 which is one of the most uptodate problems of recent times, has been presented in [7]. One of the generalizations of fuzzy sets is the concept of hesitant fuzzy set (HFS) [8], which is characterized by a membership (truth) function that is a set of crisp values in . A HFS can model uncertain data better than a fuzzy set, thanks to its handy structure, so it has been frequently preferred by researchers to solve multicriteria (group) decisionmaking or multiperiod medical diagnosis problems [9–12]. However, the concept of HFS eliminates and ignores repetitive information because of the nature of the crisp sets. For example, suppose that a doctor evaluates a target patient’s symptoms at four different times with membership degrees , respectively. If the result of this evaluation is expressed as a HFS, then the repetitive assessment is lost due to the formation structure of the HFS. In such a situation, the concept of fuzzy multiset (FMS) is a useful method to express the ambiguous information which is lost.
The concept of FMS was proposed by Yager in 1986 [13, 14] with the help of a count function. In a fuzzy multiset setting, the membership degrees of elements in a universal set are presented as a sequence having different sequence lengths/cardinalities with the same or different fuzzy values. Therefore, more accurate results can be obtained by preventing the loss of the repetitive information. Moreover, it is more appropriate to use this fuzzy set in solving multicriteria group decisionmaking problems and multiperiod medical diagnosis problems. Although FMSs have the property of saving repetitive information, the uncertainty increases as the length of the sequences in the FMSs increases. This situation causes a difficulty while expressing reasonable information and complicates the selection of the alternative in a decisionmaking problem. To make the information carried by the sequence in the FMS more understandable and to reduce the dependence of this information on the length of the sequence, some statistical methods such as arithmetic mean and standard deviation for the elements of this sequence can be used. Recently, Ye et al. [15] have used this idea in neutrosophic environment. Motivated from this, we propose a new concept which is called consistency fuzzy set (CFS). This concept is expressed as an ordered pair whose components are the average value and the consistency degree of the sequence, respectively. Later, we propose a correlation coefficient and a cosine similarity measure between CFSs.
Correlation analysis is an important research issue in the fuzzy set theory and in its generalizations because it can measure the relationship between two fuzzy sets. Therefore, they have gained attention from researchers and their wide applications in various fields have been considered. For instance, Ye [16] has proposed a weighted correlation coefficient between intuitionistic fuzzy sets. Moreover, Guan et al. [17] have put forward a synthetic correlation coefficient between HFSs. Recently, Lin et al. [18] have developed the directional correlation coefficient measures for Pythagorean fuzzy information and have applied them to the medical diagnosis and the cluster analysis. Also, several researchers have proposed some correlation coefficients in various fuzzy environments (see, e.g., [19, 20]).
The concept of similarity measure plays an important role to determine the degree of similarity between two fuzzy sets. There are several types of similarity measures in the literature (see, e.g., [21–25]). The concept of cosine similarity measure is one of them, and it is defined as the inner product of two vectors divided by the product of their lengths, that is, the cosine of the angle between the vector representations of fuzzy sets [26]. In this paper, we introduce a correlation coefficient and a cosine similarity measure between CFSs, and we give the multiperiod medical diagnosis approaches by using the proposed correlation coefficient and cosine similarity measure to show the efficiency of these new concepts.
The important contributions of the paper are listed below:(i)The concept of CFS reduces the dependence of information on the length of the sequence in FMS and presents the information carried by the sequence in FMS in a more compact form.(ii)A CFS that is based on the average values and the consistency degree can give reasonable information about sequences in a FMS.(iii)A CFS contains both the level of the average of the data that can be expressed with different sequence lengths and the degree of consistency of the data via fuzzy complement of standard deviation of a sequence in FMS.(iv)A CFS facilitates the understanding of the problem, so the decisionmaking process has compact information due to the ability of CFSs.(v)The proposed correlation coefficient and cosine similarity measure between CFSs provide useful ranking method, and they are beneficial mathematical tools for multiperiod medical diagnosis and multicriteria group decisionmaking problems in the FMS environment.(vi)The developed medical diagnosis approach not only improves the decisionmaking reliability but also supplies a new influential way for multiperiod medical diagnosis problems in the FMS environment. The remainder of this paper is set out as follows. In Section 2, we introduce the concept of CFS and we give a correlation coefficient between CFSs. Later, we apply it to a multiperiod medical diagnosis problem to demonstrate the efficiency of the proposed correlation coefficient. In Section 3, we propose a cosine similarity measure between CFSs. Then, we apply it to the same multiperiod medical diagnosis problem. Moreover, we compare the results of the proposed correlation coefficient and the proposed cosine similarity measure with each other and the existing results in literature. In Section 4, we give a conclusion with some remarks.
2. CFSs and a Correlation Coefficient between CFSs
In this section, we recall the concepts of FMS and a correlation coefficient between FMSs. Then, we introduce the concept of CFS and a correlation coefficient between CFSs. Next, we apply it to a multiperiod medical diagnosis problem.
2.1. The Concept of CFS
Definition 1 (see [14]). Let be a finite set. A FMS in is characterized by a count membership function such that , where is the set of all crisp multisets in . The membership (truth) sequence is defined as such that , for . Therefore, a FMS is given bywhere is the length of the sequence for th element. Obviously, a FMS reduces to a fuzzy set when .
Now, we define the concept of CFS which reduces the dependence of information on the length of the sequence in a FMS and to present the information carried by the sequence in a FMS in a more compact form.
Definition 2. Let be a finite set and let be a FMS in . Average values and consistency degrees of the membership (truth) sequences in are defined byfor each (), respectively, where is the standard deviation of the th membership (truth) sequence in FMS . A CFS is defined byMoreover, the consistency fuzzy element (CFE) in CFS is simply denoted as , for each .
Example 1. Let be a finite set and let be the FMS in defined byThen, we construct the corresponding CFS to FMS byby using (2) and (3).
By using CFSs, we make a statistical inference for the information carried by the truth sequences in a FMS, and we express the information presented in these sequences as a compact and understandable way. Thus, we simplify the decisionmaking process by reducing the complexity created by the length of the truth sequences in a FMS. We also eliminate the dependence of the information on the length of these truth sequences in a FMS.
The fuzzy set theory has been often preferred by researchers especially to solve reallife problems such as medical diagnosis and decisionmaking, since it can model uncertain information very well. While solving these problems, the optimal choice is usually determined by using an aggregation functions or information measures such as similarity measures, entropy measures, and divergence measures, after the uncertainty in the environment is modeled with fuzzy sets. The concept of correlation coefficient is a crucial measure that determines the relationship between two fuzzy sets. Now, we recall a correlation coefficient for FMSs.
Definition 3 (see [27]). Let be a finite set and letbe two FMSs in . A correlation coefficient between and is given withwhere
Proposition 1 (see [27]). The correlation coefficient satisfies the following properties: If , then
Now, we propose a correlation coefficient between CFSs by motivating from the definition of the correlation coefficients between FMSs.
Definition 4. Let be a finite set and let and be two FMSs in . The correlation coefficient between CFSs and is given withwhere
Proposition 2. The correlation coefficient satisfies the following properties: If , then
Proof. Let be a finite set and let and be two CFEs in for a fixed . From Schwarz inequality, we obtainThus, we haveNow, using Cauchy Schwarz inequality, we haveThen, we obtainThe proofs of and are straightforward.
Now, we propose a weighted version of the correlation coefficient for CFSs as follows.
Definition 5. Let be a finite set and let and be two FMSs in . A weighted correlation coefficient between CFSs and is given withwhere is the weight vector with , for all , such that .
2.2. An Application
A multiperiod medical diagnosis is a process of decisionmaking on a disease which has a target patient. In this process, the decision maker evaluates the effect of symptoms on the target patient several different times. The most important factor that discriminates this process from other medical diagnosis processes is the presentation of the solution algorithm that pays attention to the time variable [24]. Therefore, it can be convenient to present the patient’s symptoms and diseases with the help of a sequence of fuzzy values.
Now, we adopt an illustrative example from [27] to show the applicability and effectiveness of the proposed correlation coefficient under FMS setting.
Example 2. Let be a set of patients and letbe sets of disease and symptoms, respectively. Suppose that all patients are examined at different time intervals with respect to all the symptoms and they are represented by the following FMSs:Moreover, assume that each disease , for , is given as a FMS with respect to all of the symptoms as follows:Now, we construct CFSs. Firstly, all patients in are expressed as CFSs , and as follows:respectively, and all diseases in are expressed as CFSs , and as follows:respectively. Let the weight of each symptom be , for . Now, we apply the proposed weighted correlation coefficient to determine the optimal disease for each patient. New results obtained in this study and some existing results in [27] are given in Table 1.
The process of assigning each patient to a disease is described byfor fixed .
The numerical results in Table 1 show that third and fourth patients suffer from throat disease and typhoid, respectively, according to both correlation coefficients for FMSs [27] and the proposed correlation coefficient for CFSs. The rest of Table 1 is different for two approaches. The novelty of the approach used in this study may cause this difference.

3. A Cosine Similarity Measure for CFSs
3.1. A Cosine Similarity Measure
The concept of cosine similarity measure is defined as the inner product of two vectors divided by the product of their lengths. In other words, a cosine similarity measure is the cosine of the angle between the vector representations of the two fuzzy sets. Now, we introduce a cosine similarity measure and its weighted version for CFSs by motivating from [26] as follows.
Definition 6. Let be a finite set and let and be two FMSs in . A cosine similarity measure between CFSs and is given with
If we take , the cosine similarity measure reduces the correlation coefficient , i.e., .
Proposition 3. The cosine similarity measure satisfies the following properties: If , then
Proof. Let be a finite set and let and be two CFEs in . Then, we havewhere be the radian measure of the angle between and . Therefore, is true. and are trivial.
Now, we introduce the weighted version of the proposed cosine similarity measure between CFSs.
Definition 7. Let be a finite set and let and be two FMSs in . A weighted cosine similarity measure between CFSs and is given withwhere is the weight vector with , for all , such that .
It is clear that if we take , for any , then . Obviously, the proposed weighted cosine similarity measure also satisfies the properties .
3.2. An Application
Now, we examine the same multiperiod medical diagnosis problem which is adapted from [27] to illustrate the applicability and effectiveness of the proposed cosine similarity measure for CFSs under the FMS setting. For this aim, we use CFSs for all of the patients and all diseases in Example 2.
Example 3. Let the weight of each symptom be for each . Now, we apply the proposed weighted cosine similarity measure to determine the optimal disease for all patients. New results obtained in this study and some existing results in [27] are given in Table 2.
The process of assigning each patients to a disease is described byfor fixed .
The results in Table 2 show that second, third, and fourth patients suffer from tuberculosis, throat disease, and typhoid, respectively, according to both the correlation coefficient in [27] and the proposed cosine similarity measure in this study. The rest of Table 2 is different for two approaches. The novelty of the approach used in this study may cause this difference.
The results in Table 3 show that first and fourth patients suffer from typhoid whereas the third patient suffers from throat disease according to both the proposed correlation coefficient and the proposed cosine similarity measure.

