#### Abstract

The dual hesitant fuzzy sets (DHFSs) were proposed by Zhu et al. (2012), which encompass fuzzy sets, intuitionistic fuzzy sets, hesitant fuzzy sets, and fuzzy multisets as special cases. Correlation measures analysis is an important research topic. In this paper, we define the correlation measures for dual hesitant fuzzy information and then discuss their properties in detail. One numerical example is provided to illustrate these correlation measures. Then we present a direct transfer algorithm with respect to the problem of complex operation of matrix synthesis when reconstructing an equivalent correlation matrix for clustering DHFSs. Furthermore, we prove that the direct transfer algorithm is equivalent to transfer closure algorithm, but its asymptotic time complexity and space complexity are superior to the latter. Another real world example, that is, diamond evaluation and classification, is employed to show the effectiveness of the association coefficient and the algorithm for clustering DHFSs.

#### 1. Introduction

Correlation indicates how well two variables move together in a linear fashion. In other words, correlation reflects a linear relationship between two variables. It is an important measure in data analysis, in particular in decision making, medical diagnosis, pattern recognition, and other real world problems [1–7]. Zadeh [8] introduced the concept of fuzzy sets (FSs) whose basic component is only a membership function with the nonmembership function being one minus the membership function. In fuzzy environments, Hung and Wu [9] used the concept of “expected value” to define the correlation coefficient of fuzzy numbers, which lies in . Hong [10] considered the computational aspect of the -based extension principle when the principle is applied to a correlation coefficient of - fuzzy numbers and gave the exact solution of a fuzzy correlation coefficient without programming or the aid of computer resources. Atanassov [11, 12] gave a generalized form of fuzzy set, called intuitionistic fuzzy set (IFS), which is characterized by a membership function and a non-membership function. In intuitionistic fuzzy environments, Gerstenkorn and Mańko [13] defined a function measuring the correlation of IFSs and introduced a coefficient of such a correlation. Bustince and Burillo [14] introduced the concepts of correlation and correlation coefficient of interval-valued intuitionistic fuzzy sets (IVIFSs) [12]. Hung [15] and Mitchell [16] derived the correlation coefficient of intuitionistic fuzzy sets from a statistical viewpoint by interpreting an intuitionistic fuzzy set as an ensemble of ordinary fuzzy sets. Hung and Wu [17] proposed a method to calculate the correlation coefficient of intuitionistic fuzzy sets by means of “centroid.” Xu [18] gave a detailed survey on association analysis of intuitionistic fuzzy sets and pointed out that most existing methods deriving association coefficients cannot guarantee that the association coefficient of any two intuitionistic fuzzy sets equals one if and only if these two intuitionistic fuzzy sets are the same. Szmidt and Kacprzyk [5] discussed a concept of correlation for data represented as intuitionistic fuzzy set adopting the concepts from statistics and proposed a formula for measuring the correlation coefficient (lying in ) of intuitionistic fuzzy sets. Robinson and Amirtharaj [19] defined the correlation coefficient of interval vague sets lying in the interval and proposed a new method for computing the correlation coefficient of interval vague sets lying in the interval using a-cuts over the vague degrees through statistical confidence intervals which is presented by an example. Instead of using point-based membership as in fuzzy sets, interval-based membership is used in a vague set. In [20], Robinson and Amirtharaj presented a detailed comparison between vague sets and intuitionistic fuzzy sets and defined the correlation coefficient of vague sets through simple examples. Hesitant fuzzy sets (HFSs) were originally introduced by Torra [21, 22]. In hesitant fuzzy environments, Chen et al. [23] derived some correlation coefficient formulas for HFSs and applied them to two real world examples by using clustering analysis under hesitant fuzzy environments. Xu and Xia [24] defined the correlation measures for hesitant fuzzy information and then discussed their properties in detail.

Recently, Dubois and Prade introduced the definition of dual hesitant fuzzy set. Dual hesitant fuzzy set can reflect human’s hesitance more objectively than the other classical extensions of fuzzy set (intuitionistic fuzzy set, type-2 fuzzy set (T-2FS) [25], hesitant fuzzy set, etc.). The motivation to propose the DHFSs is that when people make a decision, they are usually hesitant and irresolute for one thing or another which makes it difficult to reach a final agreement. They further indicated that DHFSs can better deal with the situations that permit both the membership and the nonmembership of an element to a given set having a few different values, which can arise in a group decision making problem. For example, in the organization, some decision makers discuss the membership degree 0.6 and the non-membership 0.3 of an alternative that satisfies a criterion . Some possibly assign , while the others assign . No consistency is reached among these decision makers. Accordingly, the difficulty of establishing a common membership degree and a non-membership degree is not because we have a margin of error (intuitionistic fuzzy set) or some possibility distribution values (type-2 fuzzy set), but because we have a set of possible values (hesitant fuzzy set). For such a case, the satisfactory degrees can be represented by a dual hesitant fuzzy element , which is obviously different from intuitionistic fuzzy number or and hesitant fuzzy number . The aforementioned measures, however, cannot be used to deal with the correlation measures of dual hesitant fuzzy information. Thus, it is very necessary to develop some theories for dual hesitant fuzzy sets. However, little has been done about this issue. In this paper, we mainly discuss the correlation measures of dual hesitant fuzzy information. To do this, the remainder of the paper is organized as follows. Section 2 presents some basic concepts related to DHFSs, HFSs, and IFSs. In Section 3, we propose some correlation measures of dual hesitant fuzzy elements, obtain several important conclusions, and given an example to illustrate the correlation measures. In Section 4, we propose a direct transfer clustering algorithm based on DHFSs and then use a numerical example to illustrate our algorithm. Finally, Section 5 concludes the paper with some remarks and presents future challenges.

#### 2. Preliminaries

##### 2.1. DHFSs, HFSs, and IFSs

*Definition 1 (see [26]). *Let be a fixed set then a dual hesitant fuzzy set (DHFS) on is described as;
in which and are two sets of some values in , denoting the possible membership degrees and non-membership degrees of the element to the set , respectively, with the conditions
where , , , and for all . For convenience, the pair is called a dual hesitant fuzzy element (DHFE), denoted by , with the conditions , , , and , , and .

*Definition 2 (see [21, 22]). *Let be a fixed set; a hesitant fuzzy set (HFS) on is in terms of a function that when applied to returns a subset of , which can be represented as the following mathematical symbol:
where is a set of values in , denoting the possible membership degrees of the element to the set . For convenience, we call a hesitant fuzzy element (HFE). We use for all to represent HFSs.

*Definition 3 (see [11, 12]). *Let be a fixed set, an intuitionistic fuzzy set (IFS) on is an object having the form
which is characterized by a membership function and a non-membership function , where and , with the condition , for all . We use for all to represent IFSs considered in the rest of the paper without explicitly mentioning it. Furthermore, is called a hesitancy degree or an intuitionistic index of in . In the special case , that is, , the IFS reduces to an FS.

##### 2.2. Correlation Coefficients of HFSs and IFSs

Many approaches [4, 13, 17, 20, 21] have been introduced to compute the correlation coefficients of IFSs. Let be a discrete universe of discourse, for any two and on .

The correlation of the IFSs and is defined as [13] Then, the correlation coefficient of the IFSs and is defined as In [23], Chen et al. defined the correlation and correlation coefficient for HFSs as follows, respectively: where for each in , and and represent the number of values in and , respectively. We will talk about in detail in the next section.

#### 3. Correlation Measures of DHFEs

In this section, we first introduce the concept of correlation and correlation coefficient for DHFSs and then propose several correlation coefficient formulas and discuss their properties.

We arrange the elements in in decreasing order and let be the th largest value in and the th largest value in . Let the number of values in and be the number of values in . For convenience, . In most cases, for two DHFSs and , ; that is, , . To operate correctly, we should extend the shorter one until both of them have the same length when we compare them. In [24, 27], Xu and Xia extended the shorter one by adding different values in hesitant fuzzy environments. Similarly, Torra [21] also applied this ideal to derive some correlation coefficient formulas for HFSs. In fact, we can extend the shorter one by adding any value in it. The selection of this value mainly depends on the decision makers’ risk preferences. Optimists anticipate desirable outcomes and may add the maximum value, while pessimists expect unfavorable outcomes and may add the minimum value. The same situation can also be found in many existing references [13, 14].

We define several correlation coefficients for DHFEs.

*Definition 4. *For two DHFSs and on , the correlation of and , denoted as , is defined by

*Definition 5. *For two DHFSs and on , the correlation coefficient of and , denoted as , is defined by:

*Definition 6. *For two DHFSs and on , the correlation coefficient of and , denoted as , is defined by

Theorem 7. *For two DHFSs and , the correlation coefficient of and , denoted as , should satisfy the following properties:*(1)*;*(2)*;*(3)*; .*

*Proof. *(1) The inequality and is obvious. Below let us prove that , :
Using the Cauchy-Schwarz inequality
where , we obtain
Therefore,
So, .

In fact, we have
Then
We also obtain .

(2) and (3) are straightforward.

Moreover, from the proof of Theorem 7, we have Theorem 8 easily.

Theorem 8. *For two DHFSs and on , then .*

However, from Theorem 7, we notice that all the above correlation coefficients cannot guarantee that the correlation coefficient of any two DHFSs equals one if and only if these two DHFSs are the same. Thus, how to derive the correlation coefficients of the DHFSs satisfying this desirable property is an interesting research topic. To solve this issue, in what follows, we develop a new method to calculate the correlation coefficient of the DHFSs and .

*Definition 9. *For two DHFSs and on , the correlation coefficient of and , denoted as , is defined by
where

Equation (17) is motivated by the generalized idea provided by Xu [18]. Obviously, the greater the value of , the closer to . By Definition 9, we have Theorem 10.

Theorem 10. *The correlation coefficient satisfies the following properties:*(1)*;*(2)*;*(3)*.*

*Proof. *(1) The inequality is obvious. Below let us prove that
We obtain
(2) and (3) are obvious.

Usually, in practical applications, the weight of each element should be taken into account, and, so, we present the following weighted correlation coefficient. Assume that the weight of the element is with and ; then we extend the correlation coefficient formulas given: where

Note that all these formulas satisfy the properties in Theorem 7.

In what follows, we use a medical diagnosis problem in [28, 29] to illustrate the developed correlation coefficient formulas. Actually, this is also a pattern recognition problem.

*Example 11. *To make a proper diagnosis (viral fever), (malaria), (typhoid), (stomach problem), and (chest problem)} for a patient with the given values of the symptoms, (temperature), (headache), (cough), (stomach pain), and (chest pain)}, Xu [18] considered all possible diagnoses and symptoms as HFEs. Utilizing DHFSs can take much more information into account; the more values we obtain from patients, the greater epistemic certainty we have. So, in this paper, we use DHFEs to deal with such cases; each symptom is described by a DHFE, which is described by two sets and . indicates the degree that symptoms characteristic satisfies the considered diagnoses and indicates the degree that the symptoms characteristic does not satisfy the considered diagnoses . The data are given in Table 1. The set of patients is . The symptoms which can be also described by DHFEs are given in Table 2. We need to seek a diagnosis for each patient.

We utilize the correlation coefficient to derive a diagnosis for each patient. All the results for the considered patients are listed in Table 3. From the arguments in Table 3, we can find that Ted suffers from viral fever, Al and Joe from malaria, and Bob from stomach problem.

If we utilize the correlation coefficient formulas and to derive a diagnosis, then the results are listed in Tables 4 and 5, respectively.

From Tables 3–5 we know that the results obtained by different correlation coefficient formulas are different. That is because these correlation coefficient formulas are based on different linear relationships.

#### 4. Clustering Method Based on Direct Transfer Algorithm for HFSs

Based on clustering algorithms for IFSs [30, 31], and HFSs [23], and the correlation coefficient formulas developed previously for DHFSs, in what follows, we propose a direct transfer algorithm to clustering analysis with respect to the problem of complex operation of matrix synthesis when reconstructing analogical relation to equivalence relation clustering under hesitant fuzzy environments. Before doing this, some concepts are introduced firstly.

*Definition 12. *Let be DHFs; then is called an association matrix, where is the association coefficient of and , which has the following properties:(1), for all ;(2) if and only if ;(3), for all .

*Definition 13 (see [23, 30]). *Let be an association matrix; if , then is called a composition matrix of , where , for all .

Based on Definition 13, we have the following theorem.

Theorem 14 (see [23, 30]). *Let be an association matrix; then the composition matrix is also an association matrix.*

Theorem 15 (see [23, 30]). *Let be an association matrix; then, for any nonnegative integer , the composition matrix derived from is also an association matrix.*

*Definition 16 (see [23, 30]). *Let be an association matrix, if , that is,
then is called an equivalent association matrix.

By the transitivity principle of equivalent matrix, we can easily prove the following theorem.

Theorem 17 (see [23, 30, 32]). *Let be an association matrix; then, after the finite times of compositions: , there must exist a positive integer such that , and is also an equivalent association matrix.*

*Definition 18 (see [23, 30, 31]). *Let be an equivalent correlation matrix. Then we call the -cutting matrix of , where
and is the confidence level with .

Next, a traditional transfer closure algorithm is given as follows.

*Step 1. *Let be a set of DHFSs in . We can calculate the correlation coefficients of the DHFSs and then construct a correlation matrix , where .

*Step 2. *Check whether is an equivalent correlation matrix; that is, check whether it satisfies where
If it does not hold, we construct the equivalent correlation matrix : , until .

*Step 3. *For a confidence level , we construct a -cutting matrix through Definition 18 in order to classify the DHFSs . If all elements of the th line (column) are the same as the corresponding elements of the th line (column) in , then the DHFSs and are of the same type. By means of this principle, we can classify all these .

By analyzing the aforementioned transfer closure algorithm, this algorithm has one drawback such as complex operation of matrix synthesis when reconstructing the equivalent correlation matrix. In this paper, we have the following theorem of the correlation coefficients in dual hesitant fuzzy environment.

Theorem 19. *For all , for the confidence level , if , when , , , then and are of the same type.*

*Proof. *we are motivated by the generalized idea based on the transitivity principle of ordinary equivalent relation : for all (here, is an ordinary set, not a fuzzy set), , when , , we can have .

And from Definition 18, we can see that the -cutting matrix of is an ordinary correlation matrix, which completes the proof of Theorem 19.

From the above theoretical analysis, we propose a direct transfer algorithm for clustering DHFSs as follows.

*Step 1. *Let be a set of DHFSs in . We can calculate the correlation coefficients of the DHFSs and then construct a correlation matrix , where .

*Step 2. *By setting the threshold to the confidence level , we can construct a -cutting matrix . If , this means that the DHFSs and are of the same type. By means of this principle, we can classify all these .

We can see that the transfer closure algorithm must construct the equivalent correlation matrix , until and then construct a -cutting matrix through Definition 18 in order to classify the DHFSs . Simply, the transfer algorithm only constructs a -cutting matrix by setting the threshold to the confidence level and then classifies the DHFSs directly. In what follows, we will talk about the relationship between the transfer closure algorithm and the direct transfer algorithm.

Theorem 20. *The clustering results are the same by the transfer closure algorithm and the direct transfer algorithm, at the same confidence level. *

*Proof. *(1) For a confidence level , for all , , if , , , then and are of the same type by the direct transfer algorithm.

Assume that we construct the equivalent correlation matrix when we employ the transfer closure algorithm. We must prove that . Consider

So , and, for the same reason, we have . Consider

For a confidence level , when we get that and are of the same type using the direct transfer algorithm, we can also have the same clustering results by the transfer closure algorithm.

(2) For a confidence level , for all , the equivalent correlation matrix , , then, and are of the same type by the transfer closure algorithm.

Let

Then , , , .

So and are the same type in by the direct transfer algorithm.

For the same reason, , , . , , .

and are the same type in by the direct transfer algorithm.

So, , , , .

and are the same type in by the direct transfer algorithm.

For a confidence level , when we get and are of the same type using the transfer closure algorithm, we can also have the same clustering results by the direct transfer algorithm, which completes the proof.

We assume to be a set of DHFSs, and we construct the equivalent correlation matrix : , until and then construct a -cutting matrix for the transfer closure algorithm. Consequently, the running time of the transfer closure algorithm is ; by the same arguments, the direct transfer algorithm requires time on the same example. And we have established space bound at least for the step of constructing the equivalent correlation matrix based on the transfer closure algorithm, while, for the transfer algorithm, it constructs a -cutting matrix by setting the threshold to the confidence level and needs space bound. We can see that the computational complexity of both two algorithms ranges depends on the number of , and the direct transfer algorithm exhibits better behavior.

Below, we conduct experiments in order to demonstrate the effectiveness of the proposed clustering algorithm for DHFSs.

*Example 21. *Every diamond is a miracle of time and place and chance. Like snowflakes, no two are exactly alike. Every consumer shopping for diamonds is faced with endless diamond combinations. In addition to different diamond combinations, prices are also influenced by market supply and demand conditions, fashion trends, and so forth. While consumers' tastes and budgets change, most seek to find a fair price for the diamond they choose. Until the middle of the twentieth century, there was no agreed upon standard by which diamonds could be judged. No matter how beautiful a diamond may look you simply cannot see its true quality. GIA created the first and now globally accepted standard for describing diamonds: color, clarity, cut, and carat weight. Concerning color, the less color in the stone there is, the more desirable and valuable it is. Grades run from “D” to “X.” Clarity measures the amount, size, and placement of internal “inclusions”, and external “blemishes.” Grades run from “Flawless” to “included.” Cut does not refer to a diamond's shape but to the proportion and arrangement of its facets and the quality of workmanship. Grades range from “excellent” to “poor.” Carat refers to a diamond's weight. Generally speaking, the higher the carat weight, the more expensive the stone. Two diamonds of equal carat weight, however, can have very different quality and price when the other three Cs are considered. We choose a “perfect” diamond whose 4C is “D” color, “FL” clarity, “3 excellent” cut, and “1carat” weight. For the convenience of analysis, the weight vector of these attributes is . Here, there are ten diamonds. In order to better make the assessment, several evaluation organizations are requested. The normalized evaluation diamond data, represented by DHFSs, are displayed in Table 6.

Now we utilize the direct transfer algorithm to cluster the ten diamonds, which involves the following steps.

*Step 1. *Utilize (21) to calculate the association coefficients, and then construct an association matrix:

*Step 2. *We give a detailed sensitivity analysis with respect to the confidence level, and, by (26), we get all the possible classifications of the ten diamonds; see Table 7 and Figure 1.

From the above numerical analysis, under the group setting, the experts’ evaluation information usually does not reach an agreement for the objects that need to be classified. Example 21 clearly shows that the clustering algorithm based on DHFSs provides a proper way to resolve this issue.

In the following, a comparison is made among the method proposed in this paper, Chen et al.’s method [23], and Zhao et al.’s method [31] in Table 8.

Through Table 8, it is worthy of pointing out that the clustering results of the direct transfer clustering method proposed in this paper are exactly the same with those of Chen et al.’s transfer clustering method and Zhao et al.’s Boole method, but our method does not need to use the transitive closure technique to calculate the equivalent matrix of the association matrix and thus requires much less computational effort than Chen et al.’s method. The computational complexity of Chen et al.’s method and Zhao et al.’s method has, relatively, high computational complexity, which indeed motivates our clustering method proposed in this paper. Furthermore, from Example 21 we can see that the clustering results have much to do with the threshold; the smaller the confidence level is, the more detailed the clustering will be.

#### 5. Conclusions

Dual hesitant fuzzy set, as an extension of fuzzy set, can describe the situation that people have hesitancy when they make a decision more objectively than other extensions of fuzzy set (interval-valued fuzzy set, intuitionistic fuzzy set, type-2 fuzzy set, and fuzzy multiset). In this paper, the correlation coefficients for DHFSs have been studied. Their properties have been discussed, and the differences and correlations among them have been investigated in detail. We have made the clustering analysis under dual hesitant fuzzy environments with one typical real world example. To further extend the application range of the present clustering algorithm, in particular for the case that needs to assign weights for different experts, it will be necessary to generalize the original definition of DHFSs.

Given that DHFSs are a suitable technique of denoting uncertain information that is widely encountered in daily life and the latent applications of our algorithm in the field of data mining, information retrieval and pattern recognition, and so forth, may be the directions for future research.

#### Acknowledgments

The authors are very grateful to the anonymous reviewers for their insightful and constructive comments and suggestions that have led to an improved version of this paper. This work is supported by the National Nature Science Foundation of China (no. 70971136).