Fuzzy Methods for Data AnalysisView this Special Issue
Intuitionistic Fuzzy Possibilistic C Means Clustering Algorithms
Intuitionistic fuzzy sets (IFSs) provide mathematical framework based on fuzzy sets to describe vagueness in data. It finds interesting and promising applications in different domains. Here, we develop an intuitionistic fuzzy possibilistic C means (IFPCM) algorithm to cluster IFSs by hybridizing concepts of FPCM, IFSs, and distance measures. IFPCM resolves inherent problems encountered with information regarding membership values of objects to each cluster by generalizing membership and nonmembership with hesitancy degree. The algorithm is extended for clustering interval valued intuitionistic fuzzy sets (IVIFSs) leading to interval valued intuitionistic fuzzy possibilistic C means (IVIFPCM). The clustering algorithm has membership and nonmembership degrees as intervals. Information regarding membership and typicality degrees of samples to all clusters is given by algorithm. The experiments are performed on both real and simulated datasets. It generates valuable information and produces overlapped clusters with different membership degrees. It takes into account inherent uncertainty in information captured by IFSs. Some advantages of algorithms are simplicity, flexibility, and low computational complexity. The algorithm is evaluated through cluster validity measures. The clustering accuracy of algorithm is investigated by classification datasets with labeled patterns. The algorithm maintains appreciable performance compared to other methods in terms of pureness ratio.
Clustering algorithms [1, 2] form an integral part of computational intelligence and pattern recognition research. Clustering analysis is commonly used as an important tool to classify collection of objects into homogeneous groups, such that objects within a given group are similar to each other whereas objects within different groups are dissimilar to each other. The concept is based on the notion of similarity, which is a basic component of intelligence and ubiquitous to scientific endeavor. Clustering finds numerous applications  across a variety of disciplines such as taxonomy, image processing, information retrieval, data mining, pattern recognition, microbiology, archaeology and geographical analysis, and so forth. It is an exploratory tool for deducing the nature of data by providing labels to individual objects that describe how the data separate into groups. It has improved the performance of other systems by separating the problem domain into manageable subgroups . Often researchers are confronted with the challenging datasets that are large and unlabeled. There are many methods available in exploratory data analysis [5, 6] by which researchers can elucidate these data.
Clustering an unlabeled dataset is partitioning of into subgroups such that each subgroup represents natural substructure in . This is done by assigning labels to vectors in and hence to objects generating . A partition of is a set of values that can be conveniently represented as matrix . There are generally three sets of partition matrices [7, 8]:The matrix in (1) has the property that for any there exists at least an index such that is greater than 0. The matrix in (2) states that if is equal to 1 for any , it is obvious that is greater than 0. The matrix in (3) is formed by boolean matrices as a subset of matrix in (2). The equations (1), (2), and (3) thus define the sets of possibilistic, fuzzy, or probabilistic and crisp partitions of , respectively. Hence, there are thus four kinds of label vectors, but fuzzy and probabilistic label vectors are mathematically identical having entries between 0 and 1 that sum to 1 over each column. The reason these matrices are called partitions follows from the interpretation of their entries. If is crisp or fuzzy, is taken as a membership of in th partitioning fuzzy subset or cluster of . If in is probabilistic, is the posterior probability . If in is possibilistic, it has entries between 0 and 1 that do not necessarily sum to 1 over any column. In this case, is taken as the possibility that belongs to class . An alternate interpretation of possibility is that it measures the typicality of to cluster . It is observed that . A clustering algorithm finds which best explains and represents an unknown structure in with respect to the model that defines . For in is represented uniquely by the hard 1 partition, which unequivocally assigns all objects to a single cluster, and is represented uniquely by , the identity matrix up to a permutation of columns. In this case, each object is in its own singleton cluster. Choosing or rejects the hypothesis that contains clusters.
In the last few decades, variety of clustering techniques [3, 5, 6, 9, 10] has been developed to classify data. Clustering techniques are broadly divided into hierarchical and partition methods. Hierarchical clustering  generates hierarchical tree of clusters called dendrogram which can be either divisive or agglomerative . The former is a top-down splitting technique which starts with all objects in one cluster and forms hierarchy by dividing objects into smaller clusters in an iterative procedure until the desired number of clusters is achieved or considered objects which is constituted as unique cluster. The latter starts by considering each object as cluster, followed by comparing them amongst themselves using distance measure. The clusters with smaller distance are considered as constituting unique group and then merged. The merging procedure is repeated until the desirable number of clusters is achieved or only one cluster is left with all considered objects. Partition clustering method gives single partition of objects, with being the predefined number of clusters . One of the most widely used partition clustering algorithms is fuzzy C means (FCM). FCM is a combination of means clustering algorithm and fuzzy logic [1, 7]. It works iteratively in which the desired number of clusters and initial seeds are predefined. FCM algorithm assigns memberships to which are inversely proportional to relative distance of to point prototypes that are cluster centers. For , if is equidistant from two prototypes, the membership of in each cluster will be the same regardless of absolute value of the distance of from two centroids as well as from other points. The problem this creates is noise points which are far but equidistant from central structure of two clusters that can never be given equal membership, when it seems far more natural that such points are given very low or no membership in either cluster. This problem was overcome by Krishnapuram and Keller , who proposed possibilistic C means (PCM) which relaxes column sum constraint in (2) so that sum of each column satisfies the constraint . In other words, each element of th column can be between 0 and 1, as long as at least one of them is positive. They suggested that value should be interpreted as typicality of relative to cluster . They interpreted each row of as possibility distribution over . The objective function of PCM algorithm sometimes helps to identify outliers, that is, noise points. However, Barni et al.  pointed that PCM pays price for its freedom to ignore noise points such that it is very sensitive to initializations and sometimes generates coincident clusters. Moreover, typicality can be very sensitive to the choice of additional parameters needed by PCM algorithm. The coincident cluster problem of PCM algorithm was avoided by two possibilistic fuzzy clustering algorithms proposed by Timm et al. [13–15]. They modified PCM objective function by adding an inverse function of distances between the cluster centers. This term acts in repulsive nature and avoids coincident clusters. In [13, 14], Timm et al. used the same concept to modify objective function as used by Gustafson and Kessel  clustering algorithm. These algorithms exploit benefits of both fuzzy and possibilistic clustering. Pal et al.  justified the need for both possibility, that is, typicality and membership values, and proposed a model and corresponding algorithm called fuzzy possibilistic C means (FPCM). This algorithm normalizes possibility values, so that the sum of possibilities of all data points in a cluster is 1. Although FPCM is much less prone to errors encountered by FCM and PCM, possibility values are very small when size of dataset increases.
The notion of intuitionistic fuzzy set (IFS) coined by Atanassov  for fuzzy set generalizations has interesting and useful applications in different domains such as logic programming, decision making problems, and medical diagnostics [23–26]. This generalization presents degrees of membership and nonmembership with a degree of hesitancy. Thus, knowledge and semantic representation become more meaningful and applicable [27, 28]. Sometimes it is not appropriate to assume that membership and nonmembership degree of an object are exactly defined , but value ranges or value intervals can be assigned. In such cases, IFS can be generalized and interval valued intuitionistic fuzzy set (IVIFS)  can be defined whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe and deal with vague and uncertain data [28, 30]. With this motivation, it is desirable to develop some practical approaches to clustering IFSs and IVIFSs. Intuitionistic fuzzy similarity matrix was defined by  and thereby intuitionistic fuzzy equivalence matrix was developed. The work in  gave an approach to transform intuitionistic fuzzy similarity matrices into intuitionistic fuzzy equivalence matrices, based on which a procedure for clustering intuitionistic fuzzy sets was proposed. Some methods for calculating association coefficients of IFSs or IVIFSs and corresponding clustering algorithm were introduced by . The algorithm used derived association coefficients of IFSs or IVIFSs to construct an association matrix and utilized the procedure to transform it into an equivalent association matrix. Reference  introduced an intuitionistic fuzzy hierarchical algorithm for clustering IFSs which is based on traditional hierarchical clustering procedure and intuitionistic fuzzy aggregation operator. These algorithms cannot provide information about membership degrees of objects to each cluster.
In this work, an intuitionistic fuzzy possibilistic C means (IFPCM) algorithm to cluster IFSs is developed. IFPCM is obtained by applying IFSs to FPCM which is a known clustering method based on basic distance measures between IFSs [34, 35]. At each stage of the algorithm seeds are modified and for each IFS membership and typicality degrees to each of the clusters are estimated. The algorithm ends when all given IFSs are clustered according to estimated membership and typicality degrees. It overcomes the inherent problems encountered with information regarding membership values of objects to each cluster by generalizing membership and nonmembership with hesitancy degree. The algorithm is then extended to interval valued intuitionistic fuzzy possibilistic C means (IVIFPCM) for clustering IVIFSs. The algorithms are illustrated through conducting experiments on different datasets. The evaluation of the algorithm is performed through cluster validity measures. The clustering accuracy of the algorithm is determined by classification datasets with labeled patterns. IFPCM algorithm is simple and flexible in nature and provides information about membership and typicality degrees of samples to all clusters with low computational complexity.
This paper is organized as follows. In the next section, the concepts of IFSs and IVIFSs are defined. FPCM clustering algorithm is given in Section 3. The next section presents IFPCM clustering algorithms for IFSs and IVIFSs, respectively. The experimental results on both real world and simulated datasets are illustrated in Section 5. Finally, in Section 6 conclusions are given.
2. Intuitionistic Fuzzy Sets and Interval Valued Intuitionistic Fuzzy Sets
In this section, we present some basic definitions associated with IFSs and IVIFSs.
Definition 1. Considering as universe of discourse , then IFS is defined asIn (4) and are the membership and nonmembership degrees, respectively, satisfying the following constraints: Equation (5) is subject to the condition that , .
Definition 2. For each IFS in , if , then is called hesitation degree (or intuitionistic index)  of to . Obviously is specified in the range ; especially if , then IFS is reduced to fuzzy set. If and have 0 values such that , then IFS is completely intuitionistic.
Considering the fact that the elements ; in universe have different importance, let us assume should be the weight vector of ; withXu  defined the following weighted Euclidean distance between IFSs and :In particular, if , then (7) is reduced to normalized Euclidean distance  which is defined as follows:Atanassov and Gargov  pointed out that sometimes it is not appropriate to assume that membership and nonmembership degrees of the element are exactly defined but value ranges or value intervals can be given. In this context, Atanassov and Gargov  extended IFS and introduced the concept of IVIFS, which is characterized by a membership degree and a nonmembership degree, whose values are intervals rather than exact numbers.
Definition 3. An IVIFS over is an object having the following form : Here, and are intervals , , , , and and , where , . In particular, if and , then is reduced to an IFS.
Now we extend the weighted Euclidean distance measure given in (7) to IVIFS theory:Particularly, if , then (10) is reduced to normalized Euclidean distance which is given as follows:
3. Fuzzy Possibilistic C Means Clustering Algorithm
This section illustrates FPCM clustering algorithm proposed by Pal et al.  in 1997 to exploit the benefits of fuzzy and possibilistic modeling while circumventing their weaknesses. To correctly interpret the data substructure, FPCM clustering uses both memberships (relative typicality) and possibilities (absolute typicality). When we want to crisply label a data point, membership is a plausible choice as it is natural to assign a point to cluster whose prototype is closest to the point. On the other hand, while estimating the centroids, typicality is an important means for alleviating the undesirable effects of outliers. Here, the number of clusters is fixed a priori to a default value considering the dataset used in the application such that it is completely data driven. Generally it is advisable to avoid trivial clusters which may be either too large or small.
FPCM extends FCM clustering algorithm  by normalizing possibility values so that sum of possibilities of all data points in a cluster is 1. Although FPCM is much less prone to the problems of both FCM and PCM, the possibility values are very small when size of the dataset increases. Analogous to FCM clustering algorithm, the membership term in FPCM is a function of data point and all centroids. The typicality term in FPCM is a function of data point and cluster prototype alone. That is, the membership term is influenced by the positions of all cluster centers whereas typicality term is affected by only one. Incorporating the abovementioned facets the FPCM model is defined by the following optimization problem : The transpose of admissible ’s is member of set . is viewed as a typicality assignment of objects to clusters. The possibilistic term distributes with respect to all data points, but not with respect to all clusters. Under the usual conditions placed on c-means optimization problems, the first order necessary conditions for extrema of are stated in terms of the following theorem.
Theorem FPCM (see ). If and and contains at least distinct data points, then may minimize only if
The proof of the above theorem follows from . FPCM has the same type of singularity as FCM. FPCM does not suffer from the sensitivity problem that PCM seems to exhibit. Unfortunately, when the number of data points is large, the typicality values will be very small. Thus, after FPCM-AO algorithm  for approximating solutions to (12) based on iteration through (17) terminates, the typicality values may need to be scaled up. Conceptually, this is not different than scaling typicality as is done in PCM. While scaling seems to solve the small value problem which is caused by row sum constraint on , the scaled values do not possess any additional information about points in the data. Thus scaling is an artificial fix for a mathematical drawback of FPCM.
4. Intuitionistic Fuzzy Possibilistic C Means Clustering Algorithms
In this section, we discuss intuitionistic fuzzy possibilistic C means clustering algorithms for IFSs and IVIFSs, respectively.
4.1. Intuitionistic Fuzzy Possibilistic C Means Algorithm for IFSs
We develop the intuitionistic fuzzy possibilistic C means (IFPCM) model and corresponding algorithm for IFSs. We take the basic distance measure in (7) as proximity function of IFPCM; the objective function of IFPCM model can then be defined as follows: Here are IFSs each with elements, is the number of clusters , and are the prototypical IFSs, that is, centroid of the clusters. The parameter is the fuzzy factor, is the membership degree of th sample to the th cluster, is matrix of order , parameter is the typicality factor, is the typicality of th sample to the th cluster, and is typicality matrix.
To solve the optimization problem stated in (18), we make use of Lagrange multiplier method , which is discussed below. Consideringwhere,
Furthermore, , ; letFrom the above system of equations, we have the following expressions:Now we proceed to compute ; , the prototypical IFSs. Let us assume thatFrom the above expression we have For simplicity, we define weighted average operator for IFSs as follows.
Let be a set of IFSs each with elements; let be a set of weights for IFSs, respectively, with ; and then the weighted operator is defined as According to (25) to (28), if we assume the prototypical IFSs of the IFPCM model can be computed as follows:Since the above equations (22), (23), and (30) are computationally interdependent, we exploit an iterative procedure similar to the FPCM algorithm to solve these equations. The steps of algorithm are as follows.
Step 1. Initialize the seed values ; let and set .
Step 2(i). Calculate , where(a)if , then ; , ,(b)if such that , then let and .
Step 2(ii). Calculate , where(a)if , then ; , ,(b)if such that , then let and .
Step 3. Calculate , where
Step 4. If , then go to Step 5; otherwise, let , and return to Step 2.
Step 5. End
The pseudocode of the IFPCM algorithm is given in Algorithm 1.
4.2. Interval Valued Intuitionistic Fuzzy Possibilistic C Means Algorithm for IVIFSs
If the collected data are expressed as IVIFSs, then we extend IFPCM to interval valued intuitionistic fuzzy possibilistic C means (IVIFPCM) model. We take the basic distance measure in (10) as the proximity function of the IVIFCM. The objective function of IVIFPCM model can be defined as follows:Here are IVIFSs each with elements, is the number of clusters , and are the prototypical IVIFSs, that is, centroids of the clusters. The parameter is the fuzzy factor, is the membership degree of th sample to the th cluster, is matrix of order , parameter is the typicality factor, is the typicality of th sample to the th cluster, and is typicality matrix.
To solve the optimization problem stated in (30) to (35), we make use of Lagrange multiplier method , which is discussed below. Consideringwhere,Similar to IFPCM model, we establish the system of partial differential functions of as follows:The solution for the above system of equations is where Because (41) and (42) are computationally interdependent, we exploit a similar iteration procedure as follows.
Step 1. Initialize the seed values ; let and set .
Step 2(i). Calculate , where(c)if , then ; , .(d)if such that , then let and .
Step 2(ii). Calculate , where(c)if , then ; , ,(d)if such that , then let and .
Step 3. Calculate , where
Step 4. If , then go to Step 5; otherwise let , and return to Step 2.
Step 5. End
The pseudocode of the IVIFPCM algorithm is given in Algorithm 2.
5. Experimental Results
In this section, we enumerate the results of experiments performed on both real world and simulated datasets  in order to demonstrate the effectiveness of IFPCM clustering algorithm. IFPCM algorithm is implemented through MATLAB. We first explain the steps of the algorithm by the use of some experimental data which is evaluated through cluster validity measures. Next, the algorithm is applied to some classification datasets, that is, data with labeled patterns in order to examine its clustering accuracy.
5.1. Application of IFPCM Algorithm on Experimental Data
The parameters set in IFPCM algorithm are shown in Table 1. It is to be noted that if