A Weight Possibilistic Fuzzy C-Means Clustering Algorithm

Chen, Jiashun; Zhang, Hao; Pi, Dechang; Kantardzic, Mehmed; Yin, Qi; Liu, Xin

doi:https://doi.org/10.1155/2021/9965813

Scientific Programming

On this page

Abstract Introduction Related Works Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 9965813 | https://doi.org/10.1155/2021/9965813

A Weight Possibilistic Fuzzy C-Means Clustering Algorithm

Jiashun Chen,¹Hao Zhang,²Dechang Pi,³Mehmed Kantardzic,⁴Qi Yin,¹and Xin Liu¹

Academic Editor: Cristian Mateos

Received31 Mar 2021

Accepted01 Jun 2021

Published11 Jun 2021

Abstract

Fuzzy C-means (FCM) is an important clustering algorithm with broad applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. Especially, parameters in FCM have influence on clustering results. However, a lot of FCM algorithm did not solve the problem, that is, how to set parameters. In this study, we present a kind of method for computing parameters values according to role of parameters in the clustering process. New parameters are assigned to membership and typicality so as to modify objective function, on the basis of which Lagrange equation is constructed and iterative equation of membership is acquired, so does the typicality and center equation. At last, a new possibilistic fuzzy C-means based on the weight parameter algorithm (WPFCM) was proposed. In order to test the efficiency of the algorithm, some experiments on different datasets are conducted to compare WPFCM with FCM, possibilistic C-means (PCM), and possibilistic fuzzy C-means (PFCM). Experimental results show that iterative times of WPFCM are less than FCM about 25% and PFCM about 65% on dataset X₁₂. Resubstitution errors of WPFCM are less than FCM about 19% and PCM about 74% and PFCM about 10% on the IRIS dataset.

1. Introduction

Clustering is a method of unsupervised learning and had been applied in various fields, including data mining, pattern recognition, computer vision, and bioinformatics. Cluster methods might be summarized as follows: partition-based [1, 2], hierarchical-based [3], density-based [4–6], and grid-based [7]. Partition methods included hard partition [8, 9] and soft partition [10–12]. Soft partition is represented by using fuzzy membership, and membership value lies in interval [0, 1]. Many fuzzy clustering algorithms had been developed and widely used in a variety of areas [13–16], such as data mining and pattern recognition. Ruspini [17] regarded fuzzy C-means (FCM) as a clustering algorithm, and DUNN [18] analyzed fuzzy exponent m and determined that the value of m was equal to 2. Bezdek generalized fuzzy exponent m > 1. The constraint of FCM might cause that membership conflicted with intuitive belonging degree; furthermore, it made the clustering results be sensitive to noise. In order to overcome this defect, Krishnapuram and Keller [19] relaxed the constraint and proposed a new algorithm named possibilistic C-means (PCM) [20] which reduced influence of noise on clustering and had good robustness. However, PCM relied on initialization condition and might produce coincident clusters [21]. Many algorithms were developed to overcome coincident problem. For example, studies [22, 23] modified the PCM objective function adding an inverse function of the distances between cluster centers to resolve coincident problem. The study [23] proposed a new model named the fuzzy possibilistic C-means (FPCM) which introduced membership and typicality value t_ij subjected to for unlabeled data. FPCM reduced sensitivity to noise in FCM and resolved coincident problem in PCM; however, typicality value became very small as dataset scale increasing for the reason of row sum constraints.

The study by Pal [24] proposed new algorithms named possibilistic fuzzy C-means (PFCM), which was a hybridization of PCM and FCM and overcame problems of PCM, FCM, and FPCM. PFCM solves the noise sensitivity. So, PFCM has been widely applied in many fields [25–27] and solved some problems well. PFCM added coefficients a and b for membership and possibility, which measured the relative importance in the computation of centroids; however, the values of a and b were simply fix at 1, which meant that membership and possibility had the same importance during the process of computing centroids. This setting made clustering results become less evident in some clustering. PFCM did not give a scientific and rational method to compute parameters. The main objective of this study was to generalize FCM, PCM, and PFCM algorithms and propose a new algorithm named weight possibilistic fuzzy C-means (WPFCM). We designed a new objective function on basis of PFCM. According to the requirement of minimizing objective function, iterative functions of membership, typicality, and centroid were obtained by constructing Lagrange function and deriving its derivation.

This study is quite different from literature [28]. First, this study focussed on clustering on possibilistic fuzzy C-means, while the study by Schneider [28] aimed to the possibilistic C-means algorithm. The research algorithms were different. Second, the key point was that designing weight parameter was diverse. Algorithm in this study could allocated weight value to samples inlier and outlier automatically according to the calculation method of weight parameters, which made membership value maximizing and outlier reduce influence of estimation. Weight parameter could satisfy the optimization objective function, make it iterate faster, and avoid the coincident problem. The method in the study by Schneider [28] could not have these advantages.

Experiments on different datasets show that new algorithm not only makes clustering results obvious but also partitions overlapping data better and also reduces iterative times and speeds up convergence. The rest of this study is organized as follows: Section 2 reviews the FCM, PCM, FPCM, and PFCM clustering algorithm. Section 3 provides a new method on the computation of parameters and WPFCM was proposed. Section 4 experimentally demonstrates the improvement of performance of WPFCM on some UCI database. Section 5 offers conclusion.

Since fuzzy set theory was introduced by Zadeh, this method was applied in the clustering algorithm rapidly. FCM is one of the most famous algorithms and obtains clustering results by minimizing objective function and iterating membership and centroid. The objective function of FCM is designed as follows:where fuzzy exponent m is subjected to m > 1 and Euclidean distance is defined as . Membership can be obtained by minimizing objective function (1). The following equations are iterative functions of membership and centroid.

The clustering performance is better; however, the algorithm subjects to the following three constraints: , , and , which make the algorithm be sensitive to noise and usually lead to center deviation for individual anomalous data points.

The constraints of FCM require data points to consider the relation to other points in current cluster and in other clusters; therefore, membership might conflict with intuitive belonging degree and does not directly reflect real clustering results. The FCM algorithm is sensitive to noise and obtains poor clustering results in noisy data environment. Krishnapuram and Keller [19] improved FCM and proposed the possibilistic C-means algorithm which relaxed constraint. Objective function is designed as follows:where is the scaling parameter of the i^th class and defined (common K = 1), and exponent q subjects to constraint q > 1, and Euclidean distance is defined . Iterative functions of typicality and centroid are obtained by minimizing objective function (3). Equations (4) and (5) are the iterative functions.where , ,

In equation (3), t_ij is not the membership but possibility, and clustering results are easy to interpret. PCM [29] relaxes the constraint and defines the constraint , so the rows and columns are independent, and data structure becomes loose. Therefore, the algorithm is insensitive to noise and could deal with the dataset including outlier; on the other side, there is another weakness. Experiments show that PCM’s clustering results depend on initialization and generate the coincident problem. Pal [30] held that clustering centroids were closed to data centers due to the effect of membership. Pal proposed a new algorithm FPCM on the basis of FCM and PCM. FPCM used data center as the clustering center. It is feasible to a great extent. Membership is a good method when data points need to be marked clearly because it is natural to assign a point to cluster whose prototype is the nearest to the point, while possibility is important to estimate clustering centers, and effectively reduces the influence that was brought by abnormal data point. The objective function was designed as follows:where membership subjects to constraint , (j = 1,…,n), and the typicality subjects to constraint (i = 1,…,c), and other constraints are m > 1, q > 1, and 0 < u_ij, t_ij < 1, and Euclidean distance was defined . Iterative function of membership and typicality and prototype can be obtained by minimizing objective function. The equations (7)–(9) are the iterative functions, respectively.

Although FPCM overcomes weakness of PCM and FCM, the typicality value becomes very small with sample data increase. Typicality value is limited by FPCM. On a large sample dataset, the typicality value is inconsistent with real value due to constraint row sum. Pal [24] improved algorithm FPCM relaxed the typicality constraint row sum, retained membership constraint column sum, and proposed a new algorithm named possibilistic fuzzy C-means (PFCM). The objective function is designed as follows:where parameters subject to constraints m > 1, q > 1, η > 0, 0 < u_ij, t_ij < 1, 1 ≤ i ≤ c, and 1 ≤ j ≤ n. Euclidean distance is defined , and parameters and are the constants. Iterative function of membership and typicality and prototype can be obtained by minimizing objective function. The equations (11)–(13) are iterative functions.where is defined , and usually, K is a constant (K = 1).

3. WPFCM Algorithm

This section includes three paragraphs. The motivation of weight parameters was first introduced, and then, the calculation method of weight parameters is presented in the second part, and the last part gives objection function and the steps of algorithm.

3.1. Motivation Weight Parameters

PFCM integrates merit of PCM with FCM, which includes membership and typicality. PFCM reduces the sensitivity to noise in FCM and overcomes coincident problem in PCM and can deal with the problem that typicality value becomes very small with data points increasing in FPCM. After analyzing the parameters a and b, we found that the values of a and b have influence on membership and typicality and then affect the clustering results. If parameter a is greater than b, prototype is affected more by membership than by typicality; on the contrary, if parameter b is higher than a, prototype is affected more by typicality than by membership. Therefore, if we want to reduce influence of clustering results caused by outlier, we should select values of a lower than b. How to determine values of parameters is difficult. Usually, values of parameters are all fixed at 1, which means that membership and typicality have the same importance to clustering results. At this time, setting two parameters a and b becomes meaningless. In many situations, we do not know whether values fit to parameters, and then, it depends on experience to determine values of parameters a and b. Assigning values of parameters a and b lacks mathematical basis, so it is occasional and unscientific in PFCM. The clustering results become unstable. There is another weakness in PFCM that all vector data share the same value of parameters in the cluster process; however, different vectors have various importance for clustering. So it is unreasonable for parameters a and b to be fixed at 1. In order to overcome these weaknesses, we proposed a new method to compute values of weight parameters which replace parameters a and b in PFCM. New parameters consider the importance of each sample data in the process of clustering. New calculation method is more reasonable. The importance of parameter lies in the fact that values of a and b directly affect typicality value t_ij and centroid value , affect membership indirectly, and then influence on the clustering results.

3.2. Calculation Method of Weight Parameters

Many literatures have mentioned methods for calculating weight parameters [31–33]. The study by Fan et al. [32] assigned weights to properties according to the importance of each property to the cluster process. For example, in dataset IRIS [34], the third property and the fourth property are beneficial to get obvious clustering results, so they are assigned a high weight value and others are assigned a low weight value. The premise is that we must know which property is important and unimportant. To an unknown dataset, this method is inappropriate and cannot be applied. The study by Nock and Nielsen [33] estimated all samples’ probability density by using the analogy method. This method needs a great deal of computation. The study by Hung [29] gave a prototype-driven learning of parameter which is based on exponential separation strength between clusters and updated each iteration to improve the performance of FCM. Equation (14) is the definition of parameter of .where parameter is defined as the distance from data point x_j to sample mean. Parameter is defined as follows:and can be defined as sample mean:

Definition 1. A given sample set to be classified is denoted by X (X = {x₁,x₂,…,x_n}⊂F (X)), and X is partitioned c (0 < c < n) fuzzy subsets, and c is the number of clustering.

Definition 2. According to the importance of the data point x_j (x_j ∈ X) during the clustering process, weight parameter can be defined as γ_ij, which is the weight of x_j w. r. t. class i. The following equation is the calculation method.

Theorem 1. Distance from x_j to center can be regarded as weight; if the distance is long, then the value of weight will be high; on the contrary, if the distance is short, then the value of weight will be low.

Proof:. is the mean value of sample, and the difference between x_j and is reflected by the distance from data point x_j to , which is constant. The smaller the value of , the shorter the distance from x_j to class i. We can deduce that the larger the value of , the smaller the value of , and value of will be small; on the contrary, the long the distance from x_j to class i, the larger the value of . Optimization of objective function requires a minimum value. Weight parameter should satisfy the optimization objective; so long distance should get large γ_ij. It is appropriate to use γ_ij as the weight parameter.

3.3. Design Objective Function

According to the rule of classification, there is little difference among all data in the same class and great difference in different classes. During the process of designing objective function, in order to assign data point, the nearest distance from data point to center should be selected, which is denoted by the maximizing membership value. The typicality value can be used to reduce influence of estimation caused by outlier. New objective function should meet two requirements: on the one hand, role of membership in objective function should be increased when the sample is inlier; on the other hand, the role of the typicality value should be increased in objective function when the sample is outlier. Therefore, the objective function is designed as in the following equation, which include two parts: the first part is fuzzy function denoted by fuzzy weight parameter and the second part is typicality function denoted the by typicality weight parameter.

Definition 3. The new objective function is designed as following which is based on FCM, PCM, and PFCM.where γ_ij (0 < γ_ij < 1) denotes the weight between data point x_j and class i, which comes from equation (17). Different data points have various weight values, and then, clustering results are more reasonable by using different weight parameters and avoid the coincident problem. U, T, and V denote membership matrix (c × n), typicality matrix (c × n), and centroid matrix (c × 1), respectively. Here, u_ij (0 < u_ij < 1) is the membership of feature point x_j in cluster c_i and t_ij (0 < t_ij < 1) is the typicality of x_j in cluster c_i. is the Euclidean distance between data point x_j and . The parameters m (m > 1) and q (q > 1) are the fuzzy exponents. The parameter η_i (η_i > 0) is a constant, which is defined by , where K usually is fixed at 1.
According to the analysis of the preceding context, we know that the nearer the distance between data point x_j and cluster c_i, the smaller the value of weight parameter γ_ij. The distance from data point x_j to cluster c_i is near, which shows that data point x_j belongs to the i^th cluster. The weight parameter value of membership should be increased and be set as (1−γ_ij). On the contrary, the further the distance between x_j and the i^th cluster, the greater the difference between x_j and the i^th cluster, and x_j may be an anomalous point. Typicality weight parameter should be increased to reduce the effect of x_j on clustering. The typicality weight parameter is set as γ_ij. With increase (decrease) of weight parameter of membership, typicality weight parameter will be decreased (increased). The weight parameter is calculated on the basis of different sample date points, which overcomes the unreasonable value of a and b in PFCM and resolves coincident problem which is caused by small value of a and poor initialization centroid.
According to Definition 3, the Lagrangian multiplier method was used to construct the Lagrange equation. In order to minimize equation (18), the partial derivatives of u_ij and t_ij were computed according to constraints and and acquired the membership u_ij and typicality t_ij and centroid as follows:

3.4. WPFCM Algorithm

According to the objective function, steps of algorithms were provided as following:

4. Experiments

In order to validate the algorithm efficiency, some experiments on different datasets were carried out. The initial value of parameters is set as follows: ε = 0.000001, the maximum iterative times max_iter = 100, constant K = 1, the number of class Cluster_n = 2 for dataset X₁₂, and the number of class Cluster_n = 3 for dataset IRIS [34].

Experiment 1. Dataset X₁₂ [35]; algorithm: FCM, PCM, PFCM, and WPFCM; initialization:X₁₂ is a two-dimensional dataset with 12 data points. The coordinates of X₁₂ are given in Table 1. Figure 1 shows the coordinate distribution of dataset X₁₂. There are ten points forming two clusters with five points each on the left and the right sides of the axis y. Data points x₆ and x₁₂are considered as noise, and each has the same distance to two clusters.
Table 2 presents centroids which are generated by running FCM, PFCM, and WPFCM on X₁₂. Suppose distance equation Dist_X = ||V_X12−V_X ||², which denotes the distance from real centroid to centroid V_X generated by algorithms. The following are the centroids:The real centroid of is . We compute distance of each algorithm by using distance equation as follows: Dist_FCM = 0.4212, Dist_PFCM = 0.3860, and Dist_WPFCM = 0.1537. Comparing three distances, although each algorithm can get good result, Dist_WPFCM has the minimum distance, that is, to say V_WPFCM is nearer to real centroid than other V_X. V_WPFCM reflects real cluster center better.
Table 3 provides the minimum iteration times of FCM, PFCM, and WPFCM with optimal given parameters. Iteration times of WPFCM are slight less than FCM and far less than PFCM. Therefore, WPFCM has less running time in large datasets and has a high speed of convergence.
Table 4 presents the membership value by running FCM, PFCM, and WPFCM. By comparison, membership values of WPFCM are better than the other two algorithms; especially for data points x₃ and x₉, the membership values are equal to one. Data points x₃ and x₉ are the center of two clusters, which show that WPFCM is easier to recognize the cluster center. Membership value cannot tell noisy data point x₆ and x₁₂, but noisy data are identified by using the typicality value in Table 5. By analyzing data in Table 5, typicality values of WPFCM are greater than PFCM. If one data point has larger typicality value, the data point is more likely to belong to the cluster. One of typicality values of x₃ and x₉ are up to 1 in Table 5, which show data point, respectively, belongs to two clusters with large possibility.
Membership values of noisy data x₆ and x₁₂ are equal to 0.5. Figure 1 shows that distance from x₆ to two clustering centers is far less than x₁₂, but Table 4 cannot show this difference. Table 5 shows that typicality values of ten data points are greater than 0.9 except for x₆ and x₁₂. Typicality values of x₆ and x₁₂ are far less than others, so we consider the data points x₆ and x₁₂ are noise. We also find the typicality value of x₁₂ is far less than x₆ in Table 5, which shows noisy data x₁₂ belong to two clusters with less possibility than x₆ and which reflects distribution of x₆ and x₁₂ in Figure 1. WPFCM improves the defect of FCM. From Table 5, we also find the typicality value of WPFCM is better than PFCM, so WPFCM can get more obvious clustering results.
Table 6 presents centroids and iteration times by running WPFCM on dataset X₁₂ with different parameters. Clustering results of WPFCM are better than FCM and PFCM as a whole. Iteration times of WPFCM are a bit less than FCM and PFCM. When the value of m keeps unchanged and value of q varied from 2 to 5, there is an increasing tendency of membership values, but not obvious; however, typicality values have an evident decreasing tendency. Clustering centers also have great changes, and the values are increasing and nearer to real centroid. Iteration times are decreasing. With increase of q, the influence of weight parameter γ_ij on clustering results is increasing. Weight parameter γ_ij is generated in iterative procedure, and initial centroid is generated randomly, so WPFCM overcomes the defect of random selection a and b and improves the vulnerability of uncertain clustering results. Clustering results in Table 7 are better than in Table 6; however, we find that membership values greatly reduce but iteration times increase. Comprehensive consideration suggests that the value of m and q are 1.5 and 5, respectively.

Experiment 2. Dataset: IRIS; algorithm: FCM, PCM, PFCM, and WPFCM
IRIS is a four-dimensional dataset including three classes: setosa, versilcolor, and virginica. Each cluster has 50 data points, adding up to 150 data points. The first cluster setosa has good separation from the other two clusters without overlapping. There are some overlaps between versilcolor and virginica.
Data in Tables 8 and 9 were acquired by running FCM, PCM, PFCM, and WPFCM many times on . Each algorithm got good clustering centroid. Compared with other algorithms, WPFCM acquired more obvious membership and typicality values and better separation. The two centroids of versilcolor and virginica got by running the PCM algorithm almost overlap. It is difficult to find separation between clusters in Tables 8 and 9. In order to compare separation between different classes, we defined the distance between classes as Dist_ij = ||V_i−V_j||², which denotes the distance from i^th cluster to the j^th cluster.
Table 10 provides distance values between different centroids generated by FCM, PCM, PFCM, and WPFCM with IRIS. Dist₁₂ and Dist₁₃ that are calculated by using Dist_ij = ||V_i−V_j||² in WPFCM, FCM, and PFCM reflect the fact that setosa separates from the other two classes versilcolor and virginica. However, in PCM, Dist₁₂ and Dist₁₃ have almost identical results and Dist₂₃ is nearly zero, so the results do not reflect the features of dataset which is caused by coincidence of PCM. Although FCM, PFCM, and WPFCM all reflect separation of setosa from other two class and overlapping between versilcolor and virginica, by comparing Dist₂₃, we find that Dist₂₃ in WPFCM is the nearest to the real value. We conclude that WPFCM reflects the characteristic of dataset better than other algorithms and easily get good partition especially for class versilcolor and class virginica.
Table 11 provides the distance values between different centroids and real centroid generated by FCM, PCM, PFCM, and WPFCM. Formula (24) is defined as the sum of distance between centroid acquired from each algorithm and real centroid. Dist_xi represents distance from the i^th cluster center to real centroid. Each in WPFCM is less than in other algorithms. Compared value of Dist_X, there is relation Dist_WPFCM < Dist_PFCM < Dist_FCM < Dist_PCM in Table 11, which shows that there is little difference between centroid of WPFCM and real centroid.
Iterative times generated by FCM, PCM, PFCM, and WPFCM with IRIS are given in Table 12. Iterative times of WPFCM are slightly larger than FCM, but far less than PCM and PFCM. The WPFCM algorithm acquires clustering center quickly and has fast convergence speed.
The number of resubstitution errors from FCM, PCM, PFCM, and WPFCM on dataset is given in Table 13. Resubstitution errors of WPFCM are slightly less than FCM and PFCM, but far less than PCM, no matter with regard to membership value or typicality value. Table 13 includes two relations: U_eWPFCM < U_ePFCM < U_eFCM < U_ePCM and T_eWPFCM < T_ePFCM < T_eFCM < T_ePCM. Resubstitution errors of membership and typicality are up to 50 in PCM, which are far greater than other algorithms. The reason is that the PCM algorithm has clustering consistency issues, and there are overlapping data in versilcolor and virginica.

5. Conclusions

A new possibilistic fuzzy C-mean based on weight parameters was proposed according to the importance of membership and typicality in the clustering process. First, aiming at unreasonable parameters a and b, we designed weight parameter γ_ij based on literature [23] and provided the concrete calculation method. Weight parameters (1−γ_ij) and γ_ij were assigned to membership and typicality. Objective function (equation (18)) was improved, and then, new algorithm idea (Algorithm 1) was provided. Experiment on different datasets show that the new algorithm has good performance in dealing with noisy data and gets better clustering results. WPFCM resolves the coincident problem and overcomes defect of sensitivity to noisy data. New algorithm discusses the influence on membership values, typicality values, and centroids with different values of exponent parameters m and q. Exponent parameters are determined by comprehensively comparing membership values, typicality values, and centroids. Experiments compare iterative times in different algorithms. WPFCM has less iterative times and fast convergence speed. Resubstitution errors of WPFCM are near to FCM and PFCM, but far less than PCM. Comprehensively, many performance indexes suggest that WPFCM overcomes weakness of sensitivity of FCM and resolves the coincident problem of PCM and unreasonable weight parameters of PFCM. The next work is to extend the new algorithm to the nonpoint prototype clustering model such as the spherical prototype, the quadric prototype, and the shell prototype.

(1)	Initializing parameters m (m > 1), q (q > 1), and ε, c (0 < c < 1), setting the maximum cycle number max_iter, setting the initial value of cycle number as 1, and randomly generating centroid V₀.
(2)	Computing distance according to
(3)	Computing the weight parameter γ_ij and (1−γ_ij) by using equation (18)
(4)	Computing membership value u_ij and typicality value t_ij by using equations (19) and (20)
(5)	Computing the objective function obj_fcn
(6)	If \|obj_fcn (i)-obj_fcn (i−1) \|<ε or iterative times are less than max_iter, then stop
	Else obj_fcn (i) ⟶ obj_fcn (i−1)
(7)	Computing centroid by using equation (21) and going to step 2

Data Availability

The data used to support the findings of this study are available at http://archive.ics.uci.edu/ml/datasets/iris.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by “Petrel Program of Lianyungang Jiangsu Province, China” (KK18088), and “the Program of Science and Technology Associate Chief Engineer of Jiangsu Province of China” (FZ20200458).

References

S. S. Wang, A. Gittens, and M. W. Mahoney, “Scalable kernel k-means clustering with nystrom approximation: relative-error bounds,” Journal of Machine Learning Research, vol. 20, pp. 1–49, 2019.
View at: Google Scholar
K. Zhao, X. Zhao, C. Peng et al., “Partition level multiview subspace clustering,” Neural Networks, vol. 122, pp. 279–288, 2020.
View at: Google Scholar
D. Fogli, G. Giovanni, M. Redolfi, and R. Tonoli, “A knowledge-based approach to hierarchical classification: a voting metaphor,” Expert Systems with Applications, vol. 161, no. 15, Article ID 113737, 2020.
View at: Google Scholar
Y.-A. Geng, Q. Li, R. Zheng et al., “RECOME: a new density-based clustering algorithm using relative KNN kernel density,” Information Sciences, vol. 436-437, pp. 13–30, 2018.
View at: Publisher Site | Google Scholar
R. M. Mahboobeh, A. A. Ahmad, B. Nasersharif, and B. Raahemi, “A new density-based subspace selection method using mutual information for high dimensional outlier detection,” Knowledge-Based Systems, vol. 216, Article ID 106733, 2021.
View at: Publisher Site | Google Scholar
M. Lia, X. H Bi, L. M. Wang, and X. M. Han, “A method of two-stage clustering learning based on improved DBSCAN and density peak algorithm,” Computer Communications, vol. 167, pp. 75–84, 2021.
View at: Google Scholar
B. Liu, Y. Y. Xia, and P. S. Yu, in Clustering through Decision Tree Construction, pp. 20–29, CIKM2000, NewYork, NY, USA, 2000.
H. Liu, J. Wu, T. Liu, D. Tao, and Y. Fu, “Spectral ensemble clustering via weighted k-means: theoretical and practical evidence,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 5, pp. 1129–1143, 2017.
View at: Publisher Site | Google Scholar
N. Gornitz, L. A. Lima, K.-R. Muller, M. Kloft, and S. Nakajima, “Support vector data descriptions and $k$-means clustering: one class?” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 9, pp. 3994–4006, 2018.
View at: Publisher Site | Google Scholar
F. A. T. Carvalhoa, Y. Lechevallierb, and F. M. Meloa, “Partitioning fuzzy clustering algorithms based on multiple dissimilarity matrices,” Fuzzy Sets and Systems, vol. 215, no. 16, pp. 1–28, 2013.
View at: Publisher Site | Google Scholar
F.-Q. Li, S.-L. Wang, and G.-S. Liu, “A Bayesian Possibilistic C-Means clustering approach for cervical cancer screening,” Information Sciences, vol. 501, pp. 495–510, 2019.
View at: Publisher Site | Google Scholar
H. Yazdani, “Bounded fuzzy possibilistic method,” Fuzzy Sets and Systems, vol. 389, no. 15, pp. 51–65, 2020.
View at: Publisher Site | Google Scholar
P. D. Pantula, S. S. Miriyala, and K. Mitra, “An Evolutionary Neuro-Fuzzy C-means Clustering Technique,” Engineering Applications of Artificial Intelligence, vol. 89, Article ID 103435, 2020.
View at: Publisher Site | Google Scholar
A. Singh and A. K. P. Upadhyay, “A novel approach to incorporate local information in possibilistic c-Means algorithm for an optical remote sensing imagery,” The Egyptian Journal of Remote Sensing and Space Science, vol. 24, 2020.
View at: Google Scholar
D. S. Mai, L. T. Ngo, L. H. Trinh, and H. Hagras, “A hybrid interval type-2 semi-supervised possibilistic fuzzy c-means clustering and particle swarm optimization for satellite image analysis,” Information Sciences, vol. 548, pp. 398–422, 2021.
View at: Publisher Site | Google Scholar
S. Askari, “Fuzzy C-Means clustering algorithm for data with unequal cluster sizes and contaminated with noise and outliers: review and development,” Expert Systems with Applications, vol. 165, Article ID 113856, 2021.
View at: Publisher Site | Google Scholar
E. R. Ruspini, “A new approach to clustering,” Information Control, vol. 15, no. 1, pp. 22–32, 1969.
View at: Publisher Site | Google Scholar
J. C. Dunn, “A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters,” Journal of Cybernetics, vol. 3, no. 3, pp. 32–57, 1973.
View at: Publisher Site | Google Scholar
R. Krishnapuram and J. Keller, “A possibilistic approch to clustering,” IEEE TFS, vol. 1, no. 2, pp. 88–110, 1993.
View at: Publisher Site | Google Scholar
R. Krishnapuram and J. M. Keller, “The possibilistic C-means algorithm: insights and recommendations,” IEEE Transactions on Fuzzy Systems, vol. 4, no. 3, pp. 385–393, 1996.
View at: Publisher Site | Google Scholar
M. Barni, V. Cappellini, and A. Mecocci, “A possibilistic approach to clustering,” IEEE Transactions on Fuzzy Systems, vol. 4, no. 3, pp. 393–396, 1996.
View at: Publisher Site | Google Scholar
H. Timm, C. Borgelt, and R. Kruse, in Fuzzy Cluster Analysis with Cluster Repulsion, EUNITE, Tenerife, Spain, 2001.
Q. Zhang, L. T. Yang, Z. Chen, and F. Xia, “A high-order possibilistic $C$ -means algorithm for clustering incomplete multimedia data,” IEEE Systems Journal, vol. 11, no. 4, pp. 2160–2169, 2017.
View at: Publisher Site | Google Scholar
N. R. Pal, K. H. Pal, and M. K. James, “A possibilistic fuzzy c-means clustering algorithm,” IEEE TFS, vol. 13, no. 1, pp. 517–530, 2005.
View at: Publisher Site | Google Scholar
C. L. Chowdhary and D. P. Acharjya, “Segmentation of mammograms using a novel intuitionistic possibilistic fuzzy c-mean clustering algorithm,” Nature Inspired Computing, vol. 652, pp. 75–82, 2018.
View at: Publisher Site | Google Scholar
S. Askari, N. Montazerin, M. H. F. Zarandi, and E. Hakimi, “Generalized entropy based possibilistic fuzzy C-Means for clustering noisy data and its convergence proof,” Neurocomputing, vol. 219, pp. 186–202, 2017.
View at: Publisher Site | Google Scholar
C. L. Chowdhary, M. Mittal, P. A. Pattanaik, and Z. Marszalek, “An efficient segmentation and classification system in medical images using intuitionist possibilistic fuzzy C-mean clustering and fuzzy SVM algorithm,” Sensors, vol. 20, no. 14, 2020.
View at: Publisher Site | Google Scholar
A. Schneider, “Weighted possibilistic C-means clustering algorithms,” in Proceedings of the Ninth IEEE International Conference on Fuzzy Systems, pp. 176–180, San Antonio, TX, USA, May 2000.
View at: Google Scholar
W. L. Hung, M. Yang, and D. Chen, “Parameter selection for suppressed fuzzy c-means with an application to MRI segmentation,” Pattern Recognition Letters, vol. 27, no. 5, pp. 424–438, 2005.
View at: Google Scholar
N. R. Pal, K. Pal, and J. C. Bezdek, “A mixed c-means clustering model,” in Proceedings of the Sixth IEEE International Conference on Fuzzy Systems, pp. 11–21, Barcelona, Spain, July 1997.
View at: Google Scholar
L. T. Cheng, S. G. Wang, and X. Wei, “New fuzzy c-means clustering model based on the data weighted approach,” Data & Knowledge Engineering, vol. 69, pp. 881–900, 2010.
View at: Publisher Site | Google Scholar
J. Fan, M. Han, and J. Wang, “Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation,” Pattern Recognition, vol. 42, no. 11, pp. 2527–2540, 2009.
View at: Publisher Site | Google Scholar
R. Nock and F. Nielsen, “On weighting clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1223–1235, 2005.
View at: Google Scholar
E. Anderson, “The IRIS of gasp peninsula,” Bulletin of American IRIS Society, vol. 59, pp. 2–5, 1935.
View at: Google Scholar
N. R. Pal, K. Pal, and J. C. Bezdek, “A new hybird c-means clustering model,” in Proceedings of the 2004 IEEE International Conference on Fuzzy Systems, pp. 179–184, Piscataway, NY, USA, July 2004.
View at: Google Scholar

Copyright

Copyright © 2021 Jiashun Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2075

Downloads

1019

Citations

Scientific Programming

A Weight Possibilistic Fuzzy C-Means Clustering Algorithm

Abstract

1. Introduction

2. Related Works

3. WPFCM Algorithm

3.1. Motivation Weight Parameters

3.2. Calculation Method of Weight Parameters

3.3. Design Objective Function

3.4. WPFCM Algorithm

4. Experiments

5. Conclusions

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright