Abstract

Clustering involves grouping data points together according to some measure of similarity. Clustering is one of the most significant unsupervised learning problems and do not need any labeled data. There are many clustering algorithms, among which fuzzy c-means (FCM) is one of the most popular approaches. FCM has an objective function based on Euclidean distance. Some improved versions of FCM with rather different objective functions are proposed in recent years. Generalized Improved fuzzy partitions FCM (GIFP-FCM) is one of them, which uses 𝐿𝑝 norm distance measure and competitive learning and outperforms the previous algorithms in this field. In this paper, we present a novel FCM clustering method with improved fuzzy partitions that utilizes shadowed sets and try to improve GIFP-FCM in noisy data sets. It enhances the efficiency of GIFP-FCM and improves the clustering results by correctly eliminating most outliers during steps of clustering. We name the novel fuzzy clustering method shadowed set-based GIFP-FCM (SGIFP-FCM). Several experiments on vessel segmentation in retinal images of DRIVE database illustrate the efficiency of the proposed method.

1. Introduction

One of the first mental activities of humans is clustering, where the goal is to find data structures and assign similar data to a group (cluster). Actually it tries to put unlabeled data into groups so that data points in a group are more similar to each other compared to those in other groups. It means that the goal is to maximize the intraclass likeness while minimizing the intercluster similarity.

Clustering has been used in diverse areas like machine vision and pattern recognition as well as in medical applications. There is a wide array of clustering algorithms [1–8]. But they can be generally classified to some groups as listed below:(i)exclusive clustering: one data point to belong to one cluster. No Overlapping (K-means, any linear classifier belongs to this kind),(ii)overlapping clustering: data can be in two or more clusters (fuzzy C-means),(iii)hierarchical clustering,(iv)probabilistic clustering (model based).

Among the above, overlapping clustering or partition-based clustering methods that group objects with some membership degrees are often used to handle noisy and uncertain data and hence can have a good utility in many practical applications.

Probabilistic and fuzzy clustering are two kinds of overlapping clustering methods when we consider the uncertainty in data. Actually, crisp clustering can be considered as a special type of these two clustering methods. FCM [9] is the most popular one among the other presented fuzzy clustering methods. It goes without saying that the later proposed fuzzy clustering methods are based on FCM in general. FCM and its derived methods cluster data according to an objective function and several constraints. For instance, the summation of all the memberships of each data point to all clusters must be one.

A novel presentation of constraining the membership functions can be seen in [10]. In addition, a new FCM algorithm is also presented in [10], which is based on improved fuzzy partitions, named IFP-FCM. This algorithm is driven by crisp membership degrees, so it may seem resistant to noise and outliers. However, the fuzziness index m [10] is fixed and cannot be changed. Since the clusters in a data set have different densities, the performance of FCM may significantly depend on the choice of fuzziness index. Therefore, a good value of this parameter should be adopted to be consequent on the data distribution in data set [11]. So, another objective function should be introduced so that adjusts m. In [12] GIFP-FCM is proposed that does not restrict m to a fixed value. There is a parameter in this algorithm (𝛼) that connects it to FCM and IFP-FCM. We can say briefly that this algorithm has the benefit of IFP-FCM and also generalizes it because of the various values that m can get.

In the paper, we present a novel Shadowed set-based Generalized Improved fuzzy partitions FCM (SGIFP-FCM). This new algorithm uses shadowed sets and the most important idea of the proposed method is to improve performance of GIFP-FCM in the stage of determining new cluster centers at each iteration. This is accomplished by removing the outliers and unsuitable data that have negative effects on structure of clusters. We will see that it improves the results of clustering in noisy systems and also it decreases the time of clustering in comparison with GIFP-FCM and the previous methods.

Clustering methods are often used in image processing applications such as in image segmentation [7]. To investigate the proficiency and utility of the presented method, the algorithm is applied here to vessel extraction of retinal images. These images are often used in medical applications to diagnosis of some diseases such as diabetes. The proposed approach is compared against other competing clustering algorithms on this database as well as an artificial dataset.

Some fuzzy clustering algorithms are briefly reviewed in Section 2. Section 3 describes the proposed algorithm (SGIFP-FCM) in detail. Section 4 shows experimental results of our work. Finally, conclusion is brought in Section 5.

2. Previous Fuzzy Clustering Algorithms

Many areas such as pattern recognition and machine vision utilize fuzzy clustering in solving their problems. A great number of fuzzy clustering algorithms are there, and it is noticeable that most of them use distance criteria. One of these algorithms that is so common and popular is FCM [13]. Reverse distance is used in FCM to fuzzy memberships.

In FCM, each feature vector can belong to every cluster with a coefficient between zero and one. Finally, algorithms label each data point (feature vector) based on the maximum coefficient of this data point over all clusters.

The fuzzy membership matrix and the cluster centers are computed by minimizing the following partition formula:𝐽𝑓(𝐢,π‘š)=𝑐𝑛𝑖=1ξ“π‘˜=1𝑒𝑖,π‘˜ξ€Έπ‘šπ‘‘π‘–,π‘˜subjectto𝑐𝑖=1𝑒𝑖,π‘˜=1.(1)

In this equation, n denotes the number of data, c the number of clusters, 𝑒𝑖,π‘˜ the fuzzy membership of the π‘˜th data point to the 𝑖th cluster, 𝑑𝑖,π‘˜ the Euclidean distance between the data point and the cluster center, and π‘šβˆˆ(1,∞) a fuzzy weighting factor that defines the degree of fuzziness of the results. The data class becomes fuzzier and less discriminating with increasing π‘š. In general, π‘š=2 is chosen (it is mentioned that this value of π‘š does not produce optimal solution for all problems).

The constraint in (1) implies that each point must entirely distribute its membership among all the clusters. The cluster centers (centroids) are determined as the fuzzy-weighted center of gravity of the data 𝑋,𝑣𝑖=βˆ‘π‘›π‘˜=1𝑒𝑖,π‘˜ξ€Έπ‘šπ‘₯π‘˜βˆ‘π‘›π‘˜=1𝑒𝑖,π‘˜ξ€Έπ‘š,𝑖=1,2,…,𝑐.(2)

Since 𝑒𝑖,π‘˜ affects the computation of the cluster center 𝑣𝑖, the data with a high membership will influence the prototype location more than points with a low membership. For the fuzzy C-means algorithm, distance 𝑑𝑖,π‘˜ is defined as follows:𝑑𝑖,π‘˜ξ€Έ2=β€–β€–π‘₯π‘˜βˆ’π‘£π‘–β€–β€–2.(3)

The cluster centers 𝑣𝑖 represent the typical values of that cluster, whereas the 𝑒𝑖,π‘˜ component of the membership matrix denotes the extent to which the data point π‘₯π‘˜ is similar to its prototype. The minimization of the partition functional (1) will give the following expression for the membership:𝑒𝑖,π‘˜=1βˆ‘πΆπ‘—=1𝑑𝑖,π‘˜/𝑑𝑗,π‘˜ξ€Έ1/(π‘šβˆ’1).(4)

Equation (4) is determined in an iterative way since the distance 𝑑𝑖,π‘˜ depends on membership 𝑒𝑖,π‘˜.

The procedure to calculate the FCM is as follows.(1) Opt for the number of clusters 𝑐, 2≀𝑐<𝑛; Choose π‘š, 1β‰€π‘š<∞. Initialize π‘ˆ(0).(2) Calculate the cluster centers 𝑣𝑖 using  (2).(3) Calculate new partition matrix π‘ˆ(1) using  (4).(4) Compare π‘ˆ(𝑗) and π‘ˆ(𝑗+1). If the variation of the membership degree π‘’π‘˜,𝑖, calculated with an appropriate norm, is smaller than a given threshold, terminate the algorithm, in other respects go back to step (2).

FCM clustering has a shortcoming in producing membership functions. For instance, these functions are not bounded and do not have to decay rapidly, so cannot be understood locally. Distances to the Voronoi cell of the cluster instead of using distances to the cluster prototypes are applied in IFP-FCM method [10] by changing FCM’s objective function. Objective function of IFP-FCM can be seen below:𝐽IFP-FCM=𝑐𝑛𝑖=1𝑗=1𝑒2𝑖𝑗𝑑2ξ€·π‘₯𝑖,π‘£π‘–ξ€Έβˆ’π‘›ξ“π‘—=1π‘Žπ‘—π‘ξ“π‘–=1ξ‚€π‘’π‘–π‘—βˆ’122.(5)

In this equation aj is a rewarding parameter. IFP-FCM seems more resistance to outliers and even noise.

Fuzzy index m is equal to 2 in many cases for FCM and IFP-FCM. Despite the fact that this parameter is required to be various or in different values for optimal or near-to-optimal results, fuzziness index m should be flexible and generalized. To do that, another objective function is necessary accordingly.

For that reason, GIFP-FCM clustering approach was presented in [12]. It is based on the rival-penalized competitive learning (RPCL) concept [14]:𝐽GIFP-FCM=𝑐𝑛𝑖=1𝑗=1π‘’π‘šπ‘–π‘—π‘‘2ξ€·π‘₯𝑖,𝑣𝑖+𝑛𝑗=1π‘Žπ‘—π‘ξ“π‘–=1𝑒𝑖𝑗1βˆ’π‘’π‘šβˆ’1𝑖𝑗.(6)

It is clear that the new objective function uses the opinion of RPCL that for minimizing, only a specified uij gets the maximum reward and the other ukj (rivals) get the minimum reward.

Authors in [12] showed that GIFP-FCM can convert to FCM or IFP-FCM with choosing proper values for some parameters and also it converges quicker rather than the other two clustering algorithms that we discussed before, but computational complexity of GIFP-FCM is the same as theirs. Fuzziness index π‘š in IFP-FCM was equal to 2, whereas this parameter can be changed in GIFP-FCM properly.

3. The Proposed Method

In this section, we briefly describe shadowed sets and then try to propose our new clustering algorithm, SGIFP-FCM.

3.1. Shadowed Sets

Suppose that A is a fuzzy set in 𝑋 which is an interval-valued one and maps element of 𝑋 into 0,1, and the unit interval [0,1]. Shadowed set B can be defined by this fuzzy set A so that B is a mapping π΅βˆΆπ‘‹β†’{0,1,[0,1]}, where 0,1, and [0,1] illustrate complete exclusion from B, complete inclusion in B, and complete ignorance, respectively. Shadowed sets have some characteristics. For example, they are isomorphic with a three-valued logic. Shadowed sets and logic have similar operations. (For more details you can refer to [15, 16]).

Figure 1 depicts a fuzzy set and also its shadowed set.

3.2. Creating a Shadowed Set

Pedrycz in [17] presented the way of creating a shadowed set from a fuzzy set. Two threshold of 𝛼 and 𝛽 should be defined at first as follows: π›Όβˆˆ[0,0.5] and π›½βˆˆ[0.5,1]; next, low membership grades and high membership grades should be changed to 0 and 1, respectively. At last, memberships are converted to some grades between 𝛼 and 𝛽 in [0,1]. Figure 2 shows this trend for a unimodal fuzzy set.

In Fact, the thresholds 𝛼 and 1βˆ’π›Ό are proper choices, since the threshold 𝛼 is an essential part in process of constructing a shadowed set. So, finding such an optimal threshold is of utmost importance. Pedrycz considers the balance of uncertainty for this issue. The balance of uncertainty can be preserved by recompensing the changes of membership grades to zero and one for by creating the shadowed set that β€œabsorb” the former elimination of partial membership at low and high ranges of membership.

According to Figure 2, the balance equation for fuzzy set with discrete membership function is defined asΞ©1+Ξ©2=Ξ©3,Ξ©1=𝑒𝑖≀𝛼𝑒𝑖,Ξ©2=𝑒𝑖β‰₯1βˆ’π›Όξ€·1βˆ’π‘’π‘–ξ€Έ,Ξ©3𝑒=cardπ‘–βˆ£π›Ό<𝑒𝑖.<1βˆ’π›Ό(7)

It is better to change the above problem to (8) so that find the optimal value of parameter 𝛼||Ξ©Minimize𝑄(𝛼)=1+Ξ©2βˆ’Ξ©3||.(8)

3.3. Applying Shadowed Sets to GIFP-FCM

As we saw, clusters centers are modified in each iteration of the clustering algorithm and in addition to this, all data are participated in computation of each cluster center that increases time complexity of the algorithm. For decreasing the time of algorithm and improve the efficiency, we use shadowed sets to remove unsuitable data, outliers, and also data in borders before determining centers of clusters. We show partitions matrix with π‘ˆ=(𝑒𝑖𝑗)𝑐×𝑛 and each row of this matrix describes a cluster. We should compute optimal threshold 𝛼𝑖 (𝑖=1,2,…,𝑐) for each row and then can remove outliers. 𝛼𝑖 (𝑖=1,2,…,𝑐) can be computed with solving the below optimization problem:||Ξ©Minimize𝑄(𝛼)=𝑖1+Ω𝑖2+Ω𝑖3||,(9) whereΩ𝑖1=ξ“π‘—βˆΆπ‘’π‘–π‘—β‰€π›Όπ‘–π‘’π‘–π‘—,Ω𝑖2=ξ“π‘—βˆΆπ‘’π‘–π‘—β‰₯1βˆ’π›Όπ‘–ξ€·1βˆ’π‘’π‘–π‘—ξ€Έ,Ω𝑖3𝑒=cardπ‘–π‘—βˆ£π›Όπ‘–<𝑒𝑖𝑗<1βˆ’π›Όπ‘–ξ€Ύ.,𝑗=1,2,…,𝑛(10)

3.4. SGIFP-FCM Algorithm

Now we can summarize our proposed algorithm (S FIFP-FCM) in some steps as follows.

Step 1. Set the number of clusters (c is between 1 and n) and also choose proper values for threshold of stopping algorithm (πœ€), parameter 𝛼′ in [0, 1) that determines the rate of rewarding, number of iterations and fuzziness index m, and initial values of uij.

Step 2. Compute the optimum value 𝛼 for each row of partition matrix using (9). If 𝑒𝑖𝑗<𝛼, it does not have any role in determining cluster center.

Step 3. Compute clusters centers (𝑣𝑖(𝑙+1)) using 𝑣𝑖(𝑙+1)=βˆ‘π‘—βˆΆπ‘’π‘–π‘—β‰₯π›Όπ‘’π‘šπ‘–π‘—π‘₯π‘—βˆ‘π‘—βˆΆπ‘’π‘–π‘—β‰₯π›Όπ‘’π‘šπ‘–π‘—(𝑖=1,2,…,𝑐).(11)

Step 4. Compute membership functions 𝑒(𝑙+1)𝑖𝑗 using 𝑒𝑖𝑗=1βˆ‘π‘π‘˜=1𝑑2ξ€·π‘₯𝑗,π‘£π‘–ξ€Έβˆ’π›Όβ€²min1β‰€π‘Žβ‰€π‘π‘‘2ξ€·π‘₯𝑗,π‘£π‘Žξ€Έπ‘‘2ξ€·π‘₯𝑗,π‘£π‘˜ξ€Έβˆ’π›Όβ€²min1β‰€π‘Žβ‰€π‘π‘‘2ξ€·π‘₯𝑗,π‘£π‘Žξ€Έξƒͺ1/(π‘šβˆ’1).(12)

Step 5. If the terminating condition is satisfied, the algorithm is finished, otherwise increase iteration number and go back to Step 2.

4. Results

In the first experiment, we create a random and artificial data set with three clusters or groups of data as it is shown in Figure 3. We can observe the results of these data clustering with three algorithms of FCM, GIFP-FCM, and SGIFP-FCM in Figure 4. Almost FCM and SGIFP-FCM depict similar results and detect the three clusters correctly, but in GIFP-FCM contrary to SGIFP-FCM all data are participated in determining centers and then could not cluster data exactly. This is because of the effects of outliers and data in borders of clusters on determining centers, which are removed in SGIFP-FCM.

In Figure 4, clusters are distinguished with different colors and centers of clusters are specified with β€œ+”. Values of parameters 𝛼 and π‘š are supposed to be 2 and 0.9, respectively.

In the second experiment, we use retinal images of DRIVE database [18]. We want to detect vessels in these images using clustering methods. For this purpose, we first extract some features for each pixel of the processed image and then we get these feature vectors to clustering algorithm to label the pixels as vessels or nonvessels or clearly divide the pixels into two clusters of vessels and nonvessels. There are many feature extraction methods. We use LBP (Local Binary Patterns) method for feature extraction [19]. The features vector of each pixel is composed of gray level intensity, variance, and LBP value in three local windows around the central pixel [20].

Sample of a segmented retina image using the proposed algorithm (SGIFP-FCM) is presented in Figure 5 along with manually segmented version of the image.

To evaluate the algorithm, we calculated true positive ratio (TPR) and False Positive Ratio (FPR). TPR is number of pixels of resulting image, which are correctly clustered as vessel (according image generated by human expert) to total number of pixels of human-expert generated image that are labeled as vessel. FPR is the number of pixel of resulting image, which are incorrectly clustered as vessel to total number of pixel of human-expert generated image that are labeled as background.

Table 1 compares performance of retina vessel extraction with SGIFP-FCM and GIFP-FCM versus 2nd human observer with regard to the TPR, FPR, and time of clustering.

The results in Table 1 are average results of 20 test images of DRIVE database.

As we see in Table 1, performance of SGIFP-FCM is better than GIFP not only in extracted vessels but also in time consuming.

5. Conclusion

In this paper, we present a Generalized Fuzzy C-Means Clustering With Improved Fuzzy Partitions and Shadowed sets (SGIFP-FCM). As is illustrated, this new algorithm improves the previous clustering algorithms such as GIFP-FCM in noisy data sets. In the proposed algorithm, the effects of outliers and noisy data are reduced in determining clusters centers using shadowed sets. Finally this new algorithm is tested on several experiments. Performance of this method is evaluated on image segmentation application (retina images) and because there are many noises in these images this algorithm presented better results than GIFP-FCM algorithm.