Deep and Transfer Learning Approaches for Complex Data Analysis in the Industry 4.0 EraView this Special Issue
Entropy-Based Multiview Data Clustering Analysis in the Era of Industry 4.0
In the era of Industry 4.0, single-view clustering algorithm is difficult to play a role in the face of complex data, i.e., multiview data. In recent years, an extension of the traditional single-view clustering is multiview clustering technology, which is becoming more and more popular. Although the multiview clustering algorithm has better effectiveness than the single-view clustering algorithm, almost all the current multiview clustering algorithms usually have two weaknesses as follows. (1) The current multiview collaborative clustering strategy lacks theoretical support. (2) The weight of each view is averaged. To solve the above-mentioned problems, we used the Havrda-Charvat entropy and fuzzy index to construct a new collaborative multiview fuzzy c-means clustering algorithm using fuzzy weighting called Co-MVFCM. The corresponding results show that the Co-MVFCM has the best clustering performance among all the comparison clustering algorithms.
In the era of Industry 4.0, as the methods of data collection become more and more diverse, the complexity of data is also increasing. For example, a driverless car will collect environmental data through a variety of sensors and conduct analysis and processing from multiple views while driving. In unsupervised learning, clustering is usually used for complexity data analysis. However, traditional clustering methods, such as K-means [1, 2], fuzzy C-means (FCM) [3, 4], maximum entropy clustering (MEC) [5, 6], and possibilistic C-means (PCM) [7, 8], are all designed for single-view data analysis. When the single-view algorithms [9–11] encounter a multiview clustering task, the common practice is to first consider each view independently, treating each view as an independent clustering task. After finishing each clustering task of each view, the integrated learning mechanism  is used to select an appropriate integrated learning strategy to integrate multiple clustering results and then get the final clustering results. However, due to obvious deviation of clustering results in a certain view or great difference of clustering results among different views, the multiview strategy which artificially separates each view for independent analysis may result in inaccurate global partitioning results obtained by integrated learning or unstable algorithm performance.
In many real applications, multiview representation of data is becoming more and more popular , especially in the field of medicine [14, 15]. For example, people’s living standard and economic situation have been improved since entering the twentieth century. But the incidence rate of cancer has increased by nearly 50% compared with that of the 1880s due to environmental pollution, food safety, and working pressure. However, with the continuous improvement of medical level, various detection methods such as laboratory examination (routine examination, serological examination, gene or gene product examination, etc.), imaging and endoscopy (X-ray examination, B-ultrasound examination, CT examination, radionuclide imaging, etc.), and cytopathological examination (puncture biopsy, forceps biopsy, section analysis, etc.) have been proposed and applied. These methods can be used to analyze the suspected patients from different views. This is a typical multiview data representation problem. The above examples reveal that developing various multiview clustering algorithms is very necessary for us to better observe and mine the essence of data from the viewpoint of its diverse descriptions and accordingly obtain a better clustering result that simultaneously satisfies every representation (view). Introducing the multiview technology into the traditional clustering analysis method so that there is collaborative learning in the clustering process is considered to be an effective solution. In recent years, some effective multiview clustering methods have been proposed using the above strategies. Yamanishi et al.  proposed a collaborative clustering algorithm Co-EM algorithm that can be used to solve multiview problems based on the EM algorithm from the perspective of probability and test the proposed algorithm’s effectiveness through some text-like samples. Inspired by FCM algorithm, Pedrycz  controlled the fuzzy partition between the various views, constructed a divisional cooperative control function, and finally obtained the Co-FC algorithm. The algorithm has shown certain advantages on various datasets. More early related multiview clustering studies can be found in [18, 19].
As mentioned above, a lot of researches have begun to focus on the construction of multiview clustering algorithm. Through the summary of the current research on multiview clustering, we find that the current research mainly focuses on the following aspects: (1) the early multiview clustering algorithm usually preprocesses the data itself, and the most direct method is to synthesize a multiview data into a single-view data through feature fusion and then use the data clustering analysis; (2) most of the multiview clustering algorithms proposed in recent years use collaborative learning strategy, which can enhance the performance of each view data in the process of clustering; (3) when most multiview clustering algorithms with collaborative learning ability treat each view, their common practice is to average the weight of each view. In particular, we take the most classic Co-FKM algorithm  as an example, which is one of the representative multiview clustering algorithms in recent years. The algorithm designs a very effective multiperspective collaborative learning strategy, which can make the data between different perspectives use membership to complete collaborative learning in the process of clustering. But the algorithm also has a fatal disadvantage, that is, it just treats each perspective equally and does not give each perspective different weights. In addition, the multiperspective collaborative learning strategy proposed by the algorithm also lacks the necessary physical meaning, which cannot explain why this collaborative learning strategy can contribute to the final clustering performance. In response to the above-mentioned challenges, in this study, a new view space division criterion is first constructed based on the Havrda-Charvat entropy, which is used to control the space division results across different views so that the space division results of each view tend to be as consistent as possible in order to obtain a more stable and more comprehensive global spatial division result. Furthermore, we introduced fuzzy index and fuzzy weights to adaptively weight each view and effectively adjust the weight of each view during the clustering process so that the view with the clearest spatial division has a larger weight. Finally, a new collaborative multiview fuzzy c-means clustering algorithm using fuzzy weighting called Co-MVFCM is proposed by combining with the Havrda-Charvat entropy and fuzzy index. We summarized the contributions of this study here: (1)We construct a new view space division criterion using the Havrda-Charvat entropy. The built criterion can be used to control the space division results across different views(2)We construct a view weighting mechanism using fuzzy index. The new view weighting mechanism can be used to recognize the importance degree of each view
Overall, our proposed Co-MVFCM algorithm not only has good space division ability but also has the ability to adaptively recognize the best view.
2. Related Work
When multiview clustering task is coming, Cleuziou et al.  proposed the Co-FKM method based on classical FCM. In Co-FKM, multiview clustering is achieved by a constraint of fuzzy membership degree which is aimed at keeping the partition result of each view as consistent as possible. Here, the Co-FKM method is defined as and , , .
The objective function of Co-FKM can be optimized by introducing Lagrange multipliers. So the fuzzy membership degree and center are obtained as
To obtain a fuzzy division standard with global considerations, the fuzzy membership of each view can be computed by using the geometric mean method . The specific expression is as follows:
From the Co-FKM algorithm, we can draw a general framework to represent multiview clustering, which is illustrated in Figure 1. The Co-FKM algorithm incorporates the spatial division and approximation criteria across different views in the clustering and realizes collaborative learning across different views; it has more effective multiview clustering performance compared with traditional single-view integrated clustering technology. However, as we stated before, it still has challenges to be further solved.
3. Collaborative Multiview Fuzzy Clustering (Co-MVFCM) Using Entropy Technology
In view of the two shortcomings of the current multiview clustering methods, the following two new technologies based on the entropy theory were introduced. (1)We use the Havrda-Charvat entropy to construct a new view space division approximation criterion and find the maximum similarity component between each view so that while improving the performance of clustering, it also gives the view space division approximation criterion new physical meaning from the perspective of entropy(2)We propose an entropy-weighted multiview clustering technology. By weighting each view, we can find the best view in the iterative optimization process and get the best fuzzy division result at the same time, in order to effectively control the weight
Figure 2 illustrates the new framework of multiview clustering.
3.1. Approximation Criterion of Space Division from Different Views Based on the Havrda-Charvat Entropy
In this study, the Havrda-Charvat entropy of is defined as
It is obvious that if the fuzzy membership degree is considered as a probability matrix, when the constraint holds, is equal to 0. It is very intuitive to show that the uncertainty of belonging to each division in the sample set of this view reaches the minimum value. That is to say, when the objective function reaches its minimum value, the Havrda-Charvat entropy of also reaches its minimum value.
Although Equation (6) can ensure that the uncertainty of division can be minimized, it is limited to a single view. In order to expand it into a field of multiple views, in this study, we expand Equation (6) into the following expression form by referring to the relevant strategies used in :
So we have
We observe from Equations (7) and (8) that can be used to effectively regulate the weight relationship between the current view and the membership degree division of other views ( and ). So we can get the weighted average of membership degree and finally make the membership degree division of each view as consistent as possible, so as to obtain the spatial division result with a more global view.
3.2. Multiview Adaptive Weighting Based on Fuzzy Index
In this study, we develop an automatic view weighting strategy using fuzzy index to recognize the best view. Suppose represents the weight of view under the condition that and , then can be considered as the probability distribution which is defined as
Fuzzy index technology is introduced through the above methods to make the objective function achieve the optimal entropy as much as possible, which is also the classical fuzzy c-means clustering principle .
According to the above definitions, we propose our new multiview clustering method. The objective function of Co-MVFCM is and , , .
Obviously, the objective function contains two main parts. The first one is which is derived from Havrda-Charvat and used for collaborative clustering. The essence of the first part is to find out as many similar parts among different perspectives as possible through multiview clustering technology and finally make the spatial division results of different views tend to be the same. The second part is which is derived from fuzzy index. This part can be used to adaptively calculate the weight values of each view, and finally, when the algorithm reaches the optimal level, the optimal view partitioning results can be obtained according to the weight matrix of the views. The parameter can be set to . The parameter can be determined by using grid optimization [13, 21].
To obtain the final result of space division with global characteristics, the integration strategy of global space division mentioned in  is abandoned in this study. We define a new integration strategy to obtain the final space division as
The proposed multiview can be optimized by introducing Lagrange multipliers. In this section, we give three theorems to obtain updating rules in terms of fuzzy membership degree, view weights, and cluster centers.
Theorem 1. When and are fixed, the cluster center can be solved by
Proof. By setting , we have . Therefore, Theorem 1 is achieved.
Theorem 2. When the cluster center and view weight matrix are fixed, the fuzzy membership degree matrix can be solved by
Proof. By introducing Lagrange multipliers and considering the constraint , we have the following objective function:
By setting the partial derivative of w.r.t. to 0, i.e., , we have Similarly, with , we have By combining Equations (15) and (16) to remove , we have Therefore, the proof of Theorem 2 is achieved.
Theorem 3. When the center matrix and the fuzzy membership degree matrix are fixed, the weight matrix can be solved by
Remark 4. In this section, a novel multiview clustering method called Co-MVFCM is proposed. The proposed Co-MVFCM method can find the most important view adaptively, and it also can obtain the best space division by using Equation (11). However, we will find that the proposed Co-MVFCM method has three predefined parameters. These predefined parameters should be defined by using grid optimization which will lose many time costs. In the near future, we will consider how to reduce the number of these predefined parameters.
4. Experimental Studies
In this study, we introduce several UCI datasets to evaluate the proposed multiview clustering method. For fair comparison, Co-FKM , LSSMTC , CombKM , and Coclustering  are introduced for benchmarking testing.
We introduce two commonly used criteria, i.e., NMI and RI to evaluate all clustering methods. They are defined as follows. (1)Normalized Mutual Information (NMI) [24, 25] where represents the number of samples in the th cluster, represent the matching degree of the th cluster and the th cluster, and represents the size of the dataset.(2)Rand Index (RI) [24, 25] where represents the number of pairing points that have the same class label and belong to the same class and represents the number of matching points with different class labels and belonging to different classes of data points.
The value range of the above two indexes is [0 1]. The closer the value of these two indicators is to 1, the better the performance is. Experimental environment: the experimental hardware platform was Intel Core i7 CPU, with a memory of 16 GB. The programming environment is MATLAB 2010.
4.2. Experimental Results
In this section, some real-world datasets from the famous UCI database will be used to test our algorithm: (1) Iris dataset, (2) Multiple Features (MF) dataset, (3) Image Segmentation (IS) dataset, and (4) Water Treatment Plant (WTP) dataset. The performance of the Co-MVFCM algorithm proposed in this study is verified and analyzed by using the above datasets when processing real multiview clustering tasks. In order to have a more intuitive impression of the perspectives contained in the three datasets, this paper will present the composition of the four datasets, as shown in Table 1. At the same time, the experimental results of algorithm comparison for these four real datasets are shown in Table 2.
For the Iris dataset, we will observe that the proposed Co-MVFCM has the best clustering performance among all the adopted comparison algorithms. The experimental result of Iris shows that the proposed two multiview collaborative clustering strategies have significant advantages in multiview clustering task. For the other three datasets, since the LSSMTC algorithm needs to ensure that the dimensions of each clustering task are consistent, it cannot be used in the face of the samples with different perspectives such as MF, IS, and WTP. By observing the rest of each other algorithm’s clustering results of MF datasets, it can be found that based on the multiple points of view of Co-FKM, the algorithm of this paper has a larger cluster advantage, but because of MF data, no angle exists obvious separability which exist the importance degree of the equilibrium between different points of view; this makes the clustering results from the NMI and RI of the proposed algorithm in the paper with Co-FKM algorithm similar to the average of the two major indicators, and from the variance analysis, the method is still more stable so it still reflects that the method still has certain advantages. For IS dataset, the effect of the proposed method on this sample is relatively obvious, and its clustering index is significantly better than that of the other algorithms, which further confirms the effectiveness of the Co-MVFCM. Finally, through the analysis of the experimental results of the WTP dataset, the same conclusion can be obtained with the above two datasets. In conclusion, through the experiments on real dataset multiple points of view and analysis, we can get a clear conclusion of the clustering algorithm in dealing with multiple points of view which have many view feature clustering tasks generally superior to the clustering algorithm, multiple points of view, and has a view of selective Co-MVFCM algorithm, and clustering algorithm is much better than the previous view.
Based on cluster technology, multiple points of view are introduced on the basis of the classical FCM algorithm using the Havrda-Charvat entropy structure of different view space approaching. The proposed Co-MVFCM method can better find out the similarities between view compositions, but also from the view of entropy approximation of a different view space, reasonable physical explanation, and thus get more guiding significance to the overall space partition. In addition, in this paper, another contribution is to obtain the importance degree of each view. Through the understanding of the fuzzy theory, multiple points of view are proposed based on the fuzzy index of the adaptive weighted strategy and succeeded in introducing the strategy to the latest fuzzy clustering technology, multiple points of view on the new objective function to achieve the optimal solution. Next, we can evaluate the degree of importance of each view according to the relationship between the weights of each view. The obtained degree of importance of each view provides a new method of integration of the global weighted view space integration means. Experimental results on four real UCI datasets show that the Co-MVFCM has better sample adaptability and superior algorithm performance compared with previous algorithms and related algorithms. However, since the algorithm in this paper is based on the framework of the classical fuzzy c-means (FCM) algorithm, the effectiveness of the algorithm may be tested to a certain extent when dealing with higher-dimensional data, which also points out the direction for our future research on the multiview clustering method under high-dimensional data scene.
The dataset analyzed for this study can be found in this link [http://archive.ics.uci.edu/ml/index.php].
Conflicts of Interest
The authors declare no conflicts of interest.
This work was supported in part by the National Natural Science Foundation of China under Grant 61772241 and in part by the 2018 Six Talent Peaks Project in Jiangsu Province under Grant XYDXX-127.
J. Hämäläinen, T. Kärkkäinen, and T. Rossi, “Improving scalable K-means++,” Algorithms, vol. 14, no. 1, p. 6, 2021.View at: Google Scholar
B. Fanyu, C. Hu, Q. Zhang, C. Bai, and L. T. Yang, “Thar Baker: a cloud-edge-aided incremental high-order possibilistic c-means algorithm for medical data clustering,” IEEE Transaction on Fuzzy Systems, vol. 29, no. 1, pp. 148–155, 2021.View at: Google Scholar
H. Wang, H. Shan, and A. Banerjee, “Bayesian cluster ensembles,” in Proceedings of the Ninth SIAM International Conference on Data Mining, pp. 211–222, John Ascuaga’s Nugget, Sparks, Nevada, USA, 2009.View at: Google Scholar
J. Heer, “Mining the structure of user activity using cluster stability,” in Proc. SIAM Conf. Data Mining, Web Analytics Workshop, Chicago, Llinois, USA, 2002.View at: Google Scholar
B. Long, P. S. Yu, and Z. M. Zhang, “A general model for multiple view unsupervised learning,” in Proc. 8th SIAM Int. Conf. Data Mining, pp. 822–833, Atlanta, GA, 2008.View at: Google Scholar
G. Cleuziou, M. Exbrayat, L. Martin, and J.-H. Sublemontier, “CoFKM: a centralized method for multiple-view clustering,” in Proceedings of the 9th IEEE International Conference on Data Mining (ICDM 2009), pp. 752–757, Miami, Florida, USA, 2009.View at: Google Scholar
Q. Gu and J. Zhou, “Learning the shared subspace for multi-task clustering and transductive transfer classification,” in Proceeding of the Ninth IEEE International Conference on Data Mining, pp. 159–168, Miami Beach, FL, USA, 2009.View at: Google Scholar
Q. Gu and J. Zhou, “Co-clustering on manifolds,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, Paris, France, 2009.View at: Google Scholar
R. Hathaway, J. Bezdek, and W. Tucker, “An improved covergence theorem for the fuzzy c-means clustering algorithms,” in Analysis of Fuzzy Information, vol, III, J. Bezdek, Ed., pp. 123–131, CRC Press, Boca Raton, 1987.View at: Google Scholar