Computational Intelligence and Neuroscience

Volume 2017 (2017), Article ID 3961718, 11 pages

https://doi.org/10.1155/2017/3961718

## Consensus Kernel -Means Clustering for Incomplete Multiview Data

^{1}College of Computer, National University of Defense Technology, Changsha, China^{2}State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha, China

Correspondence should be addressed to Yongkai Ye

Received 28 April 2017; Revised 28 August 2017; Accepted 6 September 2017; Published 22 October 2017

Academic Editor: Ezequiel López-Rubio

Copyright © 2017 Yongkai Ye et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Multiview clustering aims to improve clustering performance through optimal integration of information from multiple views. Though demonstrating promising performance in various applications, existing multiview clustering algorithms cannot effectively handle the view’s incompleteness. Recently, one pioneering work was proposed that handled this issue by integrating multiview clustering and imputation into a unified learning framework. While its framework is elegant, we observe that it overlooks the consistency between views, which leads to a reduction in the clustering performance. In order to address this issue, we propose a new unified learning method for incomplete multiview clustering, which simultaneously imputes the incomplete views and learns a consistent clustering result with explicit modeling of between-view consistency. More specifically, the similarity between each view’s clustering result and the consistent clustering result is measured. The consistency between views is then modeled using the sum of these similarities. Incomplete views are imputed to achieve an optimal clustering result in each view, while maintaining between-view consistency. Extensive comparisons with state-of-the-art methods on both synthetic and real-world incomplete multiview datasets validate the superiority of the proposed method.

#### 1. Introduction

The term “multiview data” refers to data that have different sources or modalities. Each source or modality is considered as one “view,” and different views have different physical meanings and statistical properties. For example, a web page can be described by the pictures and text it contains, while a news story may be reported by different sites each with its own different viewpoints. A significant number of studies aimed to investigate and learn from multiple views in the past [1, 2]. Multiview clustering, which is one component of multiview learning, aims at grouping samples by utilizing information from different views. Extensive research has been conducted into multiview clustering; these can be roughly categorized into early fusion approaches and late fusion approaches. Early fusion approaches fuse the multiview information in an early stage of the process and then perform clustering [3–9], while late fusion approaches group data by fusing previously clustered results from separate views [10, 11].

However, in real-world applications, some views may be incomplete for a variety of reasons, which hurts the clustering performance of multiview data. For example, in the context of patient grouping, the data from different tests can serve as different views. If a test is too expensive, some patients may be unable to afford it, which leads to an incomplete view for this particular test. Similarly, in webpage clustering, image data and text data are two modalities that represent a page; however, some pages may not contain any images, which makes the data for the image view incomplete.

Existing studies of incomplete multiview clustering can be roughly divided into two categories: subspace methods and imputation methods. The method outlined in [12], which was the first subspace method for incomplete multiview clustering, learns the common subspace of two views via nonnegative matrix factorization. Several variants of this method were proposed following its introduction. In [13], feature learning is integrated into the subspace learning process and the assumption that the data is nonnegative is not required. The method proposed in [14] learns a latent global graph representation and the subspace simultaneously by adding a novel Laplacian graph regularization term. The other important category of method for incomplete multiview clustering is imputation methods, which handle incomplete views by filling in the missing parts. The method proposed in [15] fills the kernel of an incomplete view according to the Laplacian regularization of the other complete view. Subsequently, the method proposed in [16] tackles the situation where two views are incomplete by alternately updating one view according to the other view. In [17], the incomplete views are imputed via low rank decomposition. As different views are assumed to be generated from a shared subspace, the data matrices of different views can be decomposed using a common factor. Most of these imputation methods simply execute a conventional multiview clustering algorithm after filling the incomplete views. Most recently, a method was proposed in [18], whereby the imputation is not separated from the multiview clustering process. More specifically, the imputation and the multiple kernel clustering are integrated into a unified procedure for better clustering performance.

Integrating imputation and multiview clustering into a unified learning process makes the imputation better serve the clustering objective. This advantage helps the method in [18] to outperform other methods that perform imputation and clustering separately. However, the disadvantage of the method in [18] is that multiview clustering solution it proposes overlooks the consistency between views, which may reduce the final clustering performance. In [18], multiview clustering is achieved by learning a linear combination of kernels that reaches the optimal kernel -means clustering result. Consequently, the linear combination to build the best kernel for clustering is learned without considering the relationships between views. Similarly, the imputation is guided only by the clustering objective and the consistency between views is neglected. However, the consistency between views is one of the inherent properties of multiview data [1]; if this critical property is not considered, the learning of the linear combination of kernels and the imputation in [18] may lead to poor clustering performance. Previous research into multiview clustering has shown that considering the consistency between views helps to boost the performance of multiview clustering [3]. In this study, we wish to build on the advances made in [18] while also considering the consistency between views in order to further improve clustering performance. Therefore, we propose a novel incomplete multiview clustering method that simultaneously fills the incomplete kernels from incomplete views and learns a consistent clustering result. To model the between-view consistency, the similarity between the consistent clustering result and the clustering result of each view is calculated. The consistency between views is measured by the sum of these similarities. The missing parts of kernels and the consistent clustering result are learned in order to achieve the optimal clustering result in each view while keeping consistency between views. Here, the learning process considers both the data structures within views and the consistent relations between views, which benefits the multiview clustering performance. The proposed objective function is then solved by alternately optimizing partial variables. Each subproblem that optimizes the corresponding partial variables either can be solved by means of eigenvector decomposition or has a closed-form solution. To evaluate the performance of the proposed method, we compare it with state-of-the-art methods on three synthetic and one real-world incomplete multiview datasets. Empirical results validate the superiority of the proposed method for incomplete multiview clustering.

The main contributions of this paper can be summarized as follows:(1)We propose a novel incomplete multiview clustering method, which simultaneously learns a consistent clustering decision and fills the incomplete kernels from incomplete views with explicit modeling of between-view consistency.(2)We design an alternating optimization algorithm to solve proposed method’s optimization problem. Here, the optimization problem is divided into three subproblems. The subproblems either can be solved by means of eigenvector decomposition or has a closed-form solution.(3)We also provide thorough convergence analysis of the alternating optimization algorithm, including theoretical proof and empirical validations.

#### 2. The Proposed Method

Regarding the consistency, we propose that a consistent clustering decision be learned that is similar to each view’s kernel -means clustering result. To handle the incomplete views, we simultaneously fill the incomplete views and learn the consistent clustering decision. In the following subsections, we first introduce the notation used in problem formulation, after which kernel -means is briefly reviewed. We then outline how a consistent decision might be found. Next, we introduce the objective function of our method to explain how the kernel filling and decision learning processes are integrated. Finally, we analyse the convergence of the proposed algorithm.

##### 2.1. Notation

Assume that there are samples and views for the multiview data. For clarity, sample’s information in a view is referred to as an instance of the sample in this paper. For incomplete multiview data, a sample’s instance in a view could be missing. is an zero-one matrix that indicates which instances are missing; when , sample ’s instance in view is missing. denotes the th column of . Because our method is based on kernel -means, we assume that the input multiview data is kernel data. For each view , we have a kernel matrix . The details of how kernel data are built can be found in Section 3.1, where datasets used in this paper are introduced.

In a view , some instances may be missing, which will lead to an incomplete kernel . To describe the visible and missing parts of the incomplete kernel , we define an operator , which selects corresponding rows and columns of according to zero-one vectors and (where 1 indicates selected). Moreover, we define . Thus is the visible part of the kernel matrix, while , , and are the missing parts. Figure 1 shows a simple example of notation with three samples.