Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 329753, 8 pages

http://dx.doi.org/10.1155/2015/329753

## Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China

Received 14 November 2014; Revised 4 February 2015; Accepted 7 February 2015

Academic Editor: Gerhard-Wilhelm Weber

Copyright © 2015 Huibin Lu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the source domain to the target domain; the second step is to construct L1-Graph on the basis of sparse representation, so as to search for the nearest neighbor data with self-adaptation and preserve the samples and the geometric structure; lastly, we integrate two complementary objective functions into the unified optimization issue and use the iterative algorithm to cope with it, and then the estimation of the testing sample classification is completed. Comparative experiments are conducted in USPS-Binary digital database, Three-Domain Object Benchmark database, and ALOI database; the experimental results verify the effectiveness of the proposed algorithm, which improves the recognition accuracy and ensures the robustness of algorithm.

#### 1. Introduction

Traditional machine learning algorithms are usually applicable to the data, during which the training and testing samples are from the same characteristic space with the same distribution. With the change of characteristic and distribution, most statistical models need to be reconstructed by the new training sample collection. But, in practice, the training and testing samples are often collected in different periods and different environment; therefore, the distribution may be different. In order to solve this problem, the transfer learning theory is introduced, which aims to transform knowledge from the source domain marked to the target domain unmarked, for example, natural language learning processing [1], sentiment analysis [2], and image classification [3]. Using transfer learning to improve the recognition accuracy and robustness of different distribution is widely concerned by the domestic and foreign scholars.

In recent years, in view of the fact that the marked training samples and unmarked testing samples are from different domains, more and more researchers begin to attach importance to transfer learning [4, 5]. The basic idea of transfer learning is that although the different data distribution in the source domain and the target domain exists, some associated domains can share some of the same knowledge structure, which can be used as a bridge of knowledge transformation from the source domain to the target domain. The existing methods usually look for these common structures through the optimization of predefined objective functions, including the maximum of empirical likelihood and the preservation of geometric structure. For example, from the perspective of empirical likelihood, Dai et al. proposed the coclustering model based on classification [6], which, through the source domain data, adds constraints to set of words to provide classification structure and part of classification information, and coclustering was regarded as a bridge, through which the classification structure and information are transformed from the source domain to the target domain. The shortcoming of the method is considering the same concept in the document only, so Zhuang et al. studied the relationship between set of words and document classification; some invariant factors were regarded as the bridge of information transformation from the source domain to the target domain by trifactorization [7], but the method only considers the similar concepts in the text. With the similar idea, Wang et al. carried it out in the network data classification of different domains [8]. In consideration of the shortcoming of the above methods, a joint model of similar concepts and the same concepts was built [9], with synchronized learning of boundary condition distribution. Then, taking the unique concept in each domain document into consideration, Zhuang et al. studied with synchronization the sharing concept and the unique concept in all domains by matrix trifactorization [10], which was more flexible in the process of data-fitting so as to get a better recognition rate. From the geometric point of view, if the two sample data in the domain are similar in terms of the essence of data distribution and the geometric structure, their markers should also be similar [11]. To retain the essential structure, Ling et al. explored the consistency between the supervision of source domain and the essential structure of target domain through the spectrum learning [12]. And Pan et al. put forward the conversion component analysis [13], which aims to find a set of common conversion components in the two domains. The samples are projected into the subspace, which makes the reduction of different degrees of data distribution in different domains. With the same idea, Wang and Mahadevan presented the projection of different domains to the new potential space, synchronously matching the corresponding samples and preserving the geometric structure of each domain [14]. Accounting for the two views, a graph canonical transformation learning (GTL) [15] was put forward, which retained some geometric structures based on the maximum of empirical likelihood.

Based on study of joint canonical transformation learning and considering the fact that the L1-Graph [16, 17] has better adaptive ability and better stability compared with neighbor graph, we put forward a multiview classification algorithm based on L1-Graph domain adaptation learning, which is based on Long et al. [15]. The idea is to construct a framework of nonnegative matrix trifactorization based on transfer learning, in which the unchanged information is regarded as the bridge of knowledge transformation from the source domain to the target domain, and the next step is to construct L1-Graph on the basis of sparse representation, so as to search for the nearest neighbor data with self-adaptation and preserve the samples and the geometric structure; finally, the paper uses the iterative algorithm to cope with the optimization issue of unified objective functions, and then the classification of the testing samples is completed.

#### 2. The Multidomain and Multiview Sample Classification Based on L1-Graph Domain Adaptation Learning

##### 2.1. Description of L1-Graph Domain Adaptation Learning

L1-Graph domain adaptation learning can be applied to all different domains, but, for purposes of explanation, two domains are presented here: the source domain and the target domain , and the domain index is expressed as . Each domain , , has a characteristic matrix . In order to find the common structure, will be broken down into three nonnegative matrices; that is, , among which , , and is the column of the matrix . In the domain , the characteristic samples of matrix trifactorization are classified through the maximum of empirical likelihood; represents the relationship between the characteristic set and the sample classification , and it can be regarded as a bridge of knowledge transformation because of the cross domain stability. In addition, the use of sparse representation is carried out to construct graphs and diagrams and to represent the geometry information of the characteristic space and the sample space in domain , respectively. The basic idea of L1-Graph domain adaptation learning algorithm is presented in [15]. In general, similar characteristics represent the same meaning; likewise, similar samples have the same marker. These two graphs are regarded as joint regularization function so that the trifactorization model of learning successfully forecasts the sample marker in the case of retaining the essence of geometric structure, and the geometric information will be effectively integrated into the clustering process, thus ensuring that the common structure information can effectively promote transfer learning.

##### 2.2. Model of Matrix Trifactorization

Assuming that the source domain is and the target domain is , the domain index is expressed as . and share the same characteristic space and marker space, including characteristics and categories in each characteristic space. , , represents the characteristic sample matrix in domain , in which indicates the column of the domain . represents the sample marker of the source domain, as belongs to category , ; otherwise, .

A total of structural information exists in every associated domain ; for the same structure information, the nonnegative matrix trifactorization is conducted as for the characteristic sample matrix , , and the optimization is planned as follows: in which is the Frobenius norm of matrix ; , and each represents semantic concepts, namely, characteristic clustering; , and each represents a sample type. and are the clustering results as to characteristic and sample separately. represents the relationship between the characteristic set and the sample classification , which can keep better stability when compared with and in different domains. Therefore, assuming that adapts to each domain, then the collective matrix trifactorization is planned as follows:

The common structural information , as the stable bridge of knowledge transformation, can be executed as the monitor information in the source domain, namely, executing . Through the bridge, the knowledge marker in source domain can be converted to the sample in target domain, and the process is in correspondence to the maximum of multidomain empirical likelihood.

##### 2.3. Sparse Representation of L1-Graph Structure

From the geometric point of view, it can be thought that the data points are sampled in the distribution, which is formed by a low dimensional manifold, and then embedded into a high dimensional data space. Therefore, in order to avoid the change of the essence of the data distribution and to hope that the geometric structure is preserved in the conversion process, the assumption is presented here; if the inherent geometric structure of data distribution of two samples and in domain is close to each other, the markers of and should also be close to each other, so a model can be formed on the geometric structure of sample space. The traditional graph construction methods are often heavily dependent on the choice of parameters, and it is difficult to effectively reflect the complexity of the data distribution. According to the theory of sparse representation, any sample can be linearly reconstructed by other samples (allowing for some reconstruction error), and the sparse reconstruction coefficient of the sample can be obtained by coping with a L1 norm optimization issue. The reconstructed coefficients, as weights between two samples, can adjust the relationship between samples with self-adaption, so that the sparse graph, which represents the local relationship between samples, contains more useful structural information. From the perspective of duality between characteristic and sample, characteristic is also sampled from the distribution, which is supported by a low dimensional manifold, and then embedded into a high dimensional space. If the geometric structure of data distribution of two characteristics and in domain is close to each other, their characteristic sets and should also be close to each other. Therefore, as for the characteristic space, a sparse graph based on the principle of sparse representation can be constructed to retain the characteristic geometric structure in each domain, just the same as retaining the geometric structure of sample space. The sparse graph construction of sample here is expressed as , the corresponding sparse graph construction of characteristic is expressed as , and the sparse graph construction of sample space is presented in the following steps.

*(1) Input*. Sample , ; each sample is normalized, so that .

*(2) To Solve the Reconstruction Coefficient*. The sparse reconstruction coefficient for each sample in each domain can be obtained by coping with the following optimization issue of minimization of L1 paradigm:in which is an overcomplete dictionary and is a column vector which constitutes the reconstruction coefficient, indicating a relationship between sample and other samples.

*(3) To Set the Edge Weights of L1-Graph*. L1-Graph is expressed as , in which represents the set of all nodes in domain and represents the weight matrix of L1-Graph in the domain, that is, the similarity matrix. When , ; when , ; when , . The number of nearest neighbors of each sample is identified by optimizing the L1 paradigm issue instead of parameters, which are manually set up.

*(4) The Similarity of Symmetrization Matrix*. Consider .

Similarly, the weights matrix of characteristic space can also be obtained.

To preserve the L1-Graph regularization function of the geometric structure minimization sample in domain ,To further preserve the L1-Graph regularization function of the geometric structure synchronization minimization characteristic in domain ,

##### 2.4. Joint Optimization

Evidently, (4) and (5) define that the geometric structure of sample and characteristic can be retained through the L1-Graph regularization function. Therefore, the two equations, as a combined L1-Graph regularization function, can be integrated into (2), which defines the optimization issue of L1-Graph coregularized collective matrix trifactorization (L1-GCMF) as follows:in which, and are regularization parameters; the optimization issue can be better defined by constraining the norm of each column of and . According to the optimization results, the marking of a random sample in target domain can be easily inferred by the following formula:

In the process of optimization of formula (6), the common structure information , which is found through the synchronous maximum of empirical likelihood and preserving the geometric structure, becomes more smooth in the process of conversion learning. L1-GCMF can be extended to handle multidomain issues and to research the commonality in the collection structure. The approaches to the optimization issue in (6) can be derived based on the constrained optimization theory. Specifically, the updating rules are deduced through fixing the rest of variables and optimizing one variable, and the process is repeated till the convergence. In consideration of the nonnegative and the constraint of norm , a Lagrange function is constructed as follows:in which and are constrained Lagrange multipliers and is 1 vector. By using the complementary conditions of Karush-Kuhn-Tucker (KKT), the constraint conditions of are deduced as follows:

According to the KKT conditions, the updating plan is obtained as follows:The computation of can be avoided through the use of an iterative standardization technology. In the process of each iteration, each column of can be normalized, so that . Then, two equal items, depending on , can be obtained; namely, , which can be omitted in the updating plan with no influence on the convergence; therefore, the updating rules are presented as follows:

Similarly, the updating rules can also be obtained as follows:in which represents the element product, the element division, and the square root of elements.

##### 2.5. Description of a Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

See Algorithm 1.