Complexity

Volume 2018, Article ID 2743678, 12 pages

https://doi.org/10.1155/2018/2743678

## Graph Sparse Nonnegative Matrix Factorization Algorithm Based on the Inertial Projection Neural Network

^{1}College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China^{2}College of Mobile Telecommunications, Chongqing University of Posts and Telecommunications, Chongqing 401520, China

Correspondence should be addressed to Chuandong Li; nc.ude.uws@ildc

Received 3 May 2017; Accepted 13 December 2017; Published 19 March 2018

Academic Editor: Pietro De Lellis

Copyright © 2018 Xiangguang Dai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We present a novel method, called graph sparse nonnegative matrix factorization, for dimensionality reduction. The affinity graph and sparse constraint are further taken into consideration in nonnegative matrix factorization and it is shown that the proposed matrix factorization method can respect the intrinsic graph structure and provide the sparse representation. Different from some existing traditional methods, the inertial neural network was developed, which can be used to optimize our proposed matrix factorization problem. By adopting one parameter in the neural network, the global optimal solution can be searched. Finally, simulations on numerical examples and clustering in real-world data illustrate the effectiveness and performance of the proposed method.

#### 1. Introduction

Dimensionality reduction plays a fundamental role in image processing, and many researchers have been seeking effective methods to solve this problem. For a given image database, there are many distinct features, whereas the available features are far less enough. Thus, it is of great significance to find useful features with low-dimensionality to represent the original feature space. For this purpose, matrix factorization techniques have attracted great attention in recent decades [1–3], and many different methods have been developed by using different criteria. The most familiar methods include Singular Value Decomposition (SVD) [4], Principal Component Analysis (PCA) [5], and Vector Quantization (VQ) [6]. The main idea of matrix factorization methods is finding several matrices whose product approximates to the original matrix. In dimensionality reduction, the dimension of the decomposed matrices is smaller than that of the original matrix. This gives rise to a low-dimensional compact representation of the original data points, which can facilitate clustering or classification.

Among these matrix factorization methods, one of the most used methods is nonnegative matrix factorization (NMF) [3], which requires the decomposed matrices to be nonnegative. The effect of the nonnegative constraint leads NMF to learn a part-based representation of high-dimensional data, and it is applied to so many areas such as signal processing [7], data mining [8, 9], and computer vision [10]. In general, NMF is shown to be effective for unsupervised learning problems, but not applicable to supervised learning problems. To overcome this problem, some researchers [11–13] have presented semi-supervised learning theory to achieve better performance in dimensionality reduction. In the light of locality preserving projection, a graph regularized nonnegative matrix factorization method (GNMF) has been proposed to impose the geometrical information on the data space. The geometrical structure is constructed by a nearest neighbor graph [11]. Based on the idea of label propagation, Liu et al. [13] imposed the label information constraint into nonnegative matrix factorization (CNMF). The idea of CNMF is that the neighboring data points with the same class are supposed to merge together in the low-dimensional representation space.

Motivated by previous researches in matrix factorization, in this paper, we propose a novel method, called graph sparse nonnegative matrix factorization (GSNMF), for dimensionality reduction, which can be used for semi-supervised learning problems. In GSNMF, a sparse constraint is imposed on GNMF, and this leads matrix factorization to learn a sparse part-based representation of the original data space. The sparse constraint causes GSNMF to be a nonconvex nonsmooth problem, and traditional optimization algorithms can not be optimized directly. Recently, numerous neural networks have emerged as a powerful tool for optimization problems [14–27]. For some nonconvex problems, an inertial projection neural network (IPNN) [16] has been proposed to search different local optimal solutions by the inertial term. In [17], a shuffled frog leaping algorithm (SFLA) has been developed using the recurrent neural network. Based on the SFLA framework, the global optimal solution can be searched. Moreover, there are many optimization methods for nonconvex nonsmooth problems that use neural networks [22–27].

It is worth highlighting some advantages of our proposed method as follows:(i)Traditional algorithms for GNMF [11] and NMF [3] can easily trap into local optimum solution, and these algorithms are sensitive to initial values, while our proposed algorithm using inertial projection neural network can avoid these problems.(ii)Our proposed algorithm can be initialized by sparser matrices; however, GNMF and NMF may fail in this case.(iii)By adopting one parameter in the neural network, GSNMF has the better clustering effect than GNMF and NMF.

The rest of the paper is organized as follows. In Section 2, some related works to NMF are briefly reviewed; then we introduce GSNMF. Section 3 reviews the inertial projection neural network theory and provides a convergence proof to GSNMF. Section 4 presents numerical examples to demonstrate the validity of our proposed algorithm. Experiments on clustering are given in Section 5. Finally, we present some concluding marks and future work in Section 6.

#### 2. Problem Formulation

To find the effective features of high dimensionality data, matrix factorization can be used to learn sets of features to represent data. Given a data matrix and an integer , matrix factorization is to find two matrices and such thatWhen , the matrix factorization method can be regarded as a dimensionality reduction method. In image dimensionality reduction, each column of is a basis vector to capture the original image data and each column of is the representation with respect to the new basis. The most used method to measure the approximation is the Frobenius norm in the following form:

Different matrix factorization methods imposed different constraints on (2) that can solve different practical problems. At present, the most used matrix factorization method is nonnegative matrix factorization (NMF) [3] with nonnegative constraints on and . The classic algorithm is summarized as follows:

Recently, Cai et al. [11] proposed a graph regularized nonnegative matrix factorization method (GNMF) and incorporated the geometrical information into the data space. The goal of GNMF is to find effective basis vectors to represent the intrinsic structure. The research has presented the natural assumption that if two data points and from are close in the intrinsic geometry of the data distribution, new representations of two points and are also close to each other. For each data point , we find its nearest neighbors and put edges between and its neighbors. Edges between each data points can be considered as the weight matrix . If nodes and are connected by an edge, then . can be described by

The low-dimensional representation of with respect to the new basis is . The Euclidean distance is used to measure the dissimilarity between and byWith the above analysis, the following term is used to measure the smoothness of the low-dimensional representation:where denotes the trace of a matrix, , and . Combining (6) and (2), the new objective function is defined by the Euclidean distance.The algorithm to solve (7) is presented as follows:

When or , GNMF is equivalent to nonnegative matrix factorization. In the representation of the image data, GNMF and NMF only consider the Euclidean structure of image space. However, recent researches have shown that human generated images may from a submanifold of the ambient Euclidean space [28, 29]. In general, the human generated images cannot uniformly fill up the high-dimensional Euclidean space. Therefore, the matrix factorization should respect the intrinsic manifold structure and learn the sparse basis to represent the image data. In the light of sparse coding [30], we impose the sparse constraint on (7) and the optimization problem is transformed into another form:

Because the optimization problem (9) is nonconvex, a block-coordinate update (BCD) [31] structure is proposed to optimize GSNMF. Given the initial and , BCD alternatively solvesuntil convergence. Since (11) and (10) have a similar form, we only consider how to solve (10); then (11) can be solved accordingly. The problem (10) can be transformed into the following vector form:whereIt is evident that -norm is not differentiable. However, [32] has presented a method to solve it. Supposing , problem (12) can be rewritten as follows:where , , , , and and are, respectively, defined as follows:According to the BCD structure, (14) can be separated into two subproblems. Given the initial and , one alternatively solvesuntil convergence. Since (16) and (17) have a similar form, we only consider how to solve (16); then (17) can be solved accordingly. Equation (16) can be transformed into the following convex quadratic program (CQP):where

According to the above analysis, problem (11) can be also transformed into a convex quadratic program problem. For saving space, we do not provide the derivation process. In the following section, we will introduce IPNN to optimize (18).

#### 3. Neural Network Model and Analysis

##### 3.1. Inertial Projection Neural Network

To solve problem (18), we establish the following neural network using IPNN [16]:where . Now, we are ready to show the convergence and optimality of (20). For any and , we set , , and ; then we will present the following theorems.

Theorem 1. *For any initial point with the initial condition , there exists unique continuous solution , where .*

*Proof. *Note that is Lipschitz continuous. Let be the Lipschitz constant. Thus, for any and , we obtainwhere and are unit matrix and Lipschitz constant, respectively. Therefore, is Lipschitz continuous on . There exists unique solution with initial condition by the local existence theorem of solution to ordinary differential equations.

Theorem 2. *Define , if the following two conditions hold.** For any and ,where is the angle between and .** . Then, the solution of model (20) converges to optimal solution set of (18).*

*Proof. *Considering and the Lyapunov function , we obtainSince and Condition holds, (23) can be rewritten asThen, (24) can be transformed intowhich indicates that the function is monotone nonincreasing. Thus, for any ,Since , we obtain . FurtherBy multiplying the inequality (27) into , we obtain which implies thatIntegrating (28) from 0 to , it is obtained that . Therefore, the trajectory of model (20) is bounded.

Since is bounded and inequality (26) holds, we obtainThus, (29) can be rewritten asSince is bounded, (30) indicates that is also bounded. From (25) and (30), one obtains and .

Assuming , and since , it implies that there exists , such that for any . Therefore, one obtainsand it is easy to obtain thatAccording to the theory of Calculus, the value of exists. Therefore, the value of also exists. Since , we obtain . Since is bounded, we have .

It follows from (20) thatIf , thenwhich implies that . In the following, we will prove .

Defining , and substituting into (20), one obtainsSince Condition holds, we haveHence, we getIntegrating (35), it gives . Since , we obtain . Thus, the solution of system (20) converges to the optimal set . The proof is completed.

*Remark 3. *In Theorem 2, Condition should be satisfied. In the following, we discuss the existence of parameter . For any and , we haveTherefore, is also Lipschitz continuous andThus, we getwhere is the angle between and . It is easy to obtain

##### 3.2. Algorithms

Based on the above analysis, we summarize Algorithms 1, 2, and 3 to optimize GSNMF. Firstly, the parameters , , , and can be derived by Algorithm 1. Secondly, Algorithm 2 applies IPNN to optimize the CQP problem. Thirdly, the optimization problem (9) is divided into two CQP problems which are solved alternatively by Algorithm 3. In the following, we analyse the time complexity of our proposed algorithms. The main cost of our proposed algorithms is spent on the calculation of the gradient in Algorithm 2. To optimize subproblem (10), the operation in Algorithm 2 is the matrix product , which takes . With the cost for calculating , the complexity of using Algorithm 2 to optimize the subproblem (10) is Similarly, the complexity of using Algorithm 2 to optimize the subproblem (11) is Hence, the overall cost to solve GSNMF is We summarize the time complexity of one iteration round of GSNMF, GNMF, and NMF in Table 1. At each iteration, there are two operations, same as NMF and GNMF.