Complexity

Volume 2018, Article ID 2685745, 12 pages

https://doi.org/10.1155/2018/2685745

## Analysis Sparse Representation for Nonnegative Signals Based on Determinant Measure by DC Programming

^{1}National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan^{2}School of Computer Science and Engineering, University of Aizu, Aizuwakamatsu, Japan^{3}School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China

Correspondence should be addressed to Wuhui Chen; nc.ude.usys.liam@huwnehc

Received 1 September 2017; Accepted 15 March 2018; Published 24 April 2018

Academic Editor: Tsendsuren Munkhdalai

Copyright © 2018 Yujie Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Analysis sparse representation has recently emerged as an alternative approach to the synthesis sparse model. Most existing algorithms typically employ the -norm, which is generally NP-hard. Other existing algorithms employ the -norm to relax the -norm, which sometimes cannot promote adequate sparsity. Most of these existing algorithms focus on general signals and are not suitable for nonnegative signals. However, many signals are necessarily nonnegative such as spectral data. In this paper, we present a novel and efficient analysis dictionary learning algorithm for nonnegative signals with the determinant-type sparsity measure which is convex and differentiable. The analysis sparse representation can be cast in three subproblems, sparse coding, dictionary update, and signal update, because the determinant-type sparsity measure would result in a complex nonconvex optimization problem, which cannot be easily solved by standard convex optimization methods. Therefore, in the proposed algorithms, we use a difference of convex (DC) programming scheme for solving the nonconvex problem. According to our theoretical analysis and simulation study, the main advantage of the proposed algorithm is its greater dictionary learning efficiency, particularly compared with state-of-the-art algorithms. In addition, our proposed algorithm performs well in image denoising.

#### 1. Introduction

Real signals around our daily life are always distributed in a high dimensional space; however low-dimensional structures are found in the signals, then we can represent the signals with a proper model by only a few parameters [1]. A proper model should be simple while matching the signals. In the past decades, the sparse and redundant representation model has been proven to be an efficient and beneficial model [2–5]. The theoretical background for sparse models is given by compressed sensing (CS) [6–8]. CS mathematically declares that if a signal is sparse or compressive, this original signal can be reconstructed by a few measurements, which are much fewer than the counts suggested by previous theories [6, 7, 9–11]. Sparse representation has also been described as an extraordinary powerful solution for a wide range of real-word applications, especially in image processing, such as image denoising, deblurring, inpainting, restoration, superresolution, and also in the field of machine learning, computer vision, and so on [12–21].

Sparse representation can be formulated by either a synthesis model or an analysis model. The synthesis model is popular and mature. The analysis model has been less investigated for sparse representation, although several analysis dictionary learning algorithms have been proposed, such as the analysis K-SVD [12], Greedy Analysis Pursuit (GAP) [22], and the analysis thresholding algorithm [23].

In practice, some signals such as chemical concentrations in experimental results and pixels in video frames and images have inherent nonnegativity, and dedicated factorization methods have been proposed [24, 25]. The above analysis sparse representation algorithms are all for general signals which contain nonnegative and negative elements. The methods for general signals directly applied to nonnegative signals cannot achieve satisfying results. An existing analysis dictionary learning method for nonnegative signals [26], which uses blocked determinants as the sparseness measure, is quite difficult or computationally expensive. The purpose of this paper is to address the problem on nonnegative analysis sparse representation.

In this paper, we present a novel algorithm of analysis sparse representation for nonnegative signals, which is parallel to the synthesis sparse representation in the principle and structure. Though this model has been studied in the past, there is still not a matured field for nonnegative analysis representation, while the algorithms designed for general signal cannot sufficiently be applied to the nonnegative signals. Thus, we focus on the nonnegative sparse representation with the analysis model. We cast the analysis sparse representation into three subproblems (analysis dictionary update, sparse coding, and signal recovery) and use an alternating scheme to obtain an optimization solution. We utilize the determinant-type of sparseness measure as the sparseness constraint, which is convex and differentiable. The objective function for sparse coding is nonconvex, and standard convex optimization methods cannot be employed. Fortunately the objective function is the difference of two convex functions, then we can introduce difference of convex (DC) programming to solve this nonconvex optimization problem.

The remainder of this paper is organized as follows. The conventional sparse representation problem is reviewed in Section 2. In Section 3, we introduce the analysis representation. In Section 4, we describe the problem formulation for analysis sparse representation and present the optimization framework. The experiments described in Section 5 demonstrate the practical advantages of the proposed algorithms compared with state-of-the-art algorithms both with artificial and real-world datasets. Finally, we present our conclusions in Section 6.

##### 1.1. Notations

Here we list notations used in this paper. A boldface uppercase letter like is defined as a matrix, and a lowercase letter like is defined as the th entry of . A boldface lowercase letter, such as , is defined as a vector, and a lowercase letter is defined as the th entry of . Matrix slices and are defined as the th row and the th column of matrix , respectively. The Frobenius norm of matrix is defined as . The determinant value of a matrix is denoted by . Note that, in this paper, all parameters take real values.

#### 2. Preliminaries

##### 2.1. Sparse Representation

Sparse representation decomposes observed signals into a product of a dictionary matrix which contains signal bases and a sparse coefficient matrix [13–17], and there are two different structures: synthesis model and analysis model. The synthesis model is the first proposed sparse model and more popular. We first review it in this section.

Assume that we want to model the signals , where is the signal dimensionality and is the number of measurements. The synthesis sparse model suggests that the signals could be expressed asorwhere refer to as a dictionary, is a representation coefficient matrix, and is a small residual [27]. Here, is the number of bases, which are also called dictionary atoms. We further assume that the representation matrix is sparse (i.e., many zero entries) to obtain sparse representations of the signals. Equations (1) or (2) mean that each signal can be represented as a linear combination of a few atoms from the dictionary matrix .

A key issue in the sparse representation is the choice of the dictionary which the observed signals are used to decompose. One choice is a predefined dictionary, such as discrete Fourier transform (DFT), discrete cosine transform (DCT), and wavelets [28], which can be employed for learning a signal-specific dictionary from observed signals. Another choice, the learned dictionary, results in better matching to the contents of signals. In addition, the learned dictionary often exhibits better performance compared to predefined dictionaries in real-world applications [29, 30].

Intriguingly, there exists a “twin” of the synthesis model called the analysis model [31]. Assume that there is a matrix that produces a sparse coefficient matrix by being multiplied to the signal matrix: . This equation can be obtained as the solution to a minimization problem of the error function . Remarkably, this error function is convex, and standard optimization methods can be employed. Since error functions in the synthesis model are nonconvex, optimization in the analysis model is often easier. We call the analysis dictionary. Atoms in the analysis dictionary are its rows, rather columns in the synthesis dictionary . The term “analysis” means the dictionary analyzes the signal to produce a sparse result [32]. To emphasize the difference between the analysis and synthesis models, the term “cosparsity” has been introduced in the literature [31, 33], which counts the number of 0-valued elements of , that is, zero elements coproduced by and [34]. The analysis sparse model is also called the cosparse model, and then the analysis dictionary is also called the cosparse dictionary.

Now we look more closely at the analysis sparse model. The analysis model for one signal , which is a column in the signal matrix , can be represented using a proper analysis dictionary . The th row, namely, th atom, in is denoted by . We want to make the analysis representation vector sparse. This is formulated by introducing a sparsity measure , so that it negatively behaves with the sparsity of , and minimizing yields the sparsest solution:Although employing the -norm, that is, setting , yields the sparsest solution [35], the optimization problem is combinatorial and often NP-hard. Therefore other sparsity measures such as the -norm are employed to have easier optimization problems. Nevertheless, it is known that the -norm often overpenalizes large elements and solutions are too sparse.

##### 2.2. Sparseness Measure

The -norms, where , or , are popular measures for assessing the sparseness of a vector. Since the -norm yields an NP-hard problem, its convex relaxation, the -norm, is often preferred [36, 37]. The -norm of a vector is defined to be the sum of the absolute values of ; namely, . If the vector is nonnegative, that is, , the -norm of is just . For nonnegative vectors, their -norm is differentiable and smooth, and gradient methods can be used in optimization. Some authors introduce the -norm with nonnegative matrix factorization since the nonnegative constraints yield sparse solutions. However, the results with the -norm are not sparser than those with the -norm or -norm [38].

The sparsity measures mentioned above can reflect the instantaneous sparseness of one single signal [35], but they are not suitable for evaluating sparsity across different measurements [39]. In order to describe the joint sparseness of the nonnegative sources, we introduce determinant-type of sparsity measure. In spectral unmixing for remote sensing image interpretation, where signals are nonnegative, the determinant-type of sparsity measure [40] can get good results similar to the other sparseness-based methods from the results of numerical experiments [41]. Thus, the determinant-type measure can explicitly measure the sparseness of nonnegative matrices.

The determinant-type sparse measure has several good qualities. If a nonnegative matrix is normalized, we can find that the determinant value of the nonnegative matrix is well bounded and its value interpolates monotonously between two extremes of and , along with the increasing of the sparsity. For instance, if the nonnegative matrix is nonsparse with its rows and satisfies sum-to-one, then the determinant of , , is close to . On the other hand, approaches to if and only if the matrix is the most sparse [42]. Namely, the determinant value satisfies that , where if all entries of are the same, and when the following two criteria are satisfied at the same time:(1)For all , only one element in is nonzero.(2)For all and , and are orthogonal, that is, .

The detailed proof can be found in [40]. Thus we can use the determinant measure in the cost function. Figure 1 illustrates the sparseness degrees of three different matrices gauged by the determinant measure. The determinant values of the matrices from left to right are 0.0625, 0.5, and 1. We can see that the sparser the matrix is, the larger value the determinant measure is. Thus the sparse coding problem with determinant constraints can be expressed as an optimization problem