- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents

Computational Intelligence and Neuroscience

Volume 2008 (2008), Article ID 764206, 10 pages

http://dx.doi.org/10.1155/2008/764206

## Theorems on Positive Data: On the Uniqueness of NMF

^{1}Department of Electronic Systems, Aalborg University, Niels Jernes Vej 12, 9220 Aalborg, Denmark^{2}Department of Electronic Engineering, Queen Mary, University of London, Mile End Road, London E1 4NS, UK^{3}Department of Informatics and Mathematical Modeling, Technical University of Denmark, Richard Petersens Plads, Building 321, 2800 Lyngby, Denmark

Received 1 November 2007; Accepted 13 March 2008

Academic Editor: Wenwu Wang

Copyright © 2008 Hans Laurberg et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We investigate the conditions for which nonnegative matrix factorization (NMF) is unique and introduce several theorems which can determine whether the decomposition is in fact unique or not. The theorems are illustrated by several examples showing the use of the theorems and their limitations. We have shown that corruption of a unique NMF matrix by additive noise leads to a noisy estimation of the noise-free unique solution. Finally, we use a stochastic view of NMF to analyze which characterization of the underlying model will result in an NMF with small estimation errors.

#### 1. Introduction

Large quantities of positive data occur in research areas such as music analysis, text analysis, image analysis, and probability theory. Before deductive science is applied to large quantities of data, it is often appropriate to reduce data by preprocessing, for example, by matrix rank reduction or by feature extraction. Principal component analysis is an example of such preprocessing. When the original data is nonnegative, it is often desirable to preserve this property in the preprocessing. For example, elements in a power spectrogram, probabilities, and pixel intensities should still be nonnegative after the processing to be meaningful. This has led to the construction of algorithms for rank reduction of matrices and feature extraction generating nonnegative output. Many of the algorithms are related to the nonnegative matrix factorization (NMF) algorithm proposed by Lee and Seung [1, 2]. NMF algorithms factorize a nonnegative matrix or into two nonnegative matrices and : There are no closed-form solutions to the problem of finding and given a , but Lee and Seung [1, 2] proposed two computationally efficient algorithms for minimizing the difference between and for two different error functions. Later, numerous other algorithms have been proposed (see [3]).

An interesting question is whether the NMF of a particular matrix is unique. The importance of this question depends on the particular application of NMF. There can be two different viewpoints when using a model like NMF—either one can believe that the model describes nature and that the variables and have a physical meaning or one can believe that the model can capture the part of interest even though there is not a one-to-one mapping between the parameters and the model, and the physical system. When using NMF, one can wonder whether is a disturbed version of some underlying or whether the data is constructed by another model or, in other words, a ground truth and does exist. These questions are important in evaluating whether or not it is a problem that there is another NMF solution, , to the same data, that is, If NMF is used even though the data is not assumed to be generated by (1), it may not be a problem that there are several other solutions. On the other hand, if one assumes that a ground truth exists, it may be a problem if the model is not detectable, that is, if it is not possible to find and from the data matrix .

The first articles on the subject was two correspondences between Berman and Thomas. In [4] Berman asked for what amounts to a simple characterization of the class of nonnegative matrices for which an NMF exists. As we shall see, the answer by Thomas [5] can be transferred into an NMF uniqueness theorem.

The first article investigating the uniqueness of NMF is Donoho and Stodden [6]. They use convex duality to conclude that in some situations, where the column vectors of “describe parts,” and for that reason are nonoverlapping and thereby orthogonal, the NMF solution is unique.

Simultaneously with the development of NMF, Plumbley [7] worked with nonnegative independent component analysis where one of the problems is to estimate a rotation matrix from observations on the form , where is a nonnegative vector. In this setup, Plumbley investigates a property for a nonnegative independent and identically distributed (i.i.d.) vector such that can be estimated. He shows that if the elements in are grounded and a sufficiently large set of observations is used, then can be estimated. The uniqueness constraint in [7] is a statistical condition of .

The result in [7] is highly relevant to the NMF uniqueness due to the fact that in most cases new NMF solutions will have the forms and as described in Section 3. By using Plumbley's result twice, a restricted uniqueness theorem for NMF can be constructed.

In this paper, we investigate the circumstances under which NMF of an observed nonnegative matrix is unique. We present novel necessary and sufficient conditions for the uniqueness. Several examples illustrating these conditions and their interpretations are given. Additionally, we show that NMF is robust to additive noise. More specifically, we show that it is possible to obtain accurate estimates of and from noisy data when the generating NMF is unique. Lastly, we consider the generating NMF as a stochastic process and show that particular classes of such processes almost surely result in unique NMFs.

This paper is structured as follows. Section 2 introduces the notation, some definitions, and basic results. A precise definition and two characterizations of a unique NMF are given in Section 3. The minimum constraints of and for a unique NMF are investigated in Section 4. Conditions and examples of a unique NMF are given in Section 5. In Section 6, it is shown that in situations where noise is added to a data matrix with a unique NMF, it is possible to bound the error of the estimates of and . A probabilistic view on the uniqueness is considered in Section 7. The implication of the theorems is discussed in Section 8, and Section 9 concludes the paper.

#### 2. Fundamentals

We will here introduce convex duality that will be the framework of the paper, but first we shall define the notation to be used. Nonnegative real numbers are denoted as denotes the Frobenius norm, and is the space spanned by the set of vectors. Each type of variables has its own font. For instance, a scalar is denoted , a column vector is denoted , a row vector is denoted by , a matrix is denoted by , a set is denoted by , and a random variable is denoted by . Moreover, is the th index of the vector . When a condition for a set is used to describe a matrix, it is referring to the set of column vectors in the matrix. The NMF is symmetric in and , so the theorems for one matrix may also be used for the other.

In the paper, we make a geometric interpretation of the NMF similar to that used in both [5, 6]. For that, we need the following definitions.

*Definition 1. *The *positive span* is given by .

In some literature, the positive span is called the conical hull.

*Definition 2. *A set is called a *simplicial cone* if there is a set such that .
The *order* of a simplicial cone is the minimum number of elements in .

*Definition 3. *The *dual* to a set ,
denoted ,
is given by .

The following lemma is easy to prove and will be used subsequently. For a more general introduction to convex duality, see [8].

Lemma 1. *(a) If ,
then if and only if for all .**(b) If and is invertible,
then .**(c) If ,
then .**(d) If and are closed simplicial cones and ,
then .*

#### 3. Dual Space and the NMF

In this section, our definition of unique NMF and some general conditions for unique NMF are given. As a starting point, let us assume that both and have full rank, that is, .

Let and be any matrices that fulfil, . Then, . The column vectors of and are therefore both bases for the same space and as a result there exists a basis shift matrix such that . It follows that . Therefore, all NMF solutions where , are of the form . In these situations, the ambiguity of the NMF is the matrix. Note that if the above arguments are not valid because can differ from and thereby .

*Example 1. *The
following is an example of an matrix of rank ,
where there are two NMF solutions but no matrix to connect the
solutions We mention in passing that
Thomas [5] uses this
matrix to illustrate a related problem. This completes the example.

Lemma 2 (Minc [9, Lemma 2 1.1]). *The inverse of a nonnegative matrix is
nonnegative if and only if it is a scaled permutation.*

Lemma 2 shows that all NMF solutions on the forms and , where is a scaled permutation, are valid, and thereby that NMF only can be unique up to a permutation and scaling. This leads to the following definition of unique NMF in this paper.

*Definition 4. *A
matrix has a *unique NMF* if the ambiguity is a permutation and a
scaling of the columns in and rows in .

The scaling and permutation ambiguity in the
uniqueness definition is a well-known ambiguity that occurs in many blind
source separation problems. With this definition of unique NMF, it is possible
to make the following two characterizations of the unique NMF.Theorem 1. *If ,
an NMF is unique if and only if the positive orthant is the only -order simplicial cone such that .**Proof. *The proof follows the analysis of
the matrix above in combination with Lemma 1(b). The theorem can also be proved by following the steps of the proof in [5].Theorem 2 (see [6]). *The NMF is unique if and only if
there is only one -order simplicial cone such that , where is the positive orthant.**Proof. *The
proof follows directly from the definitions.The
first characterization is inspirited by [5] and the second characterization is implicit introduced
in [6]. Note that the
two characterizations of the unique NMF analyze the problem from two different
viewpoints. Theorem 1 takes a known and pair as starting point and looks at the
solution from the “inside,” that is, the -dimensional space of row
vectors in and column vectors in . Theorem 2 looks at the problem from the “outside,” that is, the
-dimensional column space of .

#### 4. Matrix Conditions

If is unique, then both and have to be unique, respectively, that is, there is only one NMF of and one of , namely, and . In this section, a necessary condition for and is given and a sufficient condition is shown.

The following definition will be shown to be a necessary condition for both the set of row vectors in and column vectors in .

*Definition 5. *A set of vectors in is called *boundary close* if for all and there is an element such that

In the case of closed sets, the boundary close condition is that and . In this section, the sets will be finite (and therefore closed), but in Section 7 the general definition above is needed.

Theorem 3. *The set of row vectors in has to be boundary close for the corresponding
NMF to be unique.*

*Proof. *If the
set of row vectors in are not boundary close, there exist indexes and such that the th element is always more than times larger than the th element in the row vectors in .
Let ,
where and denotes the th standard basis vector. This set fulfils the
condition and we therefore, using Theorem 1,
conclude that the NMF cannot be unique.

That not only the row vectors of with small elements determine the uniqueness can be seen from the following example.

*Example 2. *The following is an example where is not unique but is.

Let Here is boundary close but not unique since .
The uniqueness of can be verified by plotting the matrix as
shown in Figure 1, and observe that the conditions of Theorem 1 are fulfilled. This completes the example.

In three dimensions, as in Example 2, it is easy to
investigate whether a boundary close is unique—if ,
then can only have two types of structure: either
the trivial (desired) solution where or a solution where only the diagonal of is zero. In higher dimensions, the number of
combinations of nontrivial solutions increases and it becomes more complicated
to investigate all possible nontrivial structures. For example, if is the matrix from Example 2, then the matrix is boundary close and can be
decomposed in several ways, for example, Instead of seeking necessary and
sufficient conditions for a unique ,
a sufficient condition not much stronger than the necessary is given. In this
sufficient condition, we only focus on the row vectors of with a zero (or very small)
element.*Definition 6. *A set of vectors in is called *strongly boundary close* if it is boundary close, and there exists a and a numbering of the elements in the vectors
such that for all and there are vectors from that fulfil the
following:

(1) for all ;
and(2),
where is the “condition number” of the matrix
defined as the ratio between the largest and smallest singular values [10, page 81], and is a projection matrix that picks the last element of a vector in .Theorem 4. *If is strongly boundary close, then is unique.*The
proof is quite technical and is therefore given in the Appendix. The most
important thing to notice is that the necessary condition in Theorem 3 and the
sufficient conditions in Theorem 4 are very similar. The first item in the
strongly boundary close definition states that there
have to be several vectors with small value. The
second item ensures that the vectors with small value are linear independent in
the last elements.

#### 5. Uniqueness of R

In this section, a condition for unique is analyzed. First, Example 3 is used to investigate when a strongly boundary close and pair is unique. The section ends with a constraint for and that results in a unique NMF.

*Example 3. *This is
an investigation of uniqueness of when and are given as where .
Both and are strongly boundary close and the parameter can be calculated as The equation above shows that
small will result in a close to one and an close to one results in a large .
In Figure 2, the matrix is plotted for .
The dashed line is the desired solution and is repeated in all figures. It is
seen that the shaded area is decreasing when increases, and that the solid border increases when increases. For all -values, both the shaded area and the solid
border intersect with the dashed triangle. Therefore, it is not possible to get
another solution by simply increasing/decreasing
the desired solution. The figure shows that the NMF is unique for and not unique for where the alternative solution is shown by a
dotted line. That the NMF is not unique for can also be verified by selecting the to be the symmetric orthonormal
matrix and see that both and are nonnegative. If ,
then the matrix is given by This shows that needs no zeros for the NMF to be unique. This completes the example.

In the example above, equals and thereby fulfils the same constraints. In many applications, the meaning of and differs, for example, in music analysis where the column vectors of are spectra of notes and is a note activity matrix [11].

Next, it is investigated how to make an asymmetric uniqueness constraint.

*Definition 7. *A set of vectors in is called *sufficiently spread* if for all and ,
there is an element such that

Note that in the definition for sufficiently spread set the th element is larger than the sum in contrast to the strongly boundary close definition where the th element is smaller than the sum.

Lemma 3. *The dual space of a sufficiently spread set is the positive orthant.*

*Proof. *A sufficiently spread set is
nonnegative and the positive orthant is therefore part of the dual set for any
sufficiently spread set. Let be a vector with a negative element in the th element and select In any sufficiently spread set,
an exists, such that and therefore The is therefore not in the dual to any
sufficiently spread set.

In the case of finite sets, the sufficiently spread condition is the same as the requirement for a scaled version of all the standard basis vectors to be part of the sufficiently spread set. It is easy to verify that a sufficiently spread set also is strongly boundary close and that the parameter is one.

Theorem 5. *If a pair is sufficiently spread and strongly boundary
close, then the NMF of is unique.*

*Proof. *Lemma 3 states that the dual set of
a sufficiently spread set is the positive orthant, Theorem 4 states that is unique and by using (16) and Theorem 1 we
conclude that is unique.

Theorem 5 is a stronger version of the results of Donoho and Stodden [6, Theorem 1]. Theorem 1 in [6] also assumes that is sufficiently spread, but the condition for is stronger than the strongly boundary close assumption.

#### 6. Perturbation Analysis

In the previous sections, we have analyzed situations with a unique solution. In this section, it is shown that in some situations the nonuniqueness can be seen as estimation noise on and . The error function that describes how close an estimated pair is to the true pair is where is a permutation matrix and is a diagonal matrix.

Theorem 6. *Let be a unique NMF. Given some ,
there exists a such that any nonnegative ,
where fulfils ** where *

The proof is given in the appendix. The theorem states
that if the observation is corrupted by additive noise, then it will result in
noisy estimation of and .
Moreover, Theorem 6 shows that if the noise is small, it will result in small
estimation errors. In this section,
the Frobenius norm is used in (17) and (19) to make Theorem 6 concrete.
Theorem 6 is also valid with the same proof if any continuous metric is used
instead of the Frobenius norm in those
equations.
*Example 4. *This example investigates the
connection between the additive noise in and the estimation error on and .
The column vectors in are basis pictures of a man, a dog, and the
sun as shown in Figures 3(a), 3(b), and 3(c). In Figure 3(d), the sum of the
three basis pictures is shown. The matrix is the set of all combinations of the
pictures, that is, Theorem 5 can be used to
conclude that the NMF of is unique because both and are sufficiently spread and thereby also
strongly boundary close.

In the example, two different noise matrices, and , are used. The matrix models noisy observation and has elements that are random uniform i.i.d. The matrix contains elements that are minus one in the positions where has elements that are two and zero elsewhere, that is, is minus one in the positions where the dog and the man are overlapping. In this case, the error matrix simulates a model mismatch that occurs in the following two types of real-world data. If the data set is composed of pictures, the basis pictures will be overlapping and a pixel in will consist of one basis picture and not a mixture of the overlapping pictures. If the data is a set of amplitude spectra, the true model is an addition of complex values and not an addition of the amplitudes.

The estimation error of the factorization is plotted in Figure 4 when the norm of the error matrix is , that is, . An estimate of the pair is calculated by using the iterative algorithm for Frobenius norm minimized by Lee and Seung [2]. The algorithm is run for 500 iterations and is started from 100 different positions. The decomposition that minimizes is chosen, and is calculated numerically. Figure 4 shows that when the added error is small, it is possible to estimate the underlying parameters. When the norm of added noise matrix increases, the behavior of the two noise matrices, and , differ. For , the error of the estimate increases slowly with the norm of the added matrix while the estimation error for increases dramatically when the norm is larger than . In the simulation, we have made the following observation that can explain the difference in the performance of the two types of noise. When is used, the basis pictures remain noisy versions of the man, the dog, and the sun. When is used and the norm is larger than , the basis pictures are the man excluding the overlap, the dog excluding the overlap, and the overlap of man and dog. Another way to describe the difference is that the rank of is one and the disturbance is in one dimension, where is full rank and the disturbance is in many dimensions. This completes the example.

*Let be a unique NMF and , where and . Given and there exists a such that if the largest absolute value of both and is smaller than , then*

*where are any NMF of .*

*Proof.*This follows directly from Theorem 6. The corollary can be used in situations where there are small elements in and but no (or not enough) zero elements—as in the following example.

*Example 5. *Let ,
where is generated as in Example 3. Let all elements
in both and be equal to .
In Figure 5, is plotted when and .
In this example, neither the shaded area nor the solid border intersect with
the desired solution. Therefore, it is possible to get other solutions by
simply increasing/decreasing the desired solution. For ,
the corners of the solutions are close to the corners of the desired solution.
When ,
the corners can be placed mostly on the solid
border and still form a triangle that contains the shaded area. When ,
the corners can be anywhere on the solid border. This completes the example.

#### 7. Probability and Uniqueness

In this section, the row vectors of and the column of are seen as results of two random variables. Characteristics of the sample space (the possible outcome) of a random variable that lead to unique NMF will be investigated.

Theorem 7. *Let the
row vectors of be generated by the random variable and let the column
vectors of be generated by a random variable .
If the sample space of is strongly boundary close and the sample
space of is sufficiently spread, then for all and ,
there exist and such that ** where is any matrix such that and are nonnegative and the data size is such that and .*

*Proof. *If the data is scaled, ,
it does not change the nonuniqueness of the solutions when measured by the matrix. The proof is therefore done on the
normalized versions of and .
Let and be the normalized version of and .
There exist finite sets and of vectors in the closure of and that are strongly boundary close and
sufficiently spread. By Theorem 5, it is known that is unique. By increasing the number
of vectors sampled from and ,
for any ,
there will be two subsets of the vectors, and ,
that with a probability larger that any will fulfil It is possible to use Corollary 1 on this subset. The fact that limiting is equivalent to limiting (21) when the
vectors are normalized concludes the proof.

*Example 6. *Let all the elements in be exponential i.i.d. and therefore generated
with a sufficiently spread sample space. Additionally, let each row in be exponential i.i.d. plus a random vector with
the sample space and thereby strongly boundary close. In Figure 6, the above variables are shown for the following
four matrix sizes . This completes the example.

#### 8. Discussion

The approach in this paper is to investigate when nonnegativity leads to uniqueness in connection with NMF, . Nonnegativity is the only assumption for the theorems, and the theorems therefore cannot be used as argument for an NMF to be nonunique if there is additional information about or . An example with stronger uniqueness results is the sparse NMF algorithm of Hoyer [12] built on the assumption that the row vectors in have known ratios between the norm and the norm. Theis et al. [13] have investigated uniqueness in this situation and shown strong uniqueness results. Another example is data matrices with an added constant on each row. For this situation, the affine NMF algorithm [14] can make NMF unique even though the setup violates Theorem 3 in this paper.

As shown in Figure 4, the type of noise greatly influences on the error curves. In applications where noise is introduced because the additive model does not hold as, for example, when is pictures or spectra, it is possible to influence the noise by making a nonlinear function on the elements of . Such a nonlinear function is introduced in [15] and experiments show that it improves the results. A theoretical framework for finding good nonlinear functions will be interesting to investigate.

The sufficiently spread condition defined in Section 5 has an important role for unique NMF due to Lemma 3. The sufficiently spread assumption is seen indirectly in related areas where it also leads to unique solutions, for example, in [7] where the groundedness assumption leads to variables with a sufficiently spread sample space. If the matrix is sufficiently spread, then the columns in will occur (almost) alone as columns in . Deville [16] uses the “occur alone” assumption, and thereby sufficiently spread assumption, to make blind source separation possible.

#### 9. Conclusion

We have investigated the uniqueness of NMF from three different viewpoints as follows:

(i)uniqueness in noise free situations;(ii)the estimation error of the underlying model when a matrix with unique NMF is added with noise; and(iii)the random processes that lead to matrices where the underlying model can be estimated with small errors. By doing this, we have shown that it is possible to make many novel and useful characterizations that can be used as theoretical underpinning for using the numerous NMF algorithms. Several open issues can be found in all the three viewpoints that, if addressed, will give a better understanding of nonnegative matrix factorization.

#### Appendix

*Proof of Theorem 4. *The theorem
state that is a unique NMF. To proof this, it is shown
that the condition for Theorem 1 is fulfilled. The positive orthant is
self-dual () and thereby , where is an -order simplicial cone that contains .
Let the set of row vectors in be denoted by . An -order simplicial cone, like ,
is a closed set and it therefore needs to contain the closure of denoted by . The two items in Definition 6 of strongly boundary close can be reformulated
for that contains the border:

(1) for all ,(2)the vectors are linearly independent. The rest of the
proof follows by induction. If ,
then and is therefore unique. Let therefore .
Then linearly independent vectors in have zero as the first element, and of the basis vectors therefore need to have
zero in the first element. In other words, there is only one basis vector with
a nonzero first element. Let us call this vector .
For all there is a vector in which is nonnegative in the first element and
zero in the th element, so all the elements in except the first have to be zero. The proof is
completed by seeing that if the first element is removed from the vectors in ,
it is still strongly boundary close and the problem is therefore the dimensional problem.

*Proof of Theorem 6. *Let be the open set of all pairs that are close to and , Let be the set of all nonnegative pairs that are not in and where .
The uniqueness of ensures that for all .
The fact that the Frobenius norm is continuous, is a closed bounded set, and the statement
above is positive ensures that since a continuous function
attains its limits on a closed bounded set [17, Theorem 4.28]. The pairs that are not in and where can either be transformed by a diagonal matrix
into a matrix pair from , ,
having the same product or it can be transformed into a pair where
both and have large elements, that is, and thereby .

Select to be The error of the desired solution can be bounded by .
Let be any matrix constructed by a nonnegative
matrix pair not from .
Because of the way is selected, .
By the triangle inequality, we get All solutions that are not in therefore have a larger error than and will not be the minimizer of the error.

#### Acknowledgments

This research was supported by the Intelligent Sound project, Danish Technical Research Council Grant no. 26-02-0092. The work of M. G. Christensen is supported by the Parametric Audio Processing project, Danish Research Council for Technology, and Production Sciences Grant no. 274-06-0521. Part of this work was previously presented at a conference [18].

#### References

- D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization,”
*Nature*, vol. 401, no. 6755, pp. 788–791, 1999. View at Publisher · View at Google Scholar - D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in
*Advances in Neural Information Processing Systems 13*, pp. 556–562, MIT Press, Cambridge, Mass, USA, 2000. View at Google Scholar - M. W. Berry, M. Browne, A. N. Langville, V. P. Pauca, and R. J. Plemmons, “Algorithms and applications for approximate nonnegative matrix factorization,”
*Computational Statistics & Data Analysis*, vol. 52, no. 1, pp. 155–173, 2007. View at Publisher · View at Google Scholar - A. Berman, “Problem 73-14, rank factorization of nonnegative matrices,”
*SIAM Review*, vol. 15, no. 3, p. 655, 1973. View at Publisher · View at Google Scholar - L. Thomas, “Solution to problem 73-14, rank factorizations of nonnegative matrices,”
*SIAM Review*, vol. 16, no. 3, pp. 393–394, 1974. View at Publisher · View at Google Scholar - D. Donoho and V. Stodden, “When does non-negative matrix factorization give a correct decomposition into parts?” in
*Advances in Neural Information Processing Systems 16*, pp. 1141–1148, MIT Press, Cambridge, Mass, USA, 2004. View at Google Scholar - M. Plumbley, “Conditions for nonnegative independent component analysis,”
*IEEE Signal Processing Letters*, vol. 9, no. 6, pp. 177–180, 2002. View at Publisher · View at Google Scholar - R. T. Rockafellar,
*Convex Analysis*, Princeton University Press, Princeton, NJ, USA, 1st edition, 1970. - H. Minc,
*Nonnegative Matrices*, John Wiley & Sons, New York, NY, USA, 1st edition, 1988. - G. H. Golub and C. F. V. Loan,
*Matrix Computations*, Johns Hopkins University Press, Baltimore, Md, USA, 3rd edition, 1996. - P. Smaragdis and J. Brown, “Non-negative matrix factorization for polyphonic music transcription,” in
*Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03)*, pp. 177–180, New Paltz, NY, USA, October 2003. - P. O. Hoyer, “Non-negative matrix factorization with sparseness constraints,”
*The Journal of Machine Learning Research*, vol. 5, pp. 1457–1469, 2004. View at Google Scholar - F. Theis, K. Stadlthanner, and T. Tanaka, “First results on uniqueness of sparse non-negative matrix factorization,” in
*Proceedings of the 13th European Signal Processing Conference (EUSIPCO '05)*, Antalya, Turkey, September 2005. - H. Laurberg and L. K. Hansen, “On affine non-negative matrix factorization,” in
*Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07)*, vol. 2, pp. 653–656, Honolulu, Hawii, USA, April 2007. View at Publisher · View at Google Scholar - M. N. Schmidt, J. Larsen, and F.-T. Hsiao, “Wind noise reduction using non-negative sparse coding,” in
*Proceedings of the IEEE Workshop on Machine Learning for Signal Processing (MLSP '07)*, pp. 431–436, Thessaloniki, Greece, August 2007. View at Publisher · View at Google Scholar - Y. Deville, “Temporal and time-frequency correlation-based blind source separation methods,” in
*Proceedings of the 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA '03)*, pp. 1059–1064, Nara, Japan, April 2003. - T. M. Apostol,
*Mathematical Analysis*, Addison-Wesley, Reading, Mass, USA, 2nd edition, 1974. - H. Laurberg, “Uniqueness of non-negative matrix factorization,” in
*Proceedings of the 14th IEEE/SP Workshop on Statistical Signal Processing (SSP '07)*, pp. 44–48, Madison, Wis, USA, August 2007. View at Publisher · View at Google Scholar