A Tale of Two Matrix Factorizations

In statistical practice, rectangular tables of numeric data are commonplace, and are often analyzed using dimension-reduction methods like the singular value decomposition and its close cousin, principal component analysis (PCA). This analysis produces score and loading matrices representing the rows and the columns of the original table and these matrices may be used for both prediction purposes and to gain structural understanding of the data. In some tables, the data entries are necessarily nonnegative (apart, perhaps, from some small random noise), and so the matrix factors meant to represent them should arguably also contain only nonnegative elements. This thinking, and the desire for parsimony, underlies such techniques as rotating factors in a search for “simple structure.” These attempts to transform score or loading matrices of mixed sign into nonnegative, parsimonious forms are, however, indirect and at best imperfect. The recent development of nonnegative matrix factorization, or NMF, is an attractive alternative. Rather than attempt to transform a loading or score matrix of mixed signs into one with only nonnegative elements, it directly seeks matrix factors containing only nonnegative elements. The resulting factorization often leads to substantial improvements in interpretability of the factors. We illustrate this potential by synthetic examples and a real dataset. The question of exactly when NMF is effective is not fully resolved, but some indicators of its domain of success are given. It is pointed out that the NMF factors can be used in much the same way as those coming from PCA for such tasks as ordination, clustering, and prediction. Supplementary materials for this article are available online.

[1]  Marit Risberg Ellekjær,et al.  Assessment of Sensory Quality of Meat Sausages Using Near Infrared Spectroscopy , 1994 .

[2]  S. Zamir,et al.  Lower Rank Approximation of Matrices by Least Squares With Any Choice of Weights , 1979 .

[3]  Karthik Devarajan,et al.  Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology , 2008, PLoS Comput. Biol..

[4]  Emmanuel Vincent,et al.  Stability Analysis of Multiplicative Update Algorithms and Application to Nonnegative Matrix Factorization , 2010, IEEE Transactions on Neural Networks.

[5]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Lars Kai Hansen,et al.  On Affine Non-Negative Matrix Factorization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  E. A. Sylvestre,et al.  Self Modeling Curve Resolution , 1971 .

[9]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[10]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[11]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Stefan M. Wild,et al.  Improving non-negative matrix factorizations through structured initialization , 2004, Pattern Recognit..

[13]  M. Greenacre,et al.  Topics in Applied Multivariate Analysis: SCALING A DATA MATRIX IN A LOW-DIMENSIONAL EUCLIDEAN SPACE , 1982 .

[14]  Robert L Hanson,et al.  Genetic studies of the etiology of type 2 diabetes in Pima Indians: hunting for pieces to a complicated puzzle. , 2004, Diabetes.

[15]  Christos Boutsidis,et al.  SVD based initialization: A head start for nonnegative matrix factorization , 2008, Pattern Recognit..

[16]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[17]  Ø. Langsrud Explaining Correlations by Plotting Orthogonal Contrasts , 2006 .

[18]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[19]  Li Liu,et al.  Robust singular value decomposition analysis of microarray data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  I. J. Good,et al.  Some Applications of the Singular Decomposition of a Matrix , 1969 .

[21]  Emmanuel Vincent,et al.  Stability analysis of multiplicative update algorithms for non-negative matrix factorization , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Paul Geladi,et al.  Random error bias in principal component analysis. Part I. derivation of theoretical predictions , 1995 .