Nonsmooth nonnegative matrix factorization (nsNMF)

We propose a novel nonnegative matrix factorization model that aims at finding localized, part-based, representations of nonnegative multivariate data items. Unlike the classical nonnegative matrix factorization (NMF) technique, this new model, denoted "nonsmooth nonnegative matrix factorization" (nsNMF), corresponds to the optimization of an unambiguous cost function designed to explicitly represent sparseness, in the form of nonsmoothness, which is controlled by a single parameter. In general, this method produces a set of basis and encoding vectors that are not only capable of representing the original data, but they also extract highly focalized patterns, which generally lend themselves to improved interpretability. The properties of this new method are illustrated with several data sets. Comparisons to previously published methods show that the new nsNMF method has some advantages in keeping faithfulness to the data in the achieving a high degree of sparseness for both the estimated basis and the encoding vectors and in better interpretability of the factors.

[1]  Hans-Hermann Bock,et al.  Two-mode clustering methods: astructuredoverview , 2004, Statistical methods in medical research.

[2]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[3]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[4]  Liisa Holm,et al.  Sensitive pattern discovery with 'fuzzy' alignments of distantly related proteins , 2003, ISMB.

[5]  Robert Schmitt,et al.  Integration of fMRI and simultaneous EEG: towards a comprehensive understanding of localization and time-course of brain activity in target detection , 2004, NeuroImage.

[6]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[7]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[8]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Victoria Stodden,et al.  When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts? , 2003, NIPS.

[10]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[11]  R. Pascual-Marqui Review of methods for solving the EEG inverse problem , 1999 .

[12]  David Hinks,et al.  Spectral Spaces and Color Spaces , 2004 .

[13]  S. H. Srinivasan Features for Unsupervised Document Classification , 2002, CoNLL.

[14]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[15]  G. Buchsbaum,et al.  Color categories revealed by non-negative matrix factorization of Munsell color spectra , 2002, Vision Research.

[16]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[17]  Jordi Vitrià,et al.  Non-negative Matrix Factorization for Face Recognition , 2002, CCIA.

[18]  Nanning Zheng,et al.  Non-negative matrix factorization for visual coding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[19]  Philip M. Kim,et al.  Subsystem identification through dimensionality reduction of large-scale gene expression data. , 2003, Genome research.

[20]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[21]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[23]  R W Prager,et al.  Development of low entropy coding in a recurrent network. , 1996, Network.

[24]  C. Tong,et al.  Non-negative matrix factorization for face recognition , 2007 .

[25]  Julius P. A. Dewald,et al.  Evaluation of different cortical source localization methods using simulated and experimental EEG data , 2005, NeuroImage.

[26]  Wojtek J. Krzanowski,et al.  Improved biclustering of microarray data demonstrated through systematic performance tests , 2005, Comput. Stat. Data Anal..

[27]  Wesley E. Snyder,et al.  Eigenviews for object recognition in multispectral imaging systems , 2003, 32nd Applied Imagery Pattern Recognition Workshop, 2003. Proceedings..

[28]  Pentti Paatero,et al.  Advanced factor analysis of spatial distributions of PM2.5 in the eastern United States. , 2003, Environmental science & technology.

[29]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[30]  Bartlett W. Mel Computational neuroscience: Think positive to find parts , 1999, Nature.

[31]  D. Lehmann,et al.  Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain. , 1994, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[32]  Byoung-Tak Zhang,et al.  Topic Extraction from Text Documents Using Multiple-Cause Networks , 2002, PRICAI.

[33]  Martin Dugas,et al.  Mdclust-exploratory Microarray Analysis by Multidimensional Clustering , 2004, Bioinform..

[34]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[35]  Stan Z. Li,et al.  Local non-negative matrix factorization as a visual representation , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.