Versatile Sparse Matrix Factorization and Its Applications in High-Dimensional Biological Data Analysis

Non-negative matrix factorization and sparse representation models have been successfully applied in high-throughput biological data analysis. In this paper, we propose our versatile sparse matrix factorization (VSMF) model for biological data mining. We show that many well-known sparse models are specific cases of VSMF. Through tuning parameters, sparsity, smoothness, and non-negativity can be easily controlled in VSMF. Our computational experiments corroborate the advantages of VSMF.

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Michael F. Ochs,et al.  Matrix factorization for transcriptional regulatory network inference , 2012, 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[3]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[4]  Alioune Ngom,et al.  Fast sparse representation approaches for the classification of high-dimensional biological data , 2012, BIBM.

[5]  Alioune Ngom,et al.  The non-negative matrix factorization toolbox for biological data mining , 2013, Source Code for Biology and Medicine.

[6]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[7]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  Alioune Ngom,et al.  Non-negative matrix and tensor factorization based classification of clinical microarray gene expression data , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[10]  Alioune Ngom,et al.  A new Kernel non-negative matrix factorization and its application in microarray data analysis , 2012, 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[11]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  A. Nobel,et al.  The molecular portraits of breast tumors are conserved across microarray platforms , 2006, BMC Genomics.

[14]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[15]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Philip M. Kim,et al.  Subsystem identification through dimensionality reduction of large-scale gene expression data. , 2003, Genome research.

[17]  Chengyu Liu,et al.  Biclustering of gene expression data by non-smooth non-negative matrix factorization , 2010 .

[18]  A. Godwin,et al.  Detection of treatment-induced changes in signaling pathways in gastrointestinal stromal tumors using transcriptomic data. , 2009, Cancer research.

[19]  Alioune Ngom,et al.  Sparse representation approaches for the classification of high-dimensional biological data , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.