Multiclass Nonnegative Matrix Factorization for Comprehensive Feature Pattern Discovery

In this big data era, interpretable machine learning models are strongly demanded for the comprehensive analytics of large-scale multiclass data. Characterizing all features from such data is a key but challenging step to understand the complexity. However, existing feature selection methods do not meet this need. In this paper, to address this problem, we propose a Bayesian multiclass nonnegative matrix factorization (MC-NMF) model with structured sparsity that is able to discover ubiquitous and class-specific features. Variational update rules were derived for efficient decomposition. In order to relieve the need of model selection and stably describe feature patterns, we further propose MC-NMF with stability selection, an ensemble method that collectively detects feature patterns from many runs of MC-NMF using different hyperparameter values and training subsets. We assessed our models on both simulated count data and multitumor ribonucleic acid-seq data. The experiments revealed that our models were able to recover predefined feature patterns from the simulated data and identify biologically meaningful patterns from the pan-cancer data.

[1]  Andrzej Cichocki,et al.  Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[3]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[4]  Pravin K. Trivedi,et al.  Regression Analysis of Count Data: Preface , 1998 .

[5]  Sieu Phan,et al.  GOAL: A software tool for assessing biological significance of genes groups , 2009, BMC Bioinformatics.

[6]  R. McPherson,et al.  The role of mitogen-activated protein (MAP) kinase in breast cancer , 2002, The Journal of Steroid Biochemistry and Molecular Biology.

[7]  Alioune Ngom,et al.  Versatile sparse matrix factorization: Theory and applications , 2014, Neurocomputing.

[8]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[9]  Alioune Ngom,et al.  A review on machine learning principles for multi-view biological data integration , 2016, Briefings Bioinform..

[10]  George Michailidis,et al.  A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data , 2015, Bioinform..

[11]  L. Groop,et al.  Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism , 2014, Proceedings of the National Academy of Sciences.

[12]  M. Browne Factor analysis of multiple batteries by maximum likelihood , 1980 .

[13]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[14]  L. Tucker An inter-battery method of factor analysis , 1958 .

[15]  H. Kettenmann,et al.  Functional GABAA receptors on human glioma cells , 1998, The European journal of neuroscience.

[16]  E. Eruslanov,et al.  Tumor-associated macrophages: function, phenotype, and link to prognosis in human lung cancer. , 2012, American journal of translational research.

[17]  Marinka Zitnik,et al.  Data Fusion by Matrix Factorization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yifeng Li,et al.  Advances in multi-view matrix factorizations , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[19]  G. Page,et al.  Breast fibroblasts modulate epithelial cell proliferation in three-dimensional in vitro co-culture , 2004, Breast Cancer Research.

[20]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[21]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Zengyou He,et al.  Stable Feature Selection for Biomarker Discovery , 2010, Comput. Biol. Chem..

[24]  M. Trabucchi,et al.  Characterization of the γ-Aminobutyric Acid Receptor System in Human Brain Gliomas , 1985 .

[25]  Alexander Ilin,et al.  Transformations in variational Bayesian factor analysis to speed up learning , 2010, Neurocomputing.

[26]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[27]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[28]  O. Rath,et al.  MAP kinase signalling pathways in cancer , 2007, Oncogene.

[29]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  W. Hahn,et al.  Human breast cancer cells generated by oncogenic transformation of primary mammary epithelial cells. , 2001, Genes & development.

[31]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[32]  M. Loda,et al.  The Proliferative Activity of Mammary Epithelial Cells in Normal Tissue Predicts Breast Cancer Risk in Premenopausal Women. , 2016, Cancer research.

[33]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[34]  Samuel Kaski,et al.  Group Factor Analysis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[35]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[36]  Wyeth W. Wasserman,et al.  Deep Feature Selection: Theory and Application to Identify Enhancers and Promoters , 2015, RECOMB.

[37]  M. Grandoch,et al.  Esophageal Squamous Cell Carcinoma Cells Modulate Chemokine Expression and Hyaluronan Synthesis in Fibroblasts* , 2015, The Journal of Biological Chemistry.

[38]  Alioune Ngom,et al.  Sparse representation approaches for the classification of high-dimensional biological data , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[39]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[40]  Ivan Ivek,et al.  Supervised Dictionary Learning by a Variational Bayesian Group Sparse Nonnegative Matrix Factorization , 2014, ArXiv.

[41]  W. Loh,et al.  Consistent Variable Selection in Linear Models , 1995 .

[42]  J. Trygg O2‐PLS for qualitative and quantitative analysis in multivariate calibration , 2002 .

[43]  Jeffrey W. Pollard,et al.  Macrophage Diversity Enhances Tumor Progression and Metastasis , 2010, Cell.

[44]  Krzysztof Fujarewicz,et al.  Stable feature selection and classification algorithms for multiclass microarray data , 2012, Biology Direct.

[45]  S. Basu,et al.  Catecholamines regulate tumor angiogenesis. , 2009, Cancer research.

[46]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[47]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[48]  Tommy Löfstedt,et al.  OnPLS—a novel multiblock method for the modelling of predictive and orthogonal variation , 2011 .

[49]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[50]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[51]  Youn-Hee Choi,et al.  Loss of Protein Inhibitors of Activated STAT-3 Expression in Glioblastoma Multiforme Tumors: Implications for STAT-3 Activation and Gene Expression , 2008, Clinical Cancer Research.

[52]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[53]  R. Kalff,et al.  GABA binding sites: their density, their affinity to muscimol and their behaviour against neuroactive steroids in human gliomas of different degrees of malignancy , 2005, Journal of Neural Transmission / General Section JNT.

[54]  C. Févotte,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization with the-Divergence , 2011 .

[55]  Vincent Y. F. Tan,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization with the /spl beta/-Divergence , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.