A Novel Deep Learning Model by Stacking Conditional Restricted Boltzmann Machine and Deep Neural Network

A real-world system often exhibits complex dynamics arising from interaction among its subunits. In machine learning and data mining, these interactions are usually formulated as dependency and correlation among system variables. Similar to Convolution Neural Network dealing with spatially correlated features and Recurrent Neural Network with temporally correlated features, in this paper we present a novel deep learning model to tackle functionally interactive features by stacking a Conditional Restricted Boltzmann Machine and a Deep Neural Network (CRBM-DNN). Variables with their dependency relationships are organized into a bipartite graph, which is further converted into a Restricted Boltzmann Machine conditioned by domain knowledge. We integrate this CRBM and a DNN into one deep learning model constrained by one overall cost function. CRBM-DNN can solve both supervised and unsupervised learning problems. Compared to a regular neural network of the same size, CRBM-DNN has fewer parameters so they require fewer training samples. We perform extensive comparative studies with a large number of supervised learning and unsupervised learning methods using several challenging real-world datasets, and achieve significant superior performance.

[1]  Yuan Cao,et al.  Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks , 2019, NeurIPS.

[2]  Paris Perdikaris,et al.  Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..

[3]  Damian Szklarczyk,et al.  STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets , 2018, Nucleic Acids Res..

[4]  R. Batra,et al.  Physically informed artificial neural networks for atomistic modeling of materials , 2018, Nature Communications.

[5]  Prashant S. Emani,et al.  Comprehensive functional genomic resource and integrative model for the human brain , 2018, Science.

[6]  Penghe Chen,et al.  Prerequisite-Driven Deep Knowledge Tracing , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[7]  Zhuwen Li,et al.  Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search , 2018, NeurIPS.

[8]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[9]  Allison P. Heath,et al.  Toward a Shared Vision for Cancer Genomic Data. , 2016, The New England journal of medicine.

[10]  Rossitza Setchi,et al.  Feature selection using Joint Mutual Information Maximisation , 2015, Expert Syst. Appl..

[11]  Katia P. Sycara,et al.  Nonnegative Matrix Tri-Factorization with Graph Regularization for Community Detection in Social Networks , 2015, IJCAI.

[12]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[13]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[14]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[15]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[16]  Alexander A. Morgan,et al.  A common rejection module (CRM) for acute rejection across multiple organs identifies novel therapeutics for organ transplantation , 2013, The Journal of experimental medicine.

[17]  Andrew M. Gross,et al.  Network-based stratification of tumor mutations , 2013, Nature Methods.

[18]  K. Ryon Descriptive and Inferential Statistics , 2013 .

[19]  Lokesh Singh,et al.  Clustering Techniques: A Brief Survey of Different Clustering Algorithms , 2012 .

[20]  Ridwan Al Iqbal A Generalized Method for Integrating Rule-based Knowledge into Inductive Methods Through Virtual Sample Creation , 2011, ArXiv.

[21]  Jeff Reeve,et al.  A molecular classifier for predicting future graft loss in late kidney transplant biopsies. , 2010, The Journal of clinical investigation.

[22]  G Van Assche,et al.  Mucosal gene signatures to predict response to infliximab in patients with ulcerative colitis , 2009, Gut.

[23]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[24]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[25]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[26]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[27]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[28]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[29]  Aleks Jakulin Machine Learning Based on Attribute Interactions , 2005 .

[30]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[31]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[32]  Marie Cottrell,et al.  Advantages and drawbacks of the Batch Kohonen algorithm , 2002, ESANN.

[33]  Martín Abadi,et al.  Composing specifications , 1989, TOPL.

[34]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[35]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.