Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach

Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personalized cancer therapy.

[1]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[2]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[3]  P. Laird,et al.  Discovery of multi-dimensional modules by integrative analysis of cancer genomic data , 2012, Nucleic acids research.

[4]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[5]  Jin-Wu Nam,et al.  miR-29 miRNAs activate p53 by targeting p85α and CDC42 , 2009, Nature Structural &Molecular Biology.

[6]  Underwood Jc,et al.  Oestrogen receptors in human breast cancer: review of histopathological correlations and critique of histochemical methods. , 1983 .

[7]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[8]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[9]  J. Eun,et al.  MicroRNA-29c functions as a tumor suppressor by direct targeting oncogenic SIRT1 in hepatocellular carcinoma , 2014, Oncogene.

[10]  Tint Lwin,et al.  microRNA expression profile and identification of miR-29 as a prognostic marker and pathogenetic factor by targeting CDK6 in mantle cell lymphoma. , 2010, Blood.

[11]  Tara N. Sainath,et al.  Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[13]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[14]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[15]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[16]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[17]  Nicolas Le Roux,et al.  Representational Power of Restricted Boltzmann Machines and Deep Belief Networks , 2008, Neural Computation.

[18]  Geoffrey E. Hinton,et al.  A New Learning Algorithm for Mean Field Boltzmann Machines , 2002, ICANN.

[19]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[20]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[21]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[22]  D. Schadendorf,et al.  Highly Recurrent TERT Promoter Mutations in Human Melanoma , 2022 .

[23]  W. McGuire,et al.  The use of steroïd hormone receptors in the treatment of human breast cancer: a review. , 1979, Bulletin du cancer.

[24]  Adam B. Olshen,et al.  Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis , 2009, Bioinform..

[25]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[26]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[27]  R. Dalla‐Favera,et al.  Direct activation of TERT transcription by c-MYC , 1999, Nature Genetics.

[28]  Osborne Ck,et al.  The use of steroïd hormone receptors in the treatment of human breast cancer: a review. , 1979 .

[29]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[30]  A. Glukhov,et al.  Telomerase inhibitors as novel antitumor drugs , 2011, Applied Biochemistry and Microbiology.

[31]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[32]  Yuhao Wang,et al.  Predicting drug-target interactions using restricted Boltzmann machines , 2013, Bioinform..

[33]  J C Underwood,et al.  Oestrogen receptors in human breast cancer: review of histopathological correlations and critique of histochemical methods. , 1983, Diagnostic histopathology.