Use of deep neural network ensembles to identify embryonic-fetal transition markers: repression of COX7A1 in embryonic and cancer cells

Here we present the application of deep neural network (DNN) ensembles trained on transcriptomic data to identify the novel markers associated with the mammalian embryonic-fetal transition (EFT). Molecular markers of this process could provide important insights into regulatory mechanisms of normal development, epimorphic tissue regeneration and cancer. Subsequent analysis of the most significant genes behind the DNNs classifier on an independent dataset of adult-derived and human embryonic stem cell (hESC)-derived progenitor cell lines led to the identification of COX7A1 gene as a potential EFT marker. COX7A1, encoding a cytochrome C oxidase subunit, was up-regulated in post-EFT murine and human cells including adult stem cells, but was not expressed in pre-EFT pluripotent embryonic stem cells or their in vitro-derived progeny. COX7A1 expression level was observed to be undetectable or low in multiple sarcoma and carcinoma cell lines as compared to normal controls. The knockout of the gene in mice led to a marked glycolytic shift reminiscent of the Warburg effect that occurs in cancer cells. The DNN approach facilitated the elucidation of a potentially new biomarker of cancer and pre-EFT cells, the embryo-onco phenotype, which may potentially be used as a target for controlling the embryonic-fetal transition.

[1]  Younès Bennani,et al.  HVS : A Heuristic for Variable Selection in Multilayer Artificial Neural Network Classifier , 1997 .

[2]  S. Sedimbi,et al.  Diabetes--role of epigenetics, genetics, and physiological factors. , 2009, Zhong nan da xue xue bao. Yi xue ban = Journal of Central South University. Medical sciences.

[3]  C B Harley,et al.  Specific association of human telomerase activity with immortal cells and cancer. , 1994, Science.

[4]  O. Warburg On the origin of cancer cells. , 1956, Science.

[5]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[6]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[7]  Rafael A Irizarry,et al.  Frozen robust multiarray analysis (fRMA). , 2010, Biostatistics.

[8]  Samantha A. Morris,et al.  CellNet: Network Biology Applied to Stem Cell Engineering , 2014, Cell.

[9]  Panagiotis A. Tsonis,et al.  Bridging the regeneration gap: genetic insights from diverse animal models , 2006, Nature Reviews Genetics.

[10]  Emanuel F Petricoin,et al.  Mitochondrial proteome: Altered cytochrome c oxidase subunit levels in prostate cancer , 2003, Proteomics.

[11]  M. Ferguson,et al.  Ontogeny of the skin and the transition from scar-free to scarring phenotype during wound healing in the pouch young of a marsupial, Monodelphis domestica. , 1995, Developmental biology.

[12]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[13]  Gideon Rechavi,et al.  Analysing human neural stem cell ontogeny by consecutive isolation of Notch active neural progenitors , 2015, Nature Communications.

[14]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[15]  Ting Chen,et al.  Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Nikolay M. Borisov,et al.  The OncoFinder algorithm for minimizing the errors introduced by the high-throughput methods of transcriptome analysis , 2014, Front. Mol. Biosci..

[17]  Shaowu Zhang,et al.  lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning. , 2015, Molecular bioSystems.

[18]  M Ramalho Santos STEMNESS: TRANSCRIPTIONAL PROFILING OF EMBRYONIC AND ADULT STEM CELLS , 2002 .

[19]  M. Longaker,et al.  Scarless Wound Healing: Chasing the Holy Grail , 2015, Plastic and reconstructive surgery.

[20]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[21]  Gongshe Yang,et al.  An Additive Effect of Promoting Thermogenic Gene Expression in Mice Adipose-Derived Stromal Vascular Cells by Combination of Rosiglitazone and CL316,243 , 2017, International journal of molecular sciences.

[22]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[23]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[24]  J. Wang,et al.  Cold exposure induces the acquisition of brown adipocyte gene expression profiles in cattle inguinal fat normalized with a new set of reference genes for qRT-PCR. , 2017, Research in veterinary science.

[25]  M. I. Lomax,et al.  Sequence of a cDNA specifying subunit VIIa of human cytochrome c oxidase. , 1989, Nucleic acids research.

[26]  N. Lenka,et al.  Structural organization and transcription regulation of nuclear genes encoding the mammalian cytochrome c oxidase complex. , 1998, Progress in nucleic acid research and molecular biology.

[27]  U. Timilsina,et al.  Downregulation of cytochrome c oxidase subunit 7A1 expression is important in enhancing cell proliferation in adenocarcinoma cells. , 2017, Biochemical and biophysical research communications.

[28]  O. Warburg [Origin of cancer cells]. , 1956, Oncologia.

[29]  Kathrin Plath,et al.  Progress in understanding reprogramming to the induced pluripotent state , 2011, Nature Reviews Genetics.

[30]  H. Yoon,et al.  Transcriptional profiling of the developmentally important signalling pathways in human embryonic stem cells. , 2006, Human reproduction.

[31]  Luhua Lai,et al.  Deep Learning for Drug-Induced Liver Injury , 2015, J. Chem. Inf. Model..

[32]  Ronan O'Rahilly,et al.  Developmental Stages in Human Embryos: Including a Revision of Streeter's Horizons and a Survey of the Carnegie Collection , 1987 .

[33]  Nikolay M. Borisov,et al.  Signaling pathways activation profiles make better markers of cancer than expression of individual genes , 2014, Oncotarget.

[34]  R. Erickson,et al.  Mice deleted for heart-type cytochrome c oxidase subunit 7a1 develop dilated cardiomyopathy. , 2012, Mitochondrion.

[35]  D. Dinsdale,et al.  Switching from aerobic glycolysis to oxidative phosphorylation modulates the sensitivity of mantle cell lymphoma cells to TRAIL , 2012, Oncogene.

[36]  R. Sabirov,et al.  Cells die with increased cytosolic ATP during apoptosis: a bioluminescence study with intracellular luciferase , 2005, Cell Death and Differentiation.

[37]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[38]  G. Daley,et al.  Lin28 Enhances Tissue Repair by Reprogramming Cellular Metabolism , 2013, Cell.

[39]  A. Vaag,et al.  Muscle inflammatory signaling in response to 9 days of physical inactivity in young men with low compared with normal birth weight. , 2012, European journal of endocrinology.

[40]  Ardeshir Bayat,et al.  Regenerative healing, scar‐free healing and scar formation across the species: current concepts and future perspectives , 2014, Experimental dermatology.

[41]  J. Eaton,et al.  Cytochrome c Oxidase Activity and Oxygen Tolerance* , 2007, Journal of Biological Chemistry.

[42]  A. Aliper,et al.  In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development , 2016, Nature Communications.

[43]  Gretchen Vogel,et al.  Stem cells. 'Stemness' genes still elusive. , 2003, Science.

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  M. Tan,et al.  The Warburg effect in tumor progression: mitochondrial oxidative metabolism as an anti-metastasis mechanism. , 2015, Cancer letters.

[46]  Jeffrey T Leek,et al.  On the design and analysis of gene expression studies in human populations , 2007, Nature Genetics.

[47]  Roy M. Williams,et al.  The ACTCellerate initiative: large-scale combinatorial cloning of novel human embryonic stem cell derivatives. , 2008, Regenerative medicine.

[48]  A. Zhavoronkov,et al.  Quantifying signaling pathway activation to monitor the quality of induced pluripotent stem cells , 2015, Oncotarget.

[49]  Jianyang Zeng,et al.  A deep learning framework for modeling structural features of RNA-binding protein targets , 2015, Nucleic acids research.

[50]  Jieping Ye,et al.  Deep convolutional neural networks for annotating gene expression patterns in the mouse brain , 2015, BMC Bioinformatics.

[51]  E. Schon,et al.  Tissue-specific expression and chromosome assignment of genes specifying two isoforms of subunit VIIa of human cytochrome c oxidase. , 1992, Gene.

[52]  D. Strachan,et al.  Genetic variation in LIN28B is associated with the timing of puberty. , 2009, Nature genetics.

[53]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[54]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[55]  L. Allen Stem cells. , 2003, The New England journal of medicine.

[56]  C. Ware,et al.  HIF1α induced switch from bivalent to exclusively glycolytic metabolism during ESC‐to‐EpiSC/hESC transition , 2012, The EMBO journal.

[57]  S. Yamanaka,et al.  Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors , 2006, Cell.

[58]  Brendan J. Frey,et al.  Deep learning of the tissue-regulated splicing code , 2014, Bioinform..

[59]  G. Daley,et al.  Stem cell metabolism in tissue development and aging , 2013, Development.

[60]  P. Cahan,et al.  Origins and implications of pluripotent stem cell variability and heterogeneity , 2013, Nature Reviews Molecular Cell Biology.

[61]  J. Rinn,et al.  Lin28a transgenic mice manifest size and puberty phenotypes identified in human genetic association studies , 2010, Nature Genetics.