Exploring the druggable space around the Fanconi anemia pathway using machine learning and mechanistic models

Background In spite of the abundance of genomic data, predictive models that describe phenotypes as a function of gene expression or mutations are difficult to obtain because they are affected by the curse of dimensionality, given the disbalance between samples and candidate genes. And this is especially dramatic in scenarios in which the availability of samples is difficult, such as the case of rare diseases. Results The application of multi-output regression machine learning methodologies to predict the potential effect of external proteins over the signaling circuits that trigger Fanconi anemia related cell functionalities, inferred with a mechanistic model, allowed us to detect over 20 potential therapeutic targets. Conclusions The use of artificial intelligence methods for the prediction of potentially causal relationships between proteins of interest and cell activities related with disease-related phenotypes opens promising avenues for the systematic search of new targets in rare diseases.

[1]  Natalia Meani,et al.  Acute myeloid leukemia fusion proteins deregulate genes involved in stem cell maintenance and DNA repair. , 2003, The Journal of clinical investigation.

[2]  E. Rothenberg,et al.  Translesion polymerase kappa-dependent DNA synthesis underlies replication fork recovery , 2018, eLife.

[3]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[4]  A. Guerrero-Hernandez,et al.  Ion channel inhibitors block caspase activation by mechanisms other than restoring intracellular potassium concentration , 2011, Cell Death and Disease.

[5]  Yanjun Qi Random Forest for Bioinformatics , 2012 .

[6]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007 .

[7]  E. Alzate Modelos de mezclas Bernoulli con regresión logística: una aplicación en la valoración de carteras de crédito , 2020 .

[8]  P. Andreassen,et al.  Fancd2 in vivo interaction network reveals a non-canonical role in mitochondrial function , 2017, Scientific Reports.

[9]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[10]  Giosuè Lo Bosco,et al.  2‐methoxyestradiol impacts on amino acids‐mediated metabolic reprogramming in osteosarcoma cells by its interaction with NMDA receptor , 2017, Journal of cellular physiology.

[11]  Walter Kolch,et al.  Signaling pathway models as biomarkers: Patient-specific simulations of JNK activity predict the survival of neuroblastoma patients , 2015, Science Signaling.

[12]  Eric J Topol,et al.  High-performance medicine: the convergence of human and artificial intelligence , 2019, Nature Medicine.

[13]  R. Wood,et al.  REV7 is essential for DNA damage tolerance via two REV3L binding sites in mammalian DNA polymerase ζ , 2015, Nucleic acids research.

[14]  Francisco Salavert,et al.  High throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes , 2016, bioRxiv.

[15]  S. Simoens,et al.  Orphan Drugs for Rare Diseases , 2012, Drugs.

[16]  W. Xiao,et al.  Rev1 plays central roles in mammalian DNA‐damage tolerance in response to UV irradiation , 2019, The FEBS journal.

[17]  Ting Chen,et al.  Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[19]  A. Ferguson-Smith,et al.  Oxidative Stress and the Induction of Cyclooxygenase Enzymes and Apoptosis in the Murine Placenta , 2007, Placenta.

[20]  Limsoon Wong,et al.  Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes , 2013, BMC Bioinformatics.

[21]  Jane Yates,et al.  TLR8-dependent TNF-(alpha) overexpression in Fanconi anemia group C cells. , 2009, Blood.

[22]  W. Foulkes,et al.  Compromised BRCA1-PALB2 interaction is associated with breast cancer risk , 2017, Oncogene.

[23]  Roded Sharan,et al.  Using deep learning to model the hierarchical structure and function of a cell , 2018, Nature Methods.

[24]  S. Elledge,et al.  RFWD3-Dependent Ubiquitination of RPA Regulates Repair at Stalled Replication Forks. , 2015, Molecular cell.

[25]  D. Iizuka,et al.  Loss of the BRCA1-Interacting Helicase BRIP1 Results in Abnormal Mammary Acinar Morphogenesis , 2013, PloS one.

[26]  H. Walden,et al.  The Fanconi anemia DNA repair pathway: structural and functional insights into a complex disorder. , 2014, Annual review of biophysics.

[27]  E. Bruford,et al.  Update of the human and mouse Fanconi anemia genes , 2015, Human Genomics.

[28]  Valérie Lanneau,et al.  Clinical Practice Guidelines for Rare Diseases: The Orphanet Database , 2017, PloS one.

[29]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  M. Segal Tree-Structured Methods for Longitudinal Data , 1992 .

[31]  Anne Condon,et al.  Interpretable dimensionality reduction of single cell transcriptome data with deep generative models , 2017, Nature Communications.

[32]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[33]  Jyoti Rani,et al.  pubmed.mineR: An R package with text-mining algorithms to analyse PubMed abstracts , 2015, Journal of Biosciences.

[34]  T. Wong,et al.  AI for medical imaging goes deep , 2018, Nature Medicine.

[35]  BoulesteixAnne-Laure,et al.  Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics , 2012 .

[36]  J. Thacker,et al.  The importance of XRCC2 in RAD51-related DNA damage repair. , 2010, DNA repair.

[37]  M. Cox,et al.  DNA flap creation by the RarA/MgsA protein of Escherichia coli , 2017, Nucleic acids research.

[38]  Casey S. Greene,et al.  Unsupervised Feature Construction and Knowledge Extraction from Genome-Wide Assays of Breast Cancer with Denoising Autoencoders , 2014, Pacific Symposium on Biocomputing.

[39]  Akiko Shimamura,et al.  Fanconi anemia pathway-deficient tumor cells are hypersensitive to inhibition of ataxia telangiectasia mutated. , 2007, The Journal of clinical investigation.

[40]  Bo Xu,et al.  Convergence of the Fanconi Anemia and Ataxia Telangiectasia Signaling Pathways , 2002, Cell.

[41]  George Fountzilas,et al.  Expression of DNA repair and replication genes in non-small cell lung cancer (NSCLC): a role for thymidylate synthetase (TYMS) , 2012, BMC Cancer.

[42]  Anurag Gupta,et al.  Deep neural network improves fracture detection by clinicians , 2018, Proceedings of the National Academy of Sciences.

[43]  J. Mi,et al.  Protein Phosphatase-1α Regulates Centrosome Splitting through Nek2 , 2007 .

[44]  A. Monteiro,et al.  The Role of PALB2 in the DNA Damage Response and Cancer Predisposition , 2017, International journal of molecular sciences.

[45]  Joaquín Dopazo,et al.  Understanding disease mechanisms with models of signaling pathway activities , 2014, BMC Systems Biology.

[46]  Anne-Laure Boulesteix,et al.  Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics , 2012, WIREs Data Mining Knowl. Discov..

[47]  Carreño Carreño,et al.  Evaluación de la diversidad taxonómica y funcional de la comunidad microbiana relacionada con el ciclo del nitrógeno en suelos de cultivo de arroz con diferentes manejos del tamo , 2020 .

[48]  A. Jimeno,et al.  Thymidylate synthase (TYMS) enhancer region genotype-directed phase II trial of oral capecitabine for 2nd line treatment of advanced pancreatic cancer , 2011, Investigational New Drugs.

[49]  John Quackenbush,et al.  What would you do if you could sequence everything? , 2008, Nature Biotechnology.

[50]  P. Saetrom,et al.  Cell cycle regulation of human DNA repair and chromatin remodeling genes. , 2015, DNA repair.

[51]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[52]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[53]  Ramy Arnaout,et al.  Fast and accurate view classification of echocardiograms using deep learning , 2018, npj Digital Medicine.

[54]  Marta R. Hidalgo,et al.  Gene Expression Integration into Pathway Modules Reveals a Pan-Cancer Metabolic Landscape. , 2018, Cancer research.

[55]  Leslie K. Ferrarelli,et al.  The role of DNA repair in brain related disease pathology. , 2013, DNA repair.

[56]  R. Moses,et al.  BRCA1 interacts directly with the Fanconi anemia protein FANCA. , 2002, Human molecular genetics.

[57]  A. D’Andrea,et al.  Molecular pathogenesis of Fanconi anemia: recent progress. , 2006, Blood.

[58]  Marta R. Hidalgo,et al.  Models of cell signaling uncover molecular mechanisms of high-risk neuroblastoma and predict disease outcome , 2018, Biology Direct.

[59]  J. Stenvang,et al.  Topoisomerase‐1 and ‐2A gene copy numbers are elevated in mismatch repair‐proficient colorectal cancers , 2015, Molecular oncology.

[60]  E. Mardis DNA sequencing technologies: 2006–2016 , 2017, Nature Protocols.

[61]  Andrew D. Rouillard,et al.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update , 2016, Nucleic Acids Res..

[62]  T. Patiroglu,et al.  Fanconi Anemia and Ataxia Telangiectasia in Siblings who Inherited Unique Combinations of Novel FANCA and ATM Null Mutations , 2019, Journal of pediatric hematology/oncology.

[63]  T. Ideker,et al.  Siri of the Cell: What Biology Could Learn from the iPhone , 2014, Cell.

[64]  H. Hanenberg,et al.  FANCD2, FANCJ and BRCA2 cooperate to promote replication fork recovery independently of the Fanconi Anemia core complex , 2015, Cell cycle.

[65]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[66]  J. Harrow,et al.  Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes , 2014, Human molecular genetics.

[67]  Francisco Salavert,et al.  Using activation status of signaling pathways as mechanism-based biomarkers to predict drug sensitivity , 2015, Scientific Reports.

[68]  G. Sauter,et al.  Overexpression of thymidylate synthase (TYMS) is associated with aggressive tumor features and early PSA recurrence in prostate cancer , 2015, Oncotarget.

[69]  Marta R. Hidalgo,et al.  Differential metabolic activity and discovery of therapeutic targets using summarized metabolic pathway models , 2018, npj Systems Biology and Applications.

[70]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[71]  Kathleen M Jagodnik,et al.  Massive mining of publicly available RNA-seq data from human and mouse , 2017, Nature Communications.

[72]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[73]  Yi Li,et al.  Gene expression inference with deep learning , 2015, bioRxiv.

[74]  Alberto Policriti,et al.  GAM-NGS: genomic assemblies merger for next generation sequencing , 2013, BMC Bioinformatics.

[75]  Rona S. Gertner,et al.  Single-Cell Genomics Unveils Critical Regulators of Th17 Cell Pathogenicity , 2015, Cell.

[76]  G. Botti,et al.  Integrative Histologic and Bioinformatics Analysis of BIRC5/Survivin Expression in Oral Squamous Cell Carcinoma , 2018, International journal of molecular sciences.

[77]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[78]  J. Dopazo,et al.  Stress-induced activation of brown adipose tissue prevents obesity in conditions of low adaptive thermogenesis , 2015, Molecular metabolism.

[79]  E. Schröck,et al.  Chromosomal instability induced by increased BIRC5/Survivin levels affects tumorigenicity of glioma cells , 2017, BMC Cancer.

[80]  Zhao-Qi Wang,et al.  Human Fanconi anemia monoubiquitination pathway promotes homologous DNA repair. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[81]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[82]  L. Ding,et al.  Downregulation of cyclooxygenase‑1 stimulates mitochondrial apoptosis through the NF‑κB signaling pathway in colorectal cancer cells. , 2018, Oncology reports.

[83]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[84]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[85]  T. Ideker,et al.  Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems , 2016, Cell systems.

[86]  Joaquín Dopazo,et al.  A comparison of mechanistic signaling pathway activity analysis methods , 2018, Briefings Bioinform..

[87]  C. Sotiriou,et al.  HER2 and TOP2A as predictive markers for anthracycline-containing chemotherapy regimens as adjuvant treatment of breast cancer: a meta-analysis of individual patient data. , 2011, The Lancet. Oncology.

[88]  Gregory P. Way,et al.  Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas , 2018, Cell reports.

[89]  K. J. Patel,et al.  The Fanconi anaemia gene FANCC promotes homologous recombination and error-prone DNA repair. , 2004, Molecular cell.

[90]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[91]  V. Stambolic,et al.  Nek family of kinases in cell cycle, checkpoint control and cancer , 2011, Cell Division.

[92]  S. Puig,et al.  Fibroblast activation and abnormal extracellular matrix remodelling as common hallmarks in three cancer‐prone genodermatoses , 2019, The British journal of dermatology.

[93]  Cardona Alzate,et al.  Predicción y selección de variables con bosques aleatorios en presencia de variables correlacionadas , 2020 .

[94]  Dmitri D. Pervouchine,et al.  The effects of death and post-mortem cold ischemia on human tissue transcriptomes , 2018, Nature Communications.

[95]  R. Muschel,et al.  Inhibition of Centrosome Separation after DNA Damage: A Role for Nek2 , 2004, Radiation research.

[96]  W. Carroll,et al.  Epigenetic reprogramming reverses the relapse-specific gene expression signature and restores chemosensitivity in childhood B-lymphoblastic leukemia. , 2012, Blood.

[97]  Maricel G. Kann,et al.  IEEE/ACM Transactions on Computational Biology and Bioinformatics , 2013 .