Artificial intelligence in neurodegenerative disease research : use of IBM Watson to identify additional RNA ‐ binding proteins

Amyotrophic lateral sclerosis (ALS) is a devastating neurodegenerative disease with no effective treatments. Numerous RNA-binding proteins (RBPs) have been shown to be altered in ALS, with mutations in 11 RBPs causing familial forms of the disease, and 6 more RBPs showing abnormal expression/distribution in ALS albeit without any known mutations. RBP dysregulation is widely accepted as a contributing factor in ALS pathobiology. There are at least 1542 RBPs in the human genome; therefore, other unidentified RBPs may also be linked to the pathogenesis of ALS. We used IBM Watson® to sieve through all RBPs in the genome and identify new RBPs linked to ALS (ALS-RBPs). IBM Watson extracted features from published literature to create semantic similarities and identify new connections between entities of interest. IBM Watson analyzed all published abstracts of previously known ALS-RBPs, and applied that text-based knowledge to all RBPs in the genome, ranking them by semantic similarity to the known set. We then validated the Watson top-ten-ranked RBPs at the protein and RNA levels in tissues from ALS and non-neurological disease controls, as well as in patient-derived induced pluripotent stem cells. 5 RBPs previously unlinked to ALS, hnRNPU, Syncrip, RBMS3, Caprin-1 and NUPL2, showed significant alterations in ALS compared to controls. Overall, we successfully used IBM Watson to help identify additional RBPs altered in ALS, highlighting the use of artificial intelligence tools to accelerate scientific discovery in ALS and possibly other complex neurological disorders.

[1]  Jan Kassubek,et al.  Global brain atrophy and corticospinal tract alterations in ALS, as investigated by voxel‐based morphometry of 3‐D MRI , 2005, Amyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases.

[2]  W. Le,et al.  Genetics of amyotrophic lateral sclerosis: an update , 2013, Molecular Neurodegeneration.

[3]  A. Al-Chalabi,et al.  The genetics and neuropathology of amyotrophic lateral sclerosis , 2012, Acta Neuropathologica.

[4]  Ying Chen,et al.  IBM Watson: How Cognitive Computing Can Be Applied to Big Data Challenges in Life Sciences Research. , 2016, Clinical therapeutics.

[5]  P. Pasinelli,et al.  Pathogenic determinants and mechanisms of ALS/FTD linked to hexanucleotide repeat expansions in the C9orf72 gene , 2017, Neuroscience Letters.

[6]  E. Rogaeva,et al.  MTHFSD and DDX 58 are novel RNA-binding proteins abnormally regulated in amyotrophic lateral sclerosis , 2015 .

[7]  N. M. Reddy,et al.  Higher order arrangement of the eukaryotic nuclear bodies , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Petri,et al.  Distribution of GABAA Receptor mRNA in the Motor Cortex of ALS Patients , 2003, Journal of neuropathology and experimental neurology.

[9]  T. Taksir,et al.  Delivery of AAV-IGF-1 to the CNS extends survival in ALS mice through modification of aberrant glial cell activity. , 2008, Molecular therapy : the journal of the American Society of Gene Therapy.

[10]  A. Barta,et al.  Proto-oncoprotein TLS/FUS is associated to the nuclear matrix and complexed with splicing factors PTB, SRm160, and SR proteins. , 2003, Experimental cell research.

[11]  H. Hakonarson,et al.  Evaluating the role of the FUS/TLS-related gene EWSR1 in amyotrophic lateral sclerosis. , 2012, Human molecular genetics.

[12]  P. Khaitovich,et al.  Transcript and protein expression decoupling reveals RNA binding proteins and miRNAs as potential modulators of human aging , 2015, Genome Biology.

[13]  C. Hoogenraad,et al.  Spinal Inhibitory Interneuron Pathology Follows Motor Neuron Degeneration Independent of Glial Mutant Superoxide Dismutase 1 Expression in SOD1-ALS Mice , 2011, Journal of neuropathology and experimental neurology.

[14]  M. Mesulam,et al.  TIA1 Mutations in Amyotrophic Lateral Sclerosis and Frontotemporal Dementia Promote Phase Separation and Alter Stress Granule Dynamics , 2017, Neuron.

[15]  Christian A. Ross,et al.  Distinct brain transcriptome profiles in C9orf72-associated and sporadic ALS , 2015, Nature Neuroscience.

[16]  S. Petri,et al.  GABAA-receptor mRNA expression in the prefrontal and temporal cortex of ALS patients , 2006, Journal of the Neurological Sciences.

[17]  Abel R. Alcázar-Román,et al.  Interaction between the shuttling mRNA export factor Gle1 and the nucleoporin hCG1: a conserved mechanism in the export of Hsp70 mRNA. , 2005, Molecular biology of the cell.

[18]  Huilin Zhou,et al.  ALS-associated mutations in TDP-43 increase its stability and promote TDP-43 complexes with FUS/TLS , 2010, Proceedings of the National Academy of Sciences.

[19]  Songbin Fu,et al.  RBMS3 is a tumor suppressor gene that acts as a favorable prognostic marker in lung squamous cell carcinoma , 2015, Medical Oncology.

[20]  Chuong B. Do,et al.  Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease , 2014, Nature Genetics.

[21]  J. Kril,et al.  Cerebellar neuronal loss in amyotrophic lateral sclerosis cases with ATXN2 intermediate repeat expansions , 2015, Annals of neurology.

[22]  J. Hodges,et al.  Cerebellar Integrity in the Amyotrophic Lateral Sclerosis - Frontotemporal Dementia Continuum , 2014, PloS one.

[23]  J. Bouchard,et al.  Deleterious mutations in the essential mRNA metabolism factor, hGle1, in amyotrophic lateral sclerosis. , 2015, Human molecular genetics.

[24]  B. Giraudeau,et al.  SMN1 gene, but not SMN2, is a risk factor for sporadic ALS , 2006, Neurology.

[25]  J. Trojanowski,et al.  Pathological TDP‐43 distinguishes sporadic amyotrophic lateral sclerosis from amyotrophic lateral sclerosis with SOD1 mutations , 2007, Annals of neurology.

[26]  A. Parent,et al.  Calcium-binding proteins in primate cerebellum , 1998, Neuroscience Research.

[27]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .

[28]  Paul G. Ince,et al.  Antisense RNA foci in the motor neurons of C9ORF72-ALS patients are associated with TDP-43 proteinopathy , 2015, Acta Neuropathologica.

[29]  W. Chung,et al.  Variants in HNRNPH2 on the X Chromosome Are Associated with a Neurodevelopmental Disorder in Females. , 2016, American journal of human genetics.

[30]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[31]  Robert H. Brown,et al.  Decoding ALS: from genes to mechanism , 2016, Nature.

[32]  J. Ule,et al.  Hexanucleotide Repeats in ALS/FTD Form Length-Dependent RNA Foci, Sequester RNA Binding Proteins, and Are Neurotoxic , 2013, Cell reports.

[33]  S. Landau,et al.  Cortical selective vulnerability in motor neuron disease: a morphometric study. , 2004, Brain : a journal of neurology.

[34]  G. Comi,et al.  Analysis of hnRNPA1, A2/B1, and A3 genes in patients with amyotrophic lateral sclerosis , 2013, Neurobiology of Aging.

[35]  Jian Wang,et al.  Detection of a novel frameshift mutation and regions with homozygosis within ARHGEF28 gene in familial amyotrophic lateral sclerosis , 2013, Amyotrophic lateral sclerosis & frontotemporal degeneration.

[36]  Adriano Chiò,et al.  State of play in amyotrophic lateral sclerosis genetics , 2013, Nature Neuroscience.

[37]  P. Andersen,et al.  [Familial amyotrophic lateral sclerosis]. , 1996, Duodecim; laaketieteellinen aikakauskirja.

[38]  S. Hattori,et al.  Nuclear TDP-43 causes neuronal toxicity by escaping from the inhibitory regulation by hnRNPs. , 2015, Human molecular genetics.

[39]  L. Petrucelli,et al.  Disease Mechanisms of C9ORF72 Repeat Expansions. , 2018, Cold Spring Harbor perspectives in medicine.

[40]  Ewout J. N. Groen,et al.  Comparative interactomics analysis of different ALS-associated proteins identifies converging molecular pathways , 2016, Acta Neuropathologica.

[41]  P. De Koninck,et al.  Fragile Mental Retardation Protein Interacts with the RNA-Binding Protein Caprin1 in Neuronal RiboNucleoProtein Complexes , 2012, PloS one.

[42]  M. David,et al.  Distinct Structural Features ofCaprin-1 Mediate Its Interaction with G3BP-1 and Its Induction of Phosphorylation of Eukaryotic Translation InitiationFactor 2α, Entry to Cytoplasmic Stress Granules, and Selective Interaction with a Subset of mRNAs , 2007, Molecular and Cellular Biology.

[43]  Peter J. Haas,et al.  Automated hypothesis generation based on mining scientific literature , 2014, KDD.

[44]  M. Monteiro,et al.  ALS-linked mutations in ubiquilin-2 or hnRNPA1 reduce interaction between ubiquilin-2 and hnRNPA1. , 2015, Human molecular genetics.

[45]  G. Cox,et al.  Hyperactive Somatostatin Interneurons Contribute to Excitotoxicity in Neurodegenerative Disorders , 2016, Nature Neuroscience.

[46]  Lorne Zinman,et al.  Mutations in the Matrin 3 gene cause familial amyotrophic lateral sclerosis , 2014, Nature Neuroscience.

[47]  Erik De Schutter,et al.  Unraveling the cerebellar cortex: Cytology and cellular physiology of large-sized interneurons in the granular layer , 2008, The Cerebellum.

[48]  T. Hortobágyi,et al.  p62 positive, TDP-43 negative, neuronal cytoplasmic and intranuclear inclusions in the cerebellum and hippocampus define the pathology of C9orf72-linked FTLD and MND/ALS , 2011, Acta Neuropathologica.

[49]  Michael Sendtner,et al.  Specific interaction of Smn, the spinal muscular atrophy determining gene product, with hnRNP-R and gry-rbp/hnRNP-Q: a role for Smn in RNA processing in motor axons? , 2002, Human molecular genetics.

[50]  B. Traynor,et al.  The RNA-binding motif 45 (RBM45) protein accumulates in inclusion bodies in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP) patients , 2012, Acta Neuropathologica.

[51]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[52]  R. Chitta,et al.  Global analysis of TDP-43 interacting proteins reveals strong association with RNA splicing and translation machinery. , 2010, Journal of proteome research.

[53]  N. Shneider,et al.  The C9ORF72 GGGGCC expansion forms RNA G-quadruplex inclusions and sequesters hnRNP H to disrupt splicing in ALS brains , 2016, eLife.

[54]  Annelot M. Dekker,et al.  NEK1 variants confer susceptibility to amyotrophic lateral sclerosis , 2016, Nature Genetics.

[55]  C. Broeckhoven,et al.  hnRNP A3 binds to GGGGCC repeats and is a constituent of p62-positive/TDP43-negative inclusions in the hippocampus of patients with C9orf72 mutations , 2013, Acta Neuropathologica.

[56]  W. Lee,et al.  Immunohistochemical study on the distribution of phosphorylated extracellular signal-regulated kinase (ERK) in the central nervous system of SOD1G93A transgenic mice , 2005, Brain Research.

[57]  Patrick G. Shaw,et al.  C9orf72 Nucleotide Repeat Structures Initiate Molecular Cascades of Disease , 2014, Nature.

[58]  F. Sablitzky,et al.  Subnuclear targeting of the RNA-binding motif protein RBM6 to splicing speckles and nascent transcripts , 2010, Chromosome Research.

[59]  E. Rogaeva,et al.  MTHFSD and DDX58 are novel RNA-binding proteins abnormally regulated in amyotrophic lateral sclerosis. , 2016, Brain : a journal of neurology.

[60]  C. Ting,et al.  The Spinal Muscular Atrophy Disease Protein SMN Is Linked to the Golgi Network , 2012, PloS one.