NeuroRDF: semantic integration of highly curated data to prioritize biomarker candidates in Alzheimer's disease

BackgroundNeurodegenerative diseases are incurable and debilitating indications with huge social and economic impact, where much is still to be learnt about the underlying molecular events. Mechanistic disease models could offer a knowledge framework to help decipher the complex interactions that occur at molecular and cellular levels. This motivates the need for the development of an approach integrating highly curated and heterogeneous data into a disease model of different regulatory data layers. Although several disease models exist, they often do not consider the quality of underlying data. Moreover, even with the current advancements in semantic web technology, we still do not have cure for complex diseases like Alzheimer’s disease. One of the key reasons accountable for this could be the increasing gap between generated data and the derived knowledge.ResultsIn this paper, we describe an approach, called as NeuroRDF, to develop an integrative framework for modeling curated knowledge in the area of complex neurodegenerative diseases. The core of this strategy lies in the usage of well curated and context specific data for integration into one single semantic web-based framework, RDF. This increases the probability of the derived knowledge to be novel and reliable in a specific disease context. This infrastructure integrates highly curated data from databases (Bind, IntAct, etc.), literature (PubMed), and gene expression resources (such as GEO and ArrayExpress). We illustrate the effectiveness of our approach by asking real-world biomedical questions that link these resources to prioritize the plausible biomarker candidates. Among the 13 prioritized candidate genes, we identified MIF to be a potential emerging candidate due to its role as a pro-inflammatory cytokine. We additionally report on the effort and challenges faced during generation of such an indication-specific knowledge base comprising of curated and quality-controlled data.ConclusionAlthough many alternative approaches have been proposed and practiced for modeling diseases, the semantic web technology is a flexible and well established solution for harmonized aggregation. The benefit of this work, to use high quality and context specific data, becomes apparent in speculating previously unattended biomarker candidates around a well-known mechanism, further leveraged for experimental investigations.

[1]  Martin Hofmann-Apitius,et al.  Exploring novel mechanistic insights in Alzheimer’s disease by assessing reliability of protein interactions , 2015, Scientific Reports.

[2]  Tina Peng,et al.  Sustained Mitogen-activated Protein Kinase (MAPK) and Cytoplasmic Phospholipase A2 Activation by Macrophage Migration Inhibitory Factor (MIF) , 1999, The Journal of Biological Chemistry.

[3]  D. Van Dam,et al.  Animal models in the drug discovery pipeline for Alzheimer's disease , 2011, British journal of pharmacology.

[4]  Raul Rodriguez-Esteban,et al.  Quantifying the complexity of medical research , 2013, Bioinform..

[5]  Roded Sharan,et al.  Enhancing the Prioritization of Disease-Causing Genes through Tissue Specific Protein Interaction Networks , 2012, PLoS Comput. Biol..

[6]  Kristine Yaffe,et al.  Potential for primary prevention of Alzheimer's disease: an analysis of population-based data , 2014, The Lancet Neurology.

[7]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[8]  Kei-Hoi Cheung,et al.  Semantic Web Meets e-Neuroscience : An RDF Use Case , 2006 .

[9]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[10]  B. Roska,et al.  Cell-Type-Specific Electric Stimulation for Vision Restoration , 2014, Neuron.

[11]  B. Webb-Robertson,et al.  Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data. , 2013, Expert opinion on medical diagnostics.

[12]  Adrian J Shepherd,et al.  Mining biological networks from full-text articles. , 2014, Methods in molecular biology.

[13]  Zhiyong Lu,et al.  Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases , 2011 .

[14]  H. Simon,et al.  Macrophage migration inhibitory factor delays apoptosis in neutrophils by inhibiting the mitochondria‐dependent death pathway , 2003, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[15]  Audrey Kauffmann,et al.  Bioinformatics Applications Note Arrayqualitymetrics—a Bioconductor Package for Quality Assessment of Microarray Data , 2022 .

[16]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[17]  Eric Tardif,et al.  Pathological reorganization of NMDA receptors subunits and postsynaptic protein PSD-95 distribution in Alzheimer's disease. , 2014, Current Alzheimer research.

[18]  Andreas Krämer,et al.  Causal analysis approaches in Ingenuity Pathway Analysis , 2013, Bioinform..

[19]  Akira R. Kinjo,et al.  Implementation of linked data in the life sciences at BioHackathon 2011 , 2015, J. Biomed. Semant..

[20]  Thomas E. Nichols,et al.  Preventing Alzheimer’s disease-related gray matter atrophy by B-vitamin treatment , 2013, Proceedings of the National Academy of Sciences.

[21]  Gabriela Lindemann,et al.  The Resource Description Framework (RDF) as a Modern Structure for Medical Data , 2007 .

[22]  H. Zinger,et al.  Beneficial effects of the anti-oestrogen tamoxifen on systemic lupus erythematosus of (NZB×NZW)F1 female mice are associated with specific reduction of IgG3 autoantibodies , 2003, Annals of the rheumatic diseases.

[23]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[24]  Todd E. Golde,et al.  Targeting Aβ and tau in Alzheimer's disease, an early interim report , 2010, Experimental Neurology.

[25]  Santo Di Nuovo,et al.  Role of the Transforming-Growth-Factor-β1 Gene in Late-Onset Alzheimer’s Disease: Implications for the Treatment , 2013, Current genomics.

[26]  Martin Hofmann-Apitius,et al.  Computable cause-and-effect models of healthy and Alzheimer's disease states and their mechanistic differential analysis , 2015, Alzheimer's & Dementia.

[27]  Paul N. Schofield,et al.  The role of ontologies in biological and biomedical research: a functional perspective , 2015, Briefings Bioinform..

[28]  Martin Hofmann-Apitius,et al.  NeuroTransDB: highly curated and structured transcriptomic metadata for neurodegenerative diseases , 2015, Database J. Biol. Databases Curation.

[29]  Carole A. Goble,et al.  Transparent access to multiple bioinformatics information sources , 2001, IBM Syst. J..

[30]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[31]  Alvis Brazma,et al.  Minimum Information About a Microarray Experiment (MIAME) – Successes, Failures, Challenges , 2009, TheScientificWorldJournal.

[32]  Wendy W Chapman,et al.  Recall and bias of retrieving gene expression microarray datasets through PubMed identifiers , 2010, Journal of biomedical discovery and collaboration.

[33]  Ben Lehner,et al.  Tissue specificity and the human protein interaction network , 2009, Molecular systems biology.

[34]  Carole A. Goble,et al.  TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources , 1998, ISMB.

[35]  M. Pericak-Vance,et al.  Apolipoprotein E: high-avidity binding to beta-amyloid and increased frequency of type 4 allele in late-onset familial Alzheimer disease. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[36]  E. Perakslis,et al.  Effective knowledge management in translational medicine , 2010, Journal of Translational Medicine.

[37]  Juliane Fluck,et al.  Detecting miRNA Mentions and Relations in Biomedical Literature , 2014, F1000Research.

[38]  W. Hare,et al.  Presynaptic Inhibition by α2 Receptor/Adenylate Cyclase/PDE4 Complex at Retinal Rod Bipolar Synapse , 2014, The Journal of Neuroscience.

[39]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[40]  Sandeep Grover,et al.  Genomic convergence and network analysis approach to identify candidate genes in Alzheimer's disease , 2014, BMC Genomics.

[41]  Juliane Fluck,et al.  ProMiner: Recognition of Human Gene and Protein Names using regularly updated Dictionaries , 2007 .

[42]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[43]  Kei-Hoi Cheung,et al.  AlzPharm: integration of neurodegeneration data using RDF , 2007, BMC Bioinformatics.

[44]  L. F. Abbott,et al.  A Computational Model of Motor Neuron Degeneration , 2014, Neuron.

[45]  William L. Jorgensen,et al.  Role of Macrophage Migration Inhibitory Factor in the Regulatory T Cell Response of Tumor-Bearing Mice , 2012, The Journal of Immunology.

[46]  Maia Parsadanian,et al.  Experimental traumatic brain injury induces rapid aggregation and oligomerization of amyloid-beta in an Alzheimer's disease mouse model. , 2014, Journal of neurotrauma.

[47]  Chris F. Taylor,et al.  The MGED Ontology: a resource for semantics-based description of microarray experiments , 2006, Bioinform..

[48]  Alfonso Valencia,et al.  Text-mining approaches in molecular biology and biomedicine. , 2005, Drug discovery today.

[49]  Satoru Miyano,et al.  Comprehensive phosphoproteome analysis unravels the core signaling network that initiates the earliest synapse pathology in preclinical Alzheimer's disease brain. , 2015, Human molecular genetics.

[50]  Todd E. Golde,et al.  Anti-Aβ Therapeutics in Alzheimer's Disease: The Need for a Paradigm Shift , 2011, Neuron.

[51]  Yukiko Matsuoka,et al.  Adding Protein Context to the Human Protein-Protein Interaction Network to Reveal Meaningful Interactions , 2013, PLoS Comput. Biol..

[52]  Debby Van Dam,et al.  Animal models in the drug discovery pipeline for Alzheimer's disease. , 2011 .

[53]  Martin Hofmann-Apitius,et al.  Improving Distantly Supervised Extraction of Drug-Drug and Protein-Protein Interactions , 2012 .

[54]  R. Vassar,et al.  Molecular Neurodegeneration BioMed Central Review The Alzheimer's disease β-secretase enzyme, BACE1 , 2007 .

[55]  Martin Hofmann-Apitius,et al.  Biomarker-guided translation of brain imaging into disease pathway models , 2013, Scientific Reports.

[56]  T. Calandra,et al.  Macrophage migration inhibitory factor: a regulator of innate immunity , 2003, Nature Reviews Immunology.

[57]  J. Tervaert,et al.  What to do when you suspect your patient suffers from pulmonary vasculitis , 2013 .

[58]  Roberto Bernabei,et al.  Polymorphisms of the Macrophage Inhibitory Factor and C-Reactive Protein Genes in Subjects with Alzheimer’s Dementia , 2004, Dementia and Geriatric Cognitive Disorders.

[59]  Eric K. Neumann,et al.  Inferring novel disease indications for known drugs by semantically linking drug action and disease mechanism relationships , 2009, BMC Bioinformatics.

[60]  R. Vassar,et al.  ADAM10 Prodomain Mutations Cause Late-Onset Alzheimer’s Disease: Not Just the Latest FAD , 2013, Neuron.

[61]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[62]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[63]  Egon L. Willighagen,et al.  Linking the Resource Description Framework to cheminformatics and proteochemometrics , 2011, J. Biomed. Semant..

[64]  Núria Queralt-Rosinach,et al.  DisGeNET: from MySQL to Nanopublication, Modelling Gene-Disease Associations for the Semantic Web , 2012, SWAT4LS.

[65]  Yeon-Kyung Kang,et al.  mRNA-centric semantic modeling for finding molecular signature of trace chemical in human blood , 2012, Molecular & Cellular Toxicology.

[66]  Aedín C. Culhane,et al.  Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments , 2011, Nucleic Acids Res..

[67]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[68]  Kathryn Ziegler-Graham,et al.  Forecasting the global burden of Alzheimer’s disease , 2007, Alzheimer's & Dementia.

[69]  John David,et al.  Macrophage migration inhibitory factor (MIF) sustains macrophage proinflammatory function by inhibiting p53: Regulatory role in the innate immune response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Hsien-Da Huang,et al.  miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database , 2015, Nucleic Acids Res..

[71]  Varun RACHAKONDA,et al.  Biomarkers of neurodegenerative disorders: How good are they? , 2004, Cell Research.

[72]  Susanna Cirera,et al.  Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model , 2014, BMC Medical Genomics.

[73]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[74]  Ulf Leser,et al.  Learning Protein–Protein Interaction Extraction using Distant Supervision , 2011 .

[75]  Guojun Bu,et al.  Apolipoprotein E and its receptors in Alzheimer's disease: pathways, pathogenesis and therapy , 2009, Nature Reviews Neuroscience.

[76]  George Perry,et al.  Expression of CD74 is increased in neurofibrillary tangles in Alzheimer's disease , 2008, Molecular Neurodegeneration.

[77]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[78]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format , 2011, Nucleic Acids Res..

[79]  K. Titani,et al.  Glutamine synthetase, hemoglobin alpha-chain, and macrophage migration inhibitory factor binding to amyloid beta-protein: their identification in rat brain by a novel affinity chromatography and in Alzheimer's disease brain by immunoprecipitation. , 2000, Biochimica et biophysica acta.