The Autoimmune Disease Database: a dynamically compiled literature-derived database

BackgroundAutoimmune diseases are disorders caused by an immune response directed against the body's own organs, tissues and cells. In practice more than 80 clinically distinct diseases, among them systemic lupus erythematosus and rheumatoid arthritis, are classified as autoimmune diseases. Although their etiology is unclear these diseases share certain similarities at the molecular level i.e. susceptibility regions on the chromosomes or the involvement of common genes. To gain an overview of these related diseases it is not feasible to do a literary review but it requires methods of automated analyses of the more than 500,000 Medline documents related to autoimmune disorders.ResultsIn this paper we present the first version of the Autoimmune Disease Database which to our knowledge is the first comprehensive literature-based database covering all known or suspected autoimmune diseases. This dynamically compiled database allows researchers to link autoimmune diseases to the candidate genes or proteins through the use of named entity recognition which identifies genes/proteins in the corresponding Medline abstracts. The Autoimmune Disease Database covers 103 autoimmune disease concepts. This list was expanded to include synonyms and spelling variants yielding a list of over 1,200 disease names. The current version of the database provides links to 541,690 abstracts and over 5,000 unique genes/proteins.ConclusionThe Autoimmune Disease Database provides the researcher with a tool to navigate potential gene-disease relationships in Medline abstracts in the context of autoimmune diseases.

[1]  M Ruggieri,et al.  Serum MMP-9/TIMP-1 and MMP-2/TIMP-2 ratios in multiple sclerosis: relationships with different magnetic resonance imaging measures of disease activity during IFN-beta-1a treatment , 2005, Multiple sclerosis.

[2]  M. Eric Gershwin,et al.  Allergic Disease and Autoimmune Effectors Pathways , 2002, Developmental immunology.

[3]  Lada A. Adamic,et al.  A literature based method for identifying gene-disease connections , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[4]  Michael Y. Galperin The Molecular Biology Database Collection: 2005 update , 2004, Nucleic Acids Res..

[5]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[6]  K. Becker,et al.  The common genetic hypothesis of autoimmune/inflammatory disease , 2001, Current opinion in allergy and clinical immunology.

[7]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[8]  Thomas Werner,et al.  LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts , 2005, Nucleic Acids Res..

[9]  Joel D. Martin,et al.  PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine , 2003, BMC Bioinformatics.

[10]  Bernhard Hemmer,et al.  No association of three polymorphisms in the alpha-2-macroglobulin and lipoprotein related receptor genes with multiple sclerosis , 2001, Journal of Neuroimmunology.

[11]  Steven J. M. Jones,et al.  CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM) to identify genetically-associated cancers and candidate genes , 2005, BMC Bioinformatics.

[12]  Daniel Hanisch,et al.  ProMiner: rule-based protein and gene entity recognition , 2005, BMC Bioinformatics.

[13]  Daniel Hanisch,et al.  Playing Biology's Name Game: Identifying Protein Names in Scientific Text , 2002, Pacific Symposium on Biocomputing.

[14]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[15]  Oliver Hofmann,et al.  Concept-based annotation of enzyme classes , 2005, Bioinform..

[16]  Alexander A. Morgan,et al.  Overview of BioCreAtIvE task 1B: normalized gene lists , 2005, BMC Bioinformatics.

[17]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2006, Nucleic Acids Res..

[18]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[19]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[20]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[21]  N. Norgren,et al.  Neurofilament and glial fibrillary acidic protein in multiple sclerosis , 2004, Neurology.

[22]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[23]  D. Comings,et al.  RANTES: a genetic risk marker for multiple sclerosis , 2004, Multiple sclerosis.

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.