Gene fingerprint model for literature based detection of the associations among complex diseases: a case study of COPD

BackgroundDisease comorbidity is very common and has significant impact on disease treatment. Revealing the associations among diseases may help to understand the mechanisms of diseases, improve the prevention and treatment of diseases, and support the discovery of new drugs or new uses of existing drugs.MethodsIn this paper, we introduced a mathematical model to represent gene related diseases with a series of associated genes based on the overrepresentation of genes and diseases in PubMed literature. We also illustrated an efficient way to reveal the implicit connections between COPD and other diseases based on this model.ResultsWe applied this approach to analyze the relationships between Chronic Obstructive Pulmonary Disease (COPD) and other diseases under the Lung diseases branch in the Medical subject heading index system and detected 4 novel diseases relevant to COPD. As judged by domain experts, the F score of our approach is up to 77.6%.ConclusionsThe results demonstrate the effectiveness of the gene fingerprint model for diseases on the basis of medical literature.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Djillali Annane,et al.  Glucocorticoid treatment in acute lung injury and acute respiratory distress syndrome. , 2011, Critical care clinics.

[3]  P. Górski,et al.  Expression of HIF-1A/VEGF/ING-4 Axis in Pulmonary Sarcoidosis. , 2015, Advances in experimental medicine and biology.

[4]  Hua Xu,et al.  Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature , 2015, ICBO.

[5]  Adriano Barbosa-Silva,et al.  A guide for building biological pathways along with two case studies: hair and breast development. , 2015, Methods.

[6]  M. Gassmann,et al.  Cellular and developmental control of O2 homeostasis by hypoxia-inducible factor 1 alpha. , 1998, Genes & development.

[7]  Peter J. Barnes,et al.  Defect of Adaptation to Hypoxia in Patients With COPD Due to Reduction of Histone Deacetylase 7 , 2011, Chest.

[8]  Inderjit S. Dhillon,et al.  A generalized maximum entropy approach to bregman co-clustering and matrix approximation , 2004, J. Mach. Learn. Res..

[9]  C Michiels,et al.  Role of HIF-1 as a transcription factor involved in embryonic development, cancer progression and apoptosis (review). , 2000, International journal of molecular medicine.

[10]  E. Regan,et al.  Genetic Epidemiology of COPD (COPDGene) Study Design , 2011, COPD.

[11]  Soo Hwan Choi,et al.  Lung Cancer Associated with Sarcoidosis - A case report - , 2011, The Korean journal of thoracic and cardiovascular surgery.

[12]  Halil Kilicoglu,et al.  Semantic MEDLINE: An advanced information management application for biomedicine , 2011, Inf. Serv. Use.

[13]  Dawn Joseph,et al.  Qualitative research methods in mental health and psychotherapy: A guide for students and practitioners , 2011, QMiP Bulletin.

[14]  Ciro Casanova,et al.  Role of HIF1A, VEGFA and VEGFR2 SNPs in the Susceptibility and Progression of COPD in a Spanish Population , 2016, PloS one.

[15]  Alan D. Lopez,et al.  The Global Burden of Disease Study , 2003 .

[16]  M. Schlosser BASIC HYPERGEOMETRIC SERIES , 2007 .

[17]  A. Valencia,et al.  Mining functional information associated with expression arrays , 2001, Functional & Integrative Genomics.

[18]  Edwin J R van Beek,et al.  The clinical and genetic features of COPD-asthma overlap syndrome , 2014, European Respiratory Journal.

[19]  Kazuhiro Ito,et al.  Glucocorticoid pathways in chronic obstructive pulmonary disease therapy. , 2005, Proceedings of the American Thoracic Society.

[20]  P. Sarkar,et al.  A survey for prevalence of coronary artery disease and its risk factors in an urban population in Andhra Pradesh. , 2012 .

[21]  Raúl San José Estépar,et al.  A Genome-Wide Association Study of Emphysema and Airway Quantitative Imaging Phenotypes. , 2015, American journal of respiratory and critical care medicine.

[22]  Halil Kilicoglu,et al.  Semantic MEDLINE: A web application for managing the results of PubMed searches , 2008, SMBM 2008.

[23]  Jing Zhang,et al.  Nicotinic acetylcholine receptor variants associated with susceptibility to chronic obstructive pulmonary disease: a meta-analysis , 2011, Respiratory research.

[24]  W. Jim Zheng,et al.  Signaling network prediction by the Ontology Fingerprint enhanced Bayesian network , 2012, BMC Systems Biology.

[25]  COPDGene and Eclipse Investigators,et al.  Genetic susceptibility for chronic bronchitis in chronic obstructive pulmonary disease , 2014 .

[26]  J. Tschopp,et al.  Risk Factors for Acute Lung Injury After Thoracic Surgery for Lung Cancer , 2003, Anesthesia and analgesia.

[27]  Weisi Lin,et al.  Selective Visual Attention: Computational Models and Applications , 2013 .

[28]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[29]  Daniel A Culver,et al.  Nicotine treatment improves Toll-like receptor 2 and Toll-like receptor 9 responsiveness in active pulmonary sarcoidosis. , 2013, Chest.

[30]  David A Scott,et al.  Exploitation of the nicotinic anti-inflammatory pathway for the treatment of epithelial inflammatory diseases. , 2006, World journal of gastroenterology.

[31]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[32]  J. Brocki,et al.  A critical evaluation of the use of interpretative phenomenological analysis (IPA) in health psychology , 2006 .

[33]  A. Millar,et al.  Vascular Endothelial Growth Factor in Acute Lung Injury and Acute Respiratory Distress Syndrome , 2013, Respiration.

[34]  I. Adcock,et al.  The relationship between COPD and lung cancer , 2015, Lung cancer.

[35]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[36]  Bernadette A. Thomas,et al.  Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010 , 2012, The Lancet.

[37]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[38]  Michael Boehnke,et al.  Evaluation of genome-wide association study results through development of ontology fingerprints , 2009, Bioinform..

[39]  Ran Xu,et al.  A chronic obstructive pulmonary disease negatively influences the prognosis of patients with bladder urothelial carcinoma via hypoxia inducible factor-1α. , 2014, International journal of clinical and experimental medicine.

[40]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[41]  Zhi-Qiang Ye,et al.  Alpha7 nicotinic acetylcholine receptor activation attenuated intestine-derived acute lung injury. , 2016, The Journal of surgical research.

[42]  Léon Personnaz,et al.  Enrichment or depletion of a GO category within a class of genes: which test? , 2007, Bioinform..

[43]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[44]  Robert H. Oakley,et al.  Cellular Processing of the Glucocorticoid Receptor Gene and Protein: New Mechanisms for Generating Tissue-specific Actions of Glucocorticoids* , 2010, The Journal of Biological Chemistry.

[45]  Jeffrey T. Chang,et al.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. , 2002, Genome research.

[46]  Jijun Tang,et al.  Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network , 2014, Nucleic acids research.

[47]  Judy H. Cho,et al.  A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene , 2006, Science.

[48]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[49]  J. Cidlowski,et al.  Antiinflammatory action of glucocorticoids--new mechanisms for old drugs. , 2005, The New England journal of medicine.

[50]  Edwin K Silverman,et al.  Genetic susceptibility for chronic bronchitis in chronic obstructive pulmonary disease , 2014, Respiratory Research.

[51]  K. Shianna,et al.  A Genome-Wide Association Study in Chronic Obstructive Pulmonary Disease (COPD): Identification of Two Major Susceptibility Loci , 2009, PLoS genetics.

[52]  W. Piotrowski,et al.  Selected molecular events in the pathogenesis of sarcoidosis - recent advances. , 2015, Pneumonologia i alergologia polska.

[53]  Michael Schroeder,et al.  Prediction of drug gene associations via ontological profile similarity with application to drug repositioning. , 2015, Methods.

[54]  Li Jiang,et al.  A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records , 2014, BMC Bioinformatics.

[55]  Norman Sartorious,et al.  Comorbidity of mental and physical diseases: a main challenge for medicine of the 21st century , 2013, Shanghai archives of psychiatry.

[56]  Martin Fortin,et al.  The coexistence of terms to describe the presence of multiple concurrent diseases , 2013, Journal of comorbidity.

[57]  Christoph H. Borchers,et al.  HIF1A Reduces Acute Lung Injury by Optimizing Carbohydrate Metabolism in the Alveolar Epithelium , 2013, PLoS biology.

[58]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[59]  Joel Dudley,et al.  Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets , 2010, PLoS Comput. Biol..

[60]  Tudor I. Oprea,et al.  Associating Drugs, Targets and Clinical Outcomes into an Integrated Network Affords a New Platform for Computer‐Aided Drug Repurposing , 2011, Molecular informatics.

[61]  Pal Pacher,et al.  Nicotine Exerts an Anti-inflammatory Effect in a Murine Model of Acute Lung Injury , 2011, Inflammation.

[62]  Mizan Rahman,et al.  Basic Hypergeometric Series , 1990 .

[63]  Halil Kilicoglu,et al.  Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference , 2014, PLoS Comput. Biol..

[64]  Chronic obstructive pulmonary disease among adults--United States, 2011. , 2012, MMWR. Morbidity and mortality weekly report.

[65]  Rahul Kodgule,et al.  Risk factors and pathophysiology of chronic obstructive pulmonary disease (COPD). , 2012, The Journal of the Association of Physicians of India.

[66]  Christoph Lange,et al.  The SERPINE2 gene is associated with chronic obstructive pulmonary disease. , 2006, American journal of human genetics.

[67]  David Bryant,et al.  DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists , 2007, Nucleic Acids Res..

[68]  Ingo Ruczinski,et al.  Identifying a Deletion Affecting Total Lung Capacity Among Subjects in the COPDGene Study Cohort , 2016, Genetic epidemiology.