Assessment of curated phenotype mining in neuropsychiatric disorder literature.

Clinical evaluation of patients and diagnosis of disorder is crucial to make decisions on appropriate therapies. In addition, in the case of genetic disorders resulting from gene abnormalities, phenotypic effects may guide basic research on the mechanisms of a disorder to find the mutated gene and therefore to propose novel targets for drug therapy. However, this approach is complicated by two facts. First, the relationship between genes and disorders is not simple: one gene may be related to multiple disorders and a disorder may be caused by mutations in different genes. Second, recognizing relevant phenotypes might be difficult for clinicians working with patients of closely related complex disorders. Neuropsychiatric disorders best illustrate these difficulties since phenotypes range from metabolic to behavioral aspects, the latter extremely complex. Based on our clinical expertise on five neurodegenerative disorders, and from the wealth of bibliographical data on neuropsychiatric disorders, we have built a resource to infer associations between genes, chemicals, phenotypes for a total of 31 disorders. An initial step of automated text mining of the literature related to 31 disorders returned thousands of enriched terms. Fewer relevant phenotypic terms were manually selected by clinicians as relevant to the five neural disorders of their expertise and used to analyze the complete set of disorders. Analysis of the data indicates general relationships between neuropsychiatric disorders, which can be used to classify and characterize them. Correlation analyses allowed us to propose novel associations of genes and drugs with disorders. More generally, the results led us to uncovering mechanisms of disease that span multiple neuropsychiatric disorders, for example that genes related to synaptic transmission and receptor functions tend to be involved in many disorders, whereas genes related to sensory perception and channel transport functions are associated with fewer disorders. Our study shows that starting from expertise covering a limited set of neurological disorders and using text and data mining methods, meaningful and novel associations regarding genes, chemicals and phenotypes can be derived for an expanded set of neuropsychiatric disorders. Our results are intended for clinicians to help them evaluate patients, and for basic scientists to propose new gene targets for drug therapies. This strategy can be extended to virtually all diseases and takes advantage of the ever increasing amount of biomedical literature.

[1]  R. Holder,et al.  Investigation of dopamine receptors in susceptibility to behavioural and psychological symptoms in Alzheimer's disease , 2009, International journal of geriatric psychiatry.

[2]  J. Kleinman,et al.  Spatiotemporal transcriptome of the human brain , 2011, Nature.

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  S. Love,et al.  Assessing White Matter Ischemic Damage in Dementia Patients by Measurement of Myelin Proteins , 2013, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[5]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[6]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[7]  A. V. Vulto-van Silfhout,et al.  Clinical Significance of De Novo and Inherited Copy‐Number Variation , 2013, Human mutation.

[8]  Miguel A. Andrade-Navarro,et al.  Génie: literature-based gene prioritization at multi genomic scale , 2011, Nucleic Acids Res..

[9]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[11]  Thomas C. Wiegers,et al.  A CTD–Pfizer collaboration: manual curation of 88 000 scientific articles text mined for drug–disease and drug–phenotype interactions , 2013, Database J. Biol. Databases Curation.

[12]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[13]  D. Blacker,et al.  Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database , 2007, Nature Genetics.

[14]  Carol Friedman,et al.  PhenoGO: an integrated resource for the multiscale mining of clinical and biological data , 2009, BMC Bioinformatics.

[15]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[16]  D. Berg,et al.  Rivastigmine for the treatment of dementia in patients with progressive supranuclear palsy: Clinical observations as a basis for power calculations and safety analysis , 2010, Alzheimer's & Dementia.

[17]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[18]  O. Lopez,et al.  Dopamine receptor genetic variation, psychosis, and aggression in Alzheimer disease. , 1998, Archives of neurology.

[19]  Carolina Perez-Iratxeta,et al.  Linking genes to diseases: it's all in the data , 2009, Genome Medicine.

[20]  Marcel H. Schulz,et al.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies. , 2009, American journal of human genetics.

[21]  David Liu,et al.  DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis , 2007, BMC Bioinformatics.