A multi-dimensional evidence-based candidate gene prioritization approach for complex diseases-schizophrenia as a case

MOTIVATION During the past decade, we have seen an exponential growth of vast amounts of genetic data generated for complex disease studies. Currently, across a variety of complex biological problems, there is a strong trend towards the integration of data from multiple sources. So far, candidate gene prioritization approaches have been designed for specific purposes, by utilizing only some of the available sources of genetic studies, or by using a simple weight scheme. Specifically to psychiatric disorders, there has been no prioritization approach that fully utilizes all major sources of experimental data. RESULTS Here we present a multi-dimensional evidence-based candidate gene prioritization approach for complex diseases and demonstrate it in schizophrenia. In this approach, we first collect and curate genetic studies for schizophrenia from four major categories: association studies, linkage analyses, gene expression and literature search. Genes in these data sets are initially scored by category-specific scoring methods. Then, an optimal weight matrix is searched by a two-step procedure (core genes and unbiased P-values in independent genome-wide association studies). Finally, genes are prioritized by their combined scores using the optimal weight matrix. Our evaluation suggests this approach generates prioritized candidate genes that are promising for further analysis or replication. The approach can be applied to other complex diseases. AVAILABILITY The collected data, prioritized candidate genes, and gene prioritization tools are freely available at http://bioinfo.mc.vanderbilt.edu/SZGR/.

[1]  Gert Vriend,et al.  GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases , 2005, Nucleic Acids Res..

[2]  Leena Peltonen,et al.  Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: Schizophrenia. , 2003, American journal of human genetics.

[3]  Bart De Moor,et al.  Comparison of vocabularies, representations and ranking algorithms for gene prioritization by text mining , 2008, ECCB.

[4]  R. Strausberg,et al.  The Cancer Genome Anatomy Project: new resources for reading the molecular signatures of cancer , 2001, The Journal of pathology.

[5]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[6]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[7]  Chi Pui Pang,et al.  EYE on bioinformatics: dissecting complex disease traits in silico. , 2002, Applied bioinformatics.

[8]  John P A Ioannidis,et al.  Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database , 2008, Nature Genetics.

[9]  Luca Benini,et al.  TOM: a web-based integrated approach for identification of candidate disease genes , 2006, Nucleic Acids Res..

[10]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[11]  Li Wang,et al.  CGI: a new approach for prioritizing genes by combining gene expression and protein-protein interaction data , 2007, Bioinform..

[12]  Michael C Neale,et al.  American Journal of Medical Genetics Part B (Neuropsychiatric Genetics) 126B:23–36 (2004) Candidate Genes for Nicotine Dependence via Linkage , 2022 .

[13]  Gary E. Swan,et al.  Systematic biological prioritization after a genome-wide association study: an application to nicotine dependence , 2008, Bioinform..

[14]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[15]  J. Lieberman,et al.  Genomewide association for schizophrenia in the CATIE study: results of stage 1 , 2009, Molecular Psychiatry.

[16]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[17]  P. Sullivan,et al.  Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. , 2003, Archives of general psychiatry.

[18]  Phoebe M. Roberts,et al.  Mining literature for systems biology , 2006, Briefings Bioinform..

[19]  R. Straub,et al.  A genome‐wide scan for modifier loci in schizophrenia , 2007, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[20]  P. Donnelly,et al.  New models of collaboration in genome-wide association studies: the Genetic Association Information Network , 2007, Nature Genetics.

[21]  K. Kendler,et al.  The dystrobrevin-binding protein 1 gene: features and networks , 2009, Molecular Psychiatry.

[22]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[23]  C. Wijmenga,et al.  Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. , 2006, American journal of human genetics.

[24]  Sarah A. J. Reading,et al.  Neurobiology of Schizophrenia , 2006, Neuron.

[25]  I. Gottesman Schizophrenia Genesis: The Origins of Madness , 1990 .

[26]  M. Geyer,et al.  Towards Understanding The Schizophrenia Code : An Expanded Convergent Functional Genomics Approach , 2007 .

[27]  P. Visscher,et al.  Genome scan meta-analysis of schizophrenia and bipolar disorder, part III: Bipolar disorder. , 2003, American journal of human genetics.

[28]  Zhongming Zhao,et al.  Candidate genes for schizophrenia: A survey of association studies and gene ranking , 2008, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[29]  E M Wijsman,et al.  Meta-analysis of 32 genome-wide linkage studies of schizophrenia , 2009, Molecular Psychiatry.

[30]  Sam Richman,et al.  An online database for brain disease research , 2006, BMC Genomics.

[31]  N. Schork,et al.  Genetics of complex disease: approaches, problems, and solutions. , 1997, American journal of respiratory and critical care medicine.

[32]  Bing Zhang,et al.  WebGestalt: an integrated system for exploring gene sets in various biological contexts , 2005, Nucleic Acids Res..