Prioritization of Disease Susceptibility Genes Using LSM/SVD

Understanding the role of genetics in diseases is one of the most important tasks in the postgenome era. It is generally too expensive and time consuming to perform experimental validation for all candidate genes related to disease. Computational methods play important roles for prioritizing these candidates. Herein, we propose an approach to prioritize disease genes using latent semantic mapping based on singular value decomposition. Our hypothesis is that similar functional genes are likely to cause similar diseases. Measuring the functional similarity between known disease susceptibility genes and unknown genes is to predict new disease susceptibility genes. Taking autism as an instance, the analysis results of the top ten genes prioritized demonstrate they might be autism susceptibility genes, which also indicates our approach could discover new disease susceptibility genes. The novel approach of disease gene prioritization could discover new disease susceptibility genes, and latent disease-gene relations. The prioritized results could also support the interpretive diversity and experimental views as computational evidence for disease researchers.

[1]  Min Zhao,et al.  AutismKB: an evidence-based knowledgebase of autism genetics , 2011, Nucleic Acids Res..

[2]  Mario A. Cleves,et al.  The Frequency of Polymorphisms affecting Lead and Mercury Toxicity among Children with Autism , 2008 .

[3]  Hongfang Liu,et al.  RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis , 2010, BMC Bioinformatics.

[4]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[5]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[6]  Károly Mirnics,et al.  Immune transcriptome alterations in the temporal cortex of subjects with autism , 2008, Neurobiology of Disease.

[7]  Usha Naik,et al.  Aberrations in folate metabolic pathway and altered susceptibility to autism , 2009, Psychiatric genetics.

[8]  K. Mirnics,et al.  Involvement of the PRKCB1 gene in autistic disorder: significant genetic association and reduced neocortical gene expression , 2009, Molecular Psychiatry.

[9]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[10]  Yibo Wu,et al.  GOSemSim: an R package for measuring semantic similarity among GO terms and gene products , 2010, Bioinform..

[11]  M. Ehler,et al.  Nonlinear gene cluster analysis with labeling for microarray gene expression data in organ development , 2011, BMC proceedings.

[12]  J. Rubenstein,et al.  Chromosome 8p as a potential hub for developmental neuropsychiatric disorders: implications for schizophrenia, autism and cancer , 2009, Molecular Psychiatry.

[13]  May D. Wang,et al.  GoMiner: a resource for biological interpretation of genomic and proteomic data , 2003, Genome Biology.

[14]  Axel Benner,et al.  FARP2, HDLBP and PASK are downregulated in a patient with autism and 2q37.3 deletion syndrome , 2009, American journal of medical genetics. Part A.

[15]  Marie-Dominique Devignes,et al.  Gene–disease relationship discovery based on model-driven data integration and database view definition , 2008, Bioinform..

[16]  Xiao Sun,et al.  Prediction of autism susceptibility genes based on association rules , 2012, Journal of neuroscience research.

[17]  A. Doja,et al.  Immunizations and Autism: A Review of the Literature , 2006, Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques.

[18]  A. Philippi,et al.  Haplotypes in the gene encoding protein kinase c-beta (PRKCB1) on chromosome 16 are associated with autism , 2005, Molecular Psychiatry.

[19]  Thomas Bourgeron,et al.  Mutations of the X-linked genes encoding neuroligins NLGN3 and NLGN4 are associated with autism , 2003, Nature Genetics.

[20]  Lejun Gong,et al.  AUTMINER: A SYSTEM FOR EXTRACTING ASD-RELATED GENES USING TEXT MINING , 2011 .

[21]  J.R. Bellegarda,et al.  Latent semantic mapping [information retrieval] , 2005, IEEE Signal Processing Magazine.

[22]  P. Ashwood,et al.  The role of immune dysfunction in the pathophysiology of autism , 2012, Brain, Behavior, and Immunity.

[23]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[24]  Mercedes Robledo,et al.  Association studies in thyroid cancer susceptibility: are we on the right track? , 2011, Journal of molecular endocrinology.

[25]  I S Kohane,et al.  Comparative analysis of neurological disorders focuses genome-wide search for autism genes. , 2009, Genomics.

[26]  Euan A. Adie Speeding Disease Gene Discovery with SUSPECTS , 2005, BMC Bioinformatics.

[27]  A. El-Ansary,et al.  Novel metabolic biomarkers related to sulfur-dependent detoxification pathways in autistic patients of Saudi Arabia , 2011, BMC neurology.

[28]  S. Rogers,et al.  The Behavioral Phenotype in Fragile X: Symptoms of Autism in Very Young Children with Fragile X Syndrome, Idiopathic Autism, and Other Developmental Disorders , 2001, Journal of developmental and behavioral pediatrics : JDBP.

[29]  Dragomir R. Radev,et al.  Identifying gene-disease associations using centrality on a literature mined gene-interaction network , 2008, ISMB.

[30]  Thomas Lengauer,et al.  Improving disease gene prioritization using the semantic similarity of Gene Ontology terms , 2010, Bioinform..

[31]  S. Faraone,et al.  More evidence supports the association of PPP3CC with schizophrenia , 2007, Molecular Psychiatry.

[32]  Bing Zhang,et al.  WebGestalt: an integrated system for exploring gene sets in various biological contexts , 2005, Nucleic Acids Res..

[33]  Murat Gunel,et al.  Sequence Variants in SLITRK1 Are Associated with Tourette's Syndrome , 2005, Science.

[34]  Miguel A. Andrade-Navarro,et al.  Génie: literature-based gene prioritization at multi genomic scale , 2011, Nucleic Acids Res..

[35]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[36]  Nathalie Boddaert,et al.  Clinical, cellular, and neuropathological consequences of AP1S2 mutations: further delineation of a recognizable X‐linked mental retardation syndrome , 2008, Human mutation.

[37]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[38]  Jan Freudenberg,et al.  A similarity-based method for genome-wide prediction of disease-relevant human genes , 2002, ECCB.

[39]  Susan T. Dumais,et al.  The latent semantic analysis theory of knowledge , 1997 .

[40]  C. Lord,et al.  The Simons Simplex Collection: A Resource for Identification of Autism Genetic Risk Factors , 2010, Neuron.

[41]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[42]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[43]  David J. Porteous,et al.  Speeding disease gene discovery by sequence based candidate prioritization , 2005, BMC Bioinformatics.

[44]  Stepan Melnyk,et al.  Metabolic endophenotype and related genotypes are associated with oxidative stress in children with autism , 2006, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[45]  Wojciech Czaja,et al.  Schroedinger Eigenmaps for the Analysis of Biomedical Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  I L Cohen,et al.  Why are autism and the fragile-X syndrome associated? Conceptual and methodological issues. , 1991, American journal of human genetics.

[47]  Hisham Al-Mubaid,et al.  A New Text Mining Approach for Finding Protein-to-Disease Associations , 2005 .

[48]  Jerome Rene Bellegarda,et al.  Latent Semantic Mapping , 2007 .