Statistical and Computational Methods for Genetic Diseases: An Overview

The identification of causes of genetic diseases has been carried out by several approaches with increasing complexity. Innovation of genetic methodologies leads to the production of large amounts of data that needs the support of statistical and computational methods to be correctly processed. The aim of the paper is to provide an overview of statistical and computational methods paying attention to methods for the sequence analysis and complex diseases.

[1]  C. Kendziorski,et al.  Statistical Methods for Expression Quantitative Trait Loci (eQTL) Mapping , 2006, Biometrics.

[2]  M. McCarthy,et al.  Underlying genetic models of inheritance in established type 2 diabetes associations. , 2009, American journal of epidemiology.

[3]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[4]  Andreas Ziegler,et al.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data , 2010, Bioinform..

[5]  John D. Storey,et al.  Multiple Locus Linkage Analysis of Genomewide Expression in Yeast , 2005, PLoS biology.

[6]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[7]  M. Boehnke,et al.  So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. , 2007, American journal of human genetics.

[8]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  D. Nyholt A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. , 2004, American journal of human genetics.

[10]  Robert L. Fathke,et al.  Characterization of Coding Synonymous and Non-Synonymous Variants in ADAMTS13 Using Ex Vivo and In Silico Approaches , 2012, PloS one.

[11]  Antonino Staiano,et al.  A multilayer perceptron neural network-based approach for the identification of responsiveness to interferon therapy in multiple sclerosis patients , 2010, Inf. Sci..

[12]  L. del Vecchio,et al.  The novel variant p.Ser465Leu in the PCSK9 gene does not account for the decreased LDLR activity in members of a FH family , 2014, Clinical chemistry and laboratory medicine.

[13]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[14]  John P A Ioannidis,et al.  Meta-analysis in genome-wide association studies. , 2009, Pharmacogenomics.

[15]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[16]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[17]  Hailiang Huang,et al.  Gene-Based Tests of Association , 2011, PLoS genetics.

[18]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[19]  John P A Ioannidis,et al.  Discovery properties of genome-wide association signals from cumulatively combined data sets. , 2009, American journal of epidemiology.

[20]  Kai Wang,et al.  Multiple testing in genome-wide association studies via hidden Markov models , 2009, Bioinform..

[21]  A Y Kashiwabara,et al.  Splice site prediction using stochastic regular grammars. , 2007, Genetics and molecular research : GMR.

[22]  Antonino Staiano,et al.  Investigation of Single Nucleotide Polymorphisms Associated to Familial Combined Hyperlipidemia with Random Forests , 2012, WIRN.

[23]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[24]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[25]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[26]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[27]  Yan Cui,et al.  Inferring gene transcriptional modulatory relations: a genetical genomics approach. , 2005, Human molecular genetics.

[28]  Julian Little,et al.  Systematic Reviews of Genetic Association Studies , 2009, PLoS medicine.

[29]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[30]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[31]  X. Chen,et al.  Random forests for genomic data analysis. , 2012, Genomics.

[32]  David W Fardo,et al.  Statistical Approaches to Combine Genetic Association Data. , 2013, Journal of biometrics & biostatistics.

[33]  David G. Stork,et al.  Pattern Classification , 1973 .

[34]  David Baux,et al.  A Classification Model Relative to Splicing for Variants of Unknown Clinical Significance: Application to the CFTR Gene , 2013, Human mutation.

[35]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[36]  S. Salzberg,et al.  GeneSplicer: a new computational method for splice site prediction. , 2001, Nucleic acids research.

[37]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[38]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[39]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[40]  Marylyn D. Ritchie,et al.  GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease , 2006, BMC Bioinformatics.

[41]  C. Kimchi-Sarfaty,et al.  Understanding the contribution of synonymous mutations to human disease , 2011, Nature Reviews Genetics.

[42]  Hiroyuki Honda,et al.  Artificial neural network approach for selection of susceptible single nucleotide polymorphisms and construction of prediction model on childhood allergic asthma , 2004, BMC Bioinformatics.

[43]  John P A Ioannidis,et al.  The power of meta-analysis in genome-wide association studies. , 2013, Annual review of genomics and human genetics.

[44]  Ivan Rusyn,et al.  Computational tools for discovery and interpretation of expression quantitative trait loci. , 2012, Pharmacogenomics.

[45]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[46]  Antonino Staiano,et al.  Association of USF1 and APOA5 polymorphisms with familial combined hyperlipidemia in an Italian population. , 2015, Molecular and cellular probes.

[47]  E. Zeggini,et al.  Defining the power limits of genome‐wide association scan meta‐analyses , 2011, Genetic epidemiology.

[48]  Park,et al.  Open Access Research Article Identification of Type 2 Diabetes-associated Combination of Snps Using Support Vector Machine , 2022 .

[49]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[50]  Kerrie L. Mengersen,et al.  Methods for Identifying SNP Interactions: A Review on Variations of Logic Regression, Random Forest and Bayesian Logistic Regression , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[52]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.

[53]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[54]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[55]  C. Béroud,et al.  Human Splicing Finder: an online bioinformatics tool to predict splicing signals , 2009, Nucleic acids research.

[56]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[57]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[58]  Scott M. Williams,et al.  Guidelines for Genome-Wide Association Studies , 2012, PLoS genetics.

[59]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[60]  D. Balding A tutorial on statistical methods for population association studies , 2006, Nature Reviews Genetics.

[61]  Jason H. Moore,et al.  Genetic programming neural networks: A powerful bioinformatics tool for human genetics , 2007, Appl. Soft Comput..

[62]  B. Fridley,et al.  Gene set analysis of SNP data: benefits, challenges, and future directions , 2011, European Journal of Human Genetics.

[63]  C. Gieger,et al.  Genomewide association analysis of coronary artery disease. , 2007, The New England journal of medicine.

[64]  Eleazar Eskin,et al.  Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. , 2011, American journal of human genetics.

[65]  Alain Xayaphoummine,et al.  Kinefold web server for RNA/DNA folding path and structure prediction including pseudoknots and knots , 2005, Nucleic Acids Res..

[66]  Andrew B. Nobel,et al.  FastMap: Fast eQTL mapping in homozygous populations , 2008, Bioinform..

[67]  N. Bing,et al.  Genetical Genomics Analysis of a Yeast Segregant Population for Transcription Network Inference , 2005, Genetics.

[68]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[69]  Jinhua Wang,et al.  ESEfinder: a web resource to identify exonic splicing enhancers , 2003, Nucleic Acids Res..

[70]  P. Holmans Statistical methods for pathway analysis of genome-wide data for association with complex genetic traits. , 2010, Advances in genetics.

[71]  John P. A. Ioannidis,et al.  Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls , 2008, Human Genetics.

[72]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[73]  Andreas Ziegler,et al.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data , 2010, Bioinform..

[74]  J. Ott,et al.  Detecting gene-gene interactions using support vector machines with L1 penalty , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[75]  Ching Lee Koo,et al.  A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology , 2013, BioMed research international.

[76]  J. Zhu,et al.  An integrative genomics approach to the reconstruction of gene networks in segregating populations , 2004, Cytogenetic and Genome Research.

[77]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[78]  Mitchell H. Gail,et al.  On Combining Data From Genome-Wide Association Studies to Discover Disease-Associated SNPs , 2009, 1010.5046.

[79]  Hua Xu,et al.  Genetic studies of complex human diseases: Characterizing SNP-disease associations using Bayesian networks , 2012, BMC Systems Biology.

[80]  Ping Wang,et al.  A review of statistical methods for expression quantitative trait loci mapping , 2006, Mammalian Genome.

[81]  Nathan Mantel,et al.  Chi-square tests with one degree of freedom , 1963 .

[82]  Miguel Pérez-Enciso,et al.  Qxpak.5: Old mixed model solutions for new genomics problems , 2011, BMC Bioinformatics.

[83]  J. Ioannidis,et al.  Meta-analysis methods for genome-wide association studies and beyond , 2013, Nature Reviews Genetics.

[84]  Jonathan J Shuster,et al.  Empirical vs natural weighting in random effects meta‐analysis , 2009, Statistics in medicine.

[85]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[86]  Peter Kraft,et al.  Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis , 2012, Nature Genetics.

[87]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[88]  Xi Chen,et al.  Pathway hunting by random survival forests , 2013, Bioinform..

[89]  Christina Kendziorski,et al.  Combined Expression Trait Correlations and Expression Quantitative Trait Locus Mapping , 2006, PLoS genetics.