An Application and Empirical Comparison of Statistical Analysis Methods for Associating Rare Variants to a Complex Phenotype

The contribution of collections of rare sequence variations (or 'variants') to phenotypic expression has begun to receive considerable attention within the biomedical research community. However, the best way to capture the effects of rare variants in relevant statistical analysis models is an open question. In this paper we describe the application of a number of statistical methods for testing associations between rare variants in two genes to obesity. We consider the relative merits of the different methods as well as important implementation details, such as the leveraging of genomic annotations and determining p-values.

[1]  J. Friedman Fast sparse regression and classification , 2012 .

[2]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[3]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[4]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[5]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[6]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[7]  R. Pearl Biometrics , 1914, The American Naturalist.

[8]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[9]  V. Bansal,et al.  Statistical analysis strategies for association studies involving rare variants , 2010, Nature Reviews Genetics.

[10]  Michael Stonebraker,et al.  The Morgan Kaufmann Series in Data Management Systems , 1999 .

[11]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .