A penalized linear mixed model for genomic prediction using pedigree structures

Genetic Analysis Workshop 18 provided a platform for evaluating genomic prediction power based on single-nucleotide polymorphisms from single-nucleotide polymorphism array data and sequencing data. Also, Genetic Analysis Workshop 18 provided a diverse pedigree structure to be explored in prediction. In this study, we attempted to combine pedigree information with single-nucleotide polymorphism data to predict systolic blood pressure. Our results suggested that the prediction power based on pedigree information only could be unsatisfactory. Using additional information such as single-nucleotide polymorphism genotypes would improve prediction accuracy. In particular, the improvement can be significant when there exist a few single-nucleotide polymorphisms with relatively larger effect sizes. We also compared the prediction performance based on genome-wide association study data (ie, common variants) and sequencing data (ie, common variants plus low-frequency variants). The experimental result showed that inclusion of low frequency variants could not lead to improvement of prediction accuracy.

[1]  P. Visscher,et al.  A Commentary on ‘Common SNPs Explain a Large Proportion of the Heritability for Human Height’ by Yang et al. (2010) , 2010, Twin Research and Human Genetics.

[2]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[3]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[4]  Eleazar Eskin,et al.  Improved linear mixed models for genome-wide association studies , 2012, Nature Methods.

[5]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[6]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[7]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[8]  Daniel Gianola,et al.  Predicting genetic predisposition in humans: the promise of whole-genome markers , 2010, Nature Reviews Genetics.

[9]  D. Allison,et al.  Beyond Missing Heritability: Prediction of Complex Traits , 2011, PLoS genetics.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  José Crossa,et al.  Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree , 2009, Genetics.

[12]  Oliver Stegle,et al.  A Lasso multi-marker mixed model for association mapping with population structure correction , 2013, Bioinform..

[13]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[14]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[15]  P. Bühlmann,et al.  Estimation for High‐Dimensional Linear Mixed‐Effects Models Using ℓ1‐Penalization , 2010, 1002.3784.

[16]  T. Hastie,et al.  SparseNet: Coordinate Descent With Nonconvex Penalties , 2011, Journal of the American Statistical Association.

[17]  M. Gail Discriminatory accuracy from single-nucleotide polymorphisms in models to predict breast cancer risk. , 2008, Journal of the National Cancer Institute.