Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr
暂无分享,去创建一个
Andrey Ziyatdinov | Michael G. B. Blum | Hugues Aschard | Florian Privé | M. Blum | H. Aschard | A. Ziyatdinov | F. Privé
[1] Danny C. Sorensen,et al. Deflation Techniques for an Implicitly Restarted Arnoldi Iteration , 1996, SIAM J. Matrix Anal. Appl..
[2] Stephen Weston,et al. Scalable Strategies for Computing with Massive Data , 2013 .
[3] Dirk Eddelbuettel,et al. Rcpp: Seamless R and C++ Integration , 2011 .
[4] Gad Abraham,et al. Fast Principal Component Analysis of Large-Scale Genome-Wide Data , 2014, bioRxiv.
[5] Yaohui Zeng,et al. The biglasso Package: A Memory- and Computation-Efficient Solver for Lasso Model Fitting with Big Data in R , 2017, R J..
[6] Alan M. Kwong,et al. A reference panel of 64,976 haplotypes for genotype imputation , 2015, Nature Genetics.
[7] D. Reich,et al. Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.
[8] Bernard W. Silverman,et al. Warping Functional Data in R and C via a Bayesian Multiresolution Approach , 2010 .
[9] Gad Abraham,et al. FlashPCA2: principal component analysis of biobank-scale genotype datasets , 2016, bioRxiv.
[10] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[11] Nilanjan Chatterjee,et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies , 2013, Nature Genetics.
[12] B. Browning,et al. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.
[13] Yurii S. Aulchenko,et al. BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm108 Genetics and population analysis GenABEL: an R library for genome-wide association analysis , 2022 .
[14] David Levine,et al. A high-performance computing toolset for relatedness and principal component analysis of SNP data , 2012, Bioinform..
[15] R. Tibshirani,et al. Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.
[16] M. Blum,et al. Pcadapt: An R Package to Perform Genome Scans for Selection Based on Principal Component Analysis , 2016, bioRxiv.
[17] Cameron D. Palmer,et al. Bias Characterization in Probabilistic Genotype Data and Improved Signal Detection with Multiple Imputation , 2016, PLoS genetics.
[18] Justin Zobel,et al. SparSNP: Fast and memory-efficient analysis of all SNPs for phenotype prediction , 2012, BMC Bioinformatics.
[19] John Novembre,et al. The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. , 2008, American journal of human genetics.
[20] Sara E. Kalla,et al. Complex disease and phenotype mapping in the domestic dog , 2016, Nature Communications.
[21] Carson C Chow,et al. Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.
[22] Paul H. C. Eilers,et al. GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies , 2013, BMC Bioinformatics.
[23] Thomas Mailund,et al. SNPFile – A software library and file format for large scale association mapping and population genetics studies , 2008, BMC Bioinformatics.
[24] P. Deloukas,et al. Multiple common variants for celiac disease influencing immune gene expression , 2010, Nature Genetics.
[25] K. Shianna,et al. Long-range LD can confound genome scans in admixed populations. , 2008, American journal of human genetics.
[26] Manuel A. R. Ferreira,et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.
[27] Jack Euesden,et al. PRSice: Polygenic Risk Score software , 2014, Bioinform..
[28] F. Dudbridge. Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.
[29] Lusheng Wang,et al. Fast accurate missing SNP genotype local imputation , 2012, BMC Research Notes.
[30] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.
[31] David Levine,et al. GWASTools: an R/Bioconductor package for quality control and analysis of genome-wide association studies , 2012, Bioinform..
[32] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.
[33] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .
[34] Gad Abraham,et al. FlashPCA2: principal component analysis of biobank-scale genotype datasets , 2016 .
[35] Sayan Mukherjee,et al. Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia. , 2016, American journal of human genetics.