An integrated approach to reduce the impact of minor allele frequency and linkage disequilibrium on variable importance measures for genome-wide data
暂无分享,去创建一个
[1] Andreas Ziegler,et al. A Statistical Approach to Genetic Epidemiology: With Access to E-Learning Platform by Friedrich Pahlke , 2010 .
[2] Mark Daly,et al. Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..
[3] P. Armitage. Tests for Linear Trends in Proportions and Frequencies , 1955 .
[4] Yi Yu,et al. Performance of random forest when SNPs are in linkage disequilibrium , 2009, BMC Bioinformatics.
[5] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .
[6] R. Carroll,et al. Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants , 2011, Proceedings of the National Academy of Sciences.
[7] M. Friedman. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .
[8] M. McCarthy,et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.
[9] A. Foulkes,et al. Application of two machine learning algorithms to genetic association studies in the presence of covariates , 2008, BMC Genetics.
[10] H. Cordell,et al. SNP Selection in Genome-Wide and Candidate Gene Studies via Penalized Logistic Regression , 2010, Genetic epidemiology.
[11] Heping Zhang,et al. Detecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests , 2009 .
[12] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[13] Andrey A. Shabalin,et al. Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..
[14] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[15] Nilanjan Chatterjee,et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries , 2010, Nature Genetics.
[16] K. Lunetta,et al. Screening large-scale association study data: exploiting interactions using random forests , 2004, BMC Genetics.
[17] Paola Zuccolotto,et al. Analysis and correction of bias in Total Decrease in Node Impurity measures for tree-based algorithms , 2010, Stat. Comput..
[18] Marco Sandri,et al. A Bias Correction Algorithm for the Gini Variable Importance Measure in Classification Trees , 2008 .
[19] G. Rosner,et al. A modified forward multiple regression in high‐density genome‐wide association studies for complex traits , 2009, Genetic epidemiology.
[20] Satish Chikkagoudar,et al. Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest , 2011, Nucleic acids research.
[21] Hon-Cheong So,et al. Uncovering the total heritability explained by all true susceptibility variants in a genome‐wide association study , 2011, Genetic epidemiology.
[22] B. Maher. Personal genomes: The case of the missing heritability , 2008, Nature.
[23] I. König,et al. A Statistical Approach to Genetic Epidemiology: Concepts and Applications , 2006 .
[24] Manuel A. R. Ferreira,et al. Common variants in the trichohyalin gene are associated with straight hair in Europeans. , 2009, American journal of human genetics.
[25] Guimei Liu,et al. An empirical comparison of several recent epistatic interaction detection methods , 2011, Bioinform..
[26] Achim Zeileis,et al. BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .
[27] Adele Cutler,et al. An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings , 2010, BMC Genetics.
[28] Jason H. Moore,et al. The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.
[29] Judy H. Cho,et al. Finding the missing heritability of complex diseases , 2009, Nature.
[30] Achim Zeileis,et al. Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.
[31] James D. Malley,et al. Predictor correlation impacts machine learning algorithms: implications for genomic studies , 2009, Bioinform..
[32] D. Clayton,et al. Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.
[33] J. H. Moore,et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.
[34] K. Frazer,et al. Human genetic variation and its contribution to complex traits , 2009, Nature Reviews Genetics.
[35] Rich Caruana,et al. An empirical comparison of supervised learning algorithms , 2006, ICML.
[36] Andy Liaw,et al. Classification and Regression by randomForest , 2007 .
[37] Hans C. van Houwelingen,et al. The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .
[38] W. Kruskal,et al. Use of Ranks in One-Criterion Variance Analysis , 1952 .
[39] Ricardo Cao,et al. Evaluating the Ability of Tree‐Based Methods and Logistic Regression for the Detection of SNP‐SNP Interaction , 2009, Annals of human genetics.
[40] K. Lunetta,et al. Identifying SNPs predictive of phenotype using random forests , 2005, Genetic epidemiology.
[41] Guifang Fu,et al. The Bayesian lasso for genome-wide association studies , 2011, Bioinform..
[42] Andreas Ziegler and Inke R. Konig,et al. A statistical approach to genetic epidemiology , 2013 .
[43] Atanu Biswas,et al. A new bivariate binomial distribution , 2002 .
[44] Carolin Strobl,et al. Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations , 2012, Briefings Bioinform..
[45] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[46] K. Roeder,et al. Screen and clean: a tool for identifying interactions in genome‐wide association studies , 2010, Genetic epidemiology.
[47] Jing Li,et al. Detecting epistatic effects in association studies at a genomic level based on an ensemble approach , 2011, Bioinform..
[48] P. Visscher,et al. Common SNPs explain a large proportion of heritability for human height , 2011 .
[49] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[50] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[51] Yan V. Sun,et al. Machine learning in genome‐wide association studies , 2009, Genetic epidemiology.
[52] Qianchuan He,et al. BIOINFORMATICS ORIGINAL PAPER , 2022 .
[53] E. S. Pearson,et al. THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .
[54] Hans-Peter Piepho,et al. A comparison of random forests, boosting and support vector machines for genomic selection , 2011, BMC proceedings.