Learning on complex, biased, and big data: disease risk prediction in epidemiological studies and genomic medicine on the example of childhood asthma
暂无分享,去创建一个
[1] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[2] John Van Hoewyk,et al. A multivariate technique for multiply imputing missing values using a sequence of regression models , 2001 .
[3] T. Roumeliotaki,et al. Variations in the prevalence of childhood asthma and wheeze in MeDALL cohorts in Europe , 2017, ERJ Open Research.
[4] S. Alberti,et al. Epigenetic inheritance and the missing heritability , 2015, Human Genomics.
[5] A. Dupuy,et al. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. , 2007, Journal of the National Cancer Institute.
[6] Robert P. W. Duin,et al. Bagging for linear classifiers , 1998, Pattern Recognit..
[7] Kenny Q. Ye,et al. An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.
[8] Jenny Donovan,et al. Evaluating the Prostate Cancer Prevention Trial High Grade prostate cancer risk calculator in 10 international biopsy cohorts: results from the prostate biopsy collaborative group , 2014, World journal of urology.
[9] Amy L. McGuire,et al. Personalized genomic information: preparing for the future of genetic medicine , 2010, Nature Reviews Genetics.
[10] Nicholas Eriksson,et al. Comparison of Family History and SNPs for Predicting Risk of Complex Disease , 2012, PLoS genetics.
[11] C. Calì,et al. Some mathematical properties of the ROC curve and their applications , 2015 .
[12] Rossen I. Valkanov,et al. Boundaries of Predictability: Noisy Predictive Regressions , 2000 .
[13] Craig K Enders,et al. A 'missing not at random' (MNAR) and 'missing at random' (MAR) growth model comparison with a buprenorphine/naloxone clinical trial. , 2015, Addiction.
[14] Stef van Buuren,et al. MICE: Multivariate Imputation by Chained Equations in R , 2011 .
[15] Kenneth F Schulz,et al. Refining clinical diagnosis with likelihood ratios , 2005, The Lancet.
[16] D. Vercelli. Gene–environment interactions in asthma and allergy: the end of the beginning? , 2010, Current opinion in allergy and clinical immunology.
[17] Ewout W Steyerberg,et al. Validation and updating of predictive logistic regression models: a study on sample size and shrinkage , 2004, Statistics in medicine.
[18] A. Hall,et al. The Rho Target PRK2 Regulates Apical Junction Formation in Human Bronchial Epithelial Cells , 2010, Molecular and Cellular Biology.
[19] Q. Deng,et al. Single-cell RNA sequencing: Technical advancements and biological applications. , 2017, Molecular aspects of medicine.
[20] Jürgen Unützer,et al. A comparison of imputation methods in a longitudinal randomized clinical trial , 2005, Statistics in medicine.
[21] H. Ortega,et al. Role of local eosinophilopoietic processes in the development of airway eosinophilia in prednisone‐dependent severe asthma , 2016, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.
[22] J E White,et al. A two stage design for the study of the relationship between a rare exposure and a rare disease. , 1982, American journal of epidemiology.
[23] Johanna M Seddon,et al. Prediction model for prevalence and incidence of advanced age-related macular degeneration based on genetic, demographic, and environmental variables. , 2009, Investigative ophthalmology & visual science.
[24] Anne-Laure Boulesteix,et al. Added predictive value of high-throughput molecular data to clinical data and its validation , 2011, Briefings Bioinform..
[25] Jamis J. Perrett,et al. Bonferroni Adjustments in Tests for Regression Coefficients , 2006 .
[26] Thomas Lengauer,et al. Permutation importance: a corrected feature importance measure , 2010, Bioinform..
[27] Helen E. Parkinson,et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..
[28] D. Belsky,et al. Polygenic risk and the development and course of asthma: an analysis of data from a four-decade longitudinal study. , 2013, The Lancet. Respiratory medicine.
[29] T. Illig,et al. Identification of novel immune phenotypes for allergic and nonallergic childhood asthma. , 2015, The Journal of allergy and clinical immunology.
[30] Margaret Sullivan Pepe,et al. Assessing risk prediction models in case–control studies using semiparametric and nonparametric methods , 2010, Statistics in medicine.
[31] Anne-Laure Boulesteix,et al. Over-optimism in bioinformatics: an illustration , 2010, Bioinform..
[32] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[33] J. Robins,et al. Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .
[34] Fabian J Theis,et al. Feature ranking of type 1 diabetes susceptibility genes improves prediction of type 1 diabetes , 2014, Diabetologia.
[35] Thomas Lengauer,et al. ROCR: visualizing classifier performance in R , 2005, Bioinform..
[36] Lisa G. Johnston,et al. An Empirical Comparison of Respondent-driven Sampling, Time Location Sampling, and Snowball Sampling for Behavioral Surveillance in Men Who Have Sex with Men, Fortaleza, Brazil , 2008, AIDS and Behavior.
[37] J. Gris,et al. Polymorphisms of human placental alkaline phosphatase are associated with in vitro fertilization success and recurrent pregnancy loss. , 2014, The American journal of pathology.
[38] Bhramar Mukherjee,et al. Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases. , 2017, American journal of epidemiology.
[39] P. Lichtenstein,et al. Heritability and confirmation of genetic association studies for childhood asthma in twins , 2016, Allergy.
[40] Kurt Hornik,et al. Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .
[41] F. Collins,et al. Shattuck lecture--medical and societal consequences of the Human Genome Project. , 1999, The New England journal of medicine.
[42] M. Ege. Asthma and Prenatal Inflammation. , 2017, American journal of respiratory and critical care medicine.
[43] Nathalie Japkowicz,et al. The Class Imbalance Problem: Significance and Strategies , 2000 .
[44] A. Mccarthy. Development , 1996, Current Opinion in Neurobiology.
[45] Peter M Visscher,et al. Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.
[46] Gary King,et al. Logistic Regression in Rare Events Data , 2001, Political Analysis.
[47] Janet Stocks,et al. An official American Thoracic Society/European Respiratory Society statement: pulmonary function testing in preschool children. , 2007, American journal of respiratory and critical care medicine.
[48] Mark J. van der Laan,et al. A Note on Risk Prediction for Case-Control Studies , 2008 .
[49] Fabian J Theis,et al. A strategy for combining minor genetic susceptibility genes to improve prediction of disease in type 1 diabetes , 2012, Genes and Immunity.
[50] Richard G. F. Visser,et al. Integration of multi-omics data for prediction of phenotypic traits using random forest , 2016, BMC Bioinformatics.
[51] D. Duffy,et al. Genetics of asthma and hay fever in Australian twins. , 1990, The American review of respiratory disease.
[52] Ludwig Fahrmeir,et al. Regression: Models, Methods and Applications , 2013 .
[53] Zenghui Wang,et al. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.
[54] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[55] F. Dudbridge,et al. Estimation of significance thresholds for genomewide association scans , 2008, Genetic epidemiology.
[56] Miyoung Shin,et al. Developing disease risk prediction model based on environmental factors , 2014, The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014).
[57] Andreas Ziegler,et al. Risk estimation and risk prediction using machine-learning methods , 2012, Human Genetics.
[58] D. Horvitz,et al. A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .
[59] R. Erasmus,et al. Genomic medicine and risk prediction across the disease spectrum , 2015, Critical reviews in clinical laboratory sciences.
[60] J. Parsons,et al. Src family protein tyrosine kinases: cooperating with growth factor and adhesion signaling pathways. , 1997, Current opinion in cell biology.
[61] Naomi R. Wray,et al. Estimating Trait Heritability , 2008 .
[62] D. Strachan,et al. Gene-environment interaction for childhood asthma and exposure to farming in Central Europe. , 2011, The Journal of allergy and clinical immunology.
[63] Isabelle Guyon,et al. A Scaling Law for the Validation-Set Training-Set Size Ratio , 1997 .
[64] James J Schlesselman. Case-Control Studies: Design, Conduct, Analysis , 1982 .
[65] Matthew Nahorniak,et al. Using Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples , 2015, PloS one.
[66] W. Phipatanakul,et al. Utility of the Asthma Predictive Index in predicting childhood asthma and identifying disease-modifying interventions. , 2014, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.
[67] Jason H. Moore,et al. Chapter 11: Genome-Wide Association Studies , 2012, PLoS Comput. Biol..
[68] Roger A. Sugden,et al. Multiple Imputation for Nonresponse in Surveys , 1988 .
[69] W. Busse,et al. Endotypes of difficult-to-control asthma in inner-city African American children , 2017, PloS one.
[70] Constantine Frangakis,et al. Multiple imputation by chained equations: what is it and how does it work? , 2011, International journal of methods in psychiatric research.
[71] Axel Benner,et al. Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information , 2016, BMC Bioinformatics.
[72] G. Anderson,et al. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease , 2008, The Lancet.
[73] Juha Karvanen,et al. Secondary Analysis under Cohort Sampling Designs Using Conditional Likelihood , 2012 .
[74] Stef van Buuren,et al. A toolkit in SAS for the evaluation of multiple imputation methods , 2003 .
[75] Johnny S. H. Kwan,et al. Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. , 2011, American journal of human genetics.
[76] B. Schaub,et al. The puzzle of immune phenotypes of childhood asthma , 2016, Molecular and Cellular Pediatrics.
[77] E. DeLong,et al. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.
[78] øöö Blockinøø. Well-Trained PETs : Improving Probability Estimation , 2000 .
[79] W. DuMouchel,et al. Using Sample Survey Weights in Multiple Regression Analyses of Stratified Samples , 1983 .
[80] J. Alcorn,et al. A Multiomics Approach to Identify Genes Associated with Childhood Asthma Risk and Morbidity , 2017, American journal of respiratory cell and molecular biology.
[81] Xiaoyu Jiang,et al. IPF-LASSO: Integrative L 1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data , 2017, Comput. Math. Methods Medicine.
[82] F. Agakov,et al. Genomic prediction of complex human traits: relatedness, trait architecture and predictive meta-models , 2015, Human molecular genetics.
[83] A. Price,et al. Dissecting the genetics of complex traits using summary association statistics , 2016, Nature Reviews Genetics.
[84] Stef van Buuren,et al. Flexible Imputation of Missing Data , 2012 .
[85] Anne-Laure Boulesteix,et al. A computationally fast variable importance test for random forests for high-dimensional data , 2015, Adv. Data Anal. Classif..
[86] J. Catania,et al. Health-related characteristics of men who have sex with men: a comparison of those living in "gay ghettos" with those living elsewhere. , 2001, American journal of public health.
[87] Fabian J. Theis,et al. Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression , 2016, J. Comput. Biol..
[88] J. Heckman. Sample selection bias as a specification error , 1979 .
[89] Siti Mariyam Shamsuddin,et al. Classification with class imbalance problem: A review , 2015, SOCO 2015.
[90] C. Reinero,et al. The potential use of tyrosine kinase inhibitors in severe asthma , 2012, Current opinion in allergy and clinical immunology.
[91] O. Stegle,et al. Deep learning for computational biology , 2016, Molecular systems biology.
[92] Dacheng Tao,et al. A Survey on Multi-view Learning , 2013, ArXiv.
[93] Luís Torgo,et al. OpenML: networked science in machine learning , 2014, SKDD.
[94] E. Ashley. Towards precision medicine , 2016, Nature Reviews Genetics.
[95] E. Bleecker,et al. Genome-wide association study of asthma identifies RAD50-IL13 and HLA-DR/DQ regions. , 2010, The Journal of allergy and clinical immunology.
[96] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[97] C. Ober. Asthma Genetics in the Post-GWAS Era. , 2016, Annals of the American Thoracic Society.
[98] Ken P Kleinman,et al. Much Ado About Nothing , 2007, The American statistician.
[99] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.
[100] Sylvia Richardson,et al. JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects , 2016, Genetic epidemiology.
[101] G. Abecasis,et al. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.
[102] Edgar Wingender,et al. Connecting high-dimensional mRNA and miRNA expression data for binary medical classification problems , 2013, Comput. Methods Programs Biomed..
[103] David D. Lewis,et al. Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.
[104] Li Li,et al. Deep Learning to Predict Patient Future Diseases from the Electronic Health Records , 2016, ECIR.
[105] Sergey Plis,et al. Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.
[106] Raphael Gottardo,et al. Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.
[107] R. Terracciano,et al. Benralizumab in the treatment of severe asthma: design, development and potential place in therapy , 2018, Drug design, development and therapy.
[108] David Hinkley,et al. Bootstrap Methods: Another Look at the Jackknife , 2008 .
[109] D. Rubin. INFERENCE AND MISSING DATA , 1975 .
[110] Stan Matwin,et al. Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.
[111] Yvonne Vergouwe,et al. A simple method to adjust clinical prediction models to local circumstances , 2009, Canadian journal of anaesthesia = Journal canadien d'anesthesie.
[112] Chris J. Skinner,et al. Analysis of complex surveys , 1991 .
[113] Tobi Saidel,et al. Baseline integrated behavioural and biological assessment among most at-risk populations in six high-prevalence states of India: design and implementation challenges , 2008, AIDS.
[114] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[115] Yang I Li,et al. An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.
[116] R. Serfozo. Basics of Applied Stochastic Processes , 2012 .
[117] Chao Chen,et al. Using Random Forest to Learn Imbalanced Data , 2004 .
[118] Florence Demenais,et al. A large-scale, consortium-based genomewide association study of asthma. , 2010, The New England journal of medicine.
[119] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .
[120] Fabian J Theis,et al. Prediction of type 1 diabetes using a genetic risk model in the Diabetes Autoimmunity Study in the Young , 2018, Pediatric diabetes.
[121] U. Frey,et al. Farming environments and childhood atopy, wheeze, lung function, and exhaled nitric oxide. , 2012, The Journal of allergy and clinical immunology.
[122] Lori J Sokoll,et al. Prostate Cancer Prevention Trial risk calculator 2.0 for the prediction of low- vs high-grade prostate cancer. , 2014, Urology.
[123] E. Kerwin,et al. Randomized, double-blind, placebo-controlled study of brodalumab, a human anti-IL-17 receptor monoclonal antibody, in moderate to severe asthma. , 2013, American journal of respiratory and critical care medicine.
[124] Hans-Peter Piepho,et al. A comparison of random forests, boosting and support vector machines for genomic selection , 2011, BMC proceedings.
[125] Fabian J. Theis,et al. Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies , 2017, Comput. Math. Methods Medicine.
[126] Ian Davidson,et al. On Sample Selection Bias and Its Efficient Correction via Model Averaging and Unlabeled Examples , 2007, SDM.
[127] N. Terry,et al. The Emergence of National Electronic Health Record Architectures in the United States and Australia: Models, Costs, and Questions , 2005, Journal of medical Internet research.
[128] Ping Zhang,et al. Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.
[129] Carole Ober,et al. Gene-environment interactions in human disease: nuisance or opportunity? , 2011, Trends in genetics : TIG.
[130] D. Ankerst,et al. Three general concepts to improve risk prediction: good data, wisdom of the crowd, recalibration , 2016 .
[131] C. Ober,et al. Asthma genetics 2006: the long and winding road to gene discovery , 2006, Genes and Immunity.
[132] R. Tibshirani,et al. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.
[133] J. Mesirov,et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.
[134] Bernhard Schölkopf,et al. Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.
[135] Thomas Lumley,et al. Analysis of Complex Survey Samples , 2004 .
[136] Sylvia Richardson,et al. Evolutionary Stochastic Search for Bayesian model exploration , 2010, 1002.2706.
[137] Mathias Fuchs,et al. Minimization and estimation of the variance of prediction errors for cross-validation designs , 2016 .
[138] J. Castro‐Rodriguez,et al. A clinical index to define risk of asthma in young children with recurrent wheezing. , 2000, American journal of respiratory and critical care medicine.
[139] W. A. Clark,et al. Simulation of self-organizing systems by digital computer , 1954, Trans. IRE Prof. Group Inf. Theory.
[140] K. Shortman,et al. Flow cytometry and cell-separation procedures. , 1991, Current opinion in immunology.
[141] R. Little. Missing-Data Adjustments in Large Surveys , 1988 .
[142] Deepayan Sarkar,et al. Lattice: Multivariate Data Visualization with R , 2008 .
[143] P. Thompson,et al. Histone Modifications and Asthma. The Interface of the Epigenetic and Genetic Landscapes. , 2015, American journal of respiratory cell and molecular biology.
[144] Greg Ridgeway,et al. Generalized Boosted Models: A guide to the gbm package , 2006 .
[145] V. Tremaroli,et al. Resource Dynamics and Stabilization of the Human Gut Microbiome during the First Year of Life Graphical Abstract Highlights , 2022 .
[146] D. Vercelli,et al. Discovering susceptibility genes for asthma and allergy , 2008, Nature Reviews Immunology.
[147] Bianca Zadrozny,et al. Learning and evaluating classifiers under sample selection bias , 2004, ICML.
[148] C. Begg,et al. Two‐Stage Designs for Gene–Disease Association Studies with Sample Size Constraints , 2004, Biometrics.
[149] J. Friedman. Stochastic gradient boosting , 2002 .
[150] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..
[151] Hongyu Zhao,et al. Practical Issues in Building Risk-Predicting Models for Complex Diseases , 2010, Journal of biopharmaceutical statistics.
[152] Xavier Robin,et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.
[153] John Langford,et al. Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.
[154] J. Genuneit. Sex-specific development of asthma differs between farm and nonfarm children: a cohort study. , 2014, American journal of respiratory and critical care medicine.
[155] M. Gail,et al. Strategies for Developing Prediction Models From Genome‐Wide Association Studies , 2013, Genetic epidemiology.
[156] David P. Strachan,et al. Comparisons of power of statistical methods for gene–environment interaction analyses , 2013, European Journal of Epidemiology.
[157] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.
[158] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..
[159] K. Rabe,et al. Oral Glucocorticoid–Sparing Effect of Benralizumab in Severe Asthma , 2017, The New England journal of medicine.
[160] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.