RaSaR: a novel methodology for the detection of epistasis

Complex diseases which affect a large proportion of our population today demand more strategic methods to produce significant association results. As it currently stands there are numerous disorders and diseases which are yet to be identified with a genetic causal variant despite evidence produced by research efforts which indicate the existence of high genetic concordance. Breast Cancer is one of the most prominent cancers in the female population with approximately 55K new cases each year in the UK and approximately 11K deaths. The genetic component of Breast Cancer is a popular research area and has uncovered many genetic associations from high to low penetrance. The dataset used within this research is obtained from the DRIVE project, one of five introduced under the GAME-ON initiative. The general research use DRIVE dataset contains approximately 533K single-nucleotide polymorphisms (SNPs), with more than 280K sequenced with reference to the 5 most prominent cancers; colon, breast, ovarian, prostate and lung. SNP’s are sequenced for approximately 28K subjects, of which approximately 14K were diagnosed with one of three stages of Breast Cancer; unknown, in-situ and invasive. Epistasis is a progressive approach that complements the ‘common disease, common variant’ hypothesis that highlights the potential for connected networks of genetic variants collaborating to produce a phenotypic expression. Epistasis is commonly performed as a pairwise or limitless-arity capacity that considers variant networks as either variant vs variant or as high order interactions. This type of analysis extends the number of tests that were previously performed in a standard approach such as GWAS, in which FDR was already an issue, therefore by multiplying the number of tests up to a factorial rate also increases the issue of FDR. Further to this, epistasis introduces its own limitations of computational complexity that are generated based on the analysis performed; to consider the most intense approach, a multivariate analysis introduces a time complexity of ( !) On . Throughout this thesis, approaches, methods and techniques for epistasis analysis and GWAS are discussed, as well as the limitations that exist and how to address these issues. Proposed in this thesis is a novel methodology, methodology and methods for the detection of epistasis using interpretable methods and best practice to outline interactions through filtering processes. RaSaR refers to process of Random Sampling Regularisation which randomly splits and produces sample sets to conduct a voting system to regularise the significance and reliability of biological markers, SNPs. Parallel to this, the proposed methodology takes into consideration and adjusts for the common limitations of computational complexity and false discovery using filter selection and a novel method to association analysis. Preliminary results are promising, outlining a concise detection of interactions using benchmarking standard approaches that consider the common approaches to multiple testing. Results for the detection of epistasis, in the classification of breast cancer patients, indicated nine outlined risk candidate interactions from five variants and a singular candidate variant with high protective association.

[1]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[2]  Runhe Huang,et al.  A study on association rule mining of darknet big data , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[3]  C. Mathers,et al.  Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012 , 2015, International journal of cancer.

[4]  Dmitri V Zaykin,et al.  Multiple tests for genetic effects in association studies. , 2002, Methods in molecular biology.

[5]  R. Khesin,et al.  Molecular Genetics , 1968, Springer Berlin Heidelberg.

[6]  Arlindo L. Oliveira,et al.  Using Information Interaction to Discover Epistatic Effects in Complex Diseases , 2013, PloS one.

[7]  H. Lodish,et al.  Protein Sorting: Organelle Biogenesis and Protein Secretion , 2000 .

[8]  Marina Milanović,et al.  CHAID Decision Tree: Methodological Frame and Application , 2016 .

[9]  Rebecca Hardy,et al.  A BRCA1-mutation associated DNA methylation signature in blood cells predicts sporadic breast cancer incidence and survival , 2014, Genome Medicine.

[10]  Pui-Yan Kwok,et al.  Detection of single nucleotide polymorphisms. , 2003, Current issues in molecular biology.

[11]  Paul Fergus,et al.  Utilizing Deep Learning and Genome Wide Association Studies for Epistatic-Driven Preterm Birth Classification in African-American Women , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  K. Hemminki,et al.  The ‘Common Disease-Common Variant’ Hypothesis and Familial Risks , 2008, PloS one.

[13]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[14]  A. Morris,et al.  Data quality control in genetic case-control association studies , 2010, Nature Protocols.

[15]  C. Schmidt,et al.  When to use the odds ratio or the relative risk? , 2008, International Journal of Public Health.

[16]  A. Stuckey,et al.  Breast Cancer Epidemiology and Risk Factors , 2011, Clinical obstetrics and gynecology.

[17]  A. Korte,et al.  The advantages and limitations of trait analysis with GWAS: a review , 2013, Plant Methods.

[18]  T. Manolio,et al.  How to Interpret a Genome-wide Association Study Topic Collections , 2022 .

[19]  Lihong Qi,et al.  Estrogen plus progestin and breast cancer incidence and mortality in the Women's Health Initiative Observational Study. , 2013, Journal of the National Cancer Institute.

[20]  Mu Zhu,et al.  Compositional epistasis detection using a few prototype disease models , 2019, PloS one.

[21]  A. Jemal,et al.  Cancer treatment and survivorship statistics, 2016 , 2016, CA: a cancer journal for clinicians.

[22]  Peter Kraft,et al.  Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array , 2013, Nature Genetics.

[23]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[24]  J. Klijn,et al.  Clinical correlates of low-risk variants in FGFR2, TNRC9, MAP3K1, LSP1 and 8q24 in a Dutch cohort of incident breast cancer cases , 2007, Breast Cancer Research.

[25]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[26]  Chris S. Haley,et al.  Detecting epistasis in human complex traits , 2014, Nature Reviews Genetics.

[27]  M. Kanai,et al.  Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set , 2016, Journal of Human Genetics.

[28]  Patrick F Sullivan,et al.  The genomics of schizophrenia: update and implications. , 2013, The Journal of clinical investigation.

[29]  Christopher R. Gignoux,et al.  Human demographic history impacts genetic risk prediction across diverse populations , 2016, bioRxiv.

[30]  C. Weinberg,et al.  The Sister Study Cohort: Baseline Methods and Participant Characteristics , 2017, Environmental health perspectives.

[31]  S. Chanock,et al.  Genetic variation in SIPA1 in relation to breast cancer risk and survival after breast cancer diagnosis , 2009, International journal of cancer.

[32]  O. François,et al.  Naturalgwas: An R package for evaluating genomewide association methods with empirical data , 2018, Molecular ecology resources.

[33]  Arcadi Navarro,et al.  Replicability and Prediction: Lessons and Challenges from GWAS. , 2018, Trends in genetics : TIG.

[34]  D. Easton,et al.  Risk factors for the incidence of breast cancer: do they affect survival from the disease? , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[35]  J. Katz,et al.  “One Size Fits All” Doesn’t Fit When It Comes to Long-Term Opioid Use for People with Chronic Pain , 2017, Canadian journal of pain = Revue canadienne de la douleur.

[36]  R. Schwab,et al.  Reproductive risk factors and breast cancer subtypes: a review of the literature , 2014, Breast Cancer Research and Treatment.

[37]  H. Iwase,et al.  [Breast cancer]. , 2006, Nihon rinsho. Japanese journal of clinical medicine.

[38]  Dhiya Al-Jumeily,et al.  Machine learning approaches for the prediction of obesity using publicly available genetic profiles , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[39]  Franco Montalto,et al.  Automated detection of unusual soil moisture probe response patterns with association rule learning , 2018, Environ. Model. Softw..

[40]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[41]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[42]  J. Zhang,et al.  What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. , 1998, JAMA.

[43]  A. Wolk,et al.  Body weight and postmenopausal breast cancer risk defined by estrogen and progesterone receptor status among Swedish women: A prospective cohort study , 2006, International journal of cancer.

[44]  J. Bennewitz,et al.  Improved confidence intervals in quantitative trait loci mapping by permutation bootstrapping. , 2002, Genetics.

[45]  Abdulhamit Subasi,et al.  Medical Decision Support System for Diagnosis of Heart Arrhythmia using DWT and Random Forests Classifier , 2016, Journal of Medical Systems.

[46]  Chris Wallace,et al.  simGWAS: a fast method for simulation of large scale case-control GWAS summarystatistics , 2018, bioRxiv.

[47]  Montgomery Slatkin,et al.  Linkage disequilibrium — understanding the evolutionary past and mapping the medical future , 2008, Nature Reviews Genetics.

[48]  Luís A. Alexandre,et al.  Stacked Autoencoders Using Low-Power Accelerated Architectures for Object Recognition in Autonomous Systems , 2016, Neural Processing Letters.

[49]  Wei Dong,et al.  Association between two CHRNA3 variants and susceptibility of lung cancer: a meta-analysis , 2016, Scientific Reports.

[50]  M. Cloitre The “one size fits all” approach to trauma treatment: should we be satisfied? , 2015, European journal of psychotraumatology.

[51]  Ryo Yamada,et al.  LAMPLINK: detection of statistically significant SNP combinations from GWAS data , 2016, Bioinform..

[52]  Dennis J. Hazelett,et al.  The OncoArray Consortium: A Network for Understanding the Genetic Architecture of Common Cancers , 2016, Cancer Epidemiology, Biomarkers & Prevention.

[53]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[54]  Herman Chernoff,et al.  Discovering interactions among BRCA1 and other candidate genes associated with sporadic breast cancer , 2008, Proceedings of the National Academy of Sciences.

[55]  Yang Zhao,et al.  Statistical analysis for genome-wide association study , 2014, Journal of biomedical research.

[56]  Yuehua Cui,et al.  Send Orders of Reprints at Reprints@benthamscience.net Gene-based Genomewide Association Analysis: a Comparison Study , 2022 .

[57]  Lynne M Connelly,et al.  Fisher's Exact Test. , 2016, Medsurg nursing : official journal of the Academy of Medical-Surgical Nurses.

[58]  David Baltimore,et al.  Nucleic Acids, the Genetic Code, and the Synthesis of Macromolecules , 2000 .

[59]  Ruth Heller,et al.  Replicability analysis for genome-wide association studies , 2012, 1209.2829.

[60]  Aung Ko Win,et al.  A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer , 2015, Scientific Reports.

[61]  J. Chester,et al.  Personalised cancer medicine , 2015, International journal of cancer.

[62]  Katherine E Henson,et al.  Risk of Suicide After Cancer Diagnosis in England , 2018, JAMA psychiatry.

[63]  Shing I. Chang,et al.  A medical decision support system for disease diagnosis under uncertainty , 2017, Expert Syst. Appl..

[64]  G. Rocheleau,et al.  A survey about methods dedicated to epistasis detection , 2015, Front. Genet..

[65]  Cheng Soon Ong,et al.  GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS , 2013, BMC Genomics.

[66]  N E Day,et al.  European Prospective Investigation into Cancer and Nutrition (EPIC): study populations and data collection , 2002, Public Health Nutrition.

[67]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[68]  J. Ioannidis,et al.  Evolution of Reporting P Values in the Biomedical Literature, 1990-2015. , 2016, JAMA.

[69]  P. Donnelly,et al.  Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip , 2009, PLoS genetics.

[70]  D. Birnbaum,et al.  Novel indications for BRCA1 screening using individual clinical and morphological features , 1999, International journal of cancer.

[71]  John P A Ioannidis,et al.  What Should the Genome-wide Significance Threshold Be? Empirical Replication of Borderline Genetic Associations Yfor a Full List of Investigators Offering Data and Clarifications See Acknowledgments , 2022 .

[72]  A. Pagnamenta,et al.  A candidate gene study of capecitabine-related toxicity in colorectal cancer identifies new toxicity variants at DPYD and a putative role for ENOSF1 rather than TYMS , 2014, Gut.

[73]  K. Frazer,et al.  Common vs. rare allele hypotheses for complex diseases. , 2009, Current opinion in genetics & development.

[74]  Philippe Fournier-Viger,et al.  A survey of itemset mining , 2017, WIREs Data Mining Knowl. Discov..

[75]  Rongling Li,et al.  Quality Control Procedures for Genome‐Wide Association Studies , 2011, Current protocols in human genetics.

[76]  D. Balding A tutorial on statistical methods for population association studies , 2006, Nature Reviews Genetics.

[77]  Genotype imputation and genetic association studies of UK , 2022 .

[78]  N. Eriksson,et al.  Replicability and Robustness of Genome-Wide-Association Studies for Behavioral Traits , 2014, Psychological science.

[79]  Kenneth G. C. Smith,et al.  Genome‐wide association studies in Crohn's disease: Past, present and future , 2018, Clinical & translational immunology.

[80]  I. Gottesman,et al.  Twin studies of schizophrenia: from bow-and-arrow concordances to star wars Mx and functional genomics. , 2000, American journal of medical genetics.

[81]  Paulo J. G. Lisboa,et al.  A robust method for the interpretation of genomic data , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[82]  Andries T Marees,et al.  A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis , 2018, International journal of methods in psychiatric research.

[83]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[84]  C. McCarty,et al.  Alcohol, genetics and risk of breast cancer in the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial , 2012, Breast Cancer Research and Treatment.

[85]  R. Wooster,et al.  Breast cancer genetics: What we know and what we need , 2001, Nature Medicine.

[86]  N. Horita,et al.  Genetic model selection for a case–control study and a meta-analysis , 2015, Meta gene.

[87]  C. Myers,et al.  Pathway-based discovery of genetic interactions in breast cancer , 2017, PLoS genetics.

[88]  J. Kładny,et al.  Epistatic Relationship between the Cancer Susceptibility Genes CHEK2 and p27 , 2007, Cancer Epidemiology Biomarkers & Prevention.

[89]  Dennis J. Hazelett,et al.  Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans , 2015, Human molecular genetics.

[90]  Julian Peto,et al.  A large-scale assessment of two-way SNP interactions in breast cancer susceptibility using 46,450 cases and 42,461 controls from the breast cancer association consortium. , 2014, Human molecular genetics.

[91]  A. Ziegler,et al.  Cochran-Armitage Test versus Logistic Regression in the Analysis of Genetic Association Studies , 2011, Human Heredity.

[92]  Jiang Gui,et al.  A Robust Multifactor Dimensionality Reduction Method for Detecting Gene–Gene Interactions with Application to the Genetic Analysis of Bladder Cancer Susceptibility , 2011, Annals of human genetics.

[93]  E. Falk,et al.  The high-density lipoprotein-adjusted SCORE model worsens SCORE-based risk classification in a contemporary population of 30 824 Europeans: the Copenhagen General Population Study , 2015, European heart journal.

[94]  Jaime G. Carbonell,et al.  Approaches to machine learning , 1984, J. Am. Soc. Inf. Sci..

[95]  P. Robson,et al.  Assessing SNP-SNP Interactions among DNA Repair, Modification and Metabolism Related Pathway Genes in Breast Cancer Susceptibility , 2013, PloS one.

[96]  G. Luikart,et al.  Genomics advances the study of inbreeding depression in the wild , 2016, Evolutionary applications.

[97]  Dongyuan Liu,et al.  Loci and candidate gene identification for resistance to Sclerotinia sclerotiorum in soybean (Glycine max L. Merr.) via association and linkage maps. , 2015, The Plant journal : for cell and molecular biology.

[98]  Youxin Wang,et al.  Genetic model , 2016, Journal of Cellular and Molecular Medicine.

[99]  Gang Zheng,et al.  On estimation of the variance in Cochran–Armitage trend tests for genetic association using case–control studies , 2006, Statistics in medicine.

[100]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[101]  Andrew P Morris,et al.  Basic statistical analysis in genetic case-control studies , 2011, Nature Protocols.

[102]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[103]  Peter Kraft,et al.  Heterogeneity of Breast Cancer Associations with Five Susceptibility Loci by Clinical and Pathological Characteristics , 2008, PLoS genetics.

[104]  D. V. Berg,et al.  Trans-ethnic genome-wide association study of colorectal cancer identifies a new susceptibility locus in VTI1A , 2014, Nature Communications.

[105]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[106]  D. Steinberg CART: Classification and Regression Trees , 2009 .

[107]  Jing Zhao,et al.  Breast Cancer: Epidemiology and Etiology , 2014, Cell Biochemistry and Biophysics.

[108]  Kristel Van Steen,et al.  mbmdr: an R package for exploring gene-gene interactions associated with binary or quantitative traits , 2010, Bioinform..

[109]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[110]  Nicholas R. Lemoine,et al.  A practical guide for the functional annotation of genetic variations using SNPnexus , 2013, Briefings Bioinform..

[111]  F. Yates Contingency Tables Involving Small Numbers and the χ2 Test , 1934 .

[112]  Y. Lee,et al.  Meta-Analysis of Genetic Association Studies , 2015, Annals of laboratory medicine.

[113]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[114]  S. Seal,et al.  Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. , 1994, Science.

[115]  R. Bold,et al.  Apoptosis, cancer and cancer therapy. , 1997, Surgical oncology.

[116]  A. Jemal,et al.  Breast cancer statistics, 2015: Convergence of incidence rates between black and white women , 2016, CA: a cancer journal for clinicians.

[117]  L. Excoffier,et al.  Robust Demographic Inference from Genomic and SNP Data , 2013, PLoS genetics.

[118]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[119]  Stephen Eyre,et al.  Genetics of rheumatoid arthritis: GWAS and beyond , 2011, Open access rheumatology : research and reviews.

[120]  Rediet Abebe,et al.  Breast Cancer Screening, Incidence, and Mortality Across US Counties. , 2015, JAMA internal medicine.

[121]  R. Deberardinis Serine metabolism: some tumors take the road less traveled. , 2011, Cell metabolism.

[122]  D. Gudbjartsson,et al.  Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor–positive breast cancer , 2007, Nature Genetics.

[123]  A. Jemal,et al.  Cancer statistics, 2016 , 2016, CA: a cancer journal for clinicians.

[124]  S. Goodman,et al.  Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations , 2016, European Journal of Epidemiology.

[125]  Joseph P. Romano,et al.  Generalizations of the familywise error rate , 2005, math/0507420.

[126]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[127]  Peter Kraft,et al.  Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis , 2012, Nature Genetics.

[128]  Prof. Naruya Saitou Introduction to Evolutionary Genomics , 2018, Computational Biology.

[129]  M. Marazita,et al.  Genome-wide Association Studies , 2012, Journal of dental research.

[130]  Mary E. Edgerton,et al.  Selective Genomic Copy Number Imbalances and Probability of Recurrence in Early-Stage Breast Cancer , 2011, PloS one.

[131]  M. King,et al.  Population-based screening for breast and ovarian cancer risk due to BRCA1 and BRCA2 , 2014, Proceedings of the National Academy of Sciences.

[132]  R. Tibshirani,et al.  Sequential selection procedures and false discovery rate control , 2013, 1309.5352.

[133]  Karen L. Mohlke,et al.  Genetic Risk Prediction — Are We There Yet? , 2009 .

[134]  P. Ma,et al.  Comparison of different methods for imputing genome-wide marker genotypes in Swedish and Finnish Red Cattle. , 2013, Journal of dairy science.

[135]  L. Korde,et al.  Genetics of breast cancer: a topic in evolution. , 2015, Annals of oncology : official journal of the European Society for Medical Oncology.

[136]  P. Sedgwick Odds ratios II , 2010, British medical journal.

[137]  Uk Trial Of Early Detection Of Breast Cancer Group FIRST RESULTS ON MORTALITY REDUCTION IN THE UK TRIAL OF EARLY DETECTION OF BREAST CANCER , 1988, The Lancet.

[138]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[139]  Krista A. Zanetti,et al.  Novel colon cancer susceptibility variants identified from a genome‐wide association study in African Americans , 2017, International journal of cancer.

[140]  J. Kelsey A review of the epidemiology of human breast cancer. , 1979, Epidemiologic reviews.

[141]  Qiang Yang,et al.  Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso , 2010, BMC Bioinformatics.

[142]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[143]  Jing Hua Zhao,et al.  2LD, GENECOUNTING and HAP: computer programs for linkage disequilibrium analysis , 2004, Bioinform..

[144]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[145]  Quan Long,et al.  AprioriGWAS, a New Pattern Mining Strategy for Detecting Genetic Variants Associated with Disease through Interaction Effects , 2014, PLoS Comput. Biol..

[146]  Alison M. Goate,et al.  The Candidate Gene Approach , 2000, Alcohol research & health : the journal of the National Institute on Alcohol Abuse and Alcoholism.

[147]  Jun Wang,et al.  SNP Calling, Genotype Calling, and Sample Allele Frequency Estimation from New-Generation Sequencing Data , 2012, PloS one.

[148]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[149]  M. Boehnke,et al.  Methods for meta‐analysis of multiple traits using GWAS summary statistics , 2018, Genetic epidemiology.

[150]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.

[151]  D. English,et al.  Cohort Profile: The Melbourne Collaborative Cohort Study (Health 2020). , 2017, International journal of epidemiology.

[152]  Lester L. Peters,et al.  Genome-wide association study identifies novel breast cancer susceptibility loci , 2007, Nature.

[153]  J. Long,et al.  Evaluating 17 breast cancer susceptibility loci in the Nashville breast health study , 2015, Breast Cancer.

[154]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[155]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[156]  Giovanni Montana,et al.  Statistical methods in genetics , 2006, Briefings Bioinform..

[157]  R. Elston,et al.  Identification of gene‐gene interactions in the presence of missing data using the multifactor dimensionality reduction method , 2009, Genetic epidemiology.

[158]  Leif Groop,et al.  The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants , 2016, European Journal of Human Genetics.

[159]  S. Narum Beyond Bonferroni: less conservative analyses for conservation genetics , 2006, Conservation Genetics.

[160]  Jason H Moore,et al.  Analysis of Gene‐Gene Interactions , 2003, Current protocols in human genetics.

[161]  L. Galluzzi,et al.  Pathophysiology of Cancer Cell Death , 2020 .

[162]  Cisca Wijmenga,et al.  From genome-wide association studies to disease mechanisms: celiac disease as a model for autoimmune diseases , 2012, Seminars in Immunopathology.