The evidential statistical paradigm in genetics

Concerns over reproducibility in research has reinvigorated the discourse on P‐values as measures of statistical evidence. In a position statement by the American Statistical Association board of directors, they warn of P‐value misuse and refer to the availability of alternatives. Despite the common practice of comparing P‐values across different hypothesis tests in genetics, it is well‐appreciated that P‐values must be interpreted alongside the sample size and experimental design used for their computation. Here, we discuss the evidential statistical paradigm (EP), an alternative to Bayesian and Frequentist paradigms, that has been implemented in human genetics studies. Using applications in Cystic Fibrosis genetic association analyses, and describing recent theoretical developments, we review how to measure statistical evidence using the EP in the presence of covariates, model misspecification, and for composite hypotheses. Novel graphical displays are presented, and software for their computation is highlighted. The implications of multiple hypothesis testing for the EP are delineated in the analyses, demonstrating a view more consistent with scientific reasoning; the EP provides a theoretical justification for replication that is a requirement in genetic association studies. As genetic studies grow in size and complexity, a fresh look at measures of statistical evidence that are sensible amid the analysis of big data are required.

[1]  M. S. Bartlett,et al.  Statistical methods and scientific inference. , 1957 .

[2]  G. Box Science and Statistics , 1976 .

[3]  T. Ferguson A Course in Large Sample Theory , 1996 .

[4]  Nicole A. Lazar,et al.  ASA Statement on Statistical Significance and p-Values , 2020 .

[5]  R. Houlston,et al.  Prioritizing Rare Variants with Conditional Likelihood Ratios , 2015, Human Heredity.

[6]  F. J. Anscombe Normal likelihood functions , 1964 .

[7]  Lisa J Strug,et al.  An Alternative Foundation for the Planning and Evaluation of Linkage Analysis , 2006, Human Heredity.

[8]  E. Lander,et al.  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results , 1995, Nature Genetics.

[9]  Giitiro Suzuki Robustness of Bayes classification region , 1974 .

[10]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[11]  J. Blume,et al.  Statistical evidence for GLM regression parameters: A robust likelihood approach , 2007, Statistics in medicine.

[12]  I.,et al.  Weight of Evidence : A Brief Survey , 2006 .

[13]  S. Muallem,et al.  Diverse transport modes by the solute carrier 26 family of anion transporters , 2009, The Journal of physiology.

[14]  A note on the likelihood‐ratio statistic under model misspecification , 1998 .

[15]  Garry R. Cutting,et al.  Cystic fibrosis genetics: from molecular understanding to clinical application , 2014, Nature Reviews Genetics.

[16]  Zhiwei Zhang Interpreting Statistical Evidence with Empirical Likelihood Functions , 2009, Biometrical journal. Biometrische Zeitschrift.

[17]  Lisa J Strug,et al.  Prevalence of meconium ileus marks the severity of mutations of the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) gene , 2015, Genetics in Medicine.

[18]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[19]  P. Corey,et al.  An Introduction to Evidential Sample Size Calculations , 2007 .

[20]  R. Royall On the Probability of Observing Misleading Statistical Evidence , 2000 .

[21]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[22]  M. Artés Statistical errors. , 1977, Medicina clinica.

[23]  Charles Rohde,et al.  A pure likelihood approach to the analysis of genetic association data: an alternative to Bayesian and frequentist analysis , 2010, European Journal of Human Genetics.

[24]  R. Royall,et al.  Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions , 2003 .

[25]  J. Rommens,et al.  Genetic Modifiers of Cystic Fibrosis–Related Diabetes , 2013, Diabetes.

[26]  Michael Evans,et al.  Measuring statistical evidence using relative belief , 2015, Computational and structural biotechnology journal.

[27]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[28]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[29]  Michael R Knowles,et al.  Multiple apical plasma membrane constituents are associated with susceptibility to meconium ileus in individuals with cystic fibrosis , 2012, Nature Genetics.

[30]  Y. Pawitan In all likelihood : statistical modelling and inference using likelihood , 2002 .

[31]  Weili Li Pure Likelihood-based Methods for Genetic Association Studies , 2016 .

[32]  V. P. Godambe An Optimum Property of Regular Maximum Likelihood Estimation , 1960 .

[33]  Melissa R. Miller,et al.  Variants in Solute Carrier SLC26A9 Modify Prenatal Exocrine Pancreatic Damage in Cystic Fibrosis. , 2015, The Journal of pediatrics.

[34]  Shelley B. Bull,et al.  BR-squared: a practical solution to the winner’s curse in genome-wide scans , 2011, Human Genetics.

[35]  J. Rommens,et al.  Genome-wide association meta-analysis identifies five modifier loci of lung disease severity in cystic fibrosis , 2015, Nature Communications.

[36]  Steven N. Goodman,et al.  Aligning statistical and scientific reasoning , 2016, Science.

[37]  H. Robbins Statistical Methods Related to the Law of the Iterated Logarithm , 1970 .

[38]  A. Owen Empirical likelihood ratio confidence intervals for a single functional , 1988 .

[39]  F. Borgèse,et al.  Characterization of SLC26A9, Facilitation of Cl- Transport by Bicarbonate , 2008, Cellular Physiology and Biochemistry.

[40]  Zhiwei Zhang,et al.  A Likelihood Paradigm for Clinical Trials , 2013 .

[41]  Jeffrey D Blume,et al.  Likelihood methods for measuring statistical evidence , 2002, Statistics in medicine.

[42]  V. Vieland,et al.  Statistical Evidence: A Likelihood Paradigm , 1998 .

[43]  J. Rommens,et al.  Cystic fibrosis gene modifier SLC26A9 modulates airway response to CFTR-directed therapeutics , 2016, Human molecular genetics.

[44]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[45]  David R. Bickel,et al.  THE STRENGTH OF STATISTICAL EVIDENCE FOR COMPOSITE HYPOTHESES: INFERENCE TO THE BEST EXPLANATION , 2010 .

[46]  Veronica J. Vieland,et al.  Measurement of Statistical Evidence: Picking Up Where Hacking and Others Left Off , 2017, Philosophy of Science.

[47]  John D. Kalbfleisch,et al.  Application of Likelihood Methods to Models Involving Large Numbers of Parameters , 1970 .

[48]  Leena Choi,et al.  Likelihood Based Study Designs for Time-to-Event Endpoints , 2017, 1711.01527.

[49]  Rory A. Fisher,et al.  The Arrangement of Field Experiments , 1992 .

[50]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[51]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[52]  D. Freedman,et al.  On The So-Called “Huber Sandwich Estimator” and “Robust Standard Errors” , 2006 .

[53]  F. Dudbridge,et al.  Estimation of significance thresholds for genomewide association scans , 2008, Genetic epidemiology.

[54]  N. Morton Sequential tests for the detection of linkage. , 1955, American journal of human genetics.

[55]  V. Vieland The replication requirement , 2001, Nature Genetics.

[56]  Likelihood and Composite Hypotheses [Comment on “A Likelihood Paradigm for Clinical Trials”] , 2013 .

[57]  L. Strug,et al.  Genetic association analysis with pedigrees: Direct inference using the composite likelihood ratio , 2018, Genetic epidemiology.

[58]  G. Abecasis,et al.  Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies , 2006, Nature Genetics.

[59]  J. Kent Robust properties of likelihood ratio tests , 1982 .

[60]  G. A. Barnard,et al.  THE LOGIC OF STATISTICAL INFERENCE1 , 1972, The British Journal for the Philosophy of Science.

[61]  Leena Choi,et al.  Likelihood approach for evaluating bioequivalence of highly variable drugs , 2015, Pharmaceutical statistics.

[62]  Lisa J. Strug,et al.  An Alternative Foundation for the Planning and Evaluation of Linkage Analysis , 2006, Human Heredity.

[63]  J. Schreiber Foundations Of Statistics , 2016 .

[64]  J. Chotai,et al.  On the lod score method in linkage analysis , 1984, Annals of human genetics.

[65]  Sue-Jane Wang,et al.  An evidential approach to non‐inferiority clinical trials , 2011, Pharmaceutical statistics.

[66]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.