A/B Testing with Fat Tails

We propose a new framework for optimal experimentation, which we term the “A/B testing problem.” Our model departs from the existing literature by allowing for fat tails. Our key insight is that the optimal strategy depends on whether most gains accrue from typical innovations or from rare, unpredictable large successes. If the tails of the unobserved distribution of innovation quality are not too fat, the standard approach of using a few high-powered “big” experiments is optimal. However, if the distribution is very fat tailed, a “lean” strategy of trying more ideas, each with possibly smaller sample sizes, is preferred. Our theoretical results, along with an empirical analysis of Microsoft Bing’s EXP platform, suggest that simple changes to business practices could increase innovation productivity.

[1]  Nicolò Cesa-Bianchi,et al.  Bandits With Heavy Tail , 2012, IEEE Transactions on Information Theory.

[2]  J. Karamata SOME THEOREMS CONCERNING SLOWLY VARYING FUNCTIONS , 1962 .

[3]  Ashish Agarwal,et al.  Overlapping experiment infrastructure: more, better, faster experimentation , 2010, KDD.

[4]  Ward Whitt,et al.  An Introduction to Stochastic-Process Limits and their Application to Queues , 2002 .

[5]  Dean Eckles,et al.  Learning Causal Effects From Many Randomized Experiments Using Regularized Instrumental Variables , 2017, WWW.

[6]  Hector Chade,et al.  Another Look at the Radner-Stiglitz Nonconcavity in the Value of Information , 2001, J. Econ. Theory.

[7]  A. Banerjee,et al.  A Theory of Experimenters , 2017, SSRN Electronic Journal.

[8]  B. Hoadley Asymptotic Properties of Maximum Likelihood Estimators for the Independent Not Identically Distributed Case , 1971 .

[9]  Ron Kohavi,et al.  Unexpected results in online controlled experiments , 2011, SKDD.

[10]  Eric J. Johnson,et al.  The Construction of Preference: Do Defaults Save Lives? , 2006 .

[11]  Ron Berman,et al.  Test & Roll: Profit-Maximizing A/B Tests , 2018, Mark. Sci..

[12]  Thomas L. Griffiths,et al.  One and Done? Optimal Decisions From Very Few Samples , 2014, Cogn. Sci..

[13]  Andrew Mcclellan Experimentation and Approval Mechanisms , 2022, Econometrica.

[14]  Ron Kohavi,et al.  Controlled experiments on the web: survey and practical guide , 2009, Data Mining and Knowledge Discovery.

[15]  R. Koenker,et al.  CONVEX OPTIMIZATION, SHAPE CONSTRAINTS, COMPOUND DECISIONS, AND EMPIRICAL BAYES RULES , 2013 .

[16]  P. Nurmi Mixture Models , 2008 .

[17]  J. Kiefer,et al.  CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[18]  Christopher G. Small,et al.  Expansions and Asymptotics for Statistics , 2010 .

[19]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[20]  Alberto Abadie,et al.  Choosing among Regularized Estimators in Empirical Economics: The Risk of Machine Learning , 2019, Review of Economics and Statistics.

[21]  Dominic Coey,et al.  Improving Treatment Effect Estimators Through Experiment Splitting , 2019, WWW.

[22]  Yeon-Koo Che,et al.  Optimal Sequential Decision with Limited Attention∗ , 2016 .

[23]  Maximilian Kasy,et al.  Which findings should be published? , 2018, American Economic Journal: Microeconomics.

[24]  E. Glen Weyl,et al.  Empirical Bayes Estimation of Treatment Effects with Many A/B Tests: An Overview , 2019, AEA Papers and Proceedings.

[25]  Angus Deaton Instruments, Randomization, and Learning about Development , 2010 .

[26]  Steve Blank Why the Lean Start-Up Changes Everything , 2013 .

[27]  S. Morris,et al.  Crises: Equilibrium Shifts and Large Shocks , 2019, American Economic Review.

[28]  Ron Kohavi,et al.  Online Experimentation at Microsoft , 2009 .

[29]  Maximilian Kasy,et al.  Adaptive Treatment Assignment in Experiments for Policy Choice , 2019, Econometrica.

[30]  Jaeyong Lee,et al.  GENERALIZED DOUBLE PARETO SHRINKAGE. , 2011, Statistica Sinica.

[31]  L. Pekelis,et al.  p-Hacking and False Discovery in A/B Testing , 2018 .

[32]  Ron Kohavi,et al.  Improving the sensitivity of online controlled experiments by utilizing pre-experiment data , 2013, WSDM.

[33]  Stephen Morris,et al.  The Wald Problem and the Equivalence of Sequential Sampling and Static Information Costs , 2017 .

[34]  Giuseppe Moscarini,et al.  The Law of Large Demand for Information , 2000 .

[35]  Ron Kohavi,et al.  Online controlled experiments at large scale , 2013, KDD.

[36]  H. Robbins The Empirical Bayes Approach to Statistical Decision Problems , 1964 .

[37]  Susan Athey,et al.  The Econometrics of Randomized Experiments , 2016, 1607.00698.

[38]  Brijesh Singh,et al.  The Lean Startup:How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses , 2016 .

[39]  B. Verspagen,et al.  The size distribution of innovations revisited: an application of extreme value statistics to citation and value measures of patent significance , 2007 .

[40]  Giuseppe Moscarini,et al.  The Demand for Information: More Heat than Light , 2005, J. Econ. Theory.

[41]  M. A. Girshick,et al.  Bayes and minimax solutions of sequential decision problems , 1949 .

[42]  J. Johndrow,et al.  A Decision Theoretic Approach to A/B Testing , 2017, 1710.03410.

[43]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[44]  G. Imbens,et al.  Better Late than Nothing: Some Comments on Deaton (2009) and Heckman and Urzua (2009) , 2009 .

[45]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[46]  Drew Fudenberg,et al.  Stochastic Choice and Optimal Sequential Sampling , 2015, 1505.03342.

[47]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[48]  B. Efron Tweedie’s Formula and Selection Bias , 2011, Journal of the American Statistical Association.

[49]  K. DonaldW. Generalized Method of Moments Estimation When a Parameter Is on a Boundary , 1999 .

[50]  M. Mohri,et al.  Bandit Problems , 2006 .

[51]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[52]  Cun-Hui Zhang,et al.  Compound decision theory and empirical bayes methods , 2003 .

[53]  Alexander Peysakhovich,et al.  Combining observational and experimental data to find heterogeneous treatment effects , 2016, ArXiv.

[54]  A. Wald Foundations of a General Theory of Sequential Decision Functions , 1947 .

[55]  Wenhua Jiang,et al.  General maximum likelihood empirical Bayes estimation of normal means , 2009, 0908.1709.

[56]  J. Pickands Statistical Inference Using Extreme Order Statistics , 1975 .