Optimal Testing in the Experiment-rich Regime

Motivated by the widespread adoption of large-scale A/B testing in industry, we propose a new experimentation framework for the setting where potential experiments are abundant (i.e., many hypotheses are available to test), and observations are costly; we refer to this as the experiment-rich regime. Such scenarios require the experimenter to internalize the opportunity cost of assigning a sample to a particular experiment. We fully characterize the optimal policy and give an algorithm to compute it. Furthermore, we develop a simple heuristic that also provides intuition for the optimal policy. We use simulations based on real data to compare both the optimal algorithm and the heuristic to other natural alternative experimental design frameworks. In particular, we discuss the paradox of power: high-powered classical tests can lead to highly inefficient sampling in the experiment-rich regime.

[1]  Martin J. Wainwright,et al.  Online control of the false discovery rate with decaying memory , 2017, NIPS.

[2]  J. Wolfowitz,et al.  Optimum Character of the Sequential Probability Ratio Test , 1948 .

[3]  Rémi Munos,et al.  Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.

[4]  D. Siegmund Sequential Analysis: Tests and Confidence Intervals , 1985 .

[5]  Maryam Aziz,et al.  Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence , 2018, ALT.

[6]  Ashish Agarwal,et al.  Overlapping experiment infrastructure: more, better, faster experimentation , 2010, KDD.

[7]  Ron Kohavi,et al.  Improving the sensitivity of online controlled experiments by utilizing pre-experiment data , 2013, WSDM.

[8]  Benjamin Recht,et al.  The Power of Adaptivity in Identifying Statistical Alternatives , 2016, NIPS.

[9]  P. Freeman The Secretary Problem and its Extensions: A Review , 1983 .

[10]  J. Johndrow,et al.  A Decision Theoretic Approach to A/B Testing , 2017, 1710.03410.

[11]  Dean P. Foster,et al.  α‐investing: a procedure for sequential control of expected false discoveries , 2008 .

[12]  Michael S. Bernstein,et al.  Designing and deploying online field experiments , 2014, WWW.

[13]  T. Lai ON OPTIMAL STOPPING PROBLEMS IN SEQUENTIAL HYPOTHESIS TESTING , 1997 .

[14]  Matthew Malloy,et al.  lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[15]  Jaime Arguello A/B Testing , 2017, Encyclopedia of Machine Learning and Data Mining.

[16]  Lalit Jain,et al.  Firing Bandits: Optimizing Crowdfunding , 2018, ICML.

[17]  Daniel Russo,et al.  Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.

[18]  Adel Javanmard,et al.  Online Rules for Control of False Discovery Rate and False Discovery Exceedance , 2016, ArXiv.

[19]  Martin J. Wainwright,et al.  A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control , 2017, NIPS.

[20]  Aaditya Ramdas,et al.  Sequential Nonparametric Testing with the Law of the Iterated Logarithm , 2015, UAI.

[21]  David Williams,et al.  Probability with Martingales , 1991, Cambridge mathematical textbooks.

[22]  E. Platen,et al.  About secretary problems , 1980 .

[23]  Alex Deng,et al.  Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[24]  Michal Valko,et al.  Simple regret for infinitely many armed bandits , 2015, ICML.

[25]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[26]  Alexandra Carpentier,et al.  An optimal algorithm for the Thresholding Bandit Problem , 2016, ICML.

[27]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[28]  Pete Koomen,et al.  Peeking at A/B Tests: Why it matters, and what to do about it , 2017, KDD.

[29]  T. Lai Nearly Optimal Sequential Tests of Composite Hypotheses , 1988 .

[30]  Lukas Vermeer,et al.  Democratizing online controlled experiments at Booking.com , 2017, ArXiv.

[31]  Alex Deng,et al.  Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions , 2017, WSDM.

[32]  Richard M. Karp,et al.  Finding a most biased coin with fewest flips , 2012, COLT.

[33]  H. Vincent Poor,et al.  Quickest Search Over Multiple Sequences , 2011, IEEE Transactions on Information Theory.

[34]  Ron Kohavi,et al.  Online Experimentation at Microsoft , 2009 .

[35]  Shivaram Kalyanakrishnan,et al.  PAC Identification of a Bandit Arm Relative to a Reward Quantile , 2017, AAAI.

[36]  Walter T. Federer,et al.  Sequential Design of Experiments , 1967 .

[37]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[38]  Matthew Malloy,et al.  Quickest search for a rare distribution , 2012, 2012 46th Annual Conference on Information Sciences and Systems (CISS).