Near-optimal inference in adaptive linear regression

When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normality can lead to erroneous results. We propose a family of online debiasing estimators to correct these distributional anomalies in least squares estimation. Our proposed methods take advantage of the covariance structure present in the dataset and provide sharper estimates in directions for which more information has accrued. We establish an asymptotic normality property for our proposed online debiasing estimators under mild conditions on the data collection process and provide asymptotically exact confidence intervals. We additionally prove a minimax lower bound for the adaptive linear regression problem, thereby providing a baseline by which to compare estimators. There are various conditions under which our proposed estimators achieve the minimax lower bounds up to logarithmic factors. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.

[1]  Adel Javanmard,et al.  Online Debiasing for Adaptively Collected High-dimensional Data , 2019, ArXiv.

[2]  D.G. Dudley,et al.  Dynamic system identification experiment design and data analysis , 1979, Proceedings of the IEEE.

[3]  Kelly W. Zhang,et al.  Inference for Batched Bandits , 2020, NeurIPS.

[4]  Stefan Wager,et al.  Confidence intervals for policy evaluation in adaptive experiments , 2021, Proceedings of the National Academy of Sciences.

[5]  H. Robbins,et al.  Strong consistency of least squares estimates in multiple regression , 1978 .

[6]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[7]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[8]  Tao Qin,et al.  Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising , 2013, NIPS.

[9]  Jack Bowden,et al.  Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.

[10]  Csaba Szepesvári,et al.  Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems , 2011, ArXiv.

[11]  I. A. Ibragimov,et al.  ASYMPTOTIC NORMALITY FOR SUMS OF DEPENDENT RANDOM VARIABLES , 2005 .

[12]  T. Lai,et al.  Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .

[13]  Tze Leung Lai,et al.  Asymptotic Properties of Nonlinear Least Squares Estimates in Stochastic Regression Models , 1994 .

[14]  W. Fuller,et al.  Distribution of the Estimators for Autoregressive Time Series with a Unit Root , 1979 .

[15]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[16]  Alessandro Rinaldo,et al.  On the bias, risk and consistency of sample means in multi-armed bandits , 2019, SIAM J. Math. Data Sci..

[17]  Matthew Malloy,et al.  lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[18]  Jasjeet S. Sekhon,et al.  Time-uniform, nonparametric, nonasymptotic confidence sequences , 2020, The Annals of Statistics.

[19]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[20]  Doreen Meier,et al.  Introduction To Stochastic Control Theory , 2016 .

[21]  Wouter M. Koolen,et al.  Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals , 2018, J. Mach. Learn. Res..

[22]  Vianney Perchet,et al.  Online A-Optimal Design and Active Linear Regression , 2021, ICML.

[23]  Diane M. Griffiths,et al.  THE REGENTS OF THE UNIVERSITY OF CALIFORNIA , 2007 .

[24]  Susan A. Murphy,et al.  Statistical Inference with M-Estimators on Adaptively Collected Data , 2021, NeurIPS.

[25]  John S. White THE LIMITING DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT IN THE EXPLOSIVE CASE , 1958 .

[26]  Xinkun Nie,et al.  Why adaptively collected data have negative bias and how to correct for it , 2017, AISTATS.

[27]  H. Robbins,et al.  Adaptive Design and Stochastic Approximation , 1979 .

[28]  Csaba Szepesvari,et al.  Bandit Algorithms , 2020 .

[29]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[30]  Alessandro Rinaldo,et al.  Are sample means in multi-armed bandits positively or negatively biased? , 2019, NeurIPS.

[31]  Vasilis Syrgkanis,et al.  Accurate Inference for Adaptive Linear Models , 2017, ICML.

[32]  T. Tony Cai,et al.  Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity , 2015, 1506.05539.