论文信息 - PAC Mode Estimation using PPR Martingale Confidence Sequences - 字舞流文

PAC Mode Estimation using PPR Martingale Confidence Sequences

We consider the problem of correctly identifying the mode of a discrete distribution P with sufficiently high probability by observing a sequence of i.i.d. samples drawn from P. This problem reduces to the estimation of a single parameter when P has a support set of size K = 2. After noting that this special case is handled very well by prior-posteriorratio (PPR) martingale confidence sequences (Waudby-Smith and Ramdas, 2020), we propose a generalisation to mode estimation, in which P may take K ≥ 2 values. To begin, we show that the “one-versus-one” principle to generalise from K = 2 to K ≥ 2 classes is more efficient than the “one-versus-rest” alternative. We then prove that our resulting stopping rule, denoted PPR-1v1, is asymptotically optimal (as the mistake probability is taken to 0). PPR-1v1 is parameter-free and computationally light, and incurs significantly fewer samples than competitors even in the non-asymptotic regime. We demonstrate its gains in two practical applications of sampling: election forecasting and verification of smart contracts in blockchains. Proceedings of the 25 International Conference on Artificial Intelligence and Statistics (AISTATS) 2022, Valencia, Spain. PMLR: Volume 151. Copyright 2022 by the author(s).

Vinay J. Ribeiro | Shivaram Kalyanakrishnan | Jian Vora | Sourav Das | Shubham Anand Jain | Sanit Gupta | Denil Mehta | Inderjeet Jayakumar Nair | Rohan Shah | Sushil Khyalia

[1] K. Gross,et al. Sequential probability ratio test for nuclear plant component surveillance , 1991 .

[2] Aditya Gopalan,et al. Sequential Mode Estimation with Oracle Queries , 2019, AAAI.

[3] R. Karandikar,et al. Predicting the 1998 Indian parliamentary election , 2002 .

[4] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .

[5] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.

[6] C. Payne. Election Forecasting in the UK: The BBC's Experience , 2003 .

[7] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[8] Jayadev Misra,et al. Finding Repeated Elements , 1982, Sci. Comput. Program..

[9] Sourav Das,et al. YODA: Enabling computationally intensive contracts on blockchains with Byzantine and Selfish nodes , 2019, NDSS.

[10] Shivaram Kalyanakrishnan,et al. Information Complexity in Bandit Subset Selection , 2013, COLT.

[11] S. Nakamoto,et al. Bitcoin: A Peer-to-Peer Electronic Cash System , 2008 .

[12] Wouter M. Koolen,et al. Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals , 2018, J. Mach. Learn. Res..

[13] Aaditya Ramdas,et al. Confidence sequences for sampling without replacement , 2020, NeurIPS.

[14] R. Khan,et al. Sequential Tests of Statistical Hypotheses. , 1972 .

[15] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[16] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.

[17] Arnab Bhattacharyya,et al. Sample Complexity for Winner Prediction in Elections , 2015, AAMAS.

[18] Kaigui Bian,et al. Robust Distributed Spectrum Sensing in Cognitive Radio Networks , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[19] Vitalik Buterin. A NEXT GENERATION SMART CONTRACT & DECENTRALIZED APPLICATION PLATFORM , 2015 .

[20] Paul Perry,et al. Certain Problems in Election Survey Methodology , 1979 .

[21] Aurélien Garivier,et al. Informational confidence bounds for self-normalized averages and applications , 2013, 2013 IEEE Information Theory Workshop (ITW).

[22] Jon D. McAuliffe,et al. Time-uniform Chernoff bounds via nonnegative supermartingales , 2018, Probability Surveys.