论文信息 - Near-Optimal Target Learning With Stochastic Binary Signals

Near-Optimal Target Learning With Stochastic Binary Signals

We study learning in a noisy bisection model: specifically, Bayesian algorithms to learn a target value V given access only to noisy realizations of whether V is less than or greater than a threshold theta. At step t = 0, 1, 2, ..., the learner sets threshold theta t and observes a noisy realization of sign(V - theta t). After T steps, the goal is to output an estimate V^ which is within an eta-tolerance of V . This problem has been studied, predominantly in environments with a fixed error probability q V, and there is little known when this happens. We give a pseudo-Bayesian algorithm which provably converges to V. When the true prior matches our algorithm's Gaussian prior, we show near-optimal expected performance. Our methods extend to the general multiple-threshold setting where the observation noisily indicates which of k >= 2 regions V belongs to.

[1] J. Michael Harrison,et al. Bayesian Dynamic Pricing Policies: Learning and Earning Under a Binary Prior Distribution , 2011, Manag. Sci..

[2] Robert Nowak,et al. Active Learning and Sampling , 2008 .

[3] Stochastic approximation with virtual observations for dose-finding on discrete levels. , 2010, Biometrika.

[4] Peter I. Frazier,et al. A Bayesian approach to stochastic root finding , 2011, Proceedings of the 2011 Winter Simulation Conference (WSC).

[5] Michael Horstein,et al. Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.

[6] Peter I. Frazier,et al. Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss , 2012, Journal of Applied Probability.

[7] H. Robbins. A Stochastic Approximation Method , 1951 .

[8] Raphael Sznitman,et al. Active Testing for Face Detection and Localization , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Richard M. Karp,et al. Noisy binary search and its applications , 2007, SODA '07.

[10] Sanmay Das,et al. Adapting to a Market Shock: Optimal Sequential Market-Making , 2008, NIPS.