论文信息 - Collaborative 20 Questions for Target Localization

Collaborative 20 Questions for Target Localization

We consider the problem of 20 questions with noise for multiple players under the minimum entropy criterion in the setting of stochastic search, with application to target localization. Each player yields a noisy response to a binary query governed by a certain error probability. First, we propose a sequential policy for constructing questions that queries each player in sequence and refines the posterior of the target location. Second, we consider a joint policy that asks all players questions in parallel at each time instant and characterize the structure of the optimal policy for constructing the sequence of questions. This generalizes the single player probabilistic bisection method for stochastic search problems. Third, we prove an equivalence between the two schemes showing that, despite the fact that the sequential scheme has access to a more refined filtration, the joint scheme performs just as well on average. Fourth, we establish convergence rates of the mean-square error and derive error exponents. Finally, we obtain an extension to the case of unknown error probabilities. This framework provides a mathematical model for incorporating a human in the loop for active machine learning systems.

Alfred O. Hero | Brian M. Sadler | Theodoros Tsiligkaridis | A. Hero | Theodoros Tsiligkaridis

[1] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .

[2] Robert Nowak,et al. Active Learning and Sampling , 2008 .

[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4] Yi Zhang,et al. Exploration and Exploitation in Adaptive Filtering Based on Bayesian Active Learning , 2003, ICML.

[5] Pascal Fua,et al. An Optimal Policy for Target Localization with Application to Electron Microscopy , 2013, ICML.

[6] Devavrat Shah,et al. Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[7] R. Nowak,et al. Upper and Lower Error Bounds for Active Learning , 2006 .

[8] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[9] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .

[10] Keith D. Kastella,et al. Foundations and Applications of Sensor Management , 2010 .

[11] Robert D. Nowak,et al. Active Ranking using Pairwise Comparisons , 2011, NIPS.

[12] M. Degroot. Optimal Statistical Decisions , 1970 .

[13] Allen Gersho,et al. Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[14] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .

[15] Peter I. Frazier,et al. A Bayesian approach to stochastic root finding , 2011, Proceedings of the 2011 Winter Simulation Conference (WSC).

[16] Michael Horstein,et al. Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.

[17] Tibor Hegedüs,et al. Generalized Teaching Dimensions and the Query Complexity of Learning , 1995, COLT.

[18] Peter I. Frazier,et al. Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss , 2012, Journal of Applied Probability.

[19] Tibor Hegedűs,et al. Generalized teaching dimensions and the query complexity of learning , 1995, Annual Conference Computational Learning Theory.

[20] Eli Upfal,et al. Computing with Noisy Information , 1994, SIAM J. Comput..

[21] Robert D. Nowak,et al. Query Complexity of Derivative-Free Optimization , 2012, NIPS.

[22] Robert D. Nowak,et al. The Geometry of Generalized Binary Search , 2009, IEEE Transactions on Information Theory.

[23] Rui M. Castro,et al. Active Learning and Adaptive Sampling for Non-Parametric Inference , 2007 .

[24] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[25] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[26] James O. Berger,et al. Bayesian analysis of dynamic item response models in educational testing , 2013, 1304.4441.

[27] H. Robbins. A Stochastic Approximation Method , 1951 .

[28] Onésimo Hernández-Lerma,et al. Controlled Markov Processes , 1965 .