Near-optimal Bayesian Active Learning with Correlated and Noisy Tests

We consider the Bayesian active learning and experimental design problem, where the goal is to learn the value of some unknown target variable through a sequence of informative, noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly. In this paper, we propose ECED, a novel, computationally efficient active learning algorithm, and prove strong theoretical guarantees that hold with correlated, noisy tests. Rather than directly optimizing the prediction error, at each step, ECED picks the test that maximizes the gain in a surrogate objective, which takes into account the dependencies between tests. Our analysis relies on an information-theoretic auxiliary function to track the progress of ECED, and utilizes adaptive submodularity to attain the near-optimal bound. We demonstrate strong empirical performance of ECED on two problem instances, including a Bayesian experimental design task intended to distinguish among economic theories of how people make risky decisions, and an active preference learning task via pairwise comparisons.

[1]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[2]  David Heckerman,et al.  Troubleshooting Under Uncertainty , 1994 .

[3]  Steve Hanneke,et al.  Theory of Disagreement-Based Active Learning , 2014, Found. Trends Mach. Learn..

[4]  Haim Kaplan,et al.  Learning with attribute costs , 2005, STOC '05.

[5]  Andreas Krause,et al.  Bayesian Rapid Optimal Adaptive Design (BROAD): Method and application distinguishing models of risky choice , 2019 .

[6]  P. Wakker Prospect Theory: For Risk and Ambiguity , 2010 .

[7]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[8]  Suresh K. Bhavnani,et al.  Extensions of Generalized Binary Search to Group Identification and Exponential Costs , 2010, NIPS.

[9]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[10]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[11]  Kamalika Chaudhuri,et al.  Beyond Disagreement-Based Agnostic Active Learning , 2014, NIPS.

[12]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[13]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.

[14]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[15]  Richard M. Karp,et al.  Noisy binary search and its applications , 2007, SODA '07.

[16]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[17]  Andreas Krause,et al.  Near-Optimal Bayesian Active Learning with Noisy Observations , 2010, NIPS.

[18]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[19]  Devavrat Shah,et al.  Iterative ranking from pair-wise comparisons , 2012, NIPS.

[20]  Teresa M. Przytycka,et al.  On an Optimal Split Tree Problem , 1999, WADS.

[21]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[22]  Lisa Hellerstein,et al.  Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover , 2013, SODA.

[23]  Andreas Krause,et al.  Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.

[24]  Andreas Krause,et al.  Sequential Information Maximization: When is Greedy Near-optimal? , 2015, COLT.

[25]  W. Sharpe CAPITAL ASSET PRICES: A THEORY OF MARKET EQUILIBRIUM UNDER CONDITIONS OF RISK* , 1964 .

[26]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[27]  Andreas Krause,et al.  Submodular Surrogates for Value of Information , 2015, AAAI.

[28]  Maria-Florina Balcan,et al.  Active Learning - Modern Learning Theory , 2016, Encyclopedia of Algorithms.

[29]  Martin J. Wainwright,et al.  Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence , 2015, J. Mach. Learn. Res..

[30]  Sarah J. Converse,et al.  Special Issue Article: Adaptive management for biodiversity conservation in an uncertain world Which uncertainty? Using expert elicitation and expected value of information to design an adaptive program , 2011 .

[31]  Mukesh K. Mohania,et al.  Decision trees for entity identification: approximation algorithms and hardness results , 2007, PODS '07.

[32]  Robert D. Nowak,et al.  Noisy Generalized Binary Search , 2009, NIPS.

[33]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[34]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..