Light at the end of the tunnel: a Monte Carlo approach to computing value of information

Calculating the expected value of information (VOI) for sequences of observations under uncertainty is intractable, as branching trees of potential outcomes of sets of observations must be considered in the general case. We address the combinatorial challenge of computing ideal observational policies in situations where long sequences of weak evidential updates may have to be considered. We introduce and validate the use of Monte Carlo procedures for computing VOI with such long evidential sequences. We evaluate the procedure on a synthetic dataset and on a challenging citizen-science problem and demonstrate how it can effectively cut through the intractability of the combinatorial space.

[1]  Shlomo Zilberstein,et al.  Monitoring and control of anytime algorithms: A dynamic programming approach , 2001, Artif. Intell..

[2]  Eric Horvitz,et al.  An approximate nonmyopic computation for value of information , 1994, UAI 1994.

[3]  Eric Horvitz,et al.  Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[4]  Meir Kalech,et al.  When to Stop? That Is the Question , 2011, AAAI.

[5]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[6]  Keith D. Kastella,et al.  Foundations and Applications of Sensor Management , 2010 .

[7]  G. Gorry,et al.  Decision analysis as the basis for computer-aided management of acute renal failure. , 1973, The American journal of medicine.

[8]  Jaesik Choi,et al.  Greedy Algorithms for Sequential Sensing Decisions , 2009, IJCAI.

[9]  Alʹbert Nikolaevich Shiri︠a︡ev,et al.  Optimal Stopping and Free-Boundary Problems , 2006 .

[10]  Andreas Krause,et al.  Near-optimal Observation Selection using Submodular Functions , 2007, AAAI.

[11]  Lise Getoor,et al.  Value of Information Lattice: Exploiting Probabilistic Independence for Effective Feature Subset Acquisition , 2011, J. Artif. Intell. Res..

[12]  D. Heckerman,et al.  Toward Normative Expert Systems: Part I The Pathfinder Project , 1992, Methods of Information in Medicine.

[13]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[14]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[15]  Finn Verner Jensen,et al.  Myopic Value of Information in Influence Diagrams , 1997, UAI.

[16]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[17]  David Maxwell Chickering,et al.  A Bayesian Approach to Learning Bayesian Networks with Local Structure , 1997, UAI.

[18]  Yi Zhang,et al.  Multi-Task Active Learning with Output Constraints , 2010, AAAI.

[19]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[20]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[21]  Demosthenis Teneketzis,et al.  Multi-Armed Bandit Problems , 2008 .

[22]  Qiang Ji,et al.  Efficient non-myopic value-of-information computation for influence diagrams , 2008, Int. J. Approx. Reason..

[23]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.