Nonmyopic Multifidelity Active Search

Active search is a learning paradigm where we seek to identify as many members of a rare, valuable class as possible given a labeling budget. Previous work on active search has assumed access to a faithful (and expensive) oracle reporting experimental results. However, some settings offer access to cheaper surrogates such as computational simulation that may aid in the search. We propose a model of multifidelity active search, as well as a novel, computationally efficient policy for this setting that is motivated by state-of-the-art classical policies. Our policy is nonmyopic and budget aware, allowing for a dynamic tradeoff between exploration and exploitation. We evaluate the performance of our solution on real-world datasets and demonstrate significantly better performance than natural benchmarks.

[1]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[2]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[3]  Alok Choudhary,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016 .

[4]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[5]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[6]  Roman Garnett,et al.  Cost Effective Active Search , 2019, NeurIPS.

[7]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[8]  Roman Garnett,et al.  Efficient Nonmyopic Active Search , 2017, ICML.

[9]  R. A. Miller,et al.  Sequential kriging optimization using multiple-fidelity evaluations , 2006 .

[10]  Kirthevasan Kandasamy,et al.  Multi-fidelity Bayesian Optimisation with Continuous Approximations , 2017, ICML.

[11]  David K. Smith,et al.  Dynamic Programming and Optimal Control. Volume 1 , 1996 .

[12]  Roman Garnett,et al.  Bayesian Optimal Active Search and Surveying , 2012, ICML.

[13]  Victor Picheny,et al.  Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[14]  P. Schrimpf,et al.  Dynamic Programming , 2011 .

[15]  Andreas Krause,et al.  Discovering Valuable items from Massive Data , 2015, KDD.

[16]  Gunnar Rätsch,et al.  Active Learning in the Drug Discovery Process , 2001, NIPS.

[17]  Emmanuel Müller,et al.  Figuring out the User in a Few Steps: Bayesian Multifidelity Active Search with Cokriging , 2019, KDD.

[18]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[19]  Kirthevasan Kandasamy,et al.  The Multi-fidelity Multi-armed Bandit , 2016, NIPS.

[20]  A. Tsai,et al.  Nonequilibrium phase diagrams of ternary amorphous alloys , 1997 .

[21]  Roman Garnett,et al.  Efficient nonmyopic batch active search , 2018, NeurIPS.

[22]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[23]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[24]  Matthias Poloczek,et al.  Multi-Information Source Optimization , 2016, NIPS.

[25]  Andrew Gordon Wilson,et al.  Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning , 2019, UAI.

[26]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[27]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .