Discovering Valuable items from Massive Data

Suppose there is a large collection of items, each with an associated cost and an inherent utility that is revealed only once we commit to selecting it. Given a budget on the cumulative cost of the selected items, how can we pick a subset of maximal value? This task generalizes several important problems such as multi-arm bandits, active search and the knapsack problem. We present an algorithm, GP-SELECT, which utilizes prior knowledge about similarity between items, expressed as a kernel function. GP-SELECT uses Gaussian process prediction to balance exploration (estimating the unknown value of items) and exploitation (selecting items of high value). We extend GP-SELECT to be able to discover sets that simultaneously have high utility and are diverse. Our preference for diversity can be specified as an arbitrary monotone submodular function that quantifies the diminishing returns obtained when selecting similar items. Furthermore, we exploit the structure of the model updates to achieve an order of magnitude (up to 40X) speedup in our experiments without resorting to approximations. We provide strong guarantees on the performance of GP-SELECT and apply it to three real-world case studies of industrial relevance: (1) Refreshing a repository of prices in a Global Distribution System for the travel industry, (2) Identifying diverse, binding-affine peptides in a vaccine design task and (3) Maximizing clicks in a web-scale recommender system by recommending items to users.

[1]  Matthew J. Streeter,et al.  An Online Algorithm for Maximizing Submodular Functions , 2008, NIPS.

[2]  Aleksandrs Slivkins,et al.  Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[3]  Andreas Krause,et al.  Explore-exploit in top-N recommender systems via Gaussian processes , 2014, RecSys '14.

[4]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[5]  G. Choquet Theory of capacities , 1954 .

[6]  Andreas Krause,et al.  Online Learning of Assignments , 2009, NIPS.

[7]  Archie C. Chapman,et al.  ε-first policies for budget-limited multi-armed bandits , 2010, AAAI 2010.

[8]  Thomas P. Hayes,et al.  The Price of Bandit Information for Online Optimization , 2007, NIPS.

[9]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[10]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[11]  Robert D. Kleinberg,et al.  Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[12]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[13]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[14]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[15]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[16]  Nando de Freitas,et al.  Bayesian Optimization in High Dimensions via Random Embeddings , 2013, IJCAI.

[17]  Filip Radlinski,et al.  Learning optimally diverse rankings over large document collections , 2010, ICML.

[18]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[19]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[20]  Takeo Kanade,et al.  Distributed cosegmentation via submodular optimization on anisotropic diffusion , 2011, 2011 International Conference on Computer Vision.

[21]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[22]  Gunnar Rätsch,et al.  Support Vector Machines for Active Learning in the Drug Discovery Process , 2002 .

[23]  Gunnar Rätsch,et al.  Inferring latent task structure for Multitask Learning by Multiple Kernel Learning , 2010, BMC Bioinformatics.

[24]  R. Ravi,et al.  Approximation Algorithms for Correlated Knapsacks and Non-martingale Bandits , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[25]  Eli Upfal,et al.  Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[26]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[27]  Roman Garnett,et al.  Active search on graphs , 2013, KDD.

[28]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[29]  Yisong Yue,et al.  Linear Submodular Bandits and their Application to Diversified Retrieval , 2011, NIPS.

[30]  Wei Chu,et al.  A case study of behavior-driven conjoint analysis on Yahoo!: front page today module , 2009, KDD.

[31]  Archie C. Chapman,et al.  Epsilon-First Policies for Budget-Limited Multi-Armed Bandits , 2010, AAAI.

[32]  Jean-Philippe Vert,et al.  Efficient peptide-MHC-I binding prediction for alleles with few known binders , 2008, Bioinform..

[33]  Andreas Krause,et al.  Contextual Gaussian Process Bandit Optimization , 2011, NIPS.

[34]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[35]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[36]  Csaba Szepesvári,et al.  Online Optimization in X-Armed Bandits , 2008, NIPS.

[37]  Robert E. Schapire,et al.  Non-Stochastic Bandit Slate Problems , 2010, NIPS.

[38]  Morten Nielsen,et al.  A Community Resource Benchmarking Predictions of Peptide Binding to MHC-I Molecules , 2006, PLoS Comput. Biol..

[39]  Roman Garnett,et al.  Bayesian Optimal Active Search and Surveying , 2012, ICML.