Bandit-Based Monte Carlo Optimization for Nearest Neighbors

The celebrated Monte Carlo method estimates an expensive-to-compute quantity by random sampling. Bandit-based Monte Carlo optimization is a general technique for computing the minimum of many such expensive-to-compute quantities by adaptive random sampling. The technique converts an optimization problem into a statistical estimation problem which is then solved via multi-armed bandits. We apply this technique to solve the problem of high-dimensional <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbors, developing an algorithm which we prove is able to identify exact nearest neighbors with high probability. We show that under regularity assumptions on a dataset of <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> points in <inline-formula> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula>-dimensional space, the complexity of our algorithm scales logarithmically with the dimension of the data as <inline-formula> <tex-math notation="LaTeX">$O\left({(n+d)\log ^{2} \frac {nd}{\delta }}\right)$ </tex-math></inline-formula> for error probability <inline-formula> <tex-math notation="LaTeX">$\delta $ </tex-math></inline-formula>, rather than linearly as in exact computation requiring <inline-formula> <tex-math notation="LaTeX">$O(nd)$ </tex-math></inline-formula>. We corroborate our theoretical results with numerical simulations, showing that our algorithm outperforms both exact computation and state-of-the-art algorithms such as kGraph, NGT, and LSH on real datasets.

[1]  Peter W. Glynn,et al.  A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[4]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[5]  Barry L. Nelson,et al.  On the Asymptotic Validity of Fully Sequential Selection Procedures for Steady-State Simulation , 2006, Oper. Res..

[6]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[7]  Enhong Chen,et al.  Efficient Pure Exploration in Adaptive Round model , 2019, NeurIPS.

[8]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[9]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  David Tse,et al.  Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits , 2019, ICML.

[12]  Ambuj Tewari,et al.  PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.

[13]  Daisuke Miyazaki,et al.  Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data , 2018, ArXiv.

[14]  Stephen M. Omohundro,et al.  Five Balltree Construction Algorithms , 2009 .

[15]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[16]  Mark Broadie,et al.  Tractable Sampling Strategies for Ordinal Optimization , 2018, Oper. Res..

[17]  Michael C. Fu,et al.  An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..

[18]  Stefan Steinerberger,et al.  Randomized Near Neighbor Graphs, Giant Components, and Applications in Data Science , 2017, ArXiv.

[19]  Max Simchowitz,et al.  The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime , 2017, COLT.

[20]  Ilya P. Razenshteyn High-dimensional similarity search and sketching: algorithms and hardness , 2017 .

[21]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[22]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[23]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[24]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[25]  Alexandr Andoni,et al.  Practical and Optimal LSH for Angular Distance , 2015, NIPS.

[26]  Boris Aronov,et al.  Nearest-Neighbor Search Under Uncertainty , 2017, CCCG.

[27]  Junya Honda,et al.  Normal Bandits of Unknown Means and Variances , 2017, J. Mach. Learn. Res..

[28]  Fei-Fei Li,et al.  Learning Temporal Embeddings for Complex Video Analysis , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[30]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[31]  Tavor Z. Baharav,et al.  Ultra Fast Medoid Identification via Correlated Sequential Halving , 2019, NeurIPS.

[32]  Chun-Hung Chen,et al.  Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[33]  Dominik D. Freydenberger,et al.  Can We Learn to Gamble Efficiently? , 2010, COLT.

[34]  V. V. Buldygin,et al.  The sub-Gaussian norm of a binary random variable , 2013 .

[35]  David Tse,et al.  Medoids in almost linear time via multi-armed bandits , 2017, AISTATS.

[36]  Richard G. Baraniuk,et al.  Adaptive Estimation for Approximate k-Nearest-Neighbor Computations , 2019, AISTATS.

[37]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[38]  Matthew Malloy,et al.  lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[39]  Yu-Chi Ho,et al.  Ordinal optimization of DEDS , 1992, Discret. Event Dyn. Syst..

[40]  Shie Mannor,et al.  PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.

[41]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[42]  Barry L. Nelson,et al.  Recent advances in ranking and selection , 2007, 2007 Winter Simulation Conference.

[43]  Barry L. Nelson,et al.  A fully sequential procedure for indifference-zone selection in simulation , 2001, TOMC.

[44]  Kenneth Ward Church,et al.  Nonlinear Estimators and Tail Bounds for Dimension Reduction in l1 Using Cauchy Random Projections , 2006, J. Mach. Learn. Res..

[45]  Ilan Shomorony,et al.  Bandit-PAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits , 2020, NeurIPS.

[46]  Matthew Malloy,et al.  On Finding the Largest Mean Among Many , 2013, ArXiv.

[47]  Robert D. Nowak,et al.  Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[48]  Nikhil Karamchandani,et al.  Query Complexity of k-NN based Mode Estimation , 2020, ArXiv.

[49]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[50]  Jian Li,et al.  Nearly Instance Optimal Sample Complexity Bounds for Top-k Arm Selection , 2017, AISTATS.

[51]  Inderjit S. Dhillon,et al.  Linear Bandit Algorithms with Sublinear Time Complexity , 2021, ArXiv.

[52]  P. Glynn,et al.  Ordinal optimization - empirical large deviations rate estimators, and stochastic multi-armed bandits , 2015 .

[53]  Ilan Shomorony,et al.  Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment , 2020, NeurIPS.

[54]  Csaba Szepesvari,et al.  Use of variance estimation in the multi-armed bandit problem , 2006 .

[55]  Blake Mason,et al.  Learning Nearest Neighbor Graphs from Noisy Distance Samples , 2019, NeurIPS.

[56]  Ameet Talwalkar,et al.  Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.

[57]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[58]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Robert D. Nowak,et al.  Top Arm Identification in Multi-Armed Bandits with Batch Arm Pulls , 2016, AISTATS.