Tree-Independent Dual-Tree Algorithms

Dual-tree algorithms are a widely used class of branch-and-bound algorithms. Unfortunately, developing dual-tree algorithms for use with different trees and problems is often complex and burdensome. We introduce a four-part logical split: the tree, the traversal, the point-to-point base case, and the pruning rule. We provide a meta-algorithm which allows development of dual-tree algorithms in a tree-independent manner and easy extension to entirely new types of trees. Representations are provided for five common algorithms; for k-nearest neighbor search, this leads to a novel, tighter pruning bound. The meta-algorithm also allows straightforward extensions to massively parallel settings.

[1]  Alexander G. Gray,et al.  Faster Gaussian Summation: Theory and Experiment , 2006, UAI.

[2]  Alexander G. Gray,et al.  A Distributed Kernel Summation Framework for General-Dimension Machine Learning , 2012, SDM.

[3]  Pankaj K. Agarwal,et al.  Geometric Range Searching and Its Relatives , 2007 .

[4]  Andrew W. Moore,et al.  Nonparametric Density Estimation: Toward Computational Tractability , 2003, SDM.

[5]  William B. March,et al.  MLPACK: a scalable C++ machine learning library , 2012, J. Mach. Learn. Res..

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Andrew W. Moore,et al.  Dual-Tree Fast Gauss Transforms , 2005, NIPS.

[8]  Andrew W. Moore,et al.  Rapid Evaluation of Multiple Density Models , 2003, AISTATS.

[9]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[10]  William B. March,et al.  Fast algorithms for comprehensive n-point correlation estimates , 2012, KDD.

[11]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[12]  William B. March,et al.  Fast euclidean minimum spanning tree: algorithm, analysis, and applications , 2010, KDD.

[13]  Jon Louis Bentley,et al.  Data Structures for Range Searching , 1979, CSUR.

[14]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[15]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[16]  Alexander G. Gray,et al.  QUIC-SVD: Fast SVD Using Cosine Trees , 2008, NIPS.

[17]  Alexander G. Gray,et al.  Fast Mean Shift with Accurate and Stable Convergence , 2007, AISTATS.

[18]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[19]  Alexander G. Gray,et al.  Fast High-dimensional Kernel Summations Using the Monte Carlo Multipole Method , 2008, NIPS.

[20]  William B. March,et al.  Linear-time Algorithms for Pairwise Statistical Problems , 2009, NIPS.