论文信息 - Barnes-Hut-SNE

Barnes-Hut-SNE

The paper presents an O(N log N)-implementation of t-SNE -- an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots and that normally runs in O(N^2). The new implementation uses vantage-point trees to compute sparse pairwise similarities between the input data objects, and it uses a variant of the Barnes-Hut algorithm - an algorithm used by astronomers to perform N-body simulations - to approximate the forces between the corresponding points in the embedding. Our experiments show that the new algorithm, called Barnes-Hut-SNE, leads to substantial computational advantages over standard t-SNE, and that it makes it possible to learn embeddings of data sets with millions of objects.

Laurens van der Maaten | L. V. D. Maaten | L. Maaten

[1] V. Rokhlin. Rapid solution of integral equations of classical potential theory , 1985 .

[2] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[3] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .

[4] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[5] Peter N. Yianilos,et al. Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[6] M. S. Warren,et al. A parallel hashed Oct-Tree N-body algorithm , 1993, Supercomputing '93.

[7] Michael S. Warren,et al. Skeletons from the treecode closet , 1994 .

[8] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[9] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[10] Andrew W. Moore,et al. 'N-Body' Problems in Statistical Learning , 2000, NIPS.

[11] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[12] Geoffrey E. Hinton,et al. Learning Distributed Representations of Concepts Using Linear Relational Embedding , 2001, IEEE Trans. Knowl. Data Eng..

[13] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.

[14] Andrew W. Moore,et al. Rapid Evaluation of Multiple Density Models , 2003, AISTATS.

[15] Larry S. Davis,et al. Improved fast gauss transform and efficient kernel density estimation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16] Andrew W. Moore,et al. An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[17] Rudolf Bayer,et al. Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[18] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19] Nando de Freitas,et al. Fast Krylov Methods for N-Body Learning , 2005, NIPS.

[20] John Langford,et al. Cover trees for nearest neighbor , 2006, ICML.

[21] Lawrence K. Saul,et al. Large Margin Hidden Markov Models for Automatic Speech Recognition , 2006, NIPS.

[22] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[23] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[24] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.

[25] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[26] Jarkko Venna,et al. Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..

[27] Miguel Á. Carreira-Perpiñán,et al. The Elastic Embedding Algorithm for Dimensionality Reduction , 2010, ICML.

[28] Jeffrey Heer,et al. A Tour through the Visualization Zoo , 2010 .

[29] Jason Weston,et al. Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[30] Neil D. Lawrence,et al. Spectral Dimensionality Reduction via Maximum Entropy , 2011, AISTATS.

[31] Miguel Á. Carreira-Perpiñán,et al. Fast Training of Nonlinear Embedding Algorithms , 2012, ICML.