论文信息 - Accelerating t-SNE using tree-based algorithms

Accelerating t-SNE using tree-based algorithms

The paper investigates the acceleration of t-SNE--an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots--using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N log N). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.

Laurens van der Maaten | L. V. D. Maaten | L. Maaten

[1] Yifan Hu,et al. Efficient, High-Quality Force-Directed Graph Drawing , 2006 .

[2] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3] Ludo Waltman,et al. Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[4] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5] Richard I. Hartley,et al. Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6] David G. Lowe,et al. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[7] M. S. Warren,et al. A parallel hashed Oct-Tree N-body algorithm , 1993, Supercomputing '93.

[8] Miguel Á. Carreira-Perpiñán,et al. Entropic Affinities: Properties and Efficient Numerical Computation , 2013, ICML.

[9] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[10] Peter Tiño,et al. Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11] Laurens van der Maaten,et al. Barnes-Hut-SNE , 2013, ICLR.

[12] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[13] Alexander G. Gray. Fast kernel matrix-vector multiplication with application to Gaussian process learning , 2004 .

[14] John Langford,et al. Cover trees for nearest neighbor , 2006, ICML.

[15] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .

[16] V. Rokhlin. Rapid solution of integral equations of classical potential theory , 1985 .

[17] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[18] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[19] Samuel Kaski,et al. Scalable Optimization of Neighbor Embedding for Visualization , 2013, ICML.

[20] Miguel Á. Carreira-Perpiñán,et al. The Elastic Embedding Algorithm for Dimensionality Reduction , 2010, ICML.

[21] Daniel A. Keim,et al. Mastering the Information Age - Solving Problems with Visual Analytics , 2010 .

[22] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[23] Neil D. Lawrence,et al. Spectral Dimensionality Reduction via Maximum Entropy , 2011, AISTATS.

[24] Ramani Duraiswami,et al. Fast optimal bandwidth selection for kernel density estimation , 2006, SDM.

[25] Christopher J. C. Burges,et al. Dimension Reduction: A Guided Tour , 2010, Found. Trends Mach. Learn..

[26] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[27] Edward M. Reingold,et al. Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[28] Nando de Freitas,et al. Fast Krylov Methods for N-Body Learning , 2005, NIPS.

[29] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[30] Rudolf Bayer,et al. Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[31] Miguel Á. Carreira-Perpiñán,et al. Linear-time training of nonlinear low-dimensional embeddings , 2014, AISTATS.

[32] Jeffrey Heer,et al. A tour through the visualization zoo , 2010, ACM Queue.

[33] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[34] Paul Wilmes,et al. Alignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction , 2014, Scientific Reports.

[35] Keinosuke Fukunaga,et al. A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[36] Peter Eades,et al. FADE: Graph Drawing, Clustering, and Visual Abstraction , 2000, GD.

[37] Kilian Q. Weinberger,et al. Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.

[38] Matthew Chalmers,et al. A linear iteration time layout algorithm for visualising high-dimensional data , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[39] Shuiwang Ji. Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering , 2013, BMC Bioinformatics.

[40] Sergey Brin,et al. Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[41] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.

[42] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[43] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[44] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[45] V. Springel,et al. GADGET: a code for collisionless and gasdynamical cosmological simulations , 2000, astro-ph/0003162.

[46] Larry S. Davis,et al. Improved fast gauss transform and efficient kernel density estimation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[47] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..

[48] Andrew W. Moore,et al. An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[49] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.

[50] Lawrence K. Saul,et al. Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[51] George Roussos,et al. A New Error Estimate of the Fast Gauss Transform , 2002, SIAM J. Sci. Comput..

[52] Eric O. Postma,et al. Dimensionality Reduction: A Comparative Review , 2008 .

[53] George E. Karniadakis,et al. A sharp error estimate for the fast Gauss transform , 2006, J. Comput. Phys..

[54] G. Kauffmann,et al. The many lives of active galactic nuclei: cooling flows, black holes and the luminosities and colour , 2005, astro-ph/0508046.

[55] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[56] Nando de Freitas,et al. Empirical Testing of Fast Kernel Density Estimation Algorithms , 2005 .

[57] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[58] Peter N. Yianilos,et al. Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[59] Michael S. Warren,et al. Skeletons from the treecode closet , 1994 .

[60] G. Zoutendijk,et al. Methods of Feasible Directions , 1962, The Mathematical Gazette.

[61] Nando de Freitas,et al. Fast Computational Methods for Visually Guided Robots , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[62] Laurens van der Maaten,et al. Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[63] Andrew W. Moore,et al. Rapid Evaluation of Multiple Density Models , 2003, AISTATS.

[64] Yann LeCun,et al. Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[65] Andrew W. Moore,et al. 'N-Body' Problems in Statistical Learning , 2000, NIPS.

[66] Jarkko Venna,et al. Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..