Stochastic Neighbor Compression

We present Stochastic Neighbor Compression (SNC), an algorithm to compress a dataset for the purpose of k-nearest neighbor (kNN) classification. Given training data, SNC learns a much smaller synthetic data set, that minimizes the stochastic 1-nearest neighbor classification error on the training data. This approach has several appealing properties: due to its small size, the compressed set speeds up kNN testing drastically (up to several orders of magnitude, in our experiments); it makes the kNN classifier substantially more robust to label noise; on 4 of 7 data sets it yields lower test error than kNN on the entire training set, even at compression ratios as low as 2%; finally, the SNC compression leads to impressive speed ups over kNN even when kNN and SNC are both used with ball-tree data structures, hashing, and LMNN dimensionality reduction--demonstrating that it is complementary to existing state-of-the-art algorithms to speed up kNN classification and leads to substantial further improvements.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[3]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[4]  G. Gates,et al.  The reduced nearest neighbor rule (Corresp.) , 1972, IEEE Trans. Inf. Theory.

[5]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[6]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[7]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[8]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[9]  Teuvo Kohonen,et al.  Improved versions of learning vector quantization , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[10]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[11]  Simon Kasif,et al.  Best-Case Results for Nearest-Neighbor Learning , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Christine Decaestecker,et al.  Finding prototypes for nearest neighbour classification by means of gradient descent and deterministic annealing , 1997, Pattern Recognit..

[13]  G. Gates The Reduced Nearest Neighbor Rule , 1998 .

[14]  Cheng-Lin Liu,et al.  Prototype learning algorithms for nearest neighbor classifier with application to handwritten character recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[15]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[16]  Sergio Bermejo,et al.  Adaptive soft k-nearest-neighbor classifiers , 1999, Pattern Recognit..

[17]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[18]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Godfried T. Toussaint,et al.  Proximity Graphs for Nearest Neighbor Decision Rules: Recent Progress , 2002 .

[20]  Ujjwal Maulik,et al.  Efficient prototype reordering in nearest neighbor classification , 2002, Pattern Recognit..

[21]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[22]  M. Narasimha Murty,et al.  An incremental prototype set building technique , 2002, Pattern Recognit..

[23]  Francesc J. Ferri,et al.  An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering , 2002, Pattern Recognit..

[24]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[25]  Fabrizio Angiulli,et al.  Fast condensed nearest neighbor rule , 2005, ICML.

[26]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[27]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[28]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[29]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[30]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[31]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[32]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[33]  Du Tran,et al.  Human Activity Recognition with Metric Learning , 2008, ECCV.

[34]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[35]  Stephen M. Omohundro,et al.  Five Balltree Construction Algorithms , 2009 .

[36]  Bernd Girod,et al.  Transform coding of image feature descriptors , 2009, Electronic Imaging.

[37]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[38]  Pietro Perona,et al.  Indexing in large scale image collections: Scaling properties and benchmark , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[39]  Richard S. Zemel,et al.  Stochastic k-Neighborhood Selection for Supervised and Unsupervised Learning , 2013, ICML.