ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices

Several real-world applications require real-time prediction on resource-scarce devices such as an Internet of Things (IoT) sensor. Such applications demand prediction models with small storage and computational complexity that do not compromise significantly on accuracy. In this work, we propose ProtoNN, a novel algorithm that addresses the problem of real-time and accurate prediction on resource-scarce devices. Pro-toNN is inspired by k-Nearest Neighbor (KNN) but has several orders lower storage and prediction complexity. ProtoNN models can be deployed even on devices with puny storage and computational power (e.g. an Arduino UNO with 2kB RAM) to get excellent prediction accuracy. ProtoNN derives its strength from three key ideas: a) learning a small number of prototypes to represent the entire training set, b) sparse low dimensional projection of data, c) joint discriminative learning of the projection and prototypes with explicit model size constraint. We conduct systematic empirical evaluation of ProtoNN on a variety of supervised learning tasks (binary, multi-class, multi-label classification) and show that it gives nearly state-of-the-art prediction accuracy on resource-scarce devices while consuming several orders lower storage, and using minimal working memory.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  M. Narasimha Murty,et al.  An incremental prototype set building technique , 2002, Pattern Recognit..

[5]  J. Friedman Stochastic gradient boosting , 2002 .

[6]  Francesc J. Ferri,et al.  An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering , 2002, Pattern Recognit..

[7]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[8]  Fabrizio Angiulli,et al.  Fast condensed nearest neighbor rule , 2005, ICML.

[9]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[10]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[11]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[12]  T. Blumensath,et al.  Iterative Thresholding for Sparse Approximations , 2008 .

[13]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[14]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[15]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[16]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Sebastian Nowozin,et al.  Decision Jungles: Compact and Rich Models for Classification , 2013, NIPS.

[18]  Prasoon Goyal,et al.  Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[19]  Prateek Jain,et al.  On Iterative Hard Thresholding Methods for High-dimensional M-Estimation , 2014, NIPS.

[20]  Stephen Tyree,et al.  Stochastic Neighbor Compression , 2014, ICML.

[21]  Manik Varma,et al.  FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning , 2014, KDD.

[22]  Prateek Jain,et al.  Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[23]  Venkatesh Saligrama,et al.  Feature-Budgeted Random Forest , 2015, ICML.

[24]  Wenlin Chen,et al.  Deep Metric Learning with Data Summarization , 2016, ECML/PKDD.

[25]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[26]  Venkatesh Saligrama,et al.  Pruning Random Forests for Prediction on a Budget , 2016, NIPS.

[27]  John Langford,et al.  Logarithmic Time One-Against-Some , 2016, ICML.

[28]  Bernhard Schölkopf,et al.  DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification , 2016, WSDM.

[29]  Sanjiv Kumar,et al.  Fast Classification with Binary Prototypes , 2017, AISTATS.