Sigma-Delta and Distributed Noise-Shaping Quantization Methods for Random Fourier Features

We propose the use of low bit-depth Sigma-Delta and distributed noise-shaping methods for quantizing the Random Fourier features (RFFs) associated with shiftinvariant kernels. We prove that our quantized RFFs – even in the case of 1-bit quantization – allow a high accuracy approximation of the underlying kernels, and the approximation error decays at least polynomially fast as the dimension of the RFFs increases. We also show that the quantized RFFs can be further compressed, yielding an excellent trade-off between memory use and accuracy. Namely, the approximation error now decays exponentially as a function of the bits used. Moreover, we empirically show by testing the performance of our methods on several machine learning tasks that our method compares favorably to other state of the art quantization methods in this context.

[1]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[2]  Zoltán Szabó,et al.  Optimal Rates for Random Fourier Features , 2015, NIPS.

[3]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[4]  Francis R. Bach,et al.  On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..

[5]  I. Daubechies,et al.  Approximating a bandlimited function using very coarsely quantized data: A family of stable sigma-delta modulators of arbitrary order , 2003 .

[6]  Brian Kingsbury,et al.  Kernel Approximation Methods for Speech Recognition , 2017, J. Mach. Learn. Res..

[7]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[8]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[9]  Tri Dao,et al.  Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation , 2018, AISTATS.

[10]  Jeff G. Schneider,et al.  On the Error of Random Fourier Features , 2015, UAI.

[11]  John J. Benedetto,et al.  Sigma-delta quantization and finite frames , 2004, ICASSP.

[12]  Rayan Saab,et al.  Faster Binary Embeddings for Preserving Euclidean Distances , 2020, ArXiv.

[13]  V. Klee,et al.  Helly's theorem and its relatives , 1963 .

[14]  Petros Boufounos,et al.  Efficient Coding of Signal Distances Using Universal Quantized Embeddings , 2013, 2013 Data Compression Conference.

[15]  Benjamin Recht,et al.  Large Scale Kernel Learning using Block Coordinate Descent , 2016, ArXiv.

[16]  Felix Krahmer,et al.  An optimal family of exponentially accurate one‐bit Sigma‐Delta quantization schemes , 2010, ArXiv.

[17]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[18]  Thang Huynh Accurate Quantization in Redundant Systems: From Frames to Compressive Sampling and Phase Retrieval , 2016 .

[19]  Rayan Saab,et al.  Root-Exponential Accuracy for Coarse Quantization of Finite Frame Expansions , 2012, IEEE Transactions on Information Theory.

[20]  C. S. Güntürk One‐bit sigma‐delta quantization with exponential accuracy , 2003 .

[21]  Rayan Saab,et al.  Sobolev Duals for Random Frames and ΣΔ Quantization of Compressed Sensing Measurements , 2013, Found. Comput. Math..

[22]  Lorenzo Rosasco,et al.  Generalization Properties of Learning with Random Features , 2016, NIPS.

[23]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[24]  Holger Rauhut,et al.  A Mathematical Introduction to Compressive Sensing , 2013, Applied and Numerical Harmonic Analysis.

[25]  Rayan Saab,et al.  Noise-shaping Quantization Methods for Frame-based and Compressive Sampling Systems , 2015, ArXiv.

[26]  Trevor Campbell,et al.  Data-dependent compression of random features for large-scale kernel approximation , 2019, AISTATS.

[27]  C. Sinan Güntürk,et al.  Distributed Noise-Shaping Quantization: II. Classical Frames , 2017 .

[28]  David P. Woodruff,et al.  Faster Kernel Ridge Regression Using Sketching and Preconditioning , 2016, SIAM J. Matrix Anal. Appl..

[29]  Rayan Saab,et al.  Fast Binary Embeddings and Quantized Compressed Sensing with Structured Matrices , 2018, Communications on Pure and Applied Mathematics.

[30]  Jason Weston,et al.  Large-scale kernel machines , 2007 .

[31]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[32]  C. Sinan Güntürk,et al.  Distributed Noise-Shaping Quantization: I. Beta Duals of Finite Frames and Near-Optimal Quantization of Random Measurements , 2014, ArXiv.

[34]  Ameya Velingker,et al.  Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees , 2018, ICML.

[35]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[36]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[37]  J. Benedetto,et al.  Second-order Sigma–Delta (ΣΔ) quantization of finite frame expansions , 2006 .