Structured Transforms for Small-Footprint Deep Learning

We consider the task of building compact deep learning pipelines suitable for deployment on storage and power constrained mobile devices. We propose a unified framework to learn a broad family of structured parameter matrices that are characterized by the notion of low displacement rank. Our structured transforms admit fast function and gradient evaluation, and span a rich range of parameter sharing configurations whose statistical modeling capacity can be explicitly tuned along a continuum from structured to unstructured. Experimental results show that these transforms can significantly accelerate inference and forward/backward passes during training, and offer superior accuracy-compactness-speed tradeoffs in comparison to a number of existing techniques. In keyword spotting applications in mobile speech recognition, our methods are much more effective than standard linear low-rank bottleneck layers and nearly retain the performance of state of the art models, while providing more than 3.5-fold compression.

[1]  M. Morf,et al.  Displacement ranks of matrices and linear equations , 1979 .

[2]  T. Kailath,et al.  Generalized Displacement Structure for Block-Toeplitz,Toeplitz-Block, and Toeplitz-Derived Matrices , 1994 .

[3]  Ali H. Sayed,et al.  Displacement Structure: Theory and Applications , 1995, SIAM Rev..

[4]  V. Pan Structured Matrices and Polynomials , 2001 .

[5]  V. Pan Structured Matrices and Polynomials: Unified Superfast Algorithms , 2001 .

[6]  Victor Y. Pan,et al.  c ○ 2003 Society for Industrial and Applied Mathematics INVERSION OF DISPLACEMENT OPERATORS ∗ , 2022 .

[7]  Robert M. Gray,et al.  Toeplitz and Circulant Matrices: A Review , 2005, Found. Trends Commun. Inf. Theory.

[8]  Robert M. Gray,et al.  Toeplitz And Circulant Matrices: A Review (Foundations and Trends(R) in Communications and Information Theory) , 2006 .

[9]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[10]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[11]  Edo Liberty,et al.  The Mailman algorithm: A note on matrix-vector multiplication , 2009, Inf. Process. Lett..

[12]  Luca Maria Gambardella,et al.  High-Performance Neural Networks for Visual Object Classification , 2011, ArXiv.

[13]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[14]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[15]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[16]  Ebru Arisoy,et al.  Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[18]  Pushmeet Kohli,et al.  Memory Bounded Deep Convolutional Networks , 2014, ArXiv.

[19]  Yoshua Bengio,et al.  Low precision storage for deep learning , 2014 .

[20]  Georg Heigold,et al.  Small-footprint keyword spotting using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[22]  Le Song,et al.  Deep Fried Convnets , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  V. Sindhwani Material : Structured Transforms for Small-Footprint Deep Learning , 2015 .

[24]  Shih-Fu Chang,et al.  An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Shih-Fu Chang,et al.  Fast Neural Networks with Circulant Projections , 2015, ArXiv.

[26]  Tara N. Sainath,et al.  Convolutional neural networks for small-footprint keyword spotting , 2015, INTERSPEECH.

[27]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[28]  Ivan V. Oseledets,et al.  Fast Multidimensional Convolution in Low-Rank Tensor Formats via Cross Approximation , 2015, SIAM J. Sci. Comput..