VoiceHD: Hyperdimensional Computing for Efficient Speech Recognition

In this paper, we propose VoiceHD, a novel speech recognition technique based on brain-inspired hyperdimensional(HD) computing. VoiceHD maps preprocessed voice signals in the frequency domain to random hypervectors and combines them to compute a hypervector (as learned patterns) representing each class. During inference, VoiceHD similarly computes a query hypervector; the classification task is done by checking the similarity of the query hypervector with all learned hypervectors and finding a class with the highest similarity. We further extend VoiceHD to VoiceHD+NN that uses a neural network with a single small hidden layer to improve the similarity measures. This neural network is a small block directly operating on the similarity outputs of VoiceHD to slightly improve the classification accuracy. We evaluate efficiency of VoiceHD and VoiceHD+NN compared to a deep neural network with three large hidden layers over Isolet spoken letter dataset. Our benchmarking results on CPU show that VoiceHD and VoiceHD+NN provide 11.9X and 8.5X higher energy efficiency, 5.3X and 4.0X faster testing time, and 4.6X and 2.9X faster training time compared to the deep neural network, while providing marginally better classification accuracy.

[1]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[2]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[3]  Luca Maria Gambardella,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  P. Kanerva,et al.  Hyperdimensional Computing for Text Classification , 2016 .

[7]  Luca Benini,et al.  Hyperdimensional biosignal processing: A case study for EMG-based hand gesture recognition , 2016, 2016 IEEE International Conference on Rebooting Computing (ICRC).

[8]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[9]  Pentti Kanerva,et al.  What We Mean When We Say "What's the Dollar of Mexico?": Prototypes and Mapping in Concept Space , 2010, AAAI Fall Symposium: Quantum Informatics for Cognitive, Social, and Semantic Processes.

[10]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[11]  Luis Ceze,et al.  General-purpose code acceleration with limited-precision analog computation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[12]  Aditya Joshi,et al.  Language Geometry Using Random Indexing , 2016, QI.

[13]  Okko Johannes Räsänen,et al.  Sequence Prediction With Sparse Distributed Hyperdimensional Coding Applied to the Analysis of Mobile Phone Use Patterns , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Yann LeCun,et al.  CNP: An FPGA-based processor for Convolutional Networks , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[15]  Kilian Q. Weinberger,et al.  Large Margin Multi-Task Metric Learning , 2010, NIPS.

[16]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[17]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[18]  Samy Bengio,et al.  An Online Algorithm for Large Scale Image Similarity Learning , 2009, NIPS.

[19]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Jan M. Rabaey,et al.  High-Dimensional Computing as a Nanoscalable Paradigm , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21]  Quoc V. Le,et al.  Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Okko Johannes Räsänen,et al.  Modeling Dependencies in Multiple Parallel Data Streams with Hyperdimensional Computing , 2014, IEEE Signal Processing Letters.

[23]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[24]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[25]  Okko Johannes Räsänen,et al.  Generating Hyperdimensional Distributed Representations from Continuous-Valued Multivariate Sensory Input , 2015, CogSci.

[26]  Jan M. Rabaey,et al.  A Robust and Energy-Efficient Classifier Using Brain-Inspired Hyperdimensional Computing , 2016, ISLPED.

[27]  Jan M. Rabaey,et al.  Exploring Hyperdimensional Associative Memory , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[28]  Jan M. Rabaey,et al.  Hyperdimensional computing for noninvasive brain-computer interfaces: Blind and one-shot classification of EEG error-related potentials , 2017 .

[29]  Jan M. Rabaey,et al.  Low-Power Sparse Hyperdimensional Encoder for Language Recognition , 2017, IEEE Design & Test.