A 90 nm CMOS, 6µW Power-Proportional Acoustic Sensing Frontend for Voice Activity Detection

This work presents a sub-6 μW acoustic front-end for speech/non-speech classification in a voice activity detection (VAD) in 90 nm CMOS. Power consumption of the VAD system is minimized by architectural design around a new Power-Proportional sensing paradigm and the use of machine-learning assisted moderate-precision analog analytics for classification. Power-Proportional sensing allows for hierarchical and context-aware scaling of the frontend’s power consumption depending on the complexity of the ongoing information extraction, while the use of analog analytics brings increased power efficiency through switching on/off the computation of individual features depending on the features’ usefulness in a particular context. The proposed VAD system reduces the power consumption by 10X as compared to state-of-the-art systems and yet achieves an 89% average hit rate for a 12 dB signal to acoustic noise ratio in babble context, which is at par with software based VAD systems.

[1]  E. B. Newman,et al.  A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[2]  James Tschanz,et al.  A 2.3 nJ/Frame Voice Activity Detector-Based Audio Front-End for Context-Aware System-On-Chip Applications in 32-nm CMOS , 2013, IEEE Journal of Solid-State Circuits.

[3]  Rahul Sarpeshkar,et al.  Analog Versus Digital: Extrapolating from Electronics to Neurobiology , 1998, Neural Computation.

[4]  Ajith Amerasekera,et al.  Adaptation of CDR and full scale range of ADC-based SerDes receiver , 2009, 2009 Symposium on VLSI Circuits.

[5]  Marian Verhelst,et al.  24.2 Context-aware hierarchical information-sensing in a 6μW 90nm CMOS voice activity detector , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[6]  Justin K. Romberg,et al.  Beyond Nyquist: Efficient Sampling of Sparse Bandlimited Signals , 2009, IEEE Transactions on Information Theory.

[7]  Boris Murmann Digitally assisted data converter design , 2013, 2013 Proceedings of the ESSCIRC (ESSCIRC).

[8]  F. Fahy Measurement of acoustic intensity using the cross‐spectral density of two microphone signals , 1977 .

[9]  Vinod Kulathumani,et al.  Hibernets: Energy-Efficient Sensor Networks Using Analog Signal Processing , 2011, IEEE J. Emerg. Sel. Topics Circuits Syst..

[10]  Anantha Chandrakasan,et al.  A Resolution-Reconfigurable 5-to-10-Bit 0.4-to-1 V Power Scalable SAR ADC for Sensor Applications , 2013, IEEE Journal of Solid-State Circuits.

[11]  Richard G. Baraniuk,et al.  On the feasibility of hardware implementation of sub-Nyquist random-sampling based analog-to-information conversion , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[12]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[13]  Juan Manuel Górriz,et al.  Speech/non-speech discrimination based on contextual information integrated bispectrum LRT , 2006, IEEE Signal Processing Letters.

[14]  Marian Verhelst,et al.  Optimal resource usage in ultra-low-power sensor interfaces through context- and resource-cost-aware machine learning , 2015, Neurocomputing.

[15]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[16]  Yingchieh Ho,et al.  A 48.6-to-105.2 µW Machine Learning Assisted Cardiac Sensor SoC for Mobile Healthcare Applications , 2014, IEEE Journal of Solid-State Circuits.

[17]  Naveen Verma,et al.  18.4 A matrix-multiplying ADC implementing a machine-learning classifier directly with data conversion , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[18]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[19]  Marian Verhelst,et al.  Ultra-low-power voice-activity-detector through context- and resource-cost-aware feature selection in decision trees , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[20]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[21]  Peter E. William,et al.  Analog sensing front-end system for harmonic signal classification , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[22]  Vladimir Stojanovic,et al.  Design and Analysis of a Hardware-Efficient Compressed Sensing Architecture for Data Compression in Wireless Sensors , 2012, IEEE Journal of Solid-State Circuits.