Energy-Efficient Floating-Point MFCC Extraction Architecture for Speech Recognition Systems

This brief presents an energy-efficient architecture to extract mel-frequency cepstrum coefficients (MFCCs) for real-time speech recognition systems. Based on the algorithmic property of MFCC feature extraction, the architecture is designed with floating-point arithmetic units to cover a wide dynamic range with a small bit-width. Moreover, various operations required in the MFCC extraction are examined to optimize operational bit-width and lookup tables needed to compute nonlinear functions, such as trigonometric and logarithmic functions. In addition, the dataflow of MFCC extraction is tailored to minimize the computation time. As a result, the energy consumption is considerably reduced compared with previous MFCC extraction systems.

[1]  H. Hon A survey of hardware architectures designed for speech recognition , 1991 .

[2]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[3]  Hua Ye,et al.  Implementation of the MFCC front-end for low-cost speech recognition systems , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[4]  Steven F. Quigley,et al.  FPGA Implementation for GMM-Based Speaker Identification , 2011, Int. J. Reconfigurable Comput..

[5]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[6]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[7]  Enrique Cantó,et al.  Real-Time Speaker Verification System Implemented on Reconfigurable Hardware , 2013, J. Signal Process. Syst..

[8]  Eric A. Brewer,et al.  Hardware speech recognition for user interfaces in low cost, low power devices , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[9]  D.P. Skinner,et al.  The cepstrum: A guide to processing , 1977, Proceedings of the IEEE.

[10]  Oliver Chiu-sing Choy,et al.  An efficient MFCC extraction method in speech recognition , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[11]  C. R. Cole,et al.  CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications , 1984 .

[12]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.