FPGA-Based Hardware Accelerator for Feature Extraction in Automatic Speech Recognition

We describe in this paper a hardware-based improvement scheme of a real-time automatic speech recognition (ASR) system with respect to speed by designing a parallel feature extraction algorithm on a Field-Programmable Gate Array (FPGA). A computationally intensive block in the algorithm is identified implemented in hardware logic on the FPGA. One such block is mel-frequency cepstrum coefficient (MFCC) algorithm used for feature extraction process. We demonstrate that the FPGA platform may perform efficient feature extraction computation in the speech recognition system as compared to the generalpurpose CPU including the ARM processor. The Xilinx Zynq-7000 System on Chip (SoC) platform is used for the MFCC implementation. From this implementation described in this paper, we confirmed that the FPGA platform is approximately 500× faster than a sequential CPU implementation and 60× faster than a sequential ARM implementation. We thus verified that a parallelized and optimized MFCC architecture on the FPGA platform may significantly improve the execution time of an ASR system, compared to the CPU and ARM platforms.

[1]  Alexander I. Rudnicky,et al.  Mixture Pruning and Roughening for Scalable Acoustic Models , 2008, ACL 2008.

[2]  Shing-Tai Pan,et al.  Speech recognition via Hidden Markov Model and neural network trained by genetic algorithm , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[3]  L. Giarre,et al.  Medium access in WiFi networks: strategies of selfish nodes [Applications Corner] , 2009, IEEE Signal Processing Magazine.

[4]  Alexander I. Rudnicky,et al.  Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  FPGA implementation of feature extraction algorithm for speaker verification , 2010, Proceedings of the 17th International Conference Mixed Design of Integrated Circuits and Systems - MIXDES 2010.

[6]  Mohammed Bahoura,et al.  Hardware implementation of MFCC feature extraction for respiratory sounds analysis , 2013, 2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA).

[7]  Jhing-Fa Wang,et al.  Chip design of MFCC extraction for speech recognition , 2002, Integr..

[8]  Mehryar Mohri,et al.  Speech Recognition with Weighted Finite-State Transducers , 2008 .

[9]  Wonyong Sung,et al.  Parallel scalability in speech recognition , 2009, IEEE Signal Processing Magazine.

[10]  Wonyong Sung,et al.  Architectural Design and Implementation of an FPGA Softcore Based Speech Recognition System , 2006, 2006 6th International Workshop on System on Chip for Real Time Applications.

[11]  Youngmoo E. Kim,et al.  Efficient Acoustic Feature Extraction for Music Information Retrieval Using Programmable Gate Arrays , 2009, ISMIR.

[12]  Ian R. Lane,et al.  Optimized MFCC feature extraction on GPU , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Hui Li,et al.  A HMM Speech Recognition System Based on FPGA , 2008, 2008 Congress on Image and Signal Processing.