论文信息 - An Energy-Efficient Speech-Extraction Processor for Robust User Speech Recognition in Mobile Head-Mounted Display Systems

An Energy-Efficient Speech-Extraction Processor for Robust User Speech Recognition in Mobile Head-Mounted Display Systems

An energy-efficient speech extraction (SE) processor is proposed for robust user speech recognition (SR) in head-mounted display (HMD) systems. User SE is essential for robust user SR in a noisy environment. For the low-latency SE, the FastSE algorithm is proposed to overcome the time-consuming constrained-independent-component-analysis-based user speech selection process, which results in < 2-ms SE latency. Moreover, a reinforced-FastSE scheme is proposed to achieve 97.2% accuracy with only 33-kB FastSE on-chip memory for the low-power HMD applications. Also, a reconfigurable matrix operation accelerator is implemented for the energy-efficient acceleration of the dominant matrix operation in SE. As a result, the proposed SE processor achieves 1.3× higher speed with 4.24× smaller memory compared to the state-of-the-art work, so SR in a noisy environment becomes possible for mobile HMD applications.

Hoi-Jun Yoo | Jinmook Lee | Injoon Hong | Seongwook Park

[1] Tzyy-Ping Jung,et al. An efficient VLSI implementation of on-line recursive ICA processor for real-time multi-channel EEG signal separation , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[2] Sang Yup Lee,et al. Extracting a source of shorter source-to-microphone distance from convolutive mixtures , 2011 .

[3] Hoi-Jun Yoo,et al. A 3.13nJ/sample energy-efficient speech extraction processor for robust speech recognition in mobile head-mounted display systems , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[4] Young-Koo Lee,et al. Fast constrained independent component analysis for blind speech separation with multiple references , 2010, 5th International Conference on Computer Sciences and Convergence Information Technology.

[5] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[6] Po-Lei Lee,et al. Implementation of Pipelined FastICA on FPGA for Real-Time Blind Source Separation , 2008, IEEE Transactions on Neural Networks.

[7] Hoi-Jun Yoo,et al. Wearable mental-health monitoring platform with independent component analysis and nonlinear chaotic analysis , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[8] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9] K. Matsuoka,et al. Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[10] Chia-Hsiang Yang,et al. An 81.6 $\mu {\rm W}$ FastICA Processor for Epileptic Seizure Detection , 2015, IEEE Transactions on Biomedical Circuits and Systems.

[11] John R. Hershey,et al. Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system , 2006, INTERSPEECH.

[12] Francesco Nesta,et al. A FLEXIBLE SPATIAL BLIND SOURCE EXTRACTION FRAMEWORK FOR ROBUST SPEECH RECOGNITION IN NOISY ENVIRONMENTS , 2013 .

[13] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .

[14] Lan-Da Van,et al. Energy-Efficient FastICA Implementation for Biomedical Signal Separation , 2011, IEEE Transactions on Neural Networks.