论文信息 - Efficient speech edge detection for mobile health applications

Efficient speech edge detection for mobile health applications

Intelligent audio sensors that are continuously recording and analyzing sounds are a critical component of many emerging and future embedded applications. In these applications, the power budget is very tight, of which the analog front end consumes a major proportion. An efficient analog front end should adapt its power consumption to the instantaneous bandwidth of the audio signal of interest, instead of constantly consuming a fixed amount of power that assumes a fixed signal bandwidth. In this paper, we introduce a novel algorithm for identifying the edges of speech in the time-frequency domain, which is used to detect the instantaneous bandwidth of speech. A circuit implementation of our algorithm consumes 42.4µW of power and can extract the instantaneous bandwidth of a signal within an accuracy of 1% even in SNR conditions as low as 10 dB.

Dingkun Du | Kofi Odame

[1] Naveen Verma,et al. Design considerations for ultra-low energy wireless microsensor nodes , 2005, IEEE Transactions on Computers.

[2] Max A. Little,et al. Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[3] M. Sung,et al. Objective physiological and behavioral measures for identifying and tracking depression state in clinically depressed patients , 2005 .

[4] Farook Sattar,et al. Automatic wheeze detection using histograms of sample entropy , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[5] Ian H. Witten,et al. The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..

[6] Leslie S. Smith,et al. Robust sound onset detection using leaky integrate-and-fire neurons with depressing synapses , 2004, IEEE Transactions on Neural Networks.

[7] Fang Chen,et al. Speech-based cognitive load monitoring system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] Oded Ghitza,et al. Auditory nerve representation as a front-end for speech recognition in a noisy environment , 1986 .

[9] Ton Dijkstra,et al. Therapy progress indicator (TPI): Combining speech parameters and the subjective unit of distress , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[10] Philip Lieberman,et al. Mount Everest: a space analogue for speech monitoring of cognitive deficits and stress. , 2005, Aviation, space, and environmental medicine.

[11] Yorgos Palaskas,et al. Internally varying analog circuits minimize power dissipation , 2003 .

[12] DeLiang Wang,et al. Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Oded Ghitza. Robustness against noise: The role of timing-synchrony measurement , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.