论文信息 - A Robust Speech Recognition System for Service-Robotics Applications

A Robust Speech Recognition System for Service-Robotics Applications

Mobile service robots in human environments need to have versatile abilities to perceive and to interact with their environment. Spoken language is a natural way to interact with a robot, in general, and to instruct it, in particular. However, most existing speech recognition systems often suffer from high environmental noise present in the target domain and they require in-depth knowledge of the underlying theory in case of necessary adaptation to reach the desired accuracy. We propose and evaluate an architecture for a robust speaker independent speech recognition system using off-the-shelf technology and simple additional methods. We first use close speech detection to segment closed utterances which alleviates the recognition process. By further utilizing a combination of an FSG based and an N-gram based speech decoder we reduce false positive recognitions while achieving high accuracy.

Gerhard Lakemeyer | Stefan Schiffer | Masrur Doostdar

[1] Richard M. Stern,et al. The 1997 CMU Sphinx-3 English Broadcast News Transcription System , 1997 .

[2] Hermann Ney,et al. Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[3] Gerhard Lakemeyer,et al. Azimuthal sound localization using coincidence of timing across frequency on a robotic platform. , 2007, The Journal of the Acoustical Society of America.

[4] Aaron E. Rosenberg,et al. An improved endpoint detector for isolated word recognition , 1981 .

[5] Mei-Yuh Hwang,et al. The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[6] John H. L. Hansen,et al. Robust speech activity detection in the presence of noise , 1998, ICSLP.

[7] Michael Picheny,et al. Key-phrase spotting using an integrated language model of n-grams and finite-state grammar , 1997, EUROSPEECH.

[8] Gerhard Lakemeyer,et al. Combining Sound Localization and Laser-based Object Recognition , 2007, AAAI Spring Symposium: Multidisciplinary Collaboration for Socially Assistive Robotics.

[9] Thomas Wisspeintner,et al. RoboCup X: A Proposal for a New League Where RoboCup Goes Real World , 2005, RoboCup.

[10] Climent Nadeu,et al. Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[11] Climent Nadeu,et al. Robust speech activity detection using LDA applied to FF parameters , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..