A Robust Speech Recognition System for Service-Robotics Applications

Mobile service robots in human environments need to have versatile abilities to perceive and to interact with their environment. Spoken language is a natural way to interact with a robot, in general, and to instruct it, in particular. However, most existing speech recognition systems often suffer from high environmental noise present in the target domain and they require in-depth knowledge of the underlying theory in case of necessary adaptation to reach the desired accuracy. We propose and evaluate an architecture for a robust speaker independent speech recognition system using off-the-shelf technology and simple additional methods. We first use close speech detection to segment closed utterances which alleviates the recognition process. By further utilizing a combination of an FSG based and an N-gram based speech decoder we reduce false positive recognitions while achieving high accuracy.

[1]  Richard M. Stern,et al.  The 1997 CMU Sphinx-3 English Broadcast News Transcription System , 1997 .

[2]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[3]  Gerhard Lakemeyer,et al.  Azimuthal sound localization using coincidence of timing across frequency on a robotic platform. , 2007, The Journal of the Acoustical Society of America.

[4]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[5]  Mei-Yuh Hwang,et al.  The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[6]  John H. L. Hansen,et al.  Robust speech activity detection in the presence of noise , 1998, ICSLP.

[7]  Michael Picheny,et al.  Key-phrase spotting using an integrated language model of n-grams and finite-state grammar , 1997, EUROSPEECH.

[8]  Gerhard Lakemeyer,et al.  Combining Sound Localization and Laser-based Object Recognition , 2007, AAAI Spring Symposium: Multidisciplinary Collaboration for Socially Assistive Robotics.

[9]  Thomas Wisspeintner,et al.  RoboCup X: A Proposal for a New League Where RoboCup Goes Real World , 2005, RoboCup.

[10]  Climent Nadeu,et al.  Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[11]  Climent Nadeu,et al.  Robust speech activity detection using LDA applied to FF parameters , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..