Speech Interface Evaluation on Car Navigation System - Many Undesirable Utterances and Sever Noisy Speech -

Recently, ASR (Automatic Speech Recognition) functions have commercially been used for various consumer applications including car navigation systems. However, many technical and usability problems still exist before ASR applications are on real business use. Our goal is to make ASR technologies for a real business use. To do so, we first evaluate a car navigation interface which has ASR as an input method, and second evaluate an ASR module using real noisy in-car speech. For ASR applications, we envision mobile environments, e.g. mobile information service systems such as car navigation systems and cellular phones on which an embedded speech recognizer (Kokubo et. al., 2006) is running and which are connected to remote servers that support various information-seeking tasks. Taking a look at commercially available car navigation systems, currently over 75% systems have ASR interfaces, however, there are very few drivers who have experiences to use the ASR interfaces. What is the problem? This is caused by the ASR usability problems. In this chapter, we report two experimental evaluation results of ASR interface for mobile use, especially for car navigation applications. First, we evaluate the usability aspects of speech interface and second, we evaluate in-car noise speech problems to propose an effective method to cope with noisy speech. For the first evaluation, we use a prototype which has a promising speech interface called FlexibleShortcuts and Select&Voice produced by Waseda University (Nakano et. al., 2007). We found many undesirable OOV (Out-OfVocabulary) utterances which make the interface worse. From the second experiment to check car-noise problems, we propose an array microphone + Spectrum Subtraction (SS) technique to increase recognition accuracy.

[1]  Tetsunori Kobayashi,et al.  Extensible speech recognition system using proxy-agent , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[2]  Yasunari Obuchi,et al.  Development and evaluation of speech database in automotive environments for practical speech recognition systems , 2006, INTERSPEECH.

[3]  Kiyohiro Shikano,et al.  Embedded Julius: Continuous Speech Recognition Software for Microprocessor , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[4]  Keith Vertanen Combining open vocabulary recognition and word confusion networks , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  R.M. Stern,et al.  Missing-feature approaches in speech recognition , 2005, IEEE Signal Processing Magazine.

[6]  N. Hataoka,et al.  Robust speech dialog interface for car telematics service , 2004, First IEEE Consumer Communications and Networking Conference, 2004. CCNC 2004..