Application of Speech Technology in Vehicles

Speech technology has been regarded as one of the most interesting technologies for operating in-vehicle information systems. Cameron [1] has pointed out that under at least one of the four criteria that people are using speech system more likely. These four criteria are the following: (1) They are offered no choice; (2) it corresponds to the privacy of their surroundings; (3) their hands or eyes are busy on another task; and (4) it is quicker than any other alternatives. For driver, driving is a typical “hands and eyes are busy” task. In most of the situations, the driver is the only person inside the car, or with some passengers who know each other well, so the “privacy of surroundings” criteria are also met. There are long histories of interests of applying speech technology into controlling in-vehicle information system. Up to now, some of the commercial cars have already equipped with imbedded speech technology. In 1996, however, the S-Class car of Mercedes-Benz introduced Linguatronic, the first generation of in-car speech system for anybody who drives a car [2]. Since then, the number of in-vehicle applications using speech technology is increasing [3].

[1]  Clifford Nass,et al.  Improving automotive safety by pairing driver emotion and car voice emotion , 2005, CHI Extended Abstracts.

[2]  Paul P Jovanis,et al.  Effect of In-Vehicle Route Guidance Systems on Driver Workload and Choice of Vehicle Speed: Findings From a Driving Simulator Experiment , 1997, Ergonomics and Safety of Intelligent Driver Interfaces.

[3]  Gordon H. Bower,et al.  Affect, memory, and social cognition. , 2000 .

[4]  Chong Kwan Un,et al.  Speech recognition in noisy environments using first-order vector Taylor series , 1998, Speech Commun..

[5]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[6]  Louise Dulude,et al.  Automated telephone answering systems and aging , 2002, Behav. Inf. Technol..

[7]  Fang Chen,et al.  Speech interaction system - how to increase its usability? , 2004, INTERSPEECH.

[8]  H. Lunenfeld Human factor considerations of motorist navigation and information systems , 1989, Conference Record of papers presented at the First Vehicle Navigation and Information Systems Conference (VNIS '89).

[9]  Niels Ole Bernsen,et al.  A Multimodal Virtual Co-driver's Problems with the Driver , 2002 .

[10]  Paul Heisterkamp Linguatronic: Product-Level Speech System for Mercedes-Benz Car , 2001, HLT.

[11]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[12]  Clifford Nass,et al.  Thank you, I did not see that: in-car speech based information systems for older adults , 2005, CHI Extended Abstracts.

[13]  Mark J. F. Gales,et al.  Robust speech recognition in additive and convolutional noise using parallel model combination , 1995, Comput. Speech Lang..

[14]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition , 1996 .

[15]  John Swarbrooke,et al.  Case Study 18 – Las Vegas, Nevada, USA , 2007 .

[16]  Staffan Larsson,et al.  Issue-based Dialogue Management , 2002 .

[17]  D. Strayer,et al.  Cell phone-induced failures of visual attention during simulated driving. , 2003, Journal of experimental psychology. Applied.

[18]  Andrew W. Gellatly,et al.  Speech Recognition and Automotive Applications: Using Speech to Perform in-Vehicle Tasks , 1998 .

[19]  Mary Zajicek,et al.  Evaluation and context for in-car speech systems for older adults , 2005, CLIHC '05.

[20]  Niels Ole Bernsen,et al.  Evaluation and usability of multimodal spoken language dialogue systems , 2004, Speech Commun..

[21]  Fang Chen,et al.  Zonal adaptive workload management systems: Limiting secondary tasks while driving , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[22]  Paul Green,et al.  Safety and Usability of Speech Interfaces for In-Vehicle Tasks while Driving: A Brief Literature Review , 2006 .

[23]  Gerald L. Glore,et al.  Emotions and Beliefs: Feeling is believing: Some affective influences on belief , 2000 .

[24]  Glenn F. Wilson,et al.  Performance Enhancement with Real-Time Physiologically Controlled Adaptive Aiding , 2000 .

[25]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[26]  Magda B. Arnold,et al.  The nature of emotion , 1968 .

[27]  Richard Bishop,et al.  Intelligent Vehicle Technology and Trends , 2005 .

[28]  Mary Zajicek,et al.  Solutions for Elderly Visually Impaired People Using the Internet , 2000, BCS HCI.

[29]  Johan Engström,et al.  Sensitivity of eye-movement measures to in-vehicle task difficulty , 2005 .

[30]  Niels Ole Bernsen,et al.  Exploring Natural Interaction in the Car , 2001 .

[31]  Karen Cheng,et al.  On the road and on the Web?: comprehension of synthetic and human speech while driving , 2001, CHI.

[32]  Lars Bo Larsen,et al.  Assessment of spoken dialogue system usability - what are we really measuring? , 2003, INTERSPEECH.

[33]  Satoshi Nakamura,et al.  Cepstrum derived from differentiated power spectrum for robust speech recognition , 2003, Speech Commun..

[34]  E. Rogers,et al.  HOMOPHILY-HETEROPHILY: RELATIONAL CONCEPTS FOR COMMUNICATION RESEARCH , 1970 .

[35]  Wiebo H Brouwer,et al.  OLDER DRIVERS AND ATTENTIONAL DEMANDS: CONSEQUENCES FOR HUMAN FACTORS RESEARCH. , 1993 .

[36]  Donald A. Norman,et al.  The Design of Future Things , 2007 .

[37]  Stefan W. Hamerich Towards advanced speech driven navigation systems for cars , 2007 .

[38]  Oskar Juhlin,et al.  Watching the cars go round and round: designing for active spectating , 2006, CHI.

[39]  F. McKenna,et al.  The effect of interference on dynamic risk‐taking judgments , 1999 .

[40]  Hsiao-Chuan Wang,et al.  Robust features derived from temporal trajectory filtering for speech recognition under the corruption of additive and convolutional noises , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[41]  Dick de Waard,et al.  A simple procedure for the assessment of acceptance of advanced transport telematics , 1997 .

[42]  Sharon L. Oviatt,et al.  When do we interact multimodally?: cognitive load and multimodal communication patterns , 2004, ICMI '04.

[43]  James J. Gross,et al.  Emotion and emotion regulation , 1999 .

[44]  W. Brouwer,et al.  Age differences in divided attention in a simulated driving task. , 1988, Journal of gerontology.

[45]  Terry C. Lansdown,et al.  The design of in-car speech recognition interfaces for usability and user acceptance , 1998 .

[46]  Li Deng,et al.  HMM adaptation using vector taylor series for noisy speech recognition , 2000, INTERSPEECH.

[47]  Petra Geutner,et al.  Design of the VICO Spoken Dialogue System: Evaluation of User Expectations by Wizard-of-Oz Experiments , 2002, LREC.

[48]  G. Saulnier,et al.  Ergonomic evaluation of a prototype guidance system in an urban area. Discussion about methodologies and data collection tools , 1995, Pacific Rim TransTech Conference. 1995 Vehicle Navigation and Information Systems Conference Proceedings. 6th International VNIS. A Ride into the Future.

[49]  David Schlangen,et al.  Towards Reducing and Managing Uncertainty in Spoken Dialogue Systems , 2007 .

[50]  Timothy L. Brown,et al.  Speech-Based Interaction with In-Vehicle Computers: The Effect of Speech-Based E-Mail on Drivers' Attention to the Roadway , 2001, Hum. Factors.

[51]  Mohammad Mehdi Homayounpour,et al.  Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition , 2007, Comput. Speech Lang..

[52]  Jeff Allen Greenberg,et al.  EVALUATION OF DRIVER DISTRACTION USING AN EVENT DETECTION PARADIGM , 2003 .

[53]  Hsiao-Chuan Wang,et al.  Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences , 1999, Speech Commun..

[54]  Gabriel Skantze,et al.  Exploring human error recovery strategies: Implications for spoken dialogue systems , 2005, Speech Communication.

[55]  Paul F. Lazarsfeld,et al.  Mass communication popular taste and organized social action. , 1948 .

[56]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .

[57]  J. Gross Antecedent- and response-focused emotion regulation: divergent consequences for experience, expression, and physiology. , 1998, Journal of personality and social psychology.

[58]  J. Groeger Understanding Driving: Applying Cognitive Psychology to a Complex Everyday Task , 2000 .

[59]  Climent Nadeu,et al.  Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition , 1997, IEEE Trans. Speech Audio Process..

[60]  Mary Zajicek,et al.  Speech Output for Older Visually Impaired Adults , 2001, BCS HCI/IHM.

[61]  Rod Barrett,et al.  Hands-free mobile phone speech while driving degrades coordination and control , 2004 .

[62]  Stuart Goose,et al.  WIRE3: Driving Around the Information Super-Highway , 2002, Personal and Ubiquitous Computing.

[63]  J.-M. Boucher,et al.  A New Method Based on Spectral Subtraction for Speech Dereverberation , 2001 .

[64]  Mark J. F. Gales,et al.  Robust continuous speech recognition using parallel model combination , 1996, IEEE Trans. Speech Audio Process..

[65]  Rosalind W. Picard Affective Computing , 1997 .

[66]  Clifford Nass,et al.  Don't blame me I am only the driver: impact of blame attribution on attitudes and attention to driving task , 2004, CHI EA '04.

[67]  John L. Campbell,et al.  Speech Recognition and In-Vehicle Telematics Devices: Potential Reductions in Driver Distraction , 2004, Int. J. Speech Technol..

[68]  Christopher Kermorvant A comparison of noise reduction techniques for robust speech recognition , 1999 .

[69]  M. A. Recarte,et al.  Mental workload while driving: effects on visual search, discrimination, and decision making. , 2003, Journal of experimental psychology. Applied.

[70]  Biing-Hwang Juang,et al.  The short-time modified coherence representation and noisy speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..