论文信息 - Multimedia data collection of in-car speech communication

Multimedia data collection of in-car speech communication

In this paper, we report the details of the collection of the multimedia data such as audio, video and auxiliary information of the vehicle during a spoken dialogue in a moving car. The system specially built in a Data Collection Vehicle (DCV) supports synchronous recording of multi-channel audio data from 16 microphones that can be placed in flexible positions, multi-channel video data from 3 cameras and the vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue in about a 60-minute drive by each of 200 subjects. Data has been collected for two dialogue modes: (1) prompted dialogue between the driver and an accompanying operator and (2) natural dialogue between the driver and a telephone operator for information access over a cellular phone while driving a car. The corpus can be used for analysis of multimedia data in a moving car environment and also for modeling spoken dialogue in scenarios such as information access while driving a car.

Kazuya Takeda | Fumitada Itakura | Nobuo Kawaguchi | Shigeki Matsubara

[1] Nobuaki Minematsu,et al. Japanese dictation toolkit: plug-and-play framework for speech recognition R&D , 1999 .

[2] Khalid Choukri,et al. SPEECHDAT-CAR. A Large Speech Database for Automotive Environments , 2000, LREC.

[3] Jean-Claude Junqua,et al. Robustness in Automatic Speech Recognition , 1996 .

[4] 武田一哉,et al. Workshop on Robust Methods for Speech Recognition in Adverse Conditions報告 , 1999 .

[5] Kazuya Takeda,et al. Construction of speech corpus in moving car environment , 2000, INTERSPEECH.

[6] Jean-Claude Junqua,et al. Techniques for robust speech recognition in the car environment , 1999, EUROSPEECH.

[7] Petra Geutner,et al. VODIS - voice-operated driver information systems: a usability study on advanced speech technologies for car environments , 2000, INTERSPEECH.

[8] Deb Roy,et al. Grounded speech communication , 2000, INTERSPEECH.

[9] Shuichi Itahashi,et al. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research , 1999 .