Construction and Analysis of a Multi-Layered In-car Spoken Dialogue Corpus

In this chapter, we will discuss the construction of the multi-layered in-car spoken dialogue corpus and the preliminary result of the analysis. We have developed the system specially built in a Data Collection Vehicle (DCV) which supports synchronous recording of multi-channel audio data from 16 microphones that can be placed in flexible positions, multi-channel video data from 3 cameras and the vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue with different types of navigator in about 60-minute drive by each of 800 subjects. We have defined the Layered Intention Tag for the analysis of dialogue structure for each of speech unit. Then we have marked the tag to all of the dialogues for over 35,000 speech units. By using the dialogue sequence viewer we have developed, we can analyze the basic dialogue strategy of the human-navigator. We also report the preliminary analysis of the relation between the intention and linguistic phenomenon.