The FASil speech and multimodal corpora

In the context of the FASiL project, we have studied natural language interactions in a unimodal (speech only) and multimodal (speech and graphics) interface to a personal information management database. We collected multilingual corpora to investigate these interactions in Portuguese, English and Swedish. The corpora are used to train language models, to update acoustic models, to study semantic concepts, multimodal interactions, and dialogue management strategies. The corpora are annotated in a uniform way, with timings, transcriptions, and semantics. We report on the structure and design of the corpora which are now available via ELRA.

[1]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[2]  David Reitter,et al.  UI on the Fly: Generating a Multimodal User Interface , 2004, HLT-NAACL.

[3]  Philip R. Cohen,et al.  QuickSet: multimodal interaction for distributed applications , 1997, MULTIMEDIA '97.

[4]  Ulrike Gut,et al.  The TASX-environment: an XML-based toolset for time aligned speech corpora , 2002, LREC.

[5]  Fergus McInnes,et al.  User Responses to Prompt Wording Styles in a Banking Services with Wizard of Oz Simulation of Word-Spotting , 1997 .

[6]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[7]  Eric Sanders,et al.  Speechdat multilingual speech databases for teleservices: across the finish line , 1999, EUROSPEECH.

[8]  Wolfgang Wahlster,et al.  Smartkom: multimodal communication with a life- like character , 2001, INTERSPEECH.

[9]  Daniel Salber,et al.  NEIMO, a multiworkstation usability lab for observing and analyzing multimodal interaction , 1996, CHI 1996.

[10]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[11]  Hans J. G. A. Dolfing,et al.  Unified language modeling using finite-state transducers with first applications , 2004, INTERSPEECH.

[12]  Gavin E. Churcher,et al.  A realistic wizard of oz simulation of a multimodal spoken language system , 1998, ICSLP.

[13]  David Horowitz,et al.  Conversational Dialogue Management in the FASiL project , 2004, SIGDIAL Workshop.

[14]  Anoop K. Sinha,et al.  Suede: a Wizard of Oz prototyping tool for speech user interfaces , 2000, UIST '00.

[15]  Sharon L. Oviatt,et al.  A rapid semi-automatic simulation technique for investigating interactive speech and handwriting , 1992, ICSLP.

[16]  David Horowitz,et al.  A maximum entropy shallow functional parser for spoken language understanding , 2004, INTERSPEECH.