论文信息 - Acquisition and Labelling of a Spontaneous Speech Dialogue Corpus ∗

Acquisition and Labelling of a Spontaneous Speech Dialogue Corpus ∗

The current state of speech technologies has caused the development of new speech-based applications such as dialogue systems, which can be applied to several tasks. In dialogue systems, a computer interacts with users using dialogue, simulating a human being. Probabilistic models can be used to define the behaviour of a dialogue system. The estimation of these probabilistic models requires the use of large labelled corpora. Therefore, the acquisition and labelling of a dialogue corpus of the task is a usual previous step in dialogue systems development. In this work, we present the acquisition and labelling of a Spanish dialogue corpus, which refers to train service queries. We describe the Wizard of Oz strategy used for the acquisition and the labelling rules and the tools used for the labelling.

[1] Giuseppe Riccardi,et al. How may I help you? , 1997, Speech Commun..

[2] Ronnie W. Smith,et al. Current and New Directions in Discourse and Dialogue , 2004 .

[3] Shalom Lappin,et al. Current and New Directions in Discourse and Dialogue , 2003 .

[4] Francisco Casacuberta,et al. Evaluating a Probabilistic Dialogue Model for a Railway Information Task , 2002, TSD.

[5] Encarna Segarra,et al. Error handling in a stochastic dialog system through confidence measures , 2005, Speech Commun..

[6] Volker Steinbiss,et al. The Philips automatic train timetable information system , 1995, Speech Commun..

[7] Encarna Segarra,et al. Development of a stochastic dialog manager driven by semantics , 2003, INTERSPEECH.

[8] Nigel Gilbert,et al. Simulating speech systems , 1991 .

[9] Andreas Stolcke,et al. Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[10] J. Sadock. Speech acts , 2007 .

[11] Emilio Sanchis Arnal,et al. A Labelling Proposal to Annotate Dialogues , 2002, LREC.