Approche bayésienne de la composition sémantique dans les systèmes de dialogue oral

Focusing on the interpretation component of spoken dialog systems, this paper introduces a stochastic approach based on dynamic Bayesian networks to infer and compose semantic structures from speech. Word strings, basic concept sequences and composed semantic frames (as defined in the Berkeley FrameNet paradigm) are derived sequentially from the users ' inputs. A semi-automatic process provides a reference frame annotation of the speech training data. Then the DBN trained on these data are used to hypothesize the frames and their constituents from the test data. Eventually a rule-based process produces the final composed frame annotation. Experimental results on the French MEDIA dialog corpus show the appropriateness of the technique which both lead to good semantic tree identification performance and can provide the dialog manager with n-best lists of scored hypotheses. MOTS-CLES: systeme de dialogue oral, composition semantique, reseaux bayesiens dynamiques.

[1]  F. Lefvre Dynamic Bayesian Networks and Discriminative Classifiers for Multi-Stage Semantic Interpretation , 2007 .

[2]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[3]  Jeff A. Bilmes,et al.  Factored Language Models and Generalized Parallel Backoff , 2003, NAACL.

[4]  Philippe Roussel,et al.  The birth of Prolog , 1993, HOPL-II.

[5]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[6]  P. J. Price,et al.  Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.

[7]  H. Bonneau-Maynard,et al.  A 2+1-level stochastic understanding model , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[8]  David P. Miller,et al.  A demonstration of a conversationally guided smart wheelchair , 2007, Assets '07.

[9]  Frédéric Béchet,et al.  Semantic Frame Annotation on the French MEDIA corpus , 2008, LREC.

[10]  Miriam R. L. Petruck FRAME SEMANTICS , 1996 .

[11]  Maxine Eskénazi,et al.  LET's GO: improving spoken dialog systems for the elderly and non-natives , 2003, INTERSPEECH.

[12]  Renato De Mori,et al.  A Bayesian approach to semantic composition for spoken language interpretation , 2008, INTERSPEECH.

[13]  Jan Wielemaker,et al.  An Overview of the SWI-Prolog Programming Environment , 2003, WLPE.

[14]  Lori Lamel,et al.  The LIMSI ARISE system , 2000, Speech Commun..

[15]  Fabrice Lef DYNAMIC BAYESIAN NETWORKS AND DISCRIMINATIVE CLASSIFIERS FOR MULTI-STAGE SEMANTIC INTERPRETATION , 2007 .

[16]  Geoffrey Zweig,et al.  The graphical models toolkit: An open source software system for speech and time-series processing , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Frédéric Béchet,et al.  Spoken Language Understanding Strategies on the France Telecom 3000 Voice Agency Corpus , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[18]  Georges Linarès,et al.  Frame-based acoustic feature integration for speech understanding , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Brady Clark,et al.  Responding to Student Uncertainty in Spoken Tutorial Dialogue Systems , 2006, Int. J. Artif. Intell. Educ..

[20]  Renato De Mori,et al.  Spoken language interpretation: On the use of dynamic Bayesian networks for semantic composition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Kevin Duh,et al.  Factored Language Models Tutorial , 2007 .

[22]  Christopher R. Johnson,et al.  Background to Framenet , 2003 .

[23]  William A. Woods,et al.  What's in a Link: Foundations for Semantic Networks , 1975 .

[24]  Roberto Pieraccini,et al.  Concept-based spontaneous speech understanding system , 1995, EUROSPEECH.

[25]  Guillaume Pitel,et al.  Annotation précise du français en sémantique de rôles par projection cross-linguistique , 2007 .

[26]  Frédéric Béchet,et al.  On the use of finite state transducers for semantic interpretation , 2006, Speech Commun..

[27]  Pierre Dupont,et al.  A Cooperative spoken dialogue system based on a rational agent model: a first implementation on the AGS application , 1995 .

[28]  Steve J. Young,et al.  Spoken language understanding using the Hidden Vector State Model , 2006, Speech Commun..

[29]  J. Lowe,et al.  A Frame-Semantic Approach to Semantic Annotation , 1997 .

[30]  Fabrice Lefèvre A DBN-BASED MULTI-LEVEL STOCHASTIC SPOKEN LANGUAGE UNDERSTANDING SYSTEM , 2006 .