Assessment of user simulators for spoken dialogue systems by means of subspace multidimensional clustering

The assessment of user simulators in terms of their similarity with real users implies processing and interpreting large dialogue corpora, for which many interaction parameters can be considered. In this setting, the high dimensionality of the data makes it difficult to compare the dialogues as it is not always appropriate to consider all features equally in order to carry out meaningful interpretations. We propose to use subspace clustering for the assessment of users simulators, as this technique has been successfully applied to tackle and classify highdimensional information in other areas of study. We created and assessed a user simulator for the Let’s Go spoken dialogue system. The experimental results show that the proposed approach is easy to set up and helps to better interpret whether the user simulator has similar behaviours to real human users by creating clusters with different dimensions which cannot be identified with plain clustering techniques.

[1]  Jiong Yang,et al.  Mining High-Dimensional Data , 2010, Data Mining and Knowledge Discovery Handbook.

[2]  Maxine Eskénazi,et al.  Spoken Dialog Challenge 2010 , 2010, 2010 IEEE Spoken Language Technology Workshop.

[3]  Kallirroi Georgila,et al.  Quantitative Evaluation of User Simulation Techniques for Spoken Dialogue Systems , 2005, SIGDIAL.

[4]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..

[5]  Steve J. Young,et al.  Error simulation for training statistical dialogue systems , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[6]  Helen F. Hastie,et al.  A survey on metrics for the evaluation of user simulations , 2012, The Knowledge Engineering Review.

[7]  David Griol,et al.  A statistical approach to spoken dialog systems design and evaluation , 2008, Speech Commun..

[8]  Ramón López-Cózar,et al.  A Comparison between Dialog Corpora Acquired with Real and Simulated Users , 2009, SIGDIAL Conference.

[9]  Jason D. Williams,et al.  Evaluating user simulations with the Cramér-von Mises divergence , 2008, Speech Commun..

[10]  Oliver Lemon,et al.  Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation , 2011, Comput. Speech Lang..

[11]  Maxine Eskénazi,et al.  Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.

[12]  Ramón López-Cózar,et al.  Relations between de-facto criteria in the evaluation of a spoken dialogue system , 2008, Speech Commun..

[13]  Ira Assent,et al.  Evaluating Clustering in Subspace Projections of High Dimensional Data , 2009, Proc. VLDB Endow..

[14]  Sebastian Möller,et al.  Analysis of a new simulation approach to dialog system evaluation , 2009, Speech Commun..