WITcHCRafT: A Workbench for Intelligent exploraTion of Human ComputeR conversaTions

We present Witchcraft, an open-source framework for the evaluation of prediction models for spoken dialogue systems based on interaction logs and audio recordings. The use of Witchcraft is two fold: first, it provides an adaptable user interface to easily manage and browse thousands of logged dialogues (e.g. calls). Second, with help of the underlying models and the connected machine learning framework RapidMiner the workbench is able to display at each dialogue turn the probability of the task being completed based on the dialogue history. It estimates the emotional state, gender and age of the user. While browsing through a logged conversation, the user can directly observe the prediction result of the models at each dialogue step. By that, Witchcraft allows for spotting problematic dialogue situations and demonstrates where the current system and the prediction models have design flaws. Witchcraft will be made publically available to the community and will be deployed as open-source project.

[1]  Jonathan Harrington,et al.  Speech annotation and corpus tools , 2001, Speech Commun..

[2]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[3]  Jeremy H. Wright,et al.  Automatically Training a Problematic Dialogue Predictor for a Spoken Dialogue System , 2011, J. Artif. Intell. Res..

[4]  Roberto Pieraccini,et al.  VALUE-BASED OPTIMAL DECISION FOR DIALOG SYSTEMS , 2006, 2006 IEEE Spoken Language Technology Workshop.

[5]  Claude Barras,et al.  Transcribing with Annotation Graphs , 2000, LREC.

[6]  David McKelvie,et al.  The MATE workbench - An annotation tool for XML coded speech corpora , 2001, Speech Commun..

[7]  Florian Metze,et al.  Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Eric Horvitz,et al.  Optimizing Automated Call Routing by Integrating Spoken Dialog Models with Queuing Models , 2004, NAACL.

[9]  Woosung Kim,et al.  Online call quality monitoring for automating agent-based call centers , 2007, INTERSPEECH.

[10]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[11]  Jackson Liscombe,et al.  When calls go wrong: how to detect problematic calls based on log-files and emotions? , 2008, INTERSPEECH.

[12]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[13]  Jackson Liscombe,et al.  On NoMatchs, NoInputs and BargeIns: Do Non-Acoustic Features Support Anger Detection? , 2009, SIGDIAL Conference.

[14]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.