Raisonnement sur les incertitudes et apprentissage pour les systèmes de dialogue conventionnels

More and more industries need dialogue applications. In customer relationship management, they range from home shopping to after-sales service, order tracking, directory enquiries. . .The hotlines of these vocal services are very costly and often overcrowded. In the end, the proposed service is expensive and has a low quality level. Confronted with this problem, industry and research struggle to converge. On the one hand, industrial dialogue systems are designed with decision-making automata describing the dialogue logics. These automata are reputed simplistic, hard to design and suboptimal. On the other hand, scientists focus on advanced techniques that only experts are able to implement and that remain sorely monitorable. Grounded in the system global architecture, this PhD thesis endeavoured to reconcile research with industry by enclosing the scientific advances into the industrial process. This work led to the definition of a new model for reasoning on uncertainties, to the definition of a new non-Markovian decision process and to the implementations and optimisations of plug-and-play algorithms dedicated to the problem. The advanced functionalities developed in this thesis enable to improve robustness, to guarantee the optimality of design choices and to have the project managers receive an easy-to-comprehend usage feedback. These results have motivated the implementation of the world première of commercial dialogue application incorporating online reinforcement learning.

[1]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[2]  Didier Mallat Desmortiers Raisonnement automatique sur les croyances et les incertitudes d'un agent formalisé au sein de la théorie de l'interaction rationnelle , 2003 .

[3]  Frédéric Garcia Révision des croyances et révision du raisonnement pour la planification , 1993 .

[4]  Joseph Y. Halpern Reasoning about uncertainty , 2003 .

[5]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6]  R. Bakis,et al.  A CORPUS-BASED APPROACH TO < AHEM / > EXPRESSIVE SPEECH SYNTHESIS , 2004 .

[7]  Roberto Pieraccini,et al.  AMICA: the AT&t mixed initiative conversational architecture , 1997, EUROSPEECH.

[8]  John L. Pollock,et al.  Defeasible Reasoning , 2020, Synthese Library.

[9]  Ronald L. Wasserstein,et al.  Monte Carlo: Concepts, Algorithms, and Applications , 1997 .

[10]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[11]  Kallirroi Georgila,et al.  EVALUATING EFFECTIVENESS AND PORTABILITY OF REINFORCEMENT LEARNED DIALOGUE STRATEGIES WITH REAL USERS: THE TALK TOWNINFO EVALUATION , 2006, 2006 IEEE Spoken Language Technology Workshop.

[12]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[13]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[14]  Sridhar Mahadevan,et al.  Decision-Theoretic Planning with Concurrent Temporally Extended Actions , 2001, UAI.

[15]  Tim Paek,et al.  Toward Evaluation that Leads to Best Practices: Reconciling Dialog Evaluation in Research and Industry , 2007, Proceedings of the Workshop on Bridging the Gap Academic and Industrial Research in Dialog Technologies - NAACL-HLT '07.

[16]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[17]  Oliver Lemon,et al.  Mixture Model POMDPs for Efficient Handling of Uncertainty in Dialogue Management , 2008, ACL.

[18]  Staffan Larsson,et al.  Information States and Dialogue Move Engines , 1999, Electron. Trans. Artif. Intell..

[19]  Oliver Lemon,et al.  DIPPER: Description and Formalisation of an Information-State Update Dialogue System Architecture , 2003, SIGDIAL Workshop.

[20]  Johan de Kleer,et al.  Problem Solving with the ATMS , 1986, Artif. Intell..

[21]  Julia Hirschberg,et al.  Predicting Automatic Speech Recognition Performance Using Prosodic Cues , 2000, ANLP.

[22]  Philippe Smets,et al.  Resolving misunderstandings about belief functions , 1992, Int. J. Approx. Reason..

[23]  Nils J. Nilsson,et al.  Probabilistic Logic * , 2022 .

[24]  Hung T. Nguyen,et al.  Les incertitudes dans les systèmes intelligents , 1996 .

[25]  Romain Laroche,et al.  Hybridisation of expertise and reinforcement learning in dialogue systems , 2009, INTERSPEECH.

[26]  J. Benzecri,et al.  Théorie des capacités , 1956 .

[27]  Romain Laroche,et al.  Uncertainty Management in Dialogue Systems , 2008 .

[28]  Keikichi Hirose,et al.  Corpus-based generation of fundamental frequency contours using generation process model and considering emotional focuses , 2006, INTERSPEECH.

[29]  Milica Gasic,et al.  Evaluating semantic-level confidence scores with multiple hypotheses , 2008, INTERSPEECH.

[30]  Jason D. Williams,et al.  The best of both worlds: unifying conventional dialog systems and POMDPs , 2008, INTERSPEECH.

[31]  Christophe d'Alessandro,et al.  Vocalic sandwich, a unit designed for unit selection TTS , 2009, INTERSPEECH.

[32]  Oliver Lemon,et al.  Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation , 2008, ACL.

[33]  Phan Minh Dung,et al.  On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning and Logic Programming , 1993, IJCAI.

[34]  Didier Cadic,et al.  Paralinguistic elements in speech synthesis , 2008, INTERSPEECH.

[35]  S. Young Probabilistic methods in spoken–dialogue systems , 2000, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[36]  Olivier Boëffard,et al.  Towards intonation control in unit selection speech synthesis , 2009, INTERSPEECH.

[37]  Renaud Lecoeuche Learning Optimal Dialogue Management Rules by Using Reinforcement Learning and Inductive Logic Programming , 2001, NAACL.

[38]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[39]  Oliver Lemon,et al.  Does this list contain what you were searching for? Learning adaptive dialogue strategies for interactive question answering , 2009, Natural Language Engineering.

[40]  Alexander I. Rudnicky,et al.  A unified design for human-machine voice interaction , 2001, CHI Extended Abstracts.

[41]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[42]  Philippe Smets,et al.  The Combination of Evidence in the Transferable Belief Model , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Roberto Pieraccini,et al.  Where do we go from here? Research and Commercial Spoken Dialog Systems , 2005, SIGDIAL.

[44]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[45]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[46]  Ronen I. Brafman,et al.  Planning with Concurrent Interacting Actions , 1997, AAAI/IAAI.

[47]  J.D. Williams,et al.  Scaling up POMDPs for Dialog Management: The ``Summary POMDP'' Method , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[48]  C. Watkins Learning from delayed rewards , 1989 .

[49]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[50]  David A. Kapilow,et al.  Interactive visualization of human-machine dialogs , 2005, INTERSPEECH.

[51]  Oliver Lemon,et al.  Using Machine Learning to Explore Human Multimodal Clarification Strategies , 2006, ACL.

[52]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[53]  Xiang Li,et al.  How predictable is ASR confidence in dialog applications? , 2007, INTERSPEECH.

[54]  Subbarao Kambhampati,et al.  Refinement Planning as a Unifying Framework for Plan Synthesis , 1997, AI Mag..

[55]  Oliver Lemon,et al.  Predicting how it sounds: re-ranking dialogue prompts based on TTS quality for adaptive spoken dialogue systems , 2009, INTERSPEECH.

[56]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[57]  Alexander I. Rudnicky,et al.  A schema based approach to dialog control , 1998, ICSLP.

[58]  Philippe Bretier,et al.  ARTIMIS: Natural Dialogue Meets Rational Agency , 1997, IJCAI.

[59]  Michael F. McTear,et al.  Book Review , 2005, Computational Linguistics.

[60]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[61]  Allen L. Gorin,et al.  Dialog trajectory analysis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[62]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[63]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[64]  Oliver Lemon,et al.  Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems , 2009, EACL.

[65]  Marilyn A. Walker,et al.  Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.

[66]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[67]  Robert Givan,et al.  Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[68]  James L Olds,et al.  Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.

[69]  Oliver Lemon,et al.  Accurate Probability Estimation of Hypothesised User Acts for POMDP Approaches to Dialogue Management , 2009 .

[70]  Rebecca Jonson DIALOGUE CONTEXT-BASED RE-RANKING OF ASR HYPOTHESES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[71]  Keikichi Hirose,et al.  Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[72]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[73]  Kallirroi Georgila,et al.  Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data , 2005 .

[74]  Alex M. Andrew,et al.  ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).

[75]  Sadaoki Furui,et al.  Recent Advances in Speaker Recognition (Invited Paper) , 1997, AVBPA.

[76]  Paolo Traverso,et al.  Automated planning - theory and practice , 2004 .

[77]  Oliver Lemon,et al.  Author manuscript, published in "European Conference on Speech Communication and Technologies (Interspeech'07), Anvers: Belgium (2007)" Machine Learning for Spoken Dialogue Systems , 2022 .

[78]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[79]  Oliver Lemon,et al.  Learning human multimodal dialogue strategies , 2009, Natural Language Engineering.

[80]  Matthieu Geist,et al.  Tracking in Reinforcement Learning , 2009, ICONIP.

[81]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[82]  Fabrice Lefèvre,et al.  Back-off action selection in summary space-based POMDP dialogue systems , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[83]  Oliver Lemon,et al.  REINFORCEMENT LEARNING OF DIALOGUE STRATEGIES WITH HIERARCHICAL ABSTRACT MACHINES , 2006, 2006 IEEE Spoken Language Technology Workshop.

[84]  Oliver Lemon,et al.  Automatic Learning and Evaluation of User-Centered Objective Functions for Dialogue System Optimisation , 2008, LREC.

[85]  Dat Tran,et al.  Automatic gender recognition , 2003 .

[86]  Steve J. Young,et al.  USING POMDPS FOR DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[87]  Kallirroi Georgila,et al.  Learning user simulations for information state update dialogue systems , 2005, INTERSPEECH.

[88]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[89]  Lihong Li,et al.  Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selection , 2009, INTERSPEECH.

[90]  Ron Kohavi,et al.  Controlled experiments on the web: survey and practical guide , 2009, Data Mining and Knowledge Discovery.

[91]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[92]  Fabrice Lefèvre,et al.  k-Nearest Neighbor Monte-Carlo Control Algorithm for POMDP-Based Dialogue Systems , 2009, SIGDIAL Conference.

[93]  Johan de Kleer,et al.  Extending the ATMS , 1986, Artif. Intell..

[94]  Milan Sigmund,et al.  Automatic Gender Distinction by Voice , 2005, Artificial Intelligence and Applications.

[95]  Oliver Lemon,et al.  Hierarchical dialogue optimization using semi-Markov decision processes , 2007, INTERSPEECH.

[96]  Bernadette Bouchon-Meunier,et al.  La logique floue , 1993 .

[97]  Tim Paek,et al.  Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths and Weaknesses for Practical Deployment , 2006 .

[98]  Michael J. Maher,et al.  An Argumentation-Theoretic Characterization of Defeasible Logic , 2000, ECAI.

[99]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[100]  Oliver Lemon,et al.  Combining Acoustic and Pragmatic Features to Predict Recognition Performance in Spoken Dialogue Systems , 2004, ACL.

[101]  R. Bellman A Markovian Decision Process , 1957 .

[102]  Philippe Bretier La communication orale coopérative : contribution à la modélisation logique et à la mise en oeuvre d'un agent rationnel dialoguant , 1995 .

[103]  E. J. Sondik,et al.  The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .

[104]  B. Bouchon-Meunier,et al.  La logique floue et ses applications , 1995 .

[105]  Sridhar Mahadevan,et al.  Learning to Take Concurrent Actions , 2002, NIPS.

[106]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[107]  Jürg Kohlas,et al.  A Mathematical Theory of Hints , 1995 .

[108]  Oliver Lemon,et al.  Simulation-based Learning of Optimal Multimodal Presentation Strategies from Wizard-of-Oz data , 2008 .

[109]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[110]  Oliver Lemon,et al.  A Wizard-of-Oz Environment to Study Referring Expression Generation in a Situated Spoken Dialogue Task , 2009, ENLG.

[111]  Hui Ye,et al.  The Hidden Information State Approach to Dialog Management , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[112]  Roberto Pieraccini,et al.  Automating spoken dialogue management design using machine learning: An industry perspective , 2008, Speech Commun..

[113]  Jason D. Williams,et al.  Estimating Probability of Correctness for ASR N-Best Lists , 2009, SIGDIAL Conference.

[114]  Milica Gasic,et al.  User study of the Bayesian update of dialogue state approach to dialogue management , 2008, INTERSPEECH.

[115]  Roberto Pieraccini,et al.  Using Markov decision process for learning dialogue strategies , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[116]  M. Pickering,et al.  Towards a mechanistic theory of dialog , 2004 .

[117]  Bob Carpenter,et al.  ETUDE, a recursive dialog manager with embedded user interface patterns , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[118]  Olivier Pietquin A Probabilistic Description of Man-Machine Spoken Communication , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[119]  Frédéric Baudoin Personnalisation des systèmes de dialogue en langage naturel : une méthode d'anticipation rationnelle d'actions communicatives , 2008 .

[120]  P. T. Geach,et al.  KNOWLEDGE AND BELIEF: An Introduction to the Logic of the Two Notions , 1963 .

[121]  S. Singh,et al.  Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System , 2011, J. Artif. Intell. Res..

[122]  Pascale Sébillot,et al.  Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation , 2007, INTERSPEECH.

[123]  Johan de Kleer,et al.  An Assumption-Based TMS , 1987, Artif. Intell..

[124]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[125]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.