论文信息 - Bayesian networks based multi-modality fusion for error handling in human-robot dialogues under noisy conditions

Bayesian networks based multi-modality fusion for error handling in human-robot dialogues under noisy conditions

Abstract In this paper, we introduce probabilistic model based architecture for error handling in human–robot spoken dialogue systems under adverse audio conditions. In this architecture, a Bayesian network framework is used for interpretation of multi-modal signals in the spoken dialogue between a tour-guide robot and visitors in mass exhibition conditions. In particular, we report on experiments interpreting speech and laser scanner signals in the dialogue management system of the autonomous tour-guide robot RoboX, successfully deployed at the Swiss National Exhibition (Expo.02). A correct interpretation of a user’s (visitor’s) goal or intention at each dialogue state is a key issue for successful voice-enabled communication between tour-guide robots and visitors. To infer the visitors’ goal under the uncertainty intrinsic to these two modalities, we introduce Bayesian networks for combining noisy speech recognition with data from a laser scanner, which are independent of acoustic noise. Experiments with real-world data, collected during the operation of RoboX at Expo.02 demonstrate the effectiveness of the approach in adverse environment. The proposed architecture makes it possible to model error-handling processes in spoken dialogue systems, which include complex combination of different multi-modal information sources in cases where such information is available.

Plamen J. Prodanov | Andrzej Drygajlo | A. Drygajlo | P. Prodanov

[1] Emiel Krahmer,et al. Error Detection in Spoken Human-Machine Interaction , 2001, Int. J. Speech Technol..

[2] Roland Siegwart,et al. Visitor Flow Management using Human-Robot Interaction at Expo.02 , 2002 .

[3] Wolfram Burgard,et al. Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[4] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[5] Gabriel Skantze. Exploring Human Error Handling Strategies : Implications for Spoken Dialogue Systems , 2003 .

[6] Eric Horvitz,et al. A computational architecture for conversation , 1999 .

[7] Kevin P. Murphy,et al. Dynamic Bayesian Networks for Audio-Visual Speech Recognition , 2002, EURASIP J. Adv. Signal Process..

[8] Vladimir Pavlovic,et al. Dynamic bayesian networks for information fusion with applications to human-computer interfaces , 1999 .

[9] Christopher R. Brown,et al. Dynamic Bayes net approach to multimodal sensor fusion , 1997, Other Conferences.

[10] Roland Siegwart,et al. Voice enabled interface for interactive tour-guide robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11] Gabriel Skantze,et al. Exploring human error recovery strategies: Implications for spoken dialogue systems , 2005, Speech Communication.

[12] Moshe Kam,et al. Sensor Fusion for Mobile Robot Navigation , 1997, Proc. IEEE.

[13] Markku Turunen,et al. Agent-based error handling in spoken dialogue systems , 2001, INTERSPEECH.

[14] Roland Siegwart,et al. The interactive autonomous mobile system RoboX , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15] Finn Verner Jensen,et al. Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[16] Joelle Pineau,et al. Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[17] D G Bobrow,et al. Applications of Artificial Intelligence , 1999 .

[18] Wolfram Burgard,et al. MINERVA: a second-generation museum tour-guide robot , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[19] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[20] Illah R. Nourbakhsh,et al. The History of the Mobot Museum Robot Series: An Evolutionary Study , 2001, FLAIRS.

[21] Roland Siegwart,et al. On developing a voice-enabled interface for interactive tour-guide robots , 2003, Adv. Robotics.

[22] Ross D. Shachter. Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams) , 1998, UAI.

[23] J. Thorpe,et al. Data Fusion Algorithms for Collaborative Robotic Exploration , 2002 .

[24] Anton Nijholt,et al. Dialogue Act Recognition with Bayesian Networks for Dutch Dialogues , 2002, SIGDIAL Workshop.

[25] Steffen L. Lauritzen,et al. Bayesian updating in causal probabilistic networks by local computations , 1990 .

[26] Janienke Sturm,et al. Adding Extra Input/Output Modalities to a Spoken Dialogue System , 2001, SIGDIAL Workshop.

[27] Stuart J. Russell,et al. Dynamic bayesian networks: representation, inference and learning , 2002 .

[28] Plamen J. Prodanov,et al. Bayesian networks for spoken dialogue management in multimodal systems of tour-guide robots , 2003, INTERSPEECH.

[29] Clive Souter,et al. Dialogue Management Systems: a Survey and Overview , 1997 .