Natural Interactive Communication for Edutainment NICE Deliverable D3.7b Multimodal Output Generation Module for the NICE fairy-tale game

for dissemination) This report, Deliverable D3.7 from the HLT project Natural Interactive Communication for Edutainment (NICE), describes the construction of the output generation module for the fairy-tale characters. The module receives a richly parameterized semantic instruction from the dialogue manager (WP5) and generates a multimodal output request (verbal utterances, with lipsynchronisation track, additional behavioral instructions to the animated character (face, gaze, body), and manipulations of physical objects in the 3D environment) that are then rendered by the synthesis module and the animation render module.

[1]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[2]  Alexander I. Rudnicky,et al.  Task and domain specific modelling in the Carnegie Mellon communicator system , 2000, INTERSPEECH.

[3]  Clifford Nass,et al.  Does computer-generated speech manifest personality? an experimental test of similarity-attraction , 2000, CHI.

[4]  C. G. Fisher,et al.  Confusions among visually perceived consonants. , 1968, Journal of speech and hearing research.

[5]  M. Cranach,et al.  Human Ethology: Claims and Limits of a New Discipline. , 1982 .

[6]  Joakim Nivre,et al.  On the Semantics and Pragmatics of Linguistic Feedback , 1992, J. Semant..

[7]  Ken Perlin,et al.  Improv: a system for scripting interactive actors in virtual worlds , 1996, SIGGRAPH.

[8]  A. Leroi‐Gourhan,et al.  Gesture and Speech , 1993 .

[9]  Justine Cassell,et al.  Negotiated Collusion: Modeling Social Language and its Relationship Effects in Intelligent Agents , 2003, User Modeling and User-Adapted Interaction.

[10]  Pattie Maes,et al.  Agents with Faces: The Effects of Personification of Agents , 1996 .

[11]  Susanne van Mulken,et al.  The impact of animated interface agents: a review of empirical research , 2000, Int. J. Hum. Comput. Stud..

[12]  P. Ekman Emotion in the human face , 1982 .

[13]  Peter A. Heeman,et al.  Discourse marker use in task-oriented spoken dialog \lambda , 1997, EUROSPEECH.

[14]  Paul Taylor,et al.  Festival Speech Synthesis System , 1998 .

[15]  K. Chang,et al.  Embodiment in conversational interfaces: Rea , 1999, CHI '99.

[16]  Mark Steedman,et al.  Generating Facial Expressions for Speech , 1996, Cogn. Sci..

[17]  Brenda Laurel,et al.  Interface agents: metaphors with character , 1997 .

[18]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[19]  Susan Brennan,et al.  Processes that shape conversation and their implications for computational linguistics , 2000, ACL 2000.

[20]  Lee Sproull,et al.  Using a human face in an interface , 1994, CHI '94.

[21]  Shrikanth Narayanan,et al.  Limited domain synthesis of expressive military speech for animated characters , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[22]  S. Drucker,et al.  The Role of Eye Gaze in Avatar Mediated Conversational Interfaces , 2000 .

[23]  D. Kahneman,et al.  Attention and Effort , 1973 .

[24]  B. Granström,et al.  NATURAL TURN-TAKING NEEDS NO MANUAL : COMPUTATIONAL THEORY AND MODEL , FROM PERCEPTION TO ACTION , 2002 .

[25]  Daniel Thalmann,et al.  Two Approaches to Scripting Character Animation , 2002 .

[26]  James C. Lester,et al.  The persona effect: affective impact of animated pedagogical agents , 1997, CHI.

[27]  James C. Lester,et al.  Deictic Believability: Coordinated Gesture, Locomotion, and Speech in Lifelike Pedagogical Agents , 1999, Appl. Artif. Intell..

[28]  P. Costa,et al.  Toward a new generation of personality theories: Theoretical contexts for the five-factor model. , 1996 .

[29]  P. Ekman,et al.  Emotion in the Human Face: Guidelines for Research and an Integration of Findings , 1972 .

[30]  Gérard Bailly,et al.  Talking Machines: Theories, Models, and Designs , 1992 .

[31]  Marc Cavazza,et al.  Exploring scalability of character-based storytelling , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[32]  Susan Brennan,et al.  Interaction and feedback in a spoken language system: a theoretical framework , 1995, Knowl. Based Syst..

[33]  Catherine Pelachaud,et al.  Sight and sound: generating facial expressions and spoken intonation from context , 1994, SSW.

[34]  M. Baker,et al.  Dialogue and Instruction , 1995 .

[35]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[36]  Alan W. Black,et al.  Limited domain synthesis , 2000, INTERSPEECH.

[37]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[38]  S. Joy Mountford,et al.  The Art of Human-Computer Interface Design , 1990 .

[39]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[40]  V. Yngve On getting a word in edgewise , 1970 .

[41]  C. Goodwin Conversational Organization: Interaction Between Speakers and Hearers , 1981 .

[42]  J. S. Wiggins,et al.  The five-factor model of personality : theoretical perspectives , 1996 .

[43]  Herbert H. Clark,et al.  Managing problems in speaking , 1994, Speech Communication.

[44]  Elisabeth André,et al.  The Persona Effect: How Substantial Is It? , 1998, BCS HCI.

[45]  Norman I. Badler,et al.  Representing and parameterizing agent behaviors , 2002, Proceedings of Computer Animation 2002 (CA 2002).

[46]  Norman I. Badler,et al.  The EMOTE model for effort and shape , 2000, SIGGRAPH.