Avatar-augmented online conversation

One of the most important roles played by technology is connecting people and mediating their communication with one another. Building technology that mediates conversation presents a number of challenging research and design questions. Apart from the fundamental issue of what exactly gets mediated, two of the more crucial questions are how the person being mediated interacts with the mediating layer and how the receiving person experiences the mediation. This thesis is concerned with both of these questions and proposes a theoretical framework of mediated conversation by means of automated avatars. This new approach relies on a model of face-to-face conversation, and derives an architecture for implementing these features through automation. First the thesis describes the process of face-to-face conversation and what nonverbal behaviors contribute to its success. It then presents a theoretical framework that explains how a text message can be automatically analyzed in terms of its communicative function based on discourse context, and how behaviors, shown to support those same functions in face-to-face conversation, can then be automatically performed by a graphical avatar in synchrony with the message delivery. An architecture, Spark, built on this framework demonstrates the approach in an actual system function from behavior, and the concept of an avatar agent, responsible for coordinated delivery and continuous maintenance of the communication channel. A derived application, MapChat, is an online collaboration system where users represented by avatars in a shared virtual environment can chat and manipulate an interactive map while their avatars generate face-to-face behaviors. A study evaluating the strength of the approach compares groups collaborating on a route-planning task using MapChat with and without the animated avatars. The results show that while task outcome was equally good for both groups, the group using these avatars felt that the task was significantly less difficult, and the feeling of efficiency and consensus were significantly stronger. An analysis of the conversation transcripts shows a significant improvement of the overall conversational process and significantly fewer messages spent on channel maintenance in the avatar groups. The avatars also significantly improved the users' perception of each others' effort. Finally, MapChat with avatars was found to be significantly more personal, enjoyable, and easier to use. The ramifications of these findings with respect to mediating conversation are discussed.

[1]  S. Kita Pointing: Where language, culture, and cognition meet , 2003 .

[2]  G. Broll,et al.  Microsoft Corporation , 1999 .

[3]  A. W. Siegman,et al.  Nonverbal behavior and communication , 1979 .

[4]  Anoop Gupta,et al.  Graphical Enhancements for Voice Only Conference Calls , 2001 .

[5]  喜多 壮太郎 Pointing : where language, culture, and cognition meet , 2013 .

[6]  F. R. Farmer,et al.  The lessons of Lucasfilm's habitat , 1991 .

[7]  James C. Lester,et al.  Deictic Believability: Coordinated Gesture, Locomotion, and Speech in Lifelike Pedagogical Agents , 1999, Appl. Artif. Intell..

[8]  E. Goffman Behavior in public places : notes on the social organization of gatherings , 1964 .

[9]  Bonnie A. Nardi and Steve Whittaker The Place of Face-to-Face Communication in Distributed Work , 2002 .

[10]  Judith S. Donath,et al.  Mediated Faces , 2001, Cognitive Technology.

[11]  J. Bavelas,et al.  Gestures Specialized for Dialogue , 1995 .

[12]  李幼升,et al.  Ph , 1989 .

[13]  Akikazu Takeuchi,et al.  Situated facial displays: towards social interaction , 1995, CHI '95.

[14]  S. Turkle Life on the Screen: Identity in the Age of the Internet , 1997 .

[15]  Angela Cora Garcia,et al.  The Interactional Organization of Computer Mediated Communication in the College Classroom , 1998 .

[16]  Bruce Damer,et al.  Putting a human face on cyberspace (panel): designing avatars and the virtual worlds they live in , 1997, SIGGRAPH.

[17]  John Lasseter,et al.  Principles of traditional animation applied to 3D computer animation , 1987, SIGGRAPH.

[18]  Jeff Sokolov Methodologies for evaluation of collaborative systems , 1999, SIGG.

[19]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[20]  Richard C. Waters,et al.  The rise of shared virtual environments , 1997 .

[21]  J. Cassell,et al.  Modeling Gaze Behavior as a Function of Discourse Structure , 1998 .

[22]  J. Cassell,et al.  Communicative humanoids: a computational model of psychosocial dialogue skills , 1996 .

[23]  C. Werry Linguistic and interactional features of Internet relay chat , 1996 .

[24]  Michael Halliday,et al.  Cohesion in English , 1976 .

[25]  Michael J. Taylor,et al.  Gaze communication using semantically consistent spaces , 2000, CHI.

[26]  Bruce Blumberg,et al.  Multi-level direction of autonomous creatures for real-time virtual environments , 1995, SIGGRAPH.

[27]  Lynn Cherny,et al.  The mud register : conversational modes of action in a text-based virtual reality , 1996 .

[28]  Pavel Curtis,et al.  Mudding: Social phenomena in text-based virtual realities. , 1997 .

[29]  Mel Slater,et al.  The impact of eye gaze on communication using humanoid avatars , 2001, CHI.

[30]  Brian Scassellati,et al.  Infant-like Social Interactions between a Robot and a Human Caregiver , 2000, Adapt. Behav..

[31]  John C. Tang,et al.  What video can and cannot do for collaboration: A case study , 2005, Multimedia Systems.

[32]  Ellen F. Prince,et al.  Toward a taxonomy of given-new information , 1981 .

[33]  Justine Cassell,et al.  Semantic and Discourse Information for Text-to-Speech Intonation , 1997, Workshop On Concept To Speech Generation Systems.

[34]  B. Webber,et al.  Elements of Discourse Understanding , 1983 .

[35]  J. Breese,et al.  Emotion and personality in a conversational agent , 2001 .

[36]  T. Koda,et al.  Agents with faces: the effect of personification , 1996, Proceedings 5th IEEE International Workshop on Robot and Human Communication. RO-MAN'96 TSUKUBA.

[37]  Evelyn Z. McClave Linguistic functions of head movements in the context of speech , 2000 .

[38]  Francesca Barrientos,et al.  Continuous control of avatar gesture , 2000, MULTIMEDIA '00.

[39]  S. Drucker,et al.  The Role of Eye Gaze in Avatar Mediated Conversational Interfaces , 2000 .

[40]  Bruce Blumberg,et al.  Sympathetic interfaces: using a plush toy to direct synthetic characters , 1999, CHI '99.

[41]  R. Comerford,et al.  Sharing Virtual Worlds , 1997, IEEE spectrum.

[42]  Susan G. Straus,et al.  Technology, Group Process, and Group Outcomes: Testing the Connections in Computer-Mediated and Face-to-Face Groups , 1997, Hum. Comput. Interact..

[43]  A. Bruckman Situated Support for Learning: Storm's Weekend With Rachael , 2000 .

[44]  Judith S. Donath,et al.  The illustrated conversation , 1995, Multimedia Tools and Applications.

[45]  Thomas Rist,et al.  Integrating reactive and scripted behaviors in a life-like presentation agent , 1998, AGENTS '98.

[46]  Paul Dourish,et al.  Portholes: supporting awareness in a distributed work group , 1992, CHI.

[47]  E.,et al.  GROUPS : INTERACTION AND PERFORMANCE , 2001 .

[48]  M. Argyle,et al.  Gaze and Mutual Gaze , 1994, British Journal of Psychiatry.

[49]  Julia Hirschberg,et al.  Accent and Discourse Context: Assigning Pitch Accent in Synthetic Speech , 1990, AAAI.

[50]  Marina Umaschi Bers,et al.  Zora: a graphical multi-user environment to share stories about the self , 1999, CSCL.

[51]  Hiroshi Ishii,et al.  ClearBoard: A Novel Shared Drawing Medium that Supports Gaze Awareness in Remote Collaboration (Special Issue on Next Generation Visual Telecommunication and Broadcasting) , 1993 .

[52]  Andrei Popescu-Belis,et al.  What are discourse markers ? , 2003 .

[53]  M. Cary The Role of Gaze in the Initiation of Conversation , 1978 .

[54]  W. Lewis Johnson,et al.  Task-Oriented Dialogs with Animated Agents in Virtual Reality , 1998 .

[55]  Marc Smith,et al.  Conversation trees and threaded chats , 2000, CSCW '00.

[56]  Nicole Chovil Discourse‐oriented facial displays in conversation , 1991 .

[57]  D. McNeill Hand and Mind , 1995 .

[58]  Steven M. Drucker,et al.  Alternative interfaces for chat , 1999, UIST '99.

[59]  Daniel Thalmann,et al.  Computer Animation and Simulation ’99 , 1999, Eurographics.

[60]  Yutaka Matsushita,et al.  Integration of face-to-face and video-mediated meetings: HERMES , 1997, GROUP '97.

[61]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[62]  Andreas Girgensohn,et al.  NYNEX portholes: initial user reactions and redesign implications , 1997, GROUP.

[63]  Amy Bruckman,et al.  Community Support for Constructionist Learning , 2004, Computer Supported Cooperative Work (CSCW).

[64]  Michele D. Dickey,et al.  Three-dimensional virtual worlds and learning: An analysis of the impact of design affordances and limitations in Active Worlds, Blaxxun Interactive, and OnLive! Traveler; and a study of the implementation of Active Worlds for formal and informal education , 1999 .

[65]  H. H. Clark Pointing and placing. , 2003 .

[66]  Judith S. Donath,et al.  Chat circles , 1999, CHI '99.

[67]  L. Polanyi A formal model of the structure of discourse , 1988 .

[68]  Philip R. Cohen,et al.  Discourse structure and performance efficiency in interactive and non-interactive spoken modalities☆ , 1991 .

[69]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[70]  James C. Lester,et al.  Life-Like Pedagogical Agents in Constructivist Multimedia Environments: Cognitive Consequences of their Interaction , 2000 .

[71]  Clifford Nass,et al.  The media equation - how people treat computers, television, and new media like real people and places , 1996 .

[72]  S. Duncan,et al.  On the structure of speaker–auditor interaction during speaking turns , 1974, Language in Society.

[73]  Roel Vertegaal,et al.  Explaining effects of eye gaze on mediated group conversations:: amount or synchronization? , 2002, CSCW '02.

[74]  Justine Cassell,et al.  Fully Embodied Conversational Avatars: Making Communicative Behaviors Autonomous , 1999, Autonomous Agents and Multi-Agent Systems.

[75]  Julia Hirschberg,et al.  Empirical Studies on the Disambiguation of Cue Phrases , 1993, Comput. Linguistics.

[76]  James D. Herbsleb,et al.  Introducing instant messaging and chat in the workplace , 2002, CHI.

[77]  Bruce Damer,et al.  Putting a human face on cyberspace: designing avatars and the virtual worlds they live in (panel). , 1997, SIGGRAPH 1997.

[78]  Gwyneth Doherty-Sneddon,et al.  Face-to-face and video mediated communication: a comparison of dialogue structure and task performance , 1997 .

[79]  Roel Vertegaal,et al.  The GAZE groupware system: mediating joint attention in multiparty communication and collaboration , 1999, CHI '99.

[80]  J. Cassell,et al.  Embodied conversational agents , 2000 .

[81]  Norman I. Badler,et al.  The EMOTE model for effort and shape , 2000, SIGGRAPH.

[82]  Mable B. Kinzie,et al.  Exploring cases on-line with virtual environments , 1995, CSCL.

[83]  H. Vilhjálmsson Autonomous communicative behaviors in avatars , 1997 .

[84]  Charles E. Hughes,et al.  Shared virtual worlds for education: the ExploreNet experiment , 1997, Multimedia Systems.

[85]  Bonnie A. Nardi,et al.  Interaction and outeraction: instant messaging in action , 2000, CSCW '00.

[86]  Christina Vasilakis,et al.  Learning and Building Together in an Immersive Virtual World , 1999, Presence: Teleoperators & Virtual Environments.

[87]  Frank P. Coyle,et al.  Aaai '90 , 1990, IEEE Expert.

[88]  John Canny,et al.  PRoP : Personal Roving Presence , 2007 .

[89]  Brian M. Slator,et al.  Virtual Worlds in Large Enrollment Science Classes Significantly Improve Authentic Learning , 2001 .

[90]  David Salesin,et al.  Comic Chat , 1996, SIGGRAPH.

[91]  KwangYun Wohn,et al.  The control of avatar motion using hand gesture , 1998, VRST '98.

[92]  G. Whitney Computer‐mediated communication: Linguistic, social, and cross‐cultural perspectives , 1998 .

[93]  E. Goffman,et al.  Forms of talk , 1982 .

[94]  Joey Chang Action scheduling in humanoid conversational agents , 1998 .

[95]  Pamela J. Hinds,et al.  The Place of Face-to-Face Communication in Distributed Work , 2002 .

[96]  Barbara J. Grosz,et al.  Focusing and Description in Natural Language Dialogues , 1979 .

[97]  C. Goodwin Conversational Organization: Interaction Between Speakers and Hearers , 1981 .

[98]  Justine Cassell,et al.  Requirements for an Architecture for Embodied Conversational Characters , 1999, Computer Animation and Simulation.

[99]  Alan J. Dix,et al.  Text-Based On-Line Conferencing: A Conceptual and Empirical Analysis Using a Minimal Prototype , 1993, Hum. Comput. Interact..

[100]  DonathJudith A semantic approach to visualizing online conversations , 2002 .

[101]  Trevor Darrell,et al.  Perceptually-driven Avatars and Interfaces: active methods for direct control , 1997 .

[102]  Steve Whittaker,et al.  The role of vision in face-to-face and mediated communication. , 1997 .

[103]  Abigail Sellen,et al.  Video-Mediated Communication , 1997 .

[104]  Daniel D. Suthers,et al.  Collaborative representations: supporting face to face and online knowledge-building discourse , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[105]  Dennis C. Neale,et al.  MAKING MEDIA SPACES USEFUL : VIDEO SUPPORT AND TELEPRESENCE , 1998 .

[106]  A. Kendon Conducting Interaction: Patterns of Behavior in Focused Encounters , 1990 .

[107]  Mark Steedman,et al.  Specifying intonation from context for speech synthesis , 1994, Speech Communication.

[108]  Hideyuki Nakanishi,et al.  FreeWalk: supporting casual meetings in a network , 1996, CSCW '96.

[109]  Justine Cassell,et al.  Embodied Conversation: Integrating Face and Gesture into Automatic Spoken Dialogue Systems , 1998 .

[110]  Stephanie D. Teasley,et al.  Perspectives on socially shared cognition , 1991 .

[111]  Judith S. Donath,et al.  A semantic approach to visualizing online conversations , 2002, CACM.

[112]  E. Goffman Behavior in Public Places , 1963 .

[113]  E. Schegloff,et al.  Opening up Closings , 1973 .

[114]  Yoshio Nagashima,et al.  InterSpace: Networked Virtual World for Visual Communication , 1994 .

[115]  M. Argyle,et al.  The Different Functions of Gaze , 1973 .

[116]  Lynette Hirschman,et al.  Evaluating Multi-party Multi-modal Systems , 2000, LREC.

[117]  Susan R. Fussell,et al.  Constructing shared communicative environments , 1991, Perspectives on socially shared cognition.