Computers seem to be everywhere and to be able to do almost anything. Automobiles have Global Positioning Systems to give advice about travel routes and destinations. Virtual classrooms supplement and sometimes replace face-to-face classroom experiences with web-based systems (such as Blackboard) that allow postings, virtual discussion sections with virtual whiteboards, as well as continuous access to course documents, outlines, and the like. Various forms of “bots” search for information about intestinal diseases, plan airline reservations to Tucson, and inform us of the release of new movies that might fit our cinematic preferences. Instead of talking to the agent at AAA, the professor, the librarian, the travel agent, or the cinemafile two doors down, we are interacting with electronic social agents. Some entrepreneurs are even trying to create toys that are sufficiently responsive to engender emotional attachments between the toy and its owner. Comments Postprint version. This book chapter is available at ScholarlyCommons: http://repository.upenn.edu/asc_papers/102 Rules for Responsive Robots: Using Human Interactions to Build Virtual Interactions By Joseph N. Cappella and Catherine Pelachaud Chapter prepared for Reis, Fitzpatrick, and Vangelisti (Eds.), Stability and Change in Relationships Joseph N. Cappella may be reached at the Annenberg School for Communication, University of Pennsylvania, 3620 Walnut St., Philadelphia, PA 19104-6220; Fax: 215-898-2024; Tel: 215-898-7059; JCAPPELLA@ASC.UPENN.EDU Catherine Pelachaud may be reached at Università di Roma "La Sapienza", Dipartimento di Informatica e Sistemistica, Via Buonarroti, 12, 00185 Roma Italy 10/22/01 Virtual interaction 2 Computers seem to be everywhere and to be able to do almost anything. Automobiles have Global Positioning Systems to give advice about travel routes and destinations. Virtual classrooms supplement and sometimes replace face-to-face classroom experiences with web-based systems (such as Blackboard) that allow postings, virtual discussion sections with virtual whiteboards, as well as continuous access to course documents, outlines, and the like. Various forms of “bots” search for information about intestinal diseases, plan airline reservations to Tucson, and inform us of the release of new movies that might fit our cinematic preferences. Instead of talking to the agent at AAA, the professor, the librarian, the travel agent, or the cinema-file two doors down, we are interacting with electronic social agents. Some entrepreneurs are even trying to create toys that are sufficiently responsive to engender emotional attachments between the toy and its owner. These trends are seen by some as the leading edge of a broader phenomenon – not just interactive computer agents but emotionally responsive computers and emotionally responsive virtual agents. Nicholas Negroponte answers the obvious question: “Absurd? Not really. Without the ability to recognize a person’s emotional state, computers will remain at the most trivial levels of endeavor. ... What you remember most about an influential teacher is her compassion and enthusiasm, not the rigors of grammar or science.” (Negroponte, 1996, p. 184) The editors of PC Magazine do not consider emotionally responsive computers science fiction. “[I]n the not so distant future, your computer may know exactly how you feel” (PC Magazine, 1999, p. 9). Researchers at Microsoft are developing lifelike avatars to represent their owners and who could participate in a virtual meeting while the owner remains at the office available only remotely (Miller, 1999, p. 113). Computer gurus are not the only people predicting the “emotionalization” of the humancomputer interface. Scholars, such as Rosiland Picard (1997), have given serious attention to the possibility and value of programming computers and computer agents to be responsive emotionally. Part of her interest in this possibility is based on how people typically respond to computers. 10/22/01 Virtual interaction 3 Reeves and Nass (1996) have built a strong case for the “media equation,” namely that people treat computers and new media like real people. Their claim is that people are primarily social beings ready to default to social judgements and evaluations even when they are dealing with inanimate entities such as computers. For example, in one of their studies people were led to believe that they were evaluating a teaching program run by one computer. When asked by the computer that had taught them how effective the teaching program was, participants offered more positive assessments than when the same evaluation of the teaching computer was asked by a different computer. The authors argue that this result is explained by a norm of social politeness. Just as a person might direct less criticism to their own (human) teacher but direct harsher criticism toward the teacher when asked by a third party, so they did with the computer stations. The social rule of politeness was adopted as the default even when acting in a nonsocial context. In a different study, computers employing a dominant verbal style of interaction were preferred by users who possessed a dominant personality while those with submissive personalities preferred computers with a submissive style. This pattern parallels the social preferences that people have for other humans. Across a wide variety of studies, Reeves and Nass have shown that people are first and foremost social in their interactions, even when those interactions are with inanimate media rather than flesh and blood homo sapiens. Picard reasons that if people are social even in non-social interactions, then human users should prefer to interact with computers and their representative systems that are more rather than less human. To be social and to be human is in part to be emotionally responsive. Picard’s treatment of emotionally responsive computers involves reviewing literature on human emotional expression and recognition as well as recent thinking on emotional intelligence (Gardner, 1983, 1993; Goleman, 1995). She reports recent advances in automatic recognition of emotion and in work on the animation of facial displays of emotion. The automated recognition and expression of emotion present immense problems for 10/22/01 Virtual interaction 4 programmers. However, even if these problems are solved, a large gap will remain. Affective interaction in human-computer interchanges cannot be reduced to sequences of recognition and expression. The fundamental feature of human interaction is contingent responsiveness which is not reducible to a mere sequence of recognition and expression by two agents. This chapter is about what it means to act in a way that is contingently responsive. Our argument is essentially that modeling social interaction as it is experienced by humans requires certain mechanisms or rules without which simulated interactions are little more than the juxtaposition of two monologues. We present our position by (1) defining responsiveness; (2) discussing computer simulation tools; (3) presenting empirical models of two person interactions; (4) describing the importance of responsive and unresponsive interactions to people; and (5) concluding with general rules for realistic virtual interaction between human and non-human agents. Virtual Interactions and Human Relationships Before taking up these issues, it is fair to ask what this chapter has to do with human relationships. The development of computer simulations of human interactions is well underway. Service industries that provide simple transactions such as banking exchanges, fast food services, and so on are anxious to replace their service personnel with autonomous agents who will be the friendly, responsive representatives of the company that their more expensive, late, and sometimes surly and uncivil human counterparts are not. However, the models for such simulations – if they are to be accepted as viable replacements for humans – must have human social abilities. Much of what is known about human social interaction is ignored by computer modelers. Instead, they often import their own assumptions into their models. Attend even one computer conference on “real characters” and you will find fascinating models, elegantly presented, but with little empirical foundation. Understanding the human and empirical basis for social interaction is crucial for AI 10/22/01 Virtual interaction 5 specialists. The science of relationships – especially human interaction in relationships – needs to be imported into the science of modeling interactions. But does modeling virtual relationships have anything to do with understanding human relationships? The answer is an unequivocal “Yes!” in at least two senses. First, to provide useful information to computer simulators requires very precise claims and a very solid empirical base. This is a challenge to researchers who study human relationships. Our work will have little influence unless it is precise and empirically well founded. In Zen and the Art of Motorcycle Maintenance, Robert Pirsig explores the differences between classical and romantic conceptions of knowing. Complex devices, such as motorcycles, can be appreciated for the beauty of their superficial structure and function or for their underlying causal operation. The latter, classical view, leads Pirsig's hero on an intellectual journey exploring what it can mean to know the underlying, unobserved structure and function of physical and social systems. He concludes that deep knowledge is knowledge that allows one to build a replica of the system being scrutinized. So it is with models of human interaction -deep understanding comes when research and theory allow the simulation of the behaviors being modeled. The data we present on responsiveness in human interaction is pertinent to both the principles that will guide the simulations of virtual human interaction and
[1]
P. Ekman.
Universals and cultural differences in facial expressions of emotion.
,
1972
.
[2]
J. Cappella.
TALK AND SILENCE SEQUENCES IN INFORMAL CONVERSATIONS II
,
1979
.
[3]
Clifford Nass,et al.
The media equation - how people treat computers, television, and new media like real people and places
,
1996
.
[4]
Norman I. Badler,et al.
Interactive behaviors for bipedal articulated figures
,
1991,
SIGGRAPH.
[5]
J. Cappella.
Mutual influence in expressive behavior: adult--adult and infant--adult dyadic interaction.
,
1981,
Psychological bulletin.
[6]
Daniel Thalmann,et al.
The Direction of Synthetic Actors in the Film Rendez-Vous a Montreal
,
1987,
IEEE Computer Graphics and Applications.
[7]
J. Cappella.
Behavioral and Judged Coordination in Adult Informal Social Interactions: Vocal and Kinesic Indicators
,
1997
.
[8]
Henrique S. Malvar,et al.
Making faces
,
1998,
SIGGRAPH Courses.
[9]
Kristinn R. Th—risson.
Layered Modular Action Control for Communicative Humanoids
,
1997
.
[10]
Deborah Davis,et al.
Perceptions of unresponsive others: Attributions, attraction, understandability, and memory of their utterances
,
1984
.
[11]
Norman I. Badler,et al.
Inverse kinematics positioning using nonlinear programming for highly articulated figures
,
1994,
TOGS.
[12]
E. Vesterinen,et al.
Affective Computing
,
2009,
Encyclopedia of Biometrics.
[13]
P. Ekman,et al.
Autonomic nervous system activity distinguishes among emotions.
,
1983,
Science.
[14]
Daniel Thalmann,et al.
SMILE: A Multilayered Facial Animation System
,
1991,
Modeling in Computer Graphics.
[15]
Michael Girard,et al.
Computer animation of knowledge-based human grasping
,
1991,
SIGGRAPH.
[16]
W. Johnson,et al.
Task-oriented collaboration with embodied agents in virtual worlds
,
2001
.
[17]
Giacomo Mauro DAriano.
The Journal of Personality and Social Psychology.
,
2002
.
[18]
K. Chang,et al.
Embodiment in conversational interfaces: Rea
,
1999,
CHI '99.
[19]
Akikazu Takeuchi,et al.
Speech Dialogue With Facial Displays: Multimodal Human-Computer Conversation
,
1994,
ACL.
[20]
Mark Steedman,et al.
Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents
,
1994,
SIGGRAPH.
[21]
Catherine Pelachaud,et al.
Eye Communication in a Conversational 3D Synthetic Agent
,
2000,
AI Commun..
[22]
Norman I. Badler,et al.
A Parameterized Action Representation for Virtual Human Agents
,
1998
.
[23]
Norman I. Badler,et al.
Instructions, Intentions and Expectations
,
1995,
Artif. Intell..
[24]
Donald W. Fiske,et al.
Face-to-face interaction: Research, methods, and theory
,
1977
.
[25]
Nicole Chovil.
Social determinants of facial displays
,
1991
.
[26]
Akikazu Takeuchi,et al.
Communicative facial displays as a new conversational modality
,
1993,
INTERCHI.
[27]
J. Cappella,et al.
Attitude similarity, relational history, and attraction: The mediating effects of kinesic and vocal behaviors
,
1990
.
[28]
Norman I. Badler,et al.
Making Them Move: Mechanics, Control & Animation of Articulated Figures
,
1990
.
[29]
Deborah Davis,et al.
Consequences of responsiveness in dyadic interaction: Effects of probability of response and proportion of content-related responses on interpersonal attraction.
,
1979
.
[30]
Justine Cassell,et al.
Human conversation as a system framework: designing embodied conversational agents
,
2001
.
[31]
S. Feldstein,et al.
Rhythms of dialogue
,
1970
.
[32]
J. Cappella.
The Faclal Feedback Hypothesis in Human Interaction
,
1993
.
[33]
H. Martin,et al.
When pleasure begets pleasure: Recipient responsiveness as a determinant of physical pleasuring between heterosexual dating couples and strangers.
,
1978
.
[34]
Norman I. Badler,et al.
Where to Look? Automating Attending Behaviors of Virtual Human Characters
,
1999,
Agents.
[35]
L. Hudson.
Frames of Mind
,
1970
.
[36]
Lance Williams,et al.
Animating images with drawings
,
1994,
SIGGRAPH.
[37]
K. Scherer.
Vocal affect expression: a review and a model for future research.
,
1986,
Psychological bulletin.
[38]
Hyeongseok Ko.
Kinematic and dynamic techniques for analyzing, predicting, and animating human locomotion
,
1995
.
[39]
Norman I. Badler,et al.
Where to Look? Automating Attending Behaviors of Virtual Human Characters
,
1999,
AGENTS '99.
[40]
J. Cappella.
The Biological Origins of Automated Patterns of Human Interaction
,
1991
.
[41]
H. Gardner.
Multiple intelligences : the theory in practice
,
1993
.
[42]
K.R. Thorisson,et al.
Layered modular action control for communicative humanoids
,
1997,
Proceedings. Computer Animation '97 (Cat. No.97TB100120).
[43]
Catherine Pelachaud,et al.
Performative facial expressions in animated faces
,
2001
.
[44]
David Zeltzer,et al.
Task-level graphical simulation: abstraction, representation, and control
,
1991
.
[45]
J. Cappella.
Production Principles for Turn-Taking Rules in Social Interaction: Socially Anxious vs. Socially Secure Persons
,
1985
.
[46]
Thomas Rist,et al.
The automated design of believable dialogues for animated presentation teams
,
2001
.
[47]
Rodney A. Brooks,et al.
A robot that walks; emergent behaviors from a carefully evolved network
,
1989,
Proceedings, 1989 International Conference on Robotics and Automation.
[48]
James C. Lester,et al.
Deictic and emotive communication in animated pedagogical agents
,
2001
.
[49]
H. Jones,et al.
Frames of Mind
,
1969,
Mental Health.
[50]
Peter C. Litwinowicz,et al.
Facial Animation by Spatial Mapping
,
1991
.
[51]
J. Burgoon,et al.
Interpersonal Adaptation: Dyadic Interaction Patterns
,
1995
.