A central issue to address in an attempt to build machines that learn new things by interacting with people is the ability to understand intentions: the ability to predict and make expectations about the intended goal state of an agent’s actions. By the age of four, children are undeniably able to reason about intentions. In this paper I present various findings from cognitive science and developmental psychology about mechanisms that precede a fully developed theory of mind and may in fact be instrumental to its development: eye recognition and tracking, low-level pattern recognition of the physical and temporal structure of action sequences, intermodal selfrepresentation, efficiency expectations for goal-directed actions, attraction to others ‘ like-me’, and an appetite for being imitated. I then address each of these from an implementation perspective as design guidelines for building a machine that is to understand intentions. INTRODUCTION In the effort to build machines with human-like intelligence, one approach is to gain insight and clues from human development. Hence, to build a machine that learns something new, this approach directs us to look at how humans learn something new. There are essentially two ways in which people learn new things: either through their own experience or through interacting with another person. Learning on ones own generally involves a process of trail and error, in which one has a particular goal or end state in mind (Minsky, 2003). Without discounting the importance of this process in the grand scheme, it is quite inefficient unless you have luck or prior experience to get you going in the right direction. From an evolution point of view, when survival is the ultimate goal, other people (especially those that infants are presumably with most – parents) are likely to be the most efficient source of information. In developmental psychology this is described as parental scaffolding, in which the parent provides constraints or structure to facilitate the learning process of the child. Researchers in artificial intelligence envision this kind of social learning as an avenue towards machines that acquire new knowledge autonomously and increase in complexity without intervention from the designer (Breazeal, 2002). Fundamentally, learning something new from another person requires being able to decipher their actions and understand of their intentions. How exactly children come to understand intentions is an active field in cognitive science and developmental psychology. In the first half of this paper, I present findings and speculation from these fields concerning the mechanisms instrumental to the development of an ability to understand intentions. In the second half, I then address these mechanisms from an implementation perspective, proposing five design guidelines that will be necessary when building a machine that understands intentions. MECHANISMS FOR UNDERSTANDING INTENTIONS The ability to learn new goals requires seeing the world not merely as a series of events but in terms of goal-directed action, an endless sequence of means and ends. Learning something new then involves finding a way to an end state has not been seen before or perhaps learning a new path to a known end state. If this learning takes place while interacting with another individual, it requires the ability to make sense of their actions and understand of their intentions. Intentions and intentionality are part of that set of commonsense terms of which everyone has an understanding, but it is hard to find a single definition that covers all of what we mean by these words. In this paper I use the term understanding intentions to refer to the ability to predict and make expectations about the intended goal state of an agent’s actions. This involves reasoning about the agent’s actions in terms of their goals, beliefs and desires. This reasoning about mental states is why understanding intentions and theory of mind are sometimes used interchangeably in the literature. The false-belief task has since the 1980’s been a widely accepted litmus test for theory of mind (Lewis & Mitchell, 1994). There are two widely used versions of the false belief task: the unexpected transfer, and the deceptive box. In a typical unexpected transfer experiment, a child sees one experimenter come into the room, put chocolate in a cabinet, and then leave the room. Then a second experimenter comes into the room and moves the chocolate to a different cabinet. When the first experimenter returns to the room the child is asked where they will look for the chocolate. In a typical deceptive box experiment, the child is shown a smarties box and asked ‘what do you think is in here?’ and as expected replies ‘smarties!’ Then the child is shown that there are actually pencils inside, and is asked what another child will think is in the box. Children under four consistently fail these tasks, and around the age four they become able to reason about another person’s beliefs being different than their own and consistently pass these tasks. The great distinction between abilities of 3and 4-year olds in the false-belief task leads many scientists to the conclusion that theory of mind develops in stages and something just changes about a child around their fourth birthday. Other scientists however are not satisfied with this conclusion and are seeking to understand what precedes the four-year-old mechanism of reasoning about beliefs and understanding intentions. This section details various findings about mechanisms that precede a fully developed theory of mind and may in fact be instrumental to its development. This group of findings is not exhaustive, but in particular, these are findings that I believe lead to concrete principles toward developing intentional understanding in machines. Self-Representation and the ‘Like-Me’ Attraction We can learn a lot about what children understand of intentions by looking at imitation. Some of the most seminal and extensive work on imitation comes from Meltzoff and Moore. Numerous studies show that when children imitate they are doing more than copying actions, they are copying the goal and can do so in various contexts and delayed circumstances, well before the age of four (Meltzoff, 1993) (Meltzoff, 1996). Meltzoff et. al. hypothesize that the ‘commonsense psychology’ or theory of mind that children show undoubtedly by the age of five has roots in infants’ ability to recognize that certain other things are ‘ like me’ and their predisposition to attending to those ‘ like-me’ agents. One of their most important findings is that very young infants (a few hours old) can and do imitate facial expressions (tongue out, lips out, lips open). In this study, the results of which have been replicated numerous times, infants’ faces are videotaped as they observe an experimenter making various facial expressions. Adults are asked to identify what facial expressions they think the infants on the video are making (not being able to see the model they were presented). The results show that the facial expression of the model significantly increased the likelihood of the infant producing the same expression. Moreover, in a variation of the experiment they proved that it is not just a reflex behavior. Until this, imitation was assumed to develop later but these findings suggest some aspects must be innate. The fact that this is facial imitation (and the infants have surely never seen their own face) shows an innate cross-modal imitation ability, where bodymovements-as-seen are successfully mapped onto bodymovements-as-felt. Meltzoff and Moore term this innate self-representation the ‘active intermodal mapping’ . In an additional set of studies it was found that infants also have an attraction or appetite for being imitated. They prefer to attend to an adult who acts ‘ just like me’ and are recognizing both temporal and structural equivalences of the imitation. The fact that infants enjoy and draw their attention to people imitating them is essentially why parental scaffolding works, some actions are selectively imitated more often and thus reinforced. This work points out three important precursors to understanding intentions: an intermodal body map, recognition and attraction to others ‘ like-me’, and an appetite for being imitated. Meltzoff and Gopnik tie these together nicely and characterize mutual imitation as children’s ‘ tutorial in naïve psychology’ just as observing physical reactions is a ‘ tutorial in naïve physics’ . Parsing Action Sequences Along Intentional Lines One way to view the challenge of understanding intentions is as a perceptual problem. The brain is constantly bombarded with streams of sensory data that it must translate into something meaningful. Understanding intentions is a continual process, when observing an action sequence understanding intentions involves parsing that action stream into groups of actions that collectively represent an intended act. Baird and Baldwin designed a series of experiments to explore the relation between intentions and action parsing (Baird & Baldwin, 2001). In the same way that we naturally parse speech along phoneme lines, Baird and Baldwin found that adults naturally parse action sequences along intentional lines. The experiment involved video sequences of a woman cleaning her kitchen. A group of subjects coded the video sequences, determining precisely where adults agreed about one intended act ending and another beginning. Incidentally, it is interesting that there was a high level of agreement among the coders. For the experiment, short tones were placed in the audio track according to intentional lines. Tones were placed either at the endpoint of an intentional act or at the midpoint. Then a second group of adults each watched four sequences of the video, two with endpoint tones and two with midpoint tones, and were asked to remember exactly when the beep occurre
[1]
D. Premack.
The infant's theory of self-propelled objects
,
1990,
Cognition.
[2]
G. Csibra.
Teleological and referential understanding of action in infancy.
,
2003,
Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[3]
Jodie A. Baird,et al.
Making sense of human behavior: Action parsing and intentional inference
,
2001
.
[4]
J. E. Rose,et al.
Autonomic Nervous System Activity Distinguishes Among Emotions
,
2009
.
[5]
S. Baron-Cohen,et al.
A model of the mindreading system: Neuropsychological and neurobiological perspectives.
,
1994
.
[6]
P. Bloom.
Mindreading, Communication and the Learning of Names for Things
,
2002
.
[7]
A. Goldman,et al.
Mirror neurons and the simulation theory of mind-reading
,
1998,
Trends in Cognitive Sciences.
[8]
S. Baron-Cohen,et al.
Understanding other minds : perspectives from autism
,
1994
.
[9]
Carlos Hitoshi Morimoto,et al.
Pupil detection and tracking using multiple light sources
,
2000,
Image Vis. Comput..
[10]
C. Breazeal,et al.
Robots that imitate humans
,
2002,
Trends in Cognitive Sciences.
[11]
C. Heyes,et al.
Social learning in animals : the roots of culture
,
1996
.
[12]
Dare A. Baldwin,et al.
Introduction: The significance of intentionality
,
2001
.
[13]
D. Perrett,et al.
Imitation, mirror neurons and autism
,
2001,
Neuroscience & Biobehavioral Reviews.
[14]
A. Meltzoff.
Chapter 16 - The Human Infant as Imitative Generalist: A 20-Year Progress Report on Infant Imitation with Implications for Comparative Psychology
,
1996
.
[15]
N. Freeman.
Children's early understanding of mind
,
1994
.