The majority of our waking hours are spent engaging in social interactions. Some of these interactions occur at the level of long-term strategic planning while others
take place at faster time scales, such as in conversations or card games. The abilityto perceive subtle gestural, postural, and facial cues, in addition to verbal language,
in real-time is a critical component. An understanding of the underlying perceptual
primitives that support this kind of real-time social cognition is key to understanding
social development.
Robots present an ideal opportunity to study the development of social interaction
in infants [Fasel,Deak,Triesch,Movellan 2002]. It is possible to create robots that
exhibit precisely controlled contingency structures. By observing how infants interact
with these robots we gain an opportunity to understand how infants identify the
operating characteristics of the social agents with whom they interact.
We have recently developed a social interaction robot, "Ruby", designed to communicate
with children. Ruby is endowed with the following real-time perceptual
primitives to facilitate social interaction: face tracking, motor control and speech
detection. It communicates via head and eye movements and we have recently run
pilot studies indicating that Ruby is fun and non-threatening to children.
Ruby's face tracking system consist of 3 cues taken from 3 inputs. The first 2 inputs
are high-resolution pan-tilt-zoom color cameras which are the "eyes". The third
input is an omni-directional camera acting as Ruby's peripheral vision. Each eye
uses the MPLab's contrast-feature based frontal face finder [Fasel et al CVIU2004]
and adaptive color-based tracker [Ishiguro et al 2003] [Hershey et al CVPR2004].
Ruby combines both of these to find both frontal and rotated faces at more than
30 frames per second. Ruby's motor control system currently has 3 components;
neck control, eye control, and control of external objects for experiments. Ruby
also features speech detection [Pellom 2004] and response with variable delay parameters.
We are now adding eye and eye-blink detection[Fasel et al CVIU2004],
expression recognition[Littlewort-Ford, Bartlett et al 2004], recognition of common
communicative words in English, arm movements, finger pointing, and touch sensors.
We hope to use Ruby to collect and analyze data on social interaction and contingency
and on the development of social interaction in infants.
[1]
Tetsuo Ono,et al.
Development and evaluation of an interactive humanoid robot "Robovie"
,
2002,
Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[2]
Ian R. Fasel,et al.
Combining embodied models and empirical research for understanding the development of shared attention
,
2002,
Proceedings 2nd International Conference on Development and Learning. ICDL 2002.
[3]
Cynthia Breazeal,et al.
Designing sociable robots
,
2002
.
[4]
Tetsuo Ono,et al.
Development of an Interactive Humanoid Robot “ Robovie ”-An interdisciplinary research approach between cognitive science and robotics -
,
2003
.
[5]
Large-Scale Convolutional HMMs for Real-Time Video Tracking
,
2003
.
[6]
Ian R. Fasel,et al.
A generative framework for real time object detection and classification
,
2005,
Comput. Vis. Image Underst..