From Talking and Listening Robots to Intelligent Communicative Machines

It is a popular view that the future will be inhabited by intelligent talking and listening robots with whom we shall converse using the full palette of linguistic expression available to us as human beings. Of course, recent technical and engineering developments such as Siri would appear to suggest that important steps are being made in that direction – and indeed they are. However, it is argued here that we need to go far beyond our current capabilities and understanding towards a more integrated perspective; simply interfacing state-of-the-art speech technology with a state-of-the-art robot is very unlikely to lead to effective human-robot interaction. We need to move from developing robots that simply talk and listen to evolving intelligent communicative machines that are capable of truly understanding human behavior, and this means that we need to look beyond speech, beyond words, beyond meaning, beyond communication, beyond dialog and beyond one-off interactions.

[1]  M. Pickering,et al.  An integrated theory of language production and comprehension. , 2013, The Behavioral and brain sciences.

[2]  Roger K. Moore Spoken Language Processing: Where Do We Go from Here? , 2013, Your Virtual Butler.

[3]  Guest Editorial Gesture and speech in interaction : An overview , 2013 .

[4]  Dongho Kim,et al.  POMDP-based dialogue manager adaptation to extended domains , 2013, SIGDIAL Conference.

[5]  Roger K. Moore A Bayesian explanation of the ‘Uncanny Valley’ effect and related psychological phenomena , 2012, Scientific Reports.

[6]  Alan F. T. Winfield,et al.  Robotics: A Very Short Introduction , 2012 .

[7]  J. Schwartz,et al.  The Perception-for-Action-Control Theory (PACT): A perceptuo-motor theory of speech perception , 2012, Journal of Neurolinguistics.

[8]  Helen F. Hastie,et al.  Incremental Spoken Dialogue Systems: Tools and Data , 2012, SDCTD@NAACL-HLT.

[9]  Roberto Pieraccini The Voice in the Machine: Building Computers That Understand Speech , 2012 .

[10]  Yvonne Rogers,et al.  Interaction Design - Beyond Human-Computer Interaction, 3rd Edition , 2012 .

[11]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[12]  D. Wildgruber,et al.  Emotional voices in context: a neurobiological model of multimodal affective information processing. , 2011, Physics of life reviews.

[13]  Gabriella Lakatos,et al.  Comprehension and utilisation of pointing gestures and gazing in dog–human communication in relatively complex situations , 2011, Animal Cognition.

[14]  Matthias Scheutz,et al.  A mismatch in the human realism of face and voice produces an uncanny valley , 2011, i-Perception.

[15]  Roger K. Moore Interacting with Purpose (and Feeling!): What Neuropsychology and the Performing Arts Can Tell Us About 'Real' Spoken Language Behaviour , 2011, IWSDS.

[16]  Roger K. Moore,et al.  Reactive Speech Synthesis: Actively Managing Phonetic Contrast along an H&H Continuum , 2011, ICPhS.

[17]  David Vernon,et al.  Enaction as a conceptual framework for developmental cognitive robotics , 2010, Paladyn J. Behav. Robotics.

[18]  Anna Esposito,et al.  On Speech and Gestures Synchrony , 2010, COST 2102 Conference.

[19]  Mareike M. Menz,et al.  Grasping language – A short story on embodiment , 2010, Consciousness and Cognition.

[20]  Giulio Sandini,et al.  In Press, Ieee Transactions on Autonomous Mental Development , 2010 .

[21]  Steve J. Young,et al.  Cognitive User Interfaces , 2010, IEEE Signal Processing Magazine.

[22]  Tetsuya Ogata,et al.  Voice-awareness control for a humanoid robot consistent with its body posture and movements , 2010, Paladyn J. Behav. Robotics.

[23]  Y. Wilks,et al.  Book Review: Close Engagements with Artificial Companions: Key Social, Psychological, Ethical, and Design Issues edited by Yorick Wilks , 2010, CL.

[24]  Reid G. Simmons,et al.  Affective social robots , 2010, Robotics Auton. Syst..

[25]  Giulio Sandini,et al.  Embodiment in cognitive systems: on the mutual dependence of cognition and robotics , 2010 .

[26]  Emiliana R. Simon-Thomas,et al.  The voice conveys specific emotions: evidence from vocal burst displays. , 2009, Emotion.

[27]  Stuart Cunningham,et al.  Research on Social Engagement with a Rabbitic User Interface , 2009 .

[28]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[29]  Gabriel Skantze,et al.  Incremental Dialogue Processing in a Micro-Domain , 2009, EACL.

[30]  Louis ten Bosch,et al.  A Computational Model of Language Acquisition: the Emergence of Words , 2009, Fundam. Informaticae.

[31]  Takayuki Kanda,et al.  Does the Design of a Robot Influence Its Animacy and Perceived Intelligence? , 2009, Int. J. Soc. Robotics.

[32]  S. Cowley Distributed language and dynamics , 2009 .

[33]  Anna Esposito,et al.  Affect in Multimodal Information , 2009, Affective Information Processing.

[34]  I. René J. A. te Boekhorst,et al.  Human approach distances to a mechanical-looking robot with different robot voice styles , 2008, RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication.

[35]  Elisabeth André,et al.  EmoVoice - A Framework for Online Recognition of Emotions from Voice , 2008, PIT.

[36]  M. Tomasello,et al.  Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.

[37]  Alex Pentland,et al.  Honest Signals - How They Shape Our World , 2008 .

[38]  Michael Picheny,et al.  Towards Superhuman Speech Recognition , 2008 .

[39]  A. Meltzoff,et al.  The Robot in the Crib: A Developmental Analysis of Imitation Skills in Infants and Robots. , 2008, Infant and child development.

[40]  P. Bessière,et al.  Building a talking baby robot A contribution to the study of speech acquisition and evolution , 2005 .

[41]  M. Tomasello Origins of human communication , 2008 .

[42]  Roger K. Moore PRESENCE: A Human-Inspired Architecture for Speech-Based Human-Machine Interaction , 2007, IEEE Transactions on Computers.

[43]  Seiji Yamada,et al.  How appearance of robotic agents affects how people interpret the agents' attitudes , 2007, ACE '07.

[44]  Manfred Wettler From Molecules to Metaphor. A Neural Theory of Language: (2006). Cambridge, MA: The MIT Press , 2007 .

[45]  H. Ishiguro,et al.  Geminoid: Teleoperated Android of an Existing Person , 2007 .

[46]  Roger K. Moore Spoken language processing: Piecing together the puzzle , 2007, Speech Commun..

[47]  Heiga Zen,et al.  Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[48]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[49]  Roger K. Moore Towards Speech-Based Human-Robot Interaction , 2007 .

[50]  Mark J. F. Gales,et al.  The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..

[51]  Karl F. MacDorman,et al.  What baboons, babies and Tetris players tell us about interaction: a biosocial view of norm-based social learning , 2006, Connect. Sci..

[52]  Khalil Sima'an,et al.  Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship , 2006, Computational Linguistics.

[53]  Jürgen Schmidhuber,et al.  Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[54]  Stephanie Rosenthal,et al.  Designing robots for long-term social interaction , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[55]  Peter Ford Dominey,et al.  Developmental stages of perception and language acquisition in a perceptually grounded robot , 2005, Cognitive Systems Research.

[56]  Phil Turner,et al.  Designing Interactive Systems. , 2005 .

[57]  K. Dautenhahn Robots we like to live with?! - a developmental perspective on a personalized, life-long robot companion , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).

[58]  G. Rizzolatti,et al.  A unifying view of the basis of social cognition , 2004, Trends in Cognitive Sciences.

[59]  G. Rizzolatti,et al.  The mirror-neuron system. , 2004, Annual review of neuroscience.

[60]  Wolfgang Minker,et al.  Endowing Spoken Language Dialogue Systems with Emotional Intelligence , 2004, ADS.

[61]  I. Iakovidis,et al.  The road ahead. , 2004, Studies in health technology and informatics.

[62]  Klaus R. Scherer,et al.  Vocal communication of emotion: A review of research paradigms , 2003, Speech Commun..

[63]  G. Rizzolatti,et al.  Hearing Sounds, Understanding Actions: Action Representation in Mirror Neurons , 2002, Science.

[64]  Herbert H. Clark,et al.  Speaking in time , 2002, Speech Commun..

[65]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[66]  Justine Cassell,et al.  Relational agents: a model and implementation of building user trust , 2001, CHI.

[67]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[68]  Brian Scassellati,et al.  Foundations for a theory of mind for a humanoid robot , 2001 .

[69]  W. Fitch The evolution of speech: a comparative review , 2000, Trends in Cognitive Sciences.

[70]  C. Breazeal Sociable Machines: Expressive Social Ex-change Between Humans and Robots , 2000 .

[71]  R. Kurzweil The age of spiritual machines: when computers exceed human intelligence , 1998 .

[72]  Aude Billard,et al.  Grounding communication in autonomous robots: An experimental study , 1998, Robotics Auton. Syst..

[73]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[74]  M. Arbib,et al.  Language within our grasp , 1998, Trends in Neurosciences.

[75]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[76]  Raymond Kurzweil,et al.  Age of intelligent machines , 1990 .

[77]  D. Dennett The Intentional Stance. , 1987 .

[78]  John Haugeland,et al.  Artificial intelligence - the very idea , 1987 .

[79]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[80]  H. B. Ritea,et al.  Speech Understanding Systems , 1976, Artif. Intell..

[81]  A. Rogier [Communication without words]. , 1971, Tijdschrift voor ziekenverpleging.

[82]  C. F. Hockett The origin of speech. , 1960, Scientific American.