You just do not understand me! Speech Recognition in Human Robot Interaction

Speech Recognition has not fully permeated in our interaction with devices. Therefore we advocate a speech recognition friendly artificial language (ROILA) that initially was shown to outperform English, however under constraints. ROILA is intended to be used to talk to robots and therefore in this paper we present an experimental study where the recognition of ROILA is compared to English when speech is input using a robot's microphones and both when the robot's head is moving and stationary. Our results show that there was no significant difference between ROILA and English but that the type of microphone and robot's head movement had a significant effect. In conclusion we suggest implications for Human Robot (Speech) Interaction.

[1]  Kenneth Ward Church,et al.  Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table , 1982, CL.

[2]  Hiroaki Kitano,et al.  Active Audition for Humanoid , 2000, AAAI/IAAI.

[3]  Christian Heath,et al.  IEEE International Symposium on Robot and Human Interactive Communication , 2009 .

[4]  Tim Smithers,et al.  Whistling to Machines , 2006, Ambient Intelligence in Everyday.

[5]  Carl M. Rebman,et al.  Speech recognition in the human-computer interface , 2003, Inf. Manag..

[6]  Yoshifumi Nishida,et al.  Home robot service by ceiling ultrasonic locator and microphone array , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[7]  Christoph Meinel,et al.  German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings , 2011, 2011 10th IEEE/ACIS International Conference on Computer and Information Science.

[8]  Magdalena D. Bugajska,et al.  Building a Multimodal Human-Robot Interface , 2001, IEEE Intell. Syst..

[9]  Kiyohiro Shikano,et al.  Voice activity detection applied to hands-free spoken dialogue robot based on decoding using acoustic and language model , 2007 .

[10]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[11]  Günther Görz,et al.  Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12]  Jacob Aron How innovative is Apple's new voice assistant, Siri? , 2011 .

[13]  Pieter Buzing,et al.  Comparing Different Keyboard Layouts: Aspects of QWERTY, DVORAK and alphabetical keyboards , .

[14]  Markus Forsberg Why is Speech Recognition Difficult? , 2003 .

[15]  Arika Okrent,et al.  In the Land of Invented Languages: Esperanto Rock Stars, Klingon Poets, Loglan Lovers, and the Mad Dreamers Who Tried to Build A Perfect Language , 2009 .

[16]  Yoshinori Kuno,et al.  Human-robot speech interface understanding inexplicit utterances using vision , 2004, CHI EA '04.

[17]  Stefan Wermter,et al.  Towards Robust Speech Recognition for Human-Robot Interaction , 2011 .

[18]  Kiyohiro Shikano,et al.  Robots that can hear, understand and talk , 2004, Adv. Robotics.

[19]  Simon King,et al.  IEEE Workshop on automatic speech recognition and understanding , 2009 .

[20]  Christoph Bartneck,et al.  Using word spotting to evaluate roila: a speech recognition friendly artificial language , 2010, CHI EA '10.

[21]  Waleed H. Abdulla,et al.  RoboASR: A Dynamic Speech Recognition System for Service Robots , 2012, ICSR.

[22]  Laurel D. Riek,et al.  Wizard of Oz studies in HRI , 2012, J. Hum. Robot Interact..

[23]  Matthias Scheutz,et al.  DIARC: A Testbed for Natural Human-Robot Interaction , 2006, AAAI.

[24]  Maurizio Omologo,et al.  Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25]  Manfred Tscheligi,et al.  CHI '04 Extended Abstracts on Human Factors in Computing Systems , 2004, CHI 2004.

[26]  Vladimir A. Kulyukin,et al.  On natural language dialogue with assistive robots , 2006, HRI '06.

[27]  Ebru Arisoy,et al.  A Universal Human Machine Speech Interaction Language for Robust Speech Recognition Applications , 2004, TSD.

[28]  Liang-Gee Chen,et al.  Evolving technology integration for consumer electronics , 2013, 2013 IEEE International Symposium on Consumer Electronics (ISCE).

[29]  Kerstin Dautenhahn,et al.  Empirical results from using a comfort level device in human-robot interaction studies , 2006, HRI '06.

[30]  Robert M. Issenman,et al.  Use of Voice Recognition Software in an Outpatient Pediatric Specialty Practice , 2004, Pediatrics.

[31]  I. McCowan,et al.  The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): specification and initial experiments , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[32]  Ronald Rosenfeld,et al.  Speech Graffiti vs. Natural Language: Assessing the User Experience , 2004, HLT-NAACL.

[33]  M. A. Anusuya,et al.  Speech Recognition by Machine, A Review , 2010, ArXiv.

[34]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[35]  Paul Lamere,et al.  Design of the CMU Sphinx-4 Decoder , 2022 .

[36]  Ben Shneiderman,et al.  The limits of speech recognition , 2000, CACM.

[37]  Michael A. Goodrich,et al.  Human-Robot Interaction: A Survey , 2008, Found. Trends Hum. Comput. Interact..

[38]  Henry Lieberman,et al.  How to wreck a nice beach you sing calm incense , 2005, IUI.

[39]  Helmer Strik,et al.  Modeling pronunciation variation for ASR: A survey of the literature , 1999, Speech Commun..

[40]  Jun Hu,et al.  Improving speech recognition with the robot interaction language , 2012 .