Improving speech recognition with the robot interaction language

Abstract This article presents the design and evaluation of a Robot Interaction Language (ROILA). This speech recognition friendly spoken artificial language is designed to be used by humans for interacting with robots. We evaluated the use of ROILA in a Dutch high school. The language was taught as a part of the science curriculum followed by a controlled experiment, where the language was compared against English. The results from the experiment showed that the ROILA performed better than English on account of both objective recognition accuracy and the subjective assessment by the students. We estimate the trade-off between this benefit in relation to the effort required to learn ROILA. In a regular usage scenario, it would pay off to use ROILA.

[1]  Markus Forsberg Why is Speech Recognition Difficult? , 2003 .

[2]  Kaoru Sugita,et al.  RUNA: a multimodal command language for home robot users , 2008, Artificial Life and Robotics.

[3]  L. Seabra Lopes,et al.  How many words can my robot learn?: An approach and experiments with one-class learning , 2007 .

[4]  Yoshinori Kuno,et al.  Human-robot speech interface understanding inexplicit utterances using vision , 2004, CHI EA '04.

[5]  Janet Wiles,et al.  Robots, communication, and language: An overview of the Lingodroid project , 2010, ICRA 2010.

[6]  Kiyohiro Shikano,et al.  Robots that can hear, understand and talk , 2004, Adv. Robotics.

[7]  James Floyd Kelly,et al.  LEGO® MINDSTORMS® NXT , 2011 .

[8]  Raúl Marín,et al.  Automatic speech recognition to teleoperate a robot via Web , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Arika Okrent,et al.  In the Land of Invented Languages: Esperanto Rock Stars, Klingon Poets, Loglan Lovers, and the Mad Dreamers Who Tried to Build A Perfect Language , 2009 .

[10]  Karl Magnus Petersson,et al.  Artificial Language Learning in Adults and Children , 2010 .

[11]  Alexander I. Rudnicky,et al.  Universal speech interfaces , 2001, INTR.

[12]  Ebru Arisoy,et al.  A Universal Human Machine Speech Interaction Language for Robust Speech Recognition Applications , 2004, TSD.

[13]  Ben Shneiderman,et al.  The limits of speech recognition , 2000, CACM.

[14]  Robin R. Murphy,et al.  Review of Human Studies Methods in HRI and Recommendations , 2010, Int. J. Soc. Robotics.

[15]  K. Á. T.,et al.  Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI) , 2000, Natural Language Engineering.

[16]  Andrew Large,et al.  The artificial language movement , 1985 .

[17]  Hiroshi Ishiguro,et al.  Head motions during dialogue speech and nod timing control in humanoid robots , 2010, HRI 2010.

[18]  Humphrey Tonkin,et al.  Esperanto: Language, Literature, and Community , 1992 .

[19]  Thomas P. Moran,et al.  Questions, Options, and Criteria: Elements of Design Space Analysis , 1991, Hum. Comput. Interact..

[20]  Sharon L. Oviatt,et al.  Predicting hyperarticulate speech during human-computer error resolution , 1998, Speech Commun..

[21]  van Kg Koen Turnhout Socially aware conversational agents , 2007 .

[22]  Chen Liu,et al.  Training Acoustic Models with Speech Data from Different Languages , 2005 .

[23]  Tao Chen,et al.  Accent Issues in Large Vocabulary Continuous Speech Recognition , 2004, Int. J. Speech Technol..

[24]  Ronald Rosenfeld,et al.  Speech Graffiti vs. Natural Language: Assessing the User Experience , 2004, HLT-NAACL.

[25]  Christoph Bartneck,et al.  Towards the Design and Evaluation of ROILA: A Speech Recognition Friendly Artificial Language , 2010, IceTAL.

[26]  Magdalena D. Bugajska,et al.  Building a Multimodal Human-Robot Interface , 2001, IEEE Intell. Syst..

[27]  Guillaume Belrose,et al.  Computer Pidgin Language: A new language to talk to your computer? , 2001 .

[28]  Johan Bos,et al.  A spoken language interface with a mobile robot , 2006, Artificial Life and Robotics.

[29]  Jean Rouat,et al.  Robust Recognition of Simultaneous Speech by a Mobile Robot , 2007, IEEE Transactions on Robotics.

[30]  Mike Wald,et al.  Correcting automatic speech recognition captioning errors in real time , 2007, Int. J. Speech Technol..

[31]  Günther Görz,et al.  Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[32]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .

[33]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition , 1996 .

[34]  Scott Thomas,et al.  Using vision, acoustics, and natural language for disambiguation , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[35]  Ronald Rosenfeld,et al.  A universal speech interface for appliances , 2004, INTERSPEECH.

[36]  Manfred K. Warmuth,et al.  THE CMU SPHINX-4 SPEECH RECOGNITION SYSTEM , 2001 .