Learning novel objects using out-of-vocabulary word segmentation and object extraction for home assistant robots

This paper presents a method for learning novel objects from audio-visual input. Objects are learned using out-of-vocabulary word segmentation and object extraction. The latter half of this paper is devoted to evaluations. We propose the use of a task adopted from the RoboCup@Home league as a standard evaluation for real world applications. We have implemented proposed method on a real humanoid robot and evaluated it through a task called “Supermarket”. The results reveal that our integrated system works well in the real application. In fact, our robot outperformed the maximum score obtained in RoboCup@Home 2009 competitions.

[1]  D. Roy Grounding words in perception and action: computational insights , 2005, Trends in Cognitive Sciences.

[2]  Keiichi Tokuda,et al.  XIMERA: a new TTS from ATR based on corpus-based technologies , 2004, SSW.

[3]  Satoshi Nakamura,et al.  Robust Speech Recognition System for Communication Robots in Real Environments , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[4]  M. Fujita,et al.  An autonomous robot that eats information via interaction with humans and environments , 2001, Proceedings 10th IEEE International Workshop on Robot and Human Interactive Communication. ROMAN 2001 (Cat. No.01TH8591).

[5]  Ben J. A. Kröse,et al.  Jijo-2: An Office Robot that Communicates and Learns , 2001, IEEE Intell. Syst..

[6]  Naoto Iwahashi,et al.  Robots That Learn Language: A Developmental Approach to Situated Human-Robot Conversations , 2007 .

[7]  Satoshi Nakamura,et al.  Sequential Non-Stationary Noise Tracking Using Particle Filtering with Switching Dynamical System , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  Tomoki Toda,et al.  One-to-Many and Many-to-One Voice Conversion Based on Eigenvoices , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Naoto Iwahashi,et al.  Robots That Learn Language: Developmental Approach to Human-Machine Conversations , 2006, EELC.

[10]  Satoshi Nakamura,et al.  The ATR Multilingual Speech-to-Speech Translation System , 2006, IEEE Transactions on Audio, Speech, and Language Processing.