Pick-and-place application development using voice and visual commands

Purpose – The purpose of this paper is to design an interactive industrial robotic system which can be used to assist a “layperson” in re‐casting a generic pick‐and‐place application. A user can program a pick‐and‐place application simply by pointing to objects in the work area and speaking simple and intuitive natural language commands.Design/methodology/approach – The system was implemented in C# using the EMGU wrapper classes for OpenCV as well as the MS Speech Recognition API. The target language to be recognized was modelled using traditional augmented transition networks which were implemented as XML Grammars. The authors developed an original finger‐pointing algorithm using a unique combination of standard morphological and image processing techniques. Recognized voice commands trigger the vision component to capture what a user is pointing at. If the specified action requires robot movement, the required information is sent to the robot control component of the system, which then transmits the com...

[1]  Jörg Stückler,et al.  Learning to interpret pointing gestures with a time-of-flight camera , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[2]  J. C. Bartholomew,et al.  Voice control for noisy industrial environments , 1988, Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[3]  J. Norberto Pires Robot-by-voice: experiments on commanding an industrial robot using the human voice , 2005, Ind. Robot.

[4]  Yoichi Sato,et al.  Fast tracking of hands and fingertips in infrared images for augmented desk interface , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[5]  Rogério Schmidt Feris,et al.  Multi-view Appearance-based 3D Hand Pose Estimation , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[6]  Luc Van Gool,et al.  Real-time pointing gesture recognition for an immersive environment , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[7]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[8]  Rashid Ansari,et al.  Multimodal human discourse: gesture and speech , 2002, TCHI.

[9]  Phillip J. McKerrow,et al.  Introduction to robotics , 1991 .

[10]  Sebastian van Delden,et al.  Visual detection of objects in a robotic work area using hand gestures , 2011, 2011 IEEE International Symposium on Robotic and Sensors Environments (ROSE).

[11]  Thomas B. Moeslund,et al.  A Natural Interface to a Virtual Environment through Computer Vision-Estimated Pointing Gestures , 2001, Gesture Workshop.

[12]  Martin Hägele,et al.  Robot Assistants at Manual Workplaces: Effective Co-operation and Safety Aspects , 2002 .

[13]  Seong-Whan Lee,et al.  Real-time 3D pointing gesture recognition in mobile space , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[14]  Miroslav Holada,et al.  The Robot Voice-control System with Interactive Learning , 2008 .

[15]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[16]  John J. Craig,et al.  Introduction to Robotics Mechanics and Control , 1986 .

[17]  Chin-Chen Chang,et al.  New Approach for Static Gesture Recognition , 2006, J. Inf. Sci. Eng..