Understanding Human Hand Gestures for Learning Robot Pick-and-Place Tasks

Programming robots by human demonstration is an intuitive approach, especially by gestures. Because robot pick-and-place tasks are widely used in industrial factories, this paper proposes a framework to learn robot pick-and-place tasks by understanding human hand gestures. The proposed framework is composed of the module of gesture recognition and the module of robot behaviour control. For the module of gesture recognition, transport empty (TE), transport loaded (TL), grasp (G), and release (RL) from Gilbreth's therbligs are the hand gestures to be recognized. A convolution neural network (CNN) is adopted to recognize these gestures from a camera image. To achieve the robust performance, the skin model by a Gaussian mixture model (GMM) is used to filter out non-skin colours of an image, and the calibration of position and orientation is applied to obtain the neutral hand pose before the training and testing of the CNN. For the module of robot behaviour control, the corresponding robot motion primitives to TE, TL, G, and RL, respectively, are implemented in the robot. To manage the primitives in the robot system, a behaviour-based programming platform based on the Extensible Agent Behavior Specification Language (XABSL) is adopted. Because the XABSL provides the flexibility and re-usability of the robot primitives, the hand motion sequence from the module of gesture recognition can be easily used in the XABSL programming platform to implement the robot pick-and-place tasks. The experimental evaluation of seven subjects performing seven hand gestures showed that the average recognition rate was 95.96%. Moreover, by the XABSL programming platform, the experiment showed the cube-stacking task was easily programmed by human demonstration.

[1]  Omar M. Al-Jarrah,et al.  Recognition of gestures in Arabic sign language using neuro-fuzzy systems , 2001, Artif. Intell..

[2]  C. Pintavirooj,et al.  Biometrics with Eigen-Hand , 2006, 2006 1ST IEEE Conference on Industrial Electronics and Applications.

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Nicolas D. Georganas,et al.  Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques , 2011, IEEE Transactions on Instrumentation and Measurement.

[6]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[7]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[8]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  John C. Platt,et al.  A Convolutional Neural Network Hand Tracker , 1994, NIPS.

[10]  Kang-Hyun Jo,et al.  Manipulative hand gesture recognition using task knowledge for human computer interaction , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[11]  Dimitris N. Metaxas,et al.  ASL recognition based on a coupling between HMMs and 3D motion analysis , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Narendra Ahuja,et al.  Gaussian mixture model for human skin color and its applications in image and video databases , 1998, Electronic Imaging.

[13]  Rüdiger Dillmann,et al.  What Can Robots Learn from Humans , 1996 .

[14]  W. Wong,et al.  On ψ-Learning , 2003 .

[15]  Rajesh P. N. Rao,et al.  Robotic imitation from human motion capture using Gaussian processes , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[16]  Michael A. Arbib,et al.  Schema theory , 1998 .

[17]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[18]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[19]  Pradeep K. Khosla,et al.  A multi-agent system for programming robots by human demonstration , 2001, Integr. Comput. Aided Eng..

[20]  Danica Kragic,et al.  Task Learning Using Graphical Programming and Human Demonstrations , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[21]  Matthias Jüngel,et al.  XABSL - A Pragmatic Approach to Behavior Engineering , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  C. F. Bond,et al.  Lie detection across cultures , 1990 .

[23]  Chung-Lin Huang,et al.  Sign language recognition using model-based tracking and a 3D Hopfield neural network , 1998, Machine Vision and Applications.

[24]  Narendra Ahuja,et al.  Face Detection and Gesture Recognition for Human-Computer Interaction , 2001, The International Series in Video Computing.

[25]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[26]  Stefan Schaal,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[27]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[28]  Rainer Palm,et al.  Programming by Demonstration of Pick-and-Place Tasks for Industrial Manipulators using Task Primitives , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.

[29]  Narendra Ahuja,et al.  Recognizing hand gesture using motion trajectories , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[30]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[31]  Hiroshi Mizoguchi,et al.  Active Understanding of Human Intention by a Robot through Monitoring of Human Behavior , 1994, IROS.

[32]  M. Arbib,et al.  Grasping objects: the cortical mechanisms of visuomotor transformation , 1995, Trends in Neurosciences.

[33]  Jun Tani,et al.  Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model , 2006, Neural Networks.

[34]  G. Anspach,et al.  Fourier descriptors and neural networks far shape classification , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[35]  Matthew Turk,et al.  View-based interpretation of real-time optical flow for gesture recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[36]  Maja J. Matarić,et al.  A framework for learning from demonstration, generalization and practice in human-robot domains , 2003 .

[37]  C.J. Cohen,et al.  A Surveillance System for the Recognition of Intent within Individuals and Crowds , 2008, 2008 IEEE Conference on Technologies for Homeland Security.

[38]  Kunio Fukunaga,et al.  Generating natural language description of human behavior from video images , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[39]  Mubarak Shah,et al.  Visual gesture recognition , 1994 .

[40]  Robyn A. Owens,et al.  Recognising moving hand shapes , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[41]  Michael A. Goodrich,et al.  Human-Robot Interaction: A Survey , 2008, Found. Trends Hum. Comput. Interact..

[42]  Yu Sun,et al.  Static Hand Gesture Recognition and its Application based on Support Vector Machines , 2008, 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing.

[43]  Wei-Kai Chen,et al.  Learning a pick-and-place robot task from human demonstration , 2013, 2013 CACS International Automatic Control Conference (CACS).

[44]  Katsu Yamane,et al.  Dynamics Filter - concept and implementation of online motion Generator for human figures , 2000, IEEE Trans. Robotics Autom..