Online feature extraction for the incremental learning of gestures in human-swarm interaction

We present a novel approach for the online learning of hand gestures in swarm robotic (multi-robot) systems. We address the problem of online feature learning by proposing Convolutional Max-Pooling (CMP), a simple feed-forward two-layer network derived from the deep hierarchical Max-Pooling Convolutional Neural Network (MPCNN). To learn and classify gestures in an online and incremental fashion, we employ a 2nd order online learning method, namely the Soft-Confidence Weighted (SCW) learning scheme. In order for all robots to collectively take part in the learning and recognition task and obtain a swarm-level classification, we build a distributed consensus by fusing the individual decision opinions of robots together with the individual weights generated from multiple classifiers. Accuracy, robustness, and scalability of obtained solutions have been verified through emulation experiments performed on a large data set of real data acquired by a networked swarm of robots.

[1]  Steven C. H. Hoi,et al.  LIBOL: a library for online learning algorithms , 2014, J. Mach. Learn. Res..

[2]  Sven J. Dickinson,et al.  Canonical Skeletons for Shape Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[3]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[4]  Greg Mori,et al.  A robust integrated system for selecting and commanding multiple mobile robots , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  David Zhang,et al.  Hand shape recognition based on coherent distance shape contexts , 2012, Pattern Recognit..

[6]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[7]  Luca Maria Gambardella,et al.  Cooperative sensing and recognition by a swarm of mobile robots , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[9]  Francesco Mondada,et al.  The marXbot, a miniature mobile robot opening new perspectives for the collective-robotic research , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Yu Sun,et al.  Static Hand Gesture Recognition and its Application based on Support Vector Machines , 2008, 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Nicolae Duta,et al.  A survey of biometric technology based on hand shape , 2009, Pattern Recognit..

[13]  Fengming Zhang,et al.  Hand Gesture Recognition Based on MEB-SVM , 2009, 2009 International Conference on Embedded Software and Systems.

[14]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[15]  Koby Crammer,et al.  Exact Convex Confidence-Weighted Learning , 2008, NIPS.

[16]  Yael Edan,et al.  Vision-based hand-gesture applications , 2011, Commun. ACM.

[17]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[18]  Greg Mori,et al.  Selecting and commanding groups in a multi-robot vision based system , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[19]  Greg Mori,et al.  “You two! Take off!”: Creating, modifying and commanding groups of robots using face engagement and indirect speech in voice commands , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Alfred O. Hero,et al.  Efficient learning of sparse, distributed, convolutional feature representations for object recognition , 2011, 2011 International Conference on Computer Vision.

[21]  Greg Mori,et al.  Selecting and Commanding Individual Robots in a Multi-Robot System , 2010, 2010 Canadian Conference on Computer and Robot Vision.

[22]  Greg Mori,et al.  HRI in the sky: Creating and commanding teams of UAVs with a vision-mediated gestural interface , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Jodi Forlizzi,et al.  Designing interfaces for multi-user, multi-robot systems , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[24]  Luca Maria Gambardella,et al.  Incremental learning using partial feedback for gesture-based human-swarm interaction , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[25]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[26]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[27]  Bharti Bansal,et al.  Gesture Recognition: A Survey , 2016 .

[28]  Jürgen Schmidhuber,et al.  A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.

[29]  Thiago R. Trigo,et al.  An Analysis of Features for Hand-Gesture Classification , 2010 .

[30]  Michael A. Goodrich,et al.  Human-Robot Interaction: A Survey , 2008, Found. Trends Hum. Comput. Interact..

[31]  Luca Maria Gambardella,et al.  Max-pooling convolutional neural networks for vision-based hand gesture recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[32]  Steven C. H. Hoi,et al.  Exact Soft Confidence-Weighted Learning , 2012, ICML.

[33]  Luca Maria Gambardella,et al.  Convolutional Neural Support Vector Machines: Hybrid Visual Pattern Classifiers for Multi-robot Systems , 2012, 2012 11th International Conference on Machine Learning and Applications.

[34]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.