Interactive multiple object learning with scanty human supervision

Efficient approach for learning and detecting multiple objects in image sequences.Interactive object learning using scanty human supervision.Computation of multiple online classifiers using human-robot interaction.Real-time performance in diverse recognition problems. We present a fast and online human-robot interaction approach that progressively learns multiple object classifiers using scanty human supervision. Given an input video stream recorded during the human-robot interaction, the user just needs to annotate a small fraction of frames to compute object specific classifiers based on random ferns which share the same features. The resulting methodology is fast (in a few seconds, complex object appearances can be learned), versatile (it can be applied to unconstrained scenarios), scalable (real experiments show we can model up to 30 different object classes), and minimizes the amount of human intervention by leveraging the uncertainty measures associated to each classifier.We thoroughly validate the approach on synthetic data and on real sequences acquired with a mobile platform in indoor and outdoor scenarios containing a multitude of different objects. We show that with little human assistance, we are able to build object classifiers robust to viewpoint changes, partial occlusions, varying lighting and cluttered backgrounds.

[1]  Armin B. Cremers,et al.  Boosting scalable gradient features for adaptive real-time tracking , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[3]  Md. Golam Rashed,et al.  Toward Museum Guide Robots Proactively Initiating Interaction with Humans , 2015, HRI.

[4]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[5]  Roland Siegwart,et al.  People detection and tracking from aerial thermal views , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Carme Torras,et al.  Using ToF and RGBD cameras for 3D robot perception and manipulation in human environments , 2014, Intell. Serv. Robotics.

[7]  Alberto Sanfeliu,et al.  Local optimization of cooperative robot movements for guiding and regrouping people in a guiding mission , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[9]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Kate Saenko,et al.  Confidence-Rated Multiple Instance Boosting for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Benjamin Klein,et al.  Discriminative Ferns Ensemble for Hand Pose Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Maren Bennewitz,et al.  Humanoid robot localization in complex indoor environments , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Francesc Moreno-Noguer,et al.  Shared Random Ferns for Efficient Detection of Multiple Categories , 2010, 2010 20th International Conference on Pattern Recognition.

[14]  Shai Avidan Ensemble Tracking , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Hisashi Osumi,et al.  Visual attention model for manipulating human attention by a robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[17]  Tae-Kyun Kim,et al.  Fast Pedestrian Detection by Cascaded Random Forest with Dominant Orientation Templates , 2012, BMVC.

[18]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[19]  Aníbal Ollero,et al.  Data fusion in ubiquitous networked robot systems for urban services , 2012, Ann. des Télécommunications.

[20]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[22]  Gonzalo Ferrer,et al.  Robot companion: A social-force based approach with human awareness-navigation in crowded environments , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[24]  Alberto Sanfeliu,et al.  Efficient active global localization for mobile robots operating in large and cooperative environments , 2008, 2008 IEEE International Conference on Robotics and Automation.

[25]  Francesc Moreno-Noguer,et al.  Modeling robot's world with minimal effort , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Pietro Perona,et al.  Online, Real-Time Tracking Using a Category-to-Individual Detector , 2014, ECCV.

[27]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Horst Bischof,et al.  Accurate Object Detection with Joint Classification-Regression Random Forests , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[30]  Francesc Moreno-Noguer,et al.  Bootstrapping Boosted Random Ferns for discriminative and efficient object classification , 2012, Pattern Recognit..

[31]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[32]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Horst-Michael Groß,et al.  Vision-based Monte Carlo self-localization for a mobile service robot acting as shopping assistant in a home store , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Andrea Maria Zanchettin,et al.  Integration of perception, control and injury knowledge for safe human-robot interaction , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Armin B. Cremers,et al.  Adaptive real-time video-tracking for arbitrary objects , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  D. Feil-Seifer,et al.  Defining socially assistive robotics , 2005, 9th International Conference on Rehabilitation Robotics, 2005. ICORR 2005..

[37]  Francesc Moreno-Noguer,et al.  Online human-assisted learning using Random Ferns , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[38]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[39]  Horst Bischof,et al.  PROST: Parallel robust online simple tracking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Luc Van Gool,et al.  Interactive object detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Changchun Liu,et al.  An empirical study of machine learning techniques for affect recognition in human–robot interaction , 2006, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Francesc Moreno-Noguer,et al.  Dependent Multiple Cue Integration for Robust Tracking , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Luc Van Gool,et al.  Real-time 3D hand gesture interaction with a robot for understanding directions from humans , 2011, 2011 RO-MAN.

[45]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Huosheng Hu,et al.  Multisensor-Based Human Detection and Tracking for Mobile Service Robots , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[47]  Horst Bischof,et al.  On-Line Random Naive Bayes for Tracking , 2010, 2010 20th International Conference on Pattern Recognition.

[48]  Luc Van Gool,et al.  On-line Adaption of Class-specific Codebooks for Instance Tracking , 2010, BMVC.

[49]  Francesc Moreno-Noguer,et al.  On-board real-time pose estimation for UAVs using deformable visual contour registration , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[50]  Francesc Moreno-Noguer,et al.  Proactive behavior of an autonomous mobile robot for human-assisted learning , 2013, 2013 IEEE RO-MAN.

[51]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[53]  Kazuhiko Kawamura,et al.  HuDL, A Design Philosophy for Socially Intelligent Service Robots , 1997 .

[54]  Wolfram Burgard,et al.  Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva , 2000, Int. J. Robotics Res..

[55]  Gwen Littlewort,et al.  Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction. , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[56]  Rainer Stiefelhagen,et al.  Visual recognition of pointing gestures for human-robot interaction , 2007, Image Vis. Comput..

[57]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.