On the Use of a Low-Cost Thermal Sensor to Improve Kinect People Detection in a Mobile Robot

Detecting people is a key capability for robots that operate in populated environments. In this paper, we have adopted a hierarchical approach that combines classifiers created using supervised learning in order to identify whether a person is in the view-scope of the robot or not. Our approach makes use of vision, depth and thermal sensors mounted on top of a mobile platform. The set of sensors is set up combining the rich data source offered by a Kinect sensor, which provides vision and depth at low cost, and a thermopile array sensor. Experimental results carried out with a mobile platform in a manufacturing shop floor and in a science museum have shown that the false positive rate achieved using any single cue is drastically reduced. The performance of our algorithm improves other well-known approaches, such as C4 and histogram of oriented gradients (HOG).

[1]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[2]  Vijayan K. Asari,et al.  Face recognition in multi-sensor images based on a novel modular feature selection technique , 2010, Inf. Fusion.

[3]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[4]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5]  Erik Hjelmås,et al.  Face Detection: A Survey , 2001, Comput. Vis. Image Underst..

[6]  Bernt Schiele,et al.  Visual People Detection - Different Models, Comparison and Discussion , 2009, ICRA 2009.

[7]  Jake K. Aggarwal,et al.  Human detection using depth information by Kinect , 2011, CVPR 2011 WORKSHOPS.

[8]  Peter Bajcsy,et al.  Integration of thermal and visible imagery for robust foreground detection in tele-immersive spaces , 2008, 2008 11th International Conference on Information Fusion.

[9]  James M. Rehg,et al.  Real-time human detection using contour cues , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10]  Kikuo Fujimura,et al.  Bayesian 3D Human Body Pose Tracking from Depth Image Sequences , 2009, ACCV.

[11]  Basilio Sierra,et al.  Histogram distance-based Bayesian Network structure learning: A supervised classification specific approach , 2009, Decis. Support Syst..

[12]  Matthias Scheutz,et al.  Fast, reliable, adaptive, bimodal people tracking for indoor environments , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Basilio Sierra,et al.  Classifier hierarchy learning by means of genetic algorithms , 2006, Pattern Recognit. Lett..

[16]  Basilio Sierra,et al.  RGB-D, Laser and Thermal Sensor Fusion for People following in a Mobile Robot , 2013 .

[17]  Kai Oliver Arras,et al.  Leveraging RGB-D Data: Adaptive fusion and domain adaptation for object detection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Grzegorz Cielniak,et al.  Active people recognition using thermal and grey images on a mobile security robot , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Wei Li,et al.  An effective approach to pedestrian detection in thermal imagery , 2012, 2012 8th International Conference on Natural Computation.

[22]  Basilio Sierra,et al.  A Layered Learning Approach to 3D Multimodal People Detection Using Low-Cost Sensors in a Mobile Robot , 2012, ISAmI.

[23]  Martin Hofmann,et al.  Fusion of Multi-Modal Sensors in a Voxel Occupancy Grid for Tracking and Behaviour Analysis , 2011, WIAMIS 2011.

[24]  Hideya Takahashi,et al.  Fusion of Infrared and Visible Images for Robust Person Detection , 2011 .

[25]  Grzegorz Cielniak,et al.  People tracking by mobile robots using thermal and colour vision , 2007 .

[26]  W. Ritter,et al.  Reinforcing the reliability of pedestrian detection in far-infrared sensing , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[27]  Pedro Larrañaga,et al.  Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data , 2001, Artif. Intell. Medicine.

[28]  Ryo Kurazume,et al.  Multi-Part People Detection Using 2D Range Data , 2010, Int. J. Soc. Robotics.

[29]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[30]  Loreto Susperregi,et al.  Laser Based People Following Behaviour in an Emergency Environment , 2009, ICIRA.

[31]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Horst-Michael Gross,et al.  Sensor Fusion for Vision and Sonar Based People Tracking on a Mobile Service Robot , 2002 .

[33]  Javier Ruiz-del-Solar,et al.  Human Detection and Identification by Robots Using Thermal and Visual Information in Domestic Environments , 2011, Journal of Intelligent & Robotic Systems.

[34]  Wolfram Burgard,et al.  Using Boosted Features for the Detection of People in 2D Range Data , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[35]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[37]  Grzegorz Cielniak,et al.  Real-time people tracking for mobile robots using thermal vision , 2006, Robotics Auton. Syst..

[38]  X. Maldague,et al.  Thermal imaging for enhanced foreground background segmentation , 2006 .

[39]  Kurt Hornik,et al.  The support vector machine under test , 2003, Neurocomputing.

[40]  Huosheng Hu,et al.  Multisensor data fusion for joint people tracking and identification with a service robot , 2007, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[41]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[42]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[43]  Erik Schaffernicht,et al.  Multi-modal sensor fusion using a probabilistic aggregation scheme for people detection and tracking , 2006, Robotics Auton. Syst..

[44]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[45]  B. Schiele,et al.  Fast and Robust Face Finding via Local Context , 2003 .

[46]  Shuzhi Sam Ge,et al.  Robust Human Detection and Identification by Using Stereo and thermal Images in Human Robot Interaction , 2007, Int. J. Inf. Acquis..

[47]  Huosheng Hu,et al.  A Bank of Unscented Kalman Filters for Multimodal Human Perception with Mobile Service Robots , 2010, Int. J. Soc. Robotics.