Scene in the Loop : Towards Adaptation-by-Tracking in RGB-D Data

This paper addresses the problem of adapting an existing object detector to the characteristics of the environment in an unsupervised manner. The technique aims to reject all the false positive detections by exploiting the information from the environment and from the tracking system. We follow the intuition that similar characteristics are shared among the objects that are present in the same scene. Our aim is to detect the false positives by analyzing which detections do not share common properties in RGB-D feature space. For this, we make use of a One-class SVM in an unsupervised manner. This idea allows our approach to adapt to the environment it is tracking in. We developed and evaluated our system based on a people detection and tracking system that operates on Kinect data. Our experimental evaluation shows that our method outperforms standard outlier detection techniques and that is able to remove over 50% of the false positives without eliminating a significant amount of correct detections.

[1]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2]  Kai Oliver Arras,et al.  People tracking in RGB-D data with on-line boosted target models , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[4]  Sebastian Thrun,et al.  Towards 3D object recognition via classification of arbitrary object tracks , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5]  Christopher K. I. Williams,et al.  Pascal Visual Object Classes Challenge Results , 2005 .

[6]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[7]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[8]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[9]  Trevor Darrell,et al.  Transferring Visual Category Models to New Domains , 2010 .

[10]  B. Schiele,et al.  Multi-cue onboard pedestrian detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Sebastian Thrun,et al.  Tracking-based semi-supervised learning , 2011, Int. J. Robotics Res..

[12]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Luc Van Gool,et al.  Robust tracking-by-detection using a detector confidence particle filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[15]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[16]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[17]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[18]  Ramakant Nevatia,et al.  How does person identity recognition help multi-person tracking? , 2011, CVPR 2011.

[19]  Kai Oliver Arras,et al.  Leveraging RGB-D Data: Adaptive fusion and domain adaptation for object detection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[20]  Kai Oliver Arras,et al.  People detection in RGB-D data , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.