Object Localization Using Kinect

Ambient intelligence is an emergent topic today and it involves scene understanding and object recognition. Because for scene understanding the position of objects is needed, a binary classification that decides if an object is present or not in the scene is not sufficient. The present paper proposes a system for localization of objects by a 3D bounding box. This is achieved by augmenting the 2D location, extracted using a sliding-window-based method, with 3D information acquired from a stereo-camera. While the sliding window is very computationally intensive, a branch-and-bound approach is used, which reduces the processing time, typically running in sub linear time, without discarding the optimality guarantee. An SVM is employed for classification purposes based on the bag-of-visual-word representation to obtain discrimination between object classes. The system is tested on real-world objects using a Microsoft Kinect sensor.

[1]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[9]  Cristian Sminchisescu,et al.  Training Deformable Models for Localization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).