论文信息 - Kinsight: Localizing and Tracking Household Objects Using Depth-Camera Sensors

Kinsight: Localizing and Tracking Household Objects Using Depth-Camera Sensors

We solve the problem of localizing and tracking household objects using a depth-camera sensor network. We design and implement Kin sight that tracks household objects indirectly -- by tracking human figures, and detecting and recognizing objects from human-object interactions. We devise two novel algorithms: (1) Depth Sweep -- that uses depth information to efficiently extract objects from an image, and (2) Context Oriented Object Recognition -- that uses location history and activity context along with an RGB image to recognize object sat home. We thoroughly evaluate Kinsight's performance with a rich set of controlled experiments. We also deploy Kinsightin real-world scenarios and show that it achieves an average localization error of about 13 cm.

John A. Stankovic | S. M. Shahriar Nirjon | J. Stankovic | S. Nirjon

[1] Tony F. Chan,et al. Active contours without edges , 2001, IEEE Trans. Image Process..

[2] P. Fua,et al. Towards Recognizing Feature Points using Classification Trees , 2004 .

[3] Mohamed R. Amer,et al. Multiobject tracking as maximum weight independent set , 2011, CVPR 2011.

[4] Stephen Gould,et al. Region-based Segmentation and Object Detection , 2009, NIPS.

[5] Vincent Lepetit,et al. Stable real-time 3D tracking using online and offline information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[7] Prashant J. Shenoy,et al. Sherlock: automatically locating objects for humans , 2008, MobiSys '08.

[8] Tarek F. Abdelzaher,et al. Range-free localization schemes for large scale sensor networks , 2003, MobiCom '03.

[9] Junzhou Huang,et al. Robust tracking using local sparse appearance model and K-selection , 2011, CVPR 2011.

[10] John A. Stankovic,et al. Context-aware wireless sensor networks for assisted living and residential monitoring , 2008, IEEE Network.

[11] Chong Wang,et al. RFID-Based 3-D Positioning Schemes , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[12] Andrea Vedaldi,et al. Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13] Derek Hoiem,et al. Category Independent Object Proposals , 2010, ECCV.

[14] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[15] Paul A. Viola,et al. Robust Real-time Object Detection , 2001 .

[16] Ian D. Reid,et al. Real-time tracking of multiple occluding objects using level sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17] Tsuhan Chen,et al. Extracting adaptive contextual cues from unlabeled regions , 2011, 2011 International Conference on Computer Vision.

[18] James M. Rehg,et al. A Scalable Approach to Activity Recognition based on Object Use , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19] Radu Stoleru,et al. Mobile Sensor Network Localization in Harsh Environments , 2010, DCOSS.

[20] Yong Jae Lee,et al. Object-graphs for context-aware category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21] Prashant J. Shenoy,et al. Ferret: RFID Localization for Pervasive Multimedia , 2006, UbiComp.

[22] Bart Selman,et al. Human Activity Detection from RGBD Images , 2011, Plan, Activity, and Intent Recognition.

[23] Kikuo Fujimura,et al. Visual Tracking Using Depth Data , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[24] Michael R. Souryal,et al. RFID-based localization and tracking technologies , 2011, IEEE Wireless Communications.

[25] Reinhard German,et al. ALF: An autonomous localization framework for self-localization in indoor environments , 2011, 2011 International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS).

[26] G. McLachlan,et al. The EM Algorithm and Extensions: Second Edition , 2008 .

[27] Gary J. Sullivan,et al. Reduced-complexity search for video coding geometry partitions using texture and depth data , 2011, 2011 Visual Communications and Image Processing (VCIP).

[28] Vincent Lepetit,et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[29] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[30] Neil A. Thacker,et al. The Bhattacharyya metric as an absolute similarity measure for frequency coded data , 1998, Kybernetika.

[31] Dieter Fox,et al. Toward object discovery and modeling via 3-D scene comparison , 2011, 2011 IEEE International Conference on Robotics and Automation.

[32] Antonio Torralba,et al. Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.