论文信息 - Egocentric Visual Event Classification with Location-Based Priors

Egocentric Visual Event Classification with Location-Based Priors

We present a method for visual classification of actions and events captured from an egocentric point of view. The method tackles the challenge of a moving camera by creating deformable graph models for classification of actions. Action models are learned from low resolution, roughly stabilized difference images acquired using a single monocular camera. In parallel, raw images from the camera are used to estimate the user's location using a visual Simultaneous Localization and Mapping (SLAM) system. Action-location priors, learned using a labeled set of locations, further aid action classification and bring events into context. We present results on a dataset collected within a cluttered environment, consisting of routine manipulations performed on objects without tags.

Walterio W. Mayol-Cuevas | Sudeep Sundaram | W. Mayol-Cuevas | Sudeep Sundaram

[1] Masayuki Inaba,et al. Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[2] David W. Murray,et al. Video-rate localization in multiple maps for wearable augmented reality , 2008, 2008 12th IEEE International Symposium on Wearable Computers.

[3] Kent Larson,et al. Activity Recognition in the Home Using Simple and Ubiquitous Sensors , 2004, Pervasive.

[4] Tomás Werner,et al. A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Ian D. Reid,et al. A Probabilistic Framework for Recognizing Similar Actions using Spatio-Temporal Features , 2007, BMVC.

[6] Henry A. Kautz,et al. Learning and inferring transportation routines , 2004, Artif. Intell..

[7] Uwe Hansmann,et al. Pervasive Computing , 2003 .

[8] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9] Ying Wu,et al. Discriminative subvolume search for efficient action detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Kristof Van Laerhoven,et al. When Else Did This Happen? Efficient Subsequence Representation and Matching for Wearable Activity Data , 2009, 2009 International Symposium on Wearable Computers.

[11] Alex Pentland,et al. Recognizing user context via wearable sensors , 2000, Digest of Papers. Fourth International Symposium on Wearable Computers.

[12] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[13] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[14] Walterio W. Mayol-Cuevas,et al. Appearance Based Indexing for Relocalisation in Real-Time Visual SLAM , 2008, BMVC.

[15] Yihong Gong,et al. Action detection in complex scenes with spatial and temporal ambiguities , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16] Geir Hovland,et al. Skill acquisition from human demonstration using a hidden Markov model , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[17] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[18] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19] Ian D. Reid,et al. Real-Time SLAM Relocalisation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20] Horst Bischof,et al. Sparse MRF Appearance Models for Fast Anatomical Structure Localisation , 2007, BMVC.

[21] Alex Pentland,et al. Realtime personal positioning system for a wearable computer , 1999, Digest of Papers. Third International Symposium on Wearable Computers.