SEVA: sensor-enhanced video annotation

In this paper, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital recording system that records identities and locations of objects (as advertised by their sensors) along with visual images (as recorded by a camera). The process, which we refer to as sensor-enhanced video annotation (SEVA), combines a series of correlation, interpolation, and extrapolation techniques. It produces a tagged stream that later can be used to efficiently search for videos or frames containing particular objects or people. We present detailed experiments with a prototype of our system using both stationary and mobile objects as well as GPS and ultrasound. Our experiments show that: (i) SEVA has zero error rates for static objects, except very close to the boundary of the viewable area; (ii) for moving objects or a moving camera, SEVA only misses objects leaving or entering the viewable area by 1-2 frames; (iii) SEVA can scale to 10 fast moving objects using current sensor technology; and (iv) SEVA runs online using relatively inexpensive hardware.

[1]  Yuxiao Hu,et al.  Efficient propagation for face annotation in family albums , 2004, MULTIMEDIA '04.

[2]  Gordon Bell,et al.  MyLifeBits: fulfilling the Memex vision , 2002, MULTIMEDIA '02.

[3]  David E. Culler,et al.  Mica: A Wireless Platform for Deeply Embedded Networks , 2002, IEEE Micro.

[4]  Kentaro Toyama,et al.  Geographic location tags on digital images , 2003, ACM Multimedia.

[5]  Deborah Estrin,et al.  Augmenting film and video footage with sensor data , 2004, Second IEEE Annual Conference on Pervasive Computing and Communications, 2004. Proceedings of the.

[6]  Yunhao Liu,et al.  LANDMARC: Indoor Location Sensing Using Active RFID , 2004, Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003. (PerCom 2003)..

[7]  David E. Culler,et al.  Telos: enabling ultra-low power wireless research , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[8]  Gaetano Borriello,et al.  Location Systems for Ubiquitous Computing , 2001, Computer.

[9]  Frank Nack,et al.  Designing annotation before it's needed , 2001, MULTIMEDIA '01.

[10]  Bill Serra,et al.  People, Places, Things: Web Presence for the Real World , 2002, Mob. Networks Appl..

[11]  John Anderson,et al.  Wireless sensor networks for habitat monitoring , 2002, WSNA '02.

[12]  Jay L. Devore,et al.  Probability and statistics for engineering and the sciences , 1982 .

[13]  Keansub Lee,et al.  Minimal-impact audio-based personal archives , 2004, CARPE'04.

[14]  Mor Naaman,et al.  From Where to What: Metadata Sharing for Digital Photographs with Geographic Coordinates , 2003, OTM.

[15]  Robert Grimm,et al.  System support for pervasive applications , 2004, TOCS.

[16]  Luo Si,et al.  Effective automatic image annotation via a coherent language model and active learning , 2004, MULTIMEDIA '04.

[17]  Gordon Bell,et al.  Passive capture and ensuing issues for a personal lifetime store , 2004, CARPE'04.

[18]  Tat-Seng Chua,et al.  A bootstrapping framework for annotating and retrieving WWW images , 2004, MULTIMEDIA '04.

[19]  B. S. Manjunath,et al.  Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[20]  Gaetano Borriello,et al.  SpotON: An Indoor 3D Location Sensing Technology Based on RF Signal Strength , 2000 .

[21]  Dharma P. Agrawal,et al.  GPS: Location-Tracking Technology , 2002, Computer.

[22]  Edward Y. Chang,et al.  Confidence-based dynamic ensemble for image annotation and semantics discovery , 2003, MULTIMEDIA '03.

[23]  Kiyoharu Aizawa,et al.  Efficient retrieval of life log based on context and content , 2004, CARPE'04.

[24]  Armando Fox,et al.  The Interactive Workspaces Project: Experiences with Ubiquitous Computing Rooms , 2002, IEEE Pervasive Comput..

[25]  Hari Balakrishnan,et al.  Tracking moving devices with the cricket location system , 2004, MobiSys '04.

[26]  Jianping Fan,et al.  Multi-level annotation of natural scenes using dominant image components and semantic concepts , 2004, MULTIMEDIA '04.

[27]  Simon King,et al.  From context to content: leveraging context to infer media metadata , 2004, MULTIMEDIA '04.

[28]  Andreas Savvides,et al.  XYZ: a motion-enabled, power aware sensor node platform for distributed sensor network applications , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[29]  Klaus Finkenzeller,et al.  Book Reviews: RFID Handbook: Fundamentals and Applications in Contactless Smart Cards and Identification, 2nd ed. , 2004, ACM Queue.

[30]  Hari Balakrishnan,et al.  6th ACM/IEEE International Conference on on Mobile Computing and Networking (ACM MOBICOM ’00) The Cricket Location-Support System , 2022 .