The ContextCam: Automated Point of Capture Video Annotation

Rich, structured annotations of video recordings enable interesting uses, but existing techniques for manual, and even semi-automated, tagging can be too time-consuming. We present in this paper the ContextCam, a prototype of a consumer video camera that provides point of capture annotation of time, location, person presence and event information associated to recorded video. Both low- and high-level metadata are discovered via a variety of sensing and active tagging techniques, as well as through the application of machine learning techniques that use past annotations to suggest metadata for the current recordings. Furthermore, the ContextCam provides users with a minimally intrusive interface for correcting predicted high-level metadata during video recording.

[1]  Anind K. Dey,et al.  UbiComp 2003: Ubiquitous Computing , 2003, Lecture Notes in Computer Science.

[2]  R. Manmatha,et al.  Statistical models for automatic video annotation and retrieval , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Shingo Uchihashi,et al.  A semi-automatic approach to home video editing , 2000, UIST '00.

[4]  Mike Hazas,et al.  A Novel Broadband Ultrasonic Location System , 2002, UbiComp.

[5]  Laura A. Dabbish,et al.  Simplifying video editing using metadata , 2002, DIS '02.

[6]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[7]  Lars Erik Holmquist,et al.  UbiComp 2002: Ubiquitous Computing , 2002 .

[8]  John Doherty,et al.  Managing digital memories with the FXPAL photo application , 2003, MULTIMEDIA '03.

[9]  Seth J. Teller,et al.  The cricket compass for context-aware mobile applications , 2001, MobiCom '01.

[10]  Benjamin B. Bederson,et al.  PhotoMesa: a zoomable image browser using quantum treemaps and bubblemaps , 2001, UIST '01.

[11]  Ben Shneiderman,et al.  Visualization methods for personal photo collections: browsing and searching in the PhotoFinder , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[12]  Mary Czerwinski,et al.  PhotoTOC: automatic clustering for browsing personal photographs , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[13]  Deborah Estrin,et al.  Augmenting film and video footage with sensor data , 2004, Second IEEE Annual Conference on Pervasive Computing and Communications, 2004. Proceedings of the.

[14]  Lars Erik Holmquist,et al.  Capturing the invisible: designing context-aware photography , 2003, DUX '03.

[15]  Michael L. Creech,et al.  FotoFile: a consumer multimedia organization and retrieval system , 1999, CHI '99.

[16]  Katashi Nagao,et al.  The world through the computer: computer augmented interaction with real world environments , 1995, UIST '95.

[17]  Cynthia E. Irvine,et al.  Surmounting the Effects of Lossy Compression on Steganography , 1996 .

[18]  Marc Davis,et al.  Media Streams: an iconic visual language for video representation , 1995 .

[19]  Ronald M. Baecker,et al.  Readings in human-computer interaction : toward the year 2000 , 1995 .

[20]  David H. Nguyen,et al.  Proactive displays & the experience UbiComp project , 2002, SIGG.

[21]  Marc Davis,et al.  Metadata creation system for mobile images , 2004, MobiSys '04.

[22]  Gregory D. Abowd,et al.  The Family Video Archive: an annotation and browsing environment for home movies , 2003, MIR '03.

[23]  John R. Kender,et al.  On the structure and analysis of home videos , 2000 .

[24]  Andy Hopper,et al.  The active badge location system , 1992, TOIS.

[25]  Hari Balakrishnan,et al.  6th ACM/IEEE International Conference on on Mobile Computing and Networking (ACM MOBICOM ’00) The Cricket Location-Support System , 2022 .

[26]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[27]  Jiebo Luo,et al.  Indoor vs outdoor classification of consumer photographs using low-level and semantic features , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[28]  Jun Rekimoto,et al.  NaviCam:A Magnifying Glass Approach to Augmented Reality , 1997, Presence: Teleoperators & Virtual Environments.

[29]  Ravin Balakrishnan,et al.  Fluid interaction techniques for the control and annotation of digital video , 2003, UIST '03.

[30]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[31]  Donna J. Cox,et al.  IntelliBadgeTM: Towards Providing Location-Aware Value-Added Services at Academic Conferences , 2003, UbiComp.

[32]  Ahmed K. Elmagarmid,et al.  Scene change detection techniques for video database systems , 1998, Multimedia Systems.