Validating the Detection of Everyday Concepts in Visual Lifelogs

The Microsoft SenseCam is a small lightweight wearable camera used to passively capture photos and other sensor readings from a user's day-to-day activities. It can capture up to 3,000 images per day, equating to almost 1 million images per year. It is used to aid memory by creating a personal multimedia lifelog, or visual recording of the wearer's life. However the sheer volume of image data captured within a visual lifelog creates a number of challenges, particularly for locating relevant content. Within this work, we explore the applicability of semantic concept detection, a method often used within video retrieval, on the novel domain of visual lifelogs. A concept detector models the correspondence between low-level visual features and high-level semantic concepts (such as indoors, outdoors, people, buildings, etc.) using supervised machine learning. By doing so it determines the probability of a concept's presence. We apply detection of 27 everyday semantic concepts on a lifelog collection composed of 257,518 SenseCam images from 5 users. The results were then evaluated on a subset of 95,907 images, to determine the precision for detection of each semantic concept and to draw some interesting inferences on the lifestyles of those 5 users. We additionally present future applications of concept detection within the domain of lifelogging.

[1]  Cor J. Veenman,et al.  Comparing compact codebooks for visual categorization , 2010, Comput. Vis. Image Underst..

[2]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[3]  Noel E. O'Connor,et al.  Adaptive visual summary of lifeLog photos forpersonal information management , 2006 .

[4]  Alan F. Smeaton,et al.  Automatically Segmenting LifeLog Data into Events , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[5]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[6]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[7]  Alan F. Smeaton,et al.  Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs , 2008, CIVR '08.

[8]  Alan F. Smeaton,et al.  An Examination of a Large Visual Lifelog , 2008, AIRS.

[9]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  Richard W. Devaul,et al.  Real-time motion classi ca-tion for wearable computing applications , 2001 .

[14]  Liadh Kelly,et al.  Applying contextual memory cues for retrieval from personal information archives , 2008 .

[15]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[16]  Dong Wang,et al.  Video diver: generic video indexing with diverse features , 2007, MIR '07.

[17]  Jan-Mark Geusebroek,et al.  Compact Object Descriptors from Local Colour Invariant Histograms , 2006, BMVC.

[18]  Arnold W. M. Smeulders,et al.  c ○ 2005 Springer Science + Business Media, Inc. Manufactured in The Netherlands. A Six-Stimulus Theory for Stochastic Texture , 2002 .

[19]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[20]  G. Bell,et al.  A digital life , 2007 .

[21]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[22]  Alan F. Smeaton,et al.  Using bluetooth and GPS metadata to measure event similarity in SenseCam Images , 2007 .

[23]  Shahram Izadi,et al.  SenseCam: A Retrospective Memory Aid , 2006, UbiComp.

[24]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[25]  Arnold W. M. Smeulders,et al.  Color texture measurement and segmentation , 2005, Signal Process..

[26]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[27]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..