论文信息 - Semantic Summarization of Egocentric Photo Stream Events

Semantic Summarization of Egocentric Photo Stream Events

With the rapid increase of users of wearable cameras in recent years and of the amount of data they produce, there is a strong need for automatic retrieval and summarization techniques. This work addresses the problem of automatically summarizing egocentric photo streams captured through a wearable camera by taking an image retrieval perspective. After removing non-informative images by a new CNN-based filter, images are ranked by relevance to ensure semantic diversity and finally re-ranked by a novelty criterion to reduce redundancy. To assess the results, a new evaluation metric is proposed which takes into account the non-uniqueness of the solution. Experimental results applied on a database of 7,110 images from 6 different subjects and evaluated by experts gave 95.74% of experts satisfaction and a Mean Opinion Score of 4.57 out of 5.0.

[1] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[2] Alan F. Smeaton,et al. Keyframe detection in visual lifelogs , 2008, PETRA '08.

[3] Jade Goldstein-Stewart,et al. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[4] Bob Woods,et al. Efficacy of an evidence-based cognitive stimulation therapy programme for people with dementia , 2003, British Journal of Psychiatry.

[5] Jana Machajdik,et al. A Keyframe Selection of Lifelog Image Sequences , 2013, MVA.

[6] Charles L. A. Clarke,et al. Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[7] Jade Goldstein-Stewart,et al. The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[8] Trevor Darrell,et al. LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[9] Gerard Salton,et al. Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[10] Steve Mann,et al. 'WearCam' (The wearable camera): personal imaging systems for long-term use in wearable tetherless computer-mediated reality and personal photo/videographic memory prosthesis , 1998, Digest of Papers. Second International Symposium on Wearable Computers (Cat. No.98EX215).

[11] Petia Radeva,et al. Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos , 2015, ArXiv.

[12] Shahram Izadi,et al. SenseCam: A Retrospective Memory Aid , 2006, UbiComp.

[13] Petia Radeva,et al. R-Clustering for Egocentric Video Segmentation , 2015, IbPRIA.

[14] Xavier Giró-i-Nieto,et al. End-to-end Convolutional Network for Saliency Prediction , 2015, ArXiv.

[15] Yiannis Kompatsiaris,et al. Proceedings of the ACM International Conference on Image and Video Retrieval , 2009, CIVR 2009.

[16] Kristen Grauman,et al. Diverse Sequential Subset Selection for Supervised Video Summarization , 2014, NIPS.

[17] David W. Murray,et al. Wearable visual robots , 2000, Digest of Papers. Fourth International Symposium on Wearable Computers.

[18] Yong Jae Lee,et al. Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Francesco G. B. De Natale,et al. Retrieval of Diverse Images by Pre-filtering and Hierarchical Clustering , 2014, MediaEval.

[20] Ximena Olivares,et al. Visual diversification of image search results , 2009, WWW '09.

[21] Yiannis Kompatsiaris,et al. SocialSensor: Finding Diverse Images at MediaEval 2014 , 2014, MediaEval.

[22] Takahiro Okabe,et al. Fast unsupervised ego-action learning for first-person sports videos , 2011, CVPR 2011.

[23] Stefan Carlsson,et al. Novelty detection from an ego-centric perspective , 2011, CVPR 2011.

[24] Bogdan Ionescu,et al. Retrieving Diverse Social Images at MediaEval 2017: Challenges, Dataset and Evaluation , 2017, MediaEval.

[25] Sachan Priyamvada Rajendra. A Survey of Automatic Video Summarization Techniques , 2014 .

[26] Alan F. Smeaton,et al. SenseCam intervention based on Cognitive Stimulation Therapy framework for early-stage dementia , 2011, 2011 5th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth) and Workshops.

[27] Abigail Sellen,et al. Do life-logging technologies support memory for the past?: an experimental study using sensecam , 2007, CHI.

[28] Petia Radeva,et al. Toward Storytelling From Visual Lifelogging: An Overview , 2015, IEEE Transactions on Human-Machine Systems.

[29] Deva Ramanan,et al. Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30] Petia Radeva,et al. Visual summary of egocentric photostreams by representative keyframes , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[31] Luc Van Gool,et al. Creating Summaries from User Videos , 2014, ECCV.

[32] Umberto Straccia,et al. Web metasearch: rank vs. score based rank aggregation methods , 2003, SAC '03.

[33] Petia Radeva,et al. Ego-object discovery , 2015, ArXiv.

[34] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[35] John D. Lafferty,et al. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval , 2003, SIGIR.

[36] Stefan Winkler,et al. Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives , 2016, Multimedia Systems.

[37] Javed A. Aslam,et al. Relevance score normalization for metasearch , 2001, CIKM '01.

[38] Kristen Grauman,et al. Story-Driven Summarization for Egocentric Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[40] Hermann Ney,et al. Jointly optimising relevance and diversity in image retrieval , 2009, CIVR '09.

[41] A. Smeaton,et al. Using lifelogging to help construct the identity of people with dementia , 2014 .

[42] Alan F. Smeaton,et al. Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs , 2008, CIVR '08.

[43] Kai Song,et al. Diversifying the image retrieval results , 2006, MM '06.

[44] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45] Ben Taskar,et al. Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[46] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[47] Anind K. Dey,et al. Lifelogging memory appliance for people with episodic memory impairment , 2008, UbiComp.

[48] Jonathan T. Barron,et al. Multiscale Combinatorial Grouping , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.