论文信息 - Where is my Phone?: Personal Object Retrieval from Egocentric Images

Where is my Phone?: Personal Object Retrieval from Egocentric Images

This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera. Each personal object is modelled by a small set of images that define a query for a visual search engine.The retrieved results are reranked considering the temporal timestamps of the images to increase the relevance of the later detections. Finally, a temporal interleaving of the results is introduced for robustness against false detections. The Mean Reciprocal Rank is proposed as a metric to evaluate this problem. This application could help into developing personal assistants capable of helping users when they do not remember where they left their personal belongings.

Noel E. O'Connor | Eva Mohedano | Kevin McGuinness | Xavier Giró | Cristian Reyes

[1] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2] Bingbing Ni,et al. Cascaded Interactional Targeting Network for Egocentric Video Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Alan F. Smeaton,et al. Combining image descriptors to effectively retrieve events from visual lifelogs , 2008, MIR '08.

[4] Alan F. Smeaton,et al. Experiences of Aiding Autobiographical Memory Using the SenseCam , 2012, Hum. Comput. Interact..

[5] Noel E. O'Connor,et al. Bags of Local Convolutional Features for Scalable Instance Search , 2016, ICMR.

[6] Lei Sun,et al. Flower Image Retrieval Based on Saliency Map , 2014, 2014 International Symposium on Computer, Consumer and Control.

[7] C. J. van Rijsbergen,et al. Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval , 1987, SIGIR 1987.

[8] Jorge S. Marques,et al. Performance evaluation of object detection algorithms for video surveillance , 2006, IEEE Transactions on Multimedia.

[9] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[11] Chaitali Chakrabarti,et al. Lifelogging: Archival and retrieval of continuously recorded audio using wearable devices , 2012, 2012 IEEE International Conference on Emerging Signal Processing Applications.

[12] Petia Radeva,et al. Active labeling application applied to food-related object recognition , 2013, CEA '13.

[13] Stefan Carlsson,et al. Novelty detection from an ego-centric perspective , 2011, CVPR 2011.

[14] References , 1971 .

[15] Steve E Hodges,et al. Wearable cameras in health: the state of the art and future possibilities. , 2013, American journal of preventive medicine.

[16] Jake K. Aggarwal,et al. Hierarchical Recognition of Human Activities Interacting with Objects , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Kris M. Kitani,et al. Going Deeper into First-Person Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Emmanouil Giouvanakis,et al. Saliency map driven image retrieval combining the bag-of-words model and PLSA , 2014, 2014 19th International Conference on Digital Signal Processing.

[19] Yoichi Sato,et al. Recognizing Micro-Actions and Reactions from Paired Egocentric Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] G. O'loughlin,et al. Using a wearable camera to increase the accuracy of dietary analysis. , 2013, American journal of preventive medicine.

[21] Javier Hernandez,et al. SenseGlass: using google glass to sense daily emotions , 2014, UIST.

[22] Alan F. Smeaton,et al. LifeLogging: Personal Big Data , 2014, Found. Trends Inf. Retr..

[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24] Jade Goldstein-Stewart,et al. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[25] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26] James M. Rehg,et al. A Scalable Approach to Activity Recognition based on Object Use , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27] Noel E. O'Connor,et al. Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Jean-Philippe Domenger,et al. Geometrical Cues in Visual Saliency Models for Active Object Recognition in Egocentric Videos , 2014, PIVP@MM.

[29] Kai Song,et al. Diversifying the image retrieval results , 2006, MM '06.

[30] Cathal Gurrin,et al. The smartphone as a platform for wearable cameras in health research. , 2013, American journal of preventive medicine.

[31] Henning Müller,et al. Result diversification in social image retrieval: a benchmarking framework , 2014, Multimedia Tools and Applications.

[32] Jade Goldstein-Stewart,et al. The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[33] Rami Albatal,et al. NTCIR Lifelog: The First Test Collection for Lifelog Research , 2016, SIGIR.

[34] Joo-Hwee Lim,et al. Efficient Retrieval from Large-Scale Egocentric Visual Data Using a Sparse Graph Representation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[35] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Jenny Benois-Pineau,et al. Geometrical cues in visual saliency models for active object recognition in egocentric videos , 2014, PIVP '14.

[37] Nanning Zheng,et al. Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.