论文信息 - A saliency-based approach to event recognition

A saliency-based approach to event recognition

Abstract Over the last few years, a number of interesting solutions covering different aspects of event recognition have been proposed for event-based multimedia analysis. Existing approaches mostly focus on an efficient representation of the image and advanced classification schemes. However, it would be desirable to focus on the event-specific information available in the image, namely the so-called event saliency. In this paper, we propose a novel approach based on multiple instance learning (MIL) to learn the visual features contained in event-salient regions, extracted through a crowd-sourcing study. In total, we collect the salient regions for 76 different events from 4 large-scale datasets. The experimental results demonstrate the efficacy of using only event-related regions by achieving a significant gain in performance over the state-of-the-art.

[1] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Chengdong Wu,et al. Visual saliency detection: From space to frequency , 2016, Signal Process. Image Commun..

[4] Francesco G. B. De Natale,et al. Discovering inherent event taxonomies from social media collections , 2012, ICMR.

[5] Benoit Huet,et al. Heterogeneous features and model selection for event-based media classification , 2013, ICMR.

[6] Yiannis Kompatsiaris,et al. Cluster-Based Landmark and Event Detection for Tagged Photo Collections , 2011, IEEE MultiMedia.

[7] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8] Xiangmin Xu,et al. A multi-scene deep learning model for image aesthetic evaluation , 2016, Signal Process. Image Commun..

[9] Sergio Escalera,et al. ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[10] Yiannis Kompatsiaris,et al. CERTH @ MediaEval 2013 Social Event Detection Task , 2013, MediaEval.

[11] Nicu Sebe,et al. Event-based media processing and analysis: A survey of the literature , 2016, Image Vis. Comput..

[12] Nasir Ahmad,et al. Saliency based skin detection in complex scenes , 2013, Other Conferences.

[13] Francesco G. B. De Natale,et al. A hierarchical approach to event discovery from single images using MIL framework , 2016, 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[14] Jeff Z. Pan,et al. Multimedia annotations on the semantic Web , 2006, IEEE Multimedia.

[15] Martha Larson,et al. Crowdsourcing as self-fulfilling prophecy: Influence of discarding workers in subjective assessment tasks , 2016, 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI).

[16] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[17] Vijay Kumar Sharma,et al. MIL based visual object tracking with kernel and scale adaptation , 2017, Signal Process. Image Commun..

[18] Amaia Salvador,et al. Cultural Event recognition with visual ConvNets and temporal models , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19] Zhe Wang,et al. Better Exploiting OS-CNNs for Better Event Recognition in Images , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[20] Jun Wang,et al. Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[21] Dahua Lin,et al. Recognize complex events from static images by fusing deep channels , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Francesco G. B. De Natale,et al. Automatic Synchronization of Multi-user Photo Galleries , 2017, IEEE Transactions on Multimedia.

[24] Wei Liu,et al. Multimedia classification and event detection using double fusion , 2013, Multimedia Tools and Applications.

[25] Yiannis Kompatsiaris,et al. High-level event detection in video exploiting discriminant concepts , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[26] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[27] Ramesh Jain,et al. Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[28] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[29] Jaume Amores,et al. Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[30] Stefano Tubaro,et al. Deep Convolutional Neural Networks for pedestrian detection , 2015, Signal Process. Image Commun..

[31] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Francesco G. B. De Natale,et al. Robust event discovery from photo collections using Signature Image Bases (SIBs) , 2012, Multimedia Tools and Applications.

[33] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34] Matthieu Guillaumin,et al. Event Recognition in Photo Collections with a Stopwatch HMM , 2013, 2013 IEEE International Conference on Computer Vision.

[35] Xin Liu,et al. Exploiting Feature Hierarchies with Convolutional Neural Networks for Cultural Event Recognition , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[36] Yiannis Kompatsiaris,et al. Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation , 2012, MediaEval.

[37] Michael Riegler,et al. JORD: A System for Collecting Information and Monitoring Natural Disasters by Linking Social Media with Satellite Imagery , 2017, CBMI.

[38] Francesco G. B. De Natale,et al. USED: a large-scale social event detection dataset , 2016, MMSys.

[39] Ling Chen,et al. Event detection from flickr data through wavelet-based spatial analysis , 2009, CIKM.

[40] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[41] Xinmei Tian,et al. Event recognition in personal photo collections using hierarchical model and multiple features , 2015, 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP).

[42] Yiannis Kompatsiaris,et al. High-level event detection system based on discriminant visual concepts , 2011, ICMR '11.

[43] Petros Maragos,et al. A perceptually based spatio-temporal computational framework for visual saliency estimation , 2015, Signal Process. Image Commun..

[44] Ebroul Izquierdo,et al. Social event detection and retrieval in collaborative photo collections , 2012, ICMR '12.

[45] Mor Naaman,et al. Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[46] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[47] Yu Qiao,et al. Object-Scene Convolutional Neural Networks for event recognition in images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[48] Francesco G. B. De Natale,et al. EventMask: A Game-Based Framework for Event-Saliency Identification in Images , 2015, IEEE Transactions on Multimedia.

[49] Nojun Kwak,et al. Cultural event recognition by subregion classification with convolutional neural network , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.