Event-based media processing and analysis: A survey of the literature

Research on event-based processing and analysis of media is receiving an increasing attention from the scientific community due to its relevance for an abundance of applications, from consumer video management and video surveillance to lifelogging and social media. Events have the ability to semantically encode relationships of different informational modalities, such as visual-audio-text, time, involved agents and objects, with the spatio-temporal component of events being a key feature for contextual analysis. This unveils an enormous potential for exploiting new information sources and opening new research directions. In this paper, we survey the existing literature in this field. We extensively review the employed conceptualization of the notion of event in multimedia, the techniques for event representation and modeling, the feature representation and event inference approaches for the problems of event detection in audio, visual, and textual content. Furthermore, we review some key event-based multimedia applications, and various benchmarking activities that provide solid frameworks for measuring the performance of different event processing and analysis systems. We provide an in-depth discussion of the insights obtained from reviewing the literature and identify future directions and challenges. We survey the literature in event-based media processing and analysis.We examine the different definitions of events.We study various techniques for event representation and modeling.We survey feature representation, event inference approaches in multimedia content.We review event-based multimedia applications and various benchmarking activities.

[1]  Ramakant Nevatia,et al.  ISOMER: Informative Segment Observations for Multimedia Event Recounting , 2014, ICMR.

[2]  Matthew Hurst,et al.  Event Detection and Tracking in Social Streams , 2009, ICWSM.

[3]  Nicu Sebe,et al.  Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few Exemplars , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Masoud Mazloom,et al.  Searching informative concept banks for video event detection , 2013, ICMR.

[5]  Cynthia Rudin,et al.  The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[6]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[7]  Ramakant Nevatia,et al.  ACTIVE: Activity Concept Transitions in Video Event Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Yue Ming Human Activity Recognition Based on 3D Mesh MoSIFT Feature Descriptor , 2013, 2013 International Conference on Social Computing.

[9]  Ivor W. Tsang,et al.  Event Detection Using Multi-level Relevance Labels and Multiple Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jiebo Luo,et al.  Kodak consumer video benchmark data set : concept definition and annotation * * , 2008 .

[11]  Yiannis Kompatsiaris,et al.  ITI-CERTH participation to TRECVID 2015 , 2015, TRECVID.

[12]  Cees Snoek,et al.  Bag-of-Fragments: Selecting and Encoding Video Fragments for Event Detection and Recounting , 2015, ICMR.

[13]  Chong-Wah Ngo,et al.  Semantic context transfer across heterogeneous sources for domain adaptive video search , 2009, ACM Multimedia.

[14]  Cees Snoek,et al.  VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events , 2014, ACM Multimedia.

[15]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[16]  M. Casey,et al.  MPEG-7 sound-recognition tools , 2001, IEEE Trans. Circuits Syst. Video Technol..

[17]  Mieczyslaw M. Kokar,et al.  SAWA: an assistant for higher-level fusion and situation awareness , 2005, SPIE Defense + Commercial Sensing.

[18]  Gang Hua,et al.  Scene Aligned Pooling for Complex Video Recognition , 2012, ECCV.

[19]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yonghong Yan,et al.  A SVM-Based Audio Event Detection System , 2010, 2010 International Conference on Electrical and Control Engineering.

[21]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[23]  Fei-Fei Li,et al.  Learning Temporal Embeddings for Complex Video Analysis , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Cordelia Schmid,et al.  Stable Hyper-pooling and Query Expansion for Event Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Vasileios Mezaris,et al.  CERTH at MediaEval 2015 Synchronization of Multi-User Event Media Task , 2014, MediaEval.

[26]  Dong Liu,et al.  Joint audio-visual bi-modal codewords for video event detection , 2012, ICMR.

[27]  Yiannis Kompatsiaris,et al.  Collaborative event annotation in tagged photo collections , 2012, Multimedia Tools and Applications.

[28]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[29]  Frank van Harmelen,et al.  Handbook of Knowledge Representation , 2008, Handbook of Knowledge Representation.

[30]  Frank van Harmelen,et al.  Web Ontology Language: OWL , 2004, Handbook on Ontologies.

[31]  Yi Yang,et al.  Resource Constrained Multimedia Event Detection , 2014, MMM.

[32]  Florian Metze,et al.  Improvements to speaker adaptive training of deep neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[33]  Aisling Kelliher,et al.  Eventory -- An Event Based Media Repository , 2007 .

[34]  Kan Chen,et al.  The 2013 SESAME Multimedia Event Detection and Recounting System , 2013, TRECVID.

[35]  Mubarak Shah,et al.  High-level event recognition in unconstrained videos , 2013, International Journal of Multimedia Information Retrieval.

[36]  A. Friedman Framing pictures: the role of knowledge in automatized encoding and memory for gist. , 1979, Journal of experimental psychology. General.

[37]  S. Dumais Latent Semantic Analysis. , 2005 .

[38]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[39]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[40]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[41]  Georgios Petkos,et al.  Social Event Detection at MediaEval : a three-year retrospect of tasks and results , 2014 .

[42]  Michael G. Strintzis,et al.  A Data-Driven Approach for Social Event Detection , 2013, MediaEval.

[43]  Fei-Fei Li,et al.  Combining the Right Features for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  Vasileios Mezaris Social event detection at MediaEval: a 3-year retrospect of tasks and results , 2014 .

[45]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Nicu Sebe,et al.  Knowledge adaptation for ad hoc multimedia event detection with few exemplars , 2012, ACM Multimedia.

[47]  Marek J. Sergot,et al.  A logic-based calculus of events , 1989, New Generation Computing.

[48]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[49]  Yiannis Kompatsiaris,et al.  Improving event detection using related videos and relevance degree support vector machines , 2013, MM '13.

[50]  Hui Cheng,et al.  Multimedia event recounting with concept based representation , 2012, ACM Multimedia.

[51]  Yaser Sheikh,et al.  CASEE: A Hierarchical Event Representation for the Analysis of Videos , 2004, AAAI.

[52]  Hans J. W. Spoelder,et al.  Visualization of time-dependent data with feature tracking and event detection , 2001, The Visual Computer.

[53]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[54]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[55]  Sharath Pankanti,et al.  Temporal Sequence Modeling for Video Event Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Xirong Li,et al.  TagBook: A Semantic Video Representation Without Supervision for Event Detection , 2015, IEEE Transactions on Multimedia.

[57]  Wen-Huang Cheng,et al.  Semantic context detection based on hierarchical audio models , 2003, MIR '03.

[58]  Ramesh Jain,et al.  Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[59]  Kenneth Baclawski,et al.  A core ontology for situation awareness , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.

[60]  Vasileios Mezaris,et al.  GPU Accelerated Generalised Subclass Discriminant Analysis for Event and Concept Detection in Video , 2015, ACM Multimedia.

[61]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[62]  Ramesh C. Jain,et al.  A comprehensive study of visual event computing , 2011, Multimedia Tools and Applications.

[63]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[64]  Steffen Staab,et al.  F--a model of events based on the foundational ontology dolce+DnS ultralight , 2009, K-CAP '09.

[65]  Hiromitsu Yamada,et al.  Optical Character Recognition , 1999 .

[66]  Frank Hopfgartner,et al.  Detecting complex events in user-generated video using concept classifiers , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).

[67]  Heri Ramampiaro,et al.  A scalable algorithm for extraction and clustering of event-related pictures , 2012, Multimedia Tools and Applications.

[68]  Alberto Del Bimbo,et al.  Event detection and recognition for semantic annotation of video , 2010, Multimedia Tools and Applications.

[69]  Yiannis Kompatsiaris,et al.  High-level event detection in video exploiting discriminant concepts , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[70]  Hsinchun Chen,et al.  Evaluating event visualization: a usability study of COPLINK spatio-temporal visualizer , 2005, Int. J. Hum. Comput. Stud..

[71]  Cees Snoek,et al.  Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[72]  Dong Liu,et al.  EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video , 2015, ACM Multimedia.

[73]  Cordelia Schmid,et al.  Action and Event Recognition with Fisher Vectors on a Compact Feature Set , 2013, 2013 IEEE International Conference on Computer Vision.

[74]  Hiroshi Murase,et al.  Event Detection based on Twitter Enthusiasm Degree for Generating a Sports Highlight Video , 2014, ACM Multimedia.

[75]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[76]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[77]  Dong Liu,et al.  Recognizing Complex Events in Videos by Learning Key Static-Dynamic Evidences , 2014, ECCV.

[78]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[79]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[80]  Juan Carlos SanMiguel,et al.  An Ontology for Event Detection and its Application in Surveillance Video , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[81]  B. Li,et al.  Event detection and summarization in sports video , 2001, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL 2001).

[82]  Yi Yang,et al.  Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM , 2015, ICML.

[83]  Nicu Sebe,et al.  We are not equally negative: fine-grained labeling for multimedia event detection , 2013, ACM Multimedia.

[84]  Ioannis Patras,et al.  Video Event Detection Using Kernel Support Vector Machine with Isotropic Gaussian Sample Uncertainty (KSVM-iGSU) , 2016, MMM.

[85]  A. Murat Tekalp,et al.  Integrated semantic-syntactic video modeling for search and browsing , 2004, IEEE Transactions on Multimedia.

[86]  Cees Snoek,et al.  COSTA: Co-Occurrence Statistics for Zero-Shot Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[87]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[88]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[89]  Gertjan J. Burghouts,et al.  Performance evaluation of local colour invariants , 2009, Comput. Vis. Image Underst..

[90]  Mubarak Shah,et al.  Recognizing Complex Events Using Large Margin Joint Low-Level Event Model , 2012, ECCV.

[91]  Francesco G. B. De Natale,et al.  Robust event discovery from photo collections using Signature Image Bases (SIBs) , 2012, Multimedia Tools and Applications.

[92]  Mubarak Shah,et al.  Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Rama Chellappa,et al.  Attribute Grammar-Based Event Recognition and Anomaly Detection , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[94]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[95]  Ming-Syan Chen,et al.  Video Event Detection by Inferring Temporal Instance Labels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[96]  Fangzhen Lin,et al.  Embracing Causality in Specifying the Indeterminate Effects of Actions , 1996, AAAI/IAAI, Vol. 1.

[97]  Alexander G. Hauptmann,et al.  MoSIFT: Recognizing Human Actions in Surveillance Videos , 2009 .

[98]  Ahmed M. Elgammal,et al.  Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos , 2015, AAAI.

[99]  Ramesh C. Jain,et al.  EventWeb: Developing a Human-Centered Computing System , 2008, Computer.

[100]  Francesco G. B. De Natale,et al.  Synchronization of Multi-User Event Media (SEM) at MediaEval 2014: Task Description, Datasets, and Evaluation , 2014, MediaEval.

[101]  Yiannis Kompatsiaris,et al.  Social event detection using multimodal clustering and integrating supervisory signals , 2012, ICMR.

[102]  Shuang Wu,et al.  Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[103]  Véronique Malaisé,et al.  Abstracting and reasoning over ship trajectories and web data with the Simple Event Model (SEM) , 2010, Multimedia Tools and Applications.

[104]  Yiannis Kompatsiaris,et al.  Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation , 2012, MediaEval.

[105]  Ja-Ling Wu,et al.  Event Detection in Broadcasting Video for Halfpipe Sports , 2014, ACM Multimedia.

[106]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[107]  Francesco G. B. De Natale,et al.  EventMask: A Game-Based Framework for Event-Saliency Identification in Images , 2015, IEEE Transactions on Multimedia.

[108]  Mohan S. Kankanhalli,et al.  Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[109]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[110]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[111]  Mohan S. Kankanhalli,et al.  Audio Based Event Detection for Multimedia Surveillance , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[112]  Yiannis Kompatsiaris,et al.  The 2012 social event detection dataset , 2013, MMSys.

[113]  C. Schmid,et al.  Category-Specific Video Summarization , 2014, ECCV.

[114]  Harry Chen,et al.  Using OWL in a Pervasive Computing Broker , 2003, OAS.

[115]  Mubarak Shah,et al.  Complex Events Detection Using Data-Driven Concepts , 2012, ECCV.

[116]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[117]  Ansgar Scherp,et al.  Survey on modeling and indexing events in multimedia , 2014, Multimedia Tools and Applications.

[118]  Aldo Gangemi,et al.  Ontology Design Patterns , 2005 .

[119]  Jiebo Luo,et al.  Large-scale multimodal semantic concept detection for consumer video , 2007, MIR '07.

[120]  Cees Snoek,et al.  Composite Concept Discovery for Zero-Shot Video Event Detection , 2014, ICMR.

[121]  Ramesh C. Jain,et al.  {\rm E} - A Generic Event Model for Event-Centric Multimedia Data Management in eChronicle Applications , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[122]  Hong Cheng,et al.  Real world activity summary for senior home monitoring , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[123]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[124]  Amarnath Gupta,et al.  Managing Event Information: Modeling, Retrieval, and Applications , 2011, Managing Event Information.

[125]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[126]  Vasileios Mezaris,et al.  Video event detection using generalized subclass discriminant analysis and linear support vector machines , 2014, ICMR.

[127]  Alan F. Smeaton,et al.  Semantically Enhancing Multimedia Lifelog Events , 2014, PCM.

[128]  Alan F. Smeaton,et al.  Aggregating semantic concepts for event representation in lifelogging , 2011, SWIM '11.

[129]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[130]  Yi Yang,et al.  Exploring Semantic Inter-Class Relationships (SIR) for Zero-Shot Action Recognition , 2015, AAAI.

[131]  Babak Saleh,et al.  Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions , 2013, 2013 IEEE International Conference on Computer Vision.

[132]  N. Baumgartner A SURVEY OF UPPER ONTOLOGIES FOR SITUATION AWARENESS , 2006 .

[133]  Lie Lu,et al.  A flexible framework for key audio effects detection and auditory context inference , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[134]  A. Smeaton,et al.  Using lifelogging to help construct the identity of people with dementia , 2014 .

[135]  Moncef Gabbouj,et al.  Multimodal extraction of events and of information about the recording activity in user generated videos , 2012, Multimedia Tools and Applications.

[136]  Joemon M. Jose,et al.  Audio-Based Event Detection for Sports Video , 2003, CIVR.

[137]  Nuno Vasconcelos,et al.  Dynamic Pooling for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[138]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[139]  Ioannis Patras,et al.  Learning to detect video events from zero or very few video examples , 2015, Image Vis. Comput..

[140]  Nicu Sebe,et al.  Feature Weighting via Optimal Thresholding for Video Analysis , 2013, 2013 IEEE International Conference on Computer Vision.

[141]  Yi Yang,et al.  How Related Exemplars Help Complex Event Detection in Web Videos? , 2013, 2013 IEEE International Conference on Computer Vision.

[142]  Koen E. A. van de Sande,et al.  Recommendations for video event recognition using concept vocabularies , 2013, ICMR.

[143]  Shuang Wu,et al.  Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[144]  Wei Liu,et al.  Multimedia classification and event detection using double fusion , 2013, Multimedia Tools and Applications.

[145]  Dong Liu,et al.  Large-Scale Video Hashing via Structure Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[146]  Janto Skowronek,et al.  Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[147]  Álvaro García-Martín,et al.  An Ontology for Event Detection and its Application in Surveillance Video , 2009, AVSS.

[148]  John R. Kender,et al.  Highly Efficient Multimedia Event Recounting from User Semantic Preferences , 2014, ICMR.

[149]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..

[150]  Jane Yung-jen Hsu,et al.  A study of semantic context detection by using SVM and GMM approaches , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[151]  Ramakant Nevatia,et al.  An Ontology for Video Event Representation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[152]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[153]  Chloé Clavel,et al.  Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[154]  Nicu Sebe,et al.  Multimedia Event Detection Using A Classifier-Specific Intermediate Representation , 2013, IEEE Transactions on Multimedia.

[155]  Mieczyslaw M. Kokar,et al.  An Application of Semantic Web Technologies to Situation Awareness , 2005, International Semantic Web Conference.

[156]  Santanu Chaudhury,et al.  Detecting and Correlating Video-Based Event Patterns: An Ontology Driven Approach , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[157]  Nicu Sebe,et al.  Cluster encoding for modelling temporal variation in video , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[158]  Martin Doerr,et al.  The CIDOC Conceptual Reference Model - A New Standard for Knowledge Sharing , 2007, ER.

[159]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[160]  Mubarak Shah,et al.  Video Classification Using Semantic Concept Co-occurrences , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[161]  Martin Doerr,et al.  The use of CRM Core in Multimedia Annotation , 2006 .

[162]  Yiannis Kompatsiaris,et al.  Multimodal Graph-based Event Detection and Summarization in Social Media Streams , 2015, ACM Multimedia.

[163]  Chengqi Zhang,et al.  Dynamic Concept Composition for Zero-Example Event Detection , 2016, AAAI.

[164]  Patrick Gros,et al.  Audio event detection in movies using multiple audio words and contextual Bayesian networks , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[165]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[166]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[167]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[168]  Diane J. Cook,et al.  Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[169]  Hari Sundaram,et al.  Networked multimedia event exploration , 2004, MULTIMEDIA '04.

[170]  Regunathan Radhakrishnan,et al.  Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[171]  F. Hakimpour,et al.  Event visualization in a 3D environment , 2008, 2008 Conference on Human System Interactions.

[172]  Gang Hua,et al.  Video Event Detection Using Temporal Pyramids of Visual Semantics with Kernel Optimization and Model Subspace Boosting , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[173]  Christopher Joseph Pal,et al.  YouTube Scale, Large Vocabulary Video Annotation , 2010, Video Search and Mining.

[174]  Angelo Montanari,et al.  A Guided Tour through some Extensions of the Event Calculus , 2000, Comput. Intell..

[175]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[176]  Sangmin Oh,et al.  Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[177]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[178]  Dong Liu,et al.  Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[179]  Gang Hua,et al.  Semantic Model Vectors for Complex Video Event Recognition , 2012, IEEE Transactions on Multimedia.

[180]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[181]  Teruko Mitamura,et al.  Zero-Example Event Search using MultiModal Pseudo Relevance Feedback , 2014, ICMR.

[182]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[183]  Nitish Srivastava,et al.  Exploiting Image-trained CNN Architectures for Unconstrained Video Classification , 2015, BMVC.

[184]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[185]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[186]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[187]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[188]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[189]  Harry Chen,et al.  The SOUPA Ontology for Pervasive Computing , 2005 .

[190]  Yee Whye Teh,et al.  A stochastic memoizer for sequence data , 2009, ICML '09.

[191]  Stephen S. Yau,et al.  Hierarchical situation modeling and reasoning for pervasive computing , 2006, The Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems, and the Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA'06).

[192]  Hui Cheng,et al.  Evaluation of low-level features and their combinations for complex event detection in open source videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[193]  Yi Yang,et al.  A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[194]  Ramakant Nevatia,et al.  DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[195]  Yiannis Kompatsiaris,et al.  A Joint Content-Event Model for Event-Centric Multimedia Indexing , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[196]  Alan F. Smeaton,et al.  LifeLogging: Personal Big Data , 2014, Found. Trends Inf. Retr..

[197]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[198]  Yiannis Kompatsiaris,et al.  Video event recounting using mixture subclass discriminant analysis , 2013, 2013 IEEE International Conference on Image Processing.

[199]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[200]  Anand Rajaraman,et al.  Mining of Massive Datasets , 2011 .

[201]  Tao Mei,et al.  Super Fast Event Recognition in Internet Videos , 2015, IEEE Transactions on Multimedia.

[202]  João Paulo da Silva Neto,et al.  Non-speech audio event detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[203]  Yi Yang,et al.  E-LAMP: integration of innovative ideas for multimedia event detection , 2013, Machine Vision and Applications.

[204]  Francesco G. B. De Natale,et al.  Jointly exploiting visual and non-visual information for event-related social media retrieval , 2013, ICMR '13.

[205]  Patrick Gros,et al.  Classification-oriented structure learning in Bayesian networks for multimodal event detection in videos , 2012, Multimedia Tools and Applications.

[206]  Hui Cheng,et al.  Video event recognition using concept attributes , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[207]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[208]  Ramesh Jain,et al.  Event-centric media management , 2008, Electronic Imaging.

[209]  Yi Yang,et al.  DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[210]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[211]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[212]  Larry S. Davis,et al.  Selecting Relevant Web Trained Concepts for Automated Event Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[213]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[214]  Yi Yang,et al.  Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition , 2016, AAAI.

[215]  Giuseppe Carenini,et al.  Methods for Mining and Summarizing Text Conversations , 2011, Synthesis Lectures on Data Management.

[216]  Nicu Sebe,et al.  Complex Event Detection via Multi-source Video Attributes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[217]  Alexander G. Hauptmann,et al.  Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.

[218]  Matthew Brand,et al.  Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[219]  Deyu Meng,et al.  Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos , 2015, ICMR.

[220]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[221]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[222]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[223]  Trevor Darrell,et al.  Detection bank: an object detection based video representation for multimedia event recognition , 2012, ACM Multimedia.

[224]  Tao Gu,et al.  Ontology based context modeling and reasoning using OWL , 2004, IEEE Annual Conference on Pervasive Computing and Communications Workshops, 2004. Proceedings of the Second.

[225]  Yiannis Kompatsiaris,et al.  News-oriented multimedia search over multiple social networks , 2015, 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI).

[226]  Yi Yang,et al.  Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision , 2015, ACM Multimedia.

[227]  Raphaël Troncy,et al.  LODE: Linking Open Descriptions of Events , 2009, ASWC.