Leveraging multimodal information for event summarization and concept-level sentiment analysis

The rapid growth in the amount of user-generated content (UGCs) online necessitates for social media companies to automatically extract knowledge structures (concepts) from photos and videos to provide diverse multimedia-related services. However, real-world photos and videos are complex and noisy, and extracting semantics and sentics from the multimedia content alone is a very difficult task because suitable concepts may be exhibited in different representations. Hence, it is desirable to analyze UGCs from multiple modalities for a better understanding. To this end, we first present the EventBuilder system that deals with semantics understanding and automatically generates a multimedia summary for a given event in real-time by leveraging different social media such as Wikipedia and Flickr. Subsequently, we present the EventSensor system that aims to address sentics understanding and produces a multimedia summary for a given mood. It extracts concepts and mood tags from visual content and textual metadata of UGCs, and exploits them in supporting several significant multimedia-related services such as a musical multimedia summary. Moreover, EventSensor supports sentics-based event summarization by leveraging EventBuilder as its semantics engine component. Experimental results confirm that both EventBuilder and EventSensor outperform their baselines and efficiently summarize knowledge structures on the YFCC100M dataset.

[1]  Alexander F. Gelbukh,et al.  Dependency-Based Semantic Parsing for Concept-Level Text Analysis , 2014, CICLing.

[2]  Erik Cambria,et al.  Towards an intelligent framework for multimodal affective data analysis , 2015, Neural Networks.

[3]  Mor Naaman,et al.  Social multimedia: highlighting opportunities for search and mining of multimedia data in social media applications , 2010, Multimedia Tools and Applications.

[4]  Mohan S. Kankanhalli,et al.  Real-life events in multimedia: detection, representation, retrieval, and applications , 2013, Multimedia Tools and Applications.

[5]  Vasileios Hatzivassiloglou,et al.  Event-Based Extractive Summarization , 2004 .

[6]  Roger Zimmermann,et al.  Automatic music soundtrack generation for outdoor videos from contextual sensor information , 2012, ACM Multimedia.

[7]  Francesco G. B. De Natale,et al.  HuEvent'14: 2014 workshop on human-centered event understanding from multimedia , 2014, ACM Multimedia.

[8]  Ansgar Scherp,et al.  Survey on modeling and indexing events in multimedia , 2014, Multimedia Tools and Applications.

[9]  Roger Zimmermann,et al.  EventBuilder: Real-time Multimedia Event Summarization by Visualizing Social Media , 2015, ACM Multimedia.

[10]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[11]  Yi Yu,et al.  ADVISOR: Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings , 2014, ACM Multimedia.

[12]  Erik Cambria,et al.  Sentic patterns: Dependency-based rules for concept-level sentiment analysis , 2014, Knowl. Based Syst..

[13]  Erik Cambria,et al.  SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis , 2014, AAAI.

[14]  Hendrik Blockeel,et al.  Three complementary approaches to context aware movie recommendation , 2010, CAMRa '10.

[15]  Erik Cambria,et al.  AffectiveSpace 2: Enabling Affective Intuition for Concept-Level Sentiment Analysis , 2015, AAAI.

[16]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[17]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[18]  László Böszörményi,et al.  Summarization of Real-Life Events Based on Community-Contributed Content , 2012, MMEDIA 2012.

[19]  Erik Cambria,et al.  The Hourglass of Emotions , 2011, COST 2102 Training School.

[20]  Carlo Drioli,et al.  Toward an automatically generated soundtrack from low-level cross-modal correlations for automotive scenarios , 2010, ACM Multimedia.

[21]  Wolfgang Nejdl,et al.  Bringing order to your photos: event-driven classification of flickr images based on social knowledge , 2010, CIKM.

[22]  D. Hochbaum Approximating covering and packing problems: set cover, vertex cover, independent set, and related problems , 1996 .

[23]  Christian Breiteneder,et al.  Automated social event detection in large photo collections , 2013, ICMR.

[24]  Loong Fah Cheong,et al.  Affective understanding in film , 2006, IEEE Trans. Circuits Syst. Video Technol..

[25]  Mohammad Soleymani,et al.  A Bayesian framework for video affective representation , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[26]  Sebastian Michel,et al.  PICASSO: automated soundtrack suggestion for multi-modal data , 2011, CIKM '11.

[27]  Cheng-Te Li,et al.  Emotion-based impressionism slideshow with automatic music accompaniment , 2007, ACM Multimedia.

[28]  Erik Cambria,et al.  EmoSenticSpace: A novel framework for affective common-sense reasoning , 2014, Knowl. Based Syst..

[29]  Erik Cambria,et al.  The CLSA Model: A Novel Framework for Concept-Level Sentiment Analysis , 2015, CICLing.

[30]  Haofen Wang,et al.  Towards Effective Event Detection, Tracking and Summarization on Microblog Data , 2011, WAIM.

[31]  Erik Cambria,et al.  Sentiment Data Flow Analysis by Means of Dynamic Linguistic Patterns , 2015, IEEE Computational Intelligence Magazine.

[32]  Cynthia Whissell,et al.  THE DICTIONARY OF AFFECT IN LANGUAGE , 1989 .