Syncing Shared Multimedia through Audiovisual Bimodal Segmentation

This work emanates from the particularities residing in contemporary social media storytelling, where multiple users and publishing channels capture and share public events, experiences, and places. Multichannel presentation and visualization mechanisms are pursued along with novel audiovisual mixing (such as time-delay-compensation enhancement, perceptual mixing, quality-based content selection, linking to context-aware metadata, and propagating multimedia semantics), thus promoting multimodal social media editing, processing, and authoring. While the exploitation of multiple time-based media (audio and video) describing the same event may lead to significant content enhancement, difficulties regarding detection and temporal synchronization of multimedia events have to be overcome. In many cases, one can identify events based only on audio features, thus performing an initial cost-effective annotation of the multimedia content. This article introduces a new audio-driven approach for temporal alignment and management of shared audiovisual streams. The article presents the theoretical framework and demonstrates the methodology in real-world scenarios. This article is part of a special issue on social multimedia and storytelling.

[1]  Andrei Popescu-Belis,et al.  Finding Information in Multimedia Meeting Records , 2012, IEEE MultiMedia.

[2]  C.-C. Jay Kuo,et al.  Current Developments and Future Trends in Audio Authentication , 2012, IEEE MultiMedia.

[3]  Andreas Veglis,et al.  Towards Intelligent Cross-Media Publishing: Media Practices and Technology Convergence Perspectives , 2016 .

[4]  Siegfried Handschuh,et al.  ACRONYM: Context Metrics for Linking People to User-Generated Media Content , 2011, Int. J. Semantic Web Inf. Syst..

[5]  Markus Cremer,et al.  Machine-assisted editing of user-generated content , 2009, Electronic Imaging.

[6]  George Kalliris,et al.  Long-term signal detection, segmentation and summarization using wavelets and fractal dimension: A bioacoustics application in gastrointestinal-motility monitoring , 2007, Comput. Biol. Medicine.

[7]  Moncef Gabbouj,et al.  Multimodal Event Detection in User Generated Videos , 2011, 2011 IEEE International Symposium on Multimedia.

[8]  Rigas Kotsakis,et al.  Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification , 2012, Speech Commun..

[9]  Pablo César,et al.  Video Analysis Tools for Annotating User-Generated Content from Social Events , 2010, SAMT.

[10]  Andreas L. Symeonidis,et al.  Semantic analysis of web documents for the generation of optimal content , 2014, Eng. Appl. Artif. Intell..

[11]  George Kalliris,et al.  Collaborative Annotation Platform for Audio Semantics , 2013 .

[12]  George Kalliris,et al.  Joint Wavelet Video Denoising and Motion Activity Detection in Multimodal Human Activity Analysis: Application to Video-Assisted Bioacoustic/Psychophysiological Monitoring , 2008, EURASIP J. Adv. Signal Process..

[13]  Andreas Veglis,et al.  Audiovisual Hypermedia in the Semantic Web , 2015 .

[14]  Andreas Veglis,et al.  Application of Mobile Cloud-Based Technologies in News Reporting: Current Trends and Future Perspectives , 2014 .

[15]  Yiannis,et al.  Community Detection in Social Media Performance and application considerations , 2010 .

[16]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[17]  Ramesh Jain,et al.  Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[18]  George Kalliris,et al.  Investigation of Wavelet Approaches for Joint Temporal, Spectral and Cepstral Features in Audio Semantics , 2013 .