Less talk, more rock: automated organization of community-contributed collections of concert videos

We describe a system for synchronization and organization of user-contributed content from live music events. We start with a set of short video clips taken at a single event by multiple contributors, who were using a varied set of capture devices. Using audio fingerprints, we synchronize these clips such that overlapping clips can be displayed simultaneously. Furthermore, we use the timing and link structure generated by the synchronization algorithm to improve the findability and representation of the event content, including identifying key moments of interest and descriptive text for important captured segments of the show. We also identify the preferred audio track when multiple clips overlap. We thus create a much improved representation of the event that builds on the automatic content match. Our work demonstrates important principles in the use of content analysis techniques for social media content on the Web, and applies those principles in the domain of live music capture.

[1]  Marcel Worring,et al.  The Role of Visual Content and Style for Concert Video Indexing , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[2]  Lexing Xie,et al.  Contextual wisdom: social relations and correlations for multimedia event annotation , 2007, ACM Multimedia.

[3]  Cormac Herley Accurate repeat finding and object skipping using fingerprints , 2005, MULTIMEDIA '05.

[4]  H. Garcia-Molina,et al.  Automatic organization for digital photographs with geographic coordinates , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[5]  Nick Reid,et al.  Photo LOI: browsing multi-user photo collections , 2005, MULTIMEDIA '05.

[6]  Mauro Barbieri,et al.  Synchronization of multi-camera video recordings based on audio , 2007, ACM Multimedia.

[7]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[8]  Tobun Dorbin Ng,et al.  Collages as dynamic summaries for news video , 2002, MULTIMEDIA '02.

[9]  David A. Shamma,et al.  Watch what I watch: using community activity to understand content , 2007, MIR '07.

[10]  Cormac Herley,et al.  ARGOS: automatically extracting repeating objects from multimedia streams , 2006, IEEE Transactions on Multimedia.

[11]  Hans Weda,et al.  Synchronization of multiple video recordings based on still camera flashes , 2006, MM '06.

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  David M. Nichols,et al.  How people find videos , 2008, JCDL '08.

[14]  Ramesh Jain,et al.  Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[15]  Patrick Schmitz,et al.  Community annotation and remix: a research platform and pilot deployment , 2006, HCM '06.

[16]  Andreas Paepcke,et al.  Time as essence for photo browsing through personal digital libraries , 2002, JCDL '02.

[17]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[18]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System With an Efficient Search Strategy , 2003 .

[19]  Alan Hanjalic,et al.  The Multimedian Concert-Video Browser , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[20]  Alan Hanjalic,et al.  Intelligent browsing of concert videos , 2007, ACM Multimedia.

[21]  Daniel P. W. Ellis,et al.  Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[22]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.