Audeosynth: Music-driven Video Montage

We introduce music-driven video montage, a media format that offers a pleasant way to browse or summarize video clips collected from various occasions, including gatherings and adventures. In music-driven video montage, the music drives the composition of the video content. According to musical movement and beats, video clips are organized to form a montage that visually reflects the experiential properties of the music. Nonetheless, it takes enormous manual work and artistic expertise to create it. In this paper, we develop a framework for automatically generating music-driven video montages. The input is a set of video clips and a piece of background music. By analyzing the music and video content, our system extracts carefully designed temporal features from the input, and casts the synthesis problem as an optimization and solves the parameters through Markov Chain Monte Carlo sampling. The output is a video montage whose visual activities are cut and synchronized with the rhythm of the music, rendering a symphony of audio-visual resonance.

[1]  Dinesh K. Pai,et al.  FoleyAutomatic: physically-based sound effects for interactive simulation and animation , 2001, SIGGRAPH.

[2]  A. Goodwin,et al.  Dancing in the Distraction Factory: Music Television and Popular Culture , 1992 .

[3]  In-Kwon Lee,et al.  Automatic Synchronization of Background Music and Motion in Computer Animation , 2005, Comput. Graph. Forum.

[4]  Lei Yang,et al.  Image-based bidirectional scene reprojection , 2011, ACM Trans. Graph..

[5]  Michel Chion,et al.  Audio-Vision: Sound on Screen , 1994 .

[6]  David Salesin,et al.  Animating pictures with stochastic motion textures , 2005, ACM Trans. Graph..

[7]  Jun-Cheng Chen,et al.  Tiling slideshow , 2006, MM '06.

[8]  Yaser Sheikh,et al.  Automatic editing of footage from multiple social cameras , 2014, ACM Trans. Graph..

[9]  Neel Joshi,et al.  Automated video looping with progressive dynamism , 2013, ACM Trans. Graph..

[10]  Sam McGuire Introduction To MIDI , 2019, Modern MIDI.

[11]  E. Dmytryk On Film Editing: An Introduction to the Art of Film Construction , 1984 .

[12]  Markus H. Gross,et al.  VideoSnapping , 2014 .

[13]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[14]  Atsushi Nakazawa,et al.  Dancing‐to‐Music Character Animation , 2006, Comput. Graph. Forum.

[15]  Steven M. Drucker,et al.  Cliplets: juxtaposing still and dynamic imagery , 2012, UIST.

[16]  Frédo Durand,et al.  Motion magnification , 2005, ACM Trans. Graph..

[17]  Richard Szeliski,et al.  Street slide: browsing street level imagery , 2010, ACM Trans. Graph..

[18]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[19]  Doug L. James,et al.  Inverse-Foley animation , 2014, ACM Trans. Graph..

[20]  Wei Chai,et al.  Semantic Segmentation and Summarization of Music , 2006 .

[21]  Frédo Durand,et al.  The visual microphone , 2014, ACM Trans. Graph..

[22]  Yael Pritch,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .

[23]  David Salesin,et al.  Interactive digital photomontage , 2004, ACM Trans. Graph..

[24]  David Miles Huber,et al.  The MIDI Manual , 1991 .

[25]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[26]  이현철 Automatic synchronization of background music and motion in computer animation , 2005 .

[27]  Andrew R. Parker,et al.  In The Blink Of An Eye , 2003 .

[28]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[29]  Wilmot Li,et al.  Tools for placing cuts and transitions in interview video , 2012, ACM Trans. Graph..

[30]  Richard Szeliski,et al.  The Moment Camera , 2006, Computer.

[31]  P. Anandan,et al.  Efficient representations of video sequences and their applications , 1996, Signal Process. Image Commun..

[32]  M. Cardle,et al.  Music-driven motion editing: local motion transformations guided by music analysis , 2002, Proceedings 20th Eurographics UK Conference.

[33]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[34]  Carol Vernallis Experiencing Music Video: Aesthetics and Cultural Context , 2004 .

[35]  Maneesh Agrawala,et al.  Selectively de-animating video , 2012, ACM Trans. Graph..