论文信息 - Rhythm is a Dancer: Music-Driven Motion Synthesis With Global Structure

Rhythm is a Dancer: Music-Driven Motion Synthesis With Global Structure

Synthesizing human motion with a global structure, such as a choreography, is a challenging task. Existing methods tend to concentrate on local smooth pose transitions and neglect the global context or the theme of the motion. In this work, we present a music-driven motion synthesis framework that generates long-term sequences of human motions which are synchronized with the input beats, and jointly form a global structure that respects a specific dance genre. In addition, our framework enables generation of diverse motions that are controlled by the content of the music, and not only by the beat. Our music-driven dance synthesis framework is a hierarchical system that consists of three levels: pose, motif, and choreography. The pose level consists of an LSTM component that generates temporally coherent sequences of poses. The motif level guides sets of consecutive poses to form a movement that belongs to a specific distribution using a novel motion perceptual-loss. And the choreography level selects the order of the performed movements and drives the system to follow the global structure of a dance genre. Our results demonstrate the effectiveness of our music-driven framework to generate natural and consistent movements on various dance types, having control over the content of the synthesized motions, and respecting the overall structure of the dance.

[1] Jonas Beskow,et al. MoGlow , 2019, ACM Trans. Graph..

[2] Libin Liu,et al. Guided Learning of Control Graphs for Physics-Based Characters , 2016, ACM Trans. Graph..

[3] Minho Lee,et al. Music similarity-based approach to generating dance motion sequence , 2012, Multimedia Tools and Applications.

[4] Taku Komura,et al. Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[5] P. Pasquier,et al. GrooveNet : Real-Time Music-Driven Dance Movement Generation using Artificial Neural Networks , 2017 .

[6] Z. Popovic,et al. Near-optimal character animation with continuous control , 2007, ACM Trans. Graph..

[7] Kyogu Lee,et al. Automatic Choreography Generation with Convolutional Encoder-decoder Network , 2019, ISMIR.

[8] M. A. Brubaker,et al. Probabilistic Character Motion Synthesis using a Hierarchical Deep Latent Variable Model , 2020, Comput. Graph. Forum.

[9] Li Su,et al. Temporally Guided Music-to-Body-Movement Generation , 2020, ACM Multimedia.

[10] Masataka Goto,et al. Automated choreography synthesis using a Gaussian process leveraging consumer-generated dance motions , 2014, Advances in Computer Entertainment.

[11] Weidong Geng,et al. Example-Based Automatic Music-Driven Conventional Dance Motion Synthesis , 2012, IEEE Transactions on Visualization and Computer Graphics.

[12] Matthew E. P. Davies,et al. Evaluation of Audio Beat Tracking and Music Tempo Extraction Algorithms , 2007 .

[13] Zhiyong Wang,et al. Combining Recurrent Neural Networks and Adversarial Training for Human Motion Synthesis and Control , 2018, IEEE Transactions on Visualization and Computer Graphics.

[14] David A. Ross,et al. Learn to Dance with AIST++: Music Conditioned 3D Dance Generation , 2021, ArXiv.

[15] Sergey Levine,et al. Continuous character control with low-dimensional embeddings , 2012, ACM Trans. Graph..

[16] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[17] Libin Liu,et al. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning , 2018, ACM Trans. Graph..

[18] Susan Leigh Foster,et al. Choreographing Empathy: Kinesthesia in Performance , 2010 .

[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20] Lucas Kovar,et al. Motion graphs , 2002, SIGGRAPH '08.

[21] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[22] Maneesh Agrawala,et al. Visual rhythm and beat , 2018, ACM Trans. Graph..

[23] Congyi Wang,et al. Music2Dance: DanceNet for Music-Driven Dance Generation , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[24] Taku Komura,et al. A Recurrent Variational Autoencoder for Human Motion Synthesis , 2017, BMVC.

[25] George Papagiannakis,et al. Style-based motion analysis for dance composition , 2017, The Visual Computer.

[26] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..

[27] Asako Soga,et al. Body-part motion synthesis system for contemporary dance creation , 2016, SIGGRAPH Posters.

[28] Jeroen Breebaart,et al. Features for audio and music classification , 2003, ISMIR.

[29] Huang Hu,et al. Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning , 2020, ArXiv.

[30] Michiel van de Panne,et al. Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[31] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] Dario Pavllo,et al. QuaterNet: A Quaternion-based Recurrent Model for Human Motion , 2018, BMVC.

[33] Daniel Cohen-Or,et al. Deep motifs and motion signatures , 2018, ACM Trans. Graph..

[34] Silvio Savarese,et al. Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Kyoungmin Lee,et al. Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[36] D. Hyland. Dance and the Lived Body: A Descriptive Aesthetics , 1987 .

[37] C. Karen Liu,et al. Animating human dressing , 2015, ACM Trans. Graph..

[38] Taku Komura,et al. A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..

[39] Luiz Velho,et al. ChoreoGraphics: an authoring environment for dance shows , 2011, SIGGRAPH '11.

[40] Yee-Hong Yang,et al. Music-driven character animation , 2009, TOMCCAP.

[41] Meinard Müller,et al. Information retrieval for music and motion , 2007 .

[42] Jessica K. Hodgins,et al. Construction and optimal search of interpolated motion graphs , 2007, ACM Trans. Graph..

[43] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.

[44] J. Hodgins,et al. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning , 2017, ACM Trans. Graph..

[45] Hubert P. H. Shum,et al. DanceDJ: A 3D Dance Animation Authoring System for Live Performance , 2017, ACE.

[46] Danica Kragic,et al. Deep Representation Learning for Human Motion Prediction and Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] José M. F. Moura,et al. Adversarial Geometry-Aware Human Motion Prediction , 2018, ECCV.

[48] Naoshi Kaneko,et al. Analyzing Input and Output Representations for Speech-Driven Gesture Generation , 2019, IVA.

[49] Yi-Hsuan Yang,et al. Machine Recognition of Music Emotion: A Review , 2012, TIST.

[50] Yisong Yue,et al. A deep learning approach for generalized speech animation , 2017, ACM Trans. Graph..

[51] Zhanxing Zhu,et al. Spatio-Temporal Manifold Learning for Human Motions via Long-Horizon Modeling , 2019, IEEE Transactions on Visualization and Computer Graphics.

[52] KangKang Yin,et al. SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..

[53] Ira Kemelmacher-Shlizerman,et al. Audio to Body Dynamics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54] J. Paul Robinson,et al. Towards 3D Dance Motion Synthesis and Control , 2020, ArXiv.

[55] Yanxi Liu,et al. Dancing with Turks , 2015, ACM Multimedia.

[56] Sanja Fidler,et al. Learning to Generate Diverse Dance Motions with Transformer , 2020, ArXiv.

[57] Nicolas Pronost,et al. Interactive Character Animation Using Simulated Physics: A State‐of‐the‐Art Review , 2012, Comput. Graph. Forum.

[58] Yi Zhou,et al. Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis , 2017, ICLR.

[59] Taku Komura,et al. Mode-adaptive neural networks for quadruped motion control , 2018, ACM Trans. Graph..

[60] Qifeng Chen,et al. Self-supervised Dance Video Synthesis Conditioned on Music , 2020, ACM Multimedia.

[61] Michiel van de Panne,et al. Character controllers using motion VAEs , 2020, ACM Trans. Graph..

[62] Atsushi Nakazawa,et al. Dancing‐to‐Music Character Animation , 2006, Comput. Graph. Forum.

[63] Daniel P. W. Ellis,et al. Beat Tracking by Dynamic Programming , 2007 .

[64] Michael J. Black,et al. On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65] C. Karen Liu,et al. Online control of simulated humanoids using particle belief propagation , 2015, ACM Trans. Graph..

[66] Sehoon Ha,et al. Falling and landing motion control for character animation , 2012, ACM Trans. Graph..

[67] Wei Chen,et al. ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit , 2020, ACM Multimedia.

[68] Subhransu Maji,et al. Visemenet , 2018, ACM Trans. Graph..

[69] Jessica K. Hodgins,et al. Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[70] Jitendra Malik,et al. Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[71] Jia Jia,et al. Dance with Melody: An LSTM-autoencoder Approach to Music-oriented Dance Synthesis , 2018, ACM Multimedia.

[72] A. Murat Tekalp,et al. Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis , 2012, IEEE Transactions on Multimedia.

[73] Jaakko Lehtinen,et al. Audio-driven facial animation by joint end-to-end learning of pose and emotion , 2017, ACM Trans. Graph..

[74] C. Karen Liu,et al. Learning physics-based motion style with nonlinear inverse optimization , 2005, ACM Trans. Graph..

[75] Jonas Beskow,et al. Style‐Controllable Speech‐Driven Gesture Synthesis Using Normalising Flows , 2020, Comput. Graph. Forum.

[76] Sung Yong Shin,et al. Rhythmic-motion synthesis based on motion-beat analysis , 2003, ACM Trans. Graph..

[77] Taku Komura,et al. Interaction patches for multi-character animation , 2008, ACM Trans. Graph..

[78] Songhwai Oh,et al. Generative Autoregressive Networks for 3D Dancing Move Synthesis From Music , 2019, IEEE Robotics and Automation Letters.

[79] Eduardo de Campos Valadares,et al. Dancing to the music , 2000 .

[80] Sebastian Starke,et al. Local motion phases for learning multi-contact character movements , 2020, ACM Trans. Graph..

[81] Jehee Lee,et al. Interactive character animation by learning multi-objective control , 2018, ACM Trans. Graph..

[82] Daniel Cohen-Or,et al. Dance to the beat: Synchronizing motion to audio , 2018, Computational Visual Media.

[83] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..

[84] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.