Melody Structure Transfer Network: Generating Music with Separable Self-Attention

Symbolic music generation has attracted increasing attention, while most methods focus on generating short piece (mostly less than 8 bars, and up to 32 bars). Generating long music calls for effective expression of the coherent music structure. Despite their success on long sequences, self-attention architectures still have challenge in dealing with long-term music as it requires additional care on the subtle music structure. In this paper, we propose to transfer the structure of training samples for new music generation, and develop a novel separable self-attention based model which enable the learning and transferring of the structure embedding. We show that our transfer model can generate music sequences (up to 100 bars) with interpretable structures, which bears similar structures and composition techniques with the template music from training set. Extensive experiments show its ability of generating music with target structure and well diversity. The generated 3,000 sets of music is uploaded as supplemental material.

[1]  Kemal Ebcioglu,et al.  An Expert System for Harmonizing Four-Part Chorales , 1988, ICMC.

[2]  Colin Raffel,et al.  A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music , 2018, ICML.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[5]  Katerina Kosta,et al.  StructureNet: Inducing Structure in Generated Melodies , 2018, ISMIR.

[6]  François Pachet,et al.  Deep learning for music generation: challenges and directions , 2018, Neural Comput. Appl..

[7]  Michael C. Mozer,et al.  Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing , 1994, Connect. Sci..

[8]  Douglas Eck,et al.  Tuning Recurrent Neural Networks with Reinforcement Learning , 2016, ICLR.

[9]  Douglas Eck,et al.  Counterpoint by Convolution , 2019, ISMIR.

[10]  I. Xenakis,et al.  Formalized Music: Thought and Mathematics in Composition , 1971 .

[11]  Andrew M. Dai,et al.  Music Transformer: Generating Music with Long-Term Structure , 2018, ICLR.

[12]  Roger T. Dean,et al.  The Oxford Handbook of Algorithmic Music , 2018 .

[13]  Michael Good MusicXML: An internet-friendly format for sheet music , 2001 .

[14]  Yi-Hsuan Yang,et al.  MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment , 2017, AAAI.

[15]  Seungmin Rho,et al.  Music structure analysis using self-similarity matrix and two-stage categorization , 2013, Multimedia Tools and Applications.

[16]  Gerhard Widmer,et al.  Imposing higher-level Structure in Polyphonic Music Generation using Convolutional Restricted Boltzmann Machines and Constraints , 2016, ArXiv.

[17]  Ashis Pati,et al.  Learning to Traverse Latent Spaces for Musical Score Inpainting , 2019, ISMIR.

[18]  Yi-Hsuan Yang,et al.  Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation , 2018, ISMIR.

[19]  Yi-Hsuan Yang,et al.  Pop Music Transformer: Generating Music with Rhythm and Harmony , 2020, ArXiv.

[20]  Gaëtan Hadjeres,et al.  Deep Learning Techniques for Music Generation - A Survey , 2017, ArXiv.

[21]  Bob L. Sturm,et al.  Music transcription modelling and composition using deep learning , 2016, ArXiv.

[22]  Gerhard Nierhaus,et al.  Algorithmic Composition: Paradigms of Automated Music Generation , 2008 .

[23]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[24]  David Cope,et al.  The Algorithmic Composer , 2000 .

[25]  Juan Pablo Bello,et al.  Measuring Structural Similarity in Music , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Douglas Eck,et al.  This time with feeling: learning expressive musical performance , 2018, Neural Computing and Applications.

[27]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.