A Hierarchical Recurrent Neural Network for Symbolic Melody Generation

In recent years, neural networks have been used to generate symbolic melodies. However, the long-term structure in the melody has posed great difficulty to design a good model. In this article, we present a hierarchical recurrent neural network (HRNN) for melody generation, which consists of three long-short-term-memory (LSTM) subnetworks working in a coarse-to-fine manner along time. Specifically, the three subnetworks generate bar profiles, beat profiles, and notes, in turn, and the output of the high-level subnetworks are fed into the low-level subnetworks, serving as guidance to generate the finer time-scale melody components in the low-level subnetworks. Two human behavior experiments demonstrate the advantage of this structure over the single-layer LSTM which attempts to learn all hidden structures in melodies. Compared with the recently proposed models MidiNet and MusicVAE, the HRNN produces better melodies evaluated by humans.

[1]  Sanja Fidler,et al.  Song From PI: A Musically Plausible Network for Pop Music Generation , 2016, ICLR.

[2]  J. Schmidhuber,et al.  A First Look at Music Composition using LSTM Recurrent Neural Networks , 2002 .

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Michael C. Mozer,et al.  Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing , 1994, Connect. Sci..

[5]  Wulfram Gerstner,et al.  Algorithmic Composition of Melodies with Deep Recurrent Neural Networks , 2016, ArXiv.

[6]  Tetsuya Ogata,et al.  Emergence of hierarchical structure mirroring linguistic composition in a recurrent neural network , 2011, Neural Networks.

[7]  Yoshua Bengio,et al.  SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.

[8]  Nicholas Jing Yuan,et al.  XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music , 2018, KDD.

[9]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Olof Mogren,et al.  C-RNN-GAN: Continuous recurrent neural networks with adversarial training , 2016, ArXiv.

[12]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[14]  Colin Raffel,et al.  A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music , 2018, ICML.

[15]  Wulfram Gerstner,et al.  Deep Artificial Composer: A Creative Neural Network Model for Automated Melody Generation , 2017, EvoMUSART.

[16]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[17]  Jürgen Schmidhuber,et al.  A Clockwork RNN , 2014, ICML.

[18]  Risto Miikkulainen,et al.  Creating melodies with evolving recurrent neural networks , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[19]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Frank Nielsen,et al.  DeepBach: a Steerable Model for Bach Chorales Generation , 2016, ICML.

[21]  Yi-Hsuan Yang,et al.  MidiNet: A Convolutional Generative Adversarial Network for Symbolic-Domain Music Generation , 2017, ISMIR.

[22]  Daniel D. Johnson,et al.  Generating Polyphonic Music Using Tied Parallel Networks , 2017, EvoMUSART.

[23]  Yi-Hsuan Yang,et al.  MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment , 2017, AAAI.

[24]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[25]  Tsau Young Lin,et al.  Granular Computing , 2003, RSFDGrC.