论文信息 - Coupled Recurrent Models for Polyphonic Music Composition

Coupled Recurrent Models for Polyphonic Music Composition

This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music. We propose an efficient new conditional probabilistic factorization of musical scores, viewing a score as a collection of concurrent, coupled sequences: i.e. voices. To model the conditional distributions, we borrow ideas from both convolutional and recurrent neural models; we argue that these ideas are natural for capturing music's pitch invariances, temporal structure, and polyphony. We train models for single-voice and multi-voice composition on 2,300 scores from the KernScores dataset.

[1] Bob L. Sturm,et al. Music transcription modelling and composition using deep learning , 2016, ArXiv.

[2] Gaëtan Hadjeres,et al. Deep Learning Techniques for Music Generation - A Survey , 2017, ArXiv.

[3] Jamie Shotton,et al. Automatic Stylistic Composition of Bach Chorales with Deep LSTM , 2017, ISMIR.

[4] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[5] Roland Badeau,et al. Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[6] Craig Stuart Sapp. Online Database of Scores in the Humdrum File Format , 2005, ISMIR.

[7] Douglas Eck,et al. Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset , 2018, ICLR.

[8] Anna Jordanous,et al. A Standardised Procedure for Evaluating Creative Systems: Computational Creativity Evaluation Based on What it is to be Creative , 2012, Cognitive Computation.

[9] Douglas Eck,et al. Counterpoint by Convolution , 2019, ISMIR.

[10] Kemal Ebcioglu,et al. An Expert System for Harmonizing Four-Part Chorales , 1988, ICMC.

[11] Peter M. Todd,et al. A Connectionist Approach To Algorithmic Composition , 1989 .

[12] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 2001 .

[13] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[14] C. Roads,et al. Artificial intelligence and music , 1980, Copyright in the Music Industry.

[15] Christian Walder,et al. Modelling Symbolic Music: Beyond the Piano Roll , 2016, ACML.

[16] Frank Nielsen,et al. DeepBach: a Steerable Model for Bach Chorales Generation , 2016, ICML.

[17] Frederick P. Brooks,et al. An experiment in musical composition , 1957, IRE Trans. Electron. Comput..

[18] Jürgen Schmidhuber,et al. Finding temporal structure in music: blues improvisation with LSTM recurrent networks , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[19] Jeremy Pickens,et al. Polyphonic music modeling with random fields , 2003, MULTIMEDIA '03.

[20] Zaïd Harchaoui,et al. Learning Features of Music from Scratch , 2016, ICLR.

[21] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[22] Colin Raffel,et al. A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music , 2018, ICML.

[23] Douglas Eck,et al. This time with feeling: learning expressive musical performance , 2018, Neural Computing and Applications.

[24] Douglas Eck,et al. Tuning Recurrent Neural Networks with Reinforcement Learning , 2016, ICLR.

[25] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.

[26] Geraint A. Wiggins,et al. Evaluating Cognitive Models of Musical Composition , 2007 .

[27] T. Kohonen. A self-learning musical grammar, or 'associative memory of the second kind' , 1989, International 1989 Joint Conference on Neural Networks.

[28] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[29] Simon Dixon,et al. An End-to-End Neural Network for Polyphonic Piano Music Transcription , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[30] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[31] R. Shepard. Geometrical approximations to the structure of musical pitch. , 1982, Psychological review.

[32] Richard C. Pinkerton. Information theory and melody. , 1956 .

[33] Yiming Yang,et al. Transformer-XL: Language Modeling with Longer-Term Dependency , 2018 .

[34] Kratarth Goel,et al. Modeling temporal dependencies in data using a DBN-LSTM , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[35] Judy A. Franklin. Multi-Phase Learning for Jazz Improvisation and Interaction , 2001 .

[36] Ching-Hua Chuan,et al. A Functional Taxonomy of Music Generation Systems , 2017, ACM Comput. Surv..

[37] Geraint A. Wiggins,et al. Towards A Framework for the Evaluation of Machine Compositions , 2001 .

[38] Daniel D. Johnson,et al. Generating Polyphonic Music Using Tied Parallel Networks , 2017, EvoMUSART.

[39] Darrell Conklin,et al. Music Generation from Statistical Models , 2003 .

[40] Michael C. Mozer,et al. Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing , 1994, Connect. Sci..