Interactive Control of Explicit Musical Features in Generative LSTM-based Systems

Long Short-Term Memory (LSTM) neural networks have been effectively applied on learning and generating musical sequences, powered by sophisticated musical representations and integrations into other deep learning models. Deep neural networks, alongside LSTM-based systems, learn implicitly: given a sufficiently large amount of data, they transform information into high-level features that, however, do not relate with the high-level features perceived by humans. For instance, such models are able to compose music in the style of the Bach chorales, but they are not able to compose a less rhythmically dense version of them, or a Bach choral that begins with low and ends with high pitches -- even more so in an interactive way in real-time. This paper presents an approach to creating such systems. A very basic LSTM-based architecture is developed that can compose music that corresponds to user-provided values of rhythm density and pitch height/register. A small initial dataset is augmented to incorporate more intense variations of these two features and the system learns and generates music that not only reflects the style, but also (and most importantly) reflects the features that are explicitly given as input at each specific time. This system -- and future versions that will incorporate more advanced architectures and representation -- is suitable for generating music the features of which are defined in real-time and/or interactively.

[1]  Brad Johanson,et al.  GP-Music: An Interactive Genetic Programming System for Music Generation with Automated Fitness Raters , 2007 .

[2]  Michael N. Vrahatis,et al.  Interactive music composition driven by feature evolution , 2016, SpringerPlus.

[3]  Yoshua Bengio,et al.  High-dimensional sequence transduction , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Michael C Mozer,et al.  Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints ; CU-CS-495-90 , 1990 .

[5]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[6]  Perry R. Cook,et al.  MOSIEVIUS: FEATURE DRIVEN INTERACTIVE AUDIO MOSAICING , 2003 .

[7]  Kenneth Sörensen,et al.  Classification and Generation of Composer-Specific Music Using Global Feature Models and Variable Neighborhood Search , 2015, Computer Music Journal.

[8]  Gaëtan Hadjeres,et al.  Deep Learning Techniques for Music Generation - A Survey , 2017, ArXiv.

[9]  Alexander Rigopulos,et al.  Growing music from seeds : parametric generation and control of seed-based msuic for interactive composition and performance , 1994 .

[10]  S. Canazza,et al.  The ” Hand Composer ” : gesture-driven music composition machines , 2015 .

[11]  Perry R. Cook,et al.  Real-time human interaction with supervised learning algorithms for music composition and performance , 2011 .

[12]  Frank Nielsen,et al.  DeepBach: a Steerable Model for Bach Chorales Generation , 2016, ICML.

[13]  Michael N. Vrahatis,et al.  evoDrummer: Deriving Rhythmic Patterns through Interactive Genetic Algorithms , 2013, EvoMUSART.

[14]  Bob L. Sturm Adaptive Concatenative Sound Synthesis and Its Application to Micromontage Composition , 2006, Computer Music Journal.

[15]  Michael G. Epitropakis,et al.  Controlling interactive evolution of 8-bit melodies with genetic programming , 2012, Soft Computing.

[16]  Bob L. Sturm,et al.  Folk music style modelling by recurrent neural networks with long short term memory units , 2015 .

[17]  Katia Kermanidis,et al.  Combining LSTM and Feed Forward Neural Networks for Conditional Rhythm Composition , 2017, EANN.

[18]  Peter M. Todd,et al.  Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints , 2003 .

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Bill Z. Manaris,et al.  Harmonic Navigator: A Gesture-Driven, Corpus-Based Approach to Music Analysis, Composition, and Performance , 2013, MUME@AIIDE.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.