Understanding Musical Predictions With an Embodied Interface for Musical Machine Learning

Machine-learning models of music often exist outside the worlds of musical performance practice and abstracted from the physical gestures of musicians. In this work, we consider how a recurrent neural network (RNN) model of simple music gestures may be integrated into a physical instrument so that predictions are sonically and physically entwined with the performer's actions. We introduce EMPI, an embodied musical prediction interface that simplifies musical interaction and prediction to just one dimension of continuous input and output. The predictive model is a mixture density RNN trained to estimate the performer's next physical input action and the time at which this will occur. Predictions are represented sonically through synthesized audio, and physically with a motorized output indicator. We use EMPI to investigate how performers understand and exploit different predictive models to make music through a controlled study of performances with different models and levels of physical feedback. We show that while performers often favor a model trained on human-sourced data, they find different musical affordances in models trained on synthetic, and even random, data. Physical representation of predictions seemed to affect the length of performances. This work contributes new understandings of how musicians use generative ML models in real-time performance backed up by experimental evidence. We argue that a constrained musical interface can expose the affordances of embodied predictive interactions.

[1]  G Moro,et al.  Making High-Performance Embedded Instruments with Bela and Pure Data , 2016 .

[2]  S. Srihari Mixture Density Networks , 1994 .

[3]  E. Williams Experimental Designs Balanced for the Estimation of Residual Effects of Treatments , 1949 .

[4]  Kumar Krishna Agrawal,et al.  GANSynth: Adversarial Neural Audio Synthesis , 2019, ICLR.

[5]  François Pachet,et al.  The Continuator: Musical Interaction With Style , 2003, ICMC.

[6]  Jacob O. Wobbrock,et al.  Nonparametric Statistics in Human–Computer Interaction , 2016 .

[7]  Myles Borins,et al.  Embedded Networking and Hardware-Accelerated Graphics with Satellite CCRMA , 2013, NIME.

[8]  Peter R. Lewis,et al.  Self-awareness and Self-expression: Inspiration from Psychology , 2016, Self-aware Computing Systems.

[9]  C. Stevens,et al.  Music, movement and marimba: an investigation of the role of movement and gesture in communicating musical expression to an audience , 2009 .

[10]  Wendy Ju,et al.  Satellite CCRMA: A Musical Interaction and Sound Synthesis Platform , 2011, NIME.

[11]  Robert Rowe,et al.  Interactive Music Systems: Machine Listening and Composing , 1992 .

[12]  Andrew McPherson,et al.  Design and use of a hackable digital instrument , 2014 .

[13]  Gaëtan Hadjeres,et al.  Deep Learning Techniques for Music Generation , 2019 .

[14]  Bob L. Sturm,et al.  Taking the Models back to Music Practice: Evaluating Generative Transcription Models built using Deep Learning , 2017 .

[15]  Koji Tanaka,et al.  On Self-Awareness and the Self , 2014 .

[16]  Charles Patrick Martin,et al.  A Physical Intelligent Instrument using Recurrent Neural Networks , 2019, NIME.

[17]  Jim Tørresen,et al.  RoboJam: A Musical Mixture Density Network for Collaborative Touchscreen Interaction , 2017, EvoMUSART.

[18]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[19]  Andrew P. McPherson,et al.  Action-Sound Latency: Are Our Tools Fast Enough? , 2016, NIME.

[20]  Oriol Vinyals,et al.  Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.

[21]  Jeff Pressing Cybernetic Issues in Interactive Performance Systems , 1990 .

[22]  Anthony Gritten,et al.  Themes in the philosophy of music , 2004 .

[23]  François Pachet,et al.  Reflexive loopers for solo musical improvisation , 2013, CHI.

[24]  Alexander Refsum Jensenius,et al.  Musical Gestures: concepts and methods in research , 2010 .

[25]  Henry Gardner,et al.  Free-Improvised Rehearsal-as-Research for Musical HCI , 2019, New Directions in Music and Human-Computer Interaction.

[26]  Paramvir Bahl,et al.  Real-Time Video Analytics: The Killer App for Edge Computing , 2017, Computer.

[27]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[28]  Colin Raffel,et al.  A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music , 2018, ICML.

[29]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[30]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[31]  Jim Tørresen,et al.  Self-awareness in Active Music Systems , 2016, Self-aware Computing Systems.

[32]  Rebecca Fiebrink,et al.  Machine Learning as Meta-Instrument: Human-Machine Partnerships Shaping Expressive Instrumental Creation , 2017 .

[33]  Jakub Matyja,et al.  Embodied Music Cognition , 2010 .

[34]  Jim Tørresen,et al.  An Interactive Musical Prediction System with Mixture Density Recurrent Neural Networks , 2019, NIME.

[35]  Jesse Engel,et al.  Magenta Studio: Augmenting Creativity with Deep Learning in Ableton Live , 2019 .

[36]  Douglas Eck,et al.  Counterpoint by Convolution , 2019, ISMIR.

[37]  Michael Gurevich,et al.  Playing with Constraints: Stylistic Variation with a Simple Electronic Instrument , 2012, Computer Music Journal.